Written By: Bernd Harzog
Yoav Dembak the CEO of B-hive was kind enough to make an hour of his time available to me the week after the announcement that VMware was going to acquire B-hive. This was followed up by an hour with their CTO, Asaf Wexler who gave me a pretty in depth demonstration of B-hive. I took quite a few screen shots during the demo, which I will share with you in this article, so this is my way of saying sorry it is so long (since I have to explain all of the screen shots).
B-hive Benefits
Before I start with the product review, let's go over B-hive at a high level. First of all the benefits that B-hive claims that the you will derive from the product:
- A true picture of applications response time across the virtualized environment provides a better mechanism for sizing decisions than a resource utilization approach
- When applications move from physical servers "owned" by the business to virtual servers owned by IT, IT needs to take more responsibility for applications performance than it did in the physical environment. B-hive claims to provide this information in a manner appropriate to IT.
- Having quality application response time data allows virtualization projects to be accelerated, since it allows IT to provide accurate per applications performance information to the business which is what the business needs in order to feel comfortable endorsing the migration of "their" business critical applications over to a virtual infrastructure.
- Appropriate to IT means that IT does not have to do any per application work to make it work. B-hive will either automatically discover transactions in your applications, or use a generic TCP/IP mechanism to calculate response times.
- B-hive is very unique in terms of the way it automatically discovers transactions in applications that are either specifically supported (Outlook/Exchange) or that are built on top of a supported middleware platform (like HTTP, J2EE, and SQL). If the application is built in a manner where B-hive can discover its transactions (see the table later in this review), then B-hive will automatically calculate response times for the transactions that comprise this application. The application middleware support in B-hive is broad enough so that it is likely to be able to calculate application and transaction response time for a reasonable set of your business critical applications.
- The data collected by B-hive is comprehensive enough, and of high enough quality so that automatic management decisions can be made with it. In other words, with this data, and with the integration that B-hive has with V-Motion and DRS, B-hive claims that you can substantially automate the process of meeting service levels by setting up B-hive to automatically create new instances of servers in VM's, or to move a VM to a host that has more capacity.
- B-hive wants to "play nice" with other systems management and monitoring vendors. Their plan is to make all of their application and transaction performance data available via a web services API in "the very near future".
Why the Old Way No Longer Works
Before I get into how the product works, I want to spend a moment on why it is important to do things in the way that B-hive does it. There is a right way and a wrong way to do Applications Performance Management in virtualized environments. The reason for this is that when you stick a piece of software in a VM, the Windows OS (assume Windows for a moment) no longer owns the clock (the hypervisor does). This means that anything that counts time inside of a VM will do so incorrectly. This includes management agents from systems management vendors and APM vendors. This in turn means that you cannot collect resource usage information or response times from within a guest and try to use that information to infer anything about the performance of the application running in the guest. Time based metrics include CPU utilization, Page Faults per Second, Context Switches per Second, Disk I/O Reads/Writes per Second, Network Bytes Send/Received per Second, and most importantly any measure of the time elapsed between Event A (start of a transaction) and Event B (end of transaction). So, neither resource based metrics nor applications response time metrics collected from inside of a guest VM are valid. All of this is described in a VMware Whitepaper on the subject if you do not believe me. Bottom line - products that install agents to measure resource utilization and/or response time in virtualized guests do not work. So once you virtualize, a new way to do APM is needed.
The Basics of How B-hive Works
Given that the "old" way of doing APM does not work inside of virtualized guest, there must be a new way, right? Yes there is. There are two keys to doing this the right way. The first is that while resource utilization is important it is not the key metric to focus upon. The key metric is application response time. This is because per application resource utilization is no longer reliably available, and because the business really prefers a metric that they can understand (response time) to one that they cannot understand (CPU utilization). The second key is to collect that response time data on a per application (and if possible) per transaction basis without being impacted by the issues of collecting time based metrics inside of guests. So, you have to use an "outside-in" methodology to collect application response time data about an application inside of a guest from outside of the guest.
There are been HTTP appliances around for years that did this by attaching to mirror ports on the switches that supported physical web servers. However the problem with this approach is that it cannot capture response times between two guests within one physical VMware host. In order to do this, you need to measure response times from within each host but from outside of each guest. This is done via a virtual appliance that sits on the virtual mirror port of the virtual switch inside of the VMware host. This is exactly how B-hive is implemented.
So, here is how B-hive works:
- As mentioned above a B-hive virtual appliance is deployed in each VMware host that is running applications that are considered business critical.
- If you have multi-tier applications (web server - app server - database server) t is important to put virtual appliances in the hosts that are running all of the tiers of the application. It is not necessary to put a B-hive appliance in the host running the database server; it is enough to put one in the hosts that contain the portions of the application that talk to the database server. The reason for this is that B-hive decodes the web and the database transactions, but it needs to see both of them in order to decode them and their response times for you.
- Each of the virtual appliances that collect data are referred to as probes. Probes forward their data to an analysis appliance which then stores the data either in a local database, or the SQL Server or Oracle database of your choice.
- You can tell B-hive to find all of the applications, or you can select the ones that you want it to focus upon.
- B-hive does not just look at response time from a TCP/IP standpoint. It decodes the applications level data to find the "atomic" transactions that comprise the application, and to map these atomic transactions to their corresponding database queries.
- B-hive automatically discovers and calculates response times for transactions in a wide set of applications middleware. There is also a specific protocol decode for Outlook clients talking to MS Exchange Servers. Other applications are dealt with in a more generic manner and basic TCP/IP response times are used to calculate applications performance (instead of atomic transactions). One thing you will have to pay very close attention to when evaluating B-hive is how well their set of supported application protocols maps to your set of business critical applications.
- B-hive provides you with round trip and per hop per application layer response time data for every auto-discovered application for which they have an applications level protocol decode. This is in and of itself highly valuable information.
- You can get alerts off of this information, or you can tie it to rules that tell DRS to fire up another VM that contains another web server if the response time for a web server is over a threshold you specify.
- Finally, B-hive contains automatic baselining technology that automatically builds a picture of the normal behavior of the application for you. So you can set your thresholds relative to normal instead of as hard manual values, which really helps cut down on false alarms.
The Product Details
B-hive starts out by giving you a very nice per machine (physical or virtual) view of your applications system. The little window on the lower right is used to zoom into whatever portion of the application system you want to see in the larger window. You can go from seeing the entire end-to-end view of an application system to seeing just transactions and database queries for one web server in a few mouse clicks.

If you focus in on a specific web server two things automatically happen. The first is that the transactions for the application on that web server (in the case of this demo it was Sugar CRM) get shown to the right of the web server. Note that these transactions were discovered automatically by B-hive. The second is that B-hive also automatically calculates the response time for each of the discovered transactions, and if you keep the focus on the web server shows you a rollup of the response time statistics for all of the transactions flowing through that server. You get three different pictures of response time, the WAN (the round trip from the user to the web server), the Network (the network latency (not ping) for this application from the web server to the back end database server and back again), and the Transaction response time which itself split into the infrastructure and applications components. Note that in the example below, total response time is 359 ms, of which 333 ms is in the application. So one important question has already been answered (the time is being spent in the application itself, not the infrastructure for the application). Just knowing this piece of information for each business critical application in the virtual infrastructure is (in my opinion) enough to buy B-hive since it lets the IT manager prove to the application owner that the slowness is not in the infrastructure.

If you want to drill into a specific transaction, you can highlight that transaction, and B-hive will highlight the corresponding database tables and queries for you. So the IT Manager can not only say to the application owner, "the problem is not in my infrastructure - it is in your application", but some relevant detail about where the problem is occurring is also easy to get.

You can even drill down into the detail of the actual SQL Query that is at fault for the slow response time.

One of the difficult tasks when using any monitoring product is to decide where to set the thresholds that separate normal from abnormal behavior. This is very tricky to do since if you set them too low you will get overwhelmed with false alarms, and if you set them too high you may miss something important. It is also the case that normal is often time of day and day of the week dependent (you may expect things to be a little slower in the busiest part of each day). B-hive lets you ask it to suggest threshold values based upon the recent statistical behavior of the response time metrics of interest.

Like any good management product there is an extensive reporting system that allows you to run a wide variety of canned reports whenever you want and distribute them to whomever you want.

Any good management product needs a dashboard that rolls up the information that various users of the product need. B-hive provides a very nice dashboard that can easily be configured to address the appropriate level of detail for a variety of constituents.

When you get an alert off of the dashboard, you get quite a bit of detail with the alert. This helps you figure out where (1st pass infrastructure or application and 2nd pass where in the application) the problem lies. So, this information is not only highly valuable to the IT staff supporting the infrastructure but also the teams who own and support the applications.

Supported Protocols and Applications
B-hive uses deep packet inspection techniques to identify applications and then within certain types of applications identify response times for specific transactions. The table below lists the different applications and protocols supported by B-hive, and indicates whether or not application specific transaction support is available. In the transaction column the type of transaction for the protocol is listed if transaction support exists for that protocol. If a transaction is listed, then B-hive will automatically discover transactions for this protocol, and you can via configuration combine atomic transactions into compound transactions. If the transactions are not supported, then B-hive uses a generic TCP/IP request response mechanism to calculate response times for this protocol.
B-hive Protocol, Application and Transaction Support |
Protocol | Application | Transaction |
HTTP/S | All Web | HTTP Request/Response |
Database RPC | SQL Server, Oracle, MySQL | SQL Query/Response |
Outlook/Exchange RPC | Outlook and Exchange | End User Action in Outlook/Response |
J2EE App Servers | Weblogic, Websphere, Tomcat, Jboss, etc. | HTTP Request/Response |
DNS RPC | DNS Services | DNS Command/Reply |
Active Directory RPC | Active Directory Services | LDAP Command (bind)/Response |
Domain Controller RPC | Domain Name Services | DC Command/Response |
File Server RPC | File Services | Open/Read/Write |
ICA | Citrix | TCP/IP Request/Response |
RDP | MS Terminal Services | TCP/IP Request/Response |
Blackberry | Blackberry | TCP/IP Request/Response |
VDI | VDI | TCP/IP Request/Response |
Pricing
B-hive is priced at $1000 per physical CPU socket on your VMware servers. This strikes me as right in line with both how the industry charges for these kinds of products and how much they should cost. If this seems like a lot of money to you, then I would suggest that you think about what it would be worth it to you to be able to prove your innocence (or the innocence of your applications infrastructure) when the business guys complain about performance, and that you think about how much you could accelerate your virtualization projects if you could use this data to prove performance to the reluctant owners of business critical applications.
The Good
I do not claim to have seen every APM product that is relevant to managing virtualized applications, but I am pretty sure I have seen most of them. B-hive is one of the nicest ones that I have seen due to how much work it does for you without requiring that IT know anything about the applications to begin with or as they change. If you have business critical applications in production on VMware servers, and/or you plan to grow your list of business critical applications in your VMware environment, then you absolutely should take a look at B-hive.
The Caveats
With any product there are always caveats. The perfect product would provide transaction level visibility for all applications automatically. However the perfect product is not possible due to the wide variety with which applications have been built since the beginning of business computing. So, you should be aware of the following caveats when evaluating B-hive or any other APM solution for your virtualized environment:
- B-hive has two levels of support for applications. The difference between the two levels is whether or not atomic transactions are automatically discovered for an application or not. For applications written to supported middleware platforms and Outlook/Exchange, B-hive will automatically discover the atomic transactions, and calculate response times for each of them. For other applications, a generic TCP/IP request response mechanism is used to calculate a response time number for that application (with no visibility into the corresponding atomic transactions). So you will get great visibility into some or most of your applications. But you will not get atomic transaction visibility into applications built to middleware layers not supported by B-hive. This is not a slam against B-hive is it simply recognition of the fact that the problem is too large to ever be addressed in its entirety.
- The transactions that B-hive automatically discovers are at the "atomic" level. For example in an HTTP application they are the individual HTTP request/responses each one of which is considered a transaction. This is at a much more granular level than they way your business colleagues would view a transaction as many atomic transactions are required to create one business transaction like the entry of an order. This is not a reason not to buy B-hive it is just a setting of expectations that B-hive's out of the box discovery of transactions is appropriate for an IT person who wants an overall view of application performance, and not for a business analyst who wants to know how long the application system took to respond to an entire submit of an order. B-hive does provide a way to combine atomic transactions into business transactions, but that feature of the product was outside of the scope of this review since it is for a different audience than the IT staff who manages virtualized systems.
- B-hive is appropriate for virtualized servers, and is far less so for virtualized desktops. The reason for this is that the product focuses on the server tiers of the application system, and that it does not include the visibility up into the user's virtualized desktop which would be necessary for a VDI APM solution. vmSight's Connector ID technology is unique in its ability to provide insight into who the actual users are in all cases, and should therefore be strongly considered for VDI implementations.
- There is also no single product that provides a complete picture. While B-hive provides a great picture of applications response time across the applications and server tiers of the virtualized applications systems, it does not tell you how and when things like I/O contention or SAN configuration are in fact the root cause of problems. This is what Akorri specializes in, and a true solution to the problem of how to performance manage a virtualized system from end to end may in fact require both B-hive and Akorri.
- Right now B-hive claims to support VMware and Citrix Xen. Once VMware takes control of B-hives product roadmap, it would be reasonable to expect VMware to focus B-hive upon the VMware platform to the exclusion of Citrix Xen and potentially in the future Microsoft Hyper-V (this is speculation on my part - VMware may well choose to be a cross-platform APM vendor, I just do not know how this will turn out). If you expect to have both VMware and, for example Hyper-V servers with perhaps some virtualized Citrix desktops thrown in for good measure then you probably have a problem that is outside of the scope of anyone who does APM at a response time level for virtualized systems right now (of course this will change).
- What VMware plans to do with B-hive is unknown at this point. Clearly tighter integration is going to happen, but I simply do not know how B-hive is going to be represented in the set of management tools from VMware. So, if you buy B-hive now, you may end up upgrading to a different packaging and licensing scheme at some point in time in the future as things get more integrated with the VMware management tool set.
- B-hive strongly emphasizes the service level automation aspects of their product. While the benefits of adjusting the infrastructure to automatically meet service levels are obvious and highly desirable, I can see that a lot of thinking and planning would have to go into the rules so that unintended and surprising things do not happen. For example if a decision is made to move a VM that is experiencing slow response time to a host that has more capacity, and to move it back again when response time improves, that VM could easily end up getting moved back and forth between two hosts in a rapid and continuous manner.
Conclusion
By buying B-hive, VMware did not just acquire yet another product that watched resource utilization on servers. B-hive moved the ball forward in terms of how to measure performance the right way (response time), with IT Operations as the target audience. This will be a highly valuable tool to VMware customers with virtualized servers, and will significantly enhance the value of the VMware platform relative to competing platforms from Microsoft and Citrix, neither of whom have anything like this in their portfolios.