Posts Tagged vCommentary
In my last post I touched briefly on a claim I’m hearing a lot in IT circles these days. This claim is often heard in discussions surrounding multi-hypervisor environments and most recently in VDI discussions. The claim in question, at its’ core, says this – “If you have two procedures to perform the same task you double your operational expense in performing that task”. Given the prevalence of this argument I wanted to focus on this in one post even though I’ve touched on it elsewhere.
As mentioned in my last post, Shawn Bass recently displayed this logic in a debate at VMworld. The example given is a company with a mixture of physical and virtual desktops. In this scenario they manage their physical desktops with Altiris/SCCM and use image-based management techniques for their non-persistent virtual desktops. Since you are using two different procedures to accomplish the same task (update desktops), it is claimed that you then “double” your operational expense.
As I’ve said, in many scenarios this is clearly false. The only way having two procedures “doubles” your operational cost is if both procedures require an equal amount of time/effort/training/etc. to implement and maintain. And the odd thing about this example is that it actually proves the opposite of what it claims. It’s very common for organizations to have physical desktops that they manage differently than their non-persistent virtual desktops. Are these organizations just not privy to the nuances of operational expenditures? I don’t think so, these organizations in many cases chose VDI at least in part for easier desktop management. For many, it’s just easier and much faster to maintain a small group of “golden images” rather than hundreds or thousands of individual images. So in this example adding the second procedure of image-based management can actually reduce the overall operational expense. Now a large portion of my desktops can be managed much more efficiently than they were before, this reduces the overall time and energy I spend managing my total desktops and thus, reduces my operational expense.
We see this same logic in a lot of multi-hypervisor discussions as well. “Two hypervisors, two ways of managing things, double the operational expense”. When done wrong, a multi-hypervisor environment can fall into this trap. However, before treating this logic as universally true you have to evaluate your own IT staff and workload requirements. Some workloads will be managed/backed up/recovered in a disaster/etc. differently than the rest of your infrastructure anyway, so putting these workloads on a separate hypervisor isn’t going to add to that expense. The management of the second hypervisor itself doesn’t necessarily “double” your cost as in many cases the knowledge your staff already possesses on how a hypervisor works in general can translate well into managing an alternate hypervisor. A lot more could be said here but in the end, CAPEX savings should override any nominal added OPEX expense or you’re doing it wrong.
In general, standardization and common management platforms are things every IT department should strive for. Like “best practice” recommendations from vendors, however, we don’t apply them universally. The main problem with this line of thinking is that it states a generalization as a universal truth and applies it to all situations while ignoring the subtle complexities of individual environments. In IT, it’s just not that easy.
After reading a bevy of excellent articles on multi-hypervisor datacenters, I thought I’d put pen to paper with my own thoughts on the subject. This article by Joe Onisick will serve as a primer to this discussion. Not only because it was recently written, but because it does an excellent job at fairly laying out the arguments on both sides of the issue. The article mentions three justifications organizations often use for deploying multiple hypervisors in their datacenter. These are, 1) cost, 2) leverage and 3) lock-in avoidance. I am in complete agreement that 2 and 3 are poor reasons to deploy multiple hypervisors, however, my disagrement on #1 is what I’d like to discuss with this post.
The discussion on the validity of multi-hypervisor environments has been going on for several years now. Steve Kaplan wrote an excellent article on this subject back in 2010 that mentions the ongoing debate at that time and discussions on this subject pre-date even that post. The recent acquisition of DynamicOps by VMware has made this a popular topic again and a slew of articles have been written covering the subject. Most of these articles seem to agree on a few things — First, despite what’s best for them, multi-hypervisor environments are increasing across organizations and service providers. Secondly, cost is usually the deciding factor in deploying multiple hypervisors, but this is not a good reason because you’ll spend more money managing the environment and training your engineers than you saved on the cost of the alternative hypervisor. Third, deploying multiple hypervisors in this way doesn’t allow you to move to a truly “private cloud” infrastructure. You now have two hypervisors and need two DR plans, two different deployment methods and two different management models. Let’s take each of these arguments against cost in turn and see how they hold up.
OpEx outweighs CapEx
As alluded to above, there’s really no denying that an organization can save money buying alternative hypervisors that are cheaper than VMware ESXi. But, do those cost savings outweigh potential increases in operational expenditures now that you’re managing two separate hypervisors? As the article by Onisick I linked to above suggests, this will vary from organization to organization. I’d like to suggest, however, that the increase in OpEx cited by many other sources as a reason to abandon multi-hypervisor deployments is often greatly exaggerated. Frequently cited is the increase in training costs, you have two hypervisors and now you have to send your people to two different training classes. I don’t necessarily see that as the case. If you’ve been trained and have a good grasp of the ESXi hypervisor, learning and administering the nuances and feature sets of another hypervisor is really not that difficult and formal training may not be necessary. Understanding the core mechanisms of what a hypervisor is and how it works will go a long way in allowing you to manage multiple hypervisors. And even if you did have to send your people to a one time training class, is it really all that likely that the class will outweigh the ongoing hypervisor cost savings? If not, then you probably aren’t saving enough money to justify multiple hypervisors in the first place. Doing a quick search, I’ve found week long XenServer training available for $5,000. Evaluate your people, do the math and figure out the cost savings in your scenario. Just don’t rule out multi-hypervisor environments thinking training costs will be necessarily astronomical or even essential for all of your employees.
Similar to the OpEx discussion, another argument often presented against the cost saving benefits of multi-hypervisor environments is that they are harder to administer as you have to come up with separate management strategies for VMs residing on the different hypervisors. Managing things in two separate ways, it is argued, moves away from the type of Private Cloud infrastructure most organizations should strive for. The main problem with this argument is that it assumes you would manage all of your VMs the same way even if they were on the same hypervisor. This is clearly false. A couple clear examples of this are XenApp and VDI. The way you manage these type of environments, deploy VMs, or plan DR is often vastly different than you would the rest of your server infrastructure. And so, if there is a significant cost savings, it is these type of environments that are often good targets for alternate hypervisors. They are good candidates for this type of environment not only because they are managed differently, regardless of hypervisor, but because they often don’t require many of the advanced features only ESXi provides.
I’m in complete agreement that having test/dev and production on separate hypervisors is a bad idea. Testing things on a different platform than they run in production is never good. But if you can save significant amounts of money by moving some of these systems that are managed in ways unique to your environment onto an alternate hypervisor, I’m all for it. This may not be the best solution for every organization (or even most), but like all things, should be evaluated carefully before ruling it out or adopting it.
Citrix announced its acquisition of ShareFile back in October and has recently allowed partners a free, one year, 20 “employee”, 20GB of space trial offer. I’ve been kicking the tires on ShareFile for the past few weeks and wanted to share my thoughts.
What is it?
If you’re familiar with solutions like DropBox and SugarSync then you already have a pretty good idea of what ShareFile is – an online file sync and collaboration tool. Unlike these other solutions, however, ShareFile is designed to be used by businesses. ShareFile provides you with SSL encrypted storage and allows you to add users and assign permissions to particular folders and the ability to add additional administrators to help manage your data and users. You’ll get configurable email alerts on file uploads and downloads and can even control the amount of bandwidth allotted to particular users in a given month. ShareFile provides you with a customizable web portal (yourdomain.sharefile.com) that allows you to brand the website with your logos and corporate colors. This web portal can be used as an alternative to FTP and even gives you the ability to search the site for particular files. Other items of note:
ShareFile is hosted almost entirely out of Amazon AWS and its services are spread across all 5 major Amazon datacenters.
-Desktop Widget: Basically a fat-client that is built on Adobe Air that allows you to upload and download files to ShareFile without having to launch a web browser.
-Outlook Plugin: Allows you to link to existing ShareFile documents and upload and send new files to ShareFile. Administrators can even set policies that dictate that files over a certain size are automatically uploaded to ShareFile instead of attached using the corporate email system. I’ve found this to be the most used ShareFile feature for me.
-Desktop Sync: This gives you the ability to select folders on your PC to be automatically synced to ShareFile. There is an “Enterprise Sync” as well that’s designed for server use and allows for sync jobs to be created under multiple user accounts.
-ShareFile Mobile: A mobile website designed to be accessed from a tablet or smartphone. In addition, there’s a ShareFile app for iOS, Android, Blackberry and Windows Phone.
ShareFile has more features that you can read about on their website.
What does this mean for the enterprise?
Citrix is incorporating ShareFile into what it’s calling the “Follow-Me-Data Fabric”, which is comprised of ShareFile, Follow-Me-Data and GoToMeeting with Workspaces. Citrix has long had the goal of allowing you to access your applications anywhere, from any device and they’re now attempting to extend this philosophy to your data as well.
In all honesty, it was initially hard for me to see this adding much value to the Citrix portfolio. After all, doesn’t XenApp, XenDesktop, Netscaler, et al. already give me the ability to access my applications and data wherever I’m at? My virtual desktop is accessible from almost any device already and all the data I work on is either saved on that desktop or accessible on corporate network shares from that desktop. As I began to think about the future of IT though, and the shift to public and hybrid clouds, the strategy here became much more obvious. While almost all the data I work on now is stored in one centralized location, the push to public and hybrid clouds will create a dispersion of corporate data across different cloud providers. Corporations may be utilizing CloudCompany-A, B and C for SaaS applications and CloudCompany-D for portions of their infrastructure. Even if you’ve only chosen one Cloud provider, most companies aren’t ready to dump all of their data and applications into the Cloud yet and may not ever. This will obviously create a de-centralization of data that could get messy if not managed properly, and that’s where ShareFile comes in.
Working in conjunction with StoreFront and Follow-me-Data, ShareFile would give you the ability to centralize all the data stored in any private and public cloud infrastructures you’ve invested in. You’d have StoreFront on the front-end tying your internal and SaaS applications into one unified interface and Follow-Me-Data and ShareFile on the back-end allowing you to access dispersed data in a centralized fashion. That, at least, is the vision. The key here will be integration – something Citrix has historically not done very well (e.g. VDI-in-a-Box, management consoles, etc). To the user, ShareFile needs to go almost unnoticed and be seamlessly integrated into the Citrix product stack so that it does not feel like a separate technology. Doing this will just make it natural for the user to store their public and private cloud data and access from anywhere. If it’s seamlessly integrated into the products the user is already utilizing for their job then I think it will go a long way to securing corporate data. After all, why would I put my corporate data on DropBox or SugarSync when it’s so much easier to get this same functionality with tools that are already integrated with the work I do? And that too, will be a key factor in how successful this will be – corporations can’t lock this down to such a degree that it’s not easy for users to work with or else it will drive them to more “open” solutions.
In the end, I think this was a smart move that’s success will ultimately be dependent on the ever increasing push towards the public Cloud and Citrix’s ability to integrate this seamlessly with their already existing products. It will also be interesting to see how DropBox and other similar companies respond to this. Whether they want to define themselves as competitors or not, the bottom line is that there are currently tons of corporate data on DropBox and SugarSync and a well-integrated ShareFile means less data on these type of solutions. Whether they add more “business-friendly” features to their products or are content with “personal” data remains to be seen. And if they do add more features that allow companies more control of the data that is stored on them, how will Citrix respond? Citrix has generally been very receptive to utilizing their services from multiple platforms (e.g. XenDesktop on ESXi/Hyper-V) so they might look to just provide integration with these other online file shares from Citrix Receiver as well. And will this service always be hosted in the public Cloud or will there be an option in the future to host a ShareFile-like service for your company within your own datacenter?
There’s a lot that remains to be seen but overall, this appears to be a “win” for Citrix and a trend that other companies have already adopted as well. End-user computing was a huge component at VMworld and Synergy this past year and I anticipate and look forward to even more rapid development in this space!
Citrix Provisioning Server (PVS) has been a vital component in the Citrix technology stack for years. Allowing for the rapid provisioning of machines through OS streaming, it has been the bedrock provisioning mechanism for XenDesktop and is also used in provisioning XenApp servers and streaming to physical endpoints. Even though PVS provides all these benefits and has been so integral to various Citrix technologies, its days are clearly numbered. Fundamentally, streaming an OS over the network is inferior to provisioning machines and delivering the OS locally in some way. As an example, technologies like Machine Creation Services (MCS) can be used to provision an OS without the additional streaming component. And while the initial scalability numbers for MCS were lower than PVS and is currently limited to the XenDesktop technology stack, MCS is new and its scalability estimates are improving all the time and there’s no reason to think it can’t or won’t be integrated with other Citrix products. Indeed, there has been talk for years of merging XenDesktop itself with other Citrix products. So, what other possible reasons will there be for holding onto PVS in the future?
- “PVS can use the caching capabilities inherent to the local OS, this reduces IOPS”
When a target devices boots up or accesses portions of the base image, those portions of the OS are then cached in RAM on the PVS server. Subsequent attempts by additional target devices to access those portions of the OS will be read from RAM, thereby reducing the amount of IOPS required on the backend storage. Since IOPS are one of the biggest concerns for VDI deployments, this has been a major selling point for PVS. However, with the rise in popularity of VDI over the past couple of years, storage vendors have really focused on optimizing their array’s for IOPS, with many having terabytes of caching capabilities in them. So, if you now have enough RAM to cache at the storage level, is there really much benefit in being able to cache at the OS level? In addition to that, you have emerging technologies like Intellicache and whole distributed storage models being developed for VDI that should make IOPS less of a concern in the future.
- “MCS will never be able to deliver an OS to a physical endpoint”
This is true. You will never be able to use a locally delivered OS solution for remote endpoints. However, what is the purpose of streaming an OS to physical endpoints? Two use-cases come to mind. The first involves streaming the OS to desktop PC’s outside the datacenter. Companies usually choose this option as a first step into the VDI world. It’s cheap because it uses already existing hardware and it gives you the single-image management and security benefits of VDI without purchasing thin-clients, hypervisors and backend storage arrays. But the important thing to point out here is that this is usually just a stepping stone towards much more robust VDI rollouts. Once their currently functioning PC’s reach end of life, these companies start to replace them with thin-clients and are more willing to invest in hypervisors and backend storage rather than a hardware refresh, thus eleminating the need to stream the OS over the network. The use-case for this in the future will become extremely “niche” as companies move away from purchasing fat-clients as a standard. The second use-case involves streaming to blade PC’s. This is usually done when high performance desktops are a “must”. Like the previous use-case we examined though, there is limited need for this today and as hypervisors continue to advance, there will soon be very little reason, if any, why a desktop cannot be run as a virtual machine and still expect optimal performance.
Now don’t get me wrong, PVS today is still a great solution and should be the main provisioning mechanism for most XenDesktop deployments. For the reasons listed above however, the next few years should see PVS use-cases diminishing rapidly. MCS or some future locally delivered OS solution will take it’s place.
The statement, “there is no technical benefit to memory overcommitment” is usually met with universal scorn, righteous indignation and a healthy dose of boisterous laughter. However, at second glance this statement is not quite so absurd. Overcommiting memory does not make your VMs faster, it doesn’t make them more highly available and it doesn’t make them more crash-resistant. So, what is memory overcommitment good for? The sole goal of memory overcommitment is to put more VMs per host. This saves you money. Thus, the benefit that memory overcommitment provides is a financial benefit.
Are there other ways to attain this financial benefit?
Memory overcommitment is one of the main reasons people choose the ESX hypervisor. If the goal of memory overcommitment is to save money and there are other ways to attain these cost savings on other hypervisors without utilizing memory overcommitment, does that change the hypervisor landscape at all? Before delving into that question, let’s first see if there is a way to save as much money without using memory overcommitment.
One way around this I’ve heard suggested is to just increase the memory in your hosts. So, if you had an ESX host with 10GB of memory and were 40% overcommitted then you could use XenServer or Hyper-V with the same amount of VMs but each host would have 14GB of memory. This to me does not seem fair as you could also add 4GB more to your ESX host and achieve even more cost savings. However, you can only add so much memory before becoming CPU-bound, right? I’m not referring to CPU utilization but the amount of vCPU’s you can overcommit before running into high CPU Ready times. Let’s use my earlier example. You have 14, 1 CPU/1GB VMs on a 4CPU/10GB ESX host. You want to put more VMs per host so you increase your host memory to 20GB. You now try putting 28, 1CPU/1GB VMs on the host. This is now twice the amount of vCPUs to the same amount of pCPUs and let’s say your CPU Ready times are around 5%-10%. Adding more VMs to this host, regardless of how much more memory slots you have available, would adversely impact performance, so you have a ceiling of around 28 VMs per host.
Knowing this number, couldn’t you then size your hosts for 4CPU and 30GB of RAM on XenServer or Hyper-V and then be saving just as much money as ESX? And this is only one way to recoup the financial benefits overcommitment provides you. If you already have Windows or Citrix products you might already own these hypervisors from a license perspective and it might not save you money to go out and buy another hypervisor. Also, some hypervisors (like XenServer) are licensed by server count and not socket count (like ESX) so you could potentially save a lot of money by using these hypervisors. In any of these cases, an in depth analysis of your specific environment will have to be done to insure you’re making the most cost effective decision.
Of course, memory overcommitment is not the only reason you choose a hypervisor. There are many other factors that still have to be considered. But given this discussion, is memory overcommitment one of these considerations? I think once you realize what memory overcommitment is really good for, it becomes less of a factor in your overall decision making process. Does this realization change the hypervisor landscape at all? As I mentioned earlier, memory overcommitment is a major sell for ESX. If you can attain the financial benefit of memory overcommitment without overcommiting memory then I think this does take a bite out of the VMware marketing machine. That said, ESX is still the #1 hypervisor out there in my opinion and I would recommend it for the vast majority of workloads but not necessarily because of memory overcommitment. There is room for other hypervisors in the datacenter and once people realize what memory overcommitment “is” and what it “isn’t” and really start analyzing the financial impact of their hypervisor choices I think you’ll see some of these other hypervisors grabbing more market share.
Thoughts? Rebuttals? Counterpoints?