Dirty little secrets of virtualization

The virtualized data center has accelerated the pace of operational change.

This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.

The virtualized data center has accelerated the pace of operational change. Virtual machines are reconfigured, computing loads are moved, and applications are scaled up and down rapidly. We know that with rapid rates of change come high levels of mistakes; analysts estimate that 60 per cent to 80 per cent of data center problems are caused by management mistakes. How can you ensure the stability of your data center while at the same time taking maximum advantage of the flexibility of virtualization?

Virtualization promises to improve data center operations and indeed it does. Server consolidation has great benefits. The ability to migrate loads without stopping them greatly eases hardware management. The ability to deploy new virtual machines in a fraction of the time of a physical machine makes application development and deployment more rapid and effective.

ANALYSIS: Virtual server management demands strong policies, automation tools

However, the advantages of virtualization bring some associated costs. The hypervisor adds another level of complexity in the software stack and imposes requirements on the servers, the storage system, and especially on the network. While the hypervisor offers some automation for simplifying operations of servers, the environment around the virtual cluster was impacted without being made any simpler.

In a recent survey of Infoblox customers, 70 per cent reported that virtualization put more pressure on their network operations. It's easy to see the source of this pressure. Every virtualization initiative is surrounded by physical resources:

• Storage systems

• Users, workstations and partner networks

• Load balancers and security devices

• Remote peer servers

• Physical unvirtualized servers

• Competitive hypervisors that are not compatible

Private clouds, laboratory systems and other specialized clusters

The boundary between each of these elements and the virtualized environment is a place where operational mistakes can be made. Both sides of the boundary matter; the hypervisor's configuration may be incorrect, or the external environment may be misconfigured. When a performance problem arises, information from both sides of the boundary must be integrated to find the solution.

When new applications are deployed, both sides must be validated in advance. Mistakes and inconsistencies will show up in three different ways: in application performance issues; in delays in operational procedures; or in inefficient operations that eat up staff time. Each data center will have its own pattern; here are some examples:

Application performance becomes poor or inconsistent

• Port and network access parameters can be mismatched. There are many parameters that impact performance, including port duplex, network QOS settings, firewall access lists and more.

Rogue devices may be attached to the network, with incorrect IP numbers or incorrect protocol settings that disturb production devices.

• Configurations that "drift" from best practice, whenever manual procedures are followed incorrectly or when standards are incomplete. The result can be old and new devices with very different settings, producing erratic performance.

Requests for changes take too long

• When a virtual server will be migrated for updates or maintenance, its destination must have the right network settings. Manual setup adds delay, especially when compared to the near-instant speed of a virtual hot-migration.

• When a disaster recovery site is created, tested or updated, its network settings must be verified to match the master site. Manual verification adds delay.

• When new servers are added to scale up a load-balanced system, several devices may need carefully sequenced updates, including the physical switch, firewall and load balancer. Manual configuration adds delay, which is often much longer than the time it takes to spin up a new virtual server.

Staff wastes time in routine operations

• Daily tasks like IP address assignment must be coordinated. Mistakes can be hard to track down in a constantly changing environment.

• Troubleshooting problems often involves correlating logs and alerts from multiple sources. With virtualized systems, there's often a gap between the physical and virtual systems where data must be matched by hand.

• If an unauthorized person performs a move or change, time can be wasted rechecking the work (or even worse, fixing mistakes).

• Auditing and compliance reporting are a regular headache, and virtual systems can add complexity.

In a virtualized data center, the changes are more complex, and they occur more often thanks to the flexibility of virtual machines. Mistakes become more costly, and they may occur more frequently.

But there is a way to master the complexity and minimize the mistakes, and it doesn't require a complete infrastructure overhaul. The answer is augmenting existing infrastructure with automation.

If a configuration management platform can be embedded in the data center network, and if it can perform automated procedures, all of the issues above can be addressed.

An automated platform can be filled with "gold standard" configurations for all elements on the virtual system boundary. Deviations from those standards, whether from rogue devices or drifting configurations, can be prevented, isolated or repaired. The gold configurations can be applied in a single step, resulting in quick and consistent response to requested changes.

Troubleshooting can be accelerated when data from physical systems is correlated with data from virtual ones. Authorization and delegation rules can block unapproved changes and audit approved ones.

Automation is needed in the network around the hypervisor to realize the full benefits of virtual systems. A network resident data-center-wide platform for management and automation can minimize error, promote flexibility and cut the hidden costs of virtualization.

Infoblox is an industry leading developer of network infrastructure automation and control solutions. Infoblox's unique technologies, including the Infoblox Grid -- a real-time, data distribution technology -- increases network availability and control, while automating time-consuming manual tasks associated with network infrastructure services like domain name resolution (DNS), IP address management (IPAM), network change and configuration management (NCCM) and network discovery, among others.

Read more about infrastructure management in Network World's Infrastructure Management section.