How to keep your Linux-heavy data center up and running
- 30 January, 2014 16:20
This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.
If you've built something yourself rather than buy it, like a book shelf or a bird house, you know the satisfaction of shaping something to your needs. And as long as nothing goes wrong, you're in good shape. But if it breaks you can't return it to the store for an exchange; you have to fix it yourself. And while repairing a bookshelf is one thing, recovering applications in a data center when they fail is something else entirely.
Linux is an excellent tool for creating the IT environment you want. Its flexibility and open-source architecture mean you can use it to support nearly any need, running mission-critical systems effectively while keeping costs low. This flexibility, however, means that if something does go wrong, it's up to you to ensure your business operations can continue without disruption. And while many disaster recovery solutions focus on recovering data in case of an outage, leaving it at that is leaving the job half done. Having the information itself will be useless if the applications that are running it don't function, and you are unable to meet SLAs.
Businesses that value the independence Linux provides can benefit from partnering with a technology provider that can keep their business running in the event of disaster. And as we have seen all too frequently in the last several years, disasters happen to organizations of all sizes, from natural disasters to large-scale hacks that take down servers company-wide. It seems that every week we hear in the news about another large company that is experiencing a significant service failure.
As you consider what to look for in solutions to keep your Linux-heavy data center up and running, we recommend focusing on the following criteria:
* Speed of failure detection and recovery: Every minute counts when it comes to business downtime. The first step to effective recovery is rapid detection of failure even the best recovery solution will be insufficient if the detection process itself takes minutes rather than seconds. The ideal tool should provide fast detection with minimal resource usage, to meet recovery time objectives.
* Failover that covers the entire range of business services: Business-critical applications may require the preservation of several different layers of the information stack, to perform complementary processes such as the Web functionality, the application itself and the databases feeding it information. High availability can be a challenge when complex recovery is needed. Be sure your backup and recovery solutions can handle the interconnected processes necessary to maintain business operations.
* Advanced failover: Keeping a standby server ready for every server you use can be costly. Look for more advanced failover capabilities that allow you to maintain redundancy with one server that can take over for any that fail.
* Failover testing capability: You shouldn't wait until you have a disaster to learn how well your recovery solutions work. Look for tools that include the ability to test failover, to assess performance without affecting normal operations.
* Automation: Another requirement for fast recovery is the automation of the process. Ideally, detection and failover should occur automatically even if it's manually initiated - to maintain continuity regardless of where or when failure occurs. This also frees up IT staff to address other needs in the event of a large-scale problem and ensures you don't have to rely on their presence in the event of a disaster. Recovery should be fast and complete even if your recovery site is thousands of miles away, regardless of the different operating systems of virtualization platforms that must be supported.
* Replication across technology from different vendors: Many organizations use storage from multiple vendors, which can increase the challenge of interruption-free failover. Choose technology that will work between arrays from each storage vendor you employ. This also keeps costs lower at the disaster recovery site.
* Keep it simple: It's helpful to minimize the number of providers and solutions you are working with. Look for the tools with the most comprehensive, wide-ranging capabilities. Working with a single vendor wherever possible will enhance interoperability and keep costs lower.
* Exploit virtualization: Virtualization technologies allow users to maintain a minimal footprint at the recovery site during normal (non-disaster) processing, and to activate just the capacity needed in order to effect recovery of operations.
Visibility can also be a challenge in a Linux environment. Choosing the right technology can not only keep applications and information available in the event of a data center outage, it can also make the day-to-day operations of IT more efficient. Look for solutions that not only meet current needs, but allow for future changes.
Businesses that utilize a Linux-based infrastructure are looking to maximize their IT investments while retaining control of resources. In order to ensure consistent availability in such a highly customized environment, these organizations will be well served to utilize outside expertise when it comes to ensuring constant access to applications and information. Look for solutions that can provide fast, automated failover of all business-critical operations in case of an outage. With the right preparation, you can enjoy flexibility without worrying about how you will continue to do business in the event of a disaster.
Read more about data center in Network World's Data Center section.