The Aussie attitude 'she'll be right mate' is fast becoming the enemy of disaster recovery. The best bet to boost uptake may be to go on a rampage, snipping major power feeds and clinching the sale while customers are still feeling the pain, suggests Kevin Cosgriff.
It's Monday morning and a stream of reluctant IDGers are trickling into the office mourning the loss of yet another joy-filled weekend. They are met in the foyer by the accountant -- barefoot, tieless, with his suit pants rolled up to the knee. The hot-water system has burst, spewing water at the rate of an open garden tap for the last 16 hours. Half the lower level of IDG resembles a rice paddy, effectively displacing 21 workers, including the main reception. The IT manager charges past, already well into damage control.
The dependence of today's work processes on IT and telecommunications sees enormous pressure put on IT staff in the event of even minor catastrophes. Few modern-day businesses can tolerate being unproductive for longer than 24 hours. In fact, following the events of September 11, Gartner estimates that 43 per cent of businesses bearing the brunt of such a disaster will not re-open and 51 per cent will close their doors within two years.
Frightening figures such as this have jolted business to attention, but their success rate in digesting recovery strategies and processes is not fabulous; among those that have made an attempt, many have failed to look beyond the data centre.
Who is responsible for DR?
According to EMC's business continuity manager, Michael Cunningham, data and technology and perhaps facilities management sit squarely with the IT department. Any other considerations -- such as procedures for public relations and damage control; communications with customers, suppliers, business partners and investors; friends and family of employees; and vital services such as public utilities, fire and police services -- cannot possibly be managed by the IT department.
"Business continuity planning, as with any project, must begin with the fundamental question: what does the business need? The answer to this question should lie with business managers and not with IT," says Cunningham. "At the very least, it should be a collaborative process with IT becoming the business continuity facilitator."
However, collaboration between IT and business management is not the norm. While nearly half of corporate managers say DR is a collaborative venture, only a third of IT executives concur, according to a recent survey by Information Week.
Cunningham admits such a dichotomy is not surprising. "Personal experience has told me that business continuity continues to be the sole domain of the CIO," he says. "Most senior corporate managers struggle to articulate the hourly cost of downtime to their business."
Identify critical elements
As the DR market matures, Simon Hackett, managing director of Internode, says it is slowly dawning on businesses that it is not about total risk mitigation. "Any disaster is bound to have some affect on business. DR solutions are about minimising the impact of a disaster and being prepared. That allows peace of mind; if something does foul up, they can cope and they can survive," Hackett says.
Paul Marriott, business development manager, Oracle 9I, says the next step is to identify which applications require top priority. "Implementing DR technology doesn't guarantee that everything will work in the event of a primary failure," he says. "Some companies have 70 steps within their recovery process."
Generally speaking, the more immediate the requirement for application and data availability, the greater the cost of recovery. Hence organisations need to evaluate which data deserves immediate recovery and which data can tolerate a delay of a day or two. This distinction has contributed to the development of mixed-medium environments, whereby tape (a cheap, space-conservative and reliable technology with a slower transfer rate) is used to store non-critical data, while disk (about four times more expensive with approximately 10 times the transfer speed, but prone to 'head crashing') is used for mission-critical data.
"Depending on the size of the company, DR is an expensive to extremely expensive undertaking," says BMC Software's David Tighe. "What's more, having implemented a DR policy and site you hope you will never have to use it. For companies to undertake such an exercise, they need to weigh up the costs of a failure and the likelihood of such a failure against the cost of implementation. It becomes a cost/benefit issue. In times when companies are looking to cut costs, not running a DR test may save real dollars."
Money is the number-one reason for the failure of DR plans, agrees Ian Cameron, senior consultant, business continuity and recovery services, IBM Global Services. "Organisations are driven to show an ROI (return on investment) on basically every activity. Given that business continuity planning is planning for the unexpected, it is very difficult to produce an ROI unless the company actually suffers some form of disaster," he says.
DR is a lucrative business for those that have the skills; however, few vendors feel the channel is qualified to tackle a rollout unaided. Service providers should check a supplier's attitude towards cooperative implementations before committing loyalty and resources to its brand.
Who is best suited to tackle DR?
IBM and Compaq seem to favour the direct approach. "In some cases, IBM Global Services does work with business partners where the relationship is already established between the business partner and the customer," says Cameron.
In Hewlett-Packard's case, Annemarie Riga, marketing manager of HP's business software group, says service providers need to have the manpower but the skills set is usually provided by the vendor. "Most channel partners have not thought much about what DR truly encompasses and probably don't have the expertise and resources to then execute a strategy," she says.
"The sale of business continuity, disaster recovery planning services and associated technologies tends to be a longer sales cycle than most of the channel would like," agrees EMC's Cunningham. "Typically, solutions of this type cannot be made to fit the kit bag' of solutions that most VARs and channel players tout. Instead, the hard questions of whom, what and when will be provided by product-independent consultants."
Still, this doesn't rule the channel out. Many of these consultants are channel entities, such as Linus Risk Management Solutions, which works in partnership with StorageTek.
What it does mean, says John Wellar, director of advanced storage systems at Lan 1, is that providers determined to tackle DR should be prepared to team with other companies to compensate for shortcomings. "If you are just talking about the IT side of DR, then a good systems/business analyst with the aid of appropriate specialists in various technical areas can put the plan together. However, there should be a risk analysis, which is usually beyond the scope of IT technical staff," he says.
What happened to the soft sell?
While this all sounds very intense, one should remember that DR covers every level of computing -- from the CD rewriter in the desktop to the remote data centre providing a full hot-site with rentable desks, chairs and telephones. People's reactive nature might make DR a tough sell up-front, but the biggest problem, according to Wellar, is that resellers fail to suggest a suitable prevention while the customer is in the store getting the current 'disaster' cured.
"It's the old fries with the hamburger trick," he says.
A small businessperson will never feel more comfortable parting with $650 for a CD burner and $40 for a copy of Easy CD Creator (a user-friendly program that does a complete backup of the hard drive) than when they are standing at the service counter thinking their entire customer list has disintegrated. It makes no difference to this customer that what disabled their was simply the video card popping out. From the reseller's point of view, that simple service job has just turned into a $700 sale plus an additional service fee for installing the CD burner.
Why DR plans fail
1. Lack of commitment by senior management.
2. Death by cost-cutting -- IT execs are often asked by directors to recut a business recovery plan according to a designated budget.
3. Poor planning -- not enough preliminary work prior to developing the DR plan.
4. No scheduled testing. No scheduled testing. No scheduled testing.
5. No maintenance of the plan -- many companies have a fine DR plan which sits on the corporate shelf awaiting disaster. As time goes by, the company changes and the plan loses relevance. Technical changes in the environment are not being taken into account.
6. Inseparability of IT infrastructure from the business location -- notebooks and the ability to dial in means employees can still work if the primary site is out of action.
Yes, we are suckers for punishment
Out of 65 things that have changed since September 11, as reported by HQ magazine (April issue), not one has anything to do with backing up our IT systems. Granted, HQ is a tenuous yardstick to measure business concerns, but it does speak volumes about our personal priorities, which are inevitably carried into the work environment.
"Australians have the highest number of uninsured motor vehicles per capita than any other nation and stories of uninsured homes damaged in the recent NSW bushfires were astounding," says EMC's Michael Cunningham.
Out of more than 11 disaster recovery specialists interviewed for this feature, only one, Oracle's Paul Marriott, felt businesses were being proactive in formulating DR plans.
September 11: Lessons learned
1. Distance is key. The physical scope of a disaster can go far beyond your local facility, cutting off support people from the site and breaking site-to-site communications. Many people were unable to travel from their disaster recovery vaults to the recovery site because many streets, bridges, tunnels and all airports were closed.
2.Tape as a medium of recovery is not effective. It became abundantly clear that relying on tape as a means of backup and recovery leaves you vulnerable. Recovery time can be too slow for effective resumption of business processes. Even when files could be accessed and restored from tape, many were found to have degraded or to be unreliable. Restore time often stretched to five days -- and the process typically had to be done more than once.
3. All applications are critical. E-mail has become one of the most critical communications vehicles for corporate knowledge. When the communications lines are severed, so is the stream of business. On September 11, many businesses found that proposals in process, agreements for trades, and the ability to document transactions and agreements were contained only in their e-mail systems. But more than e-mail is at stake: today, most operations and applications are interdependent. If content or other information assets are lost in underlying or tertiary applications, that loss often affects higher-order applications such as CRM or ERP.
4. Inconsistent backup is no backup at all. Before September 11, backing up data was a necessary task not always executed with precision or regularity. Today, it has become imperative. Different backup schedules and strategies for different applications mean that information necessary for broad-based business processes cannot be matched up or reassembled. Also, inconsistent backup of applications significantly increases recovery time.
5. People-dependent processes do not suffice. What was your first thought when the news broke? Chances are it had nothing to do with protecting your company's information. In a crisis of this magnitude people first think of their families and other personal responsibilities. IT systems that performed best were those that could automate the task of recovery and limit the need for human intervention and manual activities such as tape transport and loading. In addition, fatigued, worried employees become error-prone, leading to mistakes and actually extending the recovery process.
6. Two sites are not enough. Many companies learnt a harsh lesson: even with a second site, they were left completely exposed following the disaster -- with business processes now dependent upon a single facility. With service providers overwhelmed, these companies were faced with the prospect of functioning well below their set policy levels for protection and business continuance for an extended period of time. Clearly, information and people need to be dispersed in new ways.
7. Managed services providers can be overwhelmed. Companies that relied on tape or a third-party provider found in many cases they had difficulty meeting their recovery time objectives. The reason? Disaster recovery providers plan for only a percentage of their customers to require services simultaneously. Therefore, a sudden, unexpected and massive demand on their capabilities was created by this large-scale, geographically focused event as clients tried to gain access to their finite resources at the same time.
8. People are irreplaceable; so is your information. Facilities can be rented. Mobile phones can step in for land lines. But for every business, the ability to conduct business depends on the availability of key personnel and the critical information and systems they need to function. Once people were protected, information was the one asset businesses found they could not replace fast enough -- and without it, the most diligent employees were hindered in their ability to re-establish business operations.
9.All disasters are possible. The reality of September 11 and the ensuing events have heightened the urgency to have disaster recovery plans in place to ensure business continuity. IT executives are now faced with an increased burden of responsibility for balancing the need for protection with corporate fiscal and resource realities.