The importance of data to an organisation wasn't suddenly just realised in the last 12 months, despite what the vendors would have us believe. Businesses with critical operations have always been conscious of them and those at the upper echelons, who spend millions of dollars on the security of their information, have long been aware of the need to protect it and are willing to incur the cost of doing so.
With the proliferation of the Internet as a supply tool, however, new risks have emerged concerning the loss of data while in transit. Most in the channel agree that it is the second level of companies - with annual revenues of between $15 million and $20 million - that are heavily impacted by this digital invasion. "Banks have whole departments that investigate risk aversion, but if you come down a few rungs you will find something of a void," says John Wellar, Lan 1's director of advanced systems.
At a recent HP user conference, Will Beremen, Special Broadcasting Service's IT manager, outlined the predicament most modern-day businesses find themselves in. "SBS uses everything and anything in its environment. Our IT guys are electronic engineers, so interoperability, not just between computing platforms but between analog and digital systems, is a top priority."
Because general-purpose computing platforms are the newest of the bunch - the unproven technology by comparison - they are a common point of failure. Joan Tunstall of StorageTek says 40 per cent of all application downtime is caused by hardware error. A further 12 per cent is rooted in application failure.
The digital environment also requires a different user type and recovery mechanism than analog or manual systems. Human error as a cause of downtime is right up there with hardware failure - a massive 40 per cent - and often the only one privy to the unique quirks of the company's IT set-up is the administrator, which means that not just anyone can take the helm in an emergency.
While the risks have increased, the stakes remain the same. Consumers have less tolerance for downtime than ever and revenue raising is tied ever more closely to uninterrupted performance. "Mary Kostakidas can't sit there on the 6:30 evening news and say, hang on folks while we just reboot the servers'," Beremen says. "When we stop broadcasting for five minutes we lose viewers and credibility. With three quarters of our funding coming from taxpayers, that situation is not ideal."
There are also peripheral concerns burdening the customer. Doug Oates, general manager of managed service provider Pihana Pacific, says companies today need to protect against financial damage resulting from service providers that go bust. Integrators and MSPs should encourage clients to be carrier-neutral and provide access to a number of providers so if one goes down they can route connections to a different service provider and ensure they remain online, Oates says.
Most in the channel will tell you that organisations like SBS are in the minority for even being aware of their vulnerability. A "Prove It" disaster-readiness consortium identified 75 per cent of businesses that took the test for best practices are in the danger zone, 21 per cent need improvement and only 4 per cent met best practices.
Customers in denial
So what is the typical reaction from customers when their IT suppliers raise the topic of disaster recovery? "Denial," says a spokesperson for Canberra-based integrator 90East. "Their first reaction is an introspective look; they are not aware of their exposure."
90East says the desire to look the other way is a hangover from days when disaster recovery (DR) had a significant premium. Now you can isolate yourself from a hard drive failure using a replacement device for a couple of hundred dollars, but this awareness has not dawned. "It takes one of two ways for customers to see the light," says 90East. "A compelling event or an anecdotal example that hits close to their hearts."
The perception of cost is not altogether false, though. DR implementations typically generate a tidy buck for integrators in hardware, software and services. "The revenue split [between hardware, software and services] is relatively even at the point of sale and implementation, with consultancy and ongoing management being the significant long-term cost to the customer," says Andrew Manners of HP.
Dunson, a specialist in thin-client networks, is experiencing great success testing firms' recovery capabilities. According to technical director Phil Lancaster, it requires a large services component and, because existing plans are usually found wanting, it typically leads to the implementation of additional equipment.
"Dunson has a client who uses Hyperion to amalgamate financial information from business units throughout the world. At specific times in the month, this function is highly critical in nature. We were asked to prove and document the availability and recovery of MetaFrame and Database Servers during both outages and meltdowns'. This included failing over to alternate servers on different sites, data recovery and data integrity. Further to this, we had to prove that equipment could be replaced and rebuilt in a timely and standard manner while the recovery site was in operation. We simulated disruptions as simple as comms, to disks, to whole servers, to the whole data centre. The result was the customer purchasing many hours of services, new servers, switches, routers and additional comms," says Lancaster.
At the same time, he warns against attacking DR with technology alone. "Don't be like many vendors and just focus on technology," he says. "This is a business continuity requirement. If manual methods will work, that is fine." Vendors are particularly prone to deluding customers that technology will solve all their ills. Lan 1's Wellar says IT as an industry tends to over-complicate. "Maybe you can process orders manually for four hours while the network is down," he says. "We shouldn't exclude a simple solution."
More important, however, is the question of whether resellers have the ability to investigate risk aversion. Graham Schultz, regional partner manager for Brocade Australia, says this depends on the company; for example, are they a consulting group or a reseller? "In most cases, typically in larger accounts, the reseller may only get involved after the consulting group has done the preliminary investigation and report," says Schultz.
Wellar, on the other hand, feels a certain amount of the assessment is commonsense and can be addressed with a tick-list. Resellers can find such resources on the Internet and Wellar recommends they speak to accountants for an understanding of how system failures and lost data convert into a business expense.
"A lot of resellers have never had to look at DR from a pure business point of view. The equation is not hard. If your critical data and access to it costs you $1 million you are going to spend more than $10 to protect it but less than $2 million. A good financial officer will be able to identify the cost to the company, but it doesn't have to be dead accurate."
Many businesses are attempting to alleviate the possibility of total system failure by engaging the services of a third-party data centre, either for specific functions or to create a complete replica of their infrastructure, a "hot-site". Managed services providers, such as Virtual.Offis, are making this easy to digest by packaging services in tidy bundles - secure off-site tape storage with pre-built and tested hot-swap hard drives ready for rebuild for $550 per server per month, for example. At the same time, Pihana Pacific says it has seen growing interest in its Backup Office Environment, which allows customers to set up shop in its facility, inclusive of telephone systems, PCs and servers.
However, StorageTek's Tunstall warns that outsourcing is often used as a way to shirk the responsibility of constructing a thorough DR solution while still being seen to be doing something.
Nothing beats doing it yourself.
Causes of catastrophe
Integrators outline the common triggers of modern-day disasters.
- The "fat finger" syndrome, i.e. user error. Network Appliance claims up to 80 per cent of system outages are a direct result of user error, such as corrupting, deleting or simply losing files.
- Hardware error is the cause of 40 per cent of all downtime, according to StorageTek.
- Power outages account for more disasters than bombs, hurricanes and earthquakes combined.
- Application failure is estimated to account for around 12 per cent.
- The demise of service providers, such as One.Tel and WorldCom.
- Other environmental causes account for up to 3 per cent.