Planning for business continuity means weighing risk: What systems can your customer’s business not live without? How can they get access to that data or computing power should something happen? And, perhaps most importantly, how much do they want to spend to have a backup plan — a plan that they, hopefully, might never need?
Taking the first step and conducting a business impact analysis not only shows executives the importance of planning for business continuity, but also highlights which areas need immediate attention and which can be pushed out or planned for future budgets.
“Making [business continuity] a business issue, not an IT issue, is top of the list,” said Jim Harding, CIO of Henry Schein, a US distributor of health care products and services. “If they think it’s ‘just an IT thing,’ then it’s very difficult to get anybody to participate and have an effective plan.”
Business continuity has evolved beyond IT, encompassing more than technology and moving toward a more communications-or people-centric view. Nevertheless, IT remains an important part of business continuity, often serving as the common link that ties everything together.
Companies adopt business continuity plans in many different ways and for different reasons, but all agree on one thing: It can no longer be ignored.
All about accessibility
President of New York-based systems integrator, Computer Integrated Services (CIS), Michael Zepernick, learned just how important accessibility can be when operating in a distributed business model. CIS had been backing up, among other things, a Windows 2000 server, all its office applications, a RedHat Linux database server, a NetWare server, and its SQL-based help desk dispatch software, all part of an in-the-works plan to possibly create a remote-backup option for customers.
On September 11, CIS’ office lost all phones, power and Internet connectivity, and because the office was inside the “frozen zone” encircling the World Trade Center, nobody could enter, Zepernick said. But by restoring the most recent data from its tape libraries and setting up Citrix Systems’s MetaFrame application-access solution, CIS was up and running again within 24 hours.
The company adopted multiple locations — a rented site in Manhattan with a fluctuating T1 line (data connection at 1.544 Mbps) for sales staff, technical staff working from CIS customer sites, a network of help desks throughout the city for customer calls, and a good number of employees working out of their homes.
“We were accessing our data whether it was over dialup lines, DSL, cable modems, T1 lines, and it didn’t matter, MetaFrame handled it,” Zepernick said.
Despite the scattered locations, everybody was able to get to the information he or she needed to be productive. The key, said Zepernick, was the commitment to doing remote backups consistently.
“At the end of the day, because we had the distributed environment ability, our sales and service — compared to what had happened — was really not impacted. ... We just had to inform people where to report to and where to go. People could choose to work where they felt most comfortable, not in a location dictated by where our systems were,” Zepernick said.
Taking this experience to heart, Zepernick is convinced of the importance of backing up systems and data on a regular basis, and testing those backups.
“[A business continuity plan] is all about staying in business — that’s of paramount importance,” Zepernick said. “It’s the continuation of your operations, period. If you don’t have a plan and something happens, you can almost just start to assume that you’re going to be out of business completely.”
Protecting centralised resources
Henry Schein’s mix of distributed call centres and distribution centres and a centralised network and computing architecture presents an interesting business continuity planning challenge for Harding.
“If we lost one of our distribution centres, because we have eight of them we can easily reroute that volume to other distribution centres,” he said. “But if we lose our centralised computing or our network, we’re down for the count.”
Because of its centralised computing structure, the company put great emphasis on revamping its entire disaster recovery plan during 2001. This included working with telco AT&T to shore up and back up various systems, both centralised and decentralised. For example, by working with AT&T and adding call centre technology from Avaya, the company can automatically reroute customer calls should one call centre go down, or if too many calls flood a single location.
But the main networking resources required a more complex plan.
Harding said Henry Schein changed its disaster services provider to IBM and added network capabilities from AT&T to back up systems, data and other resources to IBM’s offsite location, contracting with these companies to allow full recovery within 24 hours.
“We could have gone for a four-hour recovery, which means every time you record a transaction, you’re rewriting that to a disk drive at the IBM location,” Harding said. “We opted out of that because the cost is about 10 times what it is to do what we’re doing.”
After losing some network capacity when a previous carrier was affected by the September 11 events, Henry Schein consolidated more services with AT&T “because their own internal recovery capabilities are so far superior to most carriers,” Harding said.
Still, the company is going a step further and is in the process of adding a third frame carrier to its major sites as a backup to the ISDN dialup line that backs up the AT&T connectivity.
“That wouldn’t normally be necessary if we weren’t so highly centralised,” Harding said. “There’s a price for being centralised, and that’s part of it.”
The entire continuity plan, including the IBM and AT&T services, will be put to work in July when Henry Schein tests its resiliency by building systems from scratch based on a previous night’s backup tapes, and then cutting the network over for a short period to test recovery plans.
Harding said the company expected to do this kind of full test “at least annually”, with more moderate continuity tests performed semi-annually.
One key concern for Harding was finding providers that had excellent business continuity and disaster recovery plans of their own.
“Without the capability, I don’t care what the price is,” Harding said. “Part of that is not only ‘Do you have this hardware and connectivity and so forth?’ but ‘How can you handle a regional disaster? If 10 of us go down, do you have the capacity to handle that?’ That was big criteria in making our selection.”
Part of the criteria was company buy-in; at Henry Schein, executives understood the need for a full business continuity plan involving not just IT but the support and participation of other business units as well, and were willing to make the necessary investments of time and money.
From there, it was a matter of putting all the pieces in place over time, Harding said.
After experiencing a lot of growth during the past three years, US advertising agency, The Tracey Edwards Company, had a legacy network and a need for better storage strategy, as well as “no real business continuance plan, which is to say: none. It was a nightmare waiting to happen,” vice-president of e-business for Tracey Edwards, Scot Villeneuve, said.
The company’s advertising business creates numerous large graphics and text files in both PC and Macintosh formats that must be stored and remain accessible for long periods of time. “This information is crucial — the No. 1 thing to keeping our business alive if a disaster happened,” Villeneuve said.
Because Tracey Edwards needed to upgrade its network and technology platform, Villeneuve was able to use that upgrade process to also add some business continuity elements, choosing to add an NAS solution from Storage Computer drives and running RAID 5 to add fault-tolerance, as well as a “pretty robust nightly/weekly/monthly backup system with 80GB DLT tapes”.
“Those [tapes] are removed offsite, put into safe deposit, and if something happened to that network room and everything was destroyed, we could recover all our data a week back and only be out a week’s worth of work, so the disaster recovery problem is addressed nicely,” Villeneuve said.
Should something go wrong within the network such as a failed drive or a server getting too hot, Villeneuve and his team also get an alarm or a page to notify them of the event.
The weekly schedule was chosen after examining and balancing the agency’s specific backup needs with the built-in redundancy of the Storage Computer product and the cost of data back ups, which become more expensive if done more frequently. Even though the business continuity elements were added as a subset of the network and storage upgrade, they were still a good chunk of the project budget.
“To be honest, the business continuity component of our hardware and software deployment purchase was probably 10 per cent of the total cost of the network, so we’re talking a lot of money to have this tape backup automated and configured,” Villeneuve said.