COVID-19 upends disaster recovery planning
- 29 April, 2021 20:00
The COVID-19 pandemic exposed gaps in enterprise disaster recovery and business continuity planning in areas such as remote access, networking, SaaS applications and ransomware. Over the past year, IT execs have been scrambling to plug those gaps and update disaster recovery plans on the fly.
More significantly, the pandemic triggered fundamental IT changes at many organisations, including a hasty migrationof applications to the cloud, an acceleration of digital transformation efforts, the emergency provisioning of new systems and services outside of traditional procurement procedures, and, in many industries, the emergence a new category of full-time, work-at-home employees who are handling mission-critical data on their personal devices.
According to new research from Accenture, "More than three-quarters of business executives plan to redesign how staff work, accelerate digital transformation plans and fundamentally change how they engage with customers," says group chief executive Manish Sharma.
All of those factors will need to be taken into account as organisations rewrite their disaster recovery and business continuity plans for 2021 and beyond. The pandemic demonstrated that worst-case scenarios (or scenarios even worse than we could ever imagine) do happen. And disaster recovery and business continuity planning will never be the same.
DR and BC plans put to the test
Most enterprises had disaster recovery plans in place prior to the pandemic, and many were conscientious about performing practice drills, called tabletop exercises, in which key players gather together to respond to a disaster scenario. Pandemic response is typically part of a DR plan.
In fact, Dan Johnson, director of global business continuity and disaster recovery at IT service management company Ensono, recalls that he was conducting disaster-recovery war games at a client site in late 2019 and just happened to select a pandemic as the test scenario.
But no one anticipated anything resembling the magnitude, the global scope and the duration of the current pandemic. Viewed from the narrow focus of IT keeping the lights on without any service interruptions or data loss (the recovery part of the equation), the pandemic didn't present issues that IT wasn't prepared to handle. After all, the virus didn't cut the power, crash servers or infect networks with malware.
As a business continuity challenge – which is the broader discipline of maintaining business processes, coordinating the people aspect of the response, and communicating internally and externally – the pandemic exposed some shortcomings.
"For the most part, disaster recovery plans were still sound—the recovery of technology. The biggest impact was with business continuity plans, which is the recovery of processes and people, making sure all business systems are working," Johnson says.
Losing data wasn't the issue, adds Christophe Bertrand, senior analyst at ESG Research. It was more about having the right people in place, and understanding all the steps that needed to be taken in terms of things like VPNs and networking. The most obvious challenge was the sudden requirement that employees work from home full-time.
"Some organisations were more ready than others," Bertrand says. "Some were caught flatfooted in terms of collaboration tools and the ability to support a remote workforce."
Not only did organisations need to make sure that employees had the right physical equipment and Internet connection, they had to provide robust and secure collaboration tools, and they had to protect this new, broad attack surface of home Wi-Fi networks from the surge in ransomware targeted at remote workers.
"There was a lot of scrambling. We took a lot of panicked phone calls," particularly in connection with ransomware, says Doug Matthews, vice president of product management at Veritas Technologies, which provides enterprise backup and recovery services. Organisations needed to quickly make sure that endpoints were protected, and that enterprise firewalls were set up to repel attacks that might originate from an unsecured endpoint.
Matthews added that the pandemic also triggered "a hasty movement to cloud adoption," even at organisations that had previously been resistant to cloud technologies. This has created what he calls a "resiliency gap," as companies are just now going back, reading their agreements with SaaS vendors, and realising that their data is not necessarily protected.
"One area that is problematic is SaaS," Bertrand says. "People don't understand SLAs all that well. Just because you have data in Office 365 or Salesforce that doesn't mean they do backup for you." SaaS providers offer SLAs for availability, but that's not the same as recoverability.
"You're always responsible for your data," Bertrand adds. He says organisations need to make decisions on what data is mission critical and therefore needs to be backed up. "Salesforce is mission critical, and Office 365. But what about GitHub or Zendesk? There's a lot more than meets the eye."
The pandemic also exposed a lack of foresight when it came to the networking aspect of DR/BC planning, since the sudden shift of employees to remote sites scrambled traditional network traffic patterns and created potential bandwidth issues.
Ravi Ravishanker, CIO and associate provost at Wellesley College, says he was fortunate that the college recently moved all of its ERP applications, Web servers and learning management system to the cloud, because the college would not have had the bandwidth to accommodate students and staffers connecting from remote locations if all of those applications had been on-premises.
Wellesley College transitions to remote learning
The pandemic affected industries in different ways, and higher education was faced with a difficult challenge: How do you suddenly switch from in-person classrooms to fully remote learning?
"We all prepared for a pandemic, but nobody was prepared for COVID," Ravishanker says. Fortunately, with core applications living in the cloud, Wellesley was able to keep its systems available to end users and students no matter where they were located.
The difference between a disaster like a fiber cut or power outage and the COVID pandemic is that by mid-March, people could see it coming and do some quick planning. "One of most fundamental things we needed to get decisions on was the technology platform for remote learning. We were all in agreement the choice should be Zoom," Ravishanker says.
The college quickly obtained an institutional licence and set up Zoom drop-in sessions for students prior to their leaving the campus last March. "Decisions were made at a pace never seen before," he adds. There were a small number of students who had outdated computers or couldn't afford Internet connections at their home locations, so the college took care of those arrangements.
After students were sent home, the next week was spring break, so the team had some time to train teachers before remote classes began. They started with the basics and then over time moved to more advanced features like creating breakout rooms and managing chats.
When the semester ended, the team conducted de-briefing sessions with faculty and began gearing up for the fall semester in which approximately half of the students were on campus and half were remote.
Having to support two systems at once created technology challenges, as well as issues such as making sure classrooms were set up for social distancing and ensuring that everyone was following the college’s COVID protocols, such as wearing masks. With students wearing masks it was difficult to be heard, so the team provided additional microphones in each classroom.
Amid disasters, there are always positives. Some faculty have been reporting that the chat option on Zoom has resulted in more contributions from students who might have been reluctant to raise their hands and speak publicly. And Ravishanker says the crisis has accelerated the adoption of digital technologies like Zoom which will likely continue to be used, even after the pandemic is over.
At IT service management company Ensono, everyone had laptops and many were already working from home, so that wasn't an issue. The company made sure employees were connecting over a VPN, and bandwidth was expanded to accommodate increased traffic from remote workers.
On the business continuity angle, the CEO conducted regular town hall meetings with employees, and managers were encouraged to check in on employees often. Customers were also kept in the loop.
At this point, Ensono employees are still working from home, and Johnson doesn't anticipate that employees will ever return to their pre-pandemic work schedule. It seems more likely that some employees will work from home permanently, and others will adopt a hybrid model of 2-3 days a week in the office.
Once decisions like that shake out at organisations across the globe, DR plans will have to be altered. For example, many companies have physical backup sites equipped with computers, phones, etc., so that employees can continue to work in the event that headquarters is hit with a disaster.
But those facilities proved useless in the pandemic because they weren't set up for social distancing, and if employees are able to do their jobs from home, maybe it's time to abandon those sites.
Similarly, as companies move their backup data to the cloud, and many turn to disaster-recovery-as-a-service (DRaaS), maybe it's time to reconsider the need for owning and operating a physical data backup site.
The future of DR/BC
As a result of the pandemic, disaster recovery has gone from the back burner to the forefront of enterprise IT planning. ESG asked 600 IT professionals in North American and Western Europe about their spending intentions for 2021, and they indicated that DR was a top priority.
For example, disaster recovery-as-a-service (DRaaS) has exploded due to COVID, according to the research firm Markets and Markets, which predicts the global DRaaS market will increase 23.3 per cent a year between now and 2025.
"The DRaaS market will continue to grow post-COVID-19 as more enterprises migrate IT infrastructure to the cloud, boost business continuity and improve IT operations," according to the report.
Adding to the pressure on IT execs in post-COVID era, Bertrand notes that the business requirements for data recovery times have shrunk from hours to minutes. And 15 per cent of respondents in the ESG survey say their business units won't tolerate any downtime at all.
On top of that, organisations may now have data in multiple public clouds, in SaaS applications that organisations don't have direct control over, as well as on-premises. In this hybrid cloud world, "you have to have solutions in place that protect mission-critical applications in a way that is congruent with business objectives," Bertrand says.
The good news is that there are automated tools that can help. Vendors such as Collibra, Digital Guardian and Varonis can help companies automate data classification. Zerto, Veeam, Unitrends and others provide automated, non-disruptive DR testing tools as part of their broad DR/BC platforms. And vendors such as Rubrik, Cohesity and Actifio offer highly automated data protection and data management platforms build for hybrid cloud environments.
IT execs who are thinking strategically about data protection should also consider the concept of dual use. For example, if there is backup data sitting in the cloud, it could be an opportunity to apply analytics to that data in order to support the company's digital transformation efforts.
"There is a tremendous amount of work still ahead," Bertrand says. "It's definitely not a time to let up."