More than a month on from the major storage hardware failure that took a host of its systems offline for days, the Australian Taxation Office (ATO) is still working to straighten out the kinks left by the “unprecedented” event.
The problems began on December 12 last year, when storage hardware that had been upgraded in November 2015 by Hewlett Packard Enterprise (HPE) experienced a failure, which was, in turn, compounded by the subsequent failure of the ATO’s backup systems.
The technical issues took out some of the ATO’s core internal systems and public-facing services for days, with the agency confirming a full nine days after the initial outage for its business-critical systems to be live once again.
On December 16, four days after trouble first struck, Australia’s Commissioner of Taxation, Chris Jordan, referred to the issues as the “worst unplanned system outage in recent memory”.
“This was an extremely unusual and unfortunate event with the outage caused by a significant and unprecedented failure of storage hardware,” Jordan said at the time.
“The storage hardware was upgraded in November 2015 by Hewlett Packard Enterprise (HPE) after a lengthy and thorough selection process, and was seen to be ‘state-of- the-art’ at the time.
“We understand the use of this storage hardware is not unique to the ATO and is basically the same used by other large clients of HPE in Australia and across the globe,” he said.
Now, over five weeks after the initial outage, the ATO is set to take its online services offline once again as it continues its work to restore its systems to their full capacity.
On January 10, the agency warned Australians that its online services would be offline for the second time in as many weeks, from the evening of January 13 to the morning of January 15, having taken services offline the weekend prior while it worked to bring additional services online.
“We will be conducting critical system maintenance again this weekend as we continue to restore our systems to full functionality,” the ATO said in a statement.
In early January, the ATO said that it had undertaken a “significant amount of work” over the Christmas period last year in its efforts to restore services.
“The performance of our core services such as the website and portals continue to improve and we are bringing other functions and tools back online every day,” the ATO said in a statement published on January 3.
“The complexity of the restoration means there is more work to be done to return our services to normal. There may be some disruptions to service as this work is undertaken,” it said.
The ongoing systems restoration efforts are a continuation of the work it began in partnership with HPE in December 2016 to rectify the resulting fallout of the storage hardware failure.
Both the ATO and HPE have vowed to get to the bottom of the hardware failure, with the agency saying it would commission an independent investigation into the outage, and its technology partner, HPE, subsequently embarking on an internal investigation of its own.
“We refrain from speculation on possible causes while the investigation is underway and at this time, HPE does not believe that other customers are at risk,” the company told ARN in a statement in late December 2016.
While it remains to be seen what the respective investigations will uncover, the last time a government agency was similarly high-profile systems outage, it saw the Australian Bureau of Statistics (ABS) and its 2016 Census portal technology lead, IBM, dragged into a Parliamentary inquiry over the incident.
For now, the ATO continues to clean up of the fallout from the storage failure, with HPE saying it is still helping the agency to restore systems to full functionality.
However, the lion’s share of the restoration work looks like it could finally be coming to an end, with the agency’s system maintenance schedule indicating that regular maintenance will resume once the online services outage in January is done and dusted.