Archiving, McIsaac said, should be made distinct from backup and only active data, which may represent 10-40 per cent of the total, should be backed up. If a file is no longer being accessed or changed, it should become reference data archived somewhere more permanently, and shouldn’t need to be repeatedly backed up.
One of the more interesting technologies designed to address this archiving problem is Content Addressable Storage (CAS).
CAS systems, as the name suggests, store information based on the content and integrity of the data itself, whereas traditional systems store files according to the location of the file on disk.
Priced slightly higher than traditional storage, CAS solutions are designed to store data that has not been accessed in a long period of time, relegating it to a safer, more secure, but ultimately slower, storage medium.
“There is a key point in the lifecycle of data in which the information isn’t accessed anymore,” EMC product marketing director, Clive Gold, said. “It doesn’t need the split second access traditional storage provides.
“With traditional storage you have no way of guaranteeing a file hasn’t changed and you can only add as many drives [as the array will fit]. With CAS, the data is completely redundant of any independent node. You can scale theoretically forever, with new nodes working alongside old ones.”
CAS systems work hand-in-hand with content management tools, which determine what files have not been accessed within a given period of time (or within whatever policy rules the administrator creates) and assigns them to the CAS archive.
The real smarts, however, is when these tools are used in such a way that the user doesn’t even know their older files are being stored in a different system. The administrator might decide, for example, that after 30 or 40 days, email threads are moved off the Exchange server and into archive storage. Using some of the latest archiving tools to market, the email still appears to be in Exchange should the user perform a search. It can still be retrieved but just takes a little longer than if the file was still on the Exchange server.
Notably, though, none of the CAS vendors – EMC, Hitachi Data Systems or HP – has found a huge amount of traction in the Australian market to date. McIsaac claimed the high upfront cost of CAS has meant most sales have been limited to the top end of town such as banking, finance and healthcare.