ARN

Striking a solid storage strategy

With the data flood continuing to rise, organisations must strike a sensible storage management strategy to help them forge ahead.

Everybody wants absolutely everything instantaneously. From staff craving access to files and emails that are years old, to clients hankering after entire transaction histories online, the pressure on storage systems to be ever faster and reliable is growing.

In 2008, the compound annual growth rate in data storage requirements could hit 50 per cent, outstripping the industry’s attempts to increase disk capacity, which is growing at around 30 per cent.

And while not all growth is even – some organisations report up to 200-300 per cent, others register nominal growth at best – you can’t afford not to have a forward-looking storage strategy in place.

“Storage needs are growing,” Gartner managing vice-president and storage specialist, Phillip Sargeant, said. “But storage growth is not necessarily associated with the sheer volume of data being stored by users. A lot of the growth can be attributed to the tendency for disk-based systems being used for backup and replication.

“People want to recover quickly from a failure. They want services restored in 15 minutes, not an hour, not half a day. They have to put in more storage to do that.”

In fact, as more and more organisations look to ease storage pain points, there are ample opportunities around a variety of technologies for the channel to help clients strike a storage strategy.

Formulating a strategy

To realise sales or consulting opportunities in this storage growth trend around disaster recovery (DR) or the latest backup and archiving technologies, resellers must first get customers thinking about classifying their data, Sun Microsystems storage manager, Steve Stavridis, said.

Data needs to be prioritised along the lines of how critical it is to the business, how easily and often it needs to be accessed, and the length of time it needs to be retained for business or compliance purposes.

“If you’ve got data out there that is important but not critical, you need to tier your storage,” Stavridis said. “You need to distinguish between fast, high-performing storage and adequate, high-capacity storage. Both come at a very different price point – the latter costs an arm and a leg.”

The first distinction a customer might make is between structured data (transactional data, such as that held in databases or for online transactions) and unstructured data (data stored on file servers, email and so on).

Transactional data is generally growing at a manageable rate (around 20-30 per cent per year), but needs to be protected with expensive equipment to meet stringent service levels. Unstructured data, meanwhile, is less vital, but harder to manage in growth terms.

Page Break

Once this distinction is made, an organisation might then split each into two or three levels or tiers of priority.

“Transactional data is generally the lifeblood of the business,” IBRS analyst, Dr Kevin McIsaac, said. “You might consider a ‘Platinum’ class for your customer information system data, your ERP and supply chain data and your online transaction data. It has to be highly available during business hours if not 24/7, has to be stored on the best disk available, and mirrored and backed-up so that systems can be restored within minutes or seconds of any downtime.

“You might then have a slightly lower class, lets call it ‘Gold’ class, applied to that transactional data that is less time sensitive – perhaps for accounting systems and the like. For this data, restoring after downtime might be a little more relaxed. The business can still function if it had to wait an hour to bring this data back online.”

The same kind of formula can be applied to unstructured data. Email might be considered ‘Gold’ for example, as many organisations would all but shut down without access. But documents on file servers may fit into a ‘Bronze’ class.

Once a business has a good view of its data, it can make far smarter decisions as to what storage technologies to use in its overall mix. ‘Platinum’ data, for example, might require the more expensive fibre channel connectivity and best-grade hardware available to ensure uptime and speed of access. But the disaster recovery site it fails over to might only require ‘Gold’ or ‘Bronze’ level disk.

However, ask any line of business manager what level of storage they want and they will all demand the best. McIsaac recommended IT departments bill business units per terabyte of storage, with a different price per terabyte for each level (Platinum, Gold or Bronze).

“A high price on Platinum and a far lower price on Bronze drives line of business managers down to realistic expectations,” McIsaac claimed. “It gives business units guidance on what service levels they truly require and ensures that storage technologies are used appropriately.”

Automating archives

After classifying their data, customers still face the problem of dealing with the rising data flood.

Traditionally, when capacity ran out, IT decision makers increased the supply of disk or set limits on usage. These days, to undertake the first course of action users are increasingly purchasing SATA drives.

Although slower and having a higher failure rate than fibre channel, it is cheaper and often married with applications that aren’t as essential, according to NEC server and storage specialist, Anthony Pepin.

Page Break

“SATA is a hell of a lot cheaper than fibre drives,” he said. “And price is king in a lot of places.”

Tips and tricks

Dimension Data general manager datacentre solutions, Ronnie Altit, gave us his top five tips for implementing a storage management strategy.

1. Understand your data, its location, size and age, and growth trend. It’s critical to understand the growth trajectory of data so as to determine the ongoing costs and the “points of inflection” (times at which you’ll need to invest significantly in storage infrastructure – such as a new SAN/NAS, or new tape library).

2. Understand the business value of data, so as to get a firm grip on what needs to be stored, where and for how long.

3. Understand the cost of the data on the overall storage environment to determine an effective storage strategy that will address both the short and the long term.

4. Don’t assume that all technology will be a fit. While there are some excellent technologies available – such as archive, de-duplication, VTL and the like – not every approach will provide a suitable ROI for every organisation.

5. Make sure you understand the roadmap of the vendors you work with. Is the environment becoming more complex, or are you working to simplify the management?

Many vendors have noted the popularity of this cheaper disk. NEC, for example, has decided to phase out its S-series fibre channel drives, replacing them with Serial Attached SCSI (SAS). SAS, the next evolution in iSCSI technology, overcomes many of its predecessor’s limitations in terms of distance and performance, while still being 40 per cent the cost of fibre channel.

However, some customers may find more value investing in tools and implementing policies that maximise existing storage investments rather than buying more disk; a course of action that can create more problems than it solves, McIsaac said.

“If everybody pulls their emails off the Exchange server onto unmanaged devices, and if many users in the same network have the same conversations and attachments stored, you actually end up replicating data all over the place,” he said.

“Most people are pack-rats. My experience is the only way to make it work is to automate the archiving.”

Page Break

Archiving, McIsaac said, should be made distinct from backup and only active data, which may represent 10-40 per cent of the total, should be backed up. If a file is no longer being accessed or changed, it should become reference data archived somewhere more permanently, and shouldn’t need to be repeatedly backed up.

Addressing content

One of the more interesting technologies designed to address this archiving problem is Content Addressable Storage (CAS).

CAS systems, as the name suggests, store information based on the content and integrity of the data itself, whereas traditional systems store files according to the location of the file on disk.

Priced slightly higher than traditional storage, CAS solutions are designed to store data that has not been accessed in a long period of time, relegating it to a safer, more secure, but ultimately slower, storage medium.

“There is a key point in the lifecycle of data in which the information isn’t accessed anymore,” EMC product marketing director, Clive Gold, said. “It doesn’t need the split second access traditional storage provides.

“With traditional storage you have no way of guaranteeing a file hasn’t changed and you can only add as many drives [as the array will fit]. With CAS, the data is completely redundant of any independent node. You can scale theoretically forever, with new nodes working alongside old ones.”

CAS systems work hand-in-hand with content management tools, which determine what files have not been accessed within a given period of time (or within whatever policy rules the administrator creates) and assigns them to the CAS archive.

The real smarts, however, is when these tools are used in such a way that the user doesn’t even know their older files are being stored in a different system. The administrator might decide, for example, that after 30 or 40 days, email threads are moved off the Exchange server and into archive storage. Using some of the latest archiving tools to market, the email still appears to be in Exchange should the user perform a search. It can still be retrieved but just takes a little longer than if the file was still on the Exchange server.

Notably, though, none of the CAS vendors – EMC, Hitachi Data Systems or HP – has found a huge amount of traction in the Australian market to date. McIsaac claimed the high upfront cost of CAS has meant most sales have been limited to the top end of town such as banking, finance and healthcare.

Page Break

Backup: Tape vs disk

Another area that has become a key issue amid the data flood is the manageability of backups. Once exclusively the realm of tape drives, backup is slowly but surely encompassing disk as users look for faster, more manageable solutions.

A study by Gartner in April this year found disk would be the primary medium for recovery of critical data by 2010. However, the analyst firm also noted enterprise storage vendors had only made slight progress in terms of winning customers over to the virtues of disk-based backup.

“We still backup to tape every night,” said the CIO of the company behind retail brands Fone Zone and Next Byte, Vita Group, Lex Moses.

“With tapes, it’s a simple cost equation; the cost if justifiable. Disk is getting more reliable, but achieving the granularity [of tape] on disk is incredibly difficult.”

Gartner’s Sargeant contended tape still has its place in the archiving of long-term data that is rarely (if ever) accessed. It is far easier to take physical tapes offsite in disaster recovery terms, whereas disk-to-disk DR requires significant bandwidth, he said.

“Tape still has value,” Sun’s Stavridis said. “Some organisations are storing data for very long periods of time. Is it wise to have that data on disk if it is never accessed? That disk is going to spin continuously and generate heat for hundreds of years. That’s where tape provides economic benefit.”

With the market largely undecided on tape versus disk, the obvious fix is agnostic data management software tools.

“People are increasingly doing their first level of backup to disk, then copying it to tape,” McIsaac said. “One master tool to do backup and recovery to disk or tape is ideal.”

Of the software vendors providing these management tools, Gartner scores CommVault (Simpana) and Symantec (NetBackUp) at the head of the pack in terms of functionality, with EMC (Avamar), IBM (Tivoli), HP (Data Protector) and CA (ArcServe) among those trailing.

While Symantec holds nearly half the market share if you consider both new licences and maintenance revenue, McIsaac claimed CommVault could help customers save money, despite its higher price tag.

“The upfront cost of the CommVault product is less than the yearly maintenance cost of Symantec’s Net BackUp,” he said. “CommVault is a relatively new entrant to the backup and recovery market, but it has done a good job of developing a next generation product and selling it at an aggressive price point.”

Page Break

CommVault managing director, Gerry Sillars, said the company was growing at 36.7 per cent globally and hiring in Australia in the middle of a global financial crisis.

“Some 78 per cent of our new business comes up against Symantec,” he said. “And some 70 per cent of our displacement customers are ex-Symantec.”

De-duplication

Many of those organisations choosing to backup to disk rather than tape are also looking to take advantage of deduplication.

The basic premise of which is that at some stage in the backup process, an algorithm is run over all the files to identify duplicate copies of the same file. One copy is preserved, the others deleted – a pointer to the single instance that has been preserved is left in their place.

“De-duplication is becoming commonplace – it’s now not a question of whether to do it but why you wouldn’t do it,” Fujitsu solutions architect, Aaron Bell, said. “It simply allows you to store more information on disk, making more information accessible and retrievable.”

Sillars agreed and expected deduplication to win more users over to disk.

“We have been convinced for several years that people would want to do backup to disk for rapid restore,” he said. “With de-duplication, you can significantly reduce the amount of disk needed for each backup.”

That said, CommVault is also “actively looking” at ways to take deduplication technology to tape.

Another technology changing the face of storage management is virtualisation. While it has existed within storage arrays for many years, the popularity of server virtualisation has naturally engulfed the storage world with its charms.

Virtualised storage abstracts the usable disk space available within storage systems from the physical device itself. Its main benefit to customers is to increase the utilisation of storage assets they already own.

“The key tangible outcome is the re-use of existing infrastructure – using older hardware as a lower tier of storage,” Hitachi Data Systems chief technology officer, Simon Elisha, said. “Customers love the thought of not throwing stuff out.”

Page Break

Some vendors, like NetApp, claim storage virtualisation can double utilisation rates. The company is so confident of the technology’s value it is also offering a guarantee that customers will use 50 per cent less storage in their virtual environments as they would have with traditional storage. And if they don’t, the vendor is offering to pony up whatever further hardware is required to hit the promised figure.

“We’re that confident about it. It is that smart,” director of partner sales, Scott Morris, said. “With this kind of guarantee, resellers can boldly go to the market with a real claim to creating value for their customers.”

Thin edge of the wedge

Another powerful outcome of virtualised storage is the potential for its use, in combination with virtualisation at server level, to simplify DR. But the virtualisation technology with the biggest ramifications in the storage world is thin provisioning.

Traditionally, when provisioning a server, an IT manager would have to assign it a given quantity of storage. They would estimate how much storage would be required in the years to come, and would generally over-provision, save having to repeat the task again once the application is live.

Thin provisioning, by contrast, is a technology that allows for the “just-in-time provisioning of storage”, negating the need to have a whole lot of spindles being powered up and not being utilised.

The virtualisation technology, HDS’ Elisha said, takes the total pool of storage and divvies up the set of discs provisioned for an organisation’s servers into small (42MB) chunks. The servers take these storage chunks in increments when they need it. Such a technology saves on the purchasing costs of as-yet-unused disks spinning, consuming power and generating heat while waiting to be used.

HDS research suggests most organisations only use 35 per cent of their storage capacity. Using thin provisioning, a server uses that 35 per cent but assumes it has 70 per cent more.

“For a medium to large organisation with 20TB of capacity, that might equate to [short-term] savings of $500,000 to $600,000 capital expenditure over a four-year period,” HDS senior marketing manager, Tim Smith, said.

“The cost of storage is going down by 30 per cent a year, so every 12 months you delay buying disc is a saving of 30 per cent.”

However, thin provisioning is relatively new and not everyone is sold on its immediate value.

“We have evaluated it, but customers don’t at present have a demand for it,” Fujitsu’s Bell said. “It’s a relatively new feature set. It will become more useful to companies when carbon trading is on the agenda.”

Gartner’s Sargeant claimed virtualisation was “still expensive to implement” at storage level.

“A lot of organisations can’t quite justify putting it in,” he said. “It’s not as widespread in storage systems as people make out. It only will be when standards emerge and prices go down.”

While bullish about the power of virtualisation, even NetApp’s Morris qualified his enthusiasm for the technology.

“Virtualisation is not all things to all people,” he said. “There is almost as much benefit from pure consolidation.”

But whatever the technology you choose as appropriate for a client to help them deal with the data flood, most organisations are still in the early stage of understanding what to do with storage and how long to keep it for, HDS’ Elisha concluded.

“They want solutions to problems, not for you to meet your sales goals,” he said. “As a reseller you should be saying, ‘here’s something you might want to do, but don’t have to do’.”