A small step to improve disk reliability

The hard slog in pushing disks away from 512-byte block size

The storage industry isn't necessarily known for being a fast mover. Take for example this memo, dated October 2003 (it's in Microsoft Word format), in which Seagate made clear the need to move the physical block size of disk drives to 4,096 bytes instead of 512 bytes.

"One near-term solution to improve reliability is advanced channels, but these need to be implemented in a way that does not excessively impact format efficiency. This progress will require larger physical block sizes," says that old memo.

So, while it's intuitive that a large block translates into faster transfers, Seagate was really saying that reliability is the main motivation to increase the size of the physical block.

"Seagate supports an industry-wide transition to a large, 4 Kbyte, sector-size standard with product introduction times in 2006." Sounds very good, but unfortunately there is an obstacle: "The primary roadblock is legacy software that requires a 512-byte block size."

Obviously this did not happen in 2006, because disk drives are still shipping with 512-byte blocks. But [[ArtId:491256204|as reported by Computerworld's Brian Fonseca, the idea of a larger block is making some progress. And as often happens when vendors are engaged in a very competitive arena, there's some controversy about the larger-block proposals, too.

Speaking of controversy, disk vendors have been under some criticism lately, mainly about the reliability of disk drives. A study published early this year at FAST (File and Storage Technologies) reached the conclusion that vendor-provided numbers on disk drive reliability could be somewhat inflated. You can find a PDF of the study here.

Unfortunately for the disk drives industry, another study made chez Google and presented at the same event reaches similarly disturbing conclusions.

It's not a surprise that those two papers caused quite a hubbub, of which you can find traces in posts at the StorageMojo blog.

Are disk vendors too soft when it comes to estimating the reliability of their products? Do we need more measurable metrics in disk drives specs than, say, the resounding expressions (1 million hours mean time between failures comes to mind) that we have today?

To be quite frank, I haven't yet formed a definite opinion, so for now I'll answer "perhaps" to the first question and "yes" to the second question. On the other hand, it's impossible to entirely dismiss those studies -- there must be some truth in those conclusions.

Regardless, if we accept for a moment that disk vendors are delivering the best technology they can produce today (why should they not?), then to increase the reliability of disk drives we need to charge that technological cannon with some fresh gunpowder, such as the apparently inconspicuous larger block size. The sooner, the better.

