Right now I am trying to spec out a storage area network (SAN) which will handle the data generated by our 454 instrument and potentially other sequencing instruments down the line. A recurring theme when dealing with the major vendors is the major discrepancy between the unit cost of hard drives – a 1TB drive can be had for as little as £54 these days – versus the cost of deploying enterprise-grade storage. Having gathered quotes from Dell, HP and IBM I have been surprised to find that you don’t get much change from £10,000 for a 12TB SAN populated with disks. That works out at a whopping £1000/TB! Even scaled up to 60TB, the HP MSA solution doesn’t manage to get much below £500/TB. That’s why I was quite interested to read Petabytes on a budget on the Backblaze blog where they demonstrate how they build custom rack-mountable storage devices, each capable of storing 67TB and manage to get the cost down to $117 (£72)/TB – a very small premium on the cost of disk. Now, I dare say most bioinformatics readers won’t fancy building this themselves at work, it would take some time to source the components and build it all from scratch. Plus you won’t get any fancy disk configuration utilities and you don’t get any block-level high I/O access via fibre-channel or iSCSI, only networked access over HTTP. But these may be luxuries we can’t afford when Solexa machines are routinely chucking out 90 gigabases of nucleotide read data per run (probably 200 gigabytes uncompressed on disk with quality scores). The article also helps answer the commonly posed question “so why don’t we just store our data in the cloud?” – well, the provided graph goes some ways as to explaining why Amazon S3 hasn’t taken over just yet.