How will we manage our overflowing hard drives?
Manufacturers of hard drive storage technologies are having a ball. Never before has market demand been so insatiable or the growth curve so fast. Everything is being digitized faster than we can cope and as a result storage drive capacities are growing and prices are falling.
What is the root cause? Legislation, regulation, security and entertainment.
Every major company is now saving all paper and electronic documents, including emails, letters and diaries, to satisfy the legal requirements of government and modern business regulation. In addition, we have security services and government departments recording more than we need to know and at home the personal collection of life-bits is escalating with digital cameras, movies, music, games and other forms of entertainment.
Then of course we have the sensor networks built into our transport, utility supply, infrastructure, climate monitoring and other systems.
Our ability to easily store all this data is impacting all aspects of our life. Look how far we've come already. Only four years ago my son built and installed our first terabyte (TB), or about 1,000GB, home storage server at a cost of $4,000. A month ago we installed the second TB for only $600. What a drop in price and physical size - two parallel drives instead of five.
I was reflecting on this topic with some students recently. I drew gasps from the class when I told them: "When I was just two years older than you I was purchasing Winchester Disk Drives at $20,000 for 20MB - and they were about the size of five stacked dinner plates - and we thought we had a lot of memory." That was just 31 years ago when there were no TB storage systems anywhere. It seems that Moore's Law has migrated to storage.
We are now within an ace of the TB PC and laptop. My guess is that by 2015 we will also be using around 10GB of RAM. You may think this ought to be enough for most individuals and businesses but when you imagine movies may soon be distributed digitally at around 6 to 12GB each, you can see it all being eaten away. And how about all 26 million books in the Library of Congress? It will need around 1TB too.
The technology to realise hard drives up to and beyond 100TB is emerging in the R&D labs whilst petabyte (PB), or about 1,000TB, systems have already been engineered at great expense for specialised applications. I see no limit on the horizon to the growth of information storage as we have yet to enter the quantum world of the really small.
In the near future your MP3 player will be able to store all the music ever recorded, and your entertainment centre will hold every movie ever made. It will also be possible to record every conversation and action through personal and wearable devices, not to mention all our medical records.
At this at this point I can see the moral philosophers reeling: should we store all this data just because we can, and what will we and others do with all that data, and how will we keep it private and secure?
Then there are the entertainment executives wringing their hands. They will say: This is worse than pirated MP3s, movies, games and software combined. What will happen to our business, our revenue streams, profitability, share value and jobs?
But I think there is a more fundamental question we should be asking. We will be faced with an infinite and distributed sea of bits - much of which will not be itemised, catalogued, categorised or labelled in any way. Moreover, there will be duplication and storage corruption on a global scale. The really big question is: how the heck are we going to identify and locate anything in this mostly uncharted sea?
Will today's search techniques fit the bill? I think not - they are far too crude. We are going to need much higher degrees of sophistication to the point of machine cognition in recognising scenes, contexts and relationships in our data. The first machine capable of realising such a capability will probably arrive around 2010 in the form of an adaptive supercomputer and, given our rate of progress, on a PC scale by 2030 or earlier.
Now for the kicker: I cannot see how a fully centralised or fully distributed system spanning the global network will satisfy our future needs. Seems to me an agent-based hybrid scheme is the only contender that will do. Right now I don't see how we are going to do that but artificial life plus artificial intelligence in an evolvable form seem to be strong contenders.
I can hear the sceptics now: what a waste! Well, how about accessing all the medical records of the entire global population along with their dietary, exercise, work and leisure profiles? Can you imagine the rich goldmine of true, rather than politically correct or mistakenly assumed, correlations we will discover? For sure 99.99 per cent of the bits stored will be a waste but we cannot guess which.
To date IT has been to the world of medicine, transport, technology, food and mass production what electricity has been to IT. If we lose IT, all progress stops and civilisation as we know it collapses. There are those who choose to write off IT and look toward the newer sciences of biotech, nanotech and materials for the future. Increasingly I see it as the same thing - they are all inseparable. To try and categorise them as entirely different is about as dumb as thinking that chemistry, physics and biology are separate sciences - in reality there is just science.
What is totally knew, and not previously evolved by nature, is knowledge. Getting from a sea of bits to information, and then to knowledge, is a whole new ball game - one we have not even started to play yet.
Written on BA flight 49 from LA to Seattle after a double disc back-up session at home. Rewritten on UA flight 6342 between Seattle and Portland and despatched to silicon.com via a free hotel Wi-Fi service the same day.