Published March 30, 2012
Have you heard about Big Data? No? Well, that’s okay, because Big Data has heard about you.
Big Data is the ultimate power behind such companies as Facebook, Google, and Twitter. Those services are free to use, but only because the companies are harvesting your user information--making use of Big Data.
Big Data is also the emerging power behind such traditional retail companies as Wal-Mart and Target; they can bring things to you cheaper and faster because they use Big Data to manage their supply chains--and also to target you, the customer, with a carefully crafted sales pitch.
And oh yes, Big Data also puts a new kind of “big” in Big Government, because bureaucrats will be able to monitor and analyze all these uses for Big Data. Indeed, Big Data will transform the 21st century as completely as steam power transformed the 19th century, or as electronics transformed the 20th century. Transform, that is, for better and for worse--but ultimately, unmistakably, for better.
Big Data is every piece of knowledge that has or will be digitalized--that is, stored on a computer hard drive, in a database on a server, on in “the cloud.”
So how big is big?
Let’s compare present-day Big Data to past human history--hereafter to be known as the era of Small Data.
Google’s Eric Schmidt once calculated that all knowledge since the beginning of human history--all the books, documents, sacred texts, statistics, music, graffiti, everything worldwide--would amount to five exabytes of data, an exabyte being a quintillion bytes of computerized information. A quintillion is a billion billions, that is, a 1 with 18 zeroes after it.
That’s a decent enough amount of content creation, but it’s dwarfed by what’s being created these days: In 2011, the world generated 1.8 zettabytes of data, a zettabyte being a sextillion bytes. A sextillion is a thousand quintillions, which is to say, a 1 with 21 zeroes after it.
To put it another way, the website Mashable says that it would require 57.5 billion iPads, each with a 32-gigabyte memory--a gigabyte being a mere billion bytes--to store all that information.
And that’s just for 2011: Big Data is growing by 50 percent a year, and so 2012 will produce 2.7 zettabytes, and on and on.
Yet, if data is to be useful, it has to be processed; that is, it has to be made intelligible. Infinities of data from insurance policy holders, from weather monitors, from traffic signals, from surveillance cameras, from credit-card transactions--all these zillions of bytes are just noise until someone makes sense of them and then figures out what to do with them.
In industry parlance, data spewing from everywhere across the planet is “unstructured.” And so it has to be “structured” in order to be useful. And so SQL, which stands for "structured query language," has been developed and endlessly refined, so that humans can access this information and make use of it.
A computer, sped along by Moore’s Law, processing data at speeds measured in “teraflops”--that is, trillions of calculations per second--can solve brain-crushing problems in nanotime.
So, for instance, FedEx or UPS can track every package that runs through their respective systems, and yet it still takes a human to decide, on the basis of all that information, how often a FedEx airplane needs a new tire or when a UPS truck needs to change its delivery route.
The better use of Big Data was the topic of a March 21-22 conference in Manhattan, “Structure: Data,” hosted by GigaOm, a tech information service. As GigaOm’s Chris Albrecht told the group, humans have long been structuring data, of course, long before the computer.
Language itself is a way of structuring data--what we hear, what we know, what we say--out of the unstructured data of all the sounds we hear, from the chirp of a bird to the babble of a brook to the sound of music to the voice of another human. “We’ve been doing analytics all our lives,” Albrecht reminded us. “It’s all about bringing order out of chaos.”
Indeed, structured Big Data can bring order to the past, as well. To cite just one small example, the Montreux Jazz Festival in Switzerland has now digitalized video and audio of all its performances going back to its beginnings in 1967. So today, a half-century of shows are now available to anyone online or on DVD.
Individuals, too, can make use of Big Data; photo services such as Picasa and Flickr, for instance, bring order to the chaos of a swelling photo collection, enabling users to tag images by date and location, even as they edit the photos online, for free.
Yet free comes at a price, or at least a consequence.
Remember that scene in the 2002 movie “Minority Report,” in which Tom Cruise’s character walks through a mall and the digitalized advertisements all recognize him and speak to him directly? That’s coming. In the meantime, if you want to use a service such as Around Me or FourSquare to know where to get the best cup of coffee, the inevitable flip side is that Starbucks and Caribou--and everyone else with something to sell--will know exactly where you are, too.
Is that good or bad? Well, we report, you decide--although we should all understand that if you are part of any kind of digital network, the network will know you.
Such total information awareness is coming, not just for the benefit of consumers and marketers, but also for the use of homeland security enforcers.
Just this month we learned that a company in Japan, Hitachi Kokusai Electric, has developed a new system that enables a computer to scan 36 million faces a second, comparing each one to every other. Is that cool, or creepy? Maybe both. But at the next big world event--say, the London Olympics next summer--such scanning capacity could be a life-saving tragedy-preventer.
So the imperatives of homeland security alone will be enough drive Big Data ever forward.
Earlier this month Wired magazine reported that the US government’s National Security Agency is building a new data center in rural Utah; the heavily fortified $2 billion facility should be operational in September 2013. As information pours into the center from all over the world, the supercomputers on site will track and analyze everything from phone calls to e-mails to parking receipts.
Wired writer James Bamford notes that many communications will be in code, and so cracking those codes will be another task for the center. Thus, it will be a battle of the big brains, the encryptors vs. the de-cryptors. As one intelligence veteran told Bamford, “Everybody with communications is a target.”
So that’s all of us.
Wait a second--is all this legal? Constitutional? It seems to be, at least according to the Obama administration. And for their part, Republicans don’t seem to be complaining, either. After all, it’s a dangerous world, and terrorists can be clever. And other countries, notably Iran, Russia, and China, can be even more clever. So we find ourselves in a new kind of arms race--a Big Data race.
What conclusion to draw from all this?
Is Big Data good or bad?
Actually, it’s both.
Big Data is like any tool or technology--it can be used any which way. A knife is a tool that can kill, but it can also cut food, carve a sculpture, and remove a tumor.
Similarly, a far more sophisticated tool, Big Data, can drive e-commerce, and e-production--and it can also be wielded by Big Brother. In other words, as with every other kind of tool or technology, we will have to learn to handle Big Data both carefully and productively.
That last word, “productively,” is key, because Big Data has the potential to advance the frontiers of human knowledge faster than we can say “teraflop.”
Yes, companies might use Big Data to sell us stuff--although there are worse things than being a well-informed consumer enjoying the lowest possible price. And yes, the US government might use Big Data to spy on us--although hopefully somebody, mindful of our liberties, is pushing back. And yes, enemies might use Big Data to harm us--although presumably our government can use its own Big Data to protect us.
History tells us, definitively, that we can do more good than bad with our discoveries. So long as we have the capacity to use our minds, and hearts, to harness the power of this wondrous new technology, it’s most likely that Big Data will be remembered as just another tool that humans used to achieve their full destiny--here on earth, throughout the solar system, and to the far reaches of the universe.
James P. Pinkerton is a writer and Fox News contributor. He is the editor/founder of the Serious Medicine Strategy blog.