Published April 02, 2012
A telescope so massive that it spans a continent won't be any better than a pair of binoculars unless you can find a way to carry and sift through its data.
The Square Kilometer Array (SKA) is planet Earth’s next big science project. It won’t be operational until the next decade, and the group planning it hasn’t even settled on a location yet, shortlisting South Africa and Australia. But they’ve already hit a problem: It will generate on its own as much data as the entire Internet carries on a regular day.
What do you do with 1 billion gigabytes of data?
IBM on Monday announced Project Dome, a 32.9 million EURO, five-year collaboration with Astron, the Netherlands Institute for Radio Astronomy, toward "exascale" computing. The pair will work together to design new tech to analyze data from the massive telescope, which will effectively double the amount of data carried on the Internet.
And they’re in a rush to find the answers, said Ronald Luijten, of IBM research in Zurich.
“We only have four years before we have to begin building the hardware,” he told FoxNews.com. “We know what needs to be done, we know how it needs to be done. But today we don’t know how to build it in an economic way.”
The technical challenges faced by the immense science project are mindboggling. After all, it will create 1 exabyte of data every day -- 1 million terabytes, or 1018 bytes, enough raw data to fill 15 million 64GB iPods. How to carry it? How to sift through it? Where to store it?
“If you take the current global daily Internet traffic and multiply it by two, you are in the range of the data set that the Square Kilometer Array radio telescope will be collecting every day,” said Ton Engbersen, IBM Research -- Zurich. “This is Big Data analytics to the extreme.”
The telescope will generate that data from thousands of receptors, spaced roughly 1 kilometer apart and linked across an entire continent. They’ll be arranged in five spiral arms like a galaxy, 3,000 50-foot-wide dishes that extend from a central core at least 1,860 miles (3,000 kilometers) -- about the distance from New York City to Albuquerque, N.M.
To carry around the staggering amount of information this beast will generate, IBM will likely turn to fiber optic cables and advanced photonics, Luijten said.
“We’re going to look to see if photonics can be used in a much more innovative way. Electromagnetic waves must be converted to an optical signal for transmission to fibers. Can we sample those electromagnetic waves directly in optical form?” he told FoxNews.com.
Moving data around is one challenge; powering up radio dishes set in the remote reaches of Australia or Africa, where there are neither data nor power lines, is another problem.
“It can’t be so expensive that no one can afford to turn on the instrument,” he joked. And off-the-shelf computer chips -- which are often lumped together by the hundreds or even thousands to power supercomputers -- simply won’t work.
“We don’t expect that with commodity CPUs we can actually do a solution that will be good enough from a power viewpoint,” Luijten told FoxNews.com. One solution may lie in dedicated signal processors, chips from companies like nVidia designed initially for graphics, but very effective at math.
The other solution is so simple, you have to wonder why no one thought of it already: Build upwards.
“We can stack almost a hundred of these chips on top of each other” Chris Sciacca, a spokesman for IBM Zurich, told FoxNews.com.
“Ninety-eight percent of energy is consumed in moving data within high-end servers today,” Sciacca said. One way to solve that challenge is moving from a 2D to a 3D world, putting chips not next to each other on a circuit board, but stacking them like an Oreo cookie.
The challenge of Big Data is common to science; in fact, it’s a problem already addressed by one of the world’s other biggest science projects. The Large Hadron Collider (LHC), a giant atom smasher run by CERN in Europe, is already operating -- and creates tremendous rafts of data. It’s not quite the same thing, Luijten said.
“At CERN, they create many big bangs within their proton accelerator … [whereas] the SKA guys are looking at the real thing,” Luijten pointed out. But both science projects faced similar challenges. Why not reuse CERN’s solution?
“They have a similar type of issue,” he admitted. “But, for instance, the amount of data produced at CERN is 15 petabytes per year, or 10 to 100 times less than what the SKA will produce.”
The LHC also operates in a convenient, 20-mile ring in Switzerland. The SKA will be spread across an entire continent to boost precision and sensitivity.
“They don’t have this global networking issue,” Luijten told FoxNews.com.
If everything comes together, the project has incredible potential, helping scientists gain a fundamental understanding of what happened at the Big Bang, about 13 million years ago. CERN’s atom smasher aims to recreate those conditions. The SKA aims to study them: The signals created at the big bang are still travelling at light speed in the universe, after all.
“This is a very exciting opportunity -- probably a once-in-a-lifetime opportunity. It will have a fundamental impact on what we understand about how the universe was formed,” he said.