"Charles Curran, a physicist who recently retired as the longtime storage consultant at CERN, remembers the old days of data access: when filling a request from a researcher was often a labor-intensive, daylong misadventure.
In the 1970s, information from CERN’s accelerators and experiments was stored on tapes, held in a huge library in the IT department, originally retrieved manually by operators and then copied to disk for the researcher. Overworked operators fell asleep, went missing for hours at a time, invented trickery to make the machines work faster, and overloaded the conveyor belts, causing tapes to fall off and disappear. Tape-retrieval robots squared off against mice (in one documented case, the mouse was found months later, desiccated) or overheated when they couldn’t reach tapes, melting their wheels in frustration. A request to see a certain tape often took 24 hours to fill.
Now the wait is about two minutes, hardly enough time to get a cup of coffee."
***"The fundamentals of grid computing, first developed to allow complex physics projects, has led to a related technology known as cloud computing, heavily virtualized distributed computing, which has been adopted for many commercial applications. The public may not know they are using a cloud—but they are. Online banking, photo-sharing sites like Flickr, and web-based email are all examples of heavily virtualized services that exist “out in the cloud.” Making a full cycle, grid computing itself is adopting aspects of cloud technology, making more use of virtualization and setting up grid sites in the cloud. However, true grid infrastructure still excels at collaborative sharing of resources belonging to different institutions; clouds spread the resources of one domain to the rest of the world for remote access. Collaboration is the basis of all the large-scale scientific challenges (e.g., CERN has 20 member states). Projects like the LHC are too big for any one organization or one country to do alone; collaboration is the only option. The same holds for the major challenges facing society across other disciplines (energy, climate change, food production). Now that we have excellent ways to reach and share data, we have a whole new set of problems, albeit more sophisticated. Who owns freely shared data? How long should it be kept? What besides the data must be kept so we can use it? Who pays for the energy to store data? How can researchers or disciplines resistant to sharing—afraid their ideas will be poached—be encouraged in a “publish or perish’’ world?"