Researchers Jim Gray, Wyman Chong, Tom Barclay, Alex Szalay, and Jan vandenBerg, affiliated with Microsoft Research and Johns Hopkins University, have introduced an innovative approach to handling large-scale data transmission, dubbed “TeraScale SneakerNet.” This method leverages inexpensive disk storage to economically transfer terabyte-scale datasets, particularly for backup, archiving, and data exchange.
In their research, the team highlights the cost-effectiveness of shipping physical storage media via parcel post compared to the current economics of wide-area networking. The Sloan Digital Sky Survey, a significant astronomical survey, employs this method to transmit large datasets both within the United States and to regions in Europe and Asia. The core of this approach involves the use of “storage bricks,” which are compact, self-contained units equipped with a GHz processor, GB of RAM, Gbps Ethernet, and TB of disk storage. Each of these bricks costs approximately $2,000 and functions as a database server on a local area network (LAN).
The process involves loading the storage bricks with data at one site and then physically transporting them to the destination site, where the data is read and utilized. This method not only addresses the economic challenges of data transmission but also ensures that the data integrity and availability are maintained throughout the process. The researchers emphasize the reliability and cost-efficiency of this approach, making it a viable solution for large-scale data management.
The paper delves into the specifics of the storage bricks, their economic benefits, and the software issues that arise from their use. By providing a detailed analysis of the hardware and software components, the researchers offer a comprehensive understanding of how TeraScale SneakerNet can be implemented in various data-intensive applications. This innovative approach has significant implications for industries that require the transfer of large datasets, offering a practical and economical solution to the challenges of data transmission. Read the original research paper here.

