Data Anywhere Project: Hackathon Day Two
Data is available in bits publicly, but aggregated by companies that want to charge for it. Other data may be free in aggregate form, but not available for live query/access. This project aims to solve both problems, one data set at a time.
Using a Occupy Sandy data set from Staten Island, on day two of the hackathon the Data Anywhere project set up a simple database, which can replicate itself, and simple scrapers on various virtual machines on a new CUNY server running Fedora.
Although just taking off, the Data Anywhere project has the potential to help many organizations. If one machine is shut down, no permanent loss is incurred to the data set, since it replicated itself to several other machines. These servers can be used to aggregate any type of data, and make it accessible to the public at large, through a simple ReSTful web interface.
The longer term vision of this project is to use the data in sort of a Freakonomics type of analysis, comparing what looks like disparate data, chronologically at first, but could be compared along any index.
Code to get your site up and running has can be found on Data Anywhere’s github page.