BigDAWG Polystore: programmer productivity for complex, heterogeneous big data applications

Thursday, September 22, 2016 - 7:00pm
32-G449 (Kiva)
Tim Mattson, Intel
Lecturer Photo

If every algorithm looked like "map reduce" and all data naturally fit a single data store, solving Big Data problems would be straightforward. The real world, however, is not so simple. Most big data problems require complex analytics over data that is spread out among multiple data stores. Current technology could be force-fit to address these problems, but only by sacrificing programmer productivity.

Research at the Intel Big Data Science and Technology Center (based at MIT with support from 4 other universities) is addressing this problem. Our central idea is a concept we call "polystore". In a polystore system, multiple database systems with potentially different data models are exposed to the programmer through a single framework. Middleware supports location transparency and semantic completeness through a uniform interface. Our reference implementation for this concept is the BigDAWG stack (Big Data Analytics working group). In this talk, we will discuss the motivations and vision for BigDAWG, the current state of its architecture, the progress we have made in implementing it, and highlight the major challenges that lie ahead of us.

An overview of some of t