Apache Calcite

  • A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
  • Edmon Begoli, Jesús Camacho-Rodríguez, Julian Hyde, Michael J. Mior, Daniel Lemire
  • May 2018, ACM (Association for Computing Machinery)
  • DOI: 10.1145/3183713.3190662

An open source framework for heterogenous query optimization

Photo by Campaign Creators on Unsplash

Photo by Campaign Creators on Unsplash

What is it about?

Apache Calcite provides many common pieces of a database system that can be used separately or combined t together as needed to build complete data processing systems. This can save significant development time and Calcite's extensible nature makes it easy to add further optimizations and connect to new data sources. Since the project is open source, any aspect of the system can be modified as needed.

Why is it important?

Apache Calcite is used by a number of high-profile companies and projects including Uber, Alibaba, and Apache Hive. Calcite is also incredibly useful to researchers prototyping different query optimization techniques or building data integration platforms as it provides a reusable set of components that can save a significant amount of time in building a complete query processing system.

Perspectives

Assistant Professor Michael J. Mior
Rochester Institute of Technology

I started working on the Apache Calcite project as a PhD student when I found it to be a system relevant to my own research. Within a couple years, I ended up as the chair of the project for a year after being welcomed by the community. Since there were now a couple academics deeply involved in the project, we decided we wanted a publication to showcase to the academic community the usefulness of Calcite in research.

Read Publication

http://dx.doi.org/10.1145/3183713.3190662

The following have contributed to this page: Assistant Professor Michael J. Mior