What is it about?
An open source semantic data lake built from big data technology. We establish a semantic layer to annotate data sources with concepts from knowledge graphs. The user is assisted by different components in a pipeline to create semantic models. The semantic models are then used to perform Ontology-based data acces (OBDA), a mechanism to query the different underlying storages in the lake uniformly.
Featured Image
Photo by Alina Grubnyak on Unsplash
Why is it important?
First system to combine semantic modelling, scalable data management and OBDA in one uniform system. Besides that, the system provides more features like MLOps, source-independet ingestion, meta data extraction, data catalog and more to represent a benchmark for future work on data lakes.
Read the Original
This page is a summary of: SEDAR: A Semantic Data Reservoir for Heterogeneous Datasets, October 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3583780.3614753.
You can read the full text:
Contributors
The following have contributed to this page







