What is it about?

An open source semantic data lake built from big data technology. We establish a semantic layer to annotate data sources with concepts from knowledge graphs. The user is assisted by different components in a pipeline to create semantic models. The semantic models are then used to perform Ontology-based data acces (OBDA), a mechanism to query the different underlying storages in the lake uniformly.

Featured Image

Why is it important?

First system to combine semantic modelling, scalable data management and OBDA in one uniform system. Besides that, the system provides more features like MLOps, source-independet ingestion, meta data extraction, data catalog and more to represent a benchmark for future work on data lakes.

Read the Original

This page is a summary of: SEDAR: A Semantic Data Reservoir for Heterogeneous Datasets, October 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3583780.3614753.
You can read the full text:

Read

Contributors

The following have contributed to this page