What is it about?

cuJSON is a GPU-accelerated JSON parser designed for modern data workloads. JSON is everywhere (logs, APIs, clickstreams), but turning raw JSON text into usable structured data is often a slow step that can bottleneck analytics and machine learning pipelines. cuJSON uses the massive parallelism of GPUs to parse many JSON records at once, helping convert large JSON/JSONL datasets into structured outputs faster and enabling end-to-end workflows that keep data on the GPU for processing.

Featured Image

Why is it important?

Most systems treat JSON parsing as an unavoidable serial preprocessing step. cuJSON challenges that assumption by showing that JSON parsing can be parallelized effectively on GPUs. By accelerating conversion from raw JSON/JSONL text into structured data, cuJSON enables faster ETL, log analytics, and feature preparation, and helps close the gap between GPU-accelerated compute and CPU-limited data ingestion.

Perspectives

One thing that motivated this work was how often JSON parsing becomes the “silent” bottleneck in real pipelines—everything else can be GPU-accelerated, but ingestion still stalls on the CPU. Building cuJSON was our attempt to close that gap and make GPU-native data processing more end-to-end. I hope this paper helps practitioners treat parsing as a first-class performance problem, and encourages more work on GPU-accelerated data formats and queries, not just GPU compute kernels.

Soroosh Safari Loaliyan
University of California Riverside

Read the Original

This page is a summary of: cuJSON: A Highly Parallel JSON Parser for GPUs, December 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3760250.3762222.
You can read the full text:

Read

Contributors

The following have contributed to this page