Swift: High Parallelism Program Generation of Tensor Operators for Accelerating Deep Learning Inference

Xiyue Yu; Jun Bi; Yuanbo Wen; Jianxing Xu; Di Huang; Jiaming Guo; Wei Li; Zidong Du; Jing Li; Tianshi Chen; Qi Guo

doi:10.1145/3762660

What is it about?

Swift is a compilation framework that generates highly optimized GPU code to accelerate deep learning inference, particularly for small-batch workloads. It works by creating a vast search space that combines traditional tiling with reduction parallelization, and then efficiently explores this space to find near-optimal programs that maximize hardware utilization.

Photo by Joshua Sortino on Unsplash

Why is it important?

It's important because small-batch inference suffers from high latency as current tools fail to fully utilize GPUs, mainly due to the difficulty of parallelizing reduction operations. Swift solves this core problem, leading to significant speedups that lower operational costs, improve real-time performance, and enable powerful AI to run on a wider range of devices.

Perspectives

I think this work is valuable because it bridges the gap between high-performance hardware and low-computation workloads.
Xiyue Yu
University of Science and Technology of China

This page is a summary of: Swift: High Parallelism Program Generation of Tensor Operators for Accelerating Deep Learning Inference, ACM Transactions on Architecture and Code Optimization, December 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3762660.
You can read the full text:

Read

Contributors

The following have contributed to this page

Xiyue Yu
University of Science and Technology of China

Faster AI Inference Through Parallelism

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Faster AI Inference Through Parallelism

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management