What is it about?
N-body algorithms aim to calculate the interactions between n different bodies to obtain their trajectories. The n-body problem occurs in various fields of science, e.g., astrophysics, molecular dynamics, plasma physics, or fluid dynamics. Algorithms that solve the n-body problem can leverage significant amounts of parallelism. Today, CPUs are commonly used besides GPUs for the execution of parallel algorithms. However, targeting several hardware platforms at once often requires using different programming languages like CUDA for NVIDIA GPUs or HIP for AMD GPUs. In this work, we have implemented a naive n-body algorithm with quadratic complexity and the tree-based Barnes-Hut algorithm that uses approximations to speed up the computation. To target CPUs and GPUs with the same programming language our implementation makes use of SYCL. SYCL 2020, developed by the Khronos Group, is a standard that acts as an abstraction layer for parallel programming on heterogeneous systems while only using standard ISO C++. We compare both algorithms on heterogeneous hardware platforms and for different SYCL implementations, with respect to their runtime behavior and support for several performance optimizations. Our results show that some optimizations reveal unexpected behavior for different SYCL implementations. Furthermore, optimizations initially designed to target GPUs also resulted in better performance on CPUs. And even though data center GPUs have a clear performance advantage for the naive algorithm, surprisingly consumer GPUs offer competitive runtimes for the Barnes-Hut algorithm.
Featured Image
Photo by Guillermo Ferla on Unsplash
Read the Original
This page is a summary of: Comparing a Naive and a Tree-Based N-Body Algorithm using Different Standard SYCL Implementations on Various Hardware, November 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3624062.3624604.
You can read the full text:
Resources
Contributors
The following have contributed to this page







