All Stories

  1. Durable Engines of Discovery
  2. Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes
  3. Task-Based Polar Decomposition Using SLATE on Massively Parallel Systems with Hardware Accelerators
  4. GPU-based LU Factorization and Solve on Batches of Matrices with Band Structure
  5. Using Additive Modifications in LU Factorization Instead of Pivoting
  6. A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines
  7. Using Advanced Vector Extensions AVX-512 for MPI Reductions
  8. Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications
  9. Load-balancing Sparse Matrix Vector Product Kernels on GPUs
  10. Guest editors’ note: Special issue on clusters, clouds, and data for scientific computing
  11. Massively Parallel Automated Software Tuning
  12. PLASMA
  13. Big data and extreme-scale computing
  14. A look back on 30 years of the Gordon Bell Prize
  15. GPU-accelerated co-design of induced dimension reduction
  16. Exascale computing and big data