What is it about?

Low-cost arithmetic is fundamental for embedded deep learning. We provide very efficient arithmetic units for the design of deep neural networks in embedded FPGAs

Featured Image

Why is it important?

Hardware architectures for deep learning computing have thousands of cores. Each of these cores has basic parallel multiply-accumulation units. The efficient design of these units determines the hardware solution's final cost and the throughput performance. We were able to design very efficient fused multiply-add units that improve the area and performance of dot-product units.


This work opens new perspectives on applying FPGAs in the design of deep learning models.

Mário Véstias

Read the Original

This page is a summary of: Efficient Design of Low Bitwidth Convolutional Neural Networks on FPGA with Optimized Dot Product Units, ACM Transactions on Reconfigurable Technology and Systems, December 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3546182.
You can read the full text:



The following have contributed to this page