All Stories

  1. LUTMUL: Exceed Conventional FPGA Roofline Limit by LUT-based Efficient Multiplication for Neural Network Inference
  2. Efficient Neural Networks on the Edge with FPGAs by Optimizing an Adaptive Activation Function
  3. Artifact Evaluation for ACM TRETS Papers Submitted from the FPT Journal Track
  4. FPGA-aware automatic acceleration framework for vision transformer with mixed-scheme quantization
  5. Strategies and Demonstration to Support Multiple Wireless Protocols with a Single RF Front-End
  6. A Novel Physical Layer Authentication With PAPR Reduction Based on Channel and Hardware Frequency Responses
  7. Real Time Receiver Baseband Processing Platform for Sub 6 GHz PHY Layer Experiments
  8. SIFO: Secure Computational Infrastructure Using FPGA Overlays
  9. QuTiBench
  10. Garbled Circuits in the Cloud using FPGA Enabled Nodes
  11. Detection of Different Wireless Protocols on an FPGA with the Same Analog/RF Front End
  12. High-Level and Compact Design of Cross-Channel LTE DownLink Channel Encoder
  13. FINN- R
  14. Local and Global Shared Memory for Task Based HPC Applications on Heterogeneous Platforms
  15. Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic
  16. Accelerating big data applications using lightweight virtualization framework on enterprise cloud
  17. FPGA modeling techniques for detecting and demodulating multiple wireless protocols
  18. FIM: Performance Prediction for Parallel Computation in Iterative Data Processing Applications
  19. Secure Function Evaluation Using an FPGA Overlay Architecture
  20. A Framework for Developing Parallel Applications with high level Tasks on Heterogeneous Platforms
  21. Using High Level GPU Tasks to Explore Memory and Communications Options on Heterogeneous Platforms
  22. Performance prediction techniques for scalable large data processing in distributed MPI systems
  23. Open-Source Variable-Precision Floating-Point Library for Major Commercial FPGAs
  24. Design space exploration of GPU Accelerated cluster systems for optimal data transfer using PCIe bus
  25. Unified and lightweight tasks and conduits: A high level parallel programming framework
  26. State-Action Based Link Layer Design for IEEE 802.11b Compliant MATLAB-Based SDR
  27. High-level hardware-software co-design of an 802.11a transceiver system using Zynq SoC
  28. Cardiac MRI compressed sensing image reconstruction with a graphics processing unit
  29. High-Level System Design of IEEE 802.11b Standard-Compliant Link Layer for MATLAB-Based SDR
  30. Validity and reliability of Kinect skeleton for measuring shoulder joint angles: a feasibility study
  31. Accelerating K-Means clustering with parallel implementations and GPU computing
  32. GPU implementation of reverse coordinate conversion for proteins
  33. Leakage evaluation on power balance countermeasure against side-channel attack on FPGAs
  34. Balance power leakage to fight against side-channel analysis at gate level in FPGAs
  35. Side-channel analysis of MAC-Keccak hardware implementations
  36. Accuracy of kinect for measuring shoulder joint angles in multiple planes of motion
  37. Kernel Specialization Provides Adaptable GPU Code for Particle Image Velocimetry
  38. Behavioral Non-portability in Scientific Numeric Computing
  39. Implementing a MATLAB-Based Self-configurable Software Defined Radio Transceiver
  40. Power analysis attack on hardware implementation of MAC-Keccak on FPGAs
  41. Accelerating protein coordinate conversion using GPUs
  42. Fast reconstruction of 3D volumes from 2D CT projection data with GPUs
  43. Reducing Processing Latency with a Heterogeneous FPGA-Processor Framework
  44. Validity and reliability of kinect for measuring shoulder joint angles
  45. Make it real: Effective floating-point reasoning via exact arithmetic
  46. FPGA-based hyperspectral covariance coprocessor for size, weight, and power constrained platforms
  47. Vendor agnostic, high performance, double precision Floating Point division for FPGAs
  48. Development of a low-cost, adaptive, clinician-friendly virtual rehabilitation system
  49. Kernel Specialization for Improved Adaptability and Performance on Graphics Processing Units (GPUs)
  50. A Message from the General Chair and Program Chair
  51. Minimum energy operation for clustered island-style FPGAs
  52. Minimum Energy Analysis and Experimental Verification of a Latch-Based Subthreshold FPGA
  53. Characterization of a single-supply subthreshold FPGA
  54. Cognitive radio universal software hardware
  55. CUDA and OpenCL implementations of 3D CT reconstruction for biomedical imaging
  56. VForce: An environment for portable applications on high performance systems with accelerators
  57. CRUSH: Cognitive Radio Universal Software Hardware
  58. Heterogeneous tasks and conduits framework for rapid application portability and deployment
  59. Cognitive Radio Universal Software Hardware
  60. Incremental clustering applied to radar deinterleaving
  61. Adaptable Two-Dimension Sliding Windows on NVIDIA GPUs with Runtime Compilation
  62. An Autonomous Vector/Scalar Floating Point Coprocessor for FPGAs
  63. VFloat
  64. Efficient template matching with variable size templates in CUDA
  65. A truly two-dimensional systolic array FPGA implementation of QR decomposition
  66. Implementing a Highly Parameterized Digital PIV System on Reconfigurable Hardware
  67. Accelerating phase unwrapping and affine transformations for optical quadrature microscopy using CUDA
  68. FPGA Supercomputing Platforms, Architectures, and Techniques for Accelerating Computationally Complex Algorithms
  69. Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing
  70. Implementing phase unwrapping using Field Programmable Gate Arrays or Graphics Processing Units: A comparison
  71. Special issue: General-purpose processing using graphics processing units
  72. Efficient Communication Between the Embedded Processor and the Reconfigurable Logic on an FPGA
  73. An efficient implementation of a phase unwrapping kernel on reconfigurable hardware
  74. An FPGA Implementation of Explicit-State Model Checking
  75. An Efficient Implementation of a Phase Unwrapping Kernel on Reconfigurable Hardware
  76. Dynamo: a runtime partitioning system for FPGA-based HW/SW image processing systems
  77. K-means Clustering for Multispectral Images Using Floating-Point Divide
  78. Writing Portable Applications that Dynamically Bind at Run Time to Reconfigurable Hardware
  79. Vforce: An Extensible Framework for Reconfigurable Supercomputing
  80. Advanced Components in the Variable Precision Floating-Point Library
  81. Automatic Sliding Window Operation Optimization for FPGA-Based
  82. Enabling MPEG-2 video playback in embedded systems through improved data cache efficiency
  83. Real-Time Particle Image Velocimetry for Feedback Loops Using FPGA Implementation
  84. Field-Programmable Gate Arrays in Embedded Systems
  85. Poster reception---Improving the performance of parallel backprojection on a reconfigurable supercomputer
  86. Parallel-Beam Backprojection: An FPGA Implementation Optimized for Medical Imaging
  87. Optimizing data intensive window-based image processing on reconfigurable hardware boards
  88. Applying reconfigurable hardware to the analysis of multispectral and hyperspectral imagery
  89. Accurate Power Estimation for Sequential CMOS Circuits Using Graph-based Methods
  90. Design issues for hardware implementation of an algorithm for segmenting hyperspectral imagery
  91. Effect of data truncation in an implementation of pixel clustering on a custom computing machine
  92. HML, a novel hardware description language and its translation to VHDL
  93. A data-centric approach to high-level synthesis
  94. Spatial and color clustering on an FPGA-based computer system
  95. Rothko: a three-dimensional FPGA
  96. Division and square root: choosing the right implementation
  97. Optimizing the data cache performance of a software MPEG-2 video decoder
  98. Rothko: A three dimensional FPGA architecture, its fabrication, and design tools
  99. Area and performance tradeoffs in floating-point divide and square-root implementations
  100. An automaton model for scheduling constraints in synchronous machines
  101. Non-restoring integer square root: A case study in design by principled optimization
  102. Reasoning about pipelines with structural hazards
  103. Verifying a logic-synthesis algorithm and implementation: a case study in software verification
  104. A methodology for efficient hardware verification
  105. PBS: proven Boolean simplification
  106. Erratum to: High level synthesis and generation FPGAs with the BEDROC system
  107. High level synthesis and generating FPGAs with the BEDROC system
  108. Formally verified synthesis of combinational CMOS circuits
  109. From programs to transistors: Verifying hardware synthesis tools
  110. Reasoning about the function and timing of integrated circuits with interval temporal logic
  111. Automatic determination of signal flow through MOS transistor networks
  112. Runtime assignment of reconfigurable hardware components for image processing pipelines
  113. Run-time execution of reconfigurable hardware in a Java environment
  114. Design tradeoffs in a hardware implementation of the k-means clustering algorithm
  115. High level synthesis for designing custom computing hardware
  116. Truly rapid prototyping requires high level synthesis