All Stories

  1. Low-Cost, Efficient Output-Only Infrastructure Damage Detection With Wireless Sensor Networks
  2. Vector Coprocessor Virtualization for Simultaneous Multithreading
  3. Wireless sensor networks for monitoring infrastructure
  4. A Method to Measure Packet Processing Time of Hosts Using High-Speed Transmission Lines
  5. Instruction Fusion for Multiscalar and Many-Core Processors
  6. Modular vector processor architecture targeting at data-level parallelism
  7. Performance-Energy Optimizations for Shared Vector Accelerators in Multicores
  8. A multiprocessor-on-a-programmable-chip reconfigurable system for matrix operations with power-grid case studies
  9. ASIC Design of Shared Vector Accelerators for Multicore Processors
  10. Message from CSE2013 Chairs
  11. Efficient on-chip vector processing for multicore processors
  12. Multicore-based vector coprocessor sharing for performance and energy gains
  13. Packet classification using rule caching
  14. FPGA and ASIC square root designs for high performance and power efficiency
  15. Message from General Chairs - Volume I
  16. Message from General Chairs - Volume II
  17. Efficient face recognition using frequency distribution curve matching
  18. Versatile design of shared vector coprocessors for multicores
  19. Replicating Tag Entries for Reliability Enhancement in Cache Tag Arrays
  20. Exploring branch target buffer access filtering for low-energy and high-performance microarchitectures
  21. On-chip Vector Coprocessor Sharing for Multicores
  22. Efficient packet classification on FPGAs also targeting at manageable memory consumption
  23. Message from the General Chairs - CSE 2010
  24. Efficient hardware support for pattern matching in network intrusion detection
  25. Novel Pipelined Architecture for Efficient Evaluation of the Square Root Using a Modified Non-Restoring Algorithm
  26. Hardware-Based Speed Up of Face Recognition Towards Real-Time Performance
  27. Online Anonymity Protection in Computer-Mediated Communication
  28. PREFACE
  29. TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array
  30. Scheduling for input-queued packet switches by a re-configurable parallel match evaluator
  31. Pipelined implementation of fixed point square root in FPGA using modified non-restoring algorithm
  32. FPGA-based static analysis tool for detecting malicious binaries
  33. Novel FPGA-Based Signature Matching for Deep Packet Inspection
  34. Preventing Unwanted Social Inferences with Classification Tree Analysis
  35. Message from the Program Chair
  36. On the Characterization and Optimization of On-Chip Cache Reliability against Soft Errors
  37. Re-Configurable Parallel Match Evaluators Applied to Scheduling Schemes for Input-Queued Packet Switches
  38. On the Exploitation of Narrow-Width Values for Improving Register File Reliability
  39. Exploiting narrow-width values for thermal-aware register file designs
  40. Low-power multiplierless DCT for image/video coders
  41. Social Inference Risk Modeling in Mobile and Social Applications
  42. Designing for different levels of social inference risk
  43. Identity Inference as a Privacy Risk in Computer-Mediated Communication
  44. Self-Adaptive Data Caches for Soft-Error Reliability
  45. Asymmetrically banked value-aware register files for low-energy and high-performance
  46. Concatenating Packets in Variable-Length Input-Queued Packet Switches with Cell-Based and Packet-Based Scheduling
  47. BTB Access Filtering: A Low Energy and High Performance Design
  48. Partially Reconfigurable Vector Processor for Embedded Applications
  49. Resource management for dynamically-challenged reconfigurable systems
  50. Robust scalability analysis and SPM case studies
  51. A Study of Data Exchange Protocols for the Grid Computing Environment
  52. Measuring Network Parameters with Hardware Support
  53. Parallel solution of Newton’s power flow equations on configurable chips
  54. Runtime Partial Reconfiguration for Embedded Vector Processors
  55. Coprocessor design to support MPI primitives in configurable multiprocessors
  56. FPGA-based Vector Processing for Matrix Operations
  57. Performance-Energy Tradeoffs for Matrix Multiplication on FPGA-Based Mixed-Mode Chip Multiprocessors
  58. Reconfiguration support for vector operations
  59. Vector Processing Support for FPGA-Oriented High Performance Applications
  60. Asymmetrically Banked Value-Aware Register Files
  61. System-Level Energy Modeling for Heterogeneous Reconfigurable Chip Multiprocessors
  62. On the Characterization of Data Cache Vulnerability in High-Performance Embedded Microprocessors
  63. A Coarse-Grain Hierarchical Technique for 2-Dimensional FFT on Configurable Parallel Computers
  64. Exploiting mixed-mode parallelism for matrix operations on the HERA architecture through reconfiguration
  65. Message from the PEN-PCGCS Workshop Co-chairs
  66. Modeling distributed data representation and its effect on parallel data accesses
  67. Load-balanced CICB packet switch with support for long round-trip times
  68. An FPGA-Based Parallel Accelerator for Matrix Multiplications in the Newton-Raphson Method
  69. FPGA implementation of a Cholesky algorithm for a shared-memory multiprocessor architecture
  70. A super-programming approach for mining association rules in parallel on PC clusters
  71. Parallel LU factorization of sparse matrices on FPGA-based configurable computing engines
  72. PowerGrid - A Computation Engine for Large-Scale Electric Networks
  73. Processor design based on dataflow concurrency
  74. Viable Architectures for High-Performance Computing
  75. Dataflow computation with intelligent memories emulated on field-programmable gate arrays (FPGAs)
  76. INVESTIGATION OF A LOW-COST HIGH-PERFORMANCE SHARED-MEMORY MULTIPROCESSOR SYSTEM FOR REAL-TIME APPLICATIONS
  77. A Universal, Dynamically Adaptable and Programmable Network Router for Parallel Computers
  78. A new-generation parallel computer and its performance evaluation
  79. Evaluating the communications capabilities of the generalized hypercube interconnection network
  80. Investigation of Various Mesh Architectures With Broadcast Buses for High-Performance Computing
  81. Material identification algorithms for parallel systems
  82. Parallel DSP algorithms on TurboNet: an experimental system with hybrid message‐passing/shared‐memory architecture
  83. Parallel DSP algorithms on TurboNet: an experimental system with hybrid message-passing/shared-memory architecture
  84. Data broadcasting and reduction, prefix computation, and sorting on reduced hypercube parallel computers
  85. FACILITATING HIGH-PERFORMANCE IMAGE ANALYSIS ON REDUCED HYPERCUBE (RH) PARALLEL COMPUTERS
  86. FACILITATING HIGH-PERFORMANCE IMAGE ANALYSIS ON REDUCED HYPERCUBE (RH) PARALLEL COMPUTERS
  87. SCALABLE MULTIFOLDED HYPERCUBES FOR VERSATILE PARALLEL COMPUTERS
  88. Adaptive Multiresolution Structures for Image Processing on Parallel Computers
  89. High-performance emulation of hierarchical structures on hypercube supercomputers
  90. RH: a versatile family of reduced hypercube interconnection networks
  91. Binary trees of modified hypercubes: a family of networks for hypercube-like parallel computers
  92. Processor allocation strategies for modified hypercubes
  93. Connected component labelling on the BLITZEN massively parallel processor
  94. Pyramid mappings onto hypercubes for computer vision: Connection machine comparative study
  95. Mapping single and multiple multilevel structures onto the hypercube
  96. Efficient mapping algorithms for a class of hierarchical systems
  97. On the problem of expanding hypercube-based systems
  98. Connection Machine results for pyramid embedding algorithms
  99. On the mapping problem for multi-level systems
  100. Improved algorithms for translation of pictures represented by leaf codes
  101. In-Register Duplication: Exploiting Narrow-Width Value for Improving Register File Reliability
  102. Adaptive Scheduling of Array-Intensive Applications on Mixed-Mode Reconfigurable Multiprocessors
  103. Load balancing on PC clusters with the super-programming model
  104. Load balancing on PC clusters with the super-programming model
  105. H-SIMD machine: configurable parallel computing for matrix multiplication
  106. A configurable mu ltiprocessor and dynamic load balancing for parallel LU factorization
  107. Performance optimization of an FPGA-based configurable multiprocessor for matrix operations
  108. Optimizing the thermal behavior of subarrayed data caches
  109. Versatile processor design for efficiency and high performance
  110. A framework for dynamic resource assignment and scheduling on reconfigurable mixed-mode on-chip multiprocessors
  111. Parallel direct solution of linear equations on FPGA-based machines
  112. A class of scalable architectures for high-performance, cost-effective parallel computing
  113. A low-complexity parallel system for gracious scalable performance. Case study for near PetaFLOPS computing
  114. Powerful and feasible processor interconnections with an evaluation of their communications capabilities
  115. High performance mapping for massively parallel hierarchical structures
  116. Embedding multilevel structures into massively parallel hypercubes-connection machine results for computer vision algorithms
  117. Performance analysis for an important class of parallel-processing networks
  118. A defect identification algorithm for sequential and parallel computers
  119. Processor allocation for a class of hypercube-like supercomputers
  120. Efficient implementation of multilevel algorithms on hypercube supercomputers for computer vision
  121. A hierarchically-controlled simd machine for 2D DCT on FPGAs
  122. FPGA-Based Vector Processing for Solving Sparse Sets of Equations