This paper presents a GALS-compatible circuitswitched on-chip network that is well suited for use in many-core platforms targeting streaming DSP and embedded applications which typically have a high degree of task-level parallelism among computational kernels. Inter-processor communication is achieved through a simple yet effective reconfigurable source- synchronous network. Interconnect paths between processors can sustain a peak throughput of one word per cycle. A theoretical model is developed for analyzing the performance of the network. A 65 nm CMOS GALS chip utilizing this network was fabricated which contains 164 programmable processors, three accelerators and three shared memory modules. For evaluating the efficiency of this platform, a complete 802.11a WLAN baseband receiver was implemented. It has a real-time throughput of 54 Mbps with all processors running at 594 MHz and 0.95 V, and consumes an average of 174.8 mW with 12.2 mW (or 7.0%) dissipated by its interconnect links and switches. With the chip's dual supply voltages set at 0.95 V and 0.75 V, and individual processors oscillators operating at workload-based optimal frequencies, the receiver consumes 123.2 mW, which is a 29.5% reduction in power. Measured power consumption values from the chip are within 2--5% of the estimated values.
PDF (602 KB),
(c) Copyright, 2010, IEEEA. T. Tran, D. N. Truong, and B. M. Baas, "A Reconfigurable Source-Synchronous On-Chip Network for GALS Many-Core Platforms," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , vol.29, no.6, pp.897-910, June 2010
@ARTICLE{Tran:TCAD2010,
author={A. T. Tran and D. N. Truong and B. M. Baas},
journal={Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on},
title={A Reconfigurable Source-Synchronous On-Chip Network for {GALS} Many-Core Platforms},
year={2010},
month={Jun.},
volume={29},
number={6},
pages={897-910},
doi={10.1109/TCAD.2010.2048594},
ISSN={0278-0070}
}