A GALS Many-Core Heterogeneous DSP Platform with Source-Synchronous On-Chip Interconnection Network

Anh T. Tran
Dean N. Truong
Bevan M. Baas
VLSI Computation Laboratory
Department of Electrical and Computer Engineering
University of California, Davis


This paper presents a many-core heterogeneous computational platform 
that employs a GALS compatible circuit-switched on-chip network. The
platform targets streaming DSP and embedded applications that have a 
high degree of task-level parallelism among computational kernels.
The test chip was fabricated in 65nm CMOS consisting of 164 simple 
small programmable cores, three dedicated-purpose accelerators and 
three shared memory modules. All processors are clocked by their own 
local oscillators and communication is achieved through a simple yet 
effective source-synchronous communication technique that allows 
each interconnection link between any two processors to sustain a 
peak throughput of one data word per cycle.

A complete 802.11a WLAN baseband receiver was implemented on this 
platform. It has a real-time throughput of 54 Mbps with all 
processors running at 594 MHz and 0.95 V, and consumes an average
174.76 mW with 12.18 mW (or 7.0%) dissipated by its interconnection
links.  We can fully utilize the benefit of the GALS architecture 
and by adjusting each processor's oscillator to run at a 
workload-based optimal clock frequency with the chip's dual supply 
voltages set at 0.95 V and 0.75 V, the receiver consumes only 
123.18 mW, a 29.5% in power reduction. Measured results of its 
power consumption on the real chip come within the difference of 
only 2-5% compared with the estimated results showing our design
 to be highly reliable and efficient.


Anh T. Tran, Dean N. Truong, Bevan M. Baas, "A GALS Many-Core Heterogeneous DSP Platform with Source-Synchronous On-Chip Interconnection Network" ACM/IEEE International Symposium on Networks on Chip (NOCS), May 2009, pp.214-223.

