**Design and Optimization of Adaptive Multimedia
Systems –Challenges, Approaches and Opportunities**

**The
tutorial will have five parts.**

** **

__PART
1 – Introduction__

In
this section we will cover motivation, objectives and outline of the rest of
the tutorial.

This
will have the following FOUR sub-sections

(a)
What is the problem and why should multimedia system designers/researchers be
interested in it?

of next generation multimedia apps such as H.264, MPEG21 etc. is very high. This is both due to enhanced functionality and the fact that multiple applications may be running concurrently.__Complexity__: One-size-fits all will no longer work because desktops/laptops are not the only devices. This is especially with the advent of digital homes such as http://www.homegateway.org/__Heterogeneity__Due to cost, weight and other practical concerns, end-devices will always be resource-constrained, especially in terms of CPU processing power, memory sizes and bandwidths and energy__Resource Constraints:__

(b)
What are the recent developments in technology that ** enable** a new way of thinking?

Over the past decades work in Economics (mechanism design), Artificial Intelligence (Reasoning under bounded, Anytime algorithms, and time-dependent planning), Real-time Computing (imprecise computation, IRIS), Approximate Signal Processing, and Complexity Scalable Multimedia Algorithms etc. has shown that computation is best viewed as a relation, not as a function, where one can make trade-offs between the “value” or “utility” of the outputs and the resources used to generate the outputs.__Flexible Computation/Multi-Fidelity Algorithms:__

Hardware is no longer a fixed monolithic entity__Adaptive Hardware:__*. Programmability is no longer restricted to just changing the software*. Modern platforms can and will have manyand these knobs can be__knobs__*tuned*both statically and dynamically (at runtime). The knobs include voltage/frequency scaling, configurable functional units and interconnect, degree of parallelism both at thread/task-level and instruction level, configurable memory hierarchies where parameters of caches can be tailored for specific applications, sophisticated DPM (dynamic power management solutions) etc. In addition, techniques such as mutable binaries, dynamic binary translation, just-in-time compilation and feedback driven compiler optimization offer new flexibility to software.

__ __

(c)
What is missing?

A
__systematic methodology__ that takes advantage of the recent developments
in algorithms and hardware platforms to design and optimize multimedia
algorithms while satisfying the constraints imposed by emerging applications in
universal multimedia access. So, that is what we will try to address in this
tutorial and thereby unravel new opportunities for hardware architects, system
designers and application/algorithm designers

(d)
OUTLINE OF THE REST OF THE PRESENTATION:

We will describe approaches that some of
us are developing that addresses the concerns listed above. In __Part II__,
we will describe a framework to extend traditional rate-distortion theory to
incorporate a notion of complexity in a __generic__ way. This provides a
framework for formulating the construction of multifidelity or complexity
scalable algorithms as a classic (non-linear) optimization problem and use
standard methods like dynamic programming, Lagrangian methods etc. In __Part III__, we will describe
configurable hardware platforms with examples from our recent work on
tile-based architectures for embedded processors, granularity studies,
aggressive voltage scaling (or voltage overscaling), multi-voltage caches and
opportunities for reconfigurable logic in implementing future multimedia apps.
In __Part IV__, we will describe a systematic technique for workload shaping
based on abstracting complexity. This will help us deal with the problems
imposed by the heterogeneity of embedded devices. In __Part V__, we will
focus on a complete case study – i.e. energy-aware system design for
wireless multimedia, where an integrated approach to energy management is
proposed that addresses cross-layer optimization issues.

__PART
II - Complexity Modeling__

·
Complexity Basis
Functions

·
Rate-Distortion-Complexity
Models

·
Complexity Scalable
Algorithm – motion-compensation example perhaps

·
Formulation of the optimization
problem to select an operating point given resource constraints such as energy
or processing power. If space is an issue, we can skip this or merge it with
Part IV.

__PART
III - Programmable Hardware Architectures __

We
will cover some of the new developments in the horizon beyond DVS (dynamic
voltage scaling) and Configurable Caches.
To satisfy the increased complexity in terms of processing power and
multiple processes tile-based chip-level multiprocessors will be required. We
are already seeing this in network processors and future multimedia processors
will follow. The question then is - how are they architected and what are their
tunable knobs? In that context,
what is the right granularity of a tile becomes very relevant? Should I use 2
simple Blackfin or C55x DSPs or should I go with a 4-way VLIW? Next, we will
describe an aggressive approach to voltage scaling, namely voltage overscaling
i.e. deliberately compromising reliability in the hope of saving power. I will
draw examples circuit-level speculation, algorithmic noise tolerance and our
recent work on power versus reliability trade-offs in data caches and
multi-voltage data caches. Finally, we will show some new opportunities for
dynamically reconfigurable logic in the complexity scalable implementation of
wavelet based video decoders.
We will draw this material from the following recent papers of ours

·
Synchroscalar design
issues from our ISCA 2004 paper

·
Impact of granularity
on power consumption of a tile-based processors

·
Voltage Overscaling -
trading power for reliability

·
Opportunities for
Dynamically Reconfigurable Logic

__PART
IV – Workload Modeling and Reshaping __

The
problem that is addressed here is - Given a programmable hardware platform with
a set of knobs, that represent different resource utilization versus
performance configurations or hardware operating points, how can the
applications choose the optimal operating point and *track the operating
point under time varying operating conditions and data inputs*. The emphasis will be on formalizing the
methodology. There are four
challenges that need to be addressed: (a) Workload modeling (b) Efficient
algorithms for reconfiguration and (c) Workload reshaping just like traffic
reshaping to meet the resource constraints at the end user and (d) Dealing with
heterogeneous architectures and platforms.

In
terms of workload modeling, there can be on-line models like those developed by
the GRACE project @ UIUC or offline models like what we @ UC Davis have been
doing. In terms of algorithms for reconfiguration we will discuss reactive and
proactive techniques and their trade-offs. Heterogeneity will be addressed by our notion of generic
complexity metrics and translating them into real complexity metrics at
run-time and finally we will talk about the need for hardware abstraction
layers in future hardware platforms for run-time resource measurement and
reconfiguration.

I
will draw the material for this section from our recent ICASSP and ICME 2005
papers and the topics will include:

·
Workload Modeling
– Generic Complexity Metrics

·
On-line versus
off-line models

·
GCM to RCM translation
and validation and using it for DVS

·
Workload Reshaping
using models for slack and slack borrowing/accumulation with and without
comprising distortion

·
We will integrate
complexity modeling and complexity scalable algorithms as part of the workload
reshaping problem with reduction in quality/distortion

·
Other sources and
pointers are appreciated

__PART
5 - SYSTEM LEVEL DESIGN –__

The
concepts and techniques that have been discussed so far are actually being
implemented and tested at some universities such as UC Irvine. Preliminary results are quite
promising.

So,
we will wrap up with a complete case study of energy aware system design
methodology being experimented at UC Irvine as part of the FORGE and DYNAMO
projects. It illustrates a specific example of integrated system level design
for wireless multimedia. The key features of this are cross-layer optimization,
i.e. how the middleware, operating system and network and hardware platform are
jointly optimized for a given quality of service. This will illustrate the
potential benefits of such an integrated approach. The role of intelligent proxies is considered. Tools and technique for rapid
prototyping will be considered.