Computer VLSI Arithmetic

Prof. Vojin G. Oklobdzija
Electrical and Computer Engineering Department
University of California

 

Content of the Course

This course consists of a set of lectures dealing with topics and issues in design of arithmetic units for high-performance and low power systems. Those issues covered in this course range from algorithms to VLSI implementation of various arithmetic structures.

The lecture starts with a set of papers on number representation systems. They include fundamental papers on nuber representation systems and it treats the use of redundancy in number representation.

The next set of lectures is dedicated to arithmetic operations such as: addition, multiplication, division and square root.  The speed of those operation often determines the speed of a processor. The power consumed by the arithmetic processor is  becoming very important in mobile and portable appliances and applications. Therefore we will treat the issue of power consumed by those operations as well.

Next section deals with the floating-point numbers and floating-point computation. It contains papers dealing with efficient implementation of floating-point processor. The next set of papers deals with the floating-point standard and issues that are relevant to it.

 Evaluation of functions will also be addressed as well as techniques that achieve the functionality and high performance.

Thorough the course we will be dealing with VLSI algorithms and relationship between implementation techniques, choice of the appropriate algorithm and logic technology. The goal is to extract the benefits of both and achieve efficient and fast implementation. The section addresses the papers on fast and optimal implementation of ALU, parallel multiplier and MAC units that are a common building block of the Digital Signal Processing (DSP) systems. The presented work emphasizes the importance of appropriate algorithm and its proper mapping into the technology of choice.

This course is intended for a graduate student in electrical and computer engineering, but it is also a reference for the practicing engineer. It is intended to provide a useful and needed reference to a collection of accumulated experience necessary for a good and successful design.
 
 

Lectures

Lecture No.

Lecture Date

Lecture Topic

Handouts

Assignments Reading Assignment

1

March 9

Digital Circuits

(lecture by Prof. Y. Leblebici)

 

   

2

March 16

Number Representations

(lecture by Prof. Paolo Ienne)

Number Systems, Paolo's Lecture (4-slides-p-page)

3

March 23

VLSI Adders

VLSI Addition

   

4

March 30

VLSI Adders: CLA, Optimized CLA, Ling's Adder

Lecture-4 (VLSI Adders)

Assignment No.1

(due April 6th in class)

Reading-1

5

April 6

VLSI Adders: Ling's Adder - HP Implementation, Prefix Adders

Lecture-5 (Ling, P-Pfx)

Expanded Assignment No. 1

Nafziger's patent

6

April 13

No Classes

 

   

7

April 20

VLSI Adders, Logical Effort

Lecture-6

 

Assignment 2

Logical Effort, paper by Bedrij, paper by Knowles

8

April 27

Estimating Speed,
Tuning the Circuit: Logical Effort

Lecture-7: D. Harris LE presentation

   

9

May 4

Energy-Delay relationship, Estimation of speed and power. Some practical examples (Intel presentation)

Lecture-8a, Lecture-8b

   

10

May 11

Carry-Save Addition, Multi-Operand Addition, Sign-Digit Arithmetic

Lecture 8c

   
11

May 14

Make-up Class: Multiplication

Lecture 9

Assignment 3

 

11

May 18

No Class

 

   

12

May 28

MIDTERM EXAM

Exam

Exam Solution

(courtesy of Theo Kluter)

 

13

June 1

Multiplication: TDM Algorithm, Division

Lecture 10, Lecture 11

   

14

June 8

Prof. Alain Guyot:

Announcement

   

15

June 11

Prof. J-M Muller: Evaluation of Functions

Lecture 12: Prof. J-M Muller

   

 

List of Papers

Partial list of papers covered in this course will be posted here. We will try to keep those difficult to find papers in the PDF format on this page. We would also like to build a comprehensive list of references (kept on this page). Contribution from the course participants to this list will be appreciated.

You can find those papers at:  Collection of Papers

References
Number Systems

  1. A. Avizienis, "Digital Computer Arithmetic: A Unified Algorithmic Specification", Proceedings of Symposium on Computers and Automata, p.509-525, Brooklyn, New York, April 13-15, 1971.

Redundant Number Systems:

  1. A. Avizienis, "On a Flexible Implementation of Digital Computer Arithmetic", Proceedings of IFIP Congress 62, Munich 1962.

  2. A. Avizienis, "Arithmetic Microsystems for the Synthesis of Function Generators", Proceedings of the IEEE, Vol.54, No.12, December 1966.

  3. A. Avizienis, "Theory of Digital Computer Arithmetic", Class notes for Engr 225A, UCLA, 1968/69.

  4. A. K. Yeung, J. M. Rabaey, "A 210Mb/s Radix-4 Bit-level Pipelined Viterbi Decoder", Proceedings of International Solid-State Circuits Conference, San Francisco, February 1995.

Adders:

Manchester Carry Chain:

  1. T. Kilburn, D. B. G. Edwards, D. Aspinall, "Parallel Addition in Digital Computers: A New Fast "Carry" Circuit", Proceedings of IEE, Vol. 106, pt. B, p. 464, September 1959.

  2. V. G. Oklobdzija and E. R. Barnes, "Some Optimal Schemes For ALU Implementation In VLSI Technology," Proceedings of the 7th Symposium on Computer Arithmetic ARITH-7, pp. 2-8. Reprinted in Computer Arithmetic, E. E. Swartzlander, (editor), Vol. II, pp. 137-142, 1985.

  3. V. G. Oklobdzija and E. R. Barnes, "On Implementing Addition In VLSI Technology," IEEE Journal of Parallel and Distributed Computing, No. 5, pp. 716-728, 1988.

  4.  V. G. Oklobdzija, "Simple And Efficient CMOS Circuit For Fast VLSI Adder Realization", Proceedings of the International Symposium on Circuits and Systems, pp. 1-4, 1988.

 

Carry-Select Adder:

  1. O. J. Bedrij, "Carry-Select Adder", IRE Transactions on Electronic Computers, p. 340, June 1962.

 

Conditional-Sum Adder:

  1. Sklanski, "Conditional-Sum Addition Logic", IRE Transaction on Electronic Computers, EC-9, pp. 226-231, 1960.

 

CLA Adder:

  1. Weinberger, J.L. Smith, "A Logic for High-Speed Addition", National Bureau of Standards, Circulation 591, p. 3-12, 1958.

  2. Naini, D. Bearden, W. Anderson, "A 4.5nS 96-b CMOS Adder Design", IEEE 1992 Custom Integrated Circuits Conference, 1992.

  3. B.D. Lee, V.G. Oklobdzija, "Improved CLA Scheme with Optimized Delay", Journal of VLSI Signal Processing, Vol. 3, p. 265-274, 1991.

 

Ling Adder:

  1. H. Ling, "High Speed Binary Parallel Adder", IEEE Transactions on Electronic Computers, EC-15, p.799-809, October, 1966.

  2. H. Ling, “High-Speed Binary Adder”, IBM J. Res. Dev., vol.25, p.156-66, 1981.

  3. R. W. Doran, "Variants on an Improved Carry Look-Ahead Adder", IEEE Transactions on Computers, Vol.37, No.9, September 1988.

  4. N. T. Quach, M. J. Flynn, "High-Speed Addition in CMOS", IEEE Transactions on Computers, Vol.41, No.12, December, 1992.

  5. S. Naffziger, “A Sub-Nanosecond 0.5um 64b Adder Design”, Digest of Technical Papers, 1996 IEEE International Solid-State Circuits Conference,  San Francisco, 8-10 Feb. 1996, p.362 –363.

  6. S. Naffziger, "High Speed Addition Using Ling's Equations and Dynamic CMOS Logic", U.S. Patent No. 5,719,803, Issued: February 17, 1998.

 

Parallel Prefix Adders:

  1.  R. P. Brent and H. T. Kong, “A Regular Layout for Parallel Adders”, IEEE Transactions on Computers, Vol. C-31, No.3, March 1982, p.260-264.

  2. S. Knowles, "A Family of Adders", Proceedings of the 14th IEEE Symposium on Computer Arithmetic, Adelaide, Australia, April 14-16, 1999.

  3. F. K. Gurkaynak, et al, "Higher-Radix Kogge-Stone Parallel Prefix Adder Architectures", Proceedings of IEEE International Symposium on Circuits and Systems, Geneva, Switzerland, May 28-31, 2000.

  4. A. Farooqui, V. G. Oklobdzija, F. Chehrazi, "Multiplexer Based Adder for Media Signal Processing", 1999 International Symposium on VLSI Technology, Systems, and Applications, Taipei, Taiwan, June 8-10, 1999.

High-Performance Adders:

  1. V. G. Oklobdzija, Bart R. Zeydel, Hoang Dao, Sanu Mathew, Ram Krishnamurthy, "Energy-Delay Estimation Technique for High-Performance Microprocessor", Proceesing of the Symposium on Computer Arithmetic , 1063-6899, 2003.

  2. Sanu Mathew, Mark Anders, Ram K. Krishnamurthy, Shekhar Borkar, "A 4-GHz 130-nm Address Generation Unit With 32-bit Sparse-Tree Adder Core", IEEE Journal of Solid-State circuits, Vol38, No.5, May 2003.

  3. D. W. Dobberpuhl, et. al, "A 200-MHz 64-b dual-issue CMOS Microprocessor", IEEE Journal of Solid-State Circuits,  ,Vol. 27 , No. 11 , November, 1992
    P. 1555-1567.

  4. J. Park, H. C. Ngo, J. A. Silberman, S. H. Dhong, "470 ps 64-bit Parallel Binary Adder", Digest of Technical Papers, 2000 Symposium on VLSI Circuits, 15-17 June Honolulu, 2000, p. 192 - 193.

 

Please see interesting exchange between Sklanski and Lehman in 1963 dealing with all the same issues and arguments we have today: "Ultimate Speed Adders" J. Sklansky and Dr. Lehman's comments, IEEE Transaction on Electronic Computers, April 1963.

Also: M. Lehman, "A Comparative Study of Propagation Speed-up Circuits in Binary Arithmetic Units", IFIP Congress, Munich, Germany, 1962.
 

Multi-Operand Addition:

  1. P. Kornerup, "Reviewing 4-2 Adders for Multi-Operand Addition", Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors, ASAP’02, San Jose, California, July 17-19, 2002.

 

Multipliers: 

(1984 comparison)

  1. C.S. Wallace, "A Suggestion for a Fast Multiplier", IEE Transactions on Electronic Computers, EC-13, p.14-17, 1964.
  2. L. Dadda, "Some Schemes for Parallel Multipliers", Alta Frequenza, Vol.34, p.349-356, March 1965.
  3. L. Dadda, "On Parallel Digital Multipliers", Reprinted from Alta Frequenza, Vol.45, p.574-580, 1976.
  4. W. J. Stenzel, W. J. Kubitz, "A Compact High-Speed Parallel Multiplication Scheme", IEEE Transaction on Computers, C-26, p.948-957, 1977.
  5. Irving T. Ho, Tien Chi Chen, "Multiple Addition by Residue Threshold Functions and Their Representations by Array Logic", IEEE Trans. on Computers, Vol. C-22, No. 8, pp. 762-767, August 1973.
  6. D. Villeger and V. G. Oklobdzija, " Analysis Of Booth Encoding Efficiency In Parallel Multipliers Using Compressors For Reduction Of Partial Products", Proceedings of the 27th Asilomar Conference on Signals, Systems and Computers, pp. 781-784, 1993.
  7. D. Villeger and V. G. Oklobdzija, " Evaluation Of Booth Encoding Techniques For Parallel Multiplier Implementation", Electronics Letters, Vol. 29, No. 23, pp. 2016-2017, 1993.
  8. V. G. Oklobdzija and D. Villeger, " Improving Multiplier Design By Using Improved Column Compression Tree And Optimized Final Adder In CMOS Technology", IEEE Transactions on VLSI Systems, Vol.3, No.2, June, 1995.
  9. Gary W. Bewick, "Fast Multiplication: Algorithms and Implementation", Ph.D. Thesis, Department of Electrical Engineering, Stanford University, February 1994.
  10. V.G. Oklobdzija, D. Villeger, S. S. Liu, "A Method for Speed Optimized Partial Product Reduction and Generation of Fast Parallel Multipliers Using and Alghoritmic Approach", IEEE Transaction on Computers, Vol.45, No.3, March 1996.
  11. V. Oklobdzija, "High-Speed VLSI Arithmetic Units: Adders and Multipliers", in "Design of High-Performance Microprocessor Circuits", Book Chapter, Book edited by A. Chandrakasan, IEEE Press, 2000.
  12. P. Stelling , V. G. Oklobdzija, “Design Strategies for Optimal Hybrid Final Adders in a Parallel Multiplier”, special issue on VLSI Arithmetic, Journal of VLSI Signal Processing, Kluwer Academic Publishers, Vol.14, No.3, December 1996.
  13. P. Stelling, C. Martel, V. G. Oklobdzija, R. Ravi, “Optimal Circuits for Parallel Multipliers,” IEEE Transaction on Computers, Vol. 47, No.3, pp. 273-285, March, 1998.
  14. Sanu Mathew, Mark Anders, Ram K. Krishnamurthy, Shekhar Borkar, "A 4-GHz 130-nm Address Generation Unit With 32-bit Sparse-Tree Adder Core", IEEE Journal of Solid-State circuits, Vol38, No.5, May 2003.

Division:

  1. J. E. Robertson, "A New Class of Digital Division Methods", IRE Trans. on Electronic Computers, Vol. EC-7, pp. 218-222, September 1958.
  2. M. J. Flynn, "On Division by Functional Iteration", IEEE Transactions on Computers, C-19, p.702-706, 1970.
  3. M. Ercegovac, "A Higher-Radix Division with Simple Selection of Quotient Digits", Proceedings of the 6th Symposium on Computer Arithmetic, Aarhus, Denmark, June 20 - 22, 1983.
  4. J. Fadrianto, "Algorithm for High-Speed Shared Radix 4 Division and Radix 4 Square Root", Proceedings of the 8th Symposium on Computer Arithmetic, Como Italy, May 19-21, 1987.

Square Root:

  1. G. Metze, "Minimal Square Rooting", IEEE Trans. on Electronic Computers, Vol. EC-14, pp. 181-185, April 1965.

Function Evaluation:

  1. J. S. Walther, "A Unified Algorithm for Elementary Functions", Spring Joing Computer Conf. Proc, pp. 379-385,  1971.
  2. W. H. Specker, "A Class of Algorithms for ln x, exp x, sin x, cos x, tan-1 x, cot-1 x ", IEEE Trans. on Electronic Computers, Vol. EC-14, No. 1, pp. 85-86, Feb. 1965.
  3. J. E. Volder, "The CORDIC Trigonometric Computing Technique", IRE Transaction on Electronic Computers, EC-8, p.330-334, 1959.

 

Floating-Point:

  1. V. G. Oklobdzija, "An Algorithmic and Novel Design of a Leading Zero Detector Circuit: Comparison with Logic Synthesis", IEEE Transactions on VLSI Systems, Vol. 2, No. 1, March 1994.