subroutine and a programmed time-sharing system was that the entry to the conventional subroutine is under normal programme control but that the entry to a subroutine associated with a time-sharing system is forced by computer control unit. It was stressed by Mr. Strachey, in reply to a point raised by Mr. N. A. F. Williams, that he considered it was essential to supply equipment in a computer to stop two 'parallel' programmes interacting one with the other. The aim should always be to make the job of the programmer as simple as possible.

Regarding the question of fast adding circuits which several speakers had described, some discussion ensued on the relative merits of synchronous and asynchronous techniques. A point made by **Dr. D. J. Wheeler** in reply was that even with synchronous

nous systems it does not follow that an addition, for extineed take a fixed time, since 'end of carry' detection can term' the process.

NO

gul.

(25)

ďΩΠ

mic

ord

inc

par.

The

adı

thai

'car

oni mi:

fac

etTc

400

tio

ço:

wh

un

wh

sin

Α

20

to

350

u:

۱i.

91

1:

Ir.

C(

C.

H;

Discussing a point from Dr. M. V. Wilkes's lecture, Dr. Lindsey questioned whether it was necessary to scale the factor in a floating-point multiplication operation, sin required equipment to determine which factor was the and similar results could be obtained by scaling either Dr. M. V. Wilkes replied that one might have to compromithis matter, but that the finding of the smaller factor was every difficult, because with non-standardized numbers in necessary only to see which had fewer zeros before the significant digit.

## SESSION 6.—SPECIAL ASPECTS OF LOGICAL DESIGN—II

## A SYSTEM OF CONTROL FOR A FAST COMPUTER

By G. ORD, M.Sc.

When a problem is introduced into a digital computer, the programme to carry out the problem consists of instructions from the instruction code of the computer. Each instruction may call for a series of actions, e.g. the opening of a gate between two registers of the arithmetic section followed by a shift in significance of the contents of one of the registers. In early computers the individual instructions required a few separate actions to be carried out sequentially, and it was possible to design separate electronic circuits to provide a sequence as well as the actions needed for each instruction. In more recent computers the individual instructions have increased considerably in complication, some requiring up to 50 steps in sequence. It is possible to design a sequencer for each of the instructions as before, but obviously it is preferable to design one apparatus which can be arranged to provide all the sequences required. Wilkes, Wheeler and Renwick\* have described such an apparatus. Although in principle the system is applicable at high speeds, the use of the ferrite cores currently available limits the speed of the system to approximately 500 kc/s.

In the design of the proposed sequencer advantage has been taken of two facts, namely

(a) All the shifting, counting, transferring and exchanging operations in and between registers take the same time (about 200 millimicrosec) and comprise the large majority of the actions needed in the arithmetic section of the computer.

\* WILKES, M. V., RENWICK, W., and WHEELER, D. J.: The Design of a Control Unit of an Electronic Digital Computer', *Proceedings I.E.E.*, Paper No. 2365 M, June, 1957 (105 B, p. 121).

Mr. Ord is at the Royal Radar Establishment.

(b) In four out of five cases the detailed steps of an instrufollow one another in a regular sequence.

A shifting register is arranged to provide an active state only one of its outputs, all the other outputs being inaction. Unless prevented from doing so, the active state progressed regularly from one digit of the register to the next 250 millimicrosec. To the output of each digit is connected set of transistor switches controlled by the instruction instruction controls the same number of switches as there steps in its sequence, but only one of the switches connected any digit output is operated at the same time. The output the switches are connected to circuits which, when energy provide gating pulses to various parts of the computer gating-pulse generator is therefore energized only when appropriately timed output from the shifting register is route it by a switch.

If an alteration to the regular sequence is required arranged that at a particular time the position of the state in the register is altered. For some actions which require than the normal sequencing time the regular progres of the active state along the register is inhibited.

Since the transistor switches are set up immediately after instruction is known and are not altered during the sequencersponding to that instruction, the time needed to set switches can be as long as 0.5 microsec without affects seriously the overall speed of the sequencer. The sequences speed is determined primarily by the speed of the shiften operations, which can be as rapid as 4 or 5 Mc/s.

## PARALLEL ADDITION IN DIGITAL COMPUTERS: A NEW FAST 'CARRY' CIRCUIT

By T. KILBURN, M.A., D.Sc., Ph.D., Member, D. B. G. EDWARDS, Ph.D., M.Sc., Associate Member, and D. Aspinall, M.Sc.

When two numbers X and Y are added together, the kth digit of the answer is dependent on the kth digits in X and Y and also on the 'carry' digit which could be initiated by X and Y digits of less significance. If addition is carried out serially, the least-significant digits of the numbers are added first and it is necessary to delay any 'carry' indication until the next more-significant digits are processed. To add two n-digit numbers

The authors are in the Electrical Engineering Laboratories, Manchester University.

in this way takes n digit periods. When the parallel mode operation is used, all n digits of the numbers X and Y approximultaneously in the same digit period. However, successful digits of increasing significance in the answer must still approxequentially in time, because of the need to propagate the 'can from one adder stage to the next stage of higher significance. The delay between successive digits in the answer can, however

\* RICHARDS, R. K.: 'Arithmetic Operations in Digital Computers' (Van No. 1955).

Proc. IEE, Vol. 106, pt.B, p. 464, September 1959

node of operation still offers a speed advantage over the more conomical serial type of addition. In practice, with computing nachines having digit periods of 6-10 microsec it is relatively asy to ensure that all the answer digits, typically 40 in number, can be evaluated in one digit period. However, in computing machines now being designed, where the digit periods are of the order of 0·1 to 0·2 microsec, it becomes impossible to complete the addition in one digit period, and the effectiveness of the parallel mode in increasing the speed of addition is thus reduced. The following techniques indicate means for minimizing the addition time.

When two n-digit numbers are added together it is unlikely hat a 'carry' propagation over n places will occur. At the institute of Advanced Study, Princeton,\* it has been estimated that, on the average, the maximum length of a '0's' or a '1's' carry' sequence in a 40-digit addition of random numbers is only 5.6 stages, Thus, if the adder could be designed to terminate the addition period at the end of the longest 'carry' equence, an average saving in 'carry' propagation time of a actor of seven could be achieved.

The addition time during the course of a problem can also be effectively reduced by storing the 'carry' in a separate register and not permitting it to propagate at each addition. Assimilation of the partial sum with the 'carry' need occur only when the implete answer has to be standardized, sent to another storage ocation or tested for sign, although in this latter case complete similation is not always necessary. The scheme saves time when more than two numbers have to be added, because for any group of numbers there need only be one 'carry' propagation ince. This scheme is of particular use during multiplication then numerous subproducts have to be added together.

Even though it may be brief, the 'carry' propagation time in a hgle stage is very important, because its effect is cumulative. small change in this time will alter the overall addition time preciably, and therefore in an engineering design it is important minimize the amount of equipment in the 'carry' path. two approaches to this problem, and these are identified by manner in which the carry to stage k is provided. In the it method a logical gating system provides the 'carry' in terms all the less-significant x and y digits. As the stages become ore significant the complexity of this gating increases. neme is expensive, but has been used by the National Bureau Standards; and they achieve a complete addition over 52 ges in 1 microsec. In the second method the 'carry' is proed from the previous stage, k-1, and that stage in its turn quires its 'carry' from stage k-2, etc. The circuit to be cribed uses this method of operation.

simple logical diagram of three stages of the adder utilizing off switches is shown in Fig. 1. These switches are closed that the logical operations appropriate to each switch occur. Should be noticed that the logical operations involve only the utilities of x and y appropriate to that particular stage, and thus switches have to be altered as a result of 'carry' propagation. any one stage the switches  $T_1-T_3$  are operated in a mutually clusive manner so that normally there can be no interaction when the various sources which define the voltage level of the try' path. If the switches are similar to relay contacts, the opagation time of a pulse through several switches in series tresponds merely to the time for a pulse to pass along a length wire and is therefore for most practical purposes instanta-

GILCHRIST, B., POMERNE, J. H., and WONG, S. Y.: 'Fast Carry Logic for Digital imputers', Transactions of the Institute of Radio Engineers, 1955, EC-4, p. 133. On the Design of a Very-High-Speed Computer', Report No. 80, University of ols, Digital Computer Laboratory, October, 1957, p. 194. WEINBERGER, A., and SMITH, J. L.: 'A One-Microsecond Adder using 1Mc/s uits', Transactions of the Institute of Radio Engineers, 1956, EC-5, p.65.



Fig. 1.—Simple logical diagram of two adder stages using on/off switches.

neous. In this case the addition time is that appropriate to the setting time of the slowest switch.

In a high-speed computer, relays are, of course, far too slow and they must be replaced by some form of electronic switch. These switches, however, suffer from several defects as compared with a mechanical contact: there is a significant voltage drop across the switch and also a current drain, so that the source of the 'carry' indication is subject to a load which varies in accordance with the number of switches operated. Both or either of these effects may limit the maximum number of electronic switches which can be connected in series. Diode 'and' gates have been used as switches, but the number which may be connected in series is limited to three or four by the rapid change in d.c. level of the transmitted pulse due to voltage drop across the diodes. A diode switch of this type allows information to pass only in one direction.

A junction transistor can be made to resemble closely the 'off' or 'on' positions of a mechanical switch by operating it in the cut-off or saturated condition. In the saturated condition the impedance between the emitter and the collector is very low, and the voltage drop is also small because it is essentially the difference in voltage drop across two conducting diodes. In this condition information can be passed in either direction through the transistor, provided that it does not significantly disturb the transistor's saturated condition. The surface-barrier transistor type 2N240 is satisfactory as a switch, but the microalloy diffused-base transistor type 2N501 is even better because of its increased current gain, current and voltage ratings and frequency response.

The simple experiment shown in Fig. 2 causes the square wave produced at the collector of  $T_0$  to be transmitted through transistors  $T_1$ – $T_{10}$ , all in series and operated in the saturated condition. The 2-volt movement of the emitter, collector and base potentials of each of these saturated transistors alters the base current only by approximately 10%, and this does not significantly alter the saturated conditions of the transistors. The waveforms from the collectors of  $T_1$  and  $T_{10}$  occur simultaneously in time when viewed on an oscillograph range indicating 20 millimicrosec per centimetre.

A repetition of this simple experiment with 18 stages separated on individual plug-in boards results in a transmission delay of approximately 20 millimicrosec. It is thought that careful design of the boards to minimize the 5ft length of wire in the 'carry' path will reduce this delay probably by a factor of two.

The practical circuit of two adder stages is shown in Fig. 3.



Fig. 2.—Experimental circuit.

All 18-kilohm loads are connected to a common supply at - 18 volts



Fig. 3.—Practical circuit of two adder stages.

The 'sum' logic is not indicated, but it is driven from the emitter follower  $T_6$  which has been inserted to minimize loading on 'carry' path. The logical operations which control the switchin of transistors  $T_1$ – $T_3$  are carried out by simple diode logic usin type OA47 diodes and both output phases of the flip flostoring x and y. The diode system operates extremely rapidly and the gate outputs are emitter-followed prior to driving bases of  $T_1$ – $T_3$ . The amplitude of the function waveform from -2.5 to -5.5 volts. To ensure that the switch transistor remain under the control of the function waveforms, the 'carrelevel must remain within these limits, and levels of -3 vol ('carry') and -4.5 volts (no 'carry') have been specified. Therefore a step 1.5 volts in amplitude is generated whenever a 'carroccurs or disappears. When 'carry' propagates between stagit does so via the  $T_1$  transistors, a large number of which therefore appear in series.

The 1 mA positive leak connected to the emitter of T<sub>1</sub> trasistors provides sufficient current for these to saturate without the need for any current to be provided along the 'carry' pall. The transistor defining the 'carry' indication passes a largurerent only transiently, in order to charge the capacitant along the 'carry' path. The T<sub>1</sub> transistors bottom to a very low voltage, because their direct current is negligible; typically, its drop across ten T<sub>1</sub> transistors of either type is approximately 0.3 volt. In practice, after ten stages an emitter-follower used to reconstitute the voltage level by using the voltage difference across the base-emitter junction, which will vary by maximum of 30%. If the emitter-follower had raised the 'carry level by exactly the correct amount, i.e. 0.3 volt, then in the next ten T<sub>1</sub> stages there is only the same 0.3-volt fall in voltage leves that no cumulative loss occurs in subsequent groups of ten It is hoped that the 'carry' in this way will be able to propagate along 40 stages without the need for reamplification.

Ideally,  $T_1-T_3$  operate in a mutually exclusive manner, by under fault and transient conditions this need not apply. For example, if switches  $T_2$  and  $T_3$  were both closed, the  $-4\cdot 5$ -at the -3-volt lines would be connected together. To avoid excessive current the -3-volt line is limited in the amount current it can supply.

When a 'carry' propagates along the 18 stages which have beconstructed, the delay plus rise time to the eighteenth stage approximately 80 millimicrosec, comprising the switching 'fin of transistors, including any delay due to generation of the switching waveforms, and the transmission delay of approximately 20 millimicrosec.

The latter is easily seen from the 'carry' indication at second and eighteenth stages. When the number of stages increased the transistor switching time will remain constant, but the transmission time will be increased. If the former time 60 millimicrosec and the latter approximately 1 millimicrosec pestage, it would appear feasible to carry out a complete 40-stage parallel addition in 0.2 microsec.