Elementary Floating Point Cordic Function Processor And Shifter Patent Grant Walther October 16, 1 [Hewlett-Packard Company]

Elementary Floating Point Cordic Function Processor And Shifter

Walther October 16, 1

Patent Grant 3766370

U.S. patent number 3,766,370 [Application Number 05/143,578] was granted by the patent office on 1973-10-16 for elementary floating point cordic function processor and shifter. This patent grant is currently assigned to Hewlett-Packard Company. Invention is credited to John S. Walther.

United States Patent	3,766,370
Walther	October 16, 1973

**Please see images for: ( Certificate of Correction ) **

ELEMENTARY FLOATING POINT CORDIC FUNCTION PROCESSOR AND SHIFTER

Abstract

Three arithmetic units including three shifters are operated in parallel and controlled by a microprogram stored in a read-only memory to provide an improved elementary function floating-point processor. The microprogram includes a set of routines for calculating 20 elementary functions including arithmetic, exponential, hyperbolic, logarithmic, square root, and trigonometric functions. Each shifter is capable of reading a fixed plural number of consecutive bits, beginning with any bit position, from an associated data storage register.

Inventors:	Walther; John S. (Sunnyvale, CA)
Assignee:	Hewlett-Packard Company (Palo Alto, CA)
Family ID:	22504668
Appl. No.:	05/143,578
Filed:	May 14, 1971

Current U.S. Class:	708/494; 708/230; 708/274; 708/277; 708/276
Current CPC Class:	G06F 17/10 (20130101); G06F 7/5446 (20130101)
Current International Class:	G06F 7/544 (20060101); G06F 7/48 (20060101); G06F 17/10 (20060101); G06f 007/00 (); G06f 007/38 ()
Field of Search:	;235/156,159,160,164,197 ;444/1 ;340/172.5

References Cited [Referenced By]

U.S. Patent Documents


3022006	February 1962	Alrich et al.
3134091	May 1964	Shugart
3553652	January 1971	Hanson

Other References

J Volder, "The Cordic Trigonometric Computing Technique," IRE Trans. on Electronic Computers, Sept. 1959, pp. 330-334..

Primary Examiner: Atkinson; Charles E.
Assistant Examiner: Malzahn; David H.

Claims

I claim:

1. A floating point CORDIC processor for calculating trigonometric, hyperbolic, and linear elementary functions, said floating point CORDIC processor comprising:

input means for receiving input information and input control signals;

output means for providing output information and output control signals;

first, second, and third arithmetic units coupled in parallel for performing floating point CORDIC calculations, each of said first, second, and third arithmetic units including an adder-subtractor, a data register, and a fixed plural-bit shifting unit;

coupling means for selectively intercoupling the adder-subtractors, the data registers, and the fixed plural-bit shifting units of the first, second, and third arithmetic units;

storage means for storing a plurality of floating point CORDIC routines and a plurality of tables of uniquely determined floating point CORDIC constants; and

control means coupled to the first, second, and third arithmetic units, to the storage means, and to the coupling means, said control means being responsive to the input control signals and to the input information for selecting different ones of the floating point CORDIC routines and associated floating point CORDIC constants stored in the storage means and for selectively enabling different portions of the coupling means.

2. A floating point CORDIC processor as in claim 1 wherein:

said tables of uniquely determined floating point CORDIC constants stored in the storage means include tables of plural-bit rotation and distortion constants for use in performing trigonometric and hyperbolic floating point CORDIC calculations;

said control means includes first logic means coupled to the first, second, and third arithmetic units for automatically reselecting a plural-bit rotation or distortion constant when, within the accuracy of the floating point CORDIC processor, the bits of that constant are identical to the bits of the next plural-bit rotation or distortion constant to be selected; and

said control means includes second logic means coupled to the first, second, and third arithmetic units for automatically reselecting a prescribed set of plural-bit distortion constants for converging hyperbolic floating point CORDIC rountines.

3. A floating point CORDIC processor as in claim 1 wherein said coupling means comprises:

first coupling means for intercoupling the adder-subtractor of the first arithmetic unit with the data register and the fixed plural-bit shifting unit of the first arithmetic unit, with the fixed plural-bit shifting unit of the second arithmetic unit, with the adder-subtractor and the fixed plural-bit shifting unit of the third arithmetic unit for transmitting information and control signals therebetween;

second coupling means for intercoupling the adder-subtractor of the second arithmetic unit with the adder-subtractor and the fixed plural-bit shifting unit of the first arithmetic unit, with the data register and the fixed plural-bit shifting unit of the second arithmetic means, and with the adder-subtractor of the third arithmetic unit for transmitting information and control signals therebetween; and

third coupling means for intercoupling the adder-subtractor of the third arithmetic unit with the data register and the fixed plural-bit shifting unit of the third arithmetic unit for transmitting information and control signals therebetween.

4. Apparatus for shifting in parallel a fixed plural number of consecutive bits beginning with any bit position from an ordered set of bits, said apparatus comprising:

first logic means for defining a first plurality of overlapping groups of consecutive bits from the ordered set of bits, each of these groups comprising a predetermined number of consecutive bits beginning with a different bit position in the ordered set of bits, the number of bits in each of these groups being less than the number of bits in the ordered set of bits and being greater than the fixed plural number of consecutive bits to be shifted in parallel from the ordered set of bits;

first decoding means coupled to the first logic means for selecting any one of the first plurality of overlapping groups of consecutive bits from the ordered set of bits in response to an input control signal;

second logic means coupled to the first logic means for defining a second plurality of overlapping groups of consecutive bits from the group of consecutive bits selected by the first decoding means, each of these groups comprising a predetermined number of consecutive bits beginning with a different bit position in the group of consecutive bits selected by the first decoding means, the number of bits in each of these groups being less than the number of bits in the group of consecutive bits selected by the first decoding means and being equal to the fixed plural number of consecutive bits to be shifted in parallel from the ordered set of bits;

second decoding means coupled to the second logic means for selecting any one of the second plurality of overlapping groups of consecutive bits in response to an input control signal; and

output means coupled to the second logic means for out-putting in parallel the group of consecutive bits selected by the second decoding means.

5. Apparatus for shifting in parallel a fixed plural number of consecutive bits beginning with any bit position from an ordered set of bits, said apparatus comprising:

logic means for defining a plurality of overlapping groups of consecutive bits from the ordered set of bits, each of these groups comprising a predetermined number of consecutive bits beginning with a different bit position in the ordered set of bits, the number of bits in each of these groups being less than the number of bits in the ordered set of bits and being equal to the fixed plural number of consecutive bits to be shifted in parallel from the ordered set of bits;

decoding means coupled to the logic means for selecting any one of the plurality of overlapping groups of consecutive bits from the ordered set of bits in response to an input control signal; and

output means coupled to the logic means for outputting in parallel the group of consecutive bits selected by the decoding means.

Description

BACKGROUND OF THE INVENTION

This invention relates to a coordinate rotational digital computer (CORDIC) floating-point processor for computing elementary functions and to a data shifter for use therein.

Conventional minicomputers without floating-point hardware, such as the Hewlett-Packard 2116B, typically have a floating-point software precision of about 24 bits of mantissa and 8 bits of exponent (i.e., double precision) and a processing time of about 500 microseconds for addition, subtraction, multiplication, and division and about 10,000 microseconds for trigonometric and hyperbolic functions. Some floating-point processors are available for increasing the precision, decreasing the processing time, and extending the capability of such minicomputers. However, these floating-point processors can only perform a few elementary functions, typically limited to addition, subtraction, multiplication, division, and, in some cases, square root. A CORDIC processor is available that can perform many additional elementary functions including trigonometric functions. However, this CORDIC processor is unable to handle floating-point arguments and is inaccurate for small arguments. In addition, it employs bit-serial shifters and adder-subtractors which makes the processing time slower than desired. Parallel shifters and adder-subtractors could be employed to increase its speed. However, to do so would substantially increase the cost of the CORDIC processor since parallel shifters and adder-subtractors are much more expensive than bit-serial shifters and adder-subtractors.

SUMMARY OF THE INVENTION

An object of this invention is to provide an improved processor for extending the hardware computing capability of a minicomputer.

Another object of this invention is to provide a faster and more accurate processor for performing an increased number of elementary functions economically.

Another object of this invention is to provide a CORDIC floating-point processor.

Another object of this invention is to provide a CORDIC floating-point processor for handling floating-point arguments and maintaining full accuracy for small arguments.

Still another object of this invention is to provide a shifter that is faster than conventional bit-serial shifters and less expensive than conventional parallel shifters.

Other and incidental objects of this invention will become apparent from a reading of this specification and an inspection of the accompanying drawings.

These objects are accomplished according to the preferred embodiment of this invention by employing three arithmetic units and three shifters operated in parallel, a read-only memory, a data storage register for storing constants read from the read-only memory, and control logic to provide a CORDIC floating-point processor that may be controlled by a microprogram stored in the read-only memory or, alternatively, by a tester. The microprogram includes a set of routines for performing the elementary functions of addition, subtraction, multiplication, division, absolute value, entier, complement, cosine, sine, tangent, arctangent, hyperbolic cosine, hyperbolic sine, hyperbolic tangent, archyperbolic tangent, exponential, natural logarithm, square root, round to nearest integer, and round to twenty-four bits. The routines for performing multiplication, division, the aforementioned trigonometric and hyperbolic functions, natural logarithm, exponential, and square root are based on a unified algorithm for performing all of these last-mentioned functions in order to simplify the control and hardware of the CORDIC floating-point processor. Each routine is optimized to perform one or more elementary functions. The control logic allows two levels of microprogrammed subroutines and permits conditional and imperative microinstructions to be executed simultaneously. Each arithmetic unit includes an adder-subtractor and an associated data storage register. The contents of these data storage registers are prescaled before performing selected elementary functions to provide an argument having a scale factor and a mantissa that falls within the domain of convergence of the unified algorithm. If the mantissa is small, it may also be prenormalized and maintained in a normalized form while the selected elementary function is performed. Certain steps of the unified algorithm for performing hyperbolic functions are repeated to maintain the convergence requirement while decreasing the processing time. Each shifter is capable of reading a fixed plural number of consecutive bits, beginning with any bit position, from an associated one of the data storage registers.

DESCRIPTION OF THE DRAWINGS

FIG. 1 represents the angle and radius of a vector in a coordinate system parameterized by m.

FIG. 2 illustrates the input-output functions for CORDIC modes.

FIG. 3 is a simplified schematic diagram of a floating point CORDIC processor according to the preferred embodiment of this invention.

FIG. 4 is a simplified flow chart of the microprogram control for the floating point CORDIC processor of FIG. 3.

FIG. 5 represents different data formats including the extended floating point data format employed by the floating point CORDIC processor of FIG. 3.

FIG. 6 illustrates the ranges of extended floating point numbers.

FIG. 7 is a simplified schematic diagram illustrating the unary function routines.

FIG. 8 is a simplified schematic diagram illustrating the binary function routines.

FIGS. 9A-D are detailed block diagrams of a floating point CORDIC processor according to the preferred embodiment of this invention.

FIG. 10 is a composite figure map of FIGS. 9A-D.

FIGS. 11A-E are logic diagrams of the read-only memory portions of FIGS. 9A-D.

FIG. 12 is a composite figure map of FIGS. 11A-E.

FIG. 13 is a wiring diagram of the read-only memory of FIGS. 11A-E.

FIGS. 14A-R are logic diagrams of the read-only memory addressing portions of FIGS. 9A-D.

FIG. 15 is a composite figure map of FIGS. 14A-R.

FIGS. 16A-N are logic diagrams of the A-adder of FIGS. 9A-D.

FIG. 17 is a composite figure map of FIGS. 16A-N.

FIGS. 18A-H are logic diagrams of the A-shifter of FIGS. 9A-D.

FIG. 19 is a composite figure map of FIGS. 18A-H.

FIGS. 20A-C are logic diagrams of the transfer gates for the A-adder of FIGS. 9A-D.

FIG. 21 is a composite figure map of FIGS. 20A-C.

FIGS. 22A-N are logic diagrams of the B-adder of FIGS. 9A-D.

FIG. 23 is a composite figure map of FIGS. 22A-N.

FIGS. 24A-H are logic diagrams of the B-shifter of FIGS. 9A-D.

FIG. 25 is a composite figure map of FIGS. 24A-H.

FIGS. 26A-C are logic diagrams of the transfer gates for the B-adder of FIGS. 9A-D.

FIG. 27 is a composite figure map of FIGS. 26A-C.

FIGS. 28A-N are logic diagrams of the C-adder of FIGS. 9A-D.

FIG. 29 is a composite figure map of FIGS. 28A-N.

FIGS. 30A-G are logic diagrams of the D-register of FIGS. 9A-D.

FIG. 31 is a composite figure map of FIGS. 30A-G.

FIGS. 32A-H are logic diagrams of the D-shifter of FIGS. 9A-D.

FIG. 33 is a composite figure map of FIGS. 32A-H.

FIGS. 34A-C are logic diagrams of the transfer gates for the C-adder of FIGS. 9A-D.

FIG. 35 is a composite figure map of FIGS. 34A-C.

FIGS. 36A-G are logic diagrams of an interface card for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 37 is a composite figure map of FIGS. 36A-G.

FIGS. 38A-I are logic diagrams of a tester for use with the floating point CORDIC processor of FIGS. 9A-D.

FIG. 39 is a composite figure map of FIGS. 38A-I.

FIGS. 40A-J are schematic diagrams of the power supply for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 41 is a composite figure map of FIGS. 40A-J.

FIGS. 42A-F are block plain wiring diagrams for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 43 is a composite figure map of FIGS. 42A-F.

FIGS. 44A-D are flow charts for the entry routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 45 is a composite figure map of FIGS. 44A-D.

FIGS. 46A-C are flow charts for the entier, fix, load, load I, and round routines for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 47 is a composite figure map of FIGS. 46A-C.

FIGS. 48A-D are flow charts for the addition, subtraction, absolute, and negative routines for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 49 is a composite figure map of FIGS. 48A-D.

FIGS. 50A-C are flow charts for the overflow and normalize routines for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 51 is a composite figure map of FIGS. 50A-C.

FIGS. 52A-B are flow charts for the multiply and initialize routines for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 53 is a composite figure map of FIGS. 52A-B.

FIGS. 54A-B are flow charts for the division routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 55 is a composite figure map of FIGS. 54A-B.

FIGS. 56A-D are flow charts for the sine, cosine, tangent, and hyperbolic prescale routines for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 57 is a composite figure map of FIGS. 56A-D.

FIGS. 58A-C are flow charts for the sine resolver routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 59 is a composite figure map of FIGS. 58A-C.

FIGS. 60A-C are flow charts for the arctangent routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 61 is a composite figure map of FIGS. 60A-C.

FIGS. 62A-D are flow charts for the hyperbolic prescale routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 63 is a composite figure map of FIGS. 62A-D.

FIGS. 64A-D are flow charts for the hyperbolic sine and hyperbolic cosine resolver routines.

FIG. 65 is a composite figure map of FIGS. 64A-D.

FIGS. 66A-D are flow charts for the natural log, hyperbolic arctangent and square root prescale routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 67 is a composite figure map of FIGS. 66A-D.

FIGS. 68A-C are flow charts for the arc hyper resolve routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 69 is a composite figure map of FIGS. 68A-C.

FIGS. 70A-B are flow charts for the diagnostic routine for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 71 is a composite figure map of FIGS. 70A-B.

FIG. 72 illustrates a clock timing diagram for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 73 illustrates the instruction execution timing diagram for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 74 represents the instruction coding of the read-only memory of the floating point CORDIC processor of FIGS. 9A-D.

FIG. 75 illustrates the addressable reading of a data storage register by an associated one of the shifter of the shifters of the floating point CORDIC processor of FIGS. 9A-D.

FIG. 76 illustrates the A-shifter selection process of the floating point CORDIC processor of FIGS. 9A-D.

FIG. 77 is a block diagram of a power supply for the floating point CORDIC processor of FIGS. 9A-D.

FIG. 78 illustrates the voltage limit ranges of the power supply of FIG. 77.

FIG. 79 shows a partial filter power supply waveform used to detect power failure.

DESCRIPTION OF THE PREFERRED EMBODIMENT

INTRODUCTION

A floating-point processor (FPP) for extending the hardware computing capability of a minicomputer (COMP), such as the Hewlett-Packard 2116B, to include high-speed, extended-precision mathematical functions is described herein. The processing time of the FPP is in the range of 20 to 75 microseconds for floating-point addition, subtraction, multiplication, and division and in the range of 20 to 200 microseconds for the remaining mathematical functions. The precision of the FPP is 40 bits of mantissa and 8 bits of exponent (i.e., triple precision), which is equivalent to an accuracy of 12 decimal digits, and the accuracy is limited only by the truncation of input arguments. Triple-store and triple-load instructions are provided to transfer the triple-length quantities between an extended floating-point accumulator and three consecutive memory locations of the COMP. To provide compatibility with double-precision data, instructions are also provided for performing the necessary format conversions. Both integer and floating-point double-precision data may be used.

Functionally, the FPP can be regarded as a calculator under the control of the COMP. In the same way that a human operator manually uses a free-standing calculator, the COMP enters an argument into the FPP and then issues a command to calculate a selected function of the argument. The answer appears in much less time than the COMP would have taken to calculate it. Thus, the benefits derived from the FPP by the COMP are much the same as the benefits derived from the calculator by the human operator, namely, speed and efficiency.

A UNIFIED ALGORITHM FOR ELEMENTARY FUNCTIONS

The FPP includes a set of routines based on a single unified algorithm for the calculation of elementary functions including multiplication, division, sine, cosine, tangent, arctangent, hyperbolic sine, hyperbolic cosine, hyperbolic tangent, archyperbolic tangent, natural logarithm, exponential, and square root. The basis for the unified algorithm is coordinate rotation in a linear, circular, or hyperbolic coordinate system depending on which function is to be calculated. The only operations required are shifting, adding, subtracting, and the recall of prestored constants.

Referring to FIG. 1, the radius R and the angle A of the vector P = (x,y) are defined as follows in a coordinate system parameterized by m:

R = [x.sup.2 +my.sup.2 ].sup.1/2 , (1) A = m.sup.-.sup.1/ 2 tan.sup.-.sup .1 [m.sup.1/2 (2) ] .

It can be shown that R is the distance from the origin to the intersection of the curve of constant radius with the x axis, while A is twice the area enclosed by the vector, the x axis, and the curve of constant radius, divided by the radius squared. The curves of constant radius for the circular (m=1), linear (m=0), and hyperbolic (m=-1) coordinate systems are shown in FIG. 1.

Let a new vector P.sub.i.sub.+1 = (x.sub.i.sub.+1, y.sub.i.sub.+1) be obtained from P.sub.i = (x.sub.i, y.sub.i) according to

x.sub.i.sub.+1 = x.sub.i + m y.sub.i .delta..sub.i (3) y.sub.i.sub.+ 1 = y.sub.i - x.sub.i .delta..sub.i (4)

where m is the parameter for the coordinate system, and .delta..sub.i is an arbitrary value. The angle radius of the new vector in terms of the old are given by

A.sub.i.sub.+1 = A.sub.i - .alpha..sub.i (5) R.sub.i.sub.+ 1 = R.sub.i * (6) ub.i ,

where

.alpha..sub.i = m.sup.-.sup.1/2 tan.sup.-.sup.1 [m.sup.1/2 .delta..sub.i ] (7) K.sub.i = [1 + m .delta..sub.i .sup.2 ].sup.1/2 . (8)

The angle and radius are modified by quantities which are independent of the coordinate values. Table 1 gives the equations for .alpha. and K after applying identities A2 and A5 in Table 2.

TABLE 1

Angles and Radius Factors

Coordinate Radius System Angle Factor m .alpha..sub.i K.sub.i 1 tan.sup.-.sup.1 .delta..sub.i (1 + .delta..sub.i .sup.2).sup.1/ 2 0 .delta..sub.i 1 -1 tanh.sup.-.sup.l .delta..sub.i (1 - .delta..sub.i .sup.2).sup.1/ 2 ##SPC1##

For n iterations we find:

A.sub.n = A.sub.o - .alpha. (9) R.sub.n = R.sub.o * K , (10)

where ##SPC2## ##SPC3##

The total change in angle is just the sum of the incremental changes while the total change in radius is the product of the incremental changes.

If a third variable z is provided for the accumulation of the angle variations,

z.sub.i.sub.+1 = z.sub.i + .alpha..sub.i (13)

and the set of difference equations (3), (4), and (13) is solved for n iterations, we find:

x.sub.n = K{x.sub.o cos(.alpha..sqroot.m) + y.sub.o m.sup.1/2 sin(.alpha..sqroot.m) } (14) Y.sub.n = K{y.sub.o cos(.alpha..s qroot.m) - x.sub.o m.sup.-.sup.1/ 2 sin(.alpha..s qroot.m)} (15)

z.sub.n = z.sub.o + .alpha. , (16)

where .alpha. and K are as in equations (11) and (12). These relations are summarized in FIG. 2 for m=1, m=0 and m=-1 for the following special cases:

1. A is forced to zero: y.sub.n = 0;

2. z is forced to zero: z.sub.n = 0.

The initial values x.sub.o, y.sub.o, z.sub.o are shown on the left of each block in FIG. 2 while the final values x.sub.n, y.sub.n, z.sub.n are shown on the right. The identities given in Table 2 were used to simplify these results.

By the proper choice of the initial values the functions x-z, y/x, sin z, cos z, tan.sup.-.sup.1 z, sinh z, cosh z, and tanh.sup.-.sup.1 z may be obtained. In addition the following functions may be generated:

tan z = sin z/cos z (17) tanhz = sinh z/cosh z (18) exp z = sinh z + cosh (19)

ln w = 2 tanh.sup.-.sup.1 [y/x],where x = w+1 and y = w-1 (20) .sqroot.w = .sqroot.x.sup. 2 -y.sup.2, where x = w + (1/4) and y = w - (1/4). (21)

The angle A of the vector P may be forced to zero by a converging sequence of rotations .alpha..sub.i which at each step brings the vector closer to the positive x axis. The magnitude of each element of the sequence may be predetermined, but the direction of rotation must be determined at each step such that

.vertline.A.sub.i.sub.+1 .vertline. = .vertline..vertline.A.sub.i .vertline. - .alpha..sub.i .vertline. . (22)

The sum of the remaining rotations must at each step be sufficient to bring the angle to at least within .alpha..sub.n.sub.-1 of zero, even in the extreme case where A.sub.i = 0, .vertline.A.sub.i.sub.+1 .vertline. = .alpha..sub.i. Thus, ##SPC4##

The domain of convergence is limited by the sum of the rotations: ##SPC5## ##SPC6##

To show that A converges to within .alpha..sub.n.sub.-1 of zero within n steps we first prove the following theorem: ##SPC7##

which holds for i .gtoreq. 0.

We proceed by induction on i. The hypothesis (26) holds for i=0 by (24). We now show that if the hypothesis is true for i then it is also true for i+1. Subtracting .alpha..sub.i from (26) and applying (23) at the left side yields ##SPC8##

Application of (22) then yields ##SPC9##

as was to be shown. Therefore, by induction, the hypothesis holds for all .gtoreq. 0.

In particular, the theorem is true for i=n so that

.vertline.A.sub.n .vertline. < .alpha..sub.n.sub.-1 . (29)

The same scheme may be used to force the angle in z to zero. The proof of convergence proceeds exactly as before except that A is replaced by z in equations (22) through (29). By equation (25) z has the same domain of convergence as A,

max .vertline.z.sub.o .vertline. = max .vertline.A.sub.o .vertline.. (30)

Note that since K is a function of .delta..sub.i.sup.2, hwere .delta..sub.i = m.sup.-.sup.1/2 tan[m.sup.1/2 .alpha..sub.i ], K is independent of the sequence of signs chosen for the .alpha..sub.i. Thus, for a fixed sequence of .alpha..sub.i magnitudes the constant 1/K may be used as an initial value to counteract the factor K present in the final values.

The practical use of the algorithm is based on the use of shifters to effect the multiplication by .delta..sub.i. If .rho. is the radix of the number system and F.sub.i is an array of integers, where i .gtoreq. 0, then a multiplication of x by

.delta..sub.i = .rho..sup.-.sup.F

is simply a shift of x by F.sub.i places to the right. The integers F.sub.i must be chosen such that the angles

.alpha..sub.m,F = m.sup.-.sup.1/2 tan.sup.-.sup.1 (m.sup.1/2 .rho..sup..sup.-F ) (32)

satisfy the convergence criterion (23). The domain of convergence is then given by (25).

Table 3 shows some F sequences, convergence ranges, and radius factors for a binary code.

TABLE 3

Shift Sequences for a Binary Code

coordinate shift domain of radius radix system sequence convergence factor .rho. m F.sub.m,i, i .gtoreq. 0 max .vertline.A.sub.o .vertline. K 2 1 0,1,2,3,4,i,... .about.1.74 .about.1.65 2 0 1,2,3,4,5,i+1,... 1.0 1.0 2 -1 1,2,3,4,4,5...* .about.1.13 .about.0.80 *for m = -1 the following integers are repeated: {4, 13, 40, 121, . . . , k, 3k + 1, . . . }

The hyperbolic mode (m = -1) is somewhat complicated by the fact that for .alpha..sub.i = tanh.sup.-.sup.1 (2.sup.-.sup.i) the convergence criterion (23) is not satisfied. However, it can be shown that ##SPC10##

and that, therefore, if the integers {4, 13, 40, 121, ..., k, 3k +1, ...} in the F.sub.i sequence are repeated then (23) becomes true.

The limited domain imposed by the convergence criterion (25) may be extended by means of the prescaling identities shown in Table 4. For example, to calculate the sin of a large argument, we first divide the argument by .pi./2 obtaining a quotient Q and a remainder D where .vertline.D.vertline. < .pi./2. The table shows that only sin D r cos D need be calculated and that .pi./2 is within the domain of convergence. Note that the sine and cosine can be generated simultaneously by the CORDIC algorithm and that the answer may then be chosen as plus or minus one of these according to Q mod 4. As a second example, to calculate the logarithm of a large argument we first shift the argument's binary point E places until it is just to the left of the most significant non-zero bit. The fraction M then satisfies 0.5 .ltoreq. M < 1.0 and as shown in the table therefore falls within the domain of convergence. The answer is calculated as log.sub.e M + E** log.sub.e 2. ##SPC11##

The accuracy at the n.sup.th step is determined in theory by the size of the last of the converging sequence of rotations .alpha..sub.i, and for large n is approximately equal in digits to F.sub.n.sub.-1. The accuracy in digits may conveniently be made equal to L, the length of storage used for each variable, by choosing n such that F.sub.n.sub.-1 = L.

In practice the accuracy is limited by the finite length of storage. The truncation of input arguments performed to make them fit within the storage length gives rise to unavoidable error, the size of which depends on the sensitivity of the calculated function to small changes in the input argument. In a binary code, the truncation of intermediate results after each of L iterations gives rise to a total of at most log.sub.2 L bits of error. This error can be rendered harmless by using L + log.sub.2 L bits for the storage of intermediate results.

In a normalized floating point number system it is desirable that all L bits of the result be accurate, independent of the absolute size of the argument. To accomplish this for very small arguments it is necessary to keep each storage register in a normalized form; i.e., in a form where there are no leading zeros. It is possible to do this by transforming the iteration equations (3), (4), (13) to a normalized form according to the following substitutions:

x becomes x' (34) y becomes y' * 2.sup.-.sup.E (35)

z becomes z' * 2.sup.-.sup.E (36) .alpha..sub.F becomes .alpha..sub.F ' * 2.sup.-.sup.F (37)

where E, a positive integer, is chosen such that the initial argument, placed into either the y or z register, is normalized.

The result of the substitutions is:

x' .fwdarw. x' + my' 2.sup.-.sup.(F.sup.+E) (38) y' .fwdarw. y' - x' 2.sup.-.sup.( F.sup.-E) (39)

z' .fwdarw. z' + .alpha..sub.F ' 2.sup.-.sup.(F.sup.-E) .

For simplicity the subscripts i and i+1 have been dropped. Instead, .alpha. has been expressed as a function of F as in equation (32), and the replacement operator (.fwdarw.) has been used. The value of i may be initialized such that F.sub.i = E:

i.sub.initial .fwdarw.{i.vertline. F.sub.i = E} . (41)

The value of n may be chosen such that L significant bits are obtained:

n .fwdarw. {n.vertline. F.sub.n.sub.-1 - E = L} . (42)

Note that n - i.sub.initial .apprxeq. L and that therefore providing L + log.sub.2 L bits for the storage of intermediate results is still adequate.

The radius factor K is now a function of i = i.sub.initial as well as m, ##SPC12##

Fortunately, not all the reciprocal constants 1/K.sub.m,i need to be stored since for large values of i

(1/K.sub.m,i) .apprxeq. 1 - m (2/3) 2.sup.-.sup.2i (44)

and therefore all the constants having i > L/2 are identical to within L significant bits. Therefore, only L/2 constants need to be stored for m = +1 and also for m = -1. For m=0 no constants need to be stored since K.sub.0,i = 1 for i .gtoreq. 1.

A similar savings in storage can be made for the angle constants .alpha..sub.m,F since for large values of F

.alpha..sub.m,F ' .tbd. .alpha..sub.m,F * 2.sup.F .apprxeq. 1 - m(1/3) 2.sup..sup.-2F (45)

and, thus, as for the K constants, only L/2 constants need to be stored for m = +1 and also for m = -1. For m=0 no constants need to be stored since 60 .sub.0,F ' = 1 for F .gtoreq. 1.

GENERAL DESCRIPTION

As shown in FIG. 3, the FPP includes three identical arithmetic units 10, 12, and 14 operated in parallel. Each arithmetic unit contains a 64-bit register 16, 18, or 20, an 8-bit parallel adder/subtractor 22, 24, or 26, and an 8-out-of-48 multiplex shifter 28, 30, or 32. The assembly of arithmetic units is controlled by a microprogram stored in a read-only memory (ROM) 34, which also contains the angle and radius-correction constants. The ROM contains 512 words of 48 bits each and operates on a cycle time of 200 nanoseconds.

The essential aspects of the microprogram used to execute the unified CORDIC algorithm are shown in FIG. 4. The initial argument and correction constants are loaded into the three registers 16, 18, and 20, and m is set to one of the three values 1, 0, -1 . If the initial argument is small, it is normalized and E is set to minus the binary exponent of the result, otherwise, E is set to zero. Next, i is initialized to a value such that F.sub.m,i = E. A loop is then entered and is repeated until F.sub.m,i - E = L. In this loop the direction of rotation necessary to force either of the angles A or z to zero is chosen; the binary variable .sigma., used to control the three adder/subtractors 22, 24, and 26, is set to either +1 or -1; and the iteration equations in block 36 or 38 are executed.

Table 5 gives a breakdown of the maximum execution times for the routines performed by the FPP. The figures in the column marked "data transfers from computer " are the times for operand and operation code transfers between the FPP and the COMP and vary depending upon the COMP used and how it is interfaced with the FPP. The FPP retains the result of each executed function. Thus, the binary functions add, subtract, multiply and divide require only one additional operand to be supplied, and the unary functions do not require any operand transfers. The first operand is loaded via the LOAD instruction, and the final result is retrieved via the STORE instruction. ##SPC13##

PROGRAMMING INFORMATION

Table 6 lists the operation codes for each of the 20 FPP routines.

TABLE 6

Cordic Floating-Point

Operation Codes 7 6 5 4 3 2 1 0 TRIPLE LOAD AND LDX 0 0 1 0 1 0 0 1 STORE STX 0 0 1 0 0 0 0 1 TRIPLE PRECISION ADX 0 0 0 0 0 0 0 1 FLOATING POINT SBX 0 0 0 0 0 0 1 1 ARITHMETIC MPX 0 0 0 0 0 1 0 1 DVX 0 0 0 0 0 1 1 1 ABX 0 0 1 0 0 0 1 1 ENX 0 0 1 0 0 0 0 1 CMX 0 0 0 0 1 0 0 1 CSX 0 0 0 1 0 0 1 1 SNX 0 0 0 1 0 0 0 1 TNX 0 0 0 1 0 1 0 1 ATX 0 0 0 0 1 0 1 1 FUNCTIONS HCX 0 0 0 1 1 0 1 1 HSX 0 0 0 1 1 0 0 1 HTX 0 0 0 1 1 1 0 1 AHT 0 0 0 0 1 1 0 1 EXX 0 0 0 1 1 1 1 1 LNX 0 0 0 0 1 1 1 1 SRX 0 0 0 1 0 1 1 1 FIX 0 0 1 0 0 1 1 1 RNX 0 0 1 0 0 1 0 1

the FPP receives the data from the COMP in a normalized or un-normalized extended floating-point format and returns data in a normalized extended floating-point format to the COMP. As shown in the upper box of FIG. 5, a typical traditional floating-point format employs 23 bits for the mantissa, 1 bit for the mantissa sign, 7 bits for the exponent, and 1 bit for the exponent sign. This requires two 16-bit words. Note that the second word is split, such that bits 8 through 15 are the eight least significant bits of the mantissa, and the remaining eight bits are the exponent and exponent sign. As shown in the middle box of FIG. 5, the only difference between the traditional double-word floating point format and the extended floating-point format employed by the FPP is the addition of one 16-bit word to the length of the mantissa.

As shown in the lower box of FIG. 5, the conversion from the traditional double-word floating-point format to the extended triple-word floating-point format is accomplished by splitting the second word of the double-word format between bits 8 and 7, and inserting 16 zeros as the least significant bits of the mantissa in the triple-word format. Note that 8 of these zeros are present in the second word of the triple-word format, and the remaining 8 are in the third word. The reverse conversion, from triple- to double-length format, consists simply of truncating the 16 least significant bits of the mantissa. This means removing bits 7 through 0 of the second word, and bits 15 through 8 of the third word.

The conversions between double-word integer and extended floating point formats (not illustrated) are more complex. Briefly, the process is as follows. For conversion to double-word integer, the mantissa is arithmetically shifted right while the exponent value is correspondingly increased (one increment per shift) until the exponent equals +31. If bit 15 is a 1 (implying 1/2), the quantity in bits 16 through 47 is incremented by 1; this rounds the integer to the nearest whole number. Bits 8 through 15 are set to 0, and bits 16 through 47 comprise the integer value. The reverse conversion, double-word integer to extended floating point, consists of filling in zeros for bits 8 through 15 of the least significant word, setting the exponent to +31, and then normalizing the result.

From the standpoint of the COMP programmer, the FPP effectively adds an extended floating-point accumulator, designated the X register. This register can be loaded from the COMP memory, its contents manipulated, and its contents stored in the COMP memory. Each extended floating-point operand may be contained in three consecutive COMP memory locations, as assumed for the following definitions of the twenty FPP routines:

LDX (LOAD EXTENDED FLOATING POINT). Load the extended floating point number from addressed memory location m (and m + 1 and m + 2) into the X-register.

STX (STORE EXTENDED FLOATING POINT). Store the extended floating point number in the X-register in addressed memory location m (and m + 1 and m + 2).

ADX (ADD EXTENDED FLOATING POINT). Add the extended floating point number from addressed memory location m (and m + 1 and m + 2) to the current value in the X-register. The extended floating point result of the addition occupies the X-register on completion of the instruction.

SBX (SUBTRACT EXTENDED FLOATING POINT). Subtract the extended floating point number in addressed memory location m (and m + 1 and m + 2) from the current value in the X-register. The extended floating point result of the subtraction occupies the X-register on completion of the instruction.

MPX (MULTIPLY EXTENDED FLOATING POINT). Multiply the extended floating point number in the X-register by the extended floating point number in addressed memory location m (and m + 1 and m + 2). The extended floating point result of the multiplication occupies the X-register on completion of the instruction.

DVX (DIVIDE EXTENDED FLOATING POINT). Divide the extended floating point number in the X-register by the extended floating point number in addressed memory location m (and m + 1 and m + 2). The extended floating point result of the division occupies the X-register on completion of the instruction.

ABX (ABSOLUTE VALUE). Calculate the absolute value of the extended floating point value in the X-register; i.e., if the content of the X-register is negative, convert to positive.

ENX (ENTIER). Calculate the entier of the extended floating point value in the X-register. The calculated result replaces the original contents of the X-register.

CMX (COMPLEMENT). Convert the extended floating point value in the X-register to an extended floating point value with unchanged magnitude but opposite sign.

CSX (COSINE). Calculate the cosine of the value in the X-register, where the value is expressed in radians as an extended floating point number. The result of the cosine calculation replaces the original value in the X-register.

SNX (SINE). Calculate the sine of the value in the X-register, where the value is expressed in radians as an extended floating point number. The result of the sine calculation replaces the original value in the X-register.

TNX (TANGENT). Calculate the tangent of the value in the X-register, where the value is expressed in radians as an extended floating point number. The result of the tangent calculation replaces the original value in the X-register.

ATX (ARCTANGENT). Calculate the arctangent of the value in the X-register, where the result is expressed in radians as an extended floating point number in the X-register.

HCX (HYPERBOLIC COSINE). Calculate the hyperbolic cosine of the extended floating point number in the X-register. The result of the hyperbolic cosine calculation replaces the original value in the X-register. HSX (HYPERBOLIC SINE). Calculate the hyperbolic sine of the extended floating point number in the X-register. The result of the hyperbolic sine calculation replaces the original value in the X-register.

HTX (HYPERBOLIC TANGENT). Calculate the hyperbolic tangent of the extended floating point number in the X-register. The result of the hyperbolic tangent calculation replaces the original value in the X-register.

AHT (ARCHYPERBOLIC TANGENT). Calculate the archyperbolic tangent of the extended floating point number in the X-register. The result of the archyperbolic tangent calculation replaces the original value in the X-register.

EXX (EXPONENTIAL). Calculate the exponential of the extended floating point number in the X-register. The result of the exponential calculation replaces the original value in the X-register.

LNX (NATURAL LOGARITHM). Calculate the natural logarithm of the extended floating point number in the X-register. The result of the logarithm calculation replaces the original value in the X-register.

SRX (SQUARE ROOT). Calculate the square root of the extended floating point number in the X-register. The result of the square root calculation replaces the original value in the X-register.

FIX (ROUND TO NEAREST INTEGER). Round-off the extended floating point number in the X-register to the nearest integer value. The result remains an extended floating point number and replaces the original value in the X-register. The number of bits affected depends on the exponent value.

RNX (ROUND TO 24 BITS). Round-off the extended floating point number in the X-register to 24 bits of precision.

Internally, the FPP processes all data as normalized extended floating point numbers and can convert the input data if the data is not already in this form. As shown in Table 7, normalization is accomplished by shifting the mantissa left to eliminate any zeros between the binary point and the first non-zero bit, while the exponent is correspondingly reduced by subtracting 1 for each shift. In the upper example of Table 7, there are two zeros between the binary point and the 1 bit. Therefore two left shifts are necessary, and the exponent is reduced from -121 to -123. In the lower example, the number is too small to be normalized, resulting in an underflow condition. The mantissa is shown shifted left seven positions, which reduces the exponent to its smallest possible value, -128. No further left shifts can therefore be made, and there is still one remaining zero between the binary point and the 1 bit. If a number is too large to be represented in the normalized extended floating point format, an overflow condition results.

TABLE 7

NORMALIZATION Mantissa Binary Exponent Binary representation) (Decimal representation) + .0010000000 . . . 000 -121 + .1000000000 . . . 000 -123 UNDERFLOW (Cannot be normalized) Mantissa Binary Exponent (Binary representation) (Decimal representation) + .0000000010 . . . 000 -121 + .0100000000 . . . 000 -128

FIG. 6 defines the ranges of valid binary numbers that can be processed by the FPP. The shaded areas define overflow and underflow ranges. In the VALUE column, the mantissa is enclosed in parentheses, followed by the exponent outside of the parentheses. Note that 0 (middle line of the figure) is represented by all zeros in both the mantissa and the exponent. Positive numbers are shown above this line, and negative numbers are shown below this line. In the far right column, the exponent values are shown increasing in both directions from the zero line, from the smallest representable value (-128) to the largest (+127). Nonrepresentable exponent values, smaller than -128, are also underflow conditions, but this situation is not considered in FIG. 6.

For each finite value of exponent, the mantissa is assumed to go through its complete cycle of valid values. In approximate terms, the positive mantissa cycles from +1/2 to +1, and the negative mantissa cycles from -1/2 to -1. More precisely, the positive range is from +1/2 to the largest possible fractional number under the value of 1. The negative range is from the largest possible fractional number below (more negative than) -1/2 to -1.

The significance of +1/2 and -1/2 in determining mantissa ranges is a result of the normalization requirement, which dictates that there will be a significant digit immediately to the right of the binary point. This automatically eliminates all numbers between +1/2 and -1/2, including the exact value of -1/2, but excluding 0 and +1/2.

If an error condition results during the execution of an FPP routine, the FPP provides an error code and an error control signal. Table 8 lists the possible types of error that can occur for each of the 20 routines of the FPP. ##SPC14##

1. x = Contents of X-register before execution. z = Contents of X-register after execution.

The FPP add, subtract, multiply, and divide routines will indicate an underflow error condition if the result cannot be normalized, i.e., the result falls in one of the shaded areas immediately adjacent to zero in FIG. 6. If the input data is not normalized, the FPP will normalize the operands before beginning the computation; if the numbers are too small to be successfully normalized, the FPP will attempt normalization as far as possible and then proceed with the computation. The answer will be correct except that the sign bit of the exponent will be incorrect (complemented). The FPP will indicate an underflow error condition for CMX if the pre-execution contents of the X-register (i.e., x) equals (1/2)2.sup..sup.-128, for HCX or HSX if x is less than -88, and for EXX if x is less than -87.3. The SBX routine checks if the subtrahend from memory has the minimum allowed positive value (i.e., (1/2)2.sup.-.sup.128 in FIG. 6); this would produce an underflow when converted to negative form during execution. However, as in all underflow conditions, the computation will proceed, and only the sign bit of the exponent will be incorrect.

The FPP add, subtract, multiply, and divide routines will indicate an overflow error condition if the result exceeds the largest positive or negative number which can be represented; i.e., the result falls in either the top or bottom shaded areas in FIG. 6. Answers will be correct except for the sign bit of the exponent, which will be complemented. However, the divide routine DVX can produce one exception to this general rule. If the dividend exponent is +126 or +127 and the divisor exponent is -127 or -128, the resultant exponent will be incorrect. In this case, the exponent value will equal the original value of the dividend exponent, plus two. This produces a rollover to an apparent negative exponent of -128 (if the original was +126) or -127 (if the original was +127). The SBX routine also checks the subtrahend from memory before execution begins. If the subtrahend is at the maximum allowed negative value [i.e., (-1)2.sup.127 in FIG. 6], an overflow will result when the number is converted to positive form during execution. Similarly, the CMX and ABX instructions check the pre-execution contents of the X-register for the maximum allowed negative value; overflow will result when the number is converted. However, in the SBX, CMX, and ABX routines execution will proceed, and only the sign bit of the exponent will be incorrect (complemented). The RNX routine will indicate an overflow condition if the number is at the maximum allowed positive value and then is rounded upward. This will cause rollover to the maximum negative number. Overflows resulting from the FIX, HCX, HSX, and EXX routines produce results which are generally unpredictable, making it difficult or impossible to reconstruct correct answers.

The CSX, SNX, and TNX routines allow the arguments to include multiple rotations of the angle represented by the argument. These rotations use up part of the available floating point bits, leaving fewer bits to express the fractional part of the rotation for the computation. When the number of rota-tions reaches about 2.sup.38 /2.pi., there is insufficient resolution to express angles in increments smaller than 90 degrees. The no-resolution error code 3 indicates this condition.

If the divisor for the DVX routine is zero, the division will not be attempted, and the error code 4 will indicate this condition. In the TNX routine, odd numbers of quarter rotations (1/4, 3/4, etc.) result in a divide-by-zero condition (sin = 1, cos = 0). This is also indicated by the error code 4. The value 1 will remain in the X-register on exit from this routine.

Attempts tO calculate the square root (SRX) of negative numbers, or the natural logarithm (LNX) of zero or negative numbers, will not be executed, and will leave the X-register unchanged. Attempts to calculate the archyperbolic tangent (AHT) of numbers which are .+-.1 or greater in magnitude also will not be executed and will leave the X-register unchanged with the exception that the attempt to calculate the archyperbolic tangent of -1 will result in clearing the X-register to zero.

Undefined operation codes given to the FPP will indicate error code 6 and will otherwise be ignored.

The ENX, RNX, and FIX are explained in the following paragraphs.

Entier (ENX) is simple truncation of the fractional part of a number. This is accomplished by noting the value of the exponent and saving that number of places in the most significant part of the mantissa. The remaIning bits of the mantissa are cleared (to zeros). The effect for positive numbers is to reduce the X-register contents (if there is a fraction) to the nearest integer value and for negative numbers is to increase the X-register contents (even if there is no fraction) to the nearest negative integer value.

Rounding an extended floating point number to 24 bits of significance (RNX) implies that a specific number of bits (16) will be cleared. Before these bits are cleared, however, the most significant bit of those to be cleared is examined. If this bit is a 1, indicating for positive numbers that the part of the number to be cleared represents a value of 1/2 (or more) of the least significant bit of the saved part, the saved part is incremented by +1. Thus, if the cleared part of a positive number is 1/2 or greater, the value is rounded upward (not necessarily to an integer); otherwise, the value is rounded downward. If the cleared part of a negative number is -1/2 or more negative, the value is rounded in the negative direction; otherwise, the value is rounded in the positive direction. Note that if all 23 significant data bits were 1's before execution, rounding upward (incrementing by +1) would cause overflow; unless an overflow condition exists, the instruction will automatically renormalize the number.

Rounding an extended floating point number to the nearest integer (FIX) involves two steps: conversion to integer, and rounding. First, the mantissa is shifted right while the exponent value is incremented, once per shift, until the exponent equals +31. Then the most significant bit of the eight bits to be cleared is examined. If this bit is a 1, the saved part is incremented by +1. For positive numbers, the 1 bit indicates a fraction of 1/2 or greater; for negative numbers, it indicates a fraction smaller than 1/2. After rounding, bits 8 through 15 are cleared. If the cleared part of a positive number is 1/2 or greater, the saved part is rounded to the next higher integer; otherwise the number is reduced to the integer-only value. If the cleared part of a negative number is -1/2 or more negative, the value is rounded to the next more negative integer; otherwise the number becomes the integer-only value.

THEORY OF OPERATION

The floating point processor has two basic modes of operation, one for binary function routines, and one for unary function routines. FIGS. 7 and 8 illustrate these two basic modes of operation.

In FIGS. 7 and 8, the three registers, A, B, C, together comprise what was earlier called, for simplicity, the X-register. Each of the three registers has 48 bits for data (mantissa), and also has 8 bits for an exponent byte (E) and 8 bits for a shift control byte (S). Since the C-register has no facilities for shifting, the C shift control byte is used to receive the operation code from the COMP. The shaded areas indicate the location of operands at the start of the operations.

Referring now to FIG. 7, an operand quantity x, previously loaded into the FPP, is assumed to exist in the FPP B-register. The COMP places an operation code on interface data lines 40 and issues an OPC command to the FPP via line 42. (An OPC command initiates the reception of an operation code through the 16 bit input port of the FPP and initiates the execution of that operation.) The FPP, which operates under control of firmware programs in the ROM, cycles in a wait mode as long as it is in the ready state, looking for an OPC command. When the ROM program detects the presence of OPC, it loads the operation code data into the S byte of the FPP C-register. Then the microprogram identifies the type of operation by decoding the operation code bits, and branches to the appropriate function routine. The operation code is now no longer needed.

As the microprogram proceeds, the value in the FPP B-register is manipulated according to the algorithm for the particular function, using all three registers. At the end of the routine, the final answer resides in the FPP B-register, and a FLG (Flag) signal is sent back to the COMP. The FLG signal indicates, if following an OPC command, that the FPP is ready for further commands, having completed the last issued command.

If an error occurs during the calculation, or if the operation code is improper, an ERR (Error) signal is sent back to the COMP with the FLG signal. It is the user's option to decide what to do about an error condition. In general, it may be said that the FPP will attempt the calculation, rather than abort, even if input values will result in an error. The FPP will provide the best answer it can, along with the error indication. This allows the programmer some flexibility to reconstruct correct answers from results which normally could not be represented. The exception to this generalization is that divisions by zero will not be attempted.

Referring now to FIG. 8, the binary function routines take two operands, one that was previously loaded into the FPP, and one that exists in the COMP memory, and operate on these operands. The operations include addition, subtraction, multiplication, and division.

Initially the quantity x is assumed to exist in the FPP B-register. It may have been left there as the result of a previous instruction, or it may have been loaded by a load instruction LDX. The COMP first fetches one word of a three-word operand (y) from memory. It then puts this data word on the interface data lines and issues an ENC command to the FPP unit. An ENC command initiates the reception of an operand word through the 16 bit input port and the transmission of a result word through the 16 bit output port.

As mentioned previously, the FPP unit, under control of the microprogram, continuously searches for ENC or OPC commands as long as it is in the ready state. When the microprogram detects the presence of ENC, it loads the data word (in two 8-bit bytes) into the high order third of the FPP C-register.

After the second byte has been loaded, the FPP unit sends a FLG signal back to the COMP, indicating readiness for the next word of y. The COMP fetches this next word from its memory and repeats the process: the word is placed on the interface data lines 40, an ENC command is given to the FPP to load these two bytes, and another FLG signal is returned to repeat the process for the third and final time.

At the end of the three-word transfer, the quantity x is in the FPP B-register and the quantity y is in the FPP C-register. The FPP unit now needs to be told what to do with these numbers. The entire process described above under function routines is now added on. In brief, the procedure is:

a. The COMP issues an operation code and OPC control signal.

b. The FPP unit loads the operation code into the FPP C-register.

c. The ROM program interprets the operation code and branches to the appropriate function routine (add, subtract, etc.).

d. The function routine calculates the answer, and leaves it in the FPP B-register.

e. A final FLG returned to the COMP tells it that the FPP unit is ready for further commands.

f. In case of error, an ERR signal indicates that an error code is present on the data lines to the COMP.

The load instruction LDX operates similarly to the above procedure, except that the operation code simply causes the loaded FPP C-register contents to move up into the FPP B-register.

The OPC command, or ENC command, should only be given (set to 1) to the FPP if FLG from FPP is 1. The ENC command should not go to 0 until FLG from FPP goes to 0. The ENC or OPC command should not go to 1 again until FLG goes to 1.

The following describes in detail the procedures described above, plus a description of the FPP power supply. All logic is positive-true. The high (or true) state ranges from +1.25 to +2.5 volts; the low (or false) state ranges from -0.5 to +0.5 volts.

The block diagrams of FIGS. 9A-D are referenced throughout this description. Tables 9 and 10 provide supporting information: the detailed coding of the instruction register (IR), definitions of the ROM instructions, and a list of tests used for branching decisions. These tables and FIGS. 9A-D should be referred to frequently, since definitions will not be given within the descriptive text. The logic diagrams for the FPP are given in FIGS. 11-40, a wiring diagram is given in FIGS. 42A-F, and flow charts for the FPP routines are given in FIGS. 44-71. Specific signal names are given on the block diagram, facilitating direct reference from a function on the block diagram to the comparable function on the logic diagrams, wiring lists, signal indexes, and signal transfers from board to board.

The clock generator for the floating point processor unit operates at a rate of 5 MHz (200-nanosecond period) supplying a 100-nanosecond clock signal, and its complement, to the FPP logic. The clock generator is located on the FPP D-register card. The complemented clock signal (Clock) allows the clock cycle to be split, so that an active bit may be loaded into a register in the first 50 nanoseconds and perform its function in the second 50 nanoseconds. This high-speed feature employs a master/slave pair of flip-flops for each bit. ##SPC15##

As shown in FIG. 72, Clock latches the master flip-flop, and Clock latches the slave flip-flop. The master flip-flop is loaded with data (Data 1) by Clock, and about 45 nanoseconds later Clock transfers the bit to the slave flip-flop. (There is a slight offset between Clock and Clock.) The output of the slave flip-flop can then be used in the logic without affecting the inputs (such as Data 2) that determined the setting of the master flip-flop.

The FPP ROM has nine input lines (RA0 through RA8), thus allowing 512 addresses (2.sup.9), and 48 output lines (R0 through R47), giving a word length of 48 bits.

The ROM constantly reads out whatever contents are enabled by the nine address lines. The ROM output lines are clocked either into the instruction register or if the preceding instruction contained a constant call into the D-register . All words in the ROM are either instructions or constants. For start-up purposes (power-on), ROM is forced to start at address 0.

The FPP ROM contents are listed in Table 11 in the form of mnemonics and constants. Physically, the ROM is contained in 24 microcircuit packages on a single printed circuit card. Each package accepts 8 of the 9 address lines and has 4 of the 48 output lines. (See logic diagram, FIGS. 11A-E). The ninth address bit (RA8) selects either the high half or low half of ROM. The lower rank of 12 packages is enabled when RA8 is 0 and is therefore active for address 0 through 255 (decimal). The upper rank of 12 packages is enabled when RA8 is 1 and is therefore active for address 256 through 511. The output of the two ranks are "or"-tied together. ##SPC16## ##SPC17## ##SPC18## ##SPC19## ##SPC20## ##SPC21## ##SPC22## ##SPC23## ##SPC24## ##SPC25##

The instruction register is clocked to load the ROM output every 200 nanoseconds. The data word is loaded into the master latch of the instruction register by Clock and into the slave latch by Clock. At Clock the contents of the instruction register are used to control the logic (see block diagram). Loading the instruction register occupies one clock cycle (time intervals 1 and 2), as shown in FIG. 73.

As soon as the instruction is in the slave latch of the instruction register, execution begins. A typical execution might read a pair of byte operands, add them during Clock (2) and store the result during the next Clock (3) into the master latch of the specified register. The next Clock (4) would transfer the stored result from master to slave, where it may be used (read) by the next instruction. Notice that there is a time overlap, and the second instruction has already been loaded from ROM (3) into the master latch of the instruction register and execution has begun (4).

The detailed coding of the instruction register, plus definitions of the instruction fields, are given in FIG. 74. Physically, the register is split up and located on three separate cards: ROM address card, D-register card, and FPP interface card.

The floating point algorithms require considerable flexibility for branching from one area of ROM to another. The following paragraphs describe the various modes employed to specify the next address within a current instruction. The addressing modes are shown in Table 12.

Each microinstruction specifies where the next micro-instruction is to be obtained by selecting one of two addresses, dependent on a test condition. One of 16 conditions may be selected for the test by the TS field of the instruction. These conditions are numbered T0 through T15, as shown in Table 10.

TABLE 12. ROM ADDRESSING MODES

NEXT ADDRESS MODE PAGE LINE Unconiditional Branching (TS = 0) BP BL Conditional Branching TS True MP (Current Page) BL TS False MP (Current Page) BP Indirect and Constants IND CS (4-7) CS (0-3) Bit 8 = 0 CON CS (4-7) CS (0-3) Bit 8 = AD9 JSB and Return JSB BP BL (JSB and RTN (If JAR = 0, save MP in FP) (If JAR = 0, save TS in FL) complement JAR (If JAR = 1, save MP in GP) (If JAR = 1, save TS in GL) after execution) RTN If JAR = 0: GP If JAR = 0: GL If JAR = 1: FP If JAR = 1: FL

figs. 9a-d show the sources of many of the test inputs; for example, OPC and ENC from the computer, selected outputs of the A, B, C adders, and certain count values of the byte counter YC. These signals are applied as one input to a three-input gate in the conditional branching logic block, as shown in FIGS. 9A-D. (This gate represents a series of gates performing this function.) The second input to the gate is the decoded test number, and the third is the BRN signal (also decoded from the instruction register) which must be true for all branching instructions. If the test is true, bits 0 through 3 of the ROM address are taken from the BL field; if the test is false, these bits are taken from the BP field. Bits 4-8 of the ROM address are taken from the MP register, which contains the page number of the current microinstruction. Thus, conditional branches may be made only to microinstructions on the current page.

Unconditional branch microinstructions give the next ROM address, regardless of test conditions. The page and line of the next microinstruction are given by the BP and BL fields, respectively. The transfer is accomplished by coding BRN and TRU in the microinstruction. This enables the first of the three gates in the unconditional branching logic block, which in turn enables BP and BL onto the page and line address lines.

The jump to subroutine instruction JSB is an unconditional branch with two special provisions: a return address is stored, allowing return from subroutine completion to this address, and register switching to allow one level of JSB nesting.

The decoded JSB signal gates BP and BL onto the page and line address lines in the unconditional branching logic block. The JSB signal also stores the current page value into either the GP or FP register, depending on the current state of the JAR flip-flop, and loads a return address line value from the TS field into either the GL or FL register, depending on the JAR flip-flop state. The TS field can be used to specify the return address line since conditional branches are not allowed with a JSB microinstruction. Thus, a return address is stored in either GP or GL or FP and FL. Since the stored page value is always the current page value, subroutine returns must be to the same page that contained the JSB call. Furthermore, since only four of the five page bits are stored, both the call and return addresses must be in the lower half of ROM (addresses 0 through 255). The subroutine itself, however, may be located on any page, specified in the BP field.

After the return address has been stored, the JAR flip-flop toggles to its complementary state. The initial state of JAR does not matter, as it is immaterial whether the G or F pair of registers is the first selected. The toggling of JAR after each occurrence of JSB insures that the remaining one of the F or G registers will be used to store a second return address. The return from subroutine instruction RTN retrieves a return address from either G or F depending on JAR and branches to that address. RTN also toggles the JAR flip-flop, insuring that a second return address will be retrieved from the remaining one of the F or G registers.

Each word in ROM is either an instruction or constant. Instructions are loaded into the instruction register, and constants are loaded into the D-register (48 bits). To obtain a constant, the ROM program must first load the address of the desired constant into the CS byte of the C-register, and then issue an instruction containing CON in the BC field. The ninth bit of CS is forced to a 1, thus, constants must be in the upper half of ROM (addresses 256 through 511).

The CON signal reads the CS byte onto the ROM address lines, disables Clock for one cycle, and enables a special clock TS that loads the addressed ROM contents into the D-register. See the D-register logic diagram. When the CON bit (R21) is detected, it is loaded by TS into the master CON latch. At Clock, the CON bit is transferred to the slave latch and inverted to disable Clock. The low input to pin 6 (input control code = 01) causes the master latch to clear, so that at the next Clock CON will go false. This, inverted to true by U92B, reenables Clock. The net result is that one Clock pulse has been inhibited, temporarily halting program execution while the constant is being read.

Note, however, that the TS clock has continued to run, and this clock, enabled during the interval that CON is high, loads ROM bits R0 through R47 (the constant) into the D-register.

If a series of constants is called, the CIC bit (R38) may be used to increment the constant address in CS. The purpose of the CKC flip-flop (which, like the CON flip-flop, is clocked by TS) is to assume the function of the CIC flip-flop (disabled in the absence of Clock) during the CON cycle. Note that this feature applies only for CS addresses below 24 (decimal). These 24 locations provide a table that results in a numerical convergence after 25 steps. The S24 signal inhibits CIC for addresses 24 or higher in order to conserve ROM space. The program may continue to call for constants and keep issuing the CIC command, but in effect the contents of location 24 will continue to be read on each further call.

IND reads the 8-bit CS byte onto the ROM address lines. For IND alone, the low half of ROM is addressable (0 to 255) since the ninth address bit (RA8) cannot be controlled by the 8-bit CS register. For CON, however, U61A forces the ninth bit to a 1 (since AD9 is normally 0), so that constants will always be read from the high half of ROM (256 to 511). For certain purposes (such as the ROM dump routine), AD9 can be made true, so that CON can also read the lower half of ROM.

The floating point processor contains three nearly identical arithmetic sections, each typically consisting of a register, a shifter, and an adder. The A-register/shifter/adder will be discussed in detail first; differences for the B and C sections will be described later.

The FPP A-register accommodates 48 bits of data in six separately controllable bytes (A0 through A5). In addition, there is an exponent byte (AE) and a shift byte (AS).

The A-shifter provides a means of selecting any eight consecutive data bits from the 48 bits stored in the FPP A-register starting at any bit position.

The A-adder adds an 8-bit byte, selected from the A-register, to another 8-bit byte, selected from the C register, the B-shifter, the D-shifter, or the A-shifter.

With reference to the block diagram of FIGS. 9A-D, the logic will be discussed left to right across the A-register/shifter/adder block. On the left side of the block is a series of four transfer mode gates (each representing eight separate gates for the complete byte). If transfer mode 1 is selected in the ROM instruction, one of the C-register bytes (on the C50 through C57 lines) will be transferred as an input to the A-adder. (The C byte number selected will be the same as the A byte number selected.) Similarly, if transfer mode 2, 3, or 4 is selected, the gates will transfer a shifted B byte (on the B70 through B77 lines), a shifted D byte (on the D70 through D77 lines) or a shifted A byte (on the A70 through A77 lines).

The output of the Transfer Mode gates is applied to a true/complement network consisting of an "and"/"nor" pair of gates for each bit. If the CPA instruction bit is true, each bit is complemented before being routed to the A-adder on the A90 through A97 lines.

The other input to the A-adder (lines A60 through A67) is enabled if the RRA bit of the instruction register is true. The input consists of one of the FPP A-register bytes on the A50 through A57 lines (byte selection described below).

The result of the addition appears on the A00 through A07 lines, with a possible carry saved in the CY bit register. The carry may be used (propagated) by a PCY instruction in the next cycle. It is also possible to inject a carry (actually an increment by one) by means of a CIA signal. PCY and CIA are functions of the Special field of the instruction word, as is BI8, which can force a one on the eighth bit (A97) of the transferred input to the adder.

Tests which can be made on the A00 through A07 output are: eighth bit true or false, eighth bit of A does or does not equal eighth bit of B, and adder output is zero or non-zero.

The adder output is applied to all eight byte positions of the A-register (the data is stored in a master register at Clock time). At Clock time, the data will be transferred into a byte position (slave register) which is selected by one of eight enabling signals: AY0 through AY5, AYE, or AYS. The enabling signal is derived from the SR, SY, and YC fields of the instruction register. The SR field specifies the A-register (SRA), and SY either specifies byte AS, AE, A5 or else enables the YC field (byte counter) to select one of the six data bytes, A0 through A5. The byte counter produces an octal output on the ROM address card, consisting of signals Y0, Y1, Y2, which is decoded on the FPP interface card to produce the SY0 through SY5 signals. The decoder is not enabled if the SY field specifies SY5 (store in A5 byte), SYS (store in AS byte) or SYE (store in AE byte).

The bytes stored in the FPP A-register can be selectively read out by read signals derived from the RY and YC fields of the instruction register. The RY field either specifies byte AS, AE, or A5 (by RYS, RYE, or RY5 signals), or else enables the decoded byte count from the YC field to select one of the six data bytes (by the RY0 through RY5 signals). The selected byte is routed via the A50 through A57 lines to the A-adder.

In addition to the selective byte reading described in the preceding paragraph, provision is also made to read any adjacent eight bits in the data portion of the A-register. Byte boundaries are ignored, and the register is looked at as a 48-bit data register. Selection is accomplished by the A-shifter, under control of the shift byte (AS) in the A-register.

The shifter may be viewed as an addressable reader. (see FIG. 75.) The numerical value of the shift byte (decimal) points to the least significant bit of the desired 8-bit series. This bit and the next higher seven bits are read out to the transfer mode gates.

As shown in FIG. 75, special cases occur when the shift byte points to bit positions higher than 40. (Seven of the eight bits of AS are used for addressable reading, so AS can point to values as high as 127.) When the AS value is between 41 and 47 (inclusive), one or more bits selected at the high end of the series of eight will be nonexistent. These non-existent bits are referred to as phantom bits (P); provision is made to fill these bit positions on the output lines (A70 through A77) with either zeros or copies of the sign bit (bit 47). The desired choice is made by controlling bit 7 of the shift byte (AS7): if 0, signs will be copied (arithmetic shifting); if 1, zeros will be filled in (logical shifting). Note that when AS is 48 or higher, all of the selected bits will be phantom bits.

Details of the selection process are shown in FIG. 76. The six least significant bits of AS are decoded octally into two sets of selection signals, designated SW0 through SW7 and SV0 through SV7. (AS6, if true, would result in the all-phantom condition, so it is not decoded but is simply "or"-tied with SW6 and SW7; see next paragraph.) The SW0 through SW5 signals accomplish a preselection of 15 out of the 48 register bits, and the SV0 through SV5 signals select 8 out of the 15 preselected bits. These final eight bits are routed out on the A70 through A77 lines.

Refer to the A-shifter logic diagram for details on the generation of phantom bits. Note that U50C gates sign bit 47 to the higher order SW5 positions if AS7 is 0. If AS7 is a 1, the output of U50C is 0. (Final selection of one or more of these bits is made by the SV0 through SV7 signals.) For the all-phantom condition, the shifter network is ignored completely (all zeros on the A70-77 lines); instead, a true or false TSA signal is sent to the complementing networks. Gate U50D is enabled by SW6, SW7 or AS6, and will provide a true output if phantom signs are desired (AS7 = 0) and the sign bit happens to be a 1. Otherwise, if the sign bit is 0 or if phantom zeros are desired (AS7 = 1), TSA will be false. Depending on the transfer mode selected, TSA will affect one of the three complementers (A, B, or C) by inverting the existing all-zero output to all ones (TSA true) or will leave the data as all zeros (TSA false). The result is eight copies of the sign (1 or 0), or eight zeros.

For microprogramming purposes, it is advantageous to have the pointer in AS keep in step with the byte counter. This means that whenever the byte counter is incremented or decremented to enable the next higher or lower byte position, the shift pointer should also change value to enable the next higher or lower series of eight bits. The AY8 adder performs this function.

In order for the AS value to point to a new series of eight bits, its value must increase by 8 when YP1 increments the byte counter and must decrease by 8 when YM1 decrements the byte counter. Furthermore, when the byte counter rolls over from 5 to 0 (incrementing, modulo 6) or from 0 to 5 (decrementing), the AS value must change correspondingly: return to its original value or go to the original value plus 40 (i.e., 5 .times. 8), respectively.

Referring to the ROM address card logic diagram, FIGS. 14A-R, note that when YP1 increments the byte counter (via U35E), it also increments the AY8 adder. Since the AY8 adder operates on bits 3 through 6 of the AS register (rather than 0 through 3), each increment adds 8 to the contents of AS, via the AP3 through AP6 lines. Similarly, when YM1 decrements the byte counter (by adding all 1's via U35D/C, U33B, and U32A), it also decrements the AY8 adder decrementing AS by 8 via the AP3-6 lines.

Gates U35E, U33A, and U32D cause the byte counter (and AY8 and BY8 adders) to act as modulo 6 counters when incrementing. When the count of 5 is detected by U24A and U15D, the next YP1 will inject a quantity which, when added to 5, will produce zero. For the 3-bit byte counter this quantity is 3 (via U35E and U33A). For the 4-bit AY8 and BY8 adders this quantity is 11 (all three gates).

To achieve modulo 6 when decrementing, gates U33B and U32A are disabled at the count of zero, and allow U35D and U35C to inject a quantity of 5. This reverts the byte counter to the count of 5 and adds 40 to the AS register via the AP3 through AP6 lines. (Incidentally, the AP3-6 lines are disabled when AS is originally loaded, by the SYSA signal.)

The method by which the byte counter is forced to zero (YFO) is to add the current value of Y to its complement (U35C, U33C, U16B) and inject a carry (U35A). For the 4-bit adders, U32B injects the necessary one-bit for the most significant bit position.

The FPP B-register/shifter/adder is identical to the A section described above, with only signal nomenclature changes and a different assignment of inputs for the transfer modes.

The C-register/adder does not have an associated shifter. Instead, the third shifter is assigned to the D-register. The D-shifter is controlled in parallel with the A-shifter by the AS shift byte.

Since the CS byte is used for indirect addressing of ROM, the CS output is routed to the ROM address card. Also the S24 signal is made available to the conditional branching test logic.

On the C-adder card, the CSX line is open 0, whereas on the A- and B-adder cards ASX and BSX are enabled by tying to +4.75 volts. This disables the CP input lines to the shift byte (CS), since these lines are not used in the C arithmetic section.

As mentioned above, data is transferred into or out of the FPP in three successive 16-bit words. It was also stated that the COMP sends 16 bits of data with every ENC, and the FPP returns 16 bits of data with every FLG, whether or not data is actually used at either end. Data to the FPP is loaded into the C-register (and transferred to the B-register if a load opcode follows), and is sent from the B-register. Referring to table 13, the process is as follows:

TABLE 13. X REGISTER TRANSFER SEQUENCE

ENC INPUT OUTPUT No. 1 IM0-7 .fwdarw. C5 B5 .fwdarw. A4 IL0-7 .fwdarw. C4 A4 (FLG) No. 2 IM0-7 .fwdarw. C3 B3 .fwdarw. A2 IL0-7 .fwdarw. C2 B2 (FLG) No. 3 IM0-7 .fwdarw. C1 B1 .fwdarw. A0 IL0-7 .fwdarw. Convert B0 Convert to FPP format to FPP format .fwdarw. C0 (FLG)

on the first ENC, the entry routine first loads the high order eight bits (IM0 through IM7) into C5 while, simultaneously, B5 is transferred to A4 and read out to the output lines (A50 through A57). Then the byte counter is decremented (pointing to byte 4). The low order input bits (IL0-7) are loaded into C4, and B4 is read out to the B50 through B57 lines. A FLG signal is issued to the COMP, telling it that it can store the 16 bits from A4 and B4.

On the second ENC, the byte counter decrements to 3, and IM0-7 is loaded into C3, while B3 is transferred to A2. Decrementing to count 2 allows C2 to be loaded, and A2 and B2 to be read out (with FLG).

On the third ENC, the byte counter decrements to 1, and IM0-7 is loaded into C1. Then, when the byte counter decrements to 0, a format conversion occurs which moves the exponent sign bit to the proper position. (Internally in the FPP, this bit is in the most significant bit position; externally in the computer, it is in the least significant bit position.) Byte B1 is now transferred to A0, and A0 and B0 are read out (with FLG).

POWER SUPPLY

The power supply of the floating point processor generates two regulated dc supply voltages for all logic circuits in the unit: +4.75 volts and -2 volts. (A third dc voltage, +10 volts, is also generated, but this supply is used only within the power supply itself.)

FIG. 77 illustrates the power supply circuits in simplified form. The 115- or 230-volt ac input is stepped down to a nominal 12 volts ac and rectified by a pair of silicon-controlled rectifiers (SCR). The inductor/capacitor filtered output is 6.75 volts dc, referenced to ground such that the positive line is +4.75 volts and the negative line is -2 volts with respect to ground.

The full 35-ampere current capacity is available to the +4.75-volt load, and up to 35 amperes is available to the +2-volt load. Since the -2volt load is less than the +4.75-volt load, the difference current is diverted through the -2-volt shunt regulator. This regulator acts in the same way as would a Zener diode. A variable amount of current is drawn through the shunt in order to maintain a constant -2-volt level.

The level of the +4.75 voltage is maintained constant by controlling the conduction time of the SCR. The +4.75-volt level is detected by a differential amplifier, which compares the voltage with a Zener diode reference. The difference output is used to control the slope of a ramp voltage, which is synchronized to the 120 Hz rectified line frequency. When the ramp reaches the trigger level of a unijunction transistor in the ramp generator, the ramp terminates, generating a positive pulse of about 10 volts amplitude and 20 microseconds duration. This pulse triggers the SCR's, which will then continue to conduct for the remainder of the half cycle. As shown in FIG. 77 (note examples of ramp slope and rectified sine wave), a variance of ramp slope has the net result of altering the conduction time (shaded area). Consequently, the energy delivered to the LC filter will be increased or reduced proportionately, thus providing the means of controlling the output dc level.

Referring now to FIGS. 40 A-J, input ac power is applied to power line assembly A1. This snap-in module contains the ac line connector, line fuse, rf interference filter, 115/230V line voltage switch, and terminals for connection of the front panel POWER switch, power-on indicator lamp (DS1), and power transformer. Relay K1 is inserted in series with the transformer primary, so that power will be turned off if either the computer loses power (-23.8V drops) or the ambient temperature in th FPP unit rises too high.

Sensing of the +4.75-volt level is made from a point on the backplane bus. Due to the high currents involved, bus resistance itself will drop the dc level slightly; power is therefore applied to the bus at two points, and the sense line is connected to a point that represents an average value.

The sensed +4.75 voltage is applied to a differential amplifier at Q1/Q2, which compares a divided sample (R20/R21) to a presettable reference level from resistor R25 (+4.75V ADJ). Any voltage difference between the bases of Q1 and Q2 is amplified and applied to Q3, altering the charging rate of ramp capacitor C30. When Q3 has charged C30 to the triggering level of unijunction transistor Q4, Q4 discharges C30 to the -2-volt clamping level. The sharp negative transition at the base of Q5 turns on Q5 for about 20 microseconds, dependent on circuit constants, and the resultant positive pulse is applied through emitter follower Q6 to the SCR trigger inputs (CR5, CR6). Diode CR22 limits the pulse amplitude to +10 volts and protects Q6; CR8 protects the SCR's (which are non-conducting before the pulse arrives).

The positive pulse turns on CR5 or CR6 (depending on the ac cycle polarity), charging filter capacitors C11, C12, and C13 through inductor L4. At the end of the half cycle, ac polarity reverses and the SCR ceases conduction. Since the other SCR will not begin its conduction until triggered, neither SCR is conducting at this time. The inductive field of L4 begins to collapse, building up a reverse voltage which could be destructive if protection were not provided. Diode CR7 provides this protection by coming into conduction when the reverse voltage exceeds the -2-volt level, and provides a current path back to the filter capacitors. Thus, even when both SCR's are off, the inductor still delivers current to the load. When the next SCR is triggered, it abruptly puts out a positive voltage to the inductor, and thus reverse biases CR7. In summary: CR7 conducts when the SCR's are not conducting.

As explained above, the timing of the SCR trigger accomplishes the voltage regulating function.

The primary purpose of Q7 is to synchronize the unijunction oscillator to twice the line frequency. A secondary function is to inhibit the triggering of unijunction transistor Q4 when the crowbar is on, thus reducing current delivered to the Crowbar, CR80. When the input voltage (pulsating dc from the input to L4) is in excess of +9 volts, Q7 is saturated (on), providing a low impedance path for the Q3 collector current, diverting it from C30. Thus the unijunction oscillator is held in the off state. (Note that a positive input from the crowbar, via Q29, could permanently hold the oscillator in this off state.) Then, when the pulsating voltage drops below +8 volts, Q7 is cut off, and the current from the Q3 collector is shunted to ramp capacitor C30. This results in a voltage ramp on the emitter of Q4, the slope of which (as discussed earlier) is determined by the collector current of Q3. The start of the ramp is therefore determined by the on-to-off transition of Q7, which occurs twice for each cycle of the line.

The -2-volt sense voltage referred to above is applied through a presettable divider to the base of Q18. The bottom end of the divider is held constant by a Zener diode reference. The -2-volt adjustment resistor is set so that the Q18 base is at zero volts when the -2-volt output is at its nominal value. This zero-volt-level is compared with the zero-volt ground at the emitter of Q19. Any difference is amplified by Q19, Q20, and Q21, altering the flow of shunt current through Q22. The direction of change (more or less current) is such as to maintain a fixed voltage value on the -2-volt sense line. As mentioned earlier (paragraph 4-- 94), the circuit acts like a Zener diode in maintaining a fixed voltage output. About 5 amperes is passed through Q22.

Transistors Q8 and Q9 are normally conducting. When an unusual current drain increases the dc voltage drop across inductor L4 to a specific level (determined by the selected values of R40 through R44), Q8 and Q8 will be biased off. Under this condition, CR35 clamps the unijunction input to a level that is below the trigger point. No pulses are therefore applied to the SCR's, and no further conduction occurs. Both +4.75-volt and -2-volt outputs are thus cut off.

There are three separate circuits involved in detecting and acting on out of limit dc voltage conditions. These three circuits (-2V limit sense, +4.75V limit sense, and crowbar) are discussed together under the current heading.

FIG. 78 illustrates the actions that occur when either the +4.75 or -2 voltages go out of limits. When the +4.75 voltage (applied to Q23/Q24 bases) rises too high, to a level set by R91, Q24 will conduct and activate the power fail circuit (discussed later under paragraph 4-110). Similarly, if the +4.75 voltage drops below a negative limit set by R92, Q23 will conduct and activate the power fail circuit. In the -2V limit sense circuit, if the -2 voltage (applied to the top of the divider), becomes too positive, to a level set by R61, Q14 will conduct and activate the power fail circuit. The negative limit sensing circuit uses a normally conducting emitter follower (Q13). When the -2 voltage becomes too negative, Q15 will conduct and activate the power fail circuit.

If the +4.75 voltage becomes excessively positive (above about +6 volts), or if the -2 voltage becomes excessively negative (more than about -3 volts), the crowbar circuit triggers and cuts off both supplies.

The crowbar circuit uses an SCR diode (CR80). When the -2-volt level goes more negative than the breakdown level of CR82 (normally an effective open circuit), CR82 causes Q30 (and Q31) to conduct. Or, if the +4.75-volt level goes more positive than the breakdown level of CR81, Q31 will again be caused to conduct. This is because both emitter and base voltages increase together as the +4.75 voltage rises; then CR81 breaks down and holds the base low. When, from either cause, Q31 conducts, SCR diode CR80 is triggered, effectively short-circuiting the +4.75V and -2V supplies together. This protects logic circuits from overvoltage damage. To prevent the rectifiers from delivering any more current to this short circuit, Q29 (which goes into conduction when the SCR triggers) inhibits the sync amplifier. Transistor Q7 is driven into saturation, thus preventing further trigger pulses to SCR rectifiers CR5 and CR6.

Diodes CR60 and CR61 rectify a sample of the transformer secondary output, and the resulting pulsating direct voltage is applied to two RC filters. One filter (R70, C50) has a short time constant, and the other (R71, R72, C51) has a long time constant. The filters are isolated from each other by CR63. As a result (see FIG. 79), a dc voltage representing the peak value of the rectified ac is present at the emitter of Q16, and a partially filtered waveform is present at the base. Normally (see half-cycle number 1), the exponential decay is not sufficient to cause conduction of Q16 before the next half-cycle restores the C50 charge. If, however, at least two half-cycles are missed (assume ac power is lost at the end of half-cycle number 2), the base voltage will drop to the point where conduction of Q16 will occur. With Q16 conducting, Q17 will also be turned on, thus activating the power fail circuit.

When any of the previously discussed voltage sensing circuits indicate a failure, Q26 is caused to conduct. (Note that four of the sources, Q14, Q15, Q17, and Q23, require inversion by Q25, whereas the Q24 source does not.) The conduction of Q26 in turn causes the other four transistors in the power fail circuit to conduct. The EPF signal (normally low) goes high to initiate a power fail interrupt in the computer. A few milliseconds later, EP0 (normally high) goes low; when power is restored, EP0 will go high again, initiating a restart sequence in computers which have the restart option installed and enabled.

Several circuits in the power supply require a +10-volt operating voltage. To supply this, the transformer secondary is rectified by CR40 and CR41, filtered by C45, and regulated by Q10. The control for Q10 is the differential amplifier consisting of Q11 and Q12. The reference voltage provided by CR42 is compared with a divided sample of the +10V output, and any difference is applied as a correction signal to the base of Q10.

INTERFACE INFORMATION

ENC: Initiates the reception of an operand word through the 16 bit input port and the transmission of a result word through the 16 bit output port.

FIRST OPERAND WORD: FPP receives through the 16 bit input port the 16 most significant magnitude bits (including sign) of a 48 bit floating point operand (reception initiated by the 1st, 4th, 7th, etc. ENC command after the last preceding OPC command).

SECOND OPERAND WORD: FPP receives through the 16 bit input port the 16 middle significant magnitude bits of a 48 bit floating point operand (reception initiated by the 2nd, 5th, 8th, etc. ENC command after the last preceding OPC command).

THIRD OPERAND WORD: FPP receives through the 16 bit input port the 8 least significant magnitude bits and the 8 exponent bits of a 48 bit floating point operand (reception initiated by the 3rd, 6th, 9th, etc., ENC command after the last preceding OPC command).

OPC: Initiates the reception of an operation code through the 16 bit input port and initiates the execution of that operation.

OPERATION CODE: FPP receives through the 16 bit input port an 8 bit operation code contained in bits 0-7 (reception initiated by the OPC command).

FLG: Indicates if following and ENC command that the reception of an operand word is complete and that the transmission of a result word has begun, or if following an OPC command that the execution of an operation is complete.

FIRST RESULT WORD: FPP transmits through the 16 bit output port the 16 most significant magnitude bits (including sign) of a 48 bit floating point result (transmission initiated by the 1st, 4th, 7th, etc. ENC command after the last preceding OPC command).

SECOND RESULT WORD: FPP transmits through the 16 bit output port the 16 middle significant magnitude bits of a 48 bit floating point result (transmission initiated by the 2nd, 5th, 8th, etc. ENC command after the last preceding OPC command).

THIRD RESULT WORD: FPP transmits through the 16 bit output port the 8 least significant magnitude bits and the 8 exponent bits of a 48 bit floating point result (transmission initiated by the 3rd, 6th, 9th, etc. ENC command after the last preceding OPC command).

ERR: Indicates that an error or special condition has been encountered during the execution of an operation and that the transmission of an error code has begun.

ERROR CODE: FPP transmits through the 16 bit output port an 8 bit error code contained in bits 8-15 (transmission initiated by an error or special condition encountered during the execution of an operation). ##SPC26## ##SPC27## ##SPC28## ##SPC29## ##SPC30## ##SPC31## ##SPC32## ##SPC33## ##SPC34##

* * * * *