U.S. patent application number 13/626886 was filed with the patent office on 2013-04-25 for method and apparatus for use in the design and manufacture of integrated circuits.
This patent application is currently assigned to IMAGINATION TECHNOLOGIES LIMITED. The applicant listed for this patent is Imagination Technologies Limited. Invention is credited to Wai-Chuen Cheung, Theo Alan Drane.
Application Number | 20130103733 13/626886 |
Document ID | / |
Family ID | 45035302 |
Filed Date | 2013-04-25 |
United States Patent
Application |
20130103733 |
Kind Code |
A1 |
Drane; Theo Alan ; et
al. |
April 25, 2013 |
METHOD AND APPARATUS FOR USE IN THE DESIGN AND MANUFACTURE OF
INTEGRATED CIRCUITS
Abstract
A method and apparatus are provided for manufacturing integrated
circuits performing invariant integer division x/d. A desired
rounding mode is provided and an integer triple (a,b,k) for this
rounding mode is derived. Furthermore, a set of conditions for the
rounding mode is derived. An RTL representation is then derived
using the integer triple. From this a hardware layout can be
derived and an integrated circuit manufactured with the derived
hardware layout. When the integer triple is derived a minimum value
of k for the desired rounding mode and set of conditions is also
derived.
Inventors: |
Drane; Theo Alan; (London,
GB) ; Cheung; Wai-Chuen; (London, GB) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Imagination Technologies Limited; |
Kings Langley |
|
GB |
|
|
Assignee: |
IMAGINATION TECHNOLOGIES
LIMITED
Kings Langley
GB
|
Family ID: |
45035302 |
Appl. No.: |
13/626886 |
Filed: |
September 26, 2012 |
Current U.S.
Class: |
708/650 |
Current CPC
Class: |
G06F 30/00 20200101;
G06F 7/38 20130101; G06F 7/535 20130101; G06F 30/30 20200101; G06F
30/327 20200101 |
Class at
Publication: |
708/650 |
International
Class: |
G06F 7/38 20060101
G06F007/38 |
Foreign Application Data
Date |
Code |
Application Number |
Oct 6, 2011 |
GB |
1117318.4 |
Claims
1. A method for manufacturing an integrated circuit for performing
invariant integer division x/d comprising: deriving an integer
triple (a,b,k) for a desired rounding mode and set of conditions
wherein Round(x/d)=(ax+b)/2.sup.k; deriving an RTL representation
of (ax+b)/2.sup.k using the integer triple; submitting the RTL
representation for derivation of a hardware layout from the RTL
representation, wherein the deriving of the integer triple (a, b,
k) comprises deriving a minimum value of k for the desired rounding
mode and set of conditions when deriving the integer triple.
2. A method according to claim 1 in which the rounding mode is
round towards zero.
3. A method according to claim 1 in which the rounding mode is
round to nearest.
4. A method according to claim 1 in which the rounding mode is
faithful rounding.
5. A method according to claim 1, further comprising deriving a
value for b for the previously derived values of k and a, which has
the smallest Hamming weight and in value.
6. An apparatus for manufacturing an integrated circuit for
performing invariant integer division x/d comprising: a module for
deriving an integer triple (a,b,k) for a desired rounding mode and
a set of conditions where Round(x/d)=(ax+b)/2.sup.k;, where k is
derived to be a minimum value for the desired rounding mode and the
set of conditions; a module for deriving a RTL representation of
(ax+b)/2.sup.k using the integer triple and for submitting the RTL
representation to a module for deriving a hardware layout from the
RTL representation.
7. An apparatus according to claim 6, wherein the module for
deriving the integer triple also is operable to derive a value for
b from the values of a and k, wherein said value for b has a lowest
Hamming weight and value.
8. An apparatus according to claim 6, wherein the desired rounding
mode is round towards zero.
9. An apparatus according to claim 6, wherein the desired rounding
mode is round to nearest.
10. An apparatus according to claim 6, wherein the desired rounding
mode is faithful rounding.
11. A tangible computer readable medium storing computer executable
instructions for a method used in designing an integrated circuit
for performing invariant integer division x/d, the method
comprising: deriving an integer triple (a,b,k) for a desired
rounding mode and set of conditions wherein
Round(x/d)=(ax+b)/2.sup.k; deriving an RTL representation of
(ax+b)/2.sup.k using the integer triple; deriving a hardware layout
from the RTL representation; and deriving a minimum value of k for
the desired rounding mode and set of conditions when deriving the
integer triple.
12. A tangible computer readable medium according to claim 11,
wherein for the method, the rounding mode is round towards
zero.
13. A tangible computer readable medium according to claim 11,
wherein for the method, the rounding mode is round to nearest.
14. A tangible computer readable medium according to claim 11,
wherein for the method, the rounding mode is faithful rounding.
15. A tangible computer readable medium according to claim 11,
wherein the method further comprises deriving a value for b for the
previously derived values of k and a, which has the smallest
Hamming weight and in value.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority from GB App. No. 1117318.4,
filed on Oct. 6, 2011, and which is incorporated by reference in
its entirety herein.
BACKGROUND
[0002] 1. Field
[0003] The following relates to methods and apparatus for use in
the design and manufacture of integrated circuits, and particularly
to the design and manufacture of circuits that perform
divisions.
[0004] 2. Related Art
[0005] When designing and manufacturing ICs, sophisticated
synthesis tools such as Synopsis.TM. Design Compiler are used to
convert a desired function which must be implemented in the IC into
a set of logic gates to perform the functions. Functions which need
to be implemented include add, subtract, multiply and divide. The
synthesis tools seek to implement the desired functions in an
efficient manner in logic gates.
[0006] The tools operate by converting a function to be
implemented, such as divide by x, to what is known as register
transfer level (RTL), which defines a circuit's behavior in terms
of the flow of signals between hardware registers and the logical
operations performed on these signals. This is then used to
generate a high level representation of a circuit from which
appropriate gate level representations and the ultimate IC design
can be derived for manufacture, and an IC can then be made. If a
synthesis tool is presented with division by a constant, it will
invariably use RTL designed for non-constant division. A designer
could note that in the case of constant division an implementation
of the form (ax+b)/2.sup.k could potentially make smaller ICs. The
designer would then have to work out values for the triple (a,b,k)
which would perform the task of x/d.
[0007] Division is acknowledged to be an expensive operation to
perform in hardware. However in the case where the divisor is known
to be a constant, efficient hardware implementations can be
constructed. Consider the division of an unsigned n bit integer x
by a known invariant integer constant d:
x d x .di-elect cons. [ 0 , 2 n - 1 ] d .di-elect cons. N
##EQU00001##
[0008] For the purposes of the exposition we will assume that d is
an odd integer larger than 1, the following schemes can be easily
modified for even d by those skilled in the art. We consider an
implementation of the form:
x d .apprxeq. ax + b 2 k ##EQU00002##
[0009] Where a, b and k are non negative integers. Note that
without loss of generality we can assume that a is odd. The prior
art in the case where the rounding used is round towards zero and d
is an unsigned m bit number comes from [1] and can be succinctly
summarised setting:
x d = ax + b 2 k ##EQU00003## t = 2 n + m - 1 d ##EQU00003.2## k =
n + m - 1 ##EQU00003.3## a = ( d ( t + 1 ) mod 2 n .ltoreq. 2 m - 1
) ? t + 1 : t b = ( d ( t + 1 ) mod 2 n .ltoreq. 2 m - 1 ) ? 0 : t
##EQU00003.4##
[0010] The second piece of prior art comes from [2] where the
rounding mode used is round to nearest, d=2.sup.n-1 and x is the
result of a multiplication of two unsigned n bit numbers a and
b:
ab 2 n - 1 + 1 2 = ( 2 n + 1 ) ( ab + 2 n - 1 ) 2 2 n
##EQU00004##
[0011] When a division is to be performed such as divide by d, the
integer triple discussed above is generated and provided to a RTL
generation unit, which produces the gate level circuits required as
an input to a synthesis tool which then generates the hardware
components required for manufacture.
SUMMARY
[0012] Aspects include methods and apparatus to design an
integrated circuit for performing invariant integer division for a
desired rounding mode such as round towards zero, round to nearest
and faithful rounding, and integrated circuits according to such
design.
[0013] In an example, the necessary and sufficient conditions for a
given integer triple of (a,b,k) to give the required answer for a
desired rounding mode are produced. In the application of a
hardware scheme an algorithm is presented which will fit into a
synthesis flow and produce the most efficient hardware. In
particular, we have appreciated that by representing integer
division in the form (ax+b)/2.sup.k and implementing this, rather
than the conventional x/d division input to an RTL generator, that
the division is implemented using a multiply-add implementation for
various rounding modes. Three rounding modes are described here but
the principle can be extended to any rounding mode. Using such an
approach results in a hardware implementation for the division
which can have up to a 50% decrease in integrated circuit area
required.
[0014] In accordance with one aspect, there is provided a method
for manufacturing an integrated circuit for performing invariant
integer division (x/d) comprises: deriving a integer triple (a,b,k)
for a desired rounding mode and set of conditions where
x/d=(ax+d)/2.sup.k; deriving an RTL representation of the
(ax+d)/2.sup.k representation of the division using the integer
triple; deriving a minimum value of k for a desired rounding mode
and a set of conditions deriving a hardware layout from the RTL
representation; and manufacturing an integrated circuit with the
derived hardware layout.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] FIG. 1 shows a block diagram of circuitry embodying an
example according to the disclosure.
[0016] FIG. 2 shows a method that can be implemented in examples
according to the disclosure.
[0017] FIG. 3 depicts apparatus that can be used to implement
aspects of the disclosure.
DETAILED DESCRIPTION
[0018] Exemplary aspects of the disclosure are described with
reference to three rounding modes. Other rounding modes may also be
implemented.
[0019] We first present the necessary and sufficient conditions for
a given triple of (a,b,k) to implement each of the three following
rounding schemes.
Round Towards Zero (RTZ)
[0020] In this case we require:
x d = ax + b 2 k ##EQU00005## 0 .ltoreq. ax + b 2 k - ( x - ( x mod
d ) d ) < 1 ##EQU00005.2## x ( 1 - ad 2 k ) - bd 2 k .ltoreq. x
mod d < x ( 1 - ad 2 k ) - bd 2 k + d ##EQU00005.3##
[0021] Now the sawtooth function x mod d is discontinuous in x with
peaks at x=md-1 where 1.ltoreq.m.ltoreq.floor(2.sup.n/d) and
troughs at x=md for 0.ltoreq.m.ltoreq.floor(2.sup.n/d). It suffices
to check that the upper bound error condition is met for md -1 and
the lower bound error condition is met for md:
md ( 1 - ad 2 k ) - bd 2 k .ltoreq. 0 d - 1 < ( md - 1 ) ( 1 -
ad 2 k ) - bd 2 k + d m ( 2 k - ad ) .ltoreq. b m ( ad - 2 k ) <
a - b where 0 .ltoreq. m .ltoreq. 2 n d where 0 < m .ltoreq. 2 n
d ##EQU00006##
[0022] Now given that a and d is odd then ad-2.sup.k.noteq.0.
Depending on the sign of ad-2.sup.k different values of m will
stress these inequalities. It follows that the necessary and
sufficient conditions for the implementation of round towards zero
mode for the IC design is:
x d = ax + b 2 k x .di-elect cons. [ 0 , 2 n - 1 ] .revreaction. -
b + 1 2 n / d < ad - 2 k < a - b if ad - 2 k < 0 ad - 2 k
< a - b 2 '' / d if ad - 2 k > 0 ##EQU00007##
Round To Nearest (RTN)
[0023] In this case we require:
x d + 1 2 = ax + b 2 k ##EQU00008## 0 .ltoreq. ax + b 2 k - ( ( 2 x
+ d ) - ( ( 2 x + d ) mod d ) 2 d ) < 1 ##EQU00008.2## 2 x ( 1 -
ad 2 k ) - bd 2 k - 1 + d .ltoreq. ( 2 x + d ) mod 2 d < 2 x ( 1
- ad 2 k ) - bd 2 k - 1 + 3 d ##EQU00008.3##
[0024] Now the sawtooth function (2x+d) mod 2d is discontinuous in
x with peaks at md-(d+1)/2 where
0<m.ltoreq.floor((2.sup.n+1+d-1)/2d) and troughs at md-(d-1)/2
for 0<m.ltoreq.floor((2.sup.n+1+d-3)/2d). It suffices to check
the upper bound error condition is met for the peaks and the lower
bound condition is met for the troughs:
2 ( md - d - 1 2 ) ( 1 - ad 2 k ) - bd 2 k - 1 + d .ltoreq. 1 2 d -
1 < 2 ( md - d + 1 2 ) ( 1 - ad 2 k ) - bd 2 k - 1 + 3 d 2 m ( 2
k - ad ) .ltoreq. 2 b - a ( d - 1 ) 2 m ( ad - 2 k ) < a ( d + 1
) - 2 b where 0 < m .ltoreq. 2 n + 1 + d - 3 2 d where 0 < m
.ltoreq. 2 n + 1 + d - 1 2 d ##EQU00009##
[0025] Now given that a and d is odd then ad-2.sup.k.noteq.0.
Depending on the sign of ad-2.sup.k different values of m will
stress these inequalities. It follows that the necessary and
sufficient conditions for implementation of round towards nearest
for the IC design is:
x d + 1 2 = ax + b 2 k x .di-elect cons. [ 0 , 2 '' - 1 ]
.revreaction. a ( d - 1 ) - 2 b - 1 2 ( 2 n + 1 + d - 3 ) / 2 d
< ad - 2 k < a ( d + 1 2 ) - b if ad - 2 k < 0 a ( d - 1 2
) - b .ltoreq. ad - 2 k < a ( d + 1 ) - 2 b 2 ( 2 n + 1 + d - 1
) / 2 d if ad - 2 k > 0 ##EQU00010##
Faithful Rounding (FR1)
[0026] In this case we can return either integer that lies either
side of the true side, if the true answer is an integer we must
return that integer:
Case x = md m = amd + b 2 k 0 .ltoreq. m .ltoreq. 2 n / d 0
.ltoreq. amd + b 2 k - m < 1 - b .ltoreq. m ( ad - 2 k ) < 2
k - b ##EQU00011## Case x mod d > 0 0 .ltoreq. ax + b 2 k x d
< 2 x ( 1 - ad 2 k ) - bd 2 k .ltoreq. x mod d < x ( 1 - ad 2
k ) - bd 2 k + 2 d ##EQU00011.2##
[0027] Now for the second case the sawtooth function x mod d is
discontinuous in x with peaks at x=md-1 where
0<m.ltoreq.floor(2.sup.n/d) and troughs at x=md+1 (note we are
assuming x.noteq.md) for 0.ltoreq.m.ltoreq.floor(2.sup.n/d). It
suffices to check that the upper bound error condition is met for
md-1 and the lower bound error condition is met for md+1:
( md + 1 ) ( 1 - ad 2 k ) - bd 2 k .ltoreq. 1 d - 1 < ( md - 1 )
( 1 - ad 2 k ) - bd 2 k + 2 d m ( 2 k - ad ) .ltoreq. a + b m ( ad
- 2 k ) < 2 k + a - b where 0 .ltoreq. m .ltoreq. 2 n d where 0
< m .ltoreq. 2 n d ##EQU00012##
[0028] Now given that a and d is odd then ad-2.sup.k.noteq.0.
Depending on the sign of ad-2.sup.k different values of m will
stress these inequalities in the two cases. It follows that the
necessary and sufficient conditions for implementation of faithful
rounding is:
x d or x d = ax + b 2 k x .di-elect cons. [ 0 , 2 '' - 1 ]
.revreaction. 2 n / d ( 2 k - ad ) .ltoreq. b < 2 k if ad - 2 k
< 0 2 n / d ( ad - 2 k ) < 2 k - b if ad - 2 k > 0
##EQU00013##
Minimal Hardware Implementation Scheme
[0029] Minimal hardware implementations in the IC will result from
minimising the number of partial product bits in ax+b. The scheme
used achieves this as follows: [0030] 1 Minimise k producing kopt.
[0031] 2 For the range of acceptable values of a for a given kopt
choose the one that results in the smallest constant multiplier.
This can be accomplished by choosing a value for a which has the
smallest number of non zero elements in a Canonical Signed Digit
representation of a. This will result in aopt. Define this function
as minCSD(x). [0032] 3 For the range of valid values for b having
fixed kopt and aopt choose the one with smallest Hamming weight, as
this minimises the number of partial products bits. If there are a
range of numbers that have smallest Hamming weight, we choose the
one that has smallest value as this will add 1s into the least
significant bits of the array where the height of the array is
smallest. Define the function which finds this value for numbers in
the interval [a, b] as minHamm(a, b). Note that the minHamm(a, b)
function can be computed as follows:
TABLE-US-00001 [0032] Input unsigned a[p-1:0],b[p-1:0] Output
unsigned c[p-1:0] c=0 for i=p-1; i.gtoreq.0; i-- loop if a[i]==b[i]
then c[i]=a[i]; a[i]=0; else c+=2.sup..left brkt-top.log.sup.2
.sup.a.right brkt-bot.; break; endloop return c
[0033] Now applying this scheme to the space of allowable (a,b,k)
as derived by the rounding mode and set of conditions we can
construct a minimal hardware implementation for each of the three
rounding schemes:
Rtz Minimal Hardware Implementation when ad-2.sup.K>0
[0034] In this case we require:
ad - 2 k < a - b 2 n / d ##EQU00014##
[0035] Now note that the right hand side is strictly decreasing in
b. So for any valid a, b and k we can always set b=0 and then the
condition will still be met, plus it will cost less hardware to
implement. Hence a minimal hardware implementation will have b=0.
Thus our condition reduces to:
( ad - 2 k ) 2 n / d < a ##EQU00015## 2 k d < a < 2 k 2 n
/ d d 2 n / d - 1 ##EQU00015.2##
[0036] Given that a must be an integer we have a formula for
kopt:
k opt = min ( k : 1 2 k 2 k d < 2 n / d d 2 n / d - 1 )
##EQU00016## k opt = min ( k : 2 k ( - 2 k ) mod d < d 2 n d - 1
) ##EQU00016.2##
[0037] And kopt is the smallest such valid k hence:
1 2 k opt - 1 2 k opt - 1 d .gtoreq. 2 n / d d 2 n / d - 1
##EQU00017## 2 2 k opt - 1 d .gtoreq. 2 k opt 2 n / d d 2 n / d - 1
##EQU00017.2## 2 k opt d + 1 .gtoreq. 2 k opt 2 n / d d 2 n / d - 1
##EQU00017.3##
[0038] Hence a=ceil(2.sup.k.sup.opt/d) is valid but
a=ceil(2.sup.k.sup.opt/d)+1 is not valid. It follows that the there
is only valid value for a when k=kopt. We can now state that the
design which minimises k and satisfies ad-2.sup.k>0 is unique
and is defined by:
k opt + = min ( k : 2 k ( - 2 k ) mod d > d 2 n d - 1 )
##EQU00018## a opt + = 2 k opt + d ##EQU00018.2## b opt + = 0
##EQU00018.3##
Rtz Minimal Hardware Implementation when ad-2.sup.K<0
[0039] In this case we need:
- b + 1 2 n / d < ad - 2 k < a - b ##EQU00019##
[0040] Hence b must necessarily be in the following interval:
b .di-elect cons.[(2.sup.k-ad).left brkt-bot.2.sup.n/d.right
brkt-bot., 2.sup.k+a-ad-1]
[0041] This interval must be non empty so:
2 k + a - ad > ( 2 k - ad ) 2 n / d ##EQU00020## 2 k d > a
> 2 k ( 2 n / d - 1 ) d 2 n / d - d + 1 ##EQU00020.2##
[0042] Given that a must be an integer we have a formula for
kopt:
k opt = min ( k : 1 2 k 2 k d > 2 n / d - 1 d 2 n / d - d + 1 )
##EQU00021## k opt = min ( k : 2 k 2 k mod d > d 2 n d - d + 1 )
##EQU00021.2##
[0043] Where kopt is the smallest such valid k hence:
1 2 k opt - 1 2 k opt - 1 d .ltoreq. 2 n / d - 1 d 2 n / d - d + 1
##EQU00022## 2 2 k opt - 1 d .ltoreq. 2 k opt ( 2 n / d - 1 ) d 2 n
/ d - d + 1 ##EQU00022.2## 2 k opt d - 1 .ltoreq. 2 k opt ( 2 n / d
- 1 ) d 2 n / d - d + 1 ##EQU00022.3##
[0044] Hence a=floor(2.sup.k.sup.opt/d) s valid but
a=floor(2.sup.k.sup.opt/d)-1 is not valid. It follows that the
there is only valid value for a when k=kopt. We can now state that
the design which minimises k and satisfies ad-2.sup.k<0 is
unique in k and a and is defined by:
k opt - = min ( k : 2 k 2 k mod d > d 2 n d - d + 1 )
##EQU00023## a opt - = 2 k opt - d ##EQU00023.2## b opt - = min
Hamm ( ( 2 k opt - - a opt - d ) 2 n / d , 2 k opt - - a opt - ( d
- 1 ) - 1 ) ##EQU00023.3##
[0045] Where minHamm(a, b) returns the number of smallest value
from the numbers of smallest Hamming weight found within the
interval [a, b].
Rtz Minimal Hardware Design
[0046] Summarising the previous sections we have the following
algorithm:
k opt + = min ( k : 2 k ( - 2 k ) mod d > d 2 n d - 1 )
##EQU00024## a opt + = 2 k opt + d ##EQU00024.2## b opt + = 0
##EQU00024.3## k opt - = min ( k : 2 k 2 k mod d > d 2 n d - d +
1 ) ##EQU00024.4## a opt - = 2 k opt - d ##EQU00024.5## b opt - =
min Hamm ( ( 2 k opt - - a opt - d ) 2 n / d , 2 k opt - - a opt -
( d - 1 ) - 1 ) ##EQU00024.6## { k opt , a opt , b opt } = ( k opt
+ < k opt - ) ? { k opt + , a opt + , b opt + } : { k opt - , a
opt - , b opt - } ##EQU00024.7##
[0047] Note that kopt.sup.+ is never equal to kopt.sup.-, otherwise
if kopt=kopt.sup.+=kopt.sup.- then:
2 k ( - 2 k ) mod d > d 2 n d - 1 .gtoreq. 2 k - 1 ( - 2 k - 1 )
mod d ##EQU00025## 2 k 2 k mod d > d 2 n d - d + 1 .gtoreq. 2 k
- 1 2 k - 1 mod d ##EQU00025.2##
[0048] Simplifying these two conditions we get:
2((-2.sup.k-1)mod d)>(-2.sup.k)mod d
2(2.sup.k-1mod d)>2.sup.k mod d
2(2.sup.k-1mod d)>2.sup.k mod d>2(2.sup.k-1 mod d)-d
[0049] This is a contradiction as 2.sup.k mod d is equal to one of
these limits.
Rtn Minimal Hardware Implementation when ad-2.sup.K>0
[0050] In this case we need:
a ( d - 1 2 ) - b .ltoreq. ad - 2 k < a ( d + 1 ) - 2 b 2 ( 2 n
+ 1 + d - 1 ) / 2 d ##EQU00026##
[0051] Hence b must necessarily be in the following interval:
b .di-elect cons. [ a ( d - 1 2 ) + 2 k - ad , a ( d + 1 2 ) + ( 2
k - ad ) 2 n + 1 + d - 1 2 d - 1 ] ##EQU00027##
[0052] This interval must be non empty so:
a ( d + 1 2 ) + ( 2 k - ad ) 2 n + 1 + d - 1 2 d > a ( d - 1 2 )
+ 2 k - ad ##EQU00028## 2 k d < a < 2 k ( 2 n + 1 - d - 1 ) /
2 d d ( 2 n + 1 - d - 1 ) / 2 d - 1 ##EQU00028.2##
[0053] Given that a must be an integer we have a formula for
kopt:
k opt = min ( k : 1 2 k 2 k d < ( 2 n + 1 - d - 1 ) / 2 d d ( 2
n + 1 - d - 1 ) / 2 d - 1 ) ##EQU00029## k opt = min ( k : 2 k ( -
2 k ) mod d > d 2 n + 1 - d - 1 2 d - 1 ) ##EQU00029.2##
[0054] Where kopt is the smallest such valid k hence:
1 2 k opt - 1 2 k opt - 1 d .gtoreq. ( 2 n + 1 - d - 1 ) / 2 d d (
2 n + 1 - d - 1 ) / 2 d - 1 ##EQU00030## 2 2 k opt - 1 d .gtoreq. 2
k opt ( 2 n + 1 - d - 1 ) / 2 d d ( 2 n + 1 - d - 1 ) / 2 d - 1
##EQU00030.2## 2 k opt d + 1 .gtoreq. 2 k opt ( 2 n + 1 - d - 1 ) /
2 d d ( 2 n + 1 - d - 1 ) / 2 d - 1 ##EQU00030.3##
[0055] Hence a=ceil(2.sup.k.sup.opt/d) is valid but
a=ceil(2.sup.k.sup.opt/d)+1 is not valid. It follows that the there
is only valid value for a when k=kopt. The design which minimises k
and satisfies ad-2.sup.k>0 is unique and is defined by:
k opt + = min ( k : 2 k ( - 2 k ) mod d > d 2 n + 1 - d - 1 2 d
- 1 ) ##EQU00031## a opt + = 2 k opt + d ##EQU00031.2## b opt + =
min Hamm ( a opt + ( d - 1 2 ) + 2 k opt + - a opt + d , a opt + (
d + 1 2 ) + ( 2 k opt + - a opt + d ) 2 n + 1 + d - 1 2 d - 1 )
##EQU00031.3##
[0056] Where minHamm(a, b) returns the number of smallest value
from the numbers of smallest Hamming weight found within the
interval [a, b].
Rtn Minimal Hardware Implementation when ad-2.sup.K<0
[0057] In this case we need:
a ( d - 1 ) - 2 b - 1 2 ( 2 n + 1 + d - 3 ) / 2 d < ad - 2 k
< a ( d + 1 2 ) - b ##EQU00032##
[0058] Hence b must necessarily be in the following interval:
b .di-elect cons. [ a ( d - 1 2 ) + ( 2 k - ad ) 2 n + 1 + d - 3 2
d , a ( d + 1 2 ) + 2 k - ad - 1 ] ##EQU00033##
[0059] This interval must be non empty so:
a ( d + 1 2 ) + 2 k - ad > a ( d - 1 2 ) + ( 2 k - ad ) 2 n + 1
+ d - 3 2 d ##EQU00034## 2 k d > a > 2 k ( 2 n + 1 - d - 3 )
/ 2 d d ( 2 n + 1 - d - 3 ) / 2 d + 1 ##EQU00034.2##
[0060] Given that a must be an integer we have a formula for
kopt:
k opt = min ( k : 1 2 k 2 k d > ( 2 n + 1 - d - 3 ) / 2 d d ( 2
n + 1 - d - 3 ) / 2 d + 1 ) ##EQU00035## k opt = min ( k : 2 k 2 k
mod d > d 2 n + 1 - d - 3 2 d + 1 ) ##EQU00035.2##
[0061] Where kopt is the smallest such valid k hence:
1 2 k opt - 1 2 k opt - 1 d .ltoreq. ( 2 n + 1 - d - 3 ) / 2 d d (
2 n + 1 - d - 3 ) / 2 d + 1 ##EQU00036## 2 2 k opt - 1 d .ltoreq. 2
k opt ( 2 n + 1 - d - 3 ) / 2 d d ( 2 n + 1 - d - 3 ) / 2 d + 1
##EQU00036.2## 2 k opt d - 1 .ltoreq. 2 k opt ( 2 n + 1 - d - 3 ) /
2 d d ( 2 n + 1 - d - 3 ) / 2 d + 1 ##EQU00036.3##
[0062] Hence a=floor(2.sup.k.sup.opt/d) is valid but
a=floor(2.sup.k.sup.opt/d)-1 is not valid. It follows that the
there is only valid value for a when k=kopt. The design which
minimises k and satisfies ad-2.sup.k<0 is unique in k and a and
is defined by:
k opt - = min ( k : 2 k 2 k mod d > d 2 n + 1 - d - 3 2 d + 1 )
##EQU00037## a opt - = 2 k opt - d ##EQU00037.2## b opt - = min
Hamm ( a opt - ( d - 1 2 ) + ( 2 k opt - - a opt - d ) 2 n + 1 + d
- 3 2 d , a opt - ( d + 1 2 ) + 2 k opt - - a opt - d - 1 )
##EQU00037.3##
[0063] Where minHamm(a, b) returns the number of smallest value
from the numbers of smallest Hamming weight found within the
interval [a, b].
Rtn Minimal Hardware Design
[0064] Summarising the previous sections results in the following
algorithm:
k opt + = min ( k : 2 k ( - 2 k ) mod d > d 2 n + 1 - d - 1 2 d
- 1 ) a opt + = 2 k opt + d b opt + = min Hamm ( a opt + ( d - 1 2
) + 2 k opt + - a opt + d , a opt + ( d + 1 2 ) + ( 2 k opt + - a
opt + d ) 2 n + 1 + d - 1 2 d - 1 ) k opt - = min ( k : 2 k 2 k mod
d > d 2 n + 1 - d - 3 2 d + 1 ) a opt - = 2 k opt - d b opt - =
min Hamm ( a opt - ( d - 1 2 ) + ( 2 k opt - - a opt - d ) 2 n + 1
+ d - 3 2 d , a opt - ( d + 1 2 ) + 2 k opt - - a opt - d - 1 ) { k
opt , a opt , b opt } = ( k opt + < k opt - ) ? { k opt + , a
opt + , b opt + } : { k opt - , a opt - , b opt - }
##EQU00038##
[0065] Note that kopt is never equal to kopt, otherwise if
kopt=kopt=kopt then:
2 k ( - 2 k ) mod d > d 2 n + 1 - d - 1 2 d - 1 .gtoreq. 2 k - 1
( - 2 k - 1 ) mod d ##EQU00039## 2 k 2 k mod d > d 2 n + 1 - d -
3 2 d + 1 .gtoreq. 2 k - 1 2 k - 1 mod d ##EQU00039.2##
[0066] Simplifying these two conditions we get:
2((-2.sup.k-1)mod d)>(-2.sup.k)mod d
2(2.sup.k-1mod d)>2.sup.kmod d
2(2.sup.k-1mod d)>2.sup.k mod d>2(2.sup.k-1mod d)-d
[0067] This is a contradiction as 2.sup.k mod d is equal to one of
these limits.
FR1 Minimal Hardware Implementation when ad-2.sup.K>0
[0068] In this case we require:
.left brkt-bot.2.sup.n/d.right
brkt-bot.(ad-2.sup.k)<2.sup.k-b
[0069] Now note that the right hand side is strictly decreasing in
b. So for any valid a, b and k we can always set b=0 and then
condition will still be met, plus cost less hardware to implement.
Hence minimal hardware implementations will have b=0. Thus our
condition reduces to:
2 n / d ( ad - 2 k ) < 2 k ##EQU00040## 2 k d < a < 2 k 2
n / d d 2 n / d ##EQU00040.2##
[0070] Given that a must be an integer we have a formula for
kopt:
k opt = min ( k : 1 2 k 2 k d < 2 n / d d 2 n / d ) ##EQU00041##
k opt = min ( k : 2 k ( - 2 k ) mod d > 2 n d )
##EQU00041.2##
[0071] Where kopt is the smallest such valid k hence:
1 2 k opt - 1 2 k opt - 1 d .gtoreq. 2 n / d d 2 n / d ##EQU00042##
2 2 k opt - 1 d .gtoreq. 2 k opt 2 n / d d 2 n / d ##EQU00042.2## 2
k opt d + 1 .gtoreq. 2 k opt 2 n / d d 2 n / d ##EQU00042.3##
[0072] Hence a=ceil(2.sup.k.sup.opt/d) is valid but
a=ceil(2.sup.k.sup.opt/d)+1 is not valid. It follows that the there
is only a valid value for a when k=kopt. We can now state that the
design which minimises k and satisfies ad-2.sup.k>0 is unique
and is defined by:
k opt + = min ( k : 2 k ( - 2 k ) mod d > 2 n d ) ##EQU00043## a
opt + = 2 k opt + d ##EQU00043.2## b opt + = 0 ##EQU00043.3##
FR1 Minimal Hardware Implementation when ad-2.sup.K<0
[0073] In this case we need:
.left brkt-bot.2.sup.n/d.right
brkt-bot.(2.sup.k-ad).ltoreq.b<2.sup.k
[0074] Now b must live in a non empty interval so:
2 k > 2 n / d ( 2 k - ad ) ##EQU00044## 2 k d > a > 2 k (
2 n / d - 1 ) d 2 n / d ##EQU00044.2##
[0075] Given that a must be an integer we have a formula for
kopt:
k opt = min ( k : 1 2 k 2 k d < 2 n / d - 1 d 2 n / d )
##EQU00045## k opt = min ( k : 2 k 2 k mod d > 2 n d )
##EQU00045.2##
[0076] Where kopt is the smallest such valid k hence:
1 2 k opt - 1 2 k opt - 1 d .ltoreq. 2 n / d - 1 d 2 n / d
##EQU00046## 2 2 k opt - 1 d .ltoreq. 2 k opt ( 2 n / d - 1 ) d 2 n
/ d ##EQU00046.2## 2 k opt d - 1 .ltoreq. 2 k opt ( 2 n / d - 1 ) d
2 n / d ##EQU00046.3##
[0077] Hence a=floor(2.sup.k.sup.opt/d) is valid but
a=floor(2.sup.k.sup.opt/d)-1 is not valid. It follows that the
there is only a valid value for a when k=kopt. We can now state
that the design which minimises k and satisfies ad-2.sup.k<0 is
unique in k and a and is defined by:
k opt - = min ( k : 2 k 2 k mod d > 2 n d ) ##EQU00047## a opt -
= 2 k opt - d ##EQU00047.2## b opt - = min Hamm ( ( 2 k opt - - a
opt - d ) 2 n / d , 2 k opt - - 1 ) ##EQU00047.3##
[0078] Where minHamm(a, b) returns the number of smallest value
from the numbers of smallest Hamming weight found within the
interval [a, b].
FR1 Minimal Hardware Design
[0079] Summarising the previous sections we have the following
algorithm:
k opt + = min ( k : 2 k ( - 2 k ) mod d > 2 n d ) ##EQU00048## a
opt + = 2 k opt + d ##EQU00048.2## b opt + = 0 ##EQU00048.3## k opt
- = min ( k : 2 k 2 k mod d > 2 n d ) ##EQU00048.4## a opt - = 2
k opt - d ##EQU00048.5## b opt - = min Hamm ( ( 2 k opt - - a opt -
d ) 2 n / d , 2 k opt - - 1 ) ##EQU00048.6## { k opt , a opt , b
opt } = ( k opt + < k opt - ) ? { k opt + , a opt + , b opt + }
: { k opt - , a opt - , b opt - } ##EQU00048.7##
[0080] Note that kopt.sup.+ is never equal to kopt.sup.-, else if
kopt=kopt.sup.+=kopt.sup.- then:
2 k ( - 2 k ) mod d > 2 n d .gtoreq. 2 k - 1 ( - 2 k - 1 ) mod d
##EQU00049## 2 k 2 k mod d > 2 n d .gtoreq. 2 k - 1 2 k - 1 mod
d ##EQU00049.2##
[0081] Simplifying these two conditions we get:
2((-2.sup.k-1)mod d)>(-2.sup.k)mod d
2(.sup.k-1 mod d)>2.sup.k mod d
2(2.sup.k-1 mod d)>2.sup.k mod d>2(2.sup.k-1 mod d)-d
[0082] This is a contradiction as 2.sup.k mod d is equal to one of
these limits.
Invariant Integer Division Synthesiser
[0083] Example structure of a synthesis apparatus according to the
disclosure that performs invariant integer division is depicted in
FIG. 1.
[0084] This shows a parameter creation unit 2 which has three
inputs n, d and rounding mode. n is the number of bits to be used
in the numerator of the division, d is the divisor, and the
rounding mode is a selection of one of a plurality of rounding
modes. Three examples are given here but others are possible.
[0085] The parameter creation unit 2 generates in dependence on the
inputs n, d, and rounding mode, the integer triple (a, b, k)
required by an RTL generator k to generate an appropriate RTL
representation of the circuitry for performing the division for the
said number of bits of n and rounding mode, and for additional
conditions provided to the RTL generation. The RTL generator is
computer controlled to generate an RTL representation of a division
for the integer triple using additional conditions such as
ad-2.sup.k<0.
[0086] The RTL representation is then output to a synthesis tool 6
which generates the hardware circuits required to implement the
division on an appropriate part of an integrated circuit.
[0087] The algorithm in the parameter creation may be summarised
as:
{ k , a , b } = ( k + < k - ) ? { k + , a + , min Hamm ( Y + ( k
+ , a + ) ) } : { k - , a - , min Hamm ( Y - ( k - , a - ) ) }
##EQU00050## Where ##EQU00050.2## k + = min ( k : 2 k ( - 2 k ) mod
d > X + ) ##EQU00050.3## k - = min ( k : 2 k 2 k mod d > X -
) ##EQU00050.4## a + = 2 k + d ##EQU00050.5## a - = 2 k - d
##EQU00050.6## And ##EQU00050.7##
TABLE-US-00002 RTZ RTN FR1 X.sup.+ d 2 n d - 1 ##EQU00051## d 2 n +
1 - d - 1 2 d - 1 ##EQU00052## 2 n d ##EQU00053## X.sup.- d 2 n d -
d + 1 ##EQU00054## d 2 n + 1 - d - 3 2 d + 1 ##EQU00055## 2 n d
##EQU00056## Y.sup.+(k, a) (0, 0) ( a ( d - 1 2 ) + 2 k - ad , a (
d - 1 2 ) + ( 2 k - ad ) 2 n + 1 + d - 1 2 d - 1 ) ##EQU00057## (0,
0) Y.sup.-(k, a) ( ( 2 k - ad ) 2 n / d , 2 k - a ( d - 1 ) - 1 )
##EQU00058## ( a ( d - 1 2 ) + ( 2 k - ad ) 2 n + 1 + d - 3 2 d , a
( d - 1 2 ) + 2 k - ad - 1 , ) ##EQU00059## ( ( 2 k - ad ) 2 n / d
, 2 k - 1 ) ##EQU00060##
Specific Example of the Idea: Application to the Multiplication of
Normalised Numbers
[0088] An unsigned n bit normalised number x is interpreted as
holding the value x/(2.sup.n-1). Multiplication of these numbers
thus involves computing the following:
y 2 n - 1 .apprxeq. a 2 n - 1 b 2 n - 1 ##EQU00061## y .apprxeq. ab
2 n - 1 ##EQU00061.2##
[0089] We can apply the previously found results to implementing
this design for the three rounding modes. In this case d=2.sup.n-1
and given that ab.ltoreq.(2.sup.n-1).sup.2 then 2.sup.n-1 in the
previous sections will be replaced by (2.sup.2-1).sup.2.
Substituting these values into the previous sections gives rise to
the following three rounding:
RTZ ( ab 2 n - 1 ) = ( 2 n + 1 ) ab + 2 n 2 2 n ##EQU00062## RTN (
ab 2 n - 1 ) = ( 2 n + 1 ) ( ab + 2 n - 1 ) 2 2 n ##EQU00062.2## FR
1 ( ab 2 n - 1 ) = ab + 2 n - 1 2 n ##EQU00062.3##
[0090] Note that the RTN case gives a generalisation and proof of
the formula for such mulitplication [2]. Note that the allowable
interval for the additive constant in each case is [2.sup.n-1,
2.sup.n+1], [2.sup.n-1(2.sup.n+1)-2, 2.sup.n-1(2.sup.n+1)] and no
freedom for the FR1 case.
Alternative Implementations
[0091] Further implementations can be realized by those skilled in
the art based on the following disclosures, to deal with the
following situations: [0092] 1) d is even. Note that if d is a
power of 2 then we have the trivial implementations:
[0092] RTZ ( x d ) = FR 1 ( x d ) = x >> log 2 ( d )
##EQU00063## RTN ( x d ) = ( x + d / 2 ) >> log 2 ( d )
##EQU00063.2## [0093] 2) If x lives in the interval [0,max] and not
[0,2.sup.n-1] then it suffices to replace 2.sup.n the formulae
above with max+1. [0094] 3) If x can take negative the formulae can
be reworked using the framework established above. [0095] 4) Other
rounding modes--the formulae can be reworked using the framework
established above. [0096] 5) The hardware scheme has been chosen to
minimize the size of the final shift--a different scheme could be
applied which would result in different equations for optimal a, b
and k.
[0097] In summary of the above, FIG. 1 depicts an example where a
parameter creator inputs n, d, and a desired rounding mode, and
outputs integer triple a, b, and k. An RTL generator 11 receives
the a, b, and k. RTL generator 11 generates RTL for
(a*x+b)>>k according to the parameters. The generated RTL is
input to a synthesis tool 12, which outputs a hardware layout that
can be fabricated in a fabrication process.
[0098] FIG. 2 depicts an example method, where integer triple (a,
b, k) is derived 15 for a rounding mode and conditions. At 16, a
minimum value of k is derived (16 is depicted separately from 15,
for ease of explanation). At 17, an RTL representation of (ax+b)/2
k is derived. At 18, a hardware layout is derived from the RTL. At
19, a circuit can be manufactured according to the hardware
layout.
[0099] FIG. 3 depicts exemplary apparatus in which methods can be
implemented. A processor 25 interfaces with a user interface 30 and
with a display 32. A RAM 26 and non-volatile storage 28 interfaces
with processor 25. These memories are tangible memories that can
store instructions for configuring processor 25 to perform aspects
of the disclosure.
* * * * *