U.S. patent application number 09/118225 was filed with the patent office on 2002-01-24 for digital circuit layout techniques.
Invention is credited to WALLACE, DAVID E..
Application Number | 20020010899 09/118225 |
Document ID | / |
Family ID | 22377270 |
Filed Date | 2002-01-24 |
United States Patent
Application |
20020010899 |
Kind Code |
A1 |
WALLACE, DAVID E. |
January 24, 2002 |
DIGITAL CIRCUIT LAYOUT TECHNIQUES
Abstract
A technique for analyzing digital circuits to identify pin swaps
is provided for circuit layout and similar tasks in which the
circuit is first decomposed into fanout-free regions.
Quasi-canonical forms or models of the fanout free region are
created from which a swap structure is created so that pins swap
groups can be identified.
Inventors: |
WALLACE, DAVID E.; (HAYWARD,
CA) |
Correspondence
Address: |
SOKOLOFF, TAYLOR, & ZAFMAN, LLP
12400 WILSHIRE BOULEVARD
7TH FLOOR
LOS ANGELES,
CA
90025
|
Family ID: |
22377270 |
Appl. No.: |
09/118225 |
Filed: |
July 17, 1998 |
Current U.S.
Class: |
716/103 |
Current CPC
Class: |
G06F 30/327
20200101 |
Class at
Publication: |
716/2 |
International
Class: |
G06F 017/50; G06F
017/60 |
Claims
What is claimed is:
1. A method of analyzing a digital circuit to identify input
equivalences comprising the steps of: decomposing the digital
circuit into one or more regions having a first set of specified
characteristics; creating a modified circuit structure from the
decomposed regions; and identifying pin swap groups within the
modified circuit structure.
2. The method of claim 1 wherein the step of decomposing the
digital circuit further comprises the step of: decomposing the
circuit into one or more fanout-free regions.
3. The method of claim 2 wherein the step of decomposing the
circuit into one of more fanout-free regions further comprises the
steps of: identifying the region of the circuit headed by a single
gate which may fanout to multiple places; identifying all other
gates to be members of that region if they have only one fanout
which is itself a member of the region; and including input nets
within the region which fanout to multiple input pins within the
region.
4. The method of claim 1 wherein the step of creating a modified
circuit structure from the decomposed regions further comprises the
steps of: building of a swap structure from the original circuit
which computes the same function as the original circuit and
contains points that are identified with the external pins of gates
in that circuit.
5. The method of claim 4 wherein the step of creating a modified
circuit structure from the decomposed regions further comprises the
step of: modifying the initial circuit structure to include large
symmetric functions such as AND, OR or XOR functions that were
present in the original circuit structure in the form of large
functional blocks.
6. The method of claim 5 wherein the step of modifying the initial
circuit structure to include large symmetric functions further
comprises the step of: avoiding input rotations to the inputs of
symmetric functions which affect genealogical constraints.
7. The method of claim 6 wherein the step of avoiding input
rotations which effect genealogical constraints further comprises
the step of: avoiding the swapping of input and output pins of the
same logic function.
8. The method of claim 5 wherein the step of modifying the initial
circuit structure to include large symmetric functions further
comprises the step of: expanding the pins of the original circuit
structure into a quasi-canonical form.
9. The method of claim 8 wherein the step of expanding the pins of
the original circuit further comprises the step of: building a data
table of pin properties by expanding the pins of each in the gates
in the original structure.
10. The method of claim 9 wherein the step of building the data
table of pin properties further comprises the step of: sequentially
expanding each gate in each fanout-free region from the head gate
providing the output of that region backwards towards input gates
in that region.
11. The method of claim 1 in which the step of identifying pin swap
groups within the modified circuit structure further comprises the
step of: extracting swap groups from the modified circuit
structure.
12. The method of claim 11 in which the step of extracting swap
groups from the modified structure further comprises the step of:
recognizing multi-gate input equivalences and or multi-pin
swaps.
13. The method of claim 12 in which the step of recognizing
multi-gate single pin swaps further comprises the step of:
identifying two external pins which are both inputs to the same
functional block in the swap structure and which do not have
multiple fanouts.
14. The method of claim 13 wherein the step with identifying the
external pins further comprises the step of: avoiding pins which
are in an ancestor-descendent relationship with each other.
15. The method of claim 12 in which the step of recognizing
multi-pin swaps further comprises the steps of: identifying pins
input to a large functional block in the swap structure; expanding
the input function to this pin until it consists entirely of
external gate pins; then looking for a match among the input
functions to other pins in the same swap function.
16. A method of analyzing a digital circuit to identify input
equivalences comprising the steps of: decomposing the digital
circuit into one or more fanout-free regions; creating a
quasi-canonical circuit structure from the fanout-free regions; and
identifying pin swap groups within the circuit structure.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] This invention relates to techniques for converting
representations of digital circuits, such as logic diagrams or
schematics, into layouts for circuit implementation, and more
specifically to the identification of logic input equivalences for
controlling and optimizing circuit area and circuit delays during
the layout process.
[0003] 2. Description of the Prior Art
[0004] Many very sophisticated logic synthesis and layout tools
have been developed for producing circuit layouts from circuit and
logic diagrams. One recent article, "Effective Coupling between
Logic Synthesis and Layout Tools for Synthesis of Area and
Speed-Efficient Circuits", Chandrasekhar, McCharles and Wallace,
published in VLSI DESIGN, 1997, Vol. 5, No. 2, pp. 125-140,
co-authored by the inventor hereof, proposes coupling between logic
synthesis and layout tools to improve post-layout circuit
implementation.
[0005] As noted in that article, many circuits synthesized by
automatic logic synthesis or other methods contain internal nodes
at the outputs of logic gates that implement the same logic
function and are therefore logically equivalent. Such nodes are
considered output equivalent and techniques are provided for
exploiting output equivalents during the layout of digital
circuits. Similarly, input equivalence is demonstrated in logic
circuits in which the output of an output gate does not change even
if the circuit configuration is changed by interchanging the
drivers connected to input gates feeding that output gate, as shown
in FIG. 8 of that article.
[0006] Although techniques are provided for working with output
equivalence, what are needed are techniques for identifying and
exploiting input equivalences in the synthesis and layout of
digital logic circuits.
SUMMARY OF THE INVENTION
[0007] The present invention provides improved techniques for
identifying input equivalence in digital circuits for use, for
example, in swapping pins in order to modify circuit layout. These
techniques include the steps of decomposing the circuit into one or
more fanout-free regions (FFRs), generating quasi-canonical models
to create a swap structure, identifying pin swap groups and
swapping pins as desired for final layout or configuration.
[0008] The present invention proceeds by looking for extensible
symmetric logic functions (such as AND, OR, and XOR functions)
within and between gates in a logic circuit. Such functions are
grown backwards as long as they can continue to be extended and
then analyzed to identify input equivalences to identify permutable
pins.
[0009] The present invention be conveniently implemented in a
computer by coding appropriate software on computer coded media or
by any other conventional means of programming a computer. The
required software may be written by a person of ordinary skill in
the art of developing programs for analyzing digital circuits for
layout and similar operations.
[0010] In a first aspect, the present invention provides a method
of analyzing a digital circuit to identify input equivalences by
decomposing the digital circuit into one or more regions having a
first set of specified characteristics, creating a modified circuit
structure from the decomposed regions, and identifying pin swap
groups within the modified circuit structure.
[0011] In another aspect, the present invention provides a method
of analyzing a digital circuit to identify input equivalences by
decomposing the digital circuit into one or more fanout-free
regions, creating a quasi-canonical circuit structure from the
fanout-free regions, and identifying pin swap groups within the
circuit structure.
[0012] These and other features and advantages of this invention
will become further apparent from the detailed description and
accompanying figures that follow. In the figures and description,
numerals indicate the various features of the invention, like
numerals referring to like features throughout both the drawings
and the description.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] FIG. 1 is a logic schematic of a simple circuit exhibiting
both input and output equivalence.
[0014] FIG. 2 is a logic schematic of a simple circuit
demonstrating multi-gate, single pin (MGSP) equivalence.
[0015] FIG. 3 is a logic schematic of a simple circuit
demonstrating multi-pin (MP) equivalence in which groups of two or
more pins are found to be equivalent when swapped as a group.
[0016] FIG. 4 is a simple flow chart of the major steps of the
operation of the present invention.
[0017] FIG. 5 is a logic schematic of a simple circuit illustrating
the properties of fan out free regions.
[0018] FIG. 6 is a logic schematic of a more complicated circuit
which will be analyzed in accordance with a present invention to
permit pin swapping for region configuration of the final layout of
the circuit.
[0019] FIG. 7 is a swap structure of the circuit shown in FIG. 6
based on the pin properties shown in table 4.
[0020] FIG. 8 is a logic schematic of a decomposed swap structure
based on the swap structure of FIG. 6.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT(S)
[0021] Referring to FIG. 1, digital logic circuit 10 illustrates
examples of both input and output equivalence. In digital logic
circuit 10, two input XOR gates 12 and 14 are combined in two input
XOR gate 16, the output of which is buffered in output buffers 18
and 20. The inputs to XOR gates 12 and 14 are said to exhibit input
equivalence in that the output of two input XOR gate 16 does not
change if any of the inputs to two input XOR gates 12 and 14 are
interchanged. Similarly, the outputs of output buffers 18 and 20
are said to display output equivalence in that the outputs can be
exchanged.
[0022] In particular, logic equivalence can be said to identify
classes of permutable pins, such as input pins 22 and 24 of two
input XOR gate 12 and input pins 26 and 28 of two input XOR gate 14
as well as output pins 30 and 32 of output buffers 18 and 20,
respectively. That is, such pins can be swapped after placement
and/or routing to reduce wire length, improve circuit timing, or
reduce routing congestion.
[0023] In general, there are two types of logic equivalence: input
equivalence, in which inputs to a sub-circuit (such as the inputs
to a parity tree) are identified as permutable, and output
equivalence, in which permutable output pins of a sub-circuit (such
as the outputs of a buffer tree) are identified as permutable. In
FIG. 1, input equivalence is said to be evidenced by permutable
input pins 22, 24, 26 and 28 while output equivalence is said to be
evidenced by permutable output pins 30 and 32.
[0024] Input equivalence is a property of pins, while output
equivalence can be expressed either as a property of the output
pins or of the nets that are driven by them. A given load driven by
such a net can equally well be driven by any net that is output
equivalent to it without changing the logical function of the
circuit, as long as timing and fanout constraints are observed.
[0025] The present invention addresses the problem of recognizing
input equivalence within a digital logic circuit such as digital
logic circuit 10. In general, three types of input equivalence are
considered:
[0026] 1) Single gate, single pin equivalences,
[0027] 2) Multi-gate, single pin equivalences, and
[0028] 3) Multi-pin equivalences (whether single or
multi-gate).
[0029] Single gate, single pin equivalence consists of symmetric
inputs to a single logic gate such as a three input NAND gate.
Recognition and exploitation of such equivalence has been
well-known for years.
[0030] Referring now to FIG. 2, circuit 34 is an example of
multi-gate, single pin equivalence (MGSP). Two input OR gate 36
receives inputs 38 and 40 to produce output 42. Three input NAND
Gate 44 receives inputs 46, 48 and 50 to produce output 52 which is
applied, together with output 42 and input 54, to three input NOR
gate 56 to produce output 58. Three input NAND gate produces output
62 from output 58 and inputs 64 and 66. In this circuit
configuration, pins 38, 40, 52 and 54 form a first group of pins
that are permutable, or swappable, while pins 46, 48, 50, 64 and 66
form a second group of permutable pins which may be freely
exchanged.
[0031] There are some constraints which must be observed, one of
which can be called a "Genealogical constraint". For example, an
input and output of the same logic function should not be swapped.
Viewing a cone of logic as a family tree, a node must never be
swapped with one of its ancestors or descendants.
[0032] Referring now to FIG. 3, circuit 62 is an example of
multi-pin (MP) equivalence in which groups of 2 or more pins are
found to be equivalent when swapped as a group. This may occur
either on a single gate or across gates. In circuit 62, two input
AND gates 64 and 66 receive inputs 68, 70, 72 and 74 and provide
outputs 76 and 78, respectively, as inputs to two input NOR gate
80. The output of two input NOR gate 80 is applied as input 82 to
two input NAND gate 84. In a preferred physical embodiment of this
circuit, gates 80, 64 and 66 are included in a single AND-OR-INVERT
GATE, such as GATE 81. Similarly, inputs 86 and 88 are applied to
two inputs NAND gate 90 to produce output 92 applied as an input to
two input NAND gate 84 to produce circuit output 94.
[0033] In this example, input pair 68 and 70 can be swapped with
input pair 72 and 74 as well as input pair 86 and 88. Note that
this equivalence is based on single-pin equivalences involving the
inaccessible or internal pins at the outputs of gates 64 and
66.
[0034] Multiple-gate equivalences, such as those shown in FIGS. 2
and 3, offer greater optimization opportunities during physical
design than do single-gate equivalences. Proper exploitation of
such multi-gate equivalences in physical design swapping of pins
can be used to relieve layout problems including resolving problems
in wire length, sharing of a single track to relieve density and
rerouting of connections. Thus multi-gate pin swaps offer
significantly greater opportunities to reduce wire lengths and
avoid congestion than the single-gate swaps do.
[0035] Multi-gate equivalences also offer greater opportunities to
fix timing problems, often discovered during physical design.
Single-gate swaps can improve timing by a small amount, by
switching a net from a slow pin to a slightly faster pin on the
same gate. But multi-gate equivalences can eliminate entire levels
of logic from the critical path.
[0036] For example, if a net attached to a particular pin was
discovered to be much slower than expected (due to placement and
routing effects), it could be swapped with a net attached to
another pin in order to minimize levels of logic from the critical
path. The wire-length reductions that are possible due to
multi-gate swaps can also help fix timing problems, by reducing
wire capacitance along the critical path.
[0037] In order to exploit such equivalences, it is necessary to be
able to easily recognize input equivalence. Conventionally, input
symmetries in a combinational logic function can be detected in
several ways. To determine if a Boolean function f( . . . ,x, y, .
. . ) of several variables is symmetric in x and y, a canonical
representation of the function (such as a truth table or a binary
decision diagram (BDD)) can be built to determine if f( . . . ,x,
y, . . . )=f( . . . ,y, x, . . . ). Similarly, symmetries can be
determined through analysis of the cofactors of the function.
[0038] A Boolean function f( . . . ,x, y, . . . ) of several
variables is symmetric in x and y if and only if the cofactor with
x=1 and y =0 is equal to the cofactor with x=0 and y=1. A variety
of heuristic approaches to minimize the effort required in such a
computation is described in the literature by Moller, Mohnke and
Weber in an article entitled "Detection of Symmetry of Boolean
Functions Represented by ROBDDs" published in Proceedings of the
IEEE/ACM International Conference on Computer Aided Design
(ICCAD93), Santa Clara, Calif. November, 1993, pp. 680-684.
[0039] What these approaches have in common is that they detect
symmetries along a fixed input boundary. This is perfectly
acceptable for applications where such a boundary is given, such as
detecting single-pin symmetries in a single gate, or detecting
symmetries among the primary inputs of a logic network. But such a
limitation makes it difficult to recognize the very desirable,
large multi-gate symmetries, such as those shown in FIGS. 2 and
3.
[0040] Such symmetries will be detected if an input boundary is
chosen that passes through the pins in question, but there are an
exponentially increasing number of such boundaries as the point of
analysis is moved backwards through a circuit.
[0041] In contrast, the present invention focuses on the
recognition of large multi-gate input equivalences in a
combinational logic circuit. Rather than examining a fixed input
boundary, the present invention proceeds by looking for extensible
symmetric logic functions (such as AND, OR, and XOR functions)
within and between gates in a logic circuit. Such functions are
grown backwards as long as they can continue to be extended.
[0042] A key idea in understanding the approach of the present
invention is that of a disjoint-support decomposition of a Boolean
function. This concept recently been the subject of key theoretical
work by Damiani and Bertacco in the following two articles: Damiani
and Bertacco, "the disjunctive decomposition of logic functions",
International Workshop on Logic Synthesis, Tahoe City, Calif., May
1997, Session 8 and Bertacco and Damiani, "Boolean function
representation based on disjoint-support decompositions,"
Proceedings of the IEEE International Conference on Computer
Design, October 1996.
[0043] The articles by Damiani and Bertacco teach that every
completely-specified Boolean function F has a ds-prime
decomposition (defined below) that is unique up to negation and
permutation of inputs to the various functions in the
decomposition. An effective method to compute a ds-prime
decomposition from a binary decision diagram (BDD) representing the
function is disclosed.
[0044] For example, let F be a completely-specified Boolean
function of several variables, F(a.sub.1, a.sub.2, a.sub.3, . . . ,
a.sub.n). The support of F is the set of input variables on which F
depends. Thus a.sub.i is in the support of F if it is an input
variable of F such that there is some setting of the remaining
variables for which F(a.sub.1, . . . , a.sub.i-1, 0, a.sub.i+1, . .
. a.sub.n).noteq. F(a.sub.1, . . . , a.sub.i-1, 1, a.sub.i+1, . . .
a.sub.n).
[0045] The support of F may be written S.sub.F. If F is decomposed
into sub-functions, F=G(A(a.sub.1, a.sub.2, . . . , a.sub.A),
B(a.sub.A+1, a.sub.A+2, . . . a.sub.B), . . . ), then its support
is generally taken to be the union of the supports of all the
sub-functions, assuming non-redundancy. Two functions are
disjoint-support if the intersection of their supports is the null
set. A decomposition F=G(A,B, . . . ) is a disjoint-support
decomposition if all of the sub-functions A, B, . . . are pair-wise
disjoint-support. These sub-functions may likewise be decomposed
into disjoint-support functions, etc.
[0046] A Boolean function F may be said to be ds-prime if no
disjoint-support decomposition into simpler functions exists.
Examples of ds-prime functions are MUX(a,b,s) and MAJORITY(a,b,c).
A ds-prime decomposition of a Boolean function F is a recursive
disjoint-support decomposition such that each function of two or
more variables in the decomposition is either a maximally-wide AND,
OR, or XOR, or a ds-prime function.
[0047] Referring now to FIG. 4, in accordance with the present
invention, input equivalence recognition is accomplished and used
in five primary steps. The circuit is first decomposed into
fanout-free regions in step 96, quasi-canonical models for the
logic gates are then generated in step 98, these quasi-canonical
models are composited to create a "swap structure" in step 100, the
swap structures are analyzed to identify pin swap groups in step
102 and the pins are then swapped in step 104.
[0048] Referring now to FIG. 5, circuit 96 is used as a simple
example to help define fan-free regions which are used to help
simplify the generation of input equivalences. Circuit 96 includes
three inputs, two separate and one common input applied to a pair
of inputs gates 106 and 108, the outputs of which are applied to
the inputs of a third gate 110, output 112 of which may be fanned
to a plurality of other logic gates and destinations.
[0049] A fanout-free region has the following properties:
[0050] 1. It is headed by a single gate in the circuit, which may
fan out to multiple places.
[0051] 2. All other gates are members of the region if they have
only one fanout, and this fanout is also a member of the
region.
[0052] 3. Input nets to the region may fan out to multiple input
pins in the region.
[0053] A logic circuit describable by logic circuit diagram 96
qualifies as a fanout-free region in that it is headed by a single
output gate, gate 110, the output 112 of which may fanout to
multiple places. The other gates in the region, gates 106 and 108,
have only one fanout or output which is applied to gate 110 within
the region. The input nets to the region may fan out to multiple
input pins in the region, that is, the four inputs to gates 106 and
108 are provided by three separate inputs.
[0054] These rules permit equivalences for gates within the region
to be generated without being concerned about the surrounding
circuitry. Because there is only one point of observability (at
output 112 of head gate 110 of circuit region 96), any
transformation of the netlist, that is swapping of the pins, within
region 96 that leaves the logic function generated by the head gate
110 unchanged and can not affect the rest of the circuit. Each
input pin to the region 96 is treated as a distinct input variable,
even though multiple pins may be driven by a single input net to
the region.
[0055] It is therefore not necessary to consider logical
relationships between the input pins of the region. As a result of
this treatment, plus the restriction of internal gates to a single
fanout, gates 106, 108 and 110 of region 96 comprise a
decomposition of the function computed at the head gate 110.
[0056] A fanout-free region (FFR) is a set of one or more
combinational gates in a logic circuit, one of which may be
distinguished as the head gate. Each combinational gate in the
circuit is a member of exactly one fanout-free region. A
combinational gate in a logic circuit is the head of a fanout-free
region if either it has more than one fanout or it has a single
fanout that is a primary output of the circuit or a
non-combinational gate in the circuit.
[0057] Note that the number of fanouts is computed over all the
fanouts of a gate. Thus a combinational gate with more than one
output (such as a combined buffer-inverter cell) would have more
than one fanout, unless all but one output was left unconnected in
the circuit. A combinational gate that has a single fanout that is
also a combinational gate is a member of the same fanout-free
region is the gate that it fans out to.
[0058] A combinational gate with no fanouts is normally a circuit
error. Normally all such gates would be removed from the circuit
before fanout-free regions are generated. However, if it is
desirable to make the algorithm work even in the presence of such
errors, it is possible to treat such gates as belonging to a
special fanout-free region that is not otherwise processed for
logic equivalence. Such a region would not have a head gate, i.e.
head=NULL.
[0059] Under these definitions, the combinational gates of the
circuit are partitioned into the fanout-free regions and are purely
combinational gates. Fanout-free regions can be extended to include
combinational sub-functions of non-combinational gates. One example
is a multiplexed flip-flop or MUXDFF which is a combinational MUX
with a sequential flip-flop.
[0060] Two recursive algorithms for identifying fanout-free regions
follow naturally from these definitions. Algorithm no. 1 works
forwards in the circuit, labeling the gates as they are
encountered. The presently preferred algorithm is algorithm no. 2
which works backwards identifying gates that are the head of the
regions, and then collecting the remaining gates in that
region.
[0061] The following pseudo-code outlines are intended as examples
of the type of algorithms that can be prepared for analyzing
circuits to identify fanout free regions. Explanations of the
functions in each code sequence are provided following the
pseudo-code description for each algorithm.
[0062] Algorithm 1.
[0063] For each gate G in the circuit:
[0064] If G is combinational, add_gate_to_ffr(G, find_ffr1(G))
[0065] End
1 find_ffrl (G) ; If G is labeled, return lable(G); EndIf If G has
one fanout and fanout(G) is combinational, /* G is a member, but
not the head of the FFR */ Then region_label = find_ffr1
(fanout(G)) Else If G has no fanouts, /* Error case - Omit if
circuit has no such gates */ Then region_label = NULL, Else
region_label = name(G); EndIf /* G is the head of a new region */
Lable G with region_label; Return region_label;
[0066] The following explanation is intended to further enhance the
largely self-explanatory function names in the pseudo-code
above.
[0067] "add_gate_to_ffr(gate) region_label)" adds the designated
gate to the fanout-free region designated by "region_label" in a
data structure identifying all of the gates in each region (either
a list of lists of gates, or a hash table of indexed by the
"region_label", returning a list of gates in that region).
[0068] "label(G)" returns the label (if any) attached to gate
G.
[0069] "name(G)" returns the unique name of gate G (which will be
used as a region_label).
[0070] "fanout(G)" returns the unique gate that G fans out to
(assuming that G has only one fanout).
[0071] Algorithm 2:
[0072] For each gate G in the circuit:
[0073] If G is combinational
[0074] Then If G has >1 fanouts or fanout(G) is not
combinational
2 /* G is head of new region */ ffr = new_ffr(name(G));
add_gate_to_ffr_with_fanins(G, ffr); add_ffr_to_ffrset(ffr); Else
If G has no fanouts /* Error Case */ ffr = new_ffr(NULL); add_gate
to_ffr_with_fanins(G, ffr); add_ffr_to_ffrset(ffr); EndIf /* Skip
gates that are not heads of regions */ EndIf End add_gate
to_ffr_with_fanins(G, ffr) add_gate_to_ffr2(G, ffr); For each fanin
G1 of G; If G1 is combinational and has 1 fanout /* G */ /* Gate is
part of region (Def 3) */ add_gate_to_ffr_with_fanins(G1, fir);
EndIf End
[0075] The following explanation is intended to further enhance the
largely self-explanatory function names in the pseudo-code
above.
[0076] "new_ffr" creates a new, empty FFR with the given name.
[0077] "add_gate_to_ffr2" adds the given gate to the specified
FFR.
[0078] "Add_ffr_to_ffrset" adds the completed FFR to the set of all
FFRs (hash table or list of lists--this Corresponds to the data
structure built by add_gate_to_ffr in Algorithm 1.
[0079] The next tasks required are to discover maximally large
symmetric functions (AND, OR, XOR) within this decomposition
because the inputs to such a symmetric function can then be rotated
freely without affecting the rest of the region. One exception to
this is that such rotations can affect the genealogical constraints
in some circumstances.
[0080] This approach does miss possible swaps that span fanout-free
regions or that depend on the logical relations between inputs to a
region. One example of this is an inverter/MUX combination that
implements an XOR function. Such swaps could be detected by
analyzing a ds-prime decomposition of the region's function,
generated by the algorithm described by Damiani and Bertacco as
discussed above.
[0081] Once a fanout-free region netlist, or other type of circuit
description, is generated, it is necessary to transform the netlist
into a quasi-canonical form, defined in more detail herein
below.
[0082] Many conventional synthesis tools, such as Autologic II from
Mentor Graphics, and Leonardo from Exemplar Logic, use netlists of
primitive gates to model the functionality of an ASIC library gate.
These primitive gates include functions such as AND, OR, XOR, NAND,
NOR, NOT, BUF, MUX, etc. The set of primitives is rich enough that
most ASIC library cells are typically represented with a netlist
that constitutes a recursive disjoint support decomposition of the
cell function, although this is not guaranteed. If such a
representation is not available (e.g., for FPGA lookup tables), it
could be generated using a ds-prime decomposition algorithm.
[0083] A fanout-free netlist of primitives (and if not fanout-free,
it can be made fanout-free by replicating sections that fan out),
can be transformed into a quasi-canonical form (QCF) by moving
inverters through AND/OR/XOR functional units, merging adjacent
AND/OR/XOR primitives of the same type, and sorting fan-ins. As the
name suggests, such a form is not guaranteed to be canonical, but
most disjoint-support representations of common ASIC gates will
lead to the same result. Indeed, the unique ds-prime decomposition
theorem suggests that starting from a disjoint support network, the
results will differ only in the ordering and negation of inputs to
nodes other than AND/OR/XOR, and in the possible decomposition of
such nodes if they are not ds-prime functions.
[0084] The QCFs for both the true and complemented versions of the
logic function are computed the first time any library cell is
encountered in a input-equivalence computation. These are then
stored on the cell for easy reference the next time that cell is
encountered.
[0085] The conversion of a netlist for fanout-free region into
quasi-canonical form requires control load inverter placement, the
merging of adjacent nodes, and the sorting of inputs.
[0086] To standardize the placement of inverters within a
quasi-canonical form, the following rules may be used:
[0087] 1. All instances of NAND, NOR, or XNOR primitives are
replaced by AND, OR, or XOR, respectively, followed by an
inverter.
[0088] 2. An inverter at the input of an XOR function is moved
forwards through the function to its output.
[0089] 3. An inverter at the output of an AND or OR function is
moved backwards through the function to all its inputs (and the
function is then converted to the opposite type), using DeMorgan's
Law. In the event that both this and the XOR movement rule apply,
the XOR rule takes priority.
[0090] 4. Two consecutive inverters cancel.
[0091] These rules for inverter placement are designed to ensure
that the movement of inverters is not ambiguous, and to maximize
the chances of being able to merge identical functions after
inverter movement.
[0092] Merging of adjacent nodes may be controlled as follows. If
two functions are the same function from {AND, OR, XOR} and one of
them is an input to the other, the two functions are merged into a
larger function.
[0093] The inputs to the functions AND, OR and XOR are sorted in
the following manner: sub-functions (including NOT) come before
pins. Pins are sorted alphabetically within themselves.
Sub-functions are represented as a list, with the first element
being the function name, and subsequent elements being the inputs
to the function (already sorted). Sub-functions are sorted
lexicographically, sorting first on the name of the function, and
then comparing element by element.
[0094] These steps should ensure that the resulting quasi-canonical
form is consistent for a wide variety of initial models. For
example, a 3-input NOR gate with inputs a, b, and c will generate a
QCF of: (AND (NOT a) (NOT b) (NOT c)), regardless of whether the
initial model was a single NOR primitive, an OR followed by an
inverter, two 2-input ORs followed by an inverter, an AND preceded
by input inverters, or many other forms.
[0095] QCFs are generated by recursive descent from each output
port of the library gate, for both true and complemented forms of
the output function. Although the above steps are presented
separately to explain them more clearly, the actual coding for a
working embodiment may conveniently overlap all three
operations.
[0096] The composition of QCFs from different logic gates follows
much the same rules as generation of the QCFs in the first place.
The goal is to propagate inversions through AND, OR, and XOR
functions in the composed structure, and to merge similar functions
together into a larger function. It is desirable to explicitly keep
track of gate pins, their polarities, and ancestry relationships,
so that the inherent symmetries of the composed "swap structure" to
generate swap groups among the pins.
[0097] In this swap structure, several properties of the pins are
tracked:
[0098] 1. The polarity of the pin: i.e. whether inverted or
non-inverted,
[0099] 2. Whether the pin is an external gate pin or an internal
point within a single gate function,
[0100] 3. Whether the pin (if external) fans out to more than one
point within the QCF of its gate (only used points count here, so
unconnected output pins do not contribute to this fanout).
[0101] 4. The parent pin (if any) within the same level of the swap
structure.
[0102] 5. The input function, if not from the same level of the
swap structure.
[0103] 6. The gate and pin name cross references, for external
pins.
[0104] The swap structure is built up by a quasi-breadth-first
search from the root gate of the FFR. The non-inverted QCF of the
root gate is used; subsequent gates supply either the non-inverted
or inverted QCFs depending on the polarity of the external pin they
connect to within the swap structure. As each gate is processed,
the appropriate QCF is merged into the existing swap structure.
External pins without multiple fanouts that connect to AND/OR/XOR
functions in the swap structure are candidates for extension of
that function.
[0105] If the appropriate QCF of the driving gate has the same
function at its head, it will be merged with the function at that
level of the swap structure. If the QCF has a buffer or inverter at
the head, the inverter propagation rules will be applied before
merging. If the QCF is only a buffer/inverter, the input pin will
definitely be merged. Otherwise, the QCF will be incorporated as
one or more separate swap structure functions.
[0106] Referring now to FIG. 6, circuit 114 is shown in a schematic
form. The gate components and interconnection of the gates of
circuit 114 will be described first, the pin properties for an
initial swap structure will then be shown in Table 1, and the
resulting swap structure will then be described in with reference
to FIG. 7.
[0107] Three input NAND gate G1 is the head gate of the fanout free
region shown in circuit 114 and provides circuit output 116. Input
A of gate G1 at pin P1 is provided by the output of two input NAND
gate G2. Input A of gate G2 is provided at pin P15. Input B of gate
G2 is provided at pin P16 by the output of three input AND gate G8.
Inputs A, B, and C of gate G8 are provided by pins P17, P18 and
P19, respectively.
[0108] Input B of gate G1 is provided at pin P2 by the output
AND-OR-INVERT gate G3. This gate computes the function (NOT (OR
(AND AB) (AND CD))), where inputs A, B, C, and D are provided by
pins P20, P21, P22 and P23, respectively. For convenience, internal
points of the gate computing the sub-functions (AND A B) and (AND C
D) are labeled P4 and P5, respectively. These internal points need
not correspond to any action points in the physical implementation
of gate 63.
[0109] Input C of gate G1 is provided at pin P3 by the output of
three input NOR gate G4. Input A of gate G4 is provided by the 2D
output of two input NOR gate G5. Input A of gate G5 is provided at
pin P9 while input B of gate G5 is provided at pin P10. Input B of
gate G4 is provided at pin P7 by the output of inverting buffer G6,
the input of which is provided at P11 by the output of two input
multiplexer G9. Input A gate G9 is provided at pin P24 while input
B of gate G9 is provided at input pin P25. Multiplexing input S is
provided pin P26.
[0110] Input C of gate G4 is provided by the output of three input
NAND gate G7, inputs A, B and C of which are provided at pins P12,
P13 and P14 respectively.
[0111] The algorithm starts with gate G1. Since this is the initial
gate, the positive QCF: (OR (NOT A) (NOT B) (NOT C))is used to
build the initial function of the swap structure, an OR with pins
1, 2, and 3. The initial values of the pin properties are shown in
the following Table.
3TABLE 1 Pin Properties for Initial Swap Structure Pin ID Gate Pin
Inv.? Ext.? Fanout Parent Input Function 1: OR, non-inverted 1 G1 A
Y Y -1 2 G1 B Y Y -1 3 G1 C Y Y -1
[0112] Next the external pins of the function are expanded in
order, until there are no more pins to expand. Expansion consists
of looking at the driving cell for that pin (if part of the same
FFR) and incorporating the QCF of that cell into the swap
structure. In an actual implementation, pin IDs are typically only
unique within a single function of the swap structure. For ease of
explanation, every pin shown in these figures is given a unique
ID.
[0113] The first pin to be expanded is pin P1. Since this is an
inverted pin, the negative QCF of G2: (AND A B) is used. Since AND
is different from OR, no merge takes place, and the AND is
introduced as a separate function. The pin values now are:
4TABLE 2 Pin Properties after expanding Pin 1 Pin ID Gate Pin Inv.?
Ext.? Fanout Parent Input Function 1: OR, non-inverted 1 G1 A Y Y
-1 2 2 G1 B Y Y -1 3 G1 C Y Y -1 Function 2: AND, non-inverted 15
G2 A N Y -1 16 G2 B N Y -1
[0114] Because function 2 is a separate function, it is not
necessary to keep track of the parent pin ID for pins 15 or 16:
this is only used to track the genealogical relationships within a
single function in the swap structure. The next pin to be expanded
is pin 2. This too is an inverted pin, so the inverted QCF of G3:
(OR (AND A B) (AND C D)) is used. The OR function does match the
current function, so this top-level function is merged into the
current function, producing the pin property shown in Table 3.
5TABLE 3 Pin Properties after expanding Pin 3 Pin ID Gate Pin Inv.?
Ext.? Fanout Parent Input Function 1: OR, non-inverted 1 G1 A Y Y
-1 2 2 G1 B Y Y -1 3 G1 C Y Y -1 4 G3 N N 2 3 5 G3 N N 2 4 Function
2: AND, non-inverted 15 G2 A N Y -1 16 G2 B N Y -1 Function 3: AND,
non-inverted 20 G3 B N Y -1 21 G3 A N Y -1 Function 4: AND,
non-inverted 22 G3 C N Y -1 23 G G32 D to N Y -1
[0115] Referring now to FIG. 7, after all the external pins have
been expanded for all functions, the resulting pin properties are
as shown in Table 4.
6TABLE 4 Pin Properties after expanding all Pins Pin ID Gate Pin
Inv.? Ext.? Fanout Parent Input Function 1: OR, non-inverted 1 G1 A
Y Y -1 2 2 G1 B Y Y -1 3 G1 C Y Y -1 4 G3 N N 2 3 5 G3 N N 2 4 6 G4
A N Y 3 7 G4 B N Y 3 8 G4 C N Y 3 9 G5 A N Y 3 10 G5 B N Y 6 11 G6
A Y Y 7 5 12 G7 A Y Y 8 13 G7 B Y Y 8 14 G7 C Y Y 8 Function 2
:AND, non-inverted 15 G2 A N Y -1 16 G2 B N Y -1 17 G8 A N Y 16 18
G8 B N Y 16 19 G8 C N Y 16 Function 3: AND, non-inverted 20 G3 B N
Y -1 21 G3 A N Y -1 Function 4: AND, non-inverted 22 G3 C N Y -1 23
G3 D N Y -1 Function 5: MUX, non-inverted 24 G9 A N Y -1 25 G9 B N
Y -1 25 G9 S N Y -1
[0116] As shown in FIG. 7, the pin properties listed in Table 4
represents a swap structure or simplified circuit, such as swap
structure or circuit 120, which can then be analyzed further to
recognize pin swapping possibilities.
[0117] In particular, swap structure 120 provides output signal 116
as the output of 14 pin, non-inverting OR gate 118 which provides
function F1 listed in Table 4. The input to pin P1 of gate 118 is
provided by the output of five input, non-inverting AND gate 122,
the inputs of which are provided by pins P15, P16, P17, P18 and
P19. Gate 122 provides function F1 shown in Table 4. The input to
pin P4 of gate 118 is provided by the output of two input
non-inverting AND gate 124 which provides function F3 in Table 4.
The inputs to gate 124 are provided by pins P20 and P21. The input
to pin PS of gate 118 is provided by the output of two input AND
gate 126 which provides the function F4 in Table 4. The inputs to
gate 126 are provided by pins P22 and P23.
[0118] In summary, the present technique recognizes multi-gate
input equivalences by building up a swap structure from the
original netlist:
[0119] 1. which computes the same function as the original
netlist,
[0120] 2. which contains points that are identified with the
external input pins of gates in that netlist, but
[0121] 3. which express the AND, OR, and XOR symmetries that were
present in the original netlist in the form of large functional
blocks.
[0122] The next step is to extract the swap groups from this swap
structure. It may be noted that there is a close relationship
between the swap structure as constructed above and a ds-prime
decomposition of the function. In fact, the example swap structure
is a ds-prime decomposition with some additional information added
(such as the gate pins that are only used internally to the AND or
OR functions). This will not always be the case, due to the limited
set of primitives and the occasional reconvergent fanout in the
internal logic model of a gate, but the two will generally be quite
close. It is expected that an extended ds-prime decomposition could
be used as the basis for an input equivalence recognition
algorithm.
[0123] The swap structure of Table 4 may be analyzed to identify
both multi-gate single-pin swaps (MGSP) and multi-pin swaps (MP).
The single-gate single-pin swaps (SGSP) are identified using
conventional BDD-based methods.
[0124] The MGSP swaps can be read almost directly out of the swap
structure based on the following observation. If two external pins
are both inputs to the same (OR, AND or XOR) functional block in
the swap structure, do not have multiple fanouts, have the same
parity, and are not in an ancestor-descendent relationship with
each other, then the inputs to those two pins may be swapped
without changing the function computed at the output of that
functional block.
[0125] In particular, if independent input cones to the two points
2D in question within the swap structure are identified by
decomposing those parts of the function in question that fan in to
the two points, these two input cones become identical (same
parity) inputs of a symmetric function.
[0126] Accordingly, if these two input cones are swapped, the
resultant swap structure computes the same function at the output
of the functional block, and therefore (since these input cones fan
out nowhere else), computes the same function at the output of the
entire FFR.
[0127] Referring now to FIG. 8, an example of a decomposed swap
structure is shown as circuit 130, justifying the swap of pin 2 of
FIG. 7 with any of pins 12-14. In particular, the outputs of AND
gate 124 and AND gate 126 at pins P4 and P5 are applied to two
input OR gate 132, the output of which is applied to pin P2.
[0128] Therefore the basic approach to generating MGSP equivalences
is to collect all the external pins of the same polarity that are
inputs to the same AND, OR, or XOR function in a swap structure,
and then filter out multiple fanout pins and those with possible
ancestor-descendent relationships. The manner in which the latter
must be done depends on the capabilities of the client for this
information. If the client is capable of recognizing
ancestor-descendent conflicts, then it is possible to output the
entire group of pins with the same polarity, and leave it up to the
client to deal with.
[0129] A more conservative approach is to only use those pins that
are not the parent of any other pin in the function output. This
ensures that there are no ancestor-descendent relationships among
the pins in the group that is output, and subsequent swaps within
this group will not change this fact. It is possible to suppress
the group if the remaining pins belong to only one gate, such as
SGSP equivalences.
[0130] The above MGSP swaps do not include internal points such as
pins 4 and 5 as shown in FIG. 7 and Table 4, which are not the
external pins of some gate. Because all swaps must ultimately be
expressed as swaps of external pins, the input function is
evaluated to determine if it can be matched with the input function
of some other pin (internal or external).
[0131] Accordingly, MP swaps are generated by looking for internal
pins input to an AND, OR or XOR function in the swap structure,
expanding the input function to this pin until it consists entirely
of external gate pins, and then looking for a match among the input
functions to other pins in the same swap function.
[0132] In general, this process may need to be repeated
recursively, as an input pin of a potential match might itself be
an internal pin of some gate, in which case the input function to
this pin must be incorporated to look for a match with the original
function, etc. However, most of the potential MP matches are to the
internal point of an OR gate. This process can be simplified by
only looking for 1-level matches: if a potential match includes an
internal point, it is simply skipped. To facilitate this, a
"signature" for the top-level gate is recomputed in each swap
function, which includes the function, the number of inputs to the
top-level gate, and the inversion pattern of the inputs to this
gate. Matches are identified by matching this signature.
[0133] In the example swap structure 130 shown in FIG. 8, the MP
swap process will start with either pin P4 or P5, identify this pin
as an internal pin, and look at the signature of the top-level gate
of the input function. Next other inputs to the OR structure that
are fed by input functions with identical signatures are identified
to find pins P1, P4 and P5. This yields a MP swap group with
elements: (P15, P16), (P20, P21), and (P22, P23).
[0134] Field-programmable Gate Arrays (FPGAs) could benefit from an
approach like this even more than Application Specific Integrated
Circuits (ASICs), since routing resources are often very restricted
in FPGAs. An approach tailored to FPGAs would take advantage of the
ability to modify the logic function of any particular gate more or
less freely, especially for lookup-table (LUT) based devices. The
following modifications are required:
[0135] 1. LUTs may not have pre-defined logic models in netlist
form. For example, the Damiani/Bertacco algorithm could be used to
generate a ds-prime decomposition for an arbitrary LUT used as the
starting point for QCF generation.
[0136] 2. In interpreting the swap structure, the technique is not
constrained to only swap between pins with the same polarity, since
the logic function of the gate can be freely changed to introduce
or delete inversions (at least for LUT-based FPGAS). Accordingly,
larger swap groups will be generated that include both polarities,
and require the client application performing the swaps to modify
the polarity of the input pins appropriately.
[0137] Having now described the invention in accordance with the
requirements of the patent statutes, those skilled in this art will
understand how to make changes and modifications in the present
invention to meet their specific requirements or conditions. Such
changes and modifications may be made without departing from the
scope and spirit of the invention as set forth in the following
claims. In particular, the characteristics of a fan-out free
region, the quasi-canonical form and the swap structure may be
varied from implementation without departing from the spirit or
scope of the present invention as long as they provide for
determining and exploiting input equivalence.
* * * * *