U.S. patent application number 14/605354 was filed with the patent office on 2015-07-30 for testing of integrated circuits during at-speed mode of operation.
This patent application is currently assigned to TEXAS INSTRUMENTS INCORPORATED. The applicant listed for this patent is Texas Instruments Incorporated. Invention is credited to Khushboo Agarwal, Sanjay Krishna H V, Wilson Pradeep, Srivaths Ravi, Raashid Moin Shaikh, Rajesh Kumar Tiwari.
Application Number | 20150212152 14/605354 |
Document ID | / |
Family ID | 53678822 |
Filed Date | 2015-07-30 |
United States Patent
Application |
20150212152 |
Kind Code |
A1 |
Agarwal; Khushboo ; et
al. |
July 30, 2015 |
TESTING OF INTEGRATED CIRCUITS DURING AT-SPEED MODE OF
OPERATION
Abstract
Methods for testing an application specific integrated circuit
(ASIC). A set of representations is created that overlays power
density information and clock gate physical locations of a set of
clock gates in a critical sub-chip of the ASIC for test mode power
analysis. The set of representations are further grouped in the
sub-chip into various groups based on overlapping of the set of
representations. Then, a set of test control signals is generated
corresponding to each of the set of clock gates during at-speed
test mode of operation such that each clock gate with overlapping
representations receive different test control signals. Further,
patterns are generated using a virtual constraint function to
selectively enable the set of test control signals such that the
set of test control signals are not activated simultaneously.
Inventors: |
Agarwal; Khushboo;
(Bangalore, IN) ; H V; Sanjay Krishna; (Bangalore,
IN) ; Shaikh; Raashid Moin; (Bangalore, Karnataka,
IN) ; Ravi; Srivaths; (Bangalore, IN) ;
Pradeep; Wilson; (Bangalore, IN) ; Tiwari; Rajesh
Kumar; (Bangalore, IN) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Texas Instruments Incorporated |
Dallas |
TX |
US |
|
|
Assignee: |
TEXAS INSTRUMENTS
INCORPORATED
Dallas
TX
|
Family ID: |
53678822 |
Appl. No.: |
14/605354 |
Filed: |
January 26, 2015 |
Current U.S.
Class: |
714/731 |
Current CPC
Class: |
G01R 31/31707 20130101;
G01R 31/31727 20130101; G01R 31/31721 20130101 |
International
Class: |
G01R 31/3177 20060101
G01R031/3177 |
Foreign Application Data
Date |
Code |
Application Number |
Jan 24, 2014 |
IN |
312/CHE/2014 |
Claims
1. A computer implemented method for testing an application
specific integrated circuit (ASIC), comprising: creating a set of
representations that overlays power density information and clock
gate physical locations of a set of clock gates in a critical
sub-chip of the ASIC for test mode power analysis; grouping the set
of representations in the sub-chip into various groups based on
overlapping of the set of representations; and generating a set of
test control signals corresponding to each of the set of clock
gates during at-speed test mode of operation such that each clock
gate with overlapping representations receive different test
control signals from the set of test control signals.
2. The method of claim 1, wherein prior to creating the set of
representations: identifying a critical sub-chip in the ASIC based
on power dissipation and IR sensitivity.
3. The method of claim 2, wherein identifying a critical sub-chip
in the ASIC comprises performing a functional mode analysis, and
performing a test mode IR drop analysis including a static IR drop
analysis and a dynamic IR drop analysis.
4. The method of claim 3, wherein performing a static IR drop
analysis comprises identifying a set of hot spots in the sub-chip
having a maximum average power hotspot and augmenting a power grid
of the set of hot spots having the maximum average power density;
and wherein performing a dynamic IR drop analysis comprises using
waveforms corresponding to a worst-case at-speed test mode pattern
exercising targeted portions of the ASIC.
5. The method of claim 1, wherein creating a set of representations
corresponding to each clock gate comprises: identifying eligible
clock gates of the set of clock gates; extracting location and load
information for each eligible clock gate of the set of clock gates;
and computing the representation for each eligible clock gate.
6. The method of claim 1, wherein grouping the set of
representations in each sub-chip into various groups based on
overlapping of representations comprises: extracting a set of
overlap graphs from the representations of the eligible clock
gates; and running a coloring scheme on each of the overlap graphs
to determine a mapping of a preliminary test control signal of the
set of test control signals to the set of clock gates.
7. The method of claim 6, wherein running a coloring scheme on each
of the overlap graphs comprises: creating a clock gate connectivity
matrix; and reassigning the color scheme incrementally across the
set of overlap graphs.
8. The method of claim 1 further comprising: generating patterns
using a virtual constraint function to selectively enable the set
of test control signals such that the set of test control signals
are not activated simultaneously.
9. The method of claim 8, wherein generating patterns comprises:
receiving a set of values of the set of test control signals;
checking for a contention corresponding to the set of values where
more than one of the set of values is a logic 1; and removing a
pattern associated with contention dynamically during pattern
generation.
10. The method of claim 1, wherein generating a set of test control
signals such that each clock gate with overlapping representations
have different test control signals avoids simultaneous switching
of clock gates and resulting IR drop hotspots thereby reducing
power consumption during at-speed test mode of operation.
11. A system for testing an integrated circuit, the system
comprising: a computer system having a test processor, the test
processor being coupled to the integrated circuit, the integrated
circuit having a set of sub-chips, a plurality of cores and a
cache, an I/O port, the test controller being configured to
activate a set of clock gates in the set of sub-chips in a
neighborhood of the integrated circuit in selective manner during
at-speed mode of operation.
12. The system of claim 11 wherein the test controller, in response
to a stimuli from the test processor, is configured to generate a
set of test control signals that activates the set of clock gates
by: creating a set of representations that overlays power density
information and clock gate physical locations of a set of clock
gates in a critical sub-chip of the integrated circuit for test
mode power analysis; grouping the set of representations in the
sub-chip into various groups based on overlapping of the set of
representations; generating a set of test control signals
corresponding to each of the set of clock gates during at-speed
test mode of operation such that each clock gate with overlapping
representations receive different test control signals.
13. A computer implemented method for testing an integrated
circuit, comprising: generating multiple test control signals to
activate a set of clock gates in the integrated circuit such that
simultaneous switching of the set of clock gates is avoided during
at-speed test mode of operation of the integrated circuit.
14. The method of claim 13, wherein generating multiple test
control signals comprising: creating a set of representations that
overlays power density information and clock gate physical
locations of a set of clock gates in a critical sub-chip of the
integrated circuit for test mode power analysis; grouping the set
of representations in the sub-chip into various groups based on
overlapping of the set of representations; and generating multiple
test control signals corresponding to each of the set of clock
gates during at-speed test mode of operation such that each clock
gate with overlapping representations receive different test
control signals.
15. A computer implemented method for testing an application
specific integrated circuit (ASIC), comprising: creating a set of
representations that overlays power density information and clock
gate physical locations of a set of clock gates in a critical
sub-chip of the ASIC for test mode power analysis; grouping the set
of representations in the sub-chip into various groups based on
overlapping of the set of representations; and generating a set of
test control signals corresponding to each of the set of clock
gates during at-speed test mode of operation such that each clock
gate with overlapping representations receive different test
control signals; and generating patterns using a virtual constraint
function to selectively enable the set of test control signals such
that the set of test control signals are not activated
simultaneously.
16. The method of claim 15, wherein creating a set of
representations corresponding to each clock gate comprises:
identifying eligible clock gates of the set of clock gates;
extracting location and load information for each eligible clock
gate of the set of clock gates; and computing the representation
for each eligible clock gate.
17. The method of claim 15, wherein grouping the set of
representations in each sub-chip into various groups based on
overlapping of representations comprises: extracting a set of
overlap graphs from the representations of the eligible clock
gates; and running a coloring scheme on each of the overlap graphs
to determine a mapping of a preliminary test control signal of the
set of test control signals to the set of clock gates.
18. The method of claim 15, wherein running a coloring scheme on
each of the overlap graphs comprises: creating a clock gate
connectivity matrix to encapsulate structural path statistics; and
reassigning the color scheme incrementally across the set of
overlap graphs to minimize pattern volume overhead and coverage
loss.
19. The method of claim 15 further comprising identifying
modifications to a functional enable of the set of clock gates that
eliminates simultaneous switching in the sub-chip during the
at-speed mode of operation.
20. The method of claim 15 whereby reducing pattern count during
the at-speed mode of operation.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority from India provisional
patent application 312/CHE/2014 filed on Jan. 24, 2014, which is
hereby incorporated by reference in its entirety.
TECHNICAL FIELD
[0002] The present disclosure relates to testing of integrated
circuits during at-speed mode of operation.
BACKGROUND
[0003] Power consumption during the test mode of circuit operation
is a major concern for scan based low power circuits. While there
are multiple design for testability (DFT) and automatic test
pattern generation (ATPG) techniques in the art for addressing both
shift and capture power reduction, most of the solutions are
coarse-grained in nature that they attempt to reduce power while
being agnostic of the local power density in the power grid and its
impact on the local dynamic IR Drop. Hence, such solutions can only
fortuitously alleviate any local IR drop issues in the power grid,
especially those arising from differences between functional and
test mode use case scenarios.
SUMMARY
[0004] One aspect provides a method for testing an application
specific integrated circuit (ASIC). A set of representations
(referred to as power equivalent polygons, PEPs) is created that
overlays power density information and clock gate physical
locations of a set of clock gates in a critical sub-chip of the
ASIC for test mode power analysis. The set of representations are
further grouped in the sub-chip into various groups based on
overlapping of the set of representations. Then, a set of test
control signals is generated corresponding to each of the set of
clock gates during at-speed test mode of operation such that each
clock gate with overlapping representations (PEPs) receive
different test control signals from the set of test control
signals. Further, patterns are generated using a virtual constraint
function to selectively enable the set of test control signals such
that the set of test control signals are not activated
simultaneously.
[0005] Another aspect provides a system for testing an IC. The
system includes a computer system having a test processor, the test
processor being coupled to the integrated circuit, the integrated
circuit having a set of sub-chips, a plurality of cores and a
cache, an I/O port. The test controller is configured to activate a
set of clock gates in the set of sub-chips in a neighborhood of the
integrated circuit in selective manner during at-speed mode of
operation.
[0006] Other aspects and example embodiments are provided in the
Drawings and the Detailed Description that follows.
BRIEF DESCRIPTION OF THE VIEWS OF DRAWINGS
[0007] FIG. 1a depicts an RTL code snippet of a common structure,
FIFO, found in most circuits;
[0008] FIG. 1b depicts a circuit schematic of FIG. 1a;
[0009] FIG. 2 depicts the impact of having such structures of FIG.
1a in a chip;
[0010] FIG. 3a depicts a baseline dynamic IR drop profile for a
single voltage, single clock domain block in a chip;
[0011] FIG. 3b depicts a corresponding dynamic IR drop profile with
one partition gated off in the chip;
[0012] FIG. 4 depicts a structure of a clock gate that has
integrated functional and test mode capabilities;
[0013] FIG. 5 depicts the hookup of test enable controls of clock
gates in a circuit;
[0014] FIG. 6a and FIG. 6b depict a layout cross-section of an
SoC;
[0015] FIG. 7 depicts a layout of an ASIC having multiple test
control signals to the clock gates according to an embodiment;
[0016] FIG. 8 depicts a method flow to test an ASIC during at speed
mode of operation according to an embodiment;
[0017] FIG. 9 depicts a method according to an embodiment for
testing an ASIC;
[0018] FIG. 10 depicts a method of creating the set of
representations according to an embodiment;
[0019] FIGS. 11a and 11b depict a coloring scheme according to an
embodiment; and
[0020] FIG. 12 depicts a virtual constraint function implementation
according to an embodiment.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0021] Power consumption is today a critical and often cost
defining dimension for a wide array of electronic and computing
systems ranging from small-scale embedded sensors and battery
operated mobile appliances to large-scale compute clusters and data
centers. Not surprisingly, these systems and transitively the
integrated circuits (ICs) used in them need to be designed and
tested in the face of ever shrinking power budgets.
[0022] While power consumption in application use modes of chip
operation has always received attention since it defines the
product specification, power consumption in the test mode of
operation had often been overlooked until various issues such as
burnt sockets due to thermal runaway during burn-in tests and
spurious yield issues due to elevated test mode IR drop during
at-speed transition fault tests (TFT) thrust test power to the
forefront. A wide range of hardware (DFT) and software (ATPG)
techniques have since been proposed to reduce the power consumed
during both shift and capture operations of scan based tests.
However, each technique is associated with its own tradeoffs in
power reduction effectiveness, test quality and test time,
applicability to compression, applicability to various test types
(eg. launch-off-capture (LOC) vs launch-off-shift (LOS)), usability
with commercial ATPG flows, etc. Invariably, a combination of
techniques is deployed in chips today to address target test power
reduction goals subject to design constraints.
[0023] Several techniques have been proposed in the art for
reducing shift mode power consumption. The techniques range from a
"simple" reduction of shift frequency to providing DFT support that
(a) reduce redundant toggling (e.g., partial or complete scan cell
gating) and/or (b) reduce concurrent chip switching (e.g., scan
segmentation or partitioning and staggered clocking) Additionally,
ATPG techniques can add switching constraints to conventional ATPG
or fill don't care bits in patterns in a power-aware manner to
generate low power pattern sets.
[0024] Relatively, there are lesser number of techniques that have
focused on reducing capture power reduction during at-speed
testing. On ATPG front, some of the fill techniques mentioned above
have also attempted to reduce capture power. In practice though,
fill techniques have only been moderately effective in industrial
designs and in the presence of test compression technologies. With
the recognition that clock gating is used de-facto in designs for
dynamic power reduction and ATPG patterns tend to turn on more
clock gates than functional use cases, ATPG tools today can
generate patterns subject to a user specified clock gate switching
threshold. This technology is associated with trade-offs in pattern
count and runtimes, and hence is used by DFT and test engineering
teams for the silicon debug of any Vmin or Fmax issues (if caused
by higher IR drop due to elevated at-speed switching) or for
production tests in highly power constrained designs. A generic
limitation of most existing at-speed clock gating enable/disable
solutions is that they attempt to reduce capture power as a whole
while being agnostic of the design's local power grid constraints.
More recently, some techniques propose the addition of DFT test
points to the functional enable of the clock gating logic to
provide more granular control and to ease the burden of the ATPG
tool to generate low power patterns. While neither works directly
address local IR drop issues, they can be built upon to tackle the
problem. However, adding logic to the functional enable has
two-fold challenges: (i) re-used IPs in a system on chip (SoC) will
need to be updated and re-verified since the changes are intrusive
and affect functionality and (ii) it adds to timing criticality of
the half-cycle path to the clock gate latch. Therefore, it is
necessary to have solutions that can alleviate local power density
and dynamic IR drop hotspots in a non-intrusive manner.
[0025] Now, empirical data from two industrial chips is used to
show the problems faced from test power perspective during at-speed
mode of operation (also referred to as at-speed TFT capture). The
first example in FIG. 1a, FIG. 1b and FIG. 2 exemplifies how
circuit structures lend themselves to high test versus functional
dynamic power differential, in turn leading to local IR drop
hotpots. The second example in FIG. 3a and FIG. 3b uses one of the
popular industrial DFT techniques, scan partitioning, to show how
coarse-grained power reduction techniques can do little towards
addressing local IR drop problems.
[0026] FIG. 1a depicts the RTL code snippet of a common structure,
FIFO, found in most circuits. The 64 bit, 256 deep FIFO used in the
illustration is synthesized into a 16348 flip-flop netlist. The
flip-flops need to get a clock edge only when there is a valid
write. Hence, there would be one clock gate per row of storage
leading to 256 clock gates overall. The circuit schematic is
outlined in FIG. 1(b), where clock gates are labelled as ICGs. The
functional and test mode operations of the circuit are now
analyzed. Functionally, address decoding will lead to only 64
flip-flops receiving clock at any clock cycle with a valid write.
On the other hand, conventional test mode control of clock gates
allows ATPG to bypass the functional clock gating enable altogether
and has a potential of switching all 256 ICGs together.
[0027] The impact of having such structures of FIG. 1b in a chip is
depicted in FIG. 2. FIG. 2 depicts the dynamic IR drop profile
corresponding to an at-speed LOC transition fault test (TFT)
pattern of a crypto processing subsystem in a 45 nm chip. The IR
drop profile bins regions based on the percentage IR drop relative
to supply voltage against design specific thresholds and are
colored accordingly. The area 205 indicates hotspots that need to
be further analyzed for use-case based waivers or design/power grid
fixes. The high IR drop section of the block called out in FIG. 2
corresponds to a register based FIFO structure, with the TFT
patterns enabling up to all clock gates at a time. Thus, a very low
functional dynamic power circuit has become an IR drop hotspot due
to excessive local (spatial and temporal) activity in the test
mode. Since the power grid in this section cannot support 100%
clock toggles, the implications are clear--a mechanism is needed to
lower the clock activity due to the clock gates in the area
205.
[0028] FIG. 3(a) depicts the baseline dynamic IR drop profile for a
single voltage, single clock domain block (290K flip-flops) in a 28
nm chip. The block also implements coarse-grained scan
partitioning, i.e., the circuit is divided into two scan partitions
as per layout and routing considerations to avoid simultaneous
shift and capture across the complete block. FIG. 3(b) depicts the
corresponding dynamic IR drop profile with one partition gated off.
The power consumption characteristics of the block are tabulated in
Table 1 below for both the baseline and coarse-grained scan
partitioning scenarios during at-speed TFT. While peak and average
power do come down with the use of partitioning, it is noted from
the IR drop profiles as well as the table, that, the worst IR drop
and hotspot locations remains more or less the same (the slight
difference results from the difference in patterns generated in
both scenarios). While coarse level partitioning can reduce average
and peak power, dynamic IR drop alleviation needs local activity
control.
TABLE-US-00001 TABLE 1 Coarse Baseline partitioning Peak Power 1510
mW 641.6 mW Average Power 861 mW 361 mW Worst Drop 14.14%
14.42%
[0029] FIG. 4 depicts a structure of a clock gate that has
integrated functional and test mode capabilities. The clock gate
425 takes an input clock 405 (Clkin), functional and test enable
controls (FE (415) and TE (410)), and outputs the gated clock 420
(Clkout). The OR-ed value of the functional and test enables is
latched when Clkin 405 changes from its low (latch transparent
state) to high state, and is then used to qualify Clkin 405 to
create Clkout 420. In simple terms, the clock is gated when both FE
415 and TE 410 are low and active when either of the enables is
high. Most clock gates in a design (unless explicitly instantiated
in RTL as is done for root clock gates) are typically inferred from
the circuit behavior and inserted during synthesis. Further,
cloning of clock gates is done to aid clock tree synthesis through
clock gate fan-out guidelines. This practice also ensures that
clock gates and the flip-flops/logic they service end up locally
situated during placement. A natural observation that follows is
that solving local IR drop issues for a given circuit can be
reduced to developing solutions for independently managing clock
gates and their activity.
[0030] FIG. 5 depicts the hookup of test enable controls of clock
gates in a circuit. During synthesis of the circuit, the test
enable of all clock gates 505 (C1, C2, C3, C4 and C5) are hooked up
to a single port (test enable, TE) at the circuit boundary level.
This pin is further connected to the circuit's scan enable pin or
more effectively to a scanned test register bit OR-ed with the scan
enable pin. The hookup for TE as above allows ATPG to set all the
clock gates in the design in one of two states, when TE=1 and when
TE=0. When TE=1, all clock gates are ON during capture, which
increases power dissipation and can create local hotspots. FIG. 6a
and FIG. 6b depict the layout cross-section of an SoC, where the
clock gates are all sharing the same TE (highlighted as black
bodied squares). For each clock gate, there is a representation
created (called power equivalent polygon or PEP hereinafter, 610)
that is indicative of the power and locality that the power grid
can service when the clock gate and its associated load switch. As
can be noted from the FIG. 6b, when the clock gates are turned ON
simultaneously with TE=1, there are many overlapping PEPs 610
leading to hotspots. When TE=0, ATPG now depends on the functional
enable to switch clock gates ON or OFF. Since the functional enable
is in turn dependent on the state of scan flip flops, it implies
justifying values at the relevant flops to realize the desired FE
values at the clock gates subject to resolution of any
inter-dependencies. As a result, relying on just this state can
potentially lead to both pattern inflation and coverage loss.
[0031] The approach according to an embodiment to alleviate the
issue highlighted in FIGS. 6a and 6b is depicted in FIG. 7.
Multiple test enable controls or test control signals (TE1 (705),
TE2 (710), TE3 (715) and TE4 (720)) are provided to hook up to
different clock gates in the circuit. Further, the hookup is done
using the methodology outlined in FIG. 8 such that clock gates with
overlapping PEPs have different test controls. By ensuring that
clock gates with overlapping PEPs have different test controls, the
chance of their simultaneous activation is lowered, meaning their
activation is now dependent on the simultaneous activation of the
respective functional enables. Thus, a locally aware multi-test
enable hookup provides an opportunity to eliminate or minimize
simultaneous switching in a local region and resulting IR drop
hotspots.
[0032] FIG. 8 depicts a method flow according to an embodiment to
test an ASIC during at speed mode of operation. The method flow
provides a low power physical design flow that has been augmented
with optimizations that enable layout-aware clock gate test
controls. Steps 805, 810, 815 and 820 include floor planning, power
routing, power-aware placement and timing optimizations, followed
by power-aware clock tree synthesis (CTS). IR drop analysis is then
done at step 825 to audit the design database at this stage from a
functional power goal perspective at step 830. Steps 805-820 are
then iterated as needed. After the design meets the functional
power and IR drop budgets, test mode IR drop analysis is done at
step 835. An embodiment is integrated in the flow at this stage at
step 840 as a layout aware ICG test enable hookup identification
engineering change order (ECO) implementation, and in turn consists
of two key sub-steps.
[0033] The first sub-step is design analysis where the results of
dynamic IR drop analysis is analyzed to identify critical sub-chips
from a local instantaneous power hotspot perspective for at-speed
TFT capture scenarios (at speed mode of operation). The second
sub-step is a layout aware clock gate test control mapping
algorithm. For the critical sub-chips identified, the algorithm
provides multiple test enable controls for clock gates such that
the local hotspots can be minimized. Both the sub-steps are
explained in detail later in the specification.
[0034] The test enable control to clock gate mapping thus obtained
is taken in as an ECO to the database. The updated database is
taken into routing and post-route optimizations at step 850, and
final signoff closure at step 855. The mapping is also conveyed to
the ATPG engine to generate production test patterns. The ATPG
engine treats the mapping as constraints that can be used to
control test enable activation of clock gates during at-speed
capture cycles at step 845. The ATPG customizations are described
further in FIG. 11.
[0035] Referring now to FIG. 9, a method according to an embodiment
for testing an ASIC is depicted. At step 905, a critical sub-chip
in the ASIC based on power dissipation and IR sensitivity is
identified. For example, consider a floorplan of a 45 nm die. The
die is designed for a wire-bond package with power sources at the
periphery. Physical design closure of the chip proceeds through a
divide and conquer approach with say 12 subsystems physically
hardened independently (also called blocks). Static and dynamic IR
drop analyses of the chip are performed to understand the
robustness of the power distribution network in a chip for various
use-case scenarios against design/package specific thresholds.
Static IR drop analysis is first done to analyze the power grid
from a global IR drop contour perspective. For regions identified
as maximum average power hotspots, power grid augmentation is first
done against existing floorplan and IO periphery constraints. This
is followed up further by using existing DFT or ATPG techniques
that reduce power consumption in a coarse-grained manner. Dynamic
IR drop analysis is then done that calls out the locations that
have instantaneously high IR drop. For at-speed capture use case
scenario, the analysis is performed using waveforms corresponding
to worst-case at-speed capture TFT patterns exercising targeted
portions of the chip. Alternatively, this is done in a vectorless
manner with the test enable control of all clock gates allowed to
be ON simultaneously and associating representative toggle activity
to the scan flops.
[0036] At step 910, the set of representations (PEPs) are created
that overlays power density information and clock gate physical
locations of a set of clock gates in a critical sub-chip of the
ASIC for test mode power analysis. FIG. 10 depicts the method of
creating the set of representations (PEPs) according to the
embodiment. Steps 1005-1015 are performed to extract clock gates
and their physical and power grid characteristics. At step 1005,
all eligible clock gates are identified for a given test clock
group in a circuit. An ICG is considered eligible if it belongs to
the relevant test clock group and if it is a leaf clock gate where
the test enable is re-connected without compromising
functionality.
[0037] For each eligible clock gate, the location and load
information from the circuit's physical design database is
extracted at step 1010. As mentioned earlier, the Power Equivalent
Polygon or PEP (the representation) for each eligible clock gate is
indicative of the power and locality that the power grid can
service when the clock gate and its associated load switch.
Therefore the PEP is abstracted as the locus around the clock gate
to which the switching power of the clock gate and its load can be
mapped.
[0038] Mathematically, the PEP is computed as follows at step 1015.
If P denotes the power limit per unit area (power grid design
constraint), A the area of the PEP, L the ICG Load, F the frequency
of operation, and V the voltage of operation,
P*A=0.5*L*V.sup.2*F
[0039] Since the interconnect is typically Manhattan, the PEP is,
for example, modelled as a rhomboid (since every point on the
rhombus will be equidistant in a Manhattan sense) or in a simpler
manner as a square. If E is the edge of the square, it follows
that
E=(0.5*V.sup.2*F/p).sup.1/2*L.sup.1/2
[0040] For example, if P=100 mW/mm2, F=200 MHz, V=1.1V, a load L of
50 fF gives an edge size of 7.8 u.
[0041] An example of PEPs extracted for one cross-section of a die
is depicted in FIG. 6. At step 915, the set of PEPs in the sub-chip
is grouped into various groups based on overlapping of the set of
PEPs. With the clock gates and their PEPs available, an overlap
graph G(V,E) is created where the vertices V correspond to the
clock gates and the edges E correspond to vertex pair (v1, v2) such
that v1 and v2 have overlapping PEPs at step 1020. The eligible
clock gate set can, thus, be mapped to multiple connected overlap
graphs.
[0042] The problem of eliminating local IR drop hotspots to finding
unique vertex colors for the extracted overlap graph is reduced at
step 1025 by running vertex colors (a coloring scheme) on each of
the overlap graphs to determine a mapping of preliminary test
control signal to the set of clock gates. Finding a minimum set of
vertex colors will also ensure that the number of independent test
control ports at the block boundary is also kept to a minimum,
thereby minimizing the area of any JO limited blocks. There are
multiple vertex coloring algorithms in the literature that can be
used such as Brelaz's greedy heuristic algorithm, commercial
implementations of which are also available that can be easily used
with any existing physical design flow.
[0043] Using the vertex color mapping thus identified, it is
ensured that vertices with a common color are mapped to the same
test enable and conversely, vertices with different colors are
mapped to a different test enable. A one-hot activation of test
enables now ensures that the clock gates in two different test
enable groups are less likely to simultaneously switch (since they
are now dependent only on the functional enable), thereby
minimizing potential local IR drop hotspots.
[0044] The vertex coloring scheme described above runs
independently per overlap graph and guarantees the color separation
of overlapping nodes. However, the initial vertex coloring solution
is agnostic of the fact that structural paths can exist between the
independent overlap graphs. Therefore, the vertex coloring
determined independently for two overlap graphs can create a
scenario where there are a number of structural paths between
flip-flops controlled by differently colored eligible clock gate
elements (referred to as "color crossing"). Given the one-hot
activation need from a local overlap graph perspective, the
algorithm may have created a scenario for potential coverage loss
or pattern inflation since downstream ATPG will now rely on the
functional enable values for coverage recovery. It is also noted
that a given overlap graph can be "recolored" so that overlapping
nodes remain color separated within, while ensuring that "color
crossings" are minimized. This is illustrated using FIGS. 11a and
11b.
[0045] Referring now to FIG. 11a, OG1 (1105) and OG2 (1110) are two
overlap graphs for a given eligible clock gate. Each graph has been
individually, satisfactorily colored with a minimum number of
colors. Consider now the dashed lines that represent structural
paths between the connected nodes. If every dashed connection is
assumed to represent 10 paths, and considering the fact that
differently colored nodes cannot switch together, this coloring
solution has created 50 color crossings. In FIG. 11b, the black and
white nodes of OG1 (1115) is swapped. This optimization, while
maintaining the coloring sanctity of the individual overlap graphs,
eliminated all color crossings between OG1 (1115) and OG2
(1120).
[0046] Therefore an iterative recoloring step is proposed according
to an embodiment that reduces the number of color crossings across
overlap graphs by a cost-driven color swap within overlap graphs.
The algorithm uses two data structures: (a) vertex-colored overlap
graphs and (b) vertex connectivity matrix that is indicative of the
number of paths between flip flops clocked by the clock gates.
[0047] It is noted that with single test control, an ATPG tool
gives the best result in terms of test coverage and pattern count.
But, since it has no granular control on the clock gates, it is
virtually impossible to meet the requirements of patterns with
lower switching activity. In one embodiment, ATPG flow is
customized to generate low power patterns, while ensuring that
constraints related to the test mode activation of clock gates are
honored. The overall ATPG flow can additionally leverage the
inherent coarse-grained flip-flop switching throttling features
available in commercial ATPG tools--that is, ensure that each
pattern meets user specified maximum switching activity thresholds
in capture cycles during TFT pattern generation. Patterns are
generated using a virtual constraint function to selectively enable
the set of test control signals such that the set of test control
signals are not activated simultaneously. The virtual constraint
function is created by receiving a set of values of the set of test
control signals, then checking for a contention corresponding to
the set of values where more than one of the set of values is logic
1, and by removing a pattern associated with contention dynamically
during pattern generation. The virtual constraint function (VCF)
encapsulates the desired property related to activation of clock
gates. The test enable controls of clock gates in a circuit (ports
or scan flop outputs) need to be activated in a one-hot manner. An
example VCF is depicted in FIG. 12 where there are four test enable
ports (1205, 1210, 1215 and 1220) shown. The VCF 1225 functions as
a monitor that takes the values of the four test enable ports
(1205, 1210, 1215 and 1220) as inputs and has a floating output. If
more than one enable value becomes one, a contention arises in the
VCF. Pattern generation can now be performed in the presence of
VCF, with a directive to the ATPG tool to dynamically remove any
pattern creating a contention during capture. Thus, the patterns
generated will ensure that for a given capture cycle utmost only
one test-enable control would be active. It is noted that the VCF
can be extended to encapsulate other power constraints including
those related to functional enables of clock gates. The generated
patterns can be further screened to ensure that they respect any
capture switching thresholds driven from functional use case data.
We have additionally made this optimization an iterative flow where
violating patterns are dropped and patterns regenerated for the
corresponding faults. This is because since some commercial ATPG
tools do not necessarily guarantee capture switching thresholds in
their pattern sets (with precedence given to test coverage over
power in their internal cost functions).
[0048] Various embodiments are implemented in a system for testing
the IC. The system includes a computer system having a test
processor, the test processor being coupled to the integrated
circuit, the integrated circuit having a set of sub-chips, a
plurality of cores and a cache and an I/O port. The test controller
is configured to activate a set of clock gates in the set of
sub-chips in a neighborhood of the integrated circuit in selective
manner during at-speed mode of operation. The test controller, in
response to a stimuli from the test processor, is configured to
generate a set of test control signals that activates the set of
clock gates by creating a set of representations that overlays
power density information and clock gate physical locations of a
set of clock gates in a critical sub-chip of the integrated circuit
for test mode power analysis, grouping the set of representations
in the sub-chip into various groups based on overlapping of the set
of representations, and by generating a set of test control signals
corresponding to each of the set of clock gates during at-speed
test mode of operation such that each clock gate with overlapping
representations receive different test control signals.
[0049] Various embodiments provide test hooks in the form of
multiple test enable controls to clock gates in a circuit to
eliminate any modifications to the functional path and can hence be
a simple engineering change order (ECO) even late in the design
cycle. The DFT hooks can be used with any commercial ATPG flow
through the addition of virtual constraints that can enable clock
gates in a selective manner.
[0050] Processes and logic flows described herein may be performed
by one or more programmable processors executing one or more
computer programs to perform functions by operating on input data
and generating corresponding output. Processes and logic flows
described herein may be performed by, and apparatus can also be
implemented as, special purpose logic circuitry, e.g., an FPGA
device or an ASIC. The foregoing description sets forth numerous
specific details to convey a thorough understanding of the
invention. However, it will be apparent to one skilled in the art
that the invention may be practiced without these specific details.
Well-known features are sometimes not described in detail in order
to avoid obscuring the invention. Other variations and embodiments
are possible in light of above teachings, and it is thus intended
that the scope of invention not be limited by this Detailed
Description, but only by the following Claims.
* * * * *