U.S. patent number 10,776,559 [Application Number 16/383,683] was granted by the patent office on 2020-09-15 for defect detection method for multilayer daisy chain structure and system using the same.
This patent grant is currently assigned to I-SHOU UNIVERSITY. The grantee listed for this patent is I-SHOU UNIVERSITY. Invention is credited to Mei-Hui Guo, Yu-Jung Huang, Chung-Long Pan.
![](/patent/grant/10776559/US10776559-20200915-D00000.png)
![](/patent/grant/10776559/US10776559-20200915-D00001.png)
![](/patent/grant/10776559/US10776559-20200915-D00002.png)
![](/patent/grant/10776559/US10776559-20200915-D00003.png)
![](/patent/grant/10776559/US10776559-20200915-D00004.png)
![](/patent/grant/10776559/US10776559-20200915-D00005.png)
![](/patent/grant/10776559/US10776559-20200915-D00006.png)
![](/patent/grant/10776559/US10776559-20200915-D00007.png)
![](/patent/grant/10776559/US10776559-20200915-D00008.png)
![](/patent/grant/10776559/US10776559-20200915-D00009.png)
![](/patent/grant/10776559/US10776559-20200915-D00010.png)
View All Diagrams
United States Patent |
10,776,559 |
Huang , et al. |
September 15, 2020 |
Defect detection method for multilayer daisy chain structure and
system using the same
Abstract
A defect detection method for a multilayer daisy chain
structure, including: generating a plurality of physical models
having a defect of at least one defect type based on the at least
one defect type of a daisy chain structure; generating a group of
training samples for each of the physical models; generating a
classifier model by using a machine learning technique algorithm
via scattering parameter values of a training set; measuring an
error value by comparing scattering parameter values of a testing
set with the classifier model, using the classifier model as a
defect model of the defect type based on the error value, and
determining that the multilayer daisy chain has a defect
corresponding to the at least one defect type by comparing actual
measurements of scattering parameter values.
Inventors: |
Huang; Yu-Jung (Kaohsiung,
TW), Pan; Chung-Long (Kaohsiung, TW), Guo;
Mei-Hui (Kaohsiung, TW) |
Applicant: |
Name |
City |
State |
Country |
Type |
I-SHOU UNIVERSITY |
Kaohsiung |
N/A |
TW |
|
|
Assignee: |
I-SHOU UNIVERSITY (Kaohsiung,
TW)
|
Family
ID: |
1000005055664 |
Appl.
No.: |
16/383,683 |
Filed: |
April 15, 2019 |
Prior Publication Data
|
|
|
|
Document
Identifier |
Publication Date |
|
US 20190236240 A1 |
Aug 1, 2019 |
|
Related U.S. Patent Documents
|
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
Issue Date |
|
|
15604671 |
May 25, 2017 |
10303823 |
|
|
|
Foreign Application Priority Data
|
|
|
|
|
Mar 30, 2017 [TW] |
|
|
106110815 A |
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06F
30/398 (20200101); G06N 20/00 (20190101); G06F
11/263 (20130101); G06F 11/2221 (20130101) |
Current International
Class: |
G06F
30/398 (20200101); G06F 11/263 (20060101); G06N
20/00 (20190101); G06F 11/22 (20060101) |
References Cited
[Referenced By]
U.S. Patent Documents
Primary Examiner: Whitmore; Stacy
Attorney, Agent or Firm: JCIPRNET
Parent Case Text
CROSS-REFERENCE TO RELATED APPLICATION
This is a continuation-in-part application of and claims the
priority benefit of U.S. patent application Ser. No. 15/604,671,
filed on May 25, 2017, now allowed, which claims priority benefit
of Taiwan application Ser. No. 106110815, filed on Mar. 30, 2017.
The entirety of each of the above-mentioned patent applications is
hereby incorporated by reference herein and made a part of this
specification.
Claims
What is claimed is:
1. A defect detection method for a multilayer daisy chain
structure, comprising: generating a plurality of physical models
having a defect of at least one defect type based on the at least
one defect type of a daisy chain structure; generating a group of
training samples for each of the physicals models to form a
training sample group and dividing the training sample group into a
training set and a testing set; obtaining a group of scattering
parameter values corresponding to the training set via the training
set; generating a classifier model via the group of scattering
parameter values; measuring an error value of the classifier model
using the test set and using the classifier model as a defect model
of the at least one defect type based on the error value; and
inputting, to the classifier model, at least one scattering
parameter value of the multilayer daisy chain structure to be
detected so as to determine that the multilayer daisy chain
structure has a determined defect corresponding to the at least one
defect type, wherein the multilayer daisy chain structure has been
having the determined defect corresponding to the at least one
defect type since being manufactured; and all steps of the detect
detection method are implemented by a processor.
2. The defect detection method of claim 1, wherein the step of
"generating a group of training samples for each of the physicals
models to form a training sample group and dividing the training
sample group into a training set and a testing set" further
comprises: dividing the training sample group into an m number of
subsets, wherein an (m-1) number of the subsets are assigned as the
training set and the remaining 1 of the subsets is assigned as the
testing set, wherein m is any positive integer.
3. The defect detection method of claim 2, wherein the step of
"measuring an error value of the classifier model using the test
set and using the classifier model as a defect model of the at
least one defect type based on the error value" further comprises:
reassigning the (m-1) number of the subsets as a new training set
and assigning the remaining 1 of the subsets as a new testing set
after the error value of the classifier model is measured, wherein
the subset forming the new testing set and the subset forming the
previous testing set are different; generating a new classifier
model and measuring a new error value of the new classifier model
based on the new training set and the new testing set; repeating
the step of generating the new classifier model and the step of
measuring the new error value (m-1) times until each of the subsets
in the m number of subsets has been the testing set; and
calculating an average error value of the measured m number of
error values and deciding whether to use the classifier model as
the defect model of the at least one defect type based on the
average error value.
4. The defect detection method of claim 1, wherein the step of
"generating a plurality of physical models having a defect of at
least one defect type based on the at least one defect type of a
daisy chain structure" comprises: generating the plurality of
physical models using a high frequency structure simulator (HFSS)
based on a structure parameter group, wherein the structure
parameter group comprises at least one parameter of a Through
Silicon Via (TSV) diameter, a TSV height, a bump diameter, a bump
height, a TSV pitch, a loss tangent of silicon, and a conductivity
of silicon.
5. The defect detection method of claim 4, wherein the group of
training samples is generated based on a same structure parameter
group as the physical model.
6. The defect method of claim 1, wherein the step of "obtaining a
group of scattering parameter values corresponding to the training
set via the training set" comprises: obtaining the group of
scattering parameter values corresponding to the training set via a
finite element method (FEM).
7. The defect detection method of claim 1, wherein the step of
"generating a classifier model via the group of scattering
parameter values" comprises: Using one of a K nearest neighbor
(KNN) and a random forest (RF) algorithm on the group of scattering
parameter values to generate the classifier model.
8. The defect detection method of claim 1, wherein the at least one
defect type corresponds to at least one of an open circuit, a short
fault, and a fault occurring location.
9. A defect detection system for a multilayer daisy chain
structure, comprising: a processor configured to control: a
physical model generating module, the physical model generating
module generates a plurality of physical models having a defect of
at least one defect type based on the at least one defect type of a
daisy chain structure; a training sample generating module, the
training sample generating module generates a group of training
samples for each of the physicals models to form a training sample
group and divides the training sample group into a training set and
a testing set; a scattering parameter value acquisition module, the
scattering parameter value acquisition module obtains a group of
scattering parameter values corresponding to the training set via
the training set; a classifier model generating module, the
classifier model generating module generates a classifier model via
the group of scattering parameter values; and a defect model
generating module, the defect model generating module measures an
error value of the classifier model using the testing set, uses the
classifier model as a defect model of the at least one defect type
based on the error value, and inputs, to the classifier model, at
least one scattering parameter value of the multilayer daisy chain
structure to be detected so as to determine that the multilayer
daisy chain structure has a determined defect corresponding to the
at least one defect type, wherein the multilayer daisy chain
structure has been having the determined defect corresponding to
the at least one defect type since being manufactured.
10. The defect detection system of claim 9, wherein the training
sample generating module further divides the training sample group
into an m number of subsets, wherein an (m-1) number of the subsets
are assigned as the training set and the remaining 1 of the subsets
is assigned as the testing set, wherein m is any positive
integer.
11. The defect detection system of claim 10, wherein the defect
model generating module is further configured to: reassign the
(m-1) number of the subsets as a new training set and assign the
remaining 1 of the subsets as a new testing set after the error
value of the classifier model is measured, wherein the subset
forming the new testing set and the subset forming the previous
testing set are different; generate a new classifier model and
measure a new error value of the new classifier model based on the
new training set and the new testing set; repeat the step of
generating the new classifier model and the step of measuring the
new error value (m-1) times until each of the subsets in the m
number of subsets has been the testing set; and calculate an
average error value of the measured m number of error values and
decide whether to use the classifier model as the defect model of
the at least one defect type based on the average error value.
12. The defect detection system of claim 9, wherein the physical
model generating module further generates the plurality of physical
models using a high frequency structure simulator (HFSS) in the
physical model generating module based on a structure parameter
group, wherein the structure parameter group comprises at least one
parameter of a Through Silicon Via (TSV) diameter, a TSV height, a
bump diameter, a bump height, a TSV pitch, a loss tangent of
silicon, and a conductivity of silicon.
13. The defect detection system of claim 12, wherein the group of
training samples is generated by the training sample generating
module based on a same structure parameter group as the physical
model.
14. The defect detection system of claim 9, wherein the scattering
parameter value acquisition module further obtains the group of
scattering parameter values corresponding to the training set via a
finite element method (FEM).
15. The defect detection system of claim 9, wherein the classifier
model generating module further uses one of a K nearest neighbor
(KNN) algorithm and a random forest (RF) algorithm on the group of
scattering parameter values to generate the classifier model.
16. The defect detection system of claim 9, wherein the at least
one defect type corresponds to at least one of an open circuit, a
short fault, and a fault occurring location.
Description
BACKGROUND OF THE INVENTION
Field of the Invention
The invention relates to a detection method and a system, and more
particularly, to a defect detection method for a 3D chip and a
system using the same.
Description of Related Art
Compared to a traditional 2D chip, a three-dimension (3D) chip
stacking technique has many advantages. For instance, adopting a 3D
stacking method can significantly increase system integration,
reduce package body size and weight, increase package density, and
reduce form factor, such that more components can be accommodated
in a unit volume of an integrated circuit. Moreover, via
heterogeneous integration, chips of various different processes and
operation characteristics can be stacked. For instance, modules of
various different functions such as analog, RF, and logic circuits
can be grouped together via a 3D stacking technique to
significantly increase system performance.
For 3D chips connected by a vertical Through Silicon Via (TSV)
technique, the TSV can significantly increase the number of
connections between two layers of chips. By achieving vertical
interconnection communication between layers, the connection length
in an integrated circuit can be effectively reduced. Since reduced
connection length is equivalent to reduced parasitic resistance and
capacitance in connection, time constant signal delay is reduced,
such that signal transmission rate is increased and the bandwidth
of data transmission is increased, and the issue of limited data
bandwidth in the past is effectively solved.
In a 3D stacking system, interconnect is generally a very important
infrastructure dominating signal transmission quality, but as the
size of the 3D chip continues to be smaller, the potential threat
of system failure caused by connection failure is also becoming
more significant. Therefore, the detection method of the TSV itself
depends on the quality of system miniaturization and the cost
thereof, and is an important link deserving attention. Many issues
and challenges of the 3D chip stacking process are related to test,
such as a large portion of the high manufacturing cost of the TSV
is used for test, and not every manufactured TSV can be applied in
a chip circuit. If defective TSV is not detected and is used in the
stacking with other chips, then the entire stacking system is
flawed. Based on the above, the development of a reliable,
high-efficiency TSV detection theory and method is very important
for the application of 3D chip stacking.
Since the 3D chip is formed by stacking many different dies, the 3D
chip is more complex than a 2D chip in the detection process, and
therefore new issues and challenges are raised in the detection
process. The forming of bumps of the 3D structure or the detection
of TSV conductivity performance cannot be achieved by detecting
using a probe. TSV techniques include via drilling, via wall
isolation, and via filling, and usually require processes such as a
thinning treatment and polishing, and TSV defects are readily
generated in these processes. For instance, a pinhole on the TSV
insulator or a void defect generated in a filling process.
Moreover, since bumps need to be stacked in the 3D structure in the
process of a 3D chip bonding and stacking technique, the resulting
stress readily generates bump cracks. Moreover, in the stacking of
3D chips, the TSV between each layer of chips and the signal of the
metal bumps need to achieve accurate alignment, and if accurate
alignment and bonding cannot be achieved in the 3D chips stacked by
a plurality layer of chips, then signal transmission is distorted.
The issues above such as defects, bump cracks, or misaligned TSV
all affect the performance of the device and the overall system,
and also cause failure to the entire system chip or system package,
and therefore the yield of 3D chip production is affected.
Therefore, the effective detection of whether the TSV is capable of
normal operation is a relatively important object in the 3D
integrated circuit design. Currently, the testing process for 3D
chips can be divided into pre-bond test, midbond test (partial
stack testing), and post-bond test, wherein the post-bond test,
partial stacking test, TSV complete stacking test, and TSV
redistribution layer (RDL) test, wherein situations such as failure
caused by short circuits, open circuits, and defects need to be
considered for the pre-bond detection of the TSV, and circuit
failures caused by, for instance, defects generated in a post-bond
process need to be detected by a post-bond detection technique. The
quantity of the TSV is directly related to the reliability of the
3D chip. An excess number of TSV readily causes transmission
signals to interfere with one another. Moreover, the area of each
of the TSV plus the area of the surrounding Design-For-Test circuit
often prevent the overall chip layout area from being optimized
such that costs are increased and function is reduced.
For the test of TSV channel function, in addition to the test of
each single die chip, the development of a testing method across
the signal between each layer of chips is also needed. Currently, a
specific standard method testing TSV channel function in a stacked
die 3D chip does not exist. However, in response to the large
demand for the defect detection of advanced IC and quality
reliability of the product, in addition to achieving the
effectiveness of basic detection function, the testing theory needs
to be improved and the structure of testing process needs to be
enhanced to reduce the costs of detection and the chip manufacture
process. Accordingly, the development of a TSV testing technique is
one of the important factors for the breakthrough in stacked die 3D
application promotion.
For stacked die 3D chip test, since when the circuit is divided
into a plurality of layers, the issue of difficult test occurs,
such as each layer lacks complete testing function, an effective
method to perform functional test for single-layer and multilayer
TSVs still cannot be found. To solve this issue, design for test
methods such as a logic built-in self-test (BIST) or scan circuit .
. . etc. are also often applied in stacked die 3D chip test,
wherein the connection testing technique of interlayer stacked die
3D chip TSV is a great challenge in test. For instance, when the
circuit is divided into a plurality of layers, a single chip itself
does not have external input/output contacts, and electrostatic
protection device (ESD) protection is also absent, and therefore
many difficulties arise in test. Since each layer of chips contains
many TSVs, in particular certain chips with higher bandwidth
requirements, the number of interlayer TSVs thereof is large and
the transmission path is complex, and therefore how to generate a
defect model of the TSV and an effective TSV testing method to
ensure the normal operation of the TSVs is also a major issue in
TSV test that needs to be solved. Moreover, the interconnected
chips of a plurality layer of TSVs require a grinding treatment,
but defects of the chips caused during the grinding treatment are
not readily observed, and therefore chips having defects are often
implanted in a subsequent stacking treatment by mistake. Based on
the above, performing an interlayer connection structure test
further explores the defect relationship of signal connection
transmission quality and a multilayer connection structure, and is
an important research topic for improving 3D chip integrated system
function.
On the other hand, the daisy chain multilayer structure is often
applied to test the interconnection reliability of the vertical via
(or TSV) structures. Failures are easily measured when the daisy
chain structure becomes an open or short circuit, but precise
locations of the cracked vias are difficult to detect. The testing
of vertical via is important as a single irreparable vertical via
can cause an entire multilayer system to fail. It is important to
understand the formation of this failure mechanism and adopt
strategies for identifying and avoiding them during the
manufacturing process.
SUMMARY OF THE INVENTION
The invention provides a defect detection method for a 3D chip (or
a multilayer daisy chain structure) and a system using the same
that can classify a 3D chip (or a multilayer daisy chain structure)
based on different defect types without performing an additional
Design-For-Test circuit design and without sample preparing for
observing the presence of defects.
An embodiment of the invention provides a defect detection method
for a single die 3D chip and a stacked die 3D chip, the method
including: generating a plurality of physical models having a
defect of at least one defect type based on the at least one defect
type of a 3D chip; generating a group of training samples for each
of the physical models to form a training sample group and dividing
the training sample group into a training set and a testing set;
obtaining a group of scattering parameter values corresponding to
the training set via the training set; generating a classifier
model via the group of scattering parameter values; and measuring
an error value of the classifier model using the testing set and
using the classifier model as a defect model of the at least one
defect type based on the error value to determine that a Through
Silicon Via (TSV) of a single die 3D chip or a stacked die 3D chip
has a defect corresponding to the at least one defect type.
An embodiment of the invention provides a defect detection system
for a single die 3D chip and a stacked die 3D chip, the system
including: a physical model generating module, a training sample
generating module, a scattering parameter value acquisition module,
a classifier model generating module, and a defect model generating
module. The physical model generating module generates a plurality
of physical models having a defect of at least one defect type
based on the at least one defect type of a 3D chip. The training
sample generating module generates a group of training samples
corresponding to a physical model for each of the physical models
to form a training sample set and divides the training sample group
into a training set and a testing set. The scattering parameter
value acquisition module obtains a group of scattering parameter
values corresponding to the training set via the training set. The
classifier model generating module generates a classifier model via
the group of scattering parameter values. The defect model
generating module measures an error value of the classifier model
using the testing set and uses the classifier model as a defect
model of the at least one defect type based on the error value to
determine that a TSV of a single die 3D chip or a stacked die 3D
chip has a defect corresponding to the at least one defect
type.
An embodiment of the invention provides a defect detection method
for a multilayer daisy chain structure, the method including:
generating a plurality of physical models having a defect of at
least one defect type based on the at least one defect type of a
daisy chain structure; generating a group of training samples for
each of the physical models to form a training sample group and
dividing the training sample group into a training set and a
testing set; obtaining a group of scattering parameter values
corresponding to the training set via the training set; generating
a classifier model via the group of scattering parameter values;
measuring an error value of the classifier model using the test set
and using the classifier model as a defect model of the at least
one defect type based on the error value; and inputting, to the
classifier model, at least one scattering parameter value of the
multilayer daisy chain structure to be detected so as to determine
that the multilayer daisy chain structure has a determined defect
corresponding to the at least one defect type, wherein the
multilayer has been having the determined defect corresponding to
the at least one defect type since being manufactured; and all
steps of the detect detection method are implemented by a
processor.
An embodiment of the invention provides a defect detection system
for a multilayer daisy chain structure, the system including: a
processor configured to control: a physical model generating
module, a training sample generating module, a scattering parameter
value acquisition module, a classifier model generating module, and
a defect model generating module. The physical model generating
module generates a plurality of physical models having a defect of
at least one defect type based on the at least one defect type of a
daisy chain structure. The training sample generating module
generates a group of training samples corresponding to a physical
model for each of the physical models to form a training sample set
and divides the training sample group into a training set and a
testing set. The scattering parameter value acquisition module
obtains a group of scattering parameter values corresponding to the
training set via the training set. The classifier model generating
module generates a classifier model via the group of scattering
parameter values. The defect model generating module measures an
error value of the classifier model using the testing set, uses the
classifier model as a defect model of the at least one defect type
based on the error value, and inputs, to the classifier model, at
least one scattering parameter value of the multilayer daisy chain
structure to be detected so as to determine that the multilayer
daisy chain structure has a determined defect corresponding to the
at least one defect type, wherein the multilayer daisy chain
structure has been having the determined defect corresponding to
the at least one defect type since being manufactured.
Based on the above, the invention provides a defect detection
method for a 3D chip and a system using the same that can improve
the defect detection and detection process algorithm for a stacked
die 3D chip using a machine learning technique. Accordingly, in the
invention, a 3D chip can be classified based on different defect
types without performing an additional Design-For-Test circuit
design and without sample preparing for observing the presence of
defects.
In order to make the aforementioned features and advantages of the
disclosure more comprehensible, embodiments accompanied with
figures are described in detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings are included to provide a further
understanding of the invention, and are incorporated in and
constitute a part of this specification. The drawings illustrate
embodiments of the invention and, together with the description,
serve to explain the principles of the invention.
FIG. 1 shows a flowchart of a machine-learning technique-based 3D
chip TSV defect detection according to an embodiment of the
invention.
FIG. 2 shows a cross-sectional structure schematic diagram of a
single-layer cylindrical TSV.
FIGS. 3A, 3B, 3C, and 3D show schematic diagrams of common TSV
defects.
FIGS. 4A, 4B, 4C, and 4D show schematic diagrams of common TSV
defects of stacked die 3D chips.
FIG. 5 shows a defect detection system for a single die 3D chip and
a stacked die 3D chip according to an embodiment of the
invention.
FIG. 6 shows a schematic diagram of a K nearest neighbor
algorithm.
FIG. 7 shows a schematic diagram of a random forest algorithm.
FIGS. 8A and 8B respectively show frequency response graphs of S11
and S21 parameters of HFSS simulation of single die 3D chip TSV
void defect percentages according to an embodiment of the
invention.
FIGS. 8C and 8D respectively show frequency response graphs of S11
and S21 parameters of HFSS simulation of stacked die 3D chip TSV
void defect percentages according to an embodiment of the
invention.
FIG. 9 shows a schematic diagram of performance comparison of a
stacked die 3D chip void detection KNN and RF algorithms classifier
model according to an embodiment of the invention.
FIG. 10 shows a defect detection method for a single die 3D chip
and a stacked die 3D chip according to an embodiment of the
invention.
FIG. 11 show schematic diagrams of common defects of multilayer
daisy chain structure.
FIG. 12 shows a defect detection system for a multilayer daisy
chain structure according to an embodiment of the invention.
FIG. 13 shows a defect detection method for a multilayer daisy
chain structure according to an embodiment of the invention.
DESCRIPTION OF THE EMBODIMENTS
Performing signal transmission using Through Silicon Via (TSV) in
the vertical direction is a key technique for achieving vertical
interconnection in a three-dimensional silicon-carbide integrated
circuit (3D-SIC), i.e., 3D chip. Therefore, a complete 3D-SIC
detection method and an effective detection design tool are an
indispensable condition. In the invention, the defect detection and
detection process of the 3D-SIC are studied using a machine
learning technique algorithm to provide a corresponding detection
structure design solution for a 3D-SIC application. On the other
hand, in 3D structure either stacked dies or multilayer printed
circuit board (PCB), the daisy chain provides engineers with a
simple and cost-effective testing mechanism to glean valuable data
about process failures. Although failures are easily measured when
an open circuit or a short circuit occurred in the daisy chain, the
precise location that the fault occurred is difficult to
detect.
As the size of transistors continues to be smaller, nanometer-scale
defects are more difficult to be detected, such that the
performance of the device is affected and signal transmission
distortion occurs as a result, and failure of the 3D chip also
occurs, such that the reliability of the 3D chip is affected. The
concept of the invention is to perform an electrical function
simulation for a TSV defect model of 3D chips, and then build a
classifier model related to the TSV defect via a machine learning
algorithm by analyzing relevant simulation or quantitative data
collected by different TSV defect types to generate a detection
identification method of TSV defect types. Simulation data of
defect samples is trained by a machine learning algorithm and data
state classifiers of various types of TSV defects are selected and
analyzed to generate a classifier model of various TSV defects. The
classifier model can perform classifying based on whether the TSV
state meets defect classifiers and classify the TSV to be detected
by the classifier model to achieve detection performance of the TSV
defect state. The motivation for the development of such research
method is the early diagnosis, prevention, and treatment of TSV
defect and failure, and even the system function can be determined
beforehand. As a result, effective prediction and state evaluation
are effectively, rapidly, and accurately achieved for TSV defects,
and detection cost reduction and increased yield of chip stacking
for a future system can be achieved.
FIG. 1 shows a flowchart of a machine-learning technique-based 3D
chip TSV defect detection according to an embodiment of the
invention. When a machine learning technique is implemented, a
group of training samples formed by an input subjects (quantity of
classifiers) and an expected output value (also referred to as a
label) is first generated. Next, the training samples can be
analyzed by a learning algorithm and generate an inferred function,
and when the output function is a discrete function, the inferred
function can be referred to as a classifier, and when the output
function is a continuous function, the inferred function can be
referred to as a regression function.
Referring to FIG. 1, in step 1, data is collected based on the TSV
defect state of the 3D chip, and then data generated by various
potentially defective TSVs and the data of normal TSVs are inputted
into a machine learning algorithm to perform training of the
classifier model for classification. Then, in step 2, a trained
classifier model is generated. Lastly, in step 3, the accuracy of
TSV defect detection is verified using the trained classifier
model.
The classifier model training algorithm used by the machine
learning algorithm affects the accuracy of detection of the
generated classifier model, and the classification performance is
also affected by the classification characteristics of the selected
data. If the selected classification characteristics can classify a
group to be classified well, then the best classification
performance is achieved. Moreover, the selected classification
characteristics all need to be quantified into numeric values for
input. The selection of the classification model training algorithm
needs to adopt different classification model training algorithms
based on different selected characteristics data, and the use of
different classification model training algorithms can also affect
the performance of classification. Since the data characteristics
used in classifier model training and the algorithm used are
predicted to have considerable correlation with the accuracy of
detection, the 3D chip TSV defect detection method based on a
machine learning technique provided by the invention is mainly
related to the selection of the characteristics data selection
method of the TSV defect state and the selection of the
classification model training algorithm.
FIG. 2 shows a cross-sectional structure schematic diagram of a
single-layer cylindrical TSV. The cylindrical TSV is the most
common TSV structure. In the case of FIG. 2, the cross-sectional
structural schematic diagram shown in FIG. 2 is shown in ground
signal ground (GSG) configuration. In the structure, through holes
are etched on a silicon substrate and metal is filled to form a
TSV, and a silicon dioxide is formed around the TSV to isolate the
silicon substrate. The TSV and the metal connect are connected by a
bump, and the bump is generally the same type as the TSV filler
metal. The resulting TSV product often has different types of
defects, and FIGS. 3A, 3B, 3C, and 3D show schematic diagrams of
common TSV defects. In the TSV manufacturing process, several types
of defects often occur. For instance, FIG. 3A shows the generation
of void defects due to incomplete deposition and filling of through
hole metal, and FIG. 3B shows the sidewall insulator of the TSV has
pinhole defects due to an effect from process parameters, such that
the TSV and the silicon substrate are connected to form wrong
paths, such that circuit failure occurs. The two defects above
readily cause leakage fault. Another case is that metal breakage or
void occurs during metal filling of the TSV, such as the bump crack
shown in FIG. 3C and the open defect shown in FIG. 3D. Such defects
result in the partial or complete blockage of the signal channel
passing through the TSV, thus causing circuit failure. The circuit
failure caused by such defects is referred to as open fault failure
of the TSV.
In addition to the generation of defects of the TSV of the single
die 3D chip during the manufacturing process, other types of
defects also occur when the chips of the single die 3D chip TSV are
stacked into stacked die 3D chip TSV. FIGS. 4A, 4B, 4C, and 4D show
schematic diagrams of common TSV defects of stacked die 3D chips.
Stacking TSV having defects and bumps results in failure or reduced
function of the entire 3D stacked chips. When the chips are
stacked, as the number of stacked chips is increased, the failure
of the TSV causes the yield loss of the chips to increase
exponentially. Therefore, to reduce stacking yield loss, a pre-bond
test needs to be performed on the TSV before the chips are stacked
to exclude defective TSV. Next, to increase the stacking yield of
the chips, a post-bond test needs to be performed on the TSV after
the chips are stacked. During the multilayer stacking of the TSV of
the 3D chips, a bump defect may be generated. Common bump defects
are, for instance, the bump crack shown in FIG. 4A, the bump
misalignment shown in FIG. 4B, the bump open shown in FIG. 4C, and
the bump short fault shown in FIG. 4D.
FIG. 5 shows a defect detection system 500 for a single die 3D chip
and a stacked die 3D chip according to an embodiment of the
invention. The defect detection system 500 can include a physical
model generating module 501, a training sample generating module
503, a scattering parameter value acquisition module 505, a
classifier model generating module 507, and a defect model
generating module 509. The physical model generating module 501,
the training sample generating module 503, the scattering parameter
value acquisition module 505, the classifier model generating
module 507, and the defect model generating module 509 can be, for
instance, disposed in a computer to execute the method disclosed by
the invention, and the invention is not limited in this regard.
The physical model generating module 501 can generate a plurality
of physical models having a defect of at least one defect type
based on the at least one defect type of a 3D chip. Specifically,
the physical model generating module 501 can generate a TSV
physical model having the defect of the defect type shown in FIGS.
3A to 3D and FIGS. 4A to 4D using a 3D full-wave solver such as a
high-frequency structure simulator (HFSS). The physical model
generating module 501 can generate a plurality of physical models
corresponding to only a defect type for the defect type, and can
also generate a plurality of physical models corresponding to the
plurality of defect types for a plurality of defect types.
In the case of the void defects and pinhole defects of a single die
3D chip TSV, if the user is to detect whether the single die 3D
chip has void defects or pinhole defects, or is to further detect
the degree to which the void defects or pinhole defects of the TSV
in the single die 3D chip occur, then the user can generate a
plurality of physical models having different degrees of void
defects and pinhole defects using HFSS, such as generate a
plurality of physical models of the TSV respectively for 0%, 5%, .
. . , 60% of void defects and generate a plurality of physical
models of the TSV respectively for 0%, 5%, . . . , 60% of pinhole
defects. In the case of a void defect of the TSV, the degree to
which the void defect occurs can be determined by the following
formula:
##EQU00001## wherein e is the void defect ratio of the TSV,
V.sub.void is the volume of the void, and V.sub.tsv is the volume
of the TSV.
In an embodiment of the invention, a corresponding classifier model
can lastly be generated based on the plurality of physical models
to determine whether the 3D chips have the same type of defect as
the physical models, or the degree to which the defect occurs. It
should be mentioned that, when the physical models are generated,
the user can generate corresponding physical models for a single
defect type and can also generate corresponding physical models for
a plurality of defect types. Moreover, the defect type of the TSV
is also not limited to the defect types shown in FIGS. 3A to 3D and
FIG. 4A to FIG. 4D, and the user can decide what defect type of the
TSV the generated physical models have based on actual requirement,
and the invention is not limited in this regard. Moreover, the
physical models can have the configuration of a single die 3D chip
and can also have the configuration of a stacked die 3D chip.
When the physical model of the TSV corresponding to a specific
defect type is generated using HFSS, the physical model generating
module 501 can use HFSS and generate the physical model based on
the structure parameter group of HFSS, wherein the structure
parameter group can include one or a plurality of different
structural parameters. The structure parameter group can include at
least one parameter of a TSV diameter, a TSV pitch, a TSV depth, a
TSV aspect ratio, an underfill height, a relative permittivity of
an underfill, a bonding overlay accuracy, a contact pitch when
thermocompression is used, a contact pitch when a solder bump is
used, a bump diameter, a bump height, a number of die per stack, an
bottom silicon dioxide thickness, and a TSV insulator thickness.
The TSV can be designed and implemented by filling different
materials such as copper and tungsten and structural parameters,
and therefore the physical model of the TSV can also be generated
by collecting various types of corresponding signal transmission
characteristic data based on different transmission characteristic
requirements. Moreover, the selection of the structural parameters
can be based on the specification recommended by the ITRS
3D-SIC/3D-SOC Roadmap. Table 1 shows the recommended settings of
several structural parameter values. When the physical model of the
TSV corresponding to a specific defect type is generated using
HFSS, the user can decide how to set the structural parameters of
the HFSS based on actual requirement, and the invention is not
limited in this regard.
TABLE-US-00001 TABLE 1 2015-2018 TSV interconnect level Global
layer Intermediate layer Minimum TSV diameter 2 .mu.m to 3.5 .mu.m
0.5 .mu.m to 2 .mu.m Minimum TSV pitch 4 .mu.m to 7 .mu.m 1 .mu.m
to 4 .mu.m Minimum TSV depth 30 .mu.m to 50 .mu.m 5 .mu.m to 20
.mu.m Minimum TSV aspect ratio 12:1-20:1 5:1-20:1 Bonding overlay
accuracy 0.5 .mu.m to 1.0 .mu.m 0.5 .mu.m to 1.0 .mu.m Minimum
contact pitch 5 .mu.m 2 .mu.m to 3 .mu.m (thermocompression)
Minimum contact pitch 10 .mu.m 2 .mu.m to 3 .mu.m (solder bump)
Number of die per stack 2 to 8 8 to 16 (Dynamic random access
memory, or DRAM)
After the physical model generating module 501 generates a
plurality of physical models, the training sample generating module
503 can generate a group of training samples for each of the
physical models to form a training sample group and divide the
training sample group into a training set t1 and a testing set t2.
Specifically, the training sample generating group 503 can group a
group of training samples generated for each type of physical model
into a set of training samples having a plurality types of physical
model, and can divide the training sample groups into an m number
of subsets, wherein an (m-1) number of the subsets are assigned as
the training set t1 and the remaining 1 of the subsets is assigned
as the testing set t2, and m can be any positive integer.
For instance, in the case that the training sample generating
module 503 respectively generates 1000 training samples for
physical models of TSVs having 0% void defects and 5% void defects,
the training sample generating module 503 can set the value of m to
10, i.e., divide 2000 training samples into 10 equal portions,
wherein 1800 training samples can be assigned as the training set
t1, and the remaining 200 training samples can be assigned as the
testing set t2. It should be mentioned that, the training sample is
generated by the training sample generating module 503 based on the
same structure parameter group as the corresponding physical model,
and therefore each training sample belonging to the same physical
model has the same structure parameter group. For instance, in the
case that the training sample generating module 503 generates 1000
training samples for the physical model of the TSV having 5% void
defects, the 1000 training samples are all TSV physical models
having 5% void defects.
The scattering parameter value acquisition module 505 can obtain a
group of scattering parameter (S parameter) values corresponding to
the training set t1 via the training set t1. Specifically, the
scattering parameter value acquisition module 505 can perform a
finite element method (FEM) on the training samples in the training
set t1 to extract the S parameter values of each training sample to
obtain a group of scattering parameter values corresponding to the
training set t1, and the S parameter values can be, for instance,
scattering parameters such as S11 parameters or S21 parameters.
After the scattering parameter value acquisition module 505 obtains
a group of scattering parameter values corresponding to the
training set t1, the classifier model generating module 507 can
generate a classifier model cm via the group of scattering
parameter values. Specifically, the classifier model generating
module 507 can use a machine learning algorithm for the group of
scattering parameter values to generate the classifier model cm
corresponding to the group of scattering parameter values. The
machine learning algorithm can be, for instance, K nearest neighbor
(KNN) algorithm or random forest (RF) algorithm, and can also be,
for instance, various types of machine learning algorithms
supporting a vector machine, a decision tree, or regression
analysis. Moreover, although in the present embodiment, a
classifier model is generated based on scattering parameter values,
the user can also generate the classifier model using different
parameters based on actual requirement, and the invention is not
limited in this regard. For instance, the user can also obtain an
impedance value of the HFSS structural parameters from the training
samples formed by the HFSS structure parameter group and extract a
phase angle with the impedance value to generate a corresponding
classifier model using the phase angle.
FIG. 6 shows a schematic diagram of a K nearest neighbor algorithm.
The KNN algorithm is the most basic classification algorithm in
machine learning and can find a K number of training samples
nearest to the sample to be classified in the training set, and the
classification of the object to be classified is decided by the
type of the majority of training samples in the K number of nearest
neighbors. When K=1, the type of the object to be classified is the
type of the nearest training sample. When the KNN algorithm is
used, for each sample, the type of the sample can be represented by
the type of the K number of neighbors nearest to the sample. In
other words, if in the classifier space, then the majority of the K
number of samples nearest to a specific sample is type 1, and the
specific sample is also type 1 and has the properties of the
samples of type 1.
As shown in FIG. 6, when K=3, in the 3 neighboring samples closest
to the object to be classified (the star in the middle), samples of
type B are the majority, and therefore the object to be classified
can be classified as type B. When K=5, in the 5 neighboring samples
closest to the object to be classified, samples of type A are the
majority, and therefore the object to be classified can be
classified as type A. When the difference between two data is
determined using KNN algorithm, a Euclidean distance is generally
adopted, and a smaller distance indicates a smaller difference
between the two data. Based on the above, the KNN algorithm finds
the nearest data to the data to be classified based on a currently
classified data set, and then determines or predicts the type of
the data to be classified based on the type of the nearest
data.
In an embodiment of the invention, the classifier model generating
module 507 generates the classifier model cm using a KNN algorithm.
When the classifier model cm is generated, the classifier model
generating module 507 first presets a K value and generates the
classifier model cm corresponding to specific defect types based on
the preset K value and the training set t1, and the specific defect
types can be the defect types and the degrees to which the defect
occurs of each training sample in the training set t1. For
instance, if each training sample in the training set t1 has 2
different types of defects and the 2 types of defects respectively
have 5 differences to varying degrees, then the classifier model cm
generated based on the training set t1 can at most differentiate
2*5=10 different defect types.
When a certain sample is classified using the classifier model cm,
the classifier model generating module 507 decides the defect type
corresponding to the sample to be classified using the K number of
training samples in the training set t1 closest to the sample to be
classified based on the K value. The method of measuring distance
can adopt the Euclidean distance above, and a smaller Euclidean
distance between the S parameters corresponding to two samples
indicates a smaller difference of two training samples. If each
sample is a coordinate point in the n-dimensional space, then the
discriminant function of the KNN method is as follows:
Classification(T)=min.parallel.T-X.sub.i.sup.c.parallel. wherein T
is the sample to be classified, X; is the i-th data with defect
type c in the classified training sample (i.e., the training set
t1), and i=1, 2, . . . , n (dimension of the training sample). In
the KKN algorithm, the selection of the K value significantly
affects the result, and a large K value can reduce the effect of
noise on classification. The most preferred K value can generally
be selected using a cross validation method. In the present
specification, embodiments of cross validation are described
later.
FIG. 7 shows a schematic diagram of a random forest algorithm.
Random forest is a classifier containing a plurality of decision
trees as the basis and a combined classification model
{h(X,.theta..sub.j); j=1, . . . } formed by many decision tree
classification models, wherein {.theta..sub.j} is an independently
distributed random vector, and when X is inputted, each tree only
provides one vote for the most suitable classification. First, an n
number of subsets such as a subset D1, a subset D2, . . . , a
subset Dn shown in FIG. 7 are randomly chosen from an original
training data set D using bagging sample selection, and the sample
size of each subset is the same. Next, an n number of decision tree
models are respectively generated for an n number of subsets to
obtain an n number of classification results. Lastly, each record
is voted based on the n number of classification results to decide
the final classification. The final prediction result is decided by
the vote of the prediction result of each decision tree such that
the classification with the most votes is used as the final
prediction result.
In an embodiment of the invention, the classifier model generating
module 508 generates the classifier model cm using a random forest
algorithm. When the classifier model cm is generated, the
classifier model generating module 507 first randomly selects an n
number of subsets in the training set t1, and the size of each
subset is the same, that is, the number of training samples in each
subset is the same. Next, the classifier model generating module
507 respectively generates corresponding decision trees for each
subset (i.e., a total of n number of decision trees), and then the
classifier model generating module 507 can generate the classifier
model cm corresponding to specific defect types via the n number of
decision trees, and the specific defect types can be the defect
types and the degrees to which the defect occurs of each training
sample in the training set t1. For instance, if each training
sample in the training set t1 has a total of 2 different types of
defects, and the 2 types of defects respectively have 5 differences
to varying degrees, then the classifier model cm generated based on
the training set t1 can at most differentiate 2*5=10 types of
different defect types.
After the classifier model cm is generated, the defect model
generating module 509 can measure the error of the classifier model
cm using the training set t2 to calculate an error value e1 and
decide the use of the classifier model cm based on the error value
e1 as the defect model of the specific defect types to determine
whether the single die 3D chip or the stacked die 3D chip has the
defect corresponding to the above specific defect types.
Specifically, the defect model generating module 509 can input each
training sample in the testing set t2 into the classifier model cm,
and determines which types of defect each training sample
respectively has via the classifier model cm, or can further
determine the degrees of the defect in each training sample via the
classifier model cm. After each training sample in the testing set
t2 is classified by the classifier model cm, the defect model
generating module 509 can compare the classification result of each
training sample with the defect types or defect degrees actually
corresponding to the training sample and determine whether the
classification result of each training sample meets the defect
types actually corresponding to the training sample to measure the
error value e1 during the classification of the testing set t2
using the classifier model cm.
After the error value e1 is obtained, the defect model generating
module 509 can determine whether the classifier model cm is used as
a defect model (also referred to as a classifier) of specific
defect types based on the error value e1 to use the defect model to
determine whether the 3D chip has a defect of specific defect
types, wherein the specific defect types can be the defect types or
the degrees to which the defect occurs of each training sample in
the training set t1.
For instance, an error value e1 less than a preset threshold
indicates the accuracy of classifying the testing set t2 using the
classifier model cm reached the expected standard, and at this
point, the defect model generating module 509 can use the
classifier model cm as the defect model. After the defect model is
generated, whether the chip to be detected has a defect meeting the
specific defect type can be determined by the defect model just by
inputting the S parameter of a single die 3D chip or a stacked die
3D chip to be detected into the defect model.
To validate the accuracy of the defect model, the defect detection
system 500 can also repeatedly check the defect model using the
training samples generated by the training sample generating module
503. Specifically, After the defect model generating module 509
measures the error value e1 of the classifier model cm using the
testing set t2, the training sample generating module 503 can
select 1 subset from an m number of subsets of the training sample
again as a new testing set t2', and the remaining (m-1) number of
subsets are new training sets t1', wherein the subsets forming the
new testing set t2' and all of the subsets forming the previous
testing set (such as the testing set t2) are different. In other
words, the subsets that have been the testing set cannot be reused
as the new testing set.
After the training sample generating module 503 generates the new
training set t1' and the new training set t2', the scattering
parameter value acquisition module 505 can obtain a group of
scattering parameter values corresponding to the training set t1'
via the training set t1'. Next, the classifier model generating
module 507 can generate a new classifier model cm' via the group of
scattering parameter values. After the classifier model generating
module 507 generates the new classifier model cm', the defect model
generating module 509 can measure the new classifier model cm'
using the new testing set t2' to obtain a new error value e2. The
detailed methods of obtaining a group of scattering parameter
values, generating a classifier model, and measuring and obtaining
an error value are all disclosed in the embodiments above and are
therefore not repeated herein.
After the steps above, the classifier model generating module 507
can respectively generate the classifier model cm and the
classifier model cm' using the same classifier model generating
method and using the same training sample but different training
sets t1 and t1', and the defect model generating module 509 can
respectively measure the error values e1 and e2 of the classifier
model cm and the classifier model cm' using the same training
sample but different testing sets t2 and t2'. In the present
embodiment, the defect detection system 500 can repeat the step of
generating a new classifier model and the step of measuring the new
error values (m-1) times until each subset in the m number of
subsets of the training sample has been the testing set.
Accordingly, the defect model generating module 509 can obtain a
total of an m number of error values.
After the defect model generating module 509 obtains the m number
of error values, the defect model generating module 509 can
calculate and measure the average error value of the m number of
error values, and the defect model generating module 509 can decide
whether to use the classifier model cm as the defect model of
specific defect types based on the average error value, wherein the
specific defect types can be the defect types or the degrees to
which the defect occurs of each training sample in the training set
t1. Specifically, after performing the step of generating a new
classifier model and the step of measuring the new error values m
times, the defect detection system 500 can generate the remaining
(m-1) number of classifier models using the same training sample
group and using the same generating method as the classifier model
cm and measure the error values (a total of m error values)
corresponding to each classifier model. If the average error value
calculated from an m number of error values is less than a preset
threshold, then the accuracy of the classifier model generated
using the same generating method as the classifier model cm can
reach the expected standard. Accordingly, the defect model
generating module 509 can decide to use the classifier model cm as
the defect model based on a low average error value. More
specifically, the defect model generating module 509 can also
select a classifier model (such as the classifier model cm')
generated by the same generating method as the classifier model cm
as the defect model.
The method above of using the same training sample group to
generate and check the classifier model is referred to as m-fold
cross validation and has the following advantages. First, the
method can obtain as much effective information as possible from
limited learning data. Next, the cross validation method can
validate the classifier model from a plurality of directions to
effectively avoid falling into local minimum. Moreover, the issue
of overfitting can be avoided to a certain degree.
The results of the HFSS simulation of the S11 parameter and the S21
parameter frequency responses for different signal transmission
frequencies and 3D chip TSV void defect percentages are as shown in
FIGS. 8A and 8B and FIGS. 8C and 8D. First, FIGS. 8A and 8B
respectively show frequency response graphs of S11 and S21
parameters of HFSS simulation of single die 3D chip TSV void
percentages according to an embodiment of the invention. These
graphs reflect the frequency response of the single die 3D chip TSV
void defect percentage between the frequencies of 200 MHz and 20
GHz. The S11 parameter is used to measure the return loss of the
signal, and a lower S11 parameter value indicates a lower return
loss of the signal. The presence of void defects causes the
incident RF signal to be absorbed or scattered. As shown in FIG.
8A, the frequency response curve (curve labeled by squares) of a
single die 3D chip representing a void defect percentage of 0 has
the minimum S11 parameter value. Moreover, it can be known from
FIG. 8A that, lesser void defects present in the TSV indicates
smaller return loss of the single die 3D chip. The S21 parameter is
used to measure the insertion loss of the signal, and a higher S21
parameter value indicates a smaller insertion loss of the signal.
As shown in FIG. 8B, the frequency response curve (curve labeled by
squares) of a single die 3D chip representing a void defect
percentage of 0 has the maximum S21 parameter value, and therefore
it can be known from FIG. 8B that, lesser void defects present in
the TSV indicates smaller insertion loss of the single die 3D
chip.
FIGS. 8C and 8D respectively show frequency response graphs of S11
and S21 parameters of HFSS simulation of stacked die 3D chip TSV
void percentages according to an embodiment of the invention. These
graphs reflect the frequency response of the stacked die 3D chip
TSV void defect percentage between the frequencies of 200 MHz and
20 GHz. Although FIG. 8C and FIG. 8D simulate the TSV void defect
of the stacked die 3D chip, in the present embodiment, the
simulation results of the single die 3D chip and the stacked die 3D
chip are substantially the same. As shown in FIG. 8C, the frequency
response curve (curve labeled by squares) of a stacked die 3D chip
representing a void defect percentage of 0 has the minimum S11
parameter value, and therefore it can be known from FIG. 8C that,
lesser void defects present in the TSV indicates smaller return
loss of the stacked die 3D chip. Similarly, as shown in FIG. 8D,
the frequency response curve (curve labeled by squares) of a
stacked die 3D chip representing a void defect percentage of 0 has
the maximum S21 parameter value. Moreover, it can be known from
FIG. 8D that, lesser void defects present in the TSV indicates
smaller insertion loss of the stacked die 3D chip.
FIG. 9 shows a schematic diagram of performance comparison of a
stacked die 3D chip void detection KNN and RF algorithms classifier
model according to an embodiment of the invention. In the
embodiment of FIG. 9, the KNN algorithm compares performances with
K=5 and random forest algorithm. Since the TSV is a device having
capacitance characteristics, the TSV has the classifier of
different frequency responses for different frequency ranges. It
can be seen from FIG. 8 that, a higher frequency indicates smaller
variance of the scattering parameters S11 and S21, which is caused
by reduced resistance of series capacitance triggered by
disconnection at a higher frequency.
M-fold cross validation KNN and random forest (RF) algorithm are
implemented to generate a defect model to detect the performance of
the stacked die 3D chip void defect detection result, which shows
the RF classifier model can reach an accuracy of 80%. Since the KNN
algorithm is an ad-hoc classifier used to classify testing data
based on distance measurement, when the KNN algorithm decides the
classification, the KNN algorithm is only related to a very small
number of adjacent samples and adopts the same weights, and the
performance of the results of type differentiation of the currently
collected data characteristics is less than expected. For the
current training data, the KNN algorithm shows that an increase in
K value reduces the accuracy of classification. Specifically, the
random forest algorithm has the following features: can be
effectively performed on large data sets; can process input samples
having high dimensional classifier without reducing dimension; can
evaluate the importance of each characteristic in classification;
and can obtain an unbiased estimate of internal generation errors
in the generation process. The experimental results also show that
the overall accuracy of the random forest algorithm is better than
the KNN algorithm. Moreover, the accuracy of the random forest
algorithm is increased with increased number of folds. Although the
accuracy is increased with the number of folds, the accuracy starts
to level at around 20 folds.
FIG. 10 shows a defect detection method for a single die 3D chip
and a stacked die 3D chip according to an embodiment of the
invention. This method is implemented by the defect detection
system 500 disclosed by the invention. In step S101, a plurality of
physical models having a defect of at least one defect type is
generated based on the at least one defect type of a 3D chip, and
then step S103 is performed. In step S103, a group of training
samples is generated for each of the physical models to form a
training sample group and divide the training sample groups into an
m number of subsets, wherein an (m-1) number of the subsets are
assigned as the training set t1 and the remaining 1 of the subsets
is assigned as the testing set t2, and m can be any positive
integer. Next, step S105 is performed. In step S105, a group of
scattering parameter values corresponding to the training set t1 is
obtained via the training set t1. Next, step S107 is performed. In
step S107, the classifier model cm is generated via the group of
scattering parameter values. Next, step S109 is performed. In step
S109, the error value e1 of the classifier model cm is measured
using the testing set t2 and whether each subset in the m number of
subsets has been the testing set is determined. Next, step S111 is
performed.
In step S111, if the result is no, then there are still subsets in
the m number of subsets that have not been the testing set, and
then step S103 is repeated to reassign the (m-1) number of the
subsets as a new training set and assign the remaining 1 of the
subsets as a new testing set, wherein the subset forming the new
testing set and the subsets forming all of the previous testing
sets are different. If the result is yes, then each subset in the m
number of subsets has been the testing set. In other words, an m
number of classifier models is generated using an m number of
different training sets, and then step S113 is performed. In step
S113, the error values of the classifier model corresponding to the
testing set are measured using the testing set. It should be
mentioned that, the step of measuring the error values of each
classifier model can be performed at the same time after the m
number of classifier models are generated, and the error values of
the classifier model can also be instantly measured after each
classifier model is generated, and the invention is not limited in
this regard. Next, an average error value is calculated using an m
number of error values and whether the classifier model cm is used
as a defect model of the at least one defect type based on the
average error value to determine that a TSV of a single die 3D chip
or a stacked die 3D chip has a defect corresponding to the at least
one defect type.
FIG. 11 show schematic diagrams of common defects of multilayer
daisy chain structure 110. Take a 3-layers PCB for example, the
multilayer daisy chain structure 110 traverse the top layer (i.e.,
layer L3) and the bottom layer (i.e., layer L) of the PCB. In order
to analyze the electrical characteristic of the multilayer daisy
chain structure 110, a testing device can couple to, for example,
pin P1 and/or pin P2 of the multilayer daisy chain structure 110.
The testing device can detect defects (e.g., open circuit or short
fault) of the multilayer daisy chain structure easily. However, the
testing device cannot locate the precise location that the open
circuit or short circuit occurred. For example, if there is a short
circuit at portion PA of the multilayer daisy chain structure 110,
the testing device can merely know that there is a short circuit in
the multilayer daisy chain structure by coupling to pin P1 and pin
P2. The testing device cannot know that the location of the short
circuit is at portion PA of the multilayer daisy chain structure
110, or cannot know that the short circuit is between bump 111 and
bump 112. For another example, if there is a open circuit at
portion PB of the multilayer daisy chain structure 110, the testing
device can merely know that there is a open circuit in the
multilayer daisy chain structure 110 by coupling to pin P1 and pin
P2. The testing device cannot know that the location of the open
circuit is at portion PB of the multilayer daisy chain structure
110, or the cannot know that the open circuit is between TSV 113
and bump 114.
FIG. 12 shows a defect detection system 1200 for a multilayer daisy
chain structure according to an embodiment of the invention. The
defect detection system 1200 can include a physical model
generating module 121, a training sample generating module 123, a
scattering parameter value acquisition module 125, a classifier
model generating module 127, and a defect model generating module
129. The physical model generating module 121, the training sample
generating module 123, the scattering parameter value acquisition
module 125, the classifier model generating module 127, and the
defect model generating module 129 can be, for instance, disposed
in a computer to execute the method disclosed by the invention, and
the invention is not limited in this regard. The functions of the
physical model generating module 121, the training sample
generating module 123, the scattering parameter value acquisition
module 125, the classifier model generating module 127, and the
defect model generating module 129 are similar to the functions of
the physical model generating module 501, the training sample
generating module 503, the scattering parameter value acquisition
module 505, the classifier model generating module 507, and the
defect model generating module 509 respectively, excepting that the
defect detection system 500 is for a single die 3D chip or a
stacked die 3D chip while the defect detection system 1200 is for a
multilayer daisy chain structure.
The physical model generating module 121 can generate a plurality
of physical models having a defect of at least one defect type
based on the at least one defect type of a daisy chain structure.
Specifically, the physical model generating module 121 can generate
a physical model having the defect of the defect type shown in FIG.
11 using a 3D full-wave solver such as HFSS. The physical model
generating module 121 can generate a plurality of physical models
corresponding to only a defect type for the defect type, and can
also generate a plurality of physical models corresponding to the
plurality of defect types for a plurality of defect types.
In an embodiment of the invention, a corresponding classifier model
can lastly be generated based on the plurality of physical models
to determine whether the multilayer daisy chain structure has the
same type of defect as the physical models, or the degree to which
the defect occurs. It should be mentioned that, when the physical
models are generated, the user can generate corresponding physical
models for a single defect type and can also generate corresponding
physical models for a plurality of defect types. Moreover, the
defect type of the multilayer daisy chain structure is also not
limited to the defect types shown in FIG. 11, and the user can
decide what defect type of the multilayer daisy chain structure the
generated physical models have based on actual requirement, and the
invention is not limited in this regard.
When the physical model corresponding to a specific defect type is
generated using HFSS, the physical model generating module 121 can
use HFSS and generate the physical model based on the structure
parameter group of HFSS, wherein the structure parameter group can
include one or a plurality of different structural parameters. The
structure parameter group can include at least one parameter of a
TSV diameter, a TSV height, a bump diameter, a bump height, a TSV
pitch, a loss tangent of silicon, and a conductivity of silicon, as
shown in Table 2.
TABLE-US-00002 TABLE 2 Parameter Value TSV diameter 20 .mu.m TSV
height 81.5 .mu.m Bump diameter 50 .mu.m Bump height 9 .mu.m TSV
pitch 250 .mu.m Loss tangent of silicon 0 Conductivity of silicon
10 (s-m)
After the physical model generating module 121 generates a
plurality of physical models, the training sample generating module
123 can generate a group of training samples for each of the
physical models to form a training sample group and divide the
training sample group into a training set T1 and a testing set T2.
Specifically, the training sample generating group 123 can group a
group of training samples generated for each type of physical model
into a set of training samples having a plurality types of physical
model, and can divide the training sample groups into an m number
of subsets, wherein an (m-1) number of the subsets are assigned as
the training set T1 and the remaining 1 of the subsets is assigned
as the testing set T2, and m can be any positive integer.
For instance, in the case that the training sample generating
module 123 respectively generates 1000 training samples for
physical models of multilayer daisy chain structure having short
circuit at portion PA and open circuit at portion PB as shown in
FIG. 11, the training sample generating module 123 can set the
value of m to 10, i.e., divide 2000 training samples into 10 equal
portions, wherein 1800 training samples can be assigned as the
training set T1, and the remaining 200 training samples can be
assigned as the testing set T2. It should be mentioned that, the
training sample is generated by the training sample generating
module 123 based on the same structure parameter group as the
corresponding physical model, and therefore each training sample
belonging to the same physical model has the same structure
parameter group. For instance, in the case that the training sample
generating module 123 generates 1000 training samples for the
physical model of the multilayer daisy chain structure having short
circuit at portion PA as shown in FIG. 11, the 1000 training
samples are all multilayer daisy chain structure physical models
having short circuit at portion PA.
The scattering parameter value acquisition module 125 can obtain a
group of scattering parameter (S parameter) values corresponding to
the training set T1 via the training set T1. Specifically, the
scattering parameter value acquisition module 125 can perform a FEM
on the training samples in the training set T1 to extract the S
parameter values of each training sample to obtain a group of
scattering parameter values corresponding to the training set T1,
and the S parameter values can be, for instance, scattering
parameters such as S11 parameters or S21 parameters. After the
scattering parameter value acquisition module 125 obtains a group
of scattering parameter values corresponding to the training set
T1, the classifier model generating module 127 can generate a
classifier model CM via the group of scattering parameter values.
Specifically, the classifier model generating module 127 can use a
machine learning algorithm for the group of scattering parameter
values to generate the classifier model CM corresponding to the
group of scattering parameter values. The machine learning
algorithm can be, for instance, KNN algorithm or RF algorithm, and
can also be, for instance, various types of machine learning
algorithms supporting a vector machine, a decision tree, or
regression analysis. Noted that, said machine learning algorithms
have been illustrated in detail during the description of FIG. 6-9,
so the details will not be given herein again. Moreover, although
in the present embodiment, a classifier model is generated based on
scattering parameter values, the user can also generate the
classifier model using different parameters based on actual
requirement, and the invention is not limited in this regard. For
instance, the user can also obtain an impedance value of the HFSS
structural parameters from the training samples formed by the HFSS
structure parameter group and extract a phase angle with the
impedance value to generate a corresponding classifier model using
the phase angle.
After the classifier model CM is generated, the defect model
generating module 129 can measure the error of the classifier model
CM using the training set T2 to calculate an error value E1 and
decide the use of the classifier model CM based on the error value
E1 as the defect model of the specific defect types to determine
whether the multilayer daisy chain structure has the defect
corresponding to the above specific defect types. Specifically, the
defect model generating module 129 can input each training sample
in the testing set T2 into the classifier model CM, and determines
which types of defect each training sample respectively has via the
classifier model CM, wherein the defect type corresponds to at
least one of an open circuit, a short circuit, and a fault
occurring location. After each training sample in the testing set
T2 is classified by the classifier model CM, the defect model
generating module 129 can compare the classification result of each
training sample with the defect types actually corresponding to the
training sample and determine whether the classification result of
each training sample meets the defect types actually corresponding
to the training sample to measure the error value E1 during the
classification of the testing set T2 using the classifier model
CM.
After the error value E1 is obtained, the defect model generating
module 129 can determine whether the classifier model CM is used as
a defect model (also referred to as a classifier) of specific
defect types based on the error value E1 to use the defect model to
determine whether the multilayer daisy chain structure has a defect
of specific defect types, wherein the specific defect types can be
the defect types to which the defect occurs of each training sample
in the training set T1.
For instance, an error value E1 less than a preset threshold
indicates the accuracy of classifying the testing set T2 using the
classifier model CM reached the expected standard, and at this
point, the defect model generating module 129 can use the
classifier model CM as the defect model. After the defect model is
generated, whether the multilayer daisy chain structure to be
detected has a defect meeting the specific defect type can be
determined by the defect model just by inputting the S parameter of
a multilayer daisy chain structure to be detected into the defect
model.
To validate the accuracy of the defect model, the defect detection
system 1200 can also repeatedly check the defect model using the
training samples generated by the training sample generating module
123. Specifically, After the defect model generating module 129
measures the error value E1 of the classifier model CM using the
testing set T2, the training sample generating module 123 can
select 1 subset from an m number of subsets of the training sample
again as a new testing set T2', and the remaining (m-1) number of
subsets are new training sets T1', wherein the subsets forming the
new testing set T2' and all of the subsets forming the previous
testing set (such as the testing set T2) are different. In other
words, the subsets that have been the testing set cannot be reused
as the new testing set.
After the training sample generating module 123 generates the new
training set T1' and the new training set T2', the scattering
parameter value acquisition module 125 can obtain a group of
scattering parameter values corresponding to the training set T1'
via the training set T1'. Next, the classifier model generating
module 127 can generate a new classifier model CM' via the group of
scattering parameter values. After the classifier model generating
module 127 generates the new classifier model CM', the defect model
generating module 129 can measure the new classifier model CM'
using the new testing set T2' to obtain a new error value E2. The
detailed methods of obtaining a group of scattering parameter
values, generating a classifier model, and measuring and obtaining
an error value are all disclosed in the embodiments above and are
therefore not repeated herein.
After the steps above, the classifier model generating module 127
can respectively generate the classifier model cm and the
classifier model CM' using the same classifier model generating
method and using the same training sample but different training
sets T1 and T1', and the defect model generating module 129 can
respectively measure the error values E1 and E2 of the classifier
model CM and the classifier model CM' using the same training
sample but different testing sets T2 and T2'. In the present
embodiment, the defect detection system 1200 can repeat the step of
generating a new classifier model and the step of measuring the new
error values (m-1) times until each subset in the m number of
subsets of the training sample has been the testing set.
Accordingly, the defect model generating module 129 can obtain a
total of an m number of error values.
After the defect model generating module 129 obtains the m number
of error values, the defect model generating module 129 can
calculate and measure the average error value of the m number of
error values, and the defect model generating module 129 can decide
whether to use the classifier model CM as the defect model of
specific defect types based on the average error value, wherein the
specific defect types can be the defect types or the degrees to
which the defect occurs of each training sample in the training set
T1. Specifically, after performing the step of generating a new
classifier model and the step of measuring the new error values m
times, the defect detection system 1200 can generate the remaining
(m-1) number of classifier models using the same training sample
group and using the same generating method as the classifier model
CM and measure the error values (a total of m error values)
corresponding to each classifier model. If the average error value
calculated from an m number of error values is less than a preset
threshold, then the accuracy of the classifier model generated
using the same generating method as the classifier model CM can
reach the expected standard. Accordingly, the defect model
generating module 129 can decide to use the classifier model cm as
the defect model based on a low average error value. More
specifically, the defect model generating module 129 can also
select a classifier model (such as the classifier model CM')
generated by the same generating method as the classifier model CM
as the defect model.
The method above of using the same training sample group to
generate and check the classifier model is referred to as m-fold
cross validation and has the following advantages. First, the
method can obtain as much effective information as possible from
limited learning data. Next, the cross validation method can
validate the classifier model from a plurality of directions to
effectively avoid falling into local minimum. Moreover, the issue
of overfitting can be avoided to a certain degree.
FIG. 13 shows a defect detection method for a multilayer daisy
chain structure according to an embodiment of the invention. This
method is implemented by the defect detection system 1200 disclosed
by the invention. In step S131, a plurality of physical models
having a defect of at least one defect type is generated based on
the at least one defect type of a daisy chain structure, and then
step S132 is performed. In step S132, a group of training samples
is generated for each of the physical models to form a training
sample group and divide the training sample groups into an m number
of subsets, wherein an (m-1) number of the subsets are assigned as
the training set T1 and the remaining 1 of the subsets is assigned
as the testing set T2, and m can be any positive integer. Next,
step S133 is performed. In step S133, a group of scattering
parameter values corresponding to the training set T1 is obtained
via the training set T1. Next, step S134 is performed. In step
S134, the classifier model CM is generated via the group of
scattering parameter values. Next, step S135 is performed. In step
S135, the error value E1 of the classifier model CM is measured
using the testing set T2 and whether each subset in the m number of
subsets has been the testing set is determined. Next, step S136 is
performed.
In step S136, if the result is no, then there are still subsets in
the m number of subsets that have not been the testing set, and
then step S132 is repeated to reassign the (m-1) number of the
subsets as a new training set and assign the remaining 1 of the
subsets as a new testing set, wherein the subset forming the new
testing set and the subsets forming all of the previous testing
sets are different. If the result is yes, then each subset in the m
number of subsets has been the testing set. In other words, an m
number of classifier models is generated using an m number of
different training sets, and then step S137 is performed. In step
S137, the error values of the classifier model corresponding to the
testing set are measured using the testing set. It should be
mentioned that, the step of measuring the error values of each
classifier model can be performed at the same time after the m
number of classifier models are generated, and the error values of
the classifier model can also be instantly measured after each
classifier model is generated, and the invention is not limited in
this regard. Next, an average error value is calculated using an m
number of error values and whether the classifier model CM is used
as a defect model of the at least one defect type based on the
average error value to determine that a multilayer daisy chain
structure has a defect corresponding to the at least one defect
type.
Based on the above, the invention provides a defect detection
method for a 3D chip and a system using the same that can extract S
parameter values under a plurality of different physical models via
finite element analysis from generated physical models and use the
S parameter values corresponding to a single die TSV or a stacked
die TSV as the training set for machine learning before or after 3D
chip stacking. By comparing the S parameter values of the 3D chips
and the classifier models generated based on the training sets, in
the invention, a 3D chip can be classified based on different
defect types without performing an additional Design-For-Test
circuit design and without sample preparing for observing the
presence of defects. In the similar manner, a short circuit or an
open circuit in a multilayer daisy chain structure can be easily
located without performing an additional Design-For-Test circuit
design and without sample preparing for observing the presence of
defects. As a result, in addition to achieving the object of
non-destructive detection, the collection and measurement or
simulation of data plus training can further achieve the effects of
reduced detection time and costs, such that the quality of the 3D
chip or the multilayer daisy chain structure is ensured, and the
production process can be speeded up.
Although the invention has been described with reference to the
above embodiments, it will be apparent to one of ordinary skill in
the art that modifications to the described embodiments may be made
without departing from the spirit of the invention. Accordingly,
the scope of the invention is defined by the attached claims not by
the above detailed descriptions.
* * * * *