U.S. patent application number 14/586322 was filed with the patent office on 2015-04-23 for semiconductor device predictive dynamic thermal management.
This patent application is currently assigned to Broadcom Corporation. The applicant listed for this patent is Broadcom Corporation. Invention is credited to Hwisung JUNG.
Application Number | 20150113303 14/586322 |
Document ID | / |
Family ID | 46798947 |
Filed Date | 2015-04-23 |
United States Patent
Application |
20150113303 |
Kind Code |
A1 |
JUNG; Hwisung |
April 23, 2015 |
Semiconductor Device Predictive Dynamic Thermal Management
Abstract
A semiconductor device includes a memory storing a lookup table
including stored values associated with modes of operation of a
component of the semiconductor device. A monitor monitors an
operating parameter of the component in real-time, and reports a
calculated value associated with the same. A power manager
determines a change in the mode of operation of the component based
on a comparison of the calculated value with a corresponding stored
value, and adjusts a current mode of operation of the component in
real-time.
Inventors: |
JUNG; Hwisung; (Irvine,
CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Broadcom Corporation |
Irvine |
CA |
US |
|
|
Assignee: |
Broadcom Corporation
Irvine
CA
|
Family ID: |
46798947 |
Appl. No.: |
14/586322 |
Filed: |
December 30, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13303882 |
Nov 23, 2011 |
8930724 |
|
|
14586322 |
|
|
|
|
61524538 |
Aug 17, 2011 |
|
|
|
Current U.S.
Class: |
713/320 |
Current CPC
Class: |
H01L 2924/0002 20130101;
Y02D 10/126 20180101; G06F 1/324 20130101; H01L 23/34 20130101;
Y02D 10/00 20180101; Y02D 10/172 20180101; G06F 1/3296 20130101;
G06F 1/206 20130101; G06F 1/3206 20130101; H01L 2924/0002 20130101;
H01L 2924/00 20130101 |
Class at
Publication: |
713/320 |
International
Class: |
G06F 1/32 20060101
G06F001/32; G06F 1/20 20060101 G06F001/20 |
Claims
1. A semiconductor device, comprising: a monitor configured to
determine a value of an operating parameter of a component of the
semiconductor device during a current mode of operation in a
plurality of modes of operation; and a power manager configured to:
determine a calculated value corresponding to the value, and adjust
an operating voltage or an operating frequency of the component to
prevent the calculated value from exceeding a predetermined
threshold amount.
2. The semiconductor device of claim 1, wherein the power manager
is further configured to determine a change in the current mode of
operation.
3. The semiconductor device of claim 1, wherein the monitor
comprises: a ring-oscillator temperature monitor.
4. The semiconductor device of claim 1, wherein the calculated
value comprises: a future operating temperature of the component,
and wherein the power manager is further configured to: predict the
future operating temperature based on a previous temperature
measurement of the component.
5. The semiconductor device of claim 4, wherein the predetermined
threshold amount comprises: a predetermined temperature threshold
of the component, and wherein the power manager is further
configured to: adjust the operating voltage or the operating
frequency of the component based on a comparison of the predicted
future operating temperature with the predetermined temperature
threshold.
6. The semiconductor device of claim 1, wherein the power manager
is further configured to suspend operation of the component in the
current mode of operation for a predetermined period of time to
prevent the calculated value from exceeding the predetermined
threshold amount.
7. The semiconductor device of claim 1, wherein the monitor is
located near a hot spot of the component.
8. A semiconductor device, comprising: a plurality of monitors
configured to determine a plurality of values corresponding to a
component of the semiconductor device; and a power manager
configured to: determine a plurality of current values of the
plurality of values, determine, based on the plurality of current
values, a predicted future temperature of the component, compare
the predicted future temperature with a temperature threshold of
the component, and adjust an operating voltage or an operating
frequency of the component to prevent a temperature of the
component from reaching the temperature threshold.
9. The semiconductor device of claim 8, wherein the monitor
comprises: a ring-oscillator temperature monitor.
10. The semiconductor device of claim 9, further comprising: a
plurality of ring-oscillator temperature monitors including the
ring-oscillator temperature monitor.
11. The semiconductor device of claim 8, wherein the plurality of
values are classified according to a process corner, a supply
voltage, and the operating frequency of the component.
12. The semiconductor device of claim 8, further comprising: a
silicon performance monitor configured to identify a process corner
of the semiconductor device.
13. The semiconductor device of claim 8, wherein, upon determining
that a new application is executing using the semiconductor device,
the power manager is configured to: read the plurality of current
values; determine, based on the plurality of current values,
whether there is a change in operation of the component.
14. The semiconductor device of claim 8, wherein the power manager
is configured to adjust the operating voltage or the operating
frequency of the component based on a plurality of baseline values
of the plurality of values.
15. The semiconductor device of claim 8, wherein the plurality of
monitors are located next to a plurality of known hot spots of the
component.
16. The semiconductor device of claim 8, wherein the power manager
is configured to predict, based on the plurality of current values,
a location of a future hot spot of the component.
17. A method, comprising: determining, based on a plurality of
readings from a plurality of monitors, a plurality of current
values corresponding to a temperature of a component of a
semiconductor device; determining, based on the plurality of
current values, a predicted future temperature of the component;
comparing the predicted future temperature with a temperature
threshold of the component; and adjusting an operating voltage or
an operating frequency of the component to prevent a temperature of
the component from reaching the temperature threshold.
18. The method of claim 17, wherein the plurality of monitors are
located near a corresponding plurality of hot spots of the
component.
19. The method of claim 17, further comprising: determining, based
on the plurality of current values, whether there is a change in
operation of the component.
20. The method of claim 17, further comprising: predicting, based
on the plurality of current values, a location of a future hot spot
of the component.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This patent application is a continuation of U.S. patent
application Ser. No. 13/303,882, filed November 23, 2011, assigned
U.S. Pat. No. 8,930,724, which claims the benefit of U.S.
Provisional Patent Application No. 61/524,538, filed Aug. 17, 2011,
entitled "Power Management Unit," each of which is incorporated
herein by reference in its entirety.
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] The present invention is directed to a system and a method
for controlling temperature of semiconductor devices that use
system-on-chip (SOC) solutions. In particular, the present
invention is directed to the use of predictive and dynamic thermal
management techniques to control temperature of the semiconductor
devices.
[0004] 2. Background Art
[0005] Advances in designs of mobile application processors have
resulted in these processors operating at higher frequencies (>2
GHz). At higher frequencies, processors generate more heat which
damages semiconductor devices. Thus, thermal control, at these
higher operating frequencies, is a matter of serious concern.
Localized heating, in the form of hot spots, is observed in
processors operating at higher frequencies (higher switching
speeds). These hotspots increase the power density and the thermal
vulnerability of the SOC design of the processor. Further, the
hotspots cause thermal stress in components leading to increase in
the junction temperatures. The increased junction temperatures can
increase leakage power and can result in undesirable power-thermal
loop. Conventional techniques employed to control temperature are
not optimum and there is a need for better temperature control
techniques.
[0006] One conventional technique is reactive (as opposed to
predictive) and relies on thermal throttling to control the
temperature. For example, in this reactive technique, a processor
is allowed to run at full capacity. When an operating temperature
is measured to exceed a thermal limit, the running capacity of the
processor is reactively curtailed to reduce the operating
temperature of the same. This reactive technique is not optimum
because it degrades the performance of the processor and provides a
limited time period to prevent a thermal runaway condition. This
reactive correction requires a throttling system that is
significantly and periodically calibrated.
[0007] Another known temperature control technique requires
determining a highest performance condition of the processor based
on application profile information of a given application, and
reactively re-configuring the hardware for thermal safety when the
highest performance condition is observed. This technique is not
optimum because it is specific to an application, and must be
duplicated for every application before being run on the processor.
Implementation of this technique during operation can be very
complex depending upon the processes required to be run by the
application.
[0008] As such, there is a need for a better technique for
controlling temperature of semiconductor devices that use SOC
solutions.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The accompanying drawings, which are incorporated herein and
form a part of the specification, illustrate the present invention
and, together with the description, further serve to explain the
principles of the invention and to enable a person skilled in the
pertinent art to make and use the invention.
[0010] FIGS. 1A and 1B illustrate early prediction of a hot spot
according to an embodiment of the present invention.
[0011] FIG. 2 illustrates the architecture of a SOC temperature
control solution according to an embodiment of the present
invention.
[0012] FIG. 3 is a flow chart of an exemplary method performed by
the semiconductor device according to an embodiment of the present
invention.
[0013] The present invention will be described with reference to
the accompanying drawings. The drawing in which an element first
appears is typically indicated by the leftmost digit(s) in the
corresponding reference number.
DETAILED DESCRIPTION OF THE INVENTION
[0014] In the following description, numerous specific details are
set forth in order to provide a thorough understanding of the
invention. However, it will be apparent to those skilled in the art
that the invention, including structures, systems, and methods, may
be practiced without these specific details. The description and
representation herein are the common means used by those
experienced or skilled in the art to most effectively convey the
substance of their work to others skilled in the art. In other
instances, well-known methods, procedures, components, and
circuitry have not been described in detail to avoid unnecessarily
obscuring aspects of the invention.
[0015] References in the specification to "one embodiment," "an
embodiment," "an example embodiment," etc., indicate that the
embodiment described may include a particular feature, structure,
or characteristic, but every embodiment may not necessarily include
the particular feature, structure, or characteristic. Moreover,
such phrases are not necessarily referring to the same embodiment.
Further, when a particular feature, structure, or characteristic is
described in connection with an embodiment, it is submitted that it
is within the knowledge of one skilled in the art to effect such
feature, structure, or characteristic in connection with other
embodiments whether or not explicitly described.
[0016] Known techniques used to control temperature of
semiconductor devices that use SOC solutions are not optimum.
Generally, the known techniques are reactive. In contrast, the
invention described herein is predictive. Applicant's predictive
method assists in minimizing power consumption while satisfying
performance constraints. Further, Applicant's predictive method can
be applied during operation of the SOC solution (i.e., the
processor) to maximize the performance capacity of the same.
[0017] In an embodiment, Applicant's technique provides early
prediction of possible hot spots and dynamic thermal management.
For example, early prediction of possible hot spots can be
accomplished by estimating, before and/or during operation, a
junction temperature of a component and/or a power state of the SOC
solution in advance based on previous junction temperature
measurements and/or previous power state measurements. Based on the
results of the estimating, the temperature of the semiconductor
device can be dynamically managed to maximize performance of the
same.
[0018] The early hot spot prediction technique will be discussed in
further detail. Early hot spot prediction includes predicting
locations of potential hot spots on the semiconductor device in
advance. Early hot spot prediction can be based on a previous power
state of the semiconductor device, on monitoring a temperature
associated with the semiconductor device, and/or on a measure of
utilization of the processor of the semiconductor device. The
measure of utilization could be a measure of time required by the
processor to complete a given task.
[0019] In an embodiment, time is tracked while the processor is
performing multiple tasks (multi-tasking). Time tracking is
important because the more the amount of time required to complete
a given task, the less the amount of time that can be devoted to
other tasks.
[0020] Hot spots can be predicted in the following way. Based on
the design of the processors in the SOC solution, hot spots can be
predicted by choosing the processors that are designed to carry out
processor/applications which require more energy. Ring-oscillator
based temperature monitors can be placed near such processors
designed to use more energy. In addition to placing
ring-oscillators near processors, ring-oscillator based temperature
monitors can also be placed near components of the semiconductor
device such as switching components, multi-media functional block
components, and the like, which are designed to expend high energy.
The ring-oscillator based temperature monitors can be connected to
each other via a ring structure, and can be controlled by a thermal
manager (FIG. 2). In an embodiment, the ring-oscillator based
temperature monitors are enabled only when certain
conditions/thresholds associated with a supply voltage and/or an
operating frequency, and/or processor utilization of the processor
are met.
[0021] FIGS. 1 A and 1B graphically illustrate early prediction of
a hot spot based on processor utilization of a processor according
to an embodiment of the present invention. In particular, FIG. 1A
is a graph of processor utilization over time, and FIG. 1B is a
graph of corresponding thermal conditions of the processor over
time. As the utilization of the processor varies, this variation is
monitored and captured. A moving average of this data is computed.
A higher temperature can be predicted as the moving average of
utilization increases because this means that the processor is
starting to expend more energy. This feature is illustrated in FIG.
1B. In particular, a future temperature associated with the
processor can be predicted based on a current variation in the
utilization of the same. This predicted future temperature can be
compared to a threshold temperature value, and the utilization of
the processor can be controlled in real time using a power manager
or a thermal manager when the future temperature is predicted to
exceed the threshold temperature value based on the result of the
comparison. In this way, the power manager or the thermal manager
can predictively prevent a processor from exceeding a critical
threshold temperature. In alternative embodiments, the moving
average can be determined based on data captured when monitoring
the variation of a supply voltage of the processor and/or an
operating frequency of the processor, and/or like parameters of the
processor.
[0022] Although, early hot spot prediction is generally described
herein with respect to a processor, it will be appreciated that
early hot spot prediction can be carried out with respect to any
component of the semiconductor device. In case of components, a
variation in the switching speed of the same may be used to
determine the moving average. In case of multimedia functional
blocks, an amount of data to be processed and/or a type of data to
be processed may be used to determine the moving average. The
temperature of a processor or a component can also be measured in
real-time and used to predict the future temperature. In another
embodiment, the future temperature can also be predicted based on a
list of applications cued up to be executed by the processor and
respective processor utilization parameters related to the
execution of each of the applications.
[0023] FIG. 2 illustrates the architecture of an SOC solution
according to an embodiment of the present invention. The SOC
solution 200 includes a power manager 201, a thermal manager 202, a
software memory 203, a memory 205 including lookup tables 206, a
silicon performance monitor 207, a power domain 210 including
processors, labeled "CPU 0," 212, and "CPU 1," 214 with associated
ring-oscillator temperature monitors 204, a power domain 220
including processor "CPU 3," 222, and processor "CPU 2," 224, with
associated ring-oscillator temperature monitors 204, and a power
domain 230 including multimedia block 232 with associated
ring-oscillator temperature monitor 204. Processor 222 can
optionally be any component of the SOC solution. The different
power domains use, for example, different supply voltages and are
used to support different operating frequencies.
[0024] The ring-oscillator temperature monitors are placed near
recognized hot spots of respective devices, such as processors 212,
214, 222 and 224. In an idle mode, a counter value of each of the
ring-oscillator temperature monitors 204 is baselined. The counter
values of each of the ring-oscillator temperature monitors 204 with
respect to all modes of operations of the associated processors and
components (including an idle mode and an active mode) are then
pre-calculated and stored in look up tables 206 in memory 205. The
counter values are classified according to a process corner (ss,
tt, ff), a supply voltage, and an operating frequency associated
with each of the processors and components being monitored by the
respective ring-oscillator temperature monitors 204. These
pre-calculated and pre-stored values correspond to respective
operating temperatures of the monitored processors and components.
The baselining is based on Applicant's recognition that increase in
temperature leads to increase in leakage power. Increase in
temperature depends on the process corner within which the
processor or component operates. There are three widely used
process corners, ss-slow slow; tt-typical typical; and ff-fast
fast. Applicant has recognized that leakage power varies at
different supply voltages and at different operating frequencies
among the different process corners. As such, counter values for
each ring-oscillator temperature monitor 204 are pre-calculated
with respect to a process corner, a supply voltage, and an
operating frequency of the monitored component. These
pre-calculated values are stored in lookup tables 206.
[0025] The thermal dynamic management will be discussed in further
detail. Upon booting up, the silicon performance monitor 207
identifies a process corner associated with each processor 212,
214, 224, and reports the same to the power manager 201. The
thermal manager 202 monitors and identifies operating parameters
including processor utilization, a switching speed, and/or an
amount of data to be processed. In particular, the thermal manager
202 reads the counter values reported by each of the
ring-oscillator temperature monitors 204, and converts the same in
terms of the above operating parameters. Finally, the power manager
201 reads the converted values from the thermal manager 202. The
power manager 201 may read these converted counter values every
time a new application runs on the processor, or do the same
periodically. Then, the power manager 201 checks whether there is a
change in operation of the processors and/or the components by
comparing the currently read converted values with previously read
converted values.
[0026] Alternatively, the power manager 201 may compare the
currently read converted values with corresponding pre-calculated
baseline counter values stored in the lookup tables 206 for each of
the ring-oscillator temperature monitors 204. If the result of the
comparison shows that there is a variation in the utilization of a
processor indicating that the temperature of the processor is
increasing, then the power manager 201 predicts a predicted future
temperature of that processor. The power manager 201 then compares
the predicted future temperature with a temperature threshold
associated with that processor. If the result of the comparison
indicates that the predicted future temperature is greater than or
equal to the temperature threshold value, then the power manager
201 controls the operation of the processor to avoid undesirable
conditions such as excessive leakage current and also thermal
runaway. The controlling the operation of the processor includes
the power manager 201 dynamically scaling the operating voltage
and/or the operating frequency of the processor. In particular, the
power manager 201 may scale the operating voltage and/or the
operating frequency based on the baseline values stored in the
lookup tables 206, thereby enabling the processor to operate within
a desired mode. Optionally, the power manager 201 may halt
operation of the processor permanently, or do the same for a given
period of time.
[0027] When the above architecture is applied with respect to a
component 222, the operating parameter monitored and identified
could be, for example, a switching speed of the component. When the
architecture is applied with respect to the multimedia block 232,
the sensed parameter could be, for example, an amount of data to be
processed and/or a type of data to be processed.
[0028] In this way, the future temperature associated with the
processors 212, 214, 224 and/or components 222, 232 can be
predicted. These predicted future temperatures can then be used to
control the operation of the processors 212, 214, 224 and/or the
components 222, 232 to prevent undesirable conditions, as discussed
above.
[0029] The comparison of the currently read converted values from
the thermal manager 202 with corresponding baseline values stored
in the lookup tables 206 will now be discussed in brief As the
temperature of the monitored processor increases, the counter value
of the associated ring-oscillator temperature monitor 204
decreases. This is because, as the temperature increases, a dynamic
current associated with the processor (or a switching current
associated with the switching component) decreases. This is because
the counter value has a direct proportional relationship with the
dynamic current and an inverse proportion relationship with the
temperature. As such, when the currently read converted value is
smaller than the corresponding stored baseline value, then the
power manager 201 may decide to lower the operating voltage and/or
the operating frequency of the processor. One will appreciate that
the power manager 201 may dynamically adjust only the operating
voltage or only the operating frequency of the processor.
[0030] FIG. 3 is a flow chart of a method 300 carried out by the
architecture shown in FIG.
[0031] 2. In step 301, baseline counter values corresponding to
temperatures of each of the monitored processors 212, 214, 224 and
components 222, 232 are pre-calculated and stored. These baseline
counter values are classified according to a process corner, an
operating supply voltage, and an operating frequency associated
with each of the monitored processors 212, 214, 224 and the
components 222, 232. In step 302, a process corner within which
each of the processors 212, 214, 224 and the components 222, 232 is
operating is identified. In step 303, operating parameters
associated with each of the processors 212, 214, 224 and the
components 222, 232 are identified. These identified operating
parameters include processor utilization, a switching speed, and/or
an amount of data to be processed. In step 304, counter values from
each ring-oscillator temperature monitor associated with each of
the processors 212, 214, 234 and the components 222, 232 are
measured. In step 305, it is checked whether there is a change in
operation of the processors and/or the components based on a
comparison of the measured counter values with previously measured
counter values. The comparison may alternatively or optionally
include monitoring a variation in operation of the processors
and/or the components and capturing the same as a moving average,
as discussed in FIG. 1. If the result of the comparison is "No,"
then the process moves to step 303. However, if the result of the
comparison is "Yes," then the process moves to step 306. In step
306, a future temperature of the processor and/or the component is
predicted. This prediction can be based on previous power states of
the processor and/or the component including historical
temperatures observed in relation to a voltage, frequency, or
utilization. In step 307, the predicted future temperature is
compared to a threshold temperature value. If the result of the
comparison indicates that the predicted future temperature is
greater than or equal to the threshold temperature value, then the
process moves to step 308. Otherwise, the process moves to step
303. In step 308, the operation of the processor and/or the
component is controlled to avoid undesirable conditions such as
excessive leakage current and also thermal runaway. The controlling
the operation of the processor includes the power manager 201
dynamically scaling the operating voltage and/or the operating
frequency of the processor. Optionally, the power manager 201 may
halt operation of the processor permanently, or do the same for a
given period of time.
[0032] In semiconductor manufacturing, a "process corner" refers to
a variation of fabrication parameters used in applying an
integrated circuit design to a semiconductor wafer. Process corners
represent the extremes of these parameter variations within which a
circuit that has been etched onto the wafer must function
correctly. A circuit running on devices fabricated at these process
corners may run slower or faster than specified and at lower or
higher temperatures and voltages, but if the circuit does not
function at all at any of these process extremes, the design is
considered to have inadequate design margin.
[0033] It is to be appreciated that the Detailed Description
section, and not the Summary and Abstract sections, is intended to
be used to interpret the claims. The Summary and Abstract sections
may set forth one or more but not all exemplary embodiments of the
present invention as contemplated by the inventor(s), and thus, are
not intended to limit the present invention and the appended claims
in any way.
[0034] The present invention has been described above with the aid
of functional building blocks illustrating the implementation of
specified functions and relationships thereof. The boundaries of
these functional building blocks have been arbitrarily defined
herein for the convenience of the description. Alternate boundaries
can be defined so long as the specified functions and relationships
thereof are appropriately performed.
[0035] The foregoing description of the specific embodiments will
so fully reveal the general nature of the invention that others
can, by applying knowledge within the skill of the art, readily
modify and/or adapt for various applications such specific
embodiments, without undue experimentation, without departing from
the general concept of the present invention. Therefore, such
adaptations and modifications are intended to be within the meaning
and range of equivalents of the disclosed embodiments, based on the
teaching and guidance presented herein. It is to be understood that
the phraseology or terminology herein is for the purpose of
description and not of limitation, such that the terminology or
phraseology of the present specification is to be interpreted by
the skilled artisan in light of the teachings and guidance.
[0036] It should be noted that any exemplary processes described
herein can be implemented in hardware, software, or any combination
thereof. For instance, the exemplary process can be implemented
using computer processors, computer logic, application specific
integrated circuits (ASICs), digital signal processors (DSP), etc.,
as will be understood by one of ordinary skill in the arts based on
the discussion herein.
[0037] Moreover, any exemplary processes discussed herein can be
embodied by a computer processor or any one of the hardware devices
listed above. The computer program instructions cause the processor
to perform the processing functions described herein. The computer
program instructions (e.g., software) can be stored in a computer
useable medium, computer program medium, or any storage medium that
can be accessed by a computer or processor. Such media include a
memory device such as a computer disk or CD ROM, or the equivalent.
Accordingly, any computer storage medium having computer program
code that causes a processor to perform the processing functions
described herein are with the scope and spirit of the present
invention.
[0038] The breadth and scope of the present invention should not be
limited by any of the above-described exemplary embodiments, but
should be defined only in accordance with the following claims and
their equivalents.
* * * * *