Latency based platform coordination Kwa; Seh W. ; et al. [Cooper; Barnes]

Latency based platform coordination

Kwa; Seh W. ; et al.

Patent Application Summary

U.S. patent application number 12/006251 was filed with the patent office on 2009-07-02 for latency based platform coordination. Invention is credited to Barnes Cooper, Robert Gough, Jaya L. Jeyaseelan, Seh W. Kwa, Nilesh V. Shah, Neil Songer.

Application Number	20090172434 12/006251
Document ID	/
Family ID	40800117
Filed Date	2009-07-02

United States Patent Application	20090172434
Kind Code	A1
Kwa; Seh W. ; et al.	July 2, 2009

Latency based platform coordination

Abstract

In some embodiments, an electronic apparatus comprises at least one processor, a plurality of components, and a policy engine comprising logic to receive latency data from one or more components in the electronic device, compute a minimum latency tolerance value from the latency data, and determine a power management policy from the minimum latency tolerance value.

Inventors:	Kwa; Seh W.; (San Jose, CA) ; Gough; Robert; (Cornelius, OR) ; Songer; Neil; (Santa Clara, CA) ; Jeyaseelan; Jaya L.; (Cupertino, CA) ; Cooper; Barnes; (Tigard, OR) ; Shah; Nilesh V.; (Folsom, CA)
Correspondence Address:	Caven & Aghevli LLC;c/o CPA Global P.O. BOX 52050 MINNEAPOLIS MN 55402 US
Family ID:	40800117
Appl. No.:	12/006251
Filed:	December 31, 2007

Current U.S. Class:	713/320 ; 713/300
Current CPC Class:	Y02D 10/24 20180101; G06F 1/3246 20130101; Y02D 10/00 20180101; G06F 1/329 20130101; G06F 1/3203 20130101
Class at Publication:	713/320 ; 713/300
International Class:	G06F 1/26 20060101 G06F001/26; G06F 1/32 20060101 G06F001/32

Claims

1. A method to implement latency based platform coordination in an electronic device, comprising: receiving, in a policy engine, latency data from one or more components in the electronic device; computing a minimum latency tolerance value from the latency data; and determining a power management policy from the minimum latency tolerance value.

2. The method of claim 1, wherein receiving, in a policy engine, latency data from one or more components in the electronic device comprises receiving a snoop latency tolerance and a non-snoop latency tolerance from the one or more components

3. The method of claim 2, wherein: the latency data from the one or more components is transmitted via an intermediate bridge/switch device the bridge has at least one delay value for data transmitted via the bridge/switch device; and the bridge deducts the delay value from the latency data.

4. The method of claim 3, wherein: the bridge comprises a first delay value when the bridge is in a low power state and a second delay value when the bridge is in an active power state; and the bridge deducts one of the first delay value or the second delay value from the latency data.

5. The method of claim 1, wherein computing a minimum latency tolerance value from the latency data comprises: comparing a plurality of latency values received from a plurality of components; and selecting the lowest latency value from the plurality of latency values.

6. The method of claim 1, wherein the policy engine monitors latency values over time during operation of the electronic device and updates power management policies as a function of changes in the latency tolerance values.

7. The method of claim 1, wherein determining a power management policy from the minimum latency tolerance value comprises selecting a sleep state that permits the system to meet the minimum latency tolerance value.

8. An electronic apparatus, comprising: at least one processor; a plurality of components; and a policy engine comprising logic to: receive latency data from one or more components in the electronic device; compute a minimum latency tolerance value from the latency data; and determine a power management policy from the minimum latency tolerance value.

9. The electronic apparatus of claim 8, wherein the policy engine further comprises logic to receive a snoop latency tolerance and a non-snoop latency tolerance from the one or more components

10. The electronic apparatus of claim 9, wherein: the latency data from the one or more components is transmitted via an intermediate bridge/switch device the bridge has at least one delay value for data transmitted via the bridge/switch device; and the bridge deducts the delay value from the latency data.

11. The electronic apparatus of claim 10, wherein: the bridge comprises a first delay value when the bridge is in a low power state and a second delay value when the bridge is in an active power state; and the bridge deducts one of the first delay value or the second delay value from the latency data.

12. The electronic apparatus of claim 8, wherein the policy engine further comprises logic to: compare a plurality of latency values received from a plurality of components; and select the lowest latency value from the plurality of latency values.

13. The electronic apparatus of claim 8, wherein the policy engine further comprises logic to monitor latency values over time during operation of the electronic device and updates power management policies as a function of changes in the latency tolerance values.

14. The electronic apparatus of claim 8, wherein the policy engine further comprises logic to select a sleep state that permits the system to meet the minimum latency tolerance value.

Description

RELATED APPLICATIONS

[0001] None.

BACKGROUND

[0002] Power management of the interconnected devices is becoming more of a concern as computers implement mobile system platforms where the computers and devices are battery powered. One of the biggest challenges of implementing an aggressive platform power management for mobile PC client and handheld devices is the lack of awareness of device latency tolerance to main memory accesses (DMA) and application latency dependency to facilitate power policy decisions. Deeper sleep states gain greater power savings, but at the cost of longer resume time. For example, deeper sleep states helps microprocessors achieve very low power, but require up to 200 microseconds to resume versus keeping the processor in a "lighter" (shallower) sleep state. Platform phase-locked loop (PLL) shutdown requires 20-50 microseconds to resume, versus 10's of nanoseconds with clock gating.

[0003] Due to the lack of awareness in device latency tolerance, some computing platforms maintain system resources in an available state (especially data paths and system memory) even during idle states. Maintaining these resources in an available state consumes power.

BRIEF DESCRIPTION OF THE DRAWINGS

[0004] The detailed description is described with reference to the accompanying figures, in which:

[0005] FIGS. 1A-1C are schematic block diagrams of portions of an apparatus that supports latency based platform coordination, according to some embodiments.

[0006] FIG. 2 is a flowchart illustrating operations in a method to implement latency based platform coordination, according to some embodiments.

[0007] FIG. 3 is a schematic timing diagram of an example of latency reporting and policy engine coordination, according to some embodiments.

[0008] FIG. 4 is a schematic illustration of a computer system, in accordance with some embodiments.

DETAILED DESCRIPTION

[0009] Described herein are exemplary systems and methods for implementing latency based platform coordination which, in some embodiments, may be implemented in an electronic device such as, e.g., a computer system. In the following description, numerous specific details are set forth to provide a thorough understanding of various embodiments. However, it will be understood by those skilled in the art that the various embodiments may be practiced without the specific details. In other instances, well-known methods, procedures, components, and circuits have not been illustrated or described in detail so as not to obscure the particular embodiments.

[0010] Embodiments of systems which implement latency based platform coordination will be explained with reference to FIGS. 1A-1C and FIG. 2. FIGS. 1A-1C are schematic block diagrams of portions of an apparatus that supports latency based platform coordination, according to some embodiments. FIG. 2 is a flowchart illustrating operations in a method to implement latency based platform coordination, according to some embodiments.

[0011] Referring first to FIG. 1A, a system to implement latency based platform coordination comprises one or more processors 110 and a platform control hub (PCH) 115, which in combination are sometimes referred to as the root complex. A policy engine 130 is implemented in the system as an abstract device which comprises logic to implement latency based platform coordination. In some embodiments, the policy engine 130 may be implemented as logic instructions stored on a computer readable medium which, when executed by a processor configure the processor to implement latency based platform coordination operations. In some embodiments, the policy engine 130 may be reduced to logic, for example in a programmable logic device such as a field programmable gate array (FPGA) or may be reduced to hardwired circuit logic. The policy engine 130 may be implemented as a single, discrete entity, or may be distributed between multiple processing components in the root complex.

[0012] The system further comprises a plurality of components 125 coupled to the policy engine 130 by a bridge/switching device 120. In some embodiments, each of the plurality of components reports (operation 210) its snoop latency, alone or in combination with its non-snoop latency to the policy engine to the policy engine 130. In the embodiment depicted in FIG. 1, the latency parameters may be reported as a tuple, in which the snoop latency parameter is represented by the symbol Sn, where an identifies the component, and in which the non-snoop latency is represented by the symbol NSn, where an identifies the component. Thus, Lat(S1, NS1) represents the snoop and non-snoop latency parameters for the first component. Similarly, Lat(S2, NS2) represents the snoop and non-snoop latency parameters for the second component and Lat(S3, NS3) represents the snoop and non-snoop latency parameters for the third component. In practice, the system may comprise dozens or even hundreds of components.

[0013] In the embodiment depicted in FIG. 1A, each of the components 125 reports its latency parameters through the bridge/switching device 120, which receives the parameters at operation 215. In other embodiments, one or more of the components 125 may be coupled directly to one of the processors 110 in the policy engine 130, such that the device could report its latency parameter directly to the policy engine 130. In some embodiments, the bridge/switching device 120 has a characteristic delay indicated in the drawings by the symbol .DELTA.. The delay, .DELTA., associated with the bridge/switching device 120 may be variable as a function of the switching/transmission capacity associated with the bridge/switching device 120, the traffic flowing through the bridge/switching device 120, and the power state of the bridge/switching device 120. For example, a bridge/switching device that is an inactive/idle state or sleep state would have a higher characteristic delay than a bridge/switching device 120 that is an active state. Similarly, a bridge/switching device 120 with a high traffic load would have a higher characteristic delay than a bridge/switching device 120 with a low traffic load.

[0014] In some embodiments, the bridge/switching device 120 comprises logic to selectively report latency parameters from the components 125 coupled to the bridge/switching device 120. In addition, in some embodiments the bridge/switching device 120 comprises logic to modify the reported latency parameters in order to compensate for the delay, .DELTA., associated with the bridge/switching device 120. In one embodiment, the bridge/switching device implements logic to deduct the characteristic delay, .DELTA., associated with the bridge/switching device 120 from each of the latency parameters for each of the components coupled to the bridge/switching device 120, at operation 220. The bridge/switching device 120 may further implement logic to report the latency parameters to the policy engine 130, at operation 225. For example, the bridge/switching device 120 may report to the policy engine the MIN(Lat(S1-.DELTA., NS1-.DELTA.), Lat(S2-.DELTA., NS2-.DELTA.), Lat(S3-.DELTA., NS3-.DELTA.)).

[0015] The policy engine 130 receives the reported latency parameters from the bridge/switching device 120 at operation 130. In some embodiments, the policy engine 130 implements logic to compute a minimum latency tolerance value (operation 235) from the latency parameters reported into the policy engine 130. The policy engine 130 then uses a minimum latency tolerance value to determine a power management policy for the system.

[0016] FIG. 1B is a schematic illustration of an example in which the system is in an active mode. Each of the components 125 reports their respective latency values into the bridge/switching device 120. In an active state, the bridge/switching device 120 has a characteristic delay of 2 .mu.s. As described above, the bridge/switching device receives the latency parameters from the respective components 125 and deducts the characteristic delay of 2 .mu.s from the reported parameters. The bridge/switching device 120 then reports the minimum latency parameter tuple to the policy engine 130.

[0017] FIG. 1C is a schematic illustration of an example in which the system is in an active mode. Each of the components 125 reports their respective latency values into the bridge/switching device 120. In an active state, the bridge/switching device 120 has a characteristic delay of 20 .mu.s. As described above, the bridge/switching device receives the latency parameters from the respective components 125 and deducts the characteristic delay of 2o .mu.s from the reported parameters. The bridge/switching device 120 then reports the minimum latency parameter tuple to the policy engine 130.

[0018] FIG. 3 is a schematic timing diagram of an example of latency reporting and policy engine coordination, according to some embodiments. FIG. 3 illustrates the utilization of latency reporting while two policy engines (PE1 and PE2) share latency information and coordinate to steer the appropriate C-states for microprocessors, memory controller power management and any other platform PLL power management. A device exhibiting a bursty traffic pattern with intermediate low power states when active and thus reporting a low latency tolerance. The microprocessor may resist entering deeper sleep states that would impact performance such as flushing its caches. However, when the device is idle and reports an extended latency tolerance, that information becomes helpful to enhance utilization of deeper sleep states for the platform and microprocessor while armed with the knowledge that any visible degradation impact is unlikely.

[0019] FIG. 4 is a schematic illustration of an architecture of a computer system which may implement latency based platform coordination n accordance with some embodiments. Computer system 400 includes a computing device 402 and a power adapter 404 (e.g., to supply electrical power to the computing device 402). The computing device 402 may be any suitable computing device such as a laptop (or notebook) computer, a personal digital assistant, a desktop computing device (e.g., a workstation or a desktop computer), a rack-mounted computing device, and the like.

[0020] Electrical power may be provided to various components of the computing device 402 (e.g., through a computing device power supply 406) from one or more of the following sources: one or more battery packs, an alternating current (AC) outlet (e.g., through a transformer and/or adaptor such as a power adapter 404), automotive power supplies, airplane power supplies, and the like. In one embodiment, the power adapter 404 may transform the power supply source output (e.g., the AC outlet voltage of about 110 VAC to 240 VAC) to a direct current (DC) voltage ranging between about 7 VDC to 12.6 VDC. Accordingly, the power adapter 404 may be an AC/DC adapter.

[0021] The computing device 402 may also include one or more central processing unit(s) (CPUs) 408 coupled to a bus 410. In one embodiment, the CPU 408 may be one or more processors in the Pentium.RTM. family of processors including the Pentium.RTM. II processor family, Pentium.RTM. III processors, Pentium.RTM. IV processors, Core and Core2 processors available from Intel.RTM. Corporation of Santa Clara, Calif. Alternatively, other CPUs may be used, such as Intel's Itanium.RTM., XEON.TM., and Celeron.RTM. processors. Also, one or more processors from other manufactures may be utilized. Moreover, the processors may have a single or multi core design.

[0022] A chipset 412 may be coupled to the bus 410. The chipset 412 may include a memory control hub (MCH) 414. The MCH 414 may include a memory controller 416 that is coupled to a main system memory 418. The main system memory 418 stores data and sequences of instructions that are executed by the CPU 408, or any other device included in the system 400. In some embodiments, the main system memory 418 includes random access memory (RAM); however, the main system memory 418 may be implemented using other memory types such as dynamic RAM (DRAM), synchronous DRAM (SDRAM), and the like. Additional devices may also be coupled to the bus 410, such as multiple CPUs and/or multiple system memories.

[0023] In some embodiments, main memory 418 may include a one or more flash memory devices. For example, main memory 418 may include either NAND or NOR flash memory devices, which may provide hundreds of megabytes, or even many gigabytes of storage capacity.

[0024] The MCH 414 may also include a graphics interface 420 coupled to a graphics accelerator 422. In one embodiment, the graphics interface 420 is coupled to the graphics accelerator 422 via an accelerated graphics port (AGP). In an embodiment, a display (such as a flat panel display) 440 may be coupled to the graphics interface 420 through, for example, a signal converter that translates a digital representation of an image stored in a storage device such as video memory or system memory into display signals that are interpreted and displayed by the display. The display 440 signals produced by the display device may pass through various control devices before being interpreted by and subsequently displayed on the display.

[0025] A hub interface 424 couples the MCH 414 to an input/output control hub (ICH) 426. The ICH 426 provides an interface to input/output (I/O) devices coupled to the computer system 400. The ICH 426 may be coupled to a peripheral component interconnect (PCI) bus. Hence, the ICH 426 includes a PCI bridge 428 that provides an interface to a PCI bus 430. The PCI bridge 428 provides a data path between the CPU 408 and peripheral devices. Additionally, other types of I/O interconnect topologies may be utilized such as the PCI Express.TM. architecture, available through Intel.RTM. Corporation of Santa Clara, Calif.

[0026] The PCI bus 430 may be coupled to a network interface card (NIC) 432 and one or more disk drive(s) 434. Other devices may be coupled to the PCI bus 430. In addition, the CPU 408 and the MCH 414 may be combined to form a single chip. Furthermore, the graphics accelerator 422 may be included within the MCH 414 in other embodiments.

[0027] Additionally, other peripherals coupled to the ICH 426 may include, in various embodiments, integrated drive electronics (IDE) or small computer system interface (SCSI) hard drive(s), universal serial bus (USB) port(s), a keyboard, a mouse, parallel port(s), serial port(s), floppy disk drive(s), digital output support (e.g., digital video interface (DVI)), and the like.

[0028] System 400 may further include a basic input/output system (BIOS) 450 to manage, among other things, the boot-up operations of computing system 400. BIOS 450 may be embodied as logic instructions encoded on a memory module such as, e.g., a flash memory module.

[0029] The terms "logic instructions" as referred to herein relates to expressions which may be understood by one or more machines for performing one or more logical operations. For example, logic instructions may comprise instructions which are interpretable by a processor compiler for executing one or more operations on one or more data objects. However, this is merely an example of machine-readable instructions and embodiments are not limited in this respect.

[0030] The terms "computer readable medium" as referred to herein relates to media capable of maintaining expressions which are perceivable by one or more machines. For example, a computer readable medium may comprise one or more storage devices for storing computer readable instructions or data. Such storage devices may comprise storage media such as, for example, optical, magnetic or semiconductor storage media. However, this is merely an example of a computer readable medium and embodiments are not limited in this respect.

[0031] The term "logic" as referred to herein relates to structure for performing one or more logical operations. For example, logic may comprise circuitry which provides one or more output signals based upon one or more input signals. Such circuitry may comprise a finite state machine which receives a digital input and provides a digital output, or circuitry which provides one or more analog output signals in response to one or more analog input signals. Such circuitry may be provided in an application specific integrated circuit (ASIC) or field programmable gate array (FPGA). Also, logic may comprise machine-readable instructions stored in a memory in combination with processing circuitry to execute such machine-readable instructions. However, these are merely examples of structures which may provide logic and embodiments are not limited in this respect.

[0032] Some of the methods described herein may be embodied as logic instructions on a computer-readable medium. When executed on a processor, the logic instructions cause a processor to be programmed as a special-purpose machine that implements the described methods. The processor, when configured by the logic instructions to execute the methods described herein, constitutes structure for performing the described methods. Alternatively, the methods described herein may be reduced to logic on, e.g., a field programmable gate array (FPGA), an application specific integrated circuit (ASIC) or the like.

[0033] In the description and claims, the terms coupled and connected, along with their derivatives, may be used. In particular embodiments, connected may be used to indicate that two or more elements are in direct physical or electrical contact with each other. Coupled may mean that two or more elements are in direct physical or electrical contact. However, coupled may also mean that two or more elements may not be in direct contact with each other, but yet may still cooperate or interact with each other.

[0034] Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least an implementation. The appearances of the phrase "in one embodiment" in various places in the specification may or may not be all referring to the same embodiment.

[0035] Although embodiments have been described in language specific to structural features and/or methodological acts, it is to be understood that claimed subject matter may not be limited to the specific features or acts described. Rather, the specific features and acts are disclosed as sample forms of implementing the claimed subject matter.

* * * * *