U.S. patent application number 10/759941 was filed with the patent office on 2005-07-21 for method and apparatus for preloading translation buffers.
This patent application is currently assigned to International Business Machines Corporation. Invention is credited to Day, Michael Norman, DeMent, Jonathan James, Johns, Charles Ray.
Application Number | 20050160229 10/759941 |
Document ID | / |
Family ID | 34749809 |
Filed Date | 2005-07-21 |
United States Patent
Application |
20050160229 |
Kind Code |
A1 |
Johns, Charles Ray ; et
al. |
July 21, 2005 |
Method and apparatus for preloading translation buffers
Abstract
A method and an apparatus are provided for efficiently managing
the operation of a translation buffer. A software and hardware
apparatus and method are utilized to pre-load a translation buffer
to prevent poor operation as a result of slow warming of a
cache.
Inventors: |
Johns, Charles Ray; (Austin,
TX) ; Day, Michael Norman; (Round Rock, TX) ;
DeMent, Jonathan James; (Austin, TX) |
Correspondence
Address: |
Gregory W. Carr
670 Founders Square
900 Jackson Street
Dallas
TX
75202
US
|
Assignee: |
International Business Machines
Corporation
Armonk
NY
|
Family ID: |
34749809 |
Appl. No.: |
10/759941 |
Filed: |
January 16, 2004 |
Current U.S.
Class: |
711/137 ;
711/206; 711/207; 711/E12.061 |
Current CPC
Class: |
G06F 12/1027 20130101;
G06F 2212/684 20130101; G06F 2212/654 20130101 |
Class at
Publication: |
711/137 ;
711/207; 711/206 |
International
Class: |
G06F 012/08 |
Claims
1. An apparatus for managing a translation mechanism in a processor
architecture comprising: an execution unit for generating an
effective address; a translator, wherein the translator at least
translates an effective address into a real address, wherein if a
translation is at least not available, then the real address is
unavailable; a miss manager, wherein the miss manager is at least
configured to manage unavailable real addresses from the means for
translating; means for pre-loading translation data; and a storage
means, wherein the storage mean at least stores a plurality of
general data wherein the plurality of general data is at least
referenced by real addresses.
2. The apparatus of claim 1, wherein the storage means further
comprises a page table, wherein the page table is configured to at
least provide a plurality of references to the plurality of general
data.
3. The apparatus of claim 2, wherein the means for pre-loading
further comprises a communication channel between the page table
and the translator.
4. The apparatus of claim 3, wherein the translation mechanism
further comprises a software manager coupled to the translator.
5. The apparatus of claim 4, wherein the software manager further
comprises: a data port for transporting data there between; an
index table for supplying index data to the translator; and means
for providing management of the means for pre-loading.
6. The apparatus of claim 4, wherein the translation mechanism
further comprises a hardware manager coupled to the translator.
7. The apparatus of claim 6, wherein the hardware manager further
comprises: a data port for transporting data there between; an
index table for supplying index data to the means for translating;
and means for providing management of the means for
pre-loading.
8. The apparatus of claim 4, wherein the translation mechanism
further comprises a hardware manager coupled to the translator,
wherein the hardware manager further comprises a means for
providing management of the means for pre-loading.
9. The apparatus of claim 8, wherein the hardware manager further
comprises: a data port for transporting data therebetween; and an
index table for supplying index data to the translator.
10. The apparatus of claim 4, wherein the translation mechanism
further comprises a hardware manager coupled to the translator,
wherein the hardware manager further comprises a means for
providing a least partial management of the means for
pre-loading.
11. The apparatus of claim 10, wherein the hardware manager further
comprises: a data port for transporting data there between; an
index table for supplying index data to the means for translating;
and means for providing at least partial management of the means
for pre-loading.
12. A method for managing a translation mechanism in a processor
architecture comprising: generating an effective address;
translating an effective address into a real address, wherein if a
translation is at least not available, then the real address is
unavailable; managing unavailable real addresses from the step of
translating; pre-loading unavailable data; and accessing a
plurality of stored general data wherein the plurality of stored
general data is at least referenced by real addresses.
13. The method of claim 12, wherein the step of accessing further
comprises accessing a page table, wherein the page table is
configured to at least provide a plurality of references to the
plurality of general data.
14. The method of claim 13, wherein the step of pre-loading further
comprises utilizing a communication channel between the reference
table and a translator.
15. The method of claim 14, wherein step of translation further
comprises at least utilizing a software manager coupled to the
translator.
16. The method of claim 15, wherein the step of at least utilizing
the software manager further comprises: transporting data through a
data port; supplying index data from an index table to the
translator; and providing management of the means for
pre-loading.
17. The method of claim 15, wherein the translation mechanism
method further comprises at least utilizing a hardware manager
coupled to the translator.
18. The method of claim 17, wherein the step of at least utilizing
software management means further comprises: transporting data
through a data port; supplying index data from an index table to
the translator; and providing management of the means for
pre-loading.
19. The method of claim 15, wherein the step of translation further
comprises at least utilizing a hardware manager coupled to the
translator, wherein the hardware management means further comprises
a means for providing management of the means for pre-loading.
20. The apparatus of claim 19, wherein the utilized hardware
management means further comprises: transporting data through a
data port; supplying index data from an index table to the
translator.
21. The method of claim 15, wherein the step of translating further
comprises: at least utilizing a hardware manager coupled to the
translator; and at least providing a least partial management of
the means for pre-loading.
22. The apparatus of claim 21, wherein the software management
means further comprises: transporting data through a data port;
supplying index data from an index table to the translator; and
providing management of the means for pre-loading.
23. A computer program product for managing a translation mechanism
in a processor architecture, the computer program product having a
medium with a computer program embodied thereon, the computer
program comprising: computer program code for transporting data
through a data port; computer program code for supplying index data
from an index table to the translator; and computer program code
for providing management of the means for pre-loading.
24. A processor for managing a translation mechanism, the processor
including a computer program comprising: computer program code for
transporting data through a data port; computer program code for
supplying index data from an index table to the translator; and
computer program code for providing management of the means for
pre-loading.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] The invention relates generally to translations mechanisms
in a computer architecture and, more particularly, to efficiently
manage a translation mechanism to prevent problems associated with
"warming" a translation cache.
[0003] 2. Description of the Related Art
[0004] Many of today's processor architectures provide a
translation mechanism for converting an effective address (EA) used
by an application into a real address (RA) used for referencing
real storage. One example of such a processor architecture is
PowerPC.TM.. The translation process uses a translation table to
translate an EA to an RA. The translation table, or page table, is
typically stored in memory. For performance reasons, a typical
implementation of the translation mechanism uses a cache and/or
buffering structure to hold recently used translations. This
structure is referred to as a Translation Lookaside Buffer (TLB) in
PowerPC.TM.. Each instruction using an EA causes a lookup in the
TLB. When a translation is not found in the TLB (for example, there
is a TLB demand miss), a hardware state machine or software routine
is invoked to load the requested translation.
[0005] As with any caching mechanism, latency and bandwidth suffers
when the cache does not contain a substantial amount of valid
information required by an application. This condition is referred
to as a "cold" cache. When a translation cache is cold, each access
to a new area in storage causes a hardware or software action to be
performed to load the requested translation. These demand misses
continue until the translation caches are loaded with the most
frequently used translations (for example, the translation cache is
"warmed"). The additional latency and bandwidth degradation caused
by the initial demand misses increase the runtime of an
application. This condition typically occurs when a program is
first run or when the processor swaps from one task to another,
commonly referred to as the startup penalty. The startup penalty
results in differences between the runtime of an application when
executed on a "cold" versus a "warm" cache.
[0006] The startup penalty can be acceptable for non real-time
applications. However, a real-time application should account for
the worst-case latencies and bandwidth to guarantee a task can be
completed in a specific amount of time (for example, a deadline).
Therefore, real-time applications should account for the
performance of a "cold" cache and, typically, cannot take full
advantage of the system performance. In addition, a real-time
application that does not properly account for the performance
differences between a "cold" and "warm" translation cache can miss
a deadline.
[0007] Therefore, there is a need for a method and/or apparatus for
avoiding the performance penalty of warming a cold cache that
addresses at least some of the problems associated with the
conventional demand miss methods and apparatuses for warming a cold
translation cache.
SUMMARY OF THE INVENTION
[0008] The present invention provides a computer program product
for managing a translation mechanism in a processor architecture,
the computer program product having a medium with a computer
program embodied thereon. A computer program code for transporting
data through a data port is provided. Also a computer program code
for supplying index data from an index table to the translator is
provided. There is also a computer program code for providing
management of the means for pre-loading.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] For a more complete understanding of the present invention
and the advantages thereof, reference is now made to the following
descriptions taken in conjunction with the accompanying drawings,
in which:
[0010] FIG. 1 is a block diagram depicting a conventional
software-controlled translation mechanism;
[0011] FIG. 2 is a block diagram depicting a conventional
hardware-controlled translation mechanism;
[0012] FIG. 3 is a block diagram depicting a Software-controlled
Pre-load Translation Mechanism; and
[0013] FIG. 4 is a block diagram depicting a Hardware-controlled
Pre-load Translation Mechanism.
DETAILED DESCRIPTION
[0014] In the following discussion, numerous specific details are
set forth to provide a thorough understanding of the present
invention. However, those skilled in the art will appreciate that
the present invention can be practiced without such specific
details. In other instances, well-known elements have been
illustrated in schematic or block diagram form in order not to
obscure the present invention in unnecessary detail. Additionally,
for the most part, details concerning network communications,
electro-magnetic signaling techniques, and the like, have been
omitted inasmuch as such details are not considered necessary to
obtain a complete understanding of the present invention, and are
considered to be within the understanding of persons of ordinary
skill in the relevant art.
[0015] It is further noted that, unless indicated otherwise, all
functions described herein can be performed in either hardware or
software, or some combinations thereof. In a preferred embodiment,
however, the functions are performed by a processor such as a
computer or an electronic data processor in accordance with code
such as computer program code, software, and/or integrated circuits
that are coded to perform such functions, unless indicated
otherwise.
[0016] Referring to FIG. 1 of the drawings, the reference numeral
100 generally designates a conventional software-controlled
translation mechanism implementation. The Translation Mechanism
Implementation 100 comprises Translation Mechanism 104 and a
Software TLB Management Interface 102. The Translation Mechanism
104 comprises an Execution Unit (EU) 110, a Translation Lookaside
Buffer (TLB) 112, a Software Miss Handler 114, and a Main Storage
116. The Main Storage 116 further includes a Page Table 118. In
addition, Main Storage 116 can also include memory mapped I/O
devices and registers. The Software TLB Management Interface 102
comprises a TLB Data Port 106 and a TLB Index 108.
[0017] Within the translation mechanism implementation 100, there
is a plurality of interconnected devices that each perform specific
tasks. The EU 110 executes instructions, such as instructions
contained in an executable file. Instructions using an Effective
Address (EA) to reference Main Storage 116 cause the EU 110 to
forward the EA to the TLB 112 for translation. The TLB 112 searches
the translation buffer or cache for a translation for the EA. If
there does not exist a translation for the EA issued by the EU 110,
then the Software Miss Handler 114 searches for the unavailable,
required translation in the Page Table 118 by computing the proper
RA to locate the translation entry needed to translate EA provided
by the EU 110 in the Page Table 118. The Software Miss Handler 114
is typically executed in the EU 110 or another processor in the
system. Once the proper translation has been found for the EA 110
requested EA, the translation is loaded into the TLB 112, utilizing
the Software Control Interface 102. The translation can now be used
for future reference and the current EA is converted into a Real
Address (RA) based on the data found in the Page Table 118. If the
translation is not found in the Page Table 118, the Software Miss
Handler 114 typically invokes a separate software mechanism (not
shown) to resolve the translations missing in the Page Table 118.
Missing translations result due to certain portions of the Page
Table 118 being swapped to a mass media device such as a hard drive
to more efficiently make use of processor memory, typically when
translation entries in the swapped portion of the Page Table 118
have not been used in a lengthy period of time.
[0018] Within the Translation Mechanism 104, there exist a variety
of connections to allow for the operation of the Mechanism 104 as
described. The EU 110 is coupled to the TLB 112 through a first
communication channel 126, wherein the first communication channel
126 transfers an EA to the TLB 112. The TLB 112 is coupled to the
Software TLB Management Interface 102 through a second
communication channel 120 and a third communication channel 122.
The second communication channel 120 and the third communication
channel 122 each provide control data to the TLB 112. Also, the
second communication channel 120 and the third communication
channel 122 are used by the Software Miss Handler 114 to load
translations found in the Page Table 118 into the TLB 112. The TLB
112 is further coupled to the Software Miss Handler 114 through a
fourth communication channel 128, wherein a TLB Miss is
communicated from the TLB 112 to the Software Miss Handler 114. TLB
112 is also coupled to the Main Storage 116 through a fifth
communication channel 132, wherein an EU's 110 translated RA is
communicated from the TLB 112 to the Main Storage 116. The Software
Miss Handler 114 is coupled to the Page Table 118 through a sixth
communication channel 130. The sixth communication channel 130 is
used by the Software Miss Handler 114 to search the Page Table 118
for the translations missing in the TLB 112. Also, the EU 110 is
coupled to the Main Storage 116 through a seventh communication
channel 134, wherein data is intercommunicated between the EU and
the Main Storage 116.
[0019] Within the Software TLB Management Interface 102, there
exist a variety of connections to allow for the operation of the
interface. The TLB Data Port 106 is coupled to the TLB 112 of the
Translation Mechanism 104 through the second communication channel
120, wherein translation data is transferred from the TLB Data Port
106 to the TLB 112. The TLB Data Port 106 provides a communication
port for delivering missing translations to the TLB 112. The TLB
Index 108 is coupled to the TLB 112 of the Translation Mechanism
through the third communication channel 122. Index data is
communicated from the TLB Index 108 to the TLB 112 through the
second communication channel 122. The TLB Index 108 contains the
buffer location in the TLB 112 for the missing translations
supplied by the TLB Data Port 106.
[0020] Now referring to FIG. 2 of the drawings, the reference
numeral 204 generally designates a conventional hardware-controlled
Translation Mechanism Implementation. The Translation Mechanism
Implementation 204 comprises an EU 210, a TLB 212, a Hardware Miss
Handler 214, and a Main Storage 216. The Main Storage 216 further
includes a Page Table 218. In addition, Main Storage 216 can also
include memory mapped I/O devices and registers.
[0021] Within the Translation Mechanism Implementation 200, there
is a plurality of interconnected devices that each performs
specific tasks. The EU 210 executes instructions such as those
contained in an executable file. Instructions using an EA to
reference Main Storage 216 cause the EU 210 to forward the EA to
the TLB 212 for translation. The TLB 212 searches the translation
buffers or cache for a translation for the EA. If there does not
exist a translation for the EA issued by the EU 210, then the
Hardware Miss Handler 214 searches for the unavailable, required
translation in the Page Table 218. Once the proper translation has
been found, the translation is loaded into the TLB 212 for future
reference and the current EA is converted into an RA. The RA is
then communicated to the Main Storage 216 through a fourth
communication channel 232. Once the RA has been transmitted, data
can be effectively transferred between the Main Storage 216 and the
EU 210. If the translation is not found in the Page Table 218, the
Hardware Miss Handler 214 typically invokes a software mechanism to
resolve translations missing in the Page Table 218.
[0022] Within the Translation Mechanism 204, there exist a variety
of connections to allow for the operation of the Mechanism 204. The
EU 210 is coupled to the TLB 212 through a first communication
channel 226, wherein the first communication channel 226 transfers
an EA to the TLB 212. The TLB 212 is coupled to the Page Table 218
through a second communication channel 224, wherein the second
communication channel 224 provides control data intercommunicated
between the TLB 212 and the Page Table 218. The second
communication channel 224 is used by the Hardware Miss Handler 214
to load translations found in the Page Table 218 into the TLB 212.
The TLB 212 is further coupled to the Hardware Miss Handler 214
through a third communication channel 228, wherein a TLB MISS is
communicated from the TLB 212 to the Hardware Miss Handler 214. TLB
212 is also coupled to the Main Storage 216 through the fourth
communication channel 232, wherein an EU's 210 translated RA is
communicated from the TLB 212 to the Main Storage 216. The Hardware
Miss Handler 214 is coupled to the Page Table 218 through a fifth
communication channel 230. The fifth communication channel 230 is
used the Hardware Miss Handler 214 to search the Page Table 218 for
the translations missing in the TLB 112. Also, the EU 210 is
coupled to the Main Storage 216 through a sixth communication
channel 234, wherein data is intercommunicated between the EU 210
and the Main Storage 216.
[0023] Referring to FIG. 3 of the drawings, the reference numeral
300 generally designates a Software-controlled Pre-load Translation
Mechanism. The Software-controlled Pre-Load Translation Mechanism
300 is similar to the Software-controlled Translation Mechanism
Implementation 100 of FIG. 1, with the inclusion of an additional
Software Pre-Load Mechanism 301. The TLB Pre-Load Translation
Mechanism 300 comprises a Software Pre-Load Mechanism 301, a
Software-controlled Translation Mechanism 304, and a Software TLB
Management Interface 302. The configurations of Mechanism 304 and
of Software TLB Management Interface 302 are substantially similar
to the Mechanism 104 and Software TLB Management Interface 102 of
FIG. 1, respectively.
[0024] Within the Software TLB Management Interface 302, there
exist a variety of connections to allow for the operation of the
interface. The TLB Data Port 306 is coupled to the TLB (not shown
but substantially similar to TLB 112 of FIG. 1) of the Translation
Mechanism 304 through the first communication channel 320, wherein
translation data is transferred from the TLB Data Port 306 to the
Translation Mechanism 304. Also, the TLB Index 308 is coupled to
the Translation Mechanism 304 through a second communication
channel 320. Index data is communicated from the TLB Index 308 to
the Translation Mechanism 304 through the second communication
channel 322. The TLB Index 308 contains the buffer location for the
missing translations supplied by the TLB Data Port 306.
[0025] The Software Pre-load Mechanism 301 distinguishes the
Software-controlled Pre-load Translation Mechanism 300 of FIG. 3
from any other conventional Translation Mechanism Implementations,
such as the Translation Mechanism Implementation 100 of FIG. 1. The
Software Pre-Load Mechanism 301 is coupled to the Software TLB
Management Interface 302 through a third communication channel 311.
The Software Pre-load Mechanism 301 with an extension of the
Software TLB Management Interface 302 allows translations to be
pre-loaded into a TLB (not shown) from a Page Table (not shown)
prior to the running of an application. In addition, the extensions
allow for the state of the TLB (not shown) to be saved and restored
when swapping tasks running on the execution unit. Pre-loading and
restoring of the TLB provide for a reduction in the lag time by
warming the associated TLB (not shown). Furthermore, the
combination also allows for re-initializing the TLB when switching
the context of the processor as opposed to a simple save and
restore.
[0026] The Software Pre-load Mechanism 301 provides the
applications with an interface for requesting the pre-load of
translation. The requested translations can also be used to
re-initialize the translations when switching the context of the
processor. The interface can be an extension of the memory advise
or "madvise" operating system call.
[0027] The "madvise" call includes an effective address and region
size parameter which defines the start and size of an area in Main
Storage for which translations are needed by an application. When
receiving a "madvise" call, the Software Pre-load Mechanism 301
searches the Page Table (not shown) for the translations for the
memory area defined by the parameters. Once the translations are
found, the Software Pre-load Mechanism 301 loads the translation
into the TLB (not shown) using the Software TLB Management
Interface 302.
[0028] Referring to FIG. 4 of the drawings, the reference numeral
400 generally designates a Hardware-controlled Pre-Load Translation
Mechanism. The Hardware-controlled Pre-Load Translation Mechanism
400 is similar to the hardware-controlled Translation Mechanism
Implementation 204 of FIG. 2, with the inclusion of an additional
Software Pre-Load Mechanism 401 and a Software TLB Management
Interface 402.
[0029] The Hardware-controlled Translation Mechanism
Implementations 400 is distinguished from any other conventional
Hardware-controlled Translation Mechanism Implementations, such as
the Implementation 200 of FIG. 2. Included in the Implementation
400 are a Software TLB Management Interface 402 and a Software
Pre-Load Mechanism 401. The Hardware-controlled Translation
Mechanism Implementation 400 also comprises a Translation Mechanism
404. Moreover, the configuration of the Mechanism 404 is
substantially similar to the Mechanism 204 of FIG. 2.
[0030] The operation of the Software Pre-load Mechanism 401 in the
Implementation 400 is similar to the operation of the Software
Pre-Load Translation Mechanism 301 of FIG. 3. However, to allow for
the Software Pre-load Mechanism to work in a hardware-controlled
mechanism, a Software TLB management interface is required. The
interface is typically not included in conventional
Hardware-controlled mechanism since the TLB is managed by hardware
miss handlers.
[0031] Within the Software TLB Management Interface 402, there
exist a variety of connections to allow for the operation of the
interface. The TLB Data Port 406 is coupled to the TLB 412 (not
shown) of the Translation Mechanism 404 through the first
communication channel 420, wherein translation data is transferred
from the TLB Data Port 406 to the Translation Mechanism 404. The
TLB Data Port 406 provides a communication port for delivering
missing translations to the Translation Mechanism 404. The TLB
Index 408 is coupled to the Translation Mechanism 404 through a
second communication channel 422. Index data is communicated from
the TLB Index 408 to the Translation Mechanism 404 through the
second communication channel 422. The TLB Index 408 contains the
buffer location for the missing translations supplied by the TLB
Data Port 406.
[0032] Included with the Hardware-controlled Pre-Load Mechanism 400
is a Software Pre-Load Mechanism. The Software Pre-Load Mechanism
401 is coupled to the Software TLB Management Interface 402 through
a third communication channel 411. The Software Pre-load Mechanism
401 with an extension of the Software TLB Management Interface 402
allows translations to be pre-loaded into a TLB (not shown) from a
Page Table (not shown) prior to the running of an application. In
addition, the extensions allow for the state of the TLB (not shown)
to be saved and restored when swapping task running the execution
unit. Pre-loading and restoring of the TLB (not shown) provide for
a reduction in the lag time by warming the associated TLB (not
shown). Furthermore, the combination also allows for
re-initializing the TLB when switching the context of the processor
as opposed to a simple save and restore.
[0033] The Software Pre-load Mechanism 401 provides the
applications with an interface for requesting the pre-load of
translation. The requested translations can also be used to
re-initialize the translations when switching the context of the
processor. The interface can be an extension of the memory advise
or "madvise" operating system call.
[0034] The "madvise" call includes an effective address and region
size parameter which defines the start and size of an area in Main
Storage for which translation are needed by an application. When
receiving a "madvise" call, the Software Pre-load Mechanism 401
searches the Page Table (not shown) for the translations for the
memory area defined by the parameters. Once the translations are
found, the Software Pre-load Mechanism 401 loads the translation
into the TLB (not shown) using the Software TLB Management
Interface 402.
[0035] There are advantages and disadvantages to both a hardware
and software-managed TLB (not shown). For example, the latency for
resolving a TLB miss is less in a hardware-managed TLB mechanism
than a software-managed TLB mechanism. However, there is less
control of the Page Table structure and the translations contained
in the TLB of a hardware-controlled TLB mechanism. The
Hardware-controlled Pre-load Translation Mechanism 400 of FIG. 4
further includes a configurable Hardware Miss Handler (not shown),
which invokes a Software Miss Handler (not shown) when the
translation is not found in the TLB (not shown). The inclusion of a
configurable Hardware Miss Handler (not shown) allows the system
software to choose the best method for managing the translations
required by an application.
[0036] From the foregoing description, it is understood that it is
also possible to having varying degrees of concurrent control and
management of a given Translation Lookaside Buffer. Hence, there
are multiple embodiments of the present invention that can
encompass varying degrees of control and/or management with respect
to software and hardware.
[0037] It will further be understood from the foregoing description
that various modifications and changes can be made in the preferred
embodiment of the present invention without departing from its true
spirit. This description is intended for purposes of illustration
only and should not be construed in a limiting sense. The scope of
this invention should be limited only by the language of the
following claims.
* * * * *