U.S. patent application number 10/446914 was filed with the patent office on 2003-10-16 for system and method for translation buffer accommodating multiple page sizes.
This patent application is currently assigned to Intel Corporation. Invention is credited to Mathews, Gregory S..
Application Number | 20030196066 10/446914 |
Document ID | / |
Family ID | 28042179 |
Filed Date | 2003-10-16 |
United States Patent
Application |
20030196066 |
Kind Code |
A1 |
Mathews, Gregory S. |
October 16, 2003 |
System and method for translation buffer accommodating multiple
page sizes
Abstract
A translation buffer is described which can translate virtual
addresses to physical addresses wherein the virtual addresses have
varying page sizes. The translation buffer includes a decoder to
generate a hashed index, the index identifying an entry into two
arrays. The first of the two arrays identifies a corresponding
physical page address and the other array identifies a
corresponding variable page address that in comparison to a
variable portion of the virtual address, will indicate whether the
entry in the first array has a matching entry. If the first array
identifies a matching physical page address, then the physical page
address is combined with the offset of the virtual address to yield
a physical address translation of the virtual address.
Inventors: |
Mathews, Gregory S.; (Santa
Clara, CA) |
Correspondence
Address: |
SCHWEGMAN, LUNDBERG, WOESSNER & KLUTH, P.A.
P.O. BOX 2938
MINNEAPOLIS
MN
55402
US
|
Assignee: |
Intel Corporation
|
Family ID: |
28042179 |
Appl. No.: |
10/446914 |
Filed: |
May 27, 2003 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10446914 |
May 27, 2003 |
|
|
|
09475607 |
Dec 30, 1999 |
|
|
|
Current U.S.
Class: |
711/207 ;
711/205; 711/E12.061 |
Current CPC
Class: |
G06F 2212/652 20130101;
G06F 12/1027 20130101 |
Class at
Publication: |
711/207 ;
711/205 |
International
Class: |
G06F 009/34 |
Claims
What is claimed is:
1. An apparatus, comprising: a translation buffer to store a
portion of a physical page address associated with a physical
memory; and a page size bias input coupled to the translation
buffer to indicate a plurality of positions within a lower portion
of a tag of a virtual address, the plurality of positions
corresponding to a plurality of page sizes within the physical
memory.
2. The apparatus of claim 1, further comprising: a variable page
address input coupled to the translation buffer.
3. The apparatus of claim 1, further comprising: a decoder to
couple the page size bias input to the translation buffer.
4. The apparatus of claim 1, wherein the page size bias input is to
map a range of positions spanning a smallest page size to a largest
page size of the plurality of page sizes.
5. An apparatus, comprising: a physical memory having a plurality
of page sizes; and a translation buffer having a page size bias
input to indicate a plurality of positions within a lower portion
of a tag of a virtual address, the plurality of positions
corresponding to the plurality of page sizes within the physical
memory.
6. The apparatus of claim 1, further comprising: a variable page
address input coupled to the translation buffer.
7. The apparatus of claim 6, further comprising: a decoder coupled
to the page size bias input and the variable page address input,
wherein the decoder is to provide a first wordline to a first
wordline select output included in the decoder.
8. The apparatus of claim 7, further comprising: a physical memory
page address output coupled to the translation buffer; and a first
array coupled to the variable page address input and the first
wordline select output, wherein the first array is to store a
corresponding physical memory page address to be provided to the
physical memory page address output, and a virtual fixed page
address to be provided to a virtual fixed page address output.
9. The apparatus of claim 8 wherein the first array is a
direct-mapped array to store the virtual fixed page address, a page
mask, and the corresponding physical memory page address.
10. The apparatus of claim 7, further comprising: a direct-mapped
array coupled to the variable page address input and the first
wordline select output, the direct-mapped array to store an entry
including a virtual address tag, a flag indicating validity of the
entry, and a page mask, wherein the direct-mapped array is to
provide an indication of a translation lookaside buffer miss or a
translation lookaside buffer hit.
11. The apparatus of claim 10, further comprising: a virtual fixed
page address input coupled to the translation buffer; and a content
addressable array coupled to the virtual fixed page address input,
the content addressable array having a third wordline select output
coupled to the direct-mapped array, wherein the content addressable
array is to store bits to select the entry for invalidation.
12. A system comprising: a processor coupled to a physical memory
having a plurality of page sizes; a translation buffer to store a
portion of a physical page address associated with the physical
memory; and a page size bias input coupled to the translation
buffer to indicate a plurality of positions within a lower portion
of a tag of a virtual address, the plurality of positions
corresponding to the plurality of page sizes.
13. The system of claim 12, further comprising: a second processor
coupled to the physical memory.
14. The system of claim 12, further comprising: a page table
coupled to the translation buffer.
15. The system of claim 12, further comprising: a decoder coupled
to the page size bias input, wherein the decoder is to provide a
first wordline to a first wordline select output included in the
decoder.
Description
[0001] This application is a continuation of U.S. patent
application Ser. No. 09/475,607, filed Dec. 30, 1999, which is
incorporated herein by reference.
TECHNICAL FIELD OF THE INVENTION
[0002] The present invention relates generally to computer systems
having virtual memory addressing, and in particular the present
invention relates to such computer systems have a translation
lookaside buffer (TLB) or similar cache for use with virtual memory
addressing.
BACKGROUND OF THE INVENTION
[0003] Virtual memory addressing is a common strategy used to
permit computer systems to have more addressable memory than the
actual physical memory installed within a given computer system.
Data is stored on a storage device such as a hard disk drive and is
loaded into physical memory as needed typically on a memory
page-by-memory page basis, where a memory page is a predetermined
amount of contiguous memory. Computer systems having virtual memory
addressing must translate a given virtual memory address to a
physical memory address that temporarily corresponds to the virtual
address.
[0004] In many such computer systems, translation is accomplished
via a translation lookaside buffer (TLB), also known by those
skilled in the art as a TC (translation cache). The TLB is a cache
located preferably near the processor of the computer system in
order to improve the access speed and also holds virtual
page-to-physical page mappings most recently used by the processor.
The TLB entries may be cached entries from a page table or
translations created and/or inserted by the operating system. The
translation of virtual to physical addresses commonly are a
critical path in computer performance. Conventional TLB
organizations well-known to those skilled in the art include
direct-mapping in which an entry can appear in the TLB in only one
position, fully associative mapping in which an entry can be placed
anywhere in the TLB, and set-associative in which an entry can be
placed in a restricted set of places in the TLB where a set is a
group of entries in the cache and an entry can be placed anywhere
within the set.
[0005] Fully associative TLBs conventionally include a Content
Addressable Memory (CAM) array and a Random Access Memory (RAM)
array. CAM, also known as "associative memory" is a kind of storage
device which includes comparison logic with each bit of storage. A
data value is broadcast to all words of storage and compared with
the values there. Words which match are flagged in some way.
Subsequent operations can then work on flagged words and/or data
linked to those flagged words, e.g. read them out one at a time or
write to certain bit positions in all of them.
[0006] Set-associative TLBs conventionally include decoders, RAM
arrays, and comparators. Part of the virtual address is used by the
decoder to determine which entries in the RAM array may contain a
corresponding physical address translation. The remainder of the
virtual address is typically used along with a tag stored in the
RAM array (each RAM array entry has a corresponding tag) by the
comparator to determine a specific entry to be used for
translation. Set-associative TLBs tend to be faster to access than
fully associative TLBs due to the use of decoders rather than CAM
arrays.
[0007] Conventional TLBs are designed to work with a fixed page
size, such as a 4K (1K=1024 bytes) page size, a 16K page size, or a
256K page size. This is less than optimal because memory space on
conventional personal computers (PCS) is designed in a manner
wherein different address ranges have differing page granularity
requirements. For example, on a PC, physical memory space between
addresses 640K and 1M (1M=2{circumflex over ( )}20 bytes) need
4K-8K granularity to support partitions for read-only memories
(ROMs), hard disk interfaces, graphics interfaces, etc., but
physical memory space below 640K and above 1M is random-access
memory (RAM), which would be more efficiently mapped with larger
page sizes.
[0008] A conventional solution is to use multiple TLBs in which at
least one TLB is implemented for each page size of addressable
memory space. For example, one TLB is implemented for memory space
that is addressed via 4K page sizes and another TLB is implemented
for memory space that is addressed via 16K page sizes. This is
problematic because all TLBs must be referenced for each virtual
address (slower than referencing a single TLB), the method allows
creation of multiple (overlapping) entries representing the same
virtual address, and the Operating System (OS) is limited to a
small set of possible page sizes.
[0009] Another conventional solution is to implement one TLB using
a page size of the smallest page size needed, such as 4K in the
above example of a conventional microprocessor. However, this is
problematic in that many more entries in the TLB will be needed to
describe the portions of memory that are addressed in larger page
sizes. For example, eight entries would be needed in a TLB to
describe every 32K page of memory if the TLB uses a page size of
4K. If the number of entries in the TLB is increased to accommodate
the requirement of more entries, this results in slower performance
because searching a larger TLB is slower than searching a smaller
TLB. If the number of entries in the TLB is not increased, then the
number of "misses" will increase (the case in which a given virtual
address has no corresponding entry in the TLB), thus causing
hardware or the OS to spend a significant number of cycles
retrieving the missing translation before program execution can
resume. Because the translation of virtual to physical addresses
are a bottle-neck in the speed of computers, it is critical that
the translation be accomplished quickly.
[0010] Therefore, a need exists for a single fast TLB that can
accommodate multiple page sizes quickly.
SUMMARY OF THE INVENTION
[0011] The system identifies virtual addresses as including three
portions; a virtual fixed page address in the upper bits of the
address word that is always used for identification of the page; an
offset address in the lower bits of the address word that is always
used for identification of the page offset; and a variable page
address between the virtual fixed page address and the offset, that
identifies either page address or offset address, depending on the
size of the page corresponding to the virtual address word.
[0012] In one embodiment of a method of the present invention, the
system receives a virtual address and page size bias for the
virtual address and outputs a corresponding physical address. The
page size bias is used in the look-up of the physical address.
During intermediate stages of the virtual to physical address
translation, according to the look-up of the virtual address and
page size bias, a page size mask and physical page address are
generated. The page size mask indicates what portion of the virtual
address describes the address of the virtual page in memory space,
and what portion of the address represents an offset within the
virtual page. Since the physical page size and virtual page size
are the same, the page size mask similarly indicates what portion
of the physical page address generated describes the translated
virtual page address and is to be used as physical address output
and what portion of the physical page address should be masked
(because it is not part of the page address) and replaced with the
virtual address offset within the page. The final physical address
consists of the unmasked portion of the physical page address
concatenated with the virtual address offset within the page (the
offset within the page is not translated).
[0013] In one embodiment of an apparatus, the present invention
generates a set of entry selects according to a virtual address and
page size bias supplied, generates a physical page address from an
entry selected by the entry selects in a first array, generates a
virtual address tag from an entry selected by the entry selects in
a first array, generates a page size mask from an entry selected by
the entry selects in a first array, and generates a match signal
from a comparison of the variable page address supplied with a
corresponding entry selected by the entry selects in a second array
(the match signal is also qualified with a valid bit contained
within the second array which indicates whether or not the
translation buffer entry selected is valid). A masked physical page
address is created by masking-off the lower bits of the generated
physical page address with the page size mask so that the address
bits which correspond to the portion of the address which
represents the offset within the page (as opposed to the portion of
the address which represents the address of the page within memory
space) are masked off. Then the offset address within the page is
created by masking the virtual address with the inverse of the page
size mask so that the address bits which correspond to the portion
of the address which represents the address of the page within
memory space (as opposed to the portion of the address which
represents the offset within the page) are masked off. The physical
address is then formed by combining the masked physical page
address with the offset address within the page.
[0014] In another embodiment of an apparatus, a computer system
that includes one or more processors, one or more physical memories
operating within the processor(s) in which the memories have more
than one page size identified to describe the corresponding
physical memory, and a translation buffer coupled to the physical
memory through an address bus in which the translation buffer
receives a virtual address and a page size bias and outputs a
physical memory address. The translation buffer includes a decoder
that receives the page size bias and a subset of the virtual
address input and outputs a set of entry selects. It also includes
an array that receives the entry selects from the decoder which
contains entries corresponding to those entry selects describing a
virtual fixed address tag, a page size mask, a physical memory page
address, in which the array outputs the physical address
corresponding to the virtual address supplied by combining
complementary portions of the physical page address and the virtual
page offset address. The array also outputs a virtual fixed address
tag which is compared to the virtual fixed address portion of the
virtual address supplied to generate a partial match signal.
Finally, the translation buffer includes a second array, which
contains a variable virtual address tag and a page size mask. The
second array inputs the variable page address portion of the
virtual address supplied and the entry selects. It then uses the
entry selects to select an entry and masks the variable page
address supplied with the page size mask of the entry selected such
that the portion of the variable page address which corresponds to
the offset address within the page is masked and compares this
result for equality with the variable virtual address tag of the
entry selected, similarly masked with the page size mask of the
entry selected, to generate match signal (the match signal is also
qualified with a valid bit contained within the second array which
indicates whether or not the translation buffer entry selected is
valid). A translation match is indicated when both the partial
match signal from the first array and the match signal from the
second array are true. The translation can be performed in parallel
by one or more translation buffers to form a set-associative TLB in
which each of the translation buffers is one way of the TLB.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] FIG. 1 is a block diagram of a computer system of an
embodiment of the invention.
[0016] FIG. 2 is a block diagram of a virtual address word using a
4K page size.
[0017] FIG. 3 is a block diagram of a virtual address word using a
256K page size.
[0018] FIG. 4 is a block diagram of a virtual address word using a
variable page size ranging from a 4K page size to a 256K page
size.
[0019] FIG. 5 is a block diagram of a physical address word using a
4K page size.
[0020] FIG. 6 is a block diagram of one embodiment of the present
invention.
[0021] FIG. 7 is a block diagram of one embodiment of translation
buffer of the present invention.
[0022] FIG. 8 is a block diagram of a decoder of one embodiment of
the present invention.
[0023] FIG. 9 is a block diagram of a direct-mapped embodiment of
the present invention.
[0024] FIG. 10 is a block diagram of a set-associative embodiment
of the present invention.
[0025] FIG. 11 is a block diagram of one embodiment of a method of
translating virtual addresses of varying page sizes to physical
addresses.
[0026] FIG. 12 is a block diagram of another embodiment of a method
of translating virtual addresses of varying page sizes to physical
addresses.
[0027] FIG. 13 is a block diagram of one embodiment of a method of
generating a wordline selection in translating virtual addresses of
varying page sizes to physical addresses.
[0028] FIG. 14 is a block diagram of one embodiment of a method of
decoding a variable page address with a page size in generating a
wordline selection in translating virtual addresses of varying page
sizes to physical addresses.
[0029] FIG. 15 is a block diagram of one embodiment of a method of
generating a physical page address in translating virtual addresses
of varying page sizes to physical addresses.
[0030] FIG. 16 is a block diagram of one embodiment of a method of
generating an indication of a match in translating virtual
addresses of varying page sizes to physical addresses.
DETAILED DESCRIPTION OF THE INVENTION
[0031] In the following detailed description of the preferred
embodiments, reference is made to the accompanying drawings which
form a part hereof, and in which is shown by way of illustration
specific preferred embodiments in which the inventions may be
practiced. These embodiments are described in sufficient detail to
enable those skilled in the art to practice the invention, and it
is to be understood that other embodiments may be utilized and that
logical, mechanical and electrical changes may be made without
departing from the spirit and scope of the present invention. The
following detailed description is, therefore, not to be taken in a
limiting sense, and the scope of the present invention is defined
only by the claims.
[0032] The present invention describes a translation lookaside
buffer TLB, or similar cache, with the ability to translate
addresses according to pages of varying sizes, for computer systems
having virtual memory addressing. The invention is not particularly
limited to a given computer system. Both scalar and vector computer
systems, as well-known within the art, can be used in the
invention.
[0033] Referring to FIG. 1, a block diagram of a computer system
100 of an embodiment of the invention is shown. There may be more
than one processor 110, as commonly found in parallel
architectures, for example. The computer system 100 employs virtual
memory addressing so that it has more addressable memory than the
actual physical memory installed. Processor 110 must therefore
translate a given virtual memory address 120 to a physical memory
address 125 within data space 130 that resides in physical memory
that temporarily corresponds to the virtual address 120. As known
within the art, in a virtual memory addressing architecture, data
may be stored on a storage device such as hard disk drive (HDD)
140, and loaded into data space 130 located in physical memory as
needed.
[0034] Virtual to physical address translation is accomplished via
translation lookaside buffer (TLB) 150. TLB 150 is a cache located
preferably near, or in, processor 110 (in order to enhance access
speed) which holds translation table entries recently used by the
processor. The translation table entries map virtual memory pages
to physical memory pages. A memory page is defined herein as a
predetermined amount of contiguous memory space, therefore a given
memory address refers to a location within a particular memory
page. The translation table entries permit conversion of virtual
addresses such as virtual address 120 to physical addresses within
data space 130 that is located within physical memory. That is, a
virtual address 120 corresponding to a location within a virtual
page mapped to a physical page is convertible to a physical address
125 corresponding to a location within that physical page. The
invention can include other types of caches than TLB 150. For
purposes of this application, the term TLB is inclusive of all such
caches.
[0035] When the translation table entry required to translate
virtual address 120 is within TLB 150, execution by processor 110
of a computation utilizing address 120 proceeds very quickly. The
physical address 125 within data space 130 located within physical
memory corresponding to virtual address 120 is obtained via TLB
150, and the physical address within data space 130 located within
physical memory is accessed by processor 110.
[0036] However, when the translation table entry required to
translate the desired virtual address 120 is not found within TLB
150, execution by processor 110 of a computation utilizing the
address 120 slows considerably. TLB miss 160 is generated, and may
be used to cause an interrupt to the OS, or may be used to cause a
hardware page table search.
[0037] In the event that a hardware page table search is initiated,
hardware will search for the translation corresponding to the
virtual address 120 which missed the TLB 150, in the page table
170. If it finds the missing translation, it will install the
translation in the TLB 150, and the program will be resumed. If the
hardware fails to find a translation corresponding to the virtual
address 120 which missed the TLB 150 in the page table 170, then an
interrupt to the OS will be generated.
[0038] If an interrupt was sent to the OS either due to a TLB miss
or a failed hardware page table search, then the OS will be
required to provide the missing TLB entry. This may entail the OS
searching the page table 170 (if there was no hardware search and
the entry was contained within the page table 170), creating a new
entry in the page table 170 (if the page table 170 did not contain
the missing entry), and/or installing a new entry in the TLB (a new
TLB entry may be created/installed by the OS which is not placed in
the page table 170), before the program can resume.
[0039] Due to the size of the page table 170, it may be
desirable/necessary to maintain only a subset of the total number
of entries within the physical memory, with the remainder stored
elsewhere (like in disk storage 140). Similarly, it may be
desirable/necessary to maintain only a subset of the physical
memory space in physical memory with the remainder stored elsewhere
(like in disk storage 140). In these cases, the OS will be required
to swap data from/to the disk/physical memory on an as-needed basis
and to mark the TLB entries and page table entries which correspond
to those physical pages with respect to their "presence" or
"non-presence" in physical memory.
[0040] As has been described, one computer system used in the
invention includes both a TLB and a page table. However, the
invention is not so limited. For example, a computer system may
only have a TLB, and no page table. The generation of a TLB miss
therefore always requires the operating system to provide
translations. Those of ordinary skill within the art will
appreciate, however, that embodiments of the invention, as will be
described, are most advantageous when all accessed pages (and their
corresponding page table entries) are present in physical
memory.
[0041] FIG. 2 is a block diagram of a virtual address word 200
using a 4K page size. The 64 bit word 200 contains a page offset
address 220 that is twelve bits in length to represent a 4K page.
The remaining portion of virtual address word 200 is the virtual
page address 240 that is 52 bits in length representing
2{circumflex over ( )}52 of virtual pages. In virtual to physical
address translation, the virtual page address 240 is translated to
physical a page address, and the page offset address 220 is
unchanged.
[0042] FIG. 3 is a block diagram of a virtual address word 300
using a 256 M page size. The 64 bit word 300 contains a page offset
320 that is twenty-eight bits in length to represent a 256M page.
The remaining portion of virtual address word {circumflex over (
)}300 is the virtual page address 340 that is 36 bits in length
representing 2{circumflex over ( )}36 virtual pages.
[0043] FIG. 4 is a block diagram of a virtual address word 400
using a variable page size ranging from a 4K page size to a 256M
page size. The 64-bit word 400 contains a page offset address 420,
a variable page address 440, and a virtual fixed page address
430.
[0044] The page offset 420 is the portion of the virtual address
word 400 that describes the offset of the address within a page.
The size of the page offset 420 is the size of smallest page size
implemented. For example, between FIG. 2 and FIG. 3, the smallest
page size implemented is the 4K page in FIG. 2 in which the page
offset address 220 is described by bits 0 through 11. Therefore,
the page offset 420 is bits 0 through 11.
[0045] The virtual fixed page address 430 is a set of bits in the
upper portion of the virtual address that is invariably used to
describe the page address. This is the portion of the virtual
address 400 that will never be used to describe the offset, even
when the page size of the virtual address is the maximum size
implemented. For example, between FIG. 2 and FIG. 3, the maximum
page size implemented is a 256M page in FIG. 3 in which the virtual
page address 340 is described by bits 28 through 63. Therefore, the
virtual fixed page address 430 is bits 28 through 63.
[0046] The variable page address 440 is a set of bits that
describes the portion of the virtual address 400 that may be used
as the lower portion of the virtual page address, the page offset,
or a combination of both, depending on the page size of the virtual
address. In other words, the variable page address 440 describes
the portion of the virtual address 400 that spans the range from
the largest page size to the smallest page size implemented. More
specifically, when the virtual address 400 translates to a physical
address using the smallest page size that is implemented, all of
the bits of the variable page address 440, in conjunction with the
virtual fixed page address 430, describe the virtual page address.
When the virtual address 400 translates to a physical address using
the largest page size that is implemented, all of the bits of the
variable page address 440, in conjunction with the page offset
address 420, describe the offset within the page. For example,
between FIG. 2 and FIG. 3, the smallest page size implemented is
the 4K page in FIG. 2, in which the page offset address 220 is
described by bits 0 through 11 and the maximum page size
implemented is the 256K page in FIG. 3, in which the virtual page
address 340 is described by bits 28 through 63. Therefore, for the
case in which the smallest page size supported was 4K and the
largest page size supported was 256K, the variable page address 440
would be the portion of the virtual address 400 that describes
either page address or page offset depending upon the page size of
the virtual address being translated, or bits 12 through 27.
[0047] FIG. 5 is a block diagram of a 44-bit physical address word
using 4K page size. The 44 bit word contains a page offset 520 that
is twelve bits in length to represent a 4K page. The remaining
portion of physical address word 500 is the {circumflex over ( )}32
physical page address 540 that is 32 bits in length representing
2{circumflex over ( )}32 physical pages. In virtual to physical
address translation, the virtual page address is translated to
physical page address 540, and the physical page offset address 520
is derived unchanged from the virtual page offset address.
[0048] Conventionally, a TLB 150 in FIG. 1 in computer system 100
will use only one page size, such as a 4K page size as in FIG. 2 or
a 256K page size as in FIG. 3 to translate a virtual address word
200 as in FIG. 2 or virtual address word 300 as in FIG. 3 into a
physical address word 500 as in FIG. 5 or require a TLB for each
page size supported. However, the inventive system overcomes this
problem by enabling a single TLB to implement two or more page
sizes.
[0049] FIG. 6 is a block diagram of one embodiment of the present
invention. System 600 includes a TLB 610 that accepts an input
virtual address 620 and an input indication of a page size bias
630. The TLB translates the input virtual address 620 into an
output physical address 640 if the input virtual address 620
matched an entry in the TLB 610, otherwise, an indication of a TLB
miss 650 is transmitted.
[0050] A more detailed description of TLB 610 is described with
reference to FIGS. 7-10.
[0051] FIG. 7 is a block diagram of one embodiment of the
translation buffer 700 of the present invention. The decoder 710
receives the page size bias 720, which corresponds to TLB 610 that
accepts an input indication of a page size bias 630. The page size
bias 720 is a set of bits that is as wide as minimally necessary to
describe the range from the smallest page size to the largest page
size that the translation buffer will support. In one embodiment,
the page size bias will be 6 bits wide in order to describe seven
page sizes ranging from 4K as in virtual address word 200 in FIG. 2
to 256K as in virtual address word 300 in FIG. 3. In one embodiment
of the page size bias 720, a programmable register is implemented
to select the value of the bias. In another embodiment of the page
size bias 720, a set of programmable registers are implemented to
select the value of the bias based upon the current privilege level
(CPL) of the program.
[0052] The decoder 710 also receives the variable page address 730,
which corresponds to TLB 610 that accepts an input virtual address
620 (of which the variable page address would be a subset). The
variable page address 730 is described in detail in FIG. 4 as the
variable page address 440. The decoder 710 is discussed in detail
below in FIG. 8. The decoder 710 generates and then outputs a
wordline entry select (not labeled) that is input to the random
access memory (RAM) array 740 and the content addressable memory
(CAM) array 750.
[0053] The RAM array 740 and the CAM array 750 also both receive
the variable page address 730 in addition to the wordline entry
select from the decoder 710.
[0054] The RAM array 740 is a direct-mapped array that utilizes the
wordline entry selects from the decoder 710 to select an entry that
describes a physical page which may correspond to the virtual page
address. Each entry in the RAM array 740 contains a virtual address
tag 742, a page mask 744, and a physical page address 746.
[0055] The virtual address tag 742 selected by the entry selects,
is output on signals 770 and is then compared with the virtual
fixed page address 790 to provide a partial match indication for
the entry selected. The page mask 744, the physical page address
746, and the variable page address 730 are used to generate the
output physical page address 760 for the entry selected.
[0056] The CAM array 750 also utilizes the wordline entry selects
from the decoder 710 to select the "match" output from the CAM
array entry which corresponds to the RAM array entry selected.
Thus, the CAM array 750 is used to determine, in part, if the entry
selected in the RAM array 740 matches the virtual page. All of the
virtual page address that is required to describe the smallest
implemented page size is described between the RAM and the CAM
arrays.
[0057] The virtual address tag 742 describes the virtual fixed page
address 430 of FIG. 4, which is a set of bits in the upper portion
of the virtual address that is invariably used to describe the page
address. This is the portion of the virtual address 400 that will
never be used to describe the offset, even when the page size of
the virtual address is the maximum size implemented. For example,
between FIG. 2 and FIG. 3, the maximum page size implemented is a
256M page in FIG. 3 in which the virtual page address 340 is
described by bits 28 through 63. Therefore, if the maximum page
size implemented is 256M, the virtual fixed page address 430 is
bits 28 through 63 of the virtual address. The page mask 744
describes how the bits in the variable page address 730 and the
physical page address 746 will be used to generate the output
physical page address 760. In one embodiment, the width of the page
mask 744 will be equally as wide as the width of the variable page
address 730, and each bit in the page mask 744 will identify a
corresponding bit in the variable page address 730, that will be
used as part of the output physical page address 760 instead of a
bit from the physical page address 746 selected. More specifically,
in an embodiment in which the page size of the virtual address
described by the entry in the RAM array 740, is 4K, as in FIG. 2,
and the minimum page size supported is 4K, then each bit of page
mask 744 will be set to "0", indicating that all of the output
physical page address 760, would come from the physical page
address 746 selected. In an embodiment in which the page size of
the virtual address described by the entry in the RAM array 740, is
256M, as in FIG. 3, and the minimum page size supported is 4K as in
FIG. 4, then bits of the page mask 744 corresponding to virtual
address bits 12-27 will be set to "1", indicating that bits 12-27
of the output physical page address 760, would come from the
variable page address 730, and the remainder from physical page
address 746 selected.
[0058] The output physical page address 760 is concatenated with
the virtual page offset as described in FIG. 4, to create the
complete physical address.
[0059] In another embodiment in which not all possible page sizes
between the smallest page size implemented and the largest page
size implemented are supported, the page mask bits 744 may be
reduced and have a many-to-1 correspondence with respect to the
variable page address 730, and the physical page address 746. For
example, in an embodiment in which the only page sizes of the
virtual address supported by the RAM array 740, are 4K and 256M,
then a single page mask bit 744 corresponding to virtual address
bits 12-27 may be used to indicate whether bits 12-27 of the output
physical page address 760, would come from the variable page
address 730 or the physical page address 746 selected.
[0060] In still another embodiment, the page mask bits have an
inverted polarity such that a "0" indicates output physical page
address 760 bits coming from the variable page address 730 and a
"1" indicating output physical page address 760 bits coming from
the physical page address 746 selected.
[0061] Each entry in the CAM array 750 includes a virtual address
tag 752, a page mask 754, and an indicator of validity of the entry
756. The page mask 754 is typically identical in structure and
content to the page mask 744 of the RAM array. The purpose of the
page mask 754 is to identify the bits in the virtual address tag
752 that will be masked during comparison to the variable page
address 730. The virtual address tag 752 does not contain the same
information as virtual address tag 742. Instead, virtual address
tag 752 describes the variable page address 440 of FIG. 4. If the
virtual address tag 752 selected via the decoder 710 entry selects
masked with the page mask 754 selected via the decoder 710 entry
selects compares equal to the variable page address 730 masked with
the page mask 754 selected via the decoder 710 entry selects, and
the valid bit 756 selected via the decoder 710 entry selects is
true, then a match signal 758 is set to its true value, otherwise
the match line 758 is set to its false value.
[0062] The translation buffer 700, also includes a purging CAM
array 780. The purging CAM array is used to identify entries in the
CAM array 750 for purging. Each entry in the purging CAM array 780
contains a virtual address tag. This virtual address tag contains
the same information as virtual address tag 742. During a purge,
the purging CAM array receives a virtual fixed page address 790 and
a page size or range of addresses to be purged (not shown). If the
virtual fixed page address 790 masked with the page size or range
supplied, matches an entry in the purging CAM array 780 masked with
the page size or range supplied, then a wordline is generated
corresponding to each entry matched. Simultaneously, during the
purge, the CAM array 750, receives a variable page address 730, and
a page size or range of addresses to be purged (not shown). For
each CAM array 750 entry for which a corresponding wordline is
generated from the purging CAM array 780, and the variable page
address 730 masked with the page mask 754 and masked with the page
size or range supplied matches the virtual address tag 742 masked
with the page mask 754 and masked with the page size or range
supplied; the valid bit 756 of that entry will be made false.
[0063] Translation buffer 700 can be implemented as a direct-mapped
TLB that is includes one translation buffer 700 as described below
in FIG. 8, or implemented as a set-associative TLB the includes a
plurality of translation buffers 700 as described below in FIG.
9.
[0064] FIG. 8 is a block diagram of a decoder 800 of one embodiment
of the present invention. The decoder 800 includes input for the
page size bias 810 and input for the variable page address 820. The
page size input 810 corresponds to a subset of the page size bias
720 of FIG. 7 and the variable page address 820 corresponds to a
subset of the variable page address 730 of FIG. 7. The decoder 800
uses the page size bias 810 and the variable page address 820 to
generate a set of entry selects. These entry selects correspond to
the entry selects shown in FIG. 7 used to index the RAM array 740
and CAM array 780. When in operation, first, the page size bias 810
is ANDed with (used to mask) the lower bits of the variable page
address 820. Then the AND gate output is XORed with the next higher
contiguous set of bits in the variable page address 820, in order
to hash the masked variable page address 820. Lastly, the hash
output is decoded, resulting in the generation of the entry
selects.
[0065] In one embodiment shown, the page size bias 810 contains six
bits supporting a range of page size biases from 4K to 256K. Page
size bias values of 111111, 111110, 111100, 111000, 110000, 100000,
and 000000, represent page size biases of 4K, 8K, 16K, 32K, 64K,
128K, and 256K respectively, where for a value of 111110, the AND
gate 845 receives a page size bias input of `0`. AND gates 840,
841, 842, 843, 844, and 845 mask the variable page address 820 bits
12-17 with the page size bias 810. The output of the AND gates 840,
841, 842, 843, 844, and 845, are exclusive-ORed by XOR gates 850,
851, 852, 853, 854, and 855 with the next six bits, 18-23, of the
variable page address 820 and then decoded via decoder 860, to
provide the entry selects into the RAM array 740 and CAM array 780
of FIG. 7.
[0066] In another embodiment, the page size bias supports a range
of page size biases other than 4K to 256K, where the number of page
size bias bits is one less than the number of page size biases
supported and the number of page size bias bits does not exceed the
number of decoder inputs.
[0067] In another embodiment, the page size bias bits do not
correspond to consecutive power of 2 page sizes.
[0068] In another embodiment, the page size bias values are derived
from a set of encoded bits.
[0069] In another embodiment, there are fewer page size bias bits
than inputs to the decoder 860, and only those variable page
address bits for which there is a 1:1 correspondence with page size
bias bits are masked with AND gates.
[0070] In another embodiment, the variable page address 820 bits
input to the masking AND gates are a consecutive series beginning
with the least significant bit of the variable page address and
providing a 1:1 correspondence of variable page address bit inputs
to AND gates.
[0071] In conjunction with the TLB look-up (translation of a
virtual page address to a physical page address), a cache tag array
930, will generate one or more physical address tags when given a
cache index address from address lines 920. One physical address
tag will be generated for each way of the cache, as is well known
by those skilled in the art. In a four-way embodiment of a cache,
cache tag 930 will generate four physical address tags 931, 932,
933, and 934, when accessed. Each physical address tag generated
from cache tag 930 is compared to the physical address generated by
each of the translation buffers of the TLB and masked with the TLB
match signals for determine which way of the cache was hit. For
example, in one embodiment in which the cache is a 16K four-way
cache, the cache tag array 930, will output four physical address
tags 931, 932, 933, and 934 corresponding to a look-up index.
Comparators 941, 942, 943, and 944, will compare each way's
physical address tag to the physical address output by the TLB 940
for equality. The outputs of the comparators are then ANDed with
the output of the virtual fixed page address comparator 914 via AND
gates 951, 952, 953, and 954, and ANDed with match line 913 via AND
gates 961, 962, 963, and 964, so that a way hit will not be
generated in the case that the physical address output of the TLB
940 is equal to one of the four physical address tags, but the
physical address output of the TLB is not a correct translation of
the virtual address input to the TLB. Note, that the way hit
signals must also be gated with a tag valid signal (not shown)
indicating whether each entry in the cache tag array 930 is valid.
The use-bypass signal 935 is used to block the generation of way
hit signals. Lastly, the way hit signals are ORed together using OR
gate 970, to generate a cache hit signal 990.
[0072] FIG. 9 is a block diagram of a direct-mapped embodiment of
the present invention. The data cache unit (DCU) 900 implements the
invention as a direct-mapped TLB 910, in comparison to FIG. 10
which shows the invention implemented as a set-associative TLB. TLB
910 includes one translation buffer 915 as in translation buffer
700 in FIG. 7. The TLB 910 is a direct-mapped TLB as a result of
the singular use of a translation buffer 915.
[0073] The DCU 900 uses a TLB to identify a physical page address
940. DCU 900 verifies that the translation buffer 915 of the TLB
910 has output the correct physical page address translation of the
virtual page address by verifying that the CAM of the translation
buffer indicates a match 913 and by verifying that the virtual
fixed page address identified by the RAM array matches the virtual
fixed page address. As shown, the TLB can operate in conjunction
with a cache 930 to determine if the cache contains data
corresponding to the physical address generated, although the TLB
can operate without the cache.
[0074] DCU 900 verifies that the virtual fixed page address
identified by the RAM array 917 matches the virtual fixed page
address 916 using comparator 914. More specifically, TLB 910
receives the variable page address 911 and 912 from an address bus
920, similar to TLB 610 in FIG. 6, that receives virtual address
620 in FIG. 6. Within TLB 910, variable page address 911 is
transmitted to the decoder 710 in FIG. 7 and variable page address
912 is transmitted to the RAM array 740 and CAM array 750 in FIG.
7. TLB 910 outputs the match from the CAM array 750 to match line
913. TLB 910 also outputs the virtual fixed page address 770 in
FIG. 7 from the RAM array 750 as the virtual fixed page address 917
that is compared to the virtual fixed page address 916, by
comparator 914. The comparator sets it's output line to high or
true if the virtual fixed page address 916 from the address bus 920
is equal to the virtual fixed page address 917 output from the
TLB.
[0075] The translation buffer 915 has identified a correct virtual
to physical page address translation has occurred when the match
line 913 indicates a match and the comparison of the virtual fixed
page address performed by comparator 914 indicates equality.
Therefore, AND gate 950 is used to determine if match line 913 and
the output of comparator 914 are both set high or to true (TLB hit
is true). The output from AND gate 950 is transmitted to the
processor 110 in FIG. 1 on miss line 160 in FIG. 1.
[0076] The final outputs of DCU 900 consist of the Physical Page
Address 940, and the way hit signals (used by the cache data array
to select outputs not shown). Additionally, the DCU 900 outputs a
TLB hit (invert to get TLB miss) signal 980, and a cache hit
(invert to get cache miss) signal 990 whose use is described in
FIG. 1.
[0077] FIG. 10 is a block diagram of a set-associative embodiment
of the present invention. The data cache unit (DCU) 1000 implements
the invention as a set-associative TLB 1010, in comparison to FIG.
9 which shows the invention implemented as a direct-mapped TLB. TLB
1010 includes two translation buffers 1011 and 1012 as in
translation buffer 700 in FIG. 7, operably coupled in parallel. The
TLB 1010 is a set-associative TLB as a result of the use of a
plurality of translation buffers with orthogonal data sets. The
invention is not limited to a two-way set-associative TLB, the
invention can also be implemented as a n-way set-associative TLB as
is well-known to those skilled in the art, or a direct-mapped TLB
as in FIG. 9.
[0078] The DCU 1000 uses TLB 1010 to identify the physical page
address 1076 which corresponds to a virtual address supplied. DCU
1000 verifies that one of the translation buffers 1011 and 1012 of
TLB 1010 has output the correct physical page address translation
of the virtual page address by verifying that the CAM arrays of the
translation buffers have indicated a match and the corresponding
virtual fixed page address supplied by those translation buffer's
RAM arrays match the virtual fixed page address supplied 1023. Note
that for a given virtual address, a maximum of one of the
translation buffers will contain a matching entry as is the case
for a typical set-associative device. In addition, for the DCU
shown, the TLB is being used in conjunction with cache tag array
1030 to determine if the cache has been hit and if so, which way
was hit by the address supplied 1020.
[0079] More specifically, TLB 1010 receives the variable page
addresses 1013 and 1014, and 1015 and 1016 from an address bus
1020, similar to TLB 610 in FIG. 6, that receives virtual address
620 in FIG. 6. Within TLB 1010, variable page addresses 1013 and
1016 are transmitted to the decoder, as per decoder 710 in FIG. 7,
the RAM array as per RAM array 740 in FIG. 7, and the CAM array as
per CAM array 750 in FIG. 7 in each of the translation buffers 1011
and 1012. TLB 1010 outputs the matches from each of the CAM arrays
as per match line 758 in FIG. 7 from translation buffers 1011 and
1012 to match lines 1017 and 1018. TLB 1010 also outputs the
virtual fixed page addresses 1021 and 1022 from the RAM arrays as
per the virtual address tag 770 in FIG. 7 that is compared to the
virtual fixed page address 1023, as in the virtual fixed page
address 430 in FIG. 4, by comparator 1024 and 1025. Comparators
1024 and 1025 set their output lines to high or true if the virtual
fixed page address 1023 from the address bus 1020 is equal to the
virtual fixed page addresses 1021 and 1022 output from the
translation buffers 1011 and 1012 in TLB 1010 respectively.
[0080] The translation buffers 1011 and 1012 have identified a
correct physical page address when the match lines 1017 and 1018
indicate a match and the corresponding comparison of the virtual
fixed page address performed by comparators 1024 and 1025 indicate
equality. Therefore, AND gates 1091 and 1092 are used to determine
if match lines 1017 and 1018 and the output of comparators 1024 and
1025 are both set high or to true. The output from AND gates 1091
and 1092 are OR'ed together by OR gate 1093 to determine if any of
the translation buffers 1011 and 1012 translated the virtual
address to a correct physical address, i.e. a TLB hit has occurred.
The TLB hit signal 1060 would then be inverted (to indicate TLB
miss) and sent to the processor core as for the miss signal 160
sent to processor 110 in FIG. 1. In addition to the TLB hit signal,
DCU 1000 also generates a physical address output 1076. This output
is obtained by selection of the correct physical address from
amongst the physical addresses output by each translation buffer
1040 and 1049 via multiplexer 1075.
[0081] In conjunction with the TLB look-up (translation of a
virtual page address to a physical page address), a cache tag array
1030, will generate one or more physical address tags when given a
cache index address from address lines 1020. One physical address
tag will be generated for each way of the cache, as is well known
by those skilled in the art. In a four-way embodiment of a cache,
cache tag 1030 will generate four physical address tags 1031, 1032,
1033, and 1034, when accessed. Each physical address tag generated
from cache tag 1030 is compared to the physical address generated
by each of the translation buffers of the TLB and masked with the
TLB match signals for determine which way of the cache was hit. For
example, in one embodiment in which the cache is a 16K four-way
cache, the cache tag array 1030, will output four physical address
tags 1031, 1032, 1033, and 1034 corresponding to a look-up index.
Comparators 1041, 1042, 1043, and 1044, will compare each way's
physical address tag to the physical address output by translation
buffer 1011 for equality. Similarly, comparators 1045, 1046, 1047,
and 1048, will compare each way's physical address tag to the
physical address output by translation buffer 1012 for equality.
The output of the comparators 1041, 1042, 1043, and 1044 are then
ANDed with the output of the virtual fixed page address comparator
1024 via AND gates 1051, 1052, 1053, and 1054, and ANDed with match
line 1017 via AND gates 1061, 1062, 1063, and 1064, so that a way
hit will not be generated in the case that the physical address
output of the translation buffer 1011 is equal to one of the four
physical address tags, but the physical address output of the
buffer is not a correct translation of the virtual address input to
the buffer. Simultaneously, the same function is applied with
respect to the second set (of associativity) of the TLB 1010. The
output of the comparators 1045, 1046, 1047, and 1048 are then ANDed
with the output of the virtual fixed page address comparator 1025
via AND gates 1055, 1056, 1057, and 1058, and ANDed with match line
1018 via AND gates 1065, 1066, 1067, and 1068, so that a way hit
will not be generated in the case that the physical address output
of the translation buffer 1012 is equal to one of the four physical
address tags, but the physical address output of the buffer is not
a correct translation of the virtual address input to the buffer.
If the physical address supplied by either translation buffer
matches the physical address tag of one of the cache sets or ways,
and the physical address supplied is a correct translation of the
virtual address supplied 1023, then a way hit is generated for that
way of the cache. This function is accomplished by OR'ing the
outputs of AND gates 1061, 1062, 1063, and 1064, with the outputs
of AND gates 1065, 1066, 1067, and 1068 via OR gates 1071, 1072,
1073, and 1074. Note that the way hit signals must also be gated
with a tag valid signal (not shown) indicating whether each entry
in the cache tag array 1030 is valid. The use-bypass signal 1035 is
used to block the generation of way hit signals. Lastly, the way
hit signals are ORed together using OR gate 1070, to generate a
cache hit signal 1090.
[0082] The final outputs of DCU 1000 consist of the Physical Page
Address 1076, and the way hit signals (used by the cache data array
to select outputs not shown). Additionally, the DCU 1000 outputs a
TLB hit (invert to get TLB miss) signal 1060, and a cache hit
(invert to get cache miss) signal 1090 whose use is described in
FIG. 1.
[0083] FIG. 11 is a block diagram of one embodiment of a method
1100 of translating virtual addresses of varying page sizes to
physical addresses. Method 1100 begins and thereafter generates an
entry select 1110. The entry select is a pointer into two arrays
that identifies a set of corresponding entries (one entry in each
array), where the first array such as 740 in FIG. 7, maps the
virtual page address such as 430 and 440 in FIG. 4, to a physical
page address 1120, such as 540 in FIG. 5, and generates a virtual
address tag such as 770 in FIG. 7, and the second array, such as
750 in FIG. 7, generates a match indication such as match signal
758 in FIG. 7. The second array generates its match signal if the
entry in the second array selected by the entry select is valid (as
indicated by a valid bit such as 756 in FIG. 7), and the variable
page address such as 730 in FIG. 7 (after being masked by the page
size mask, such as 754 in FIG. 7, selected by the entry select)
matches the virtual address tag such as 752 in FIG. 7 selected by
the entry select (after being masked by the page size mask, such as
754 in FIG. 7, selected by the entry select). A match 1130 is
indicated when both the second array indicates a match, and the
virtual address tag from the first array is determined to be equal
to the virtual fixed page address input such as 790 in FIG. 7. If a
match is indicated, then the physical address is generated 1140 by
concatenating the physical page address such as 540 in FIG. 5 with
the offset from the virtual address, such as 520 in FIG. 5 and the
physical address is used to access physical memory, such as 130 in
FIG. 1, thereafter method 1100 ends.
[0084] FIG. 12 is a block diagram of another embodiment of a method
1200 of translating virtual addresses of varying page sizes to
physical addresses. Method 1200 performs the same function as
method 1100, except that the generation of a physical page address
1220 and the generation of an indication of a match occur in
parallel 1230. The method begins, and thereafter, an entry select
is generated 1210, as in action 1110 in FIG. 11. Thereafter, a
physical page address is generated 1220, and the physical address
is generated by concatenating the physical page address with the
offset from the virtual address as in actions 1120 and 1140 in FIG.
11. In parallel to the physical page address and physical address
generation, an indication of a match is generated as in action 1130
in FIG. 11. The physical address is used to access physical memory
such as 130 in FIG. 1, but if a match is not indicated, then the
physical memory access would have to be blocked or aborted
depending upon the type of memory being accessed and the type of
access being performed (a load from speculatable memory could be
started and aborted, a store or access to non-speculatable memory
would have to be blocked/not started). Thereafter method 1200
ends.
[0085] FIG. 13 is a block diagram of one embodiment of a method
1300 of generating an entry select as in action 1110 in FIG. 11, in
the translation of virtual addresses of varying page sizes to
physical addresses. Method 1300 begins and an indication of a page
size bias associated with the virtual address is received 1310.
Thereafter, the variable portion of the virtual page address, such
as 440 in FIG. 4, is taken from the virtual address, such as 400 in
FIG. 4, and received 1320. Afterward, the variable page address and
the page size bias are decoded 1330, resulting in the generation of
an entry select that will be used to identify a physical page
address, as in action 1120 of FIG. 11 or action 1220 in FIG. 12 and
the generation of an indication of a match as in action 1130 in
FIG. 11. Thereafter, method 1300 ends. In another embodiment,
action 1310 is performed after action 1320, but before action
1330.
[0086] FIG. 14 is a block diagram of one embodiment of a method
1400 of decoding a variable page address with a page size bias, as
in action 1330 of FIG. 13, in generating an entry select in
translating virtual addresses of varying page sizes to physical
addresses as in method 1100 in FIG. 11 and method 1200 in FIG. 12.
Method 1400 begins and thereafter the variable page address that is
received as in action 1320 in FIG. 13 is masked 1410 with the
corresponding bits of the page size bias received in action 1310 in
FIG. 13. In one embodiment, masking 1410 is accomplished by AND'ing
the values. For example, if the page size bias is six bits wide,
the page size bias will be AND'ed with the lower six bits of the
variable page address. Thereafter, the result of the masking in
action 1410 is hashed with the next set of upper bits in the
variable page address immediately adjacent to the bits in the
variable page address masked in action 1410. In one embodiment, the
hashing 1420 is an XOR function in which the result is subsequently
decoded into entry selects for accessing arrays. Thereafter, the
method 1400 ends. For other embodiments of this function, please
see the discussion in conjunction with FIG. 8.
[0087] FIG. 15 is a block diagram of one embodiment of a method
1500 of generating a physical page address as in action 1120 in
FIG. 11 and action 1220 in FIG. 12. In general, a physical page
address is generated by combining a portion of the physical page
address contained within the translation buffer with a portion of
the variable page address input as indicated by the page size
contained within the translation buffer. More specifically, the
lower portion of the physical page address contained within the
translation buffer is masked off according to the page size
contained within the translation buffer to the extent that those
bits which would be considered offset within the page (as opposed
to the address of the page within memory space) are masked. Those
masked bits are then replaced with the corresponding bits of the
variable page address input to generate the physical page address
output.
[0088] The method 1500 begins and thereafter, a masked physical
page address is generated 1510 by masking a translation buffer
entry physical page address with the corresponding translation
buffer page mask, where a page mask is a decoded version of the
page size which when used as a mask will cause address bits below
the indicated page size to be masked and have no effect upon
address bits above the indicated page size. The translation buffer
physical page address and page mask are identified using an entry
select as generated in action 1120 of FIG. 11, or action 1220 of
FIG. 12, or more specifically as generated in action 1330 of FIG.
13. Thereafter, a masked variable page address is generated by
masking the input variable page address with an inverted (bit wise)
form of the identified page mask. Actions 1510 and 1520 yield a set
of complementary page address bits such that the combination of
said address bits will yield a complete physical page address.
Therefore, in action 1530, the masked physical page address yielded
by action 1510 and the masked variable page address yielded by
action 1520 are added together to yield the physical page address.
Thereafter, the method 1500 ends. In another embodiment, action
1510 is performed after action 1520, but before action 1530.
[0089] In another embodiment the masked physical page address and
masked variable page address are combined not using an add but
using an OR function instead. In another embodiment, the
translation buffer physical page address and variable page address
are not masked and combined to form the physical page address, but
instead, the page mask is used to control a multiplexer such that
the portions of the translation buffer physical page address and
the variable page address which would not have been masked as
described earlier in the description of method 1500, are directly
combined to form the physical page address (with the same result as
if the previous embodiment were used).
[0090] FIG. 16 is a block diagram of one embodiment of a method
1600 of generating an indication of a match as in action 1130 in
FIG. 11 and action 1230 in FIG. 12 in translating virtual addresses
of varying page sizes to physical addresses. Method 1600 begins and
thereafter, the entry in a second array is accessed using the entry
select generated in action 1110 of FIG. 11 or in action 1210 of
FIG. 12, or more specifically in action 1330 of FIG. 13. The
portion of the entry that indicates validity of the entry is
checked to verify that the entry is valid 1610. If the validity
indicator indicates no validity, then an indication of no match is
output 1620, and the method ends. Otherwise, if validity of the
entry is indicated, then a masked variable page address tag is
generated by masking the variable page address tag from the same
entry with the page mask from the same entry, and a masked variable
page address is generated by masking the variable page address
input with the page mask from the same entry as the valid bit and
variable page address tag selected 1630. If the comparison of the
masked variable page address tag and the masked variable page
address 1640 indicates inequality, then an indication of no match
is output 1620, and the method ends. Otherwise, an entry in a first
array is selected using the entry select generated in action 1110
of FIG. 11 or in action 1210 of FIG. 12, or more specifically in
action 1330 of FIG. 13 to obtain a virtual fixed address tag. If a
comparison of the virtual fixed address tag and the virtual fixed
address input 1650 indicates inequality, then an indication of no
match is output 1620, and the method ends. If none of the decision
trees 1610, 1640, and 1650 cause an indication of no match to be
output, then a match output will be generated 1660, and the method
ends. In other embodiments, the decisions 1610, 1640, and 1650 may
be performed in other orders or in parallel (but action 1630 must
always take place before decision 1640).
Conclusion
[0091] A translation buffer has been described which can translate
virtual to physical addresses of varying pages sizes quickly and
with few misses. The translation buffer described uses an decoder
which generates a hashed index into an array that maps a virtual
page address to a physical page address using a page mask and
maintains corresponding virtual fixed page address tags, and uses
the same hashed index to access a second array which performs match
comparisons using a variable page address tag, a page mask, and a
valid flag. Together, the two arrays contain the entire virtual
page address between the virtual fixed page address of the first
array and the variable page address in the second, thus ensuring
that the entire virtual page address will be used in determining
whether a correct virtual to physical translation has ben
performed. Furthermore, both arrays contain the page mask of the
address to enable the address to be masked and combined properly in
accordance with the page size.
[0092] Although specific embodiments have been illustrated and
described herein, it will be appreciated by those of ordinary skill
in the art that any arrangement which is calculated to achieve the
same purpose may be substituted for the specific embodiments shown.
This application is intended to cover any adaptations or variations
of the present invention. More specifically, the present invention
has been described in terms of microprocessor terminology, however,
the present invention can be embodied in software.
* * * * *