U.S. patent application number 10/102827 was filed with the patent office on 2003-07-03 for register file in the register window system and controlling method thereof.
This patent application is currently assigned to FUJITSU LIMITED. Invention is credited to Okawa, Yasukichi, Yamashita, Hideo.
Application Number | 20030126415 10/102827 |
Document ID | / |
Family ID | 19189660 |
Filed Date | 2003-07-03 |
United States Patent
Application |
20030126415 |
Kind Code |
A1 |
Okawa, Yasukichi ; et
al. |
July 3, 2003 |
Register file in the register window system and controlling method
thereof
Abstract
In the structure of register files composed of a master register
file and a working register file, when data is read, the working
register file is accessed. When data is written, the both the
master register file and the working register file are accessed. In
the working register file, data of the current window, and data
preceded thereby, and data followed thereby are stored. Thus, even
if the SAVE instruction or the RESTORE instruction are successively
executed, instructions can be processed out of order. As a result,
the efficiency of the process is improved.
Inventors: |
Okawa, Yasukichi; (Kawasaki,
JP) ; Yamashita, Hideo; (Kawasaki, JP) |
Correspondence
Address: |
STAAS & HALSEY LLP
700 11TH STREET, NW
SUITE 500
WASHINGTON
DC
20001
US
|
Assignee: |
FUJITSU LIMITED
Kawasaki
JP
|
Family ID: |
19189660 |
Appl. No.: |
10/102827 |
Filed: |
March 22, 2002 |
Current U.S.
Class: |
712/228 ;
712/E9.027; 712/E9.032 |
Current CPC
Class: |
G06F 9/30043 20130101;
G06F 9/30127 20130101 |
Class at
Publication: |
712/228 |
International
Class: |
G06F 009/00 |
Foreign Application Data
Date |
Code |
Application Number |
Dec 28, 2001 |
JP |
2001-400704 |
Claims
What is claimed is:
1. A register file having a master register file and a working
register file, data used for a process for an instruction being
transferred from the master register file and stored in the working
register file, data used for a process for an instruction being
read from the working register file, comprising: a current window
pointer unit pointing a current window position for accessing the
master register file; a working register window current pointer
unit pointing a current window position for accessing working
register file; and a unit transferring data from the master
register file to the working register file and updating data of the
working register file when the current window pointer is changed so
that the master register file stores data of all register windows
of an information processing apparatus and the working register
file stores data of a window pointed by the current window pointer
and data of windows followed and preceded by the window pointed by
the current window pointer.
2. The register file as set forth in claim 1 wherein the updating
unit writes data of a window preceded by or followed by the window
of the maser register file pointed by the current window pointer
unit to the working register file when the current window pointer
is varied.
3. The register file as set forth in claim 1, wherein after a
window of the maser register file pointed by the current window
pointer unit is switched, before the next window switching
instruction is executed, when the window is switched, data of a
window necessary for the next window switching instruction is
pre-transferred from the master register file to the working
register file.
4. The register file as set forth in claim 1, wherein the master
register file and the working register file are window registers
corresponding to an overlap window system.
5. The register file as set forth in claim 4, wherein when data of
a window is transferred from the master register file to the
working register file, data that overlap is not transferred.
6. The register file as set forth in claim 1, wherein data is
transferred from the master register file to the working register
file in two phases.
7. The register file as set forth in claim 1, wherein the master
register file is structured in a ring shape as a logical
structure.
8. The register file as set forth in claim 1, wherein when data is
written to the register file, the data i s written to the master
register file and the working register file at a time, and wherein
when data is read from the register file, the data is read from
only the working register file.
9. A method for controlling a register file having a master
register file and a working register file, data used for a process
for an instruction being transferred from the master register file
and stored in the working register file, data used for a process
for an instruction being read from the working register file,
comprising: providing a current window pointer for pointing a
current window position for accessing the master register file and
a working register window current pointer for pointing a current
window position for accessing working register file; and
transferring data from the master register file to the working
register file and updating data of the working register file when
the current window pointer is changed so that the master register
file stores data of all register windows of an information
processing apparatus and the working register file stores data of a
window pointed by the current window pointer and data of windows
followed and preceded by the window pointed by the current window
pointer.
10. The method as set forth in claim 9 wherein the updating step is
performed by writing data of a window preceded by or followed by
the window of the maser register file pointed by the current window
pointer to the working register file when the current window
pointer is varied.
11. The method as set forth in claim 9, wherein after a window of
the maser register file pointed by the current window pointer is
switched, before the next window switching instruction is executed,
when the window is switched, data of a window necessary for the
next window switching instruction is pre-transferred from the
master register file to the working register file.
12. The method as set forth in claim 9, wherein the master register
file and the working register file are window registers
corresponding to an overlap window system.
13. The method as set forth in claim 12, wherein when data of a
window is transferred from the master register file to the working
register file, data that overlap is not transferred.
14. The method as set forth in claim 9, wherein data is transferred
from the master register file to the working register file in two
phases.
15. The method as set forth in claim 9, wherein the master register
file is structured in a ring shape as a logical structure.
16. The method as set forth in claim 9, wherein when data is
written to the register file, the data is written to the master
register file and the working register file at a time, and wherein
when data is read from the register file, the data is read from
only the working register file.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] The present invention relates to a register access
processing method for use with an information processing apparatus
having an architecture of a register window and using an
out-of-order instruction execution system, the method allowing the
order of instructions to be changed so that an instruction goes
ahead of a register window switching instruction.
[0003] 2. Description of the Related Art
[0004] Some information processing apparatus having an architecture
using a reduced instruction set has a plurality of register sets
(hereinafter referred to as register windows). Thus, in the
apparatus, it is not necessary to save or restore to a memory stack
a register that takes place when a subroutine is called or
returned.
[0005] The register windows are connected in a ring shape and
managed by register window numbers (hereinafter referred to as
window numbers). For example, eight register windows are assigned
window numbers 0 to 7 and used in the order of 0, 1, 2, . . . , and
7. The window number of a register window that is being used is
stored by a register (hereinafter, this register is referred to as
current window pointer (CWP)).
[0006] FIG. 1 shows the structure of a ring-shaped register
file.
[0007] Each register window file composed of for example 32 64-bit
registers. Among these registers, eight registers are common in all
the register windows. As shown in FIG. 1, other eight registers are
in common with the immediately preceding register window. Further
eight registers are in common with the immediately following
register window.
[0008] These registers are referred to as overlap register window.
There are two types of register window switching instructions that
are a SAVE instruction and a RESTORE instruction. The SAVE
instruction increments CWP. The RESTORE instruction decrements CWP.
Thus, in the following description, the register window switching
instructions are referred to as SAVE instruction and RESTORE
instruction.
[0009] FIG. 1 shows the case that the number n of windows is eight
and a total of 136 registers of which eight "local"
registers.times.eight windows=64 registers, eight overlapped in/out
registers.times.eight windows=64 registers, and eight global
registers (not shown). It is necessary to allow data to be written
and read to and from all the registers.
[0010] In the related art, there are problem with respect to speed
and scale of a circuit that reads data from such a large register
file.
[0011] FIG. 2 is a schematic diagram showing the structure of a
register file composed of a master register file and a working
register file.
[0012] As the number of register windows becomes large, a very
large register file is required (when the number of register
windows is eight, 136 registers are required). As a result, it
becomes difficult to supply an operand to an arithmetic unit at
high speed. Thus, as shown in FIG. 2, in addition to a register
file (portion (1) shown in FIG. 2) that stores all windows as shown
in FIG. 2 (the register file is referred to as master register file
(MRF)), a subset (portion (2) shown in FIG. 2 (2)) of the MRF is
disposed. The subset stores a copy of one window pointed by CWP in
the MRF (hereinafter, the subset is referred to as working register
file (WRF)). The WRF supplies an operand to the arithmetic unit.
Since the WRF stores only a window pointed by CWP, the capacity of
the WRF is 32 entries that is smaller than that of the MRF. Thus,
data can be read from the WRF at high speed.
[0013] However, in such a structure, since the WRF stores only
registers for one window, when the SAVE instruction or the RESTORE
instruction is executed, an operand that is required after an
instruction that will be executed after the SAVE instruction or the
RESTORE instruction cannot be supplied from the WRF.
[0014] Thus, when the SAVE instruction or the RESTORE instruction
is executed, since the window of the WRF is replaced with a new
window, since a window transferring process takes place from the
MRF to the WRF, while the process is taking place, the execution of
instructions that follow is stalled.
[0015] In addition, the information processing apparatus that
changes the processing order of instructions corresponding to an
out-of-order instruction executes the instructions that can be
processed regardless of the execution order of the program.
However, the apparatus cannot execute an instruction preceded by
the SAVE instruction or the RESTORE instruction even if the
apparatus can process the instruction, until a window is
transferred to the WRF after the SAVE instruction or the RESTORE
instruction is executed.
[0016] Such a restriction causes the performance of the information
processing apparatus corresponding to the out-of-order instruction
execution system that generates a large number of instructions at a
time to largely deteriorate. The information processing apparatus
corresponding to the out-of-order instruction execution system
pre-reads many instructions and pre-stores them to buffers.
Instructions that are stored and that are executable are executed
in the changed order different from that designated by the program
so that the throughput of the execution of the instructions is
improved. Thus, when the SAVE instruction or the RESTORE
instruction is executed, if the execution order of instructions
cannot be changed, whenever the SAVE instruction or the RESTORE
instruction is executed, the out-of-order processing mechanism does
not work. As a result, the performance of the apparatus remarkably
deteriorates.
SUMMARY OF THE INVENTION
[0017] An object of the present invention is to improve the
performance of an information processing apparatus corresponding to
out-of-order processing system.
[0018] In particular, the present invention allows the execution
order to be changed and a large number of instructions to be
executed at a time when the SAVE instruction or the RESTORE
instruction is executed in an information processing apparatus that
uses a register window and out-of-order execution system.
[0019] A first aspect of the present invention is a register file
having a master register file and a working register file, data
used for a process for an instruction being transferred from the
master register file and stored in the working register file, data
used for a process for an instruction being read from the working
register file, comprising a current window pointer unit pointing a
current window position for accessing the master register file, a
working register window current pointer unit pointing a current
window position for accessing working register file, and a unit
transferring data from the master register file to the working
register file and updating data of the working register file when
the current window pointer is changed so that the master register
file stores data of all register windows of an information
processing apparatus and the working register file stores data of a
window pointed by the current window pointer and data of windows
followed and preceded by the window pointed by the current window
pointer.
[0020] A second aspect of the present invention is a method for
controlling a register file having a master register file and a
working register file, data used for a process for an instruction
being transferred from the master register file and stored in the
working register file, data used for a process for an instruction
being read from the working register file, comprising the steps of
providing a current window pointer for pointing a current window
position for accessing the master register file and a working
register window current pointer for pointing a current window
position for accessing working register file, and transferring data
from the master register file to the working register file and
updating data of the working register file when the current window
pointer is changed so that the master register file stores data of
all register windows of an information processing apparatus and the
working register file stores data of a window pointed by the
current window pointer and data of windows followed and preceded by
the window pointed by the current window pointer.
[0021] According to the present invention, data necessary for
reading or writing a plurality of instructions that include the
SAVE instruction or the RESTORE instruction is pre-stored in a
working register file. Thus, even if instructions are successively
read or written, they can be executed without need to wait until
data necessary upon execution of the SAVE instruction or the
RESTORE instruction is transferred from the MRF to the WRF.
[0022] These and other objects, features and advantages of the
present invention will become more apparent in light of the
following detailed description of a best mode embodiment thereof,
as illustrated in the accompanying drawings.
BRIEF DESCRIPTION OF DRAWINGS
[0023] FIG. 1 is a schematic diagram showing the structure of a
ring-shaped register file;
[0024] FIG. 2 is a schematic diagram showing the structure of a
register file composed of a master register file and a working
register file;
[0025] FIG. 3 is a schematic diagram for explaining a foreseeing
transfer (No. 1);
[0026] FIG. 4 is a schematic diagram for explaining a foreseeing
transfer (No. 2);
[0027] FIG. 5 is a schematic diagram for explaining a foreseeing
transfer (No. 3);
[0028] FIG. 6 is a schematic diagram showing the relation between a
CWP and a WP;
[0029] FIG. 7 is a schematic diagram showing a method for
converting a WCWP and a WP;
[0030] FIG. 8 is a schematic diagram for explaining a method for
assigning a physical address to a WRF;
[0031] FIG. 9 is a schematic diagram showing a method for
converting a register number into a physical address (No. 1);
[0032] FIG. 10 is a schematic diagram showing a method for
converting a register number into a physical address (No. 2);
[0033] FIG. 11 is a schematic diagram showing a method for
performing an updating operation for a WCWP;
[0034] FIG. 12 is a schematic diagram for explaining a process for
an instruction performed by a computer corresponding to
out-of-order instruction execution system according to an
embodiment of the present invention;
[0035] FIG. 13 is a schematic diagram showing the state of which
instructions are executed out of order;
[0036] FIG. 14 is a block diagram showing a window register file
composed of n windows according to an embodiment of the present
invention;
[0037] FIG. 15 is a schematic diagram showing a method for mapping
a physical address to a WRF;
[0038] FIG. 16 is a block diagram showing the structure of a
WRF;
[0039] FIG. 17 is a schematic diagram showing a method for
selecting a bank;
[0040] FIG. 18 is a schematic diagram showing a method for mapping
a physical address [5:0];
[0041] FIG. 19 is a schematic diagram showing a WCWP of a
destination;
[0042] FIG. 20 is a schematic diagram for explaining registers of a
WRF to which data is transferred corresponding to
MOVE_dest_addr;
[0043] FIG. 21 is a block diagram showing the structure of an MRF
according to an embodiment of the present invention;
[0044] FIG. 22 is a schematic diagram showing the relation between
registers and windows in the case that data is written to an
MRF;
[0045] FIG. 23 is a schematic diagram showing the relation between
registers and windows to be moved;
[0046] FIG. 24 is a block diagram showing a window register file of
which an MRF and a WRF are connected according to an embodiment of
the present invention;
[0047] FIG. 25 is a schematic diagram showing meanings of a CWP, a
WCWP, and a phy_addr;
[0048] FIG. 26 is a schematic diagram showing the state that a
foreseeing transfer is performed for a WRF;
[0049] FIG. 27 is a time chart of a process performed in the case
that the SAVE instruction is successively executed;
[0050] FIG. 28 is a schematic diagram for explaining a process for
writing data to a bank of a WRF in the case that a SAVE process is
performed; and
[0051] FIG. 29 is a schematic diagram for explaining a process for
writing data to a bank of a WRF in the case that a RESTORE process
is performed.
DESCRIPTION OF PREFERRED EMBODIMENTS
[0052] The present invention provides a register file and a
controlling system thereof that allow the execution order of
instructions to be changed when an instruction is followed by a
window switching instruction in an out-of-order execution system
because a working register file stores registers corresponding to a
plurality of windows in the structure of which there are two types
of files that are a master register file that stores all window
registers and a working register file that stores a part of window
registers that may be accessed.
[0053] Since it is difficult for an information processing
apparatus that has a register file storing a large number of
windows to read an operand at high speed, as shown in FIG. 2, all
the windows are stored in an MRF (portion (1) shown in FIG. 2) and
a part of windows is stored in a WRF (portion (2) shown in FIG. 2).
An operand is read from only the WRF. An updating process is
performed so that the content of the WRF is always the same as the
content stored in the MRF. Since an operand is supplied from only
the small WRF, it can be read at high speed. When an window
switching instruction is executed, the latest value of the switched
window is transferred from the master register file.
[0054] Only a window of registers pointed by CWP is updated as the
result of the execution of an instruction. The window is stored in
the WRF. Thus, the updating process can be performed for only the
WRF. On the other hand, in such a controlling method, to allow the
data of the MRF to be consistent with data of the WRF, when a
window is switched, it is necessary to pre-transfer data from the
WRF to the MRF. According to the present invention, to omit the
process for transferring data from the WRF to the MRF, the WRF and
the MRF are updated at a time.
[0055] In addition, when such a method is used, a register that can
be read as an operand is limited to one in a window pointed by CWP.
Thus, in an information processing apparatus corresponding to the
out-of-order execution type, the execution order of instructions
cannot be changed when an instruction is preceded by the SAVE
instruction or the RESTORE instruction.
[0056] To solve such a problem, according to an embodiment of the
present invention, in addition to a window pointed by CWP, windows
pointed by CWP-1 and CWP+1 are stored in the WRF. As a result,
since the windows preceded and followed by the window pointed by
CWP are stored in the WRF, instructions preceded and followed by
the SAVE instruction or the RESTORE instruction can be read from
and written to the register file. Thus, instructions can be moved
before and after the SAVE instruction or the RESTORE
instruction.
[0057] When such a WRF is used, one window has 32 registers. Among
them, eight registers are in common with all windows (these
registers are referred to as global registers). The other eight
registers are in common with the immediately following window. The
remaining eight registers are in common with the immediately
preceding window. Thus, 24 registers of the 32 registers are shared
by the other windows. Consequently, to have windows pointed by
CWP-1, CWP, and CWP+1, only 64 registers are required.
[0058] Next, a WRF that has extra two windows that are followed and
preceded by a window pointed by CWP will be described. However, it
should be noted that the present invention can be extended to the
case that a WRF has 2n extra windows that are followed and preceded
by a window pointed by CWP.
[0059] When the same window switching instructions such as the SAVE
instruction and the SAVE instruction or the RESTORE instruction and
the RESTORE instruction are successively executed, since the WRF
does not have windows pointed by CWP+2 or CWP-2, an instruction
preceded by the second SAVE instruction or the second RESTORE
instruction reading and writing data from and to the windows, while
the windows are transferred from the MRF, the execution of the next
instruction is stalled.
[0060] To prevent the execution of the instruction from being
stalled, it is assumed that when the first SAVE instruction or the
first RESTORE instruction is executed, a window necessary for
executing an instruction preceded by the second SAVE instruction or
the second RESTORE instruction is pre-transferred from the MRF.
After the SAVE instruction is executed, the window pointer becomes
CWP+1. When the SAVE instruction is followed by the RESTORE
instruction, the window pointer becomes CWP. On the other hand,
when the SAVE instruction is followed by the SAVE instruction, the
window pointer becomes CWP+2. In any case, a window pointed by
CWP-1 is not required. Thus, a window pointed by CWP+2 necessary
for the case that the SAVE instruction is followed by the SAVE
instruction is transferred from the MRF. This applies to the
RESTORE instruction. Thus, when the SAVE instruction or the RESTORE
instruction is executed, the following transferring process is
performed.
[0061] When the SAVE instruction is executed, a window pointed by
CWP+2 of the MRF is transferred to a window pointed by CWP-1 of the
WRF.
[0062] When the RESTORE instruction is executed, a window pointed
by CWP-2 of the MRF is transferred to a window pointed by CWP+1 of
the WRF.
[0063] When such a register window is foresee-transferred, even if
the SAVE instruction or the RESTORE instruction is successively
executed, the execution of an instruction preceded thereby can be
prevented from being stalled. However, there is an exception. In
other words, when an instruction preceded by the second SAVE
instruction or the second RESTORE instruction becomes executable
before the register window foresee-transferring process is
performed along with the first SAVE instruction or the first
RESTORE instruction, the executable instruction is stalled. In
other words, since a window necessary for the executable
instruction has not been transferred from the MRF, the instruction
is stalled until the window is transferred.
[0064] FIGS. 3 to 5 are schematic diagrams for explaining a
foreseeing transfer.
[0065] In the above-described foresee transfer, when the SAVE
instruction is executed, a window pointed by CWP+2 is transferred.
When the RESTORE instruction is executed, a window pointed by CWP-2
is transferred. However, since register windows are overlap
windows, in state-0 shown in FIG. 4, the MRF has an "in" portion
pointed by CWP+2 (or an "out" portion pointed by CWP-2) and a
"global" portion.
[0066] When the SAVE instruction or the RESTORE instruction is
executed, it is not necessary to transfer a whole window (32
entries) from the MRF to the WRF. In other words, when the SAVE
instruction is executed, only the "out" portion and the "local"
portion pointed by CWP+2 can be transferred. When the RESTORE
instruction is executed, only the "in" portion and the "local"
portion pointed by CWP-2 can be transferred. Thus, only 16 entries
that are a half window can be transferred. The foreseeing transfer
is performed as shown in FIG. 3.
[0067] As shown in FIG. 4, WPs are assigned to "in/out" portions
and "local" portions one after the other. To perform a foreseeing
transfer, a window should be transferred to a window two positions
ahead. Thus, a window is transferred to a window pointed by WP+2.
In FIG. 4, since WP start with 1, "mod 7" should be followed by
"+1". Thus, for compensation, "-1" is added to ".+-.4" in
parenthesis.
[0068] Using such a method, although the amount of data transferred
from the MRF to the WRF can be decreased, whenever a window
switching instruction is executed, the positions of the "in"
portion, the "local" portion, and the "out" portion adversely
vary.
[0069] FIG. 4 is a schematic diagram showing the operation of a
WRF.
[0070] Frames assigned WP (Window Pointer)=1, . . . , and 7 are
composed of eight registers each. A WRF also has a set of registers
of a "global" portion (not shown).
[0071] FIG. 5 is a schematic diagram showing the state that windows
of an MRF are mapped to a WRF. In state-1, a frame of WP=4 stores a
"local" portion. A frame of WP=3 stores an "in" portion. A frame of
WP=5 store an "out" portion. Likewise, a frame of WP=6 stores a
"local" portion pointed by CWP+1. A frame of WP=1 stores an "in"
portion pointed by CWP-1. A frame of WP=3 stores an "out" portion
pointed by CWP-1. Windows of the WRF are mapped from the MRF. In
other words, windows of the MRF are partly mapped to the WRF.
[0072] In FIG. 5, portion (1) represents the state that an MRF is
mapped to a WRF in the state-1.
[0073] When the SAVE instruction is executed, the state-1 advances
to state-2. Thus, CWP+1 in the state-1 becomes CWP in the state-2.
An "out" portion and a "local" portion pointed by CWP+1 in the
state-2 are foresee-transferred from the MRF. Thus, as portions (2)
and (3) shown in FIG. 5, frames of WP=1 and 2 are overwritten with
data transferred from the MRF. Frames of WP=5, 6, and 7 become
those pointed by CWP in the state-2.
[0074] When the RESTORE instruction is executed, the state-1
returns to the state-0. Thus, CWP-1 in the state-1 becomes CWP in
the state-0. An "in" portion and a "local" portion pointed by CWP-1
in the state-0 are foresee-transferred from the MRF. Thus, as
portions (4) and (5) shown in FIG. 5, frames of WP=6 and 7 are
overwritten with data transferred from the MRF. Frames of WP=1, 2,
and 3 become those pointed by CWP in the state-0.
[0075] FIG. 6 is a schematic diagram showing the relation between
CWP and WP.
[0076] FIG. 6 shows the case that the position of CWP of the WRF is
denoted by WP and that the "in" portion, the "local" portion, and
the "out" portion pointed by CWP are represented by a set of three
elements of WP when the SAVE instruction and the RESTORE
instruction are executed. When the SAVE instruction is executed
eight times in the state that CWP=0 and WP=[3, 4, 5] shown in FIG.
6, although CWP becomes 0, WP elements become [3, 2, 5]. Thus,
after the SAVE instruction is executed, the WP elements vary.
Consequently, Thus, with CWP, the WP elements of the "in" portion,
the "local" portion, and the "out" portion of the WRF cannot be
uniquely designated. Although registers of the MRF can be
designated with CWP and reg number, registers of the WRF cannot be
designated because of such a reason.
[0077] Since the position of CWP in the WRF cannot be uniquely
designated, WP that represents the position of CWP in the WRF is
stored in a register. That is referred to as WCWP (Working Register
Current Window Pointer). WCWP is composed of four bits. WCWP [3:2]
represents WP of the position of the "local" portion pointed by
CWP. WCWP [1:0] represents WP of the position of the "in" portion
pointed by CWP. (WCWP [1:0]+1) mod 4 represents WP of the position
of the "out" portion pointed by CWP.
[0078] Firstly, WCWP represents the position of the current window
in the WRF since it cannot be uniquely obtained with CWP as shown
in FIG. 6. When register windows are disposed in the method
according to the embodiment, a WCWP register is disposed along with
a CWP register. When the SAVE instruction or the RESTORE
instruction is executed, an updating process is performed by
synchronizing them with CWP in the method that will be described
later.
[0079] In the related art, when the SAVE instruction or the RESTORE
instruction is executed, three portions of the "in" portion, the
"local" portion, and the "out" portion should be transferred to a
WRF. In contrast, according to the embodiment of the present
invention, since the "in" portion and the "out" portion overlap in
each window, when the SAVE instruction is executed, only the "out"
portion and the "local" portion are transferred. When the RESTORE
instruction is executed, only the "in" portion and the "local"
portion are transferred. Thus, the amount of data that is
transferred can be reduced to 2/3 of the amount of data that is
normally transferred. Thus, the positions of the "in" portion, the
"out" portion, and the "local" portion dynamically vary in the
WRF.
[0080] Secondly, WCWP correctly points the current positions of the
"in" portion, the "out" portion, and the "local" portion that vary
in such a manner. When a pair of (WCWP and reg number) is given, a
means for obtaining the positions of the registers in the WRF is
accomplished.
[0081] FIG. 7 is a schematic diagram showing a method for
converting WCWP and WP.
[0082] As shown in FIG. 7, WCWP is converted into WP members of the
"in" portion, the "local" portion, and the "out" portion shown in
FIG. 4. In the example, WP numbers are assigned in the WRF so that
the "in/out" portions and the "local" portions take place one after
the other. Thus, the WP numbers assigned to "in/out" portions are
even numbers, whereas the WP numbers assigned to the "local"
portions are odd numbers. Thus, as shown in FIG. 7, the WP number
assigned to the "in" portion becomes WCWP [1:0].times.2+1. The WP
number assigned to the "local" portion becomes WCWP [3:2].times.2.
The "out" portion is preceded by the "in" portion. Thus, the WP
number assigned to the "out" portion becomes ((WCWP [1:0]+1) mod
4).times.2+1.
[0083] In addition, each WP number of the WRF and eight registers
of each WP number are assigned addresses 0 to 63 as WP=1 (8, . . .
, 15), WP=2 (16, . . . , 23), WP=3 (24, . . . , 31), . . . , and so
on. These addresses are referred to as physical addresses. In
addition, the "global" portion is assigned to WP=0 (0, . . . ,
7).
[0084] FIG. 8 is a schematic diagram for explaining a method for
assigning physical addresses to a WRF. FIGS. 9 an 10 are schematic
diagrams showing a method for converting a register number into a
physical address.
[0085] Physical addresses are assigned to the WRF as shown in FIG.
8.
[0086] Normally, a register is accessed using CWP and reg number
(register number). However, as was described above, a WRF is
accessed using WCWP instead of CWP. When WCWP and reg number are
given, a physical address of the register of the WRF is obtained as
shown in FIG. 9. A calculation of a physical address shown in FIG.
9 is illustrated in FIG. 10. In FIG. 9, a portion ". . . .times.16"
means that WCWP [3:2] (or WCWP [1:0]) is shifted to the left by
four bits as shown in FIG. 10.
[0087] Next, an example of which a pair of (WCWP, reg number) is
converted into a physical address of a WRF will be descried.
[0088] Conversion of (WCWP [1:0]="01", r10) into physical
address
[0089] Since r10 is an "out" portion, corresponding to the second
line of FIG. 9, the physical address becomes ((1+1) mod
4).times.16+10=42.
[0090] Alternatively, with reference to FIGS. 7 and 8, since
WP=2.times.2+1=5 (see FIG. 7), the physical address of r10 of WP=5
is 42 (see FIG. 8).
[0091] An expression for obtaining a physical address shown in FIG.
9 can be obtained using the table shown in FIG. 7 as follows. In
this example, a physical address of a register of the "out" portion
is obtained. However, this method applies for obtaining a physical
address of a register of the other portions.
Physical address=WP.times.8+reg number-8
From the table shown in FIG. 7, physical address=((WCWP [1:0]+1)
mod 4).times.2+1).times.8+reg number-8=((WCWP [1:0]+1) mod
4).times.16+reg number
[0092] This mapping operation is performed when an instruction for
accessing a register is decoded. A physical address of each
register of the WRF does not vary by the SAVE instruction or the
RESTORE instruction as shown in FIG. 4. Thus, when an instruction
is decoded, physical addresses of all registers that the
instruction accesses can be decided.
[0093] Since the position of CWP varies in the WRF, WCWP that
points a register accessed by an instruction does not vary with the
SAVE instruction or the RESTORE instruction. Thus, when an
instruction is decoded, the physical addresses of all registers
accessed by the instruction can be decided.
[0094] FIG. 11 is a schematic diagram showing a method for updating
WCWP.
[0095] Since the position of CWP varies in a WRF, it is necessary
to update WCWP that corresponds to CWP when the SAVE instruction or
the RESTORE instruction is executed. The updating process is
performed as shown in FIG. 11. In this example, since WCWP
[3:2]="00" has been assigned to a "global" portion, WCWP [3:2]
should vary from "01" to "10" to "11" to "01" and so on. Thus, in
WCWP [3:2] shown in FIG. 11, "mod 3" is followed by "+1".
Consequently, in the calculation in the left parentheses of "mod
3", "-1" is placed for a compensation.
[0096] The updating process should be performed when the SAVE
instruction or the RESTORE instruction is fetched. This is because
when an instruction is fetched after the SAVE instruction or the
RESTORE instruction has been executed, WCWP that has been updated
is required.
[0097] In contrast, the foreseeing transfer from the MRF to the WRF
should not be performed until all instructions that followed by the
SAVE instruction or the RESTORE instruction have been executed.
This is because in instructions followed by the SAVE instruction or
the RESTORE instruction, a branch, an interrupt, and so forth take
place and thereby a control path varies. Thus, there is a
possibility of which the SAVE instruction or the RESTORE
instruction is not executed.
[0098] FIG. 12 is a schematic diagram for explaining a process for
an instruction by a computer corresponding to out-of-order
instruction execution system according to an embodiment of the
present invention.
[0099] N instructions are simultaneously fetched by a plurality of
instruction fetching mechanisms 2 from an instruction cache 1 and
stored to a reservation station 3. These processes are performed in
order. The reservation station 3 removes the dependency of the
instructions stored therein. Until calculation slots 4 become idle,
these instructions are stored in the reservation station 3. An
instruction that can be calculated is supplied to a calculation
slot 4. An operand is read from a register file 5. The instruction
is executed by an arithmetic unit 6. The instructions are supplied
from the reservation station 3 to the calculation slots 4 out of
order regardless of the order of the instructions of the original
program. After the calculation has been completed, the result is
stored in a result buffer 7. Thereafter, the calculated result
stored in the result buffer is written to a register file 8 in
order of the instructions of the original program.
[0100] In FIG. 12, the register file 5 is the same as the register
file 8.
[0101] When the present invention is applied to the computer that
has the out-of-order instruction processing mechanism shown in FIG.
12, a remarkable effect can be achieved.
[0102] FIG. 13 is a schematic diagram showing the state that
instructions are executed out of order.
[0103] Next, an instruction sequence on the upper left of FIG. 13
will be exemplified. In this example, it is assumed that the number
m of instruction slots is 2.
[0104] In the instruction sequence shown in FIG. 13, there are
interferences of registers from instruction (1) to instruction (2),
from instruction (2) to instruction (4), and from instruction (5)
to instruction (6).
[0105] In addition, the instruction (3) is interlocked by the
preceding instructions (1) and (2). This is because when the SAVE
instruction (3) is executed, a window is transferred from the MRF
to the WRF. Thus, when the SAVE instruction (3) is executed before
the instructions (1) and (2) are executed, they will be trapped
later. When it becomes clear that the instruction (3) is not
executed, it cannot be cancelled.
[0106] Such a restriction is denoted by a solid line on the upper
right of FIG. 13. When such a restriction is satisfied and the
process is performed in the shortest time, the instructions are
executed in sequence (a) shown in FIG. 13.
[0107] In contrast, according to the related art of which the WRF
has only one window pointed by CWP, until an instruction followed
by the SAVE instruction has been executed, data cannot be
transferred from the MRF to the WRF. Until data has been
transferred from the MRF to the WRF, an instruction preceded by the
SAVE instruction cannot be executed. As a result, interlocks of
{instruction followed by instruction (3)}.fwdarw.instruction (3)
and instruction (3).fwdarw.{instruction preceded by instruction
(3)} take place. The instruction sequence shown on the upper left
of FIG. 13 has such interlocks as a restriction denoted by a curved
line on the upper right of FIG. 13.
[0108] However, since % o3 of an instruction followed by the SAVE
instruction overlaps with % i3 of an instruction preceded by the
SAVE instruction, when a technology for dynamically substituting
reg numbers of the "in/out" portions is used in the related art, no
interlocks take place. In this case, an interlock from the
instruction (3) to the instruction (4) does not take place.
[0109] When the restriction of the related art is satisfied and the
process is performed in the shortest time, the instructions are
executed in the sequence (b) shown in FIG. 13.
[0110] The latency of the sequence (a) shown in FIG. 13 is 11. In
contrast, the latency of the sequence (b) shown in FIG. 13 is 18.
Thus, according to the embodiment, the latency is improved by 7
against the related art.
[0111] CWP and WCWP of each instruction are shown on the right of
the instruction sequence on the upper left of FIG. 13. Before the
SAVE instruction (3) is executed, CWP and WCWP are 1 and 0101,
respectively. After the SAVE instruction has been executed, CWP and
WCWP vary to 2 and 1010, respectively.
[0112] The registers used in the instructions (1) and (2) followed
by the SAVE instruction are % i4, % o3, and % 11. On the other
hand, the registers used in the instructions (4), (5), and (6)
preceded by the SAVE instruction are % i3, % 13, and % 14. When the
registers are converted into physical addresses corresponding to
the table shown in FIG. 9, before the SAVE instruction is executed,
the physical addresses of the registers % i4, % o3, and % 11 become
28, 43, and 17, respectively. After the SAVE instruction has been
executed, the physical addresses of the registers % i3, % 13, and %
14 become 43, 35, and 36, respectively.
[0113] The registers of an instruction preceded and followed by the
SAVE instruction can be accessed by common physical addresses. The
positions of the registers in the WRF does not vary before and
after the SAVE instruction is executed. For example, since % o3 of
an instruction followed by the SAVE instruction overlaps with % i3
of an instruction preceded by the SAVE instruction, they can be
accessed with the same physical address (=43) in the WRF.
[0114] The fact that physical address does not vary in a WRF before
and after the SAVE instruction or the RESTORE instruction is
executed is important to execute instructions out of order.
[0115] Even if a WRF can have all registers necessary before and
after the execution of the SAVE instruction and an operand can be
supplied to an instruction preceded by the SAVE instruction without
need to wait until the next window is transferred from the MRF,
when physical addresses of the registers vary after and before the
SAVE instruction is executed, it is difficult to move an
instruction through the SAVE instruction.
[0116] In the method according to the embodiment, unless physical
addresses of registers varies, when the instruction (2) is fetched
as shown in FIG. 12, the physical addresses are calculated. When
the calculated physical addresses are registered to the reservation
station, even if instructions are executed in any order, correct
registers therefore can be accessed.
[0117] FIG. 14 is a block diagram showing a window register file
composed of n windows according to an embodiment of the present
invention.
[0118] In FIG. 14, reference numeral 10 is an MRF that has "local"
portions and "in/out" portions for n windows. In FIG. 14, reference
numeral 11 is an WRF. In the WRF, a "global" portion is composed of
eight entries, a "local" portion is composed of 24 entries, and an
"in/out" portion is composed of 32 entries. Thus, the WRF is
composed of a total of 64 entries.
[0119] An operand and stored data are supplied from only the WRF to
an execution unit and a memory unit designated by 12 in FIG. 14. In
contrast, a calculated results and/or a loaded result of the
execution unit and the memory unit are written to both the MRF and
the WRF. As a result, the content of the MRF becomes consistent
with the content of the WRF.
[0120] The WRF is accessed through a window pointer WCWP [3:0].
[0121] Register data of a total of 16 entries of one window (eight
entries) of the "local" portion and one window (eight entries) of
the "in/out" portion of the MRF is foresee-transferred to the WRF
through a transfer path.
[0122] After all instructions followed by the SAVE instruction or
the RESTORE instruction have been executed, when the SAVE
instruction is executed, the "out" portion pointed by CWP+2 (=the
"in" portion pointed by CWP+3) and the "local" portion pointed by
CWP+2 are transferred to unused windows of the WRF. In contrast,
when the RESTORE instruction is executed, the "in" portion and the
"local" portion pointed by CWP-2 are transferred to unused windows
of the WRF.
[0123] According to an embodiment of the present invention, a total
of 136 entries of registers ("in" portion, "out" portion, "local"
portion, and "global" portion) of eight windows are provided. Among
these registers, 128 registers of the "in" portion, the "out"
portion, and the "local" portion are disposed in the master
register file (MRF) These registers are always updated so that the
contents thereof store the latest values. In contrast, an operand
is supplied to the arithmetic unit from the working register file
(WRF) rather than the MRF. In addition, a "global" portion for
which a window is not switched is disposed in the WRF.
[0124] Working Register File (WRF)
[0125] The WRF accesses three windows pointed by CWP, CWP-1, and
CWP+1 (a total of 64 entries) through a six-bit physical
address.
[0126] A window pointed by CWP varies in the WRF. A four-bit WCWP
(Working Register Current Window Pointer) register is disposed so
as to designate a window pointed by CWP. WCWP [3:2] uses WCWP
instead of CWP of the "local" portion.
[0127] When all registers are accessed, a six-bit physical address
obtained with a reg number and WCWP corresponding to the table
shown in FIG. 9 is used.
[0128] FIG. 15 is a schematic diagram showing a method for mapping
a physical address to a WRF.
[0129] 64 entries of a WRF are grouped as g, 11, 12, 13, io1, io2,
io3, and io4, each of which is composed of eight entries and mapped
to physical addresses as shown in FIG. 15.
[0130] FIG. 16 is a block diagram showing the structure of a
WRF.
[0131] In FIG. 16, 64 entries of registers are grouped as four
16-entry modules as shown in FIG. 16.
[0132] The WRF is operated by three types of operations READ
(WRF.fwdarw.execution unit), WRITE (execution unit.fwdarw.WRF), and
MOVE (MRF.fwdarw.WRF). The MOVE operation is performed in
association with the SAVE instruction or the RESTORE instruction.
The READ operation and the WRITE operation are executed with
physical addresses denoted by (15) and (14) shown in FIG. 16. The
16-entry modules denoted by (1), (2), (3), and (4) shown in FIG. 16
are denoted by banks 1, 2, 3, and 4, respectively. The bank 1 has %
g0 to % g3 (% 10 to % 13); the bank 2 has % g4 to % g7 (% 14 to %
17); bank 3 has % i0 to % i3; and the bank 4 has % i4 to % i7 (% o4
to % o7).
[0133] FIG. 17 is a schematic diagram showing a method for
selecting a bank. FIG. 18 is a schematic diagram showing a method
for mapping a physical address [5:0].
[0134] A bank to or from which data is written or read is decided
by bits [3:2] of a physical address corresponding to the table
shown in FIG. 17. An address in a bank is decided by bits [1:0] and
[5:4].
[0135] Each bit of a physical address in the table shown in FIG. 10
has the meaning shown in FIG. 18.
[0136] Thus, when a physical address is given, the WRF is accessed
in such a manner that a bank is decided by the bits [3:2] of the
physical address, a four-bit address of the bank is generated with
the bits [1:0] (as high order bits) and bits [5:4] (as low order
bits) of the physical address, and a register is accessed to the
bank with the generated bank address.
[0137] FIG. 19 is a schematic diagram showing WCWP of the
destination. When the MOVE operation is performed, with
instructions denoted by (7), (8), (11), and (12) shown in FIG. 16,
an address in a bank to which data is transferred is designated.
WCWP [3:2] designates the designation for the "local" portion. WCWP
[1:0] designates the designation for the "in/out" portions. WCWP of
the designation is decided corresponding to the table shown in FIG.
19.
[0138] When a window is transferred from the MRF to the WRF, since
it is foresee-transferred, when the SAVE instruction is executed, a
window pointed by WCWP+2 is accessed. When the RESTORE instruction
is executed, a window pointed by WCWP-2 is accessed. In addition,
since WCWP [3:2]="00" is assigned to a window of the "global"
portion, it is necessary to cause WCWP [3:2] to vary from "01" to
"10" to "11" to "01" and so on. Thus, WCWP [3:2] shown in FIG. 19
is "mod 3" followed by "+1". Thus, in the calculation in the
parentheses on the left of "mod 3", "-1" is placed for a
compensation.
[0139] In contrast, when the SAVE instruction is executed, it is
necessary to transfer the "out" portion of a window that is two
positions ahead. However, since WCWP [1:0] is a pointer that points
an "in" portion, it points an "in" portion that is three positions
ahead and that overlaps with an "out" portion that is three
positions ahead. As a result, in this case, the "in" portion is
transferred.
[0140] In addition, when the MOVE operation is performed,
instructions denoted by (5), (6), (9), and (10) in FIG. 16 are
transferred in two phases so as to reduce the path width.
[0141] In phase=0, even reg numbers (1 [0], 1 [2], 1 [4], 1 [6], io
[0], io [2], io [4], and io [6]) are transferred.
[0142] In phase=1, odd reg numbers (1 [1], 1 [3], 1 [5], 1 [7], io
[1], io [3], io [5], and io [7]) are transferred.
[0143] A destination address, MOVE_dest_addr, that is necessary in
the WRF when the MOVE operation is performed is composed of a total
of four bits that are the high order two bits that designate the
high/low of the bank and represent the phase and the low order two
bits are WCWP [3:2] (or WCWP [1:0]).
[0144] FIG. 20 is a schematic diagram for explaining registers of
WRF to which data is transferred corresponding to
MOVE_dest_addr.
[0145] MOVE_dest_addr is composed of a total of four instructions
that are two instructions denoted by (7) and (8) for accessing
"local" portions of banks 1 and 2 and two instructions denoted by
(11) and (12) for accessing "out" portions of banks 3 and 4. With
MOVE_dest_addr, registers are accessed corresponding to the table
shown in FIG. 20.
[0146] When the MOVE operation is performed, an instruction denoted
by (5) shown in FIG. 16 is written to a register represented by an
address in a bank designated by an instruction denoted by (7) shown
in FIG. 16 of bank 1, 2--low. An instruction denoted by (6) shown
in FIG. 16 is written to a register represented by an address in a
bank designated by an instruction denoted by (8) shown in FIG. 16
in bank 1, 2--high. An instruction denoted by (9) shown in FIG. 16
is written to a register represented by an address in a bank
represented by the instruction denoted by (11) shown in FIG. 16 of
bank 3, 4--low. An instruction denoted by (10) shown in FIG. 16 is
written to a register represented by an address in a bank
represented by the instruction denoted by (12) shown in FIG. 16 of
bank 3, 4--high.
[0147] Even if the register numbers are transferred in two phases,
since these operations can be pipelined as will be described later,
the process latency increases only by one.
[0148] Master Register File (MRF)
[0149] FIG. 21 is a block diagram showing the structure of an MRF
according to an embodiment of the present invention. FIG. 22 is a
schematic diagram showing the relation between registers and
windows in the case that data is written to the MRF.
[0150] The MRF is divided into two areas that are an area for
storing "in/out" portions of all windows (this area is denoted by
(1) in FIG. 21) and an area for storing "local" portions of all
windows (this area is denoted by (2) in FIG. 21). In the MRF, the
WRITE operation and the MOVE operation are performed. Unlike with a
WRF, in the MRF, a window position can be decided by CWP. Thus,
when a window is accessed in the MRF, CWP is used (with
instructions denoted by (5) and (8) shown in FIG. 21).
[0151] As shown in FIG. 22, the destination of data to be written
by the WRITE operation in the MRF depends on a reg number (an
instruction designated by (4) shown in FIG. 21) represented in a
dest reg field of an instruction for updating a register. This
process is accomplished by selecting dest_CWP or dest_CWP+1 with
instructions denoted by (6) and (7) shown in FIG. 21.
[0152] FIG. 23 is a schematic diagram showing the relation between
registers and windows in the case that data is written to the
MRF.
[0153] When the MOVE operation is performed, a register to be read
depends on whether the SAVE instruction or the RESTORE instruction
is executed as shown in FIG. 23. As was described above, since a
register window is foresee-transferred from the MRF to the WRF, a
window to be transferred is a window one position ahead of a window
switched by the SAVE instruction or the RESTORE instruction. When
the SAVE instruction is executed, a register window is transferred
from move_CWP+2 in the MRF. When the RESTORE instruction is
executed, a register window is transferred from move_CWP-2 in the
MRF. However, when the SAVE instruction is executed, it is
necessary to transfer the "out" portion of move_CWP+2. However,
since the "out" portion of move_CWP+2 overlaps with the "in"
portion of move_CWP+3, on the basis of the "in" portion, registers
of the "in" portion transferred when the SAVE instruction is
executed becomes move_CWP+3 as shown in FIG. 23.
[0154] This process is accomplished by selecting move.sub.13 CWP-2,
move_CWP+2, or move_CWP+3 with instructions denoted by (9), (10),
(11), and (12) shown in FIG. 21.
[0155] An instruction denoted by (13) shown in FIG. 21 causes
registers with even reg numbers to be read from the MRF in phase=0
and registers with odd reg numbers to be read from the MRF in
phase=1.
[0156] Eight registers are read from a read port denoted by (13)
shown in FIG. 21. The output of the read port is connected to
portions denoted by (5), (6), (9), and (10) shown in FIG. 16.
[0157] FIG. 24 is a block diagram showing the structure of a window
register file of which an MRF and a WRF are connected according to
an embodiment of the present invention.
[0158] In FIG. 24, a portion denoted by (1) represents an MRF and a
portion denoted by (2) represents a WRF. In FIG. 24, instructions
denoted by (3) and (4) designate a write reg number and CWP,
respectively. When data is written to the WRF, dest_phy_addr
denoted by (6) shown in FIG. 24 is used instead of a pair of
(dest_CWP, reg number).
[0159] When data is read from the WRF, it is accessed with
src_phy_addr denoted by (7) shown in FIG. 24.
[0160] When a SAVE instruction or a RESTORE instruction denoted by
(9) shown in FIG. 24 is executed, move_CWP or move_WCWP denoted by
(8) shown in FIG. 24 is designated.
[0161] A READ operation, a WRITE operation, and a MOVE operation
for such a register are processed out of order. Thus, different
values are used for dest/move_CWP denoted by (4) and (8) shown in
FIG. 24, move_WCWP denoted by (10) shown in FIG. 24, and
dest/src_phy_addr denoted by (6) and (7) shown in FIG. 24 depending
on each instruction to be executed. CWP, WCWP, and phy_addr for
each instruction is stored in the reservation station 3 shown in
FIG. 12 along with instructions that are queued. When an
instruction that is queued is executed, CWP, WCWP, and phy_addr are
read and used.
[0162] FIG. 25 is a schematic diagram showing meanings of CWP and
WCWP.
[0163] FIG. 25 tabulates the meanings of CWP and WCWP that have
been described.
[0164] SAVE/RESTORE PROCESS
[0165] FIG. 26 is a schematic diagram showing the state of a
foreseeing transfer performed for the WRF.
[0166] When the SAVE process or the RESTORE process is performed,
as shown in FIG. 26, the current window of the WRF is changed.
[0167] This operation is accomplished by changing WCWP in the
manner that will be described later.
[0168] When the SAVE process or the RESTORE process is performed
one time, only WCWP is changed. However, when the SAVE process or
the RESTORE process is successively performed, it is necessary to
transfer a new window from the MRF.
[0169] Thus, when the SAVE process or the RESTORE process is
performed, a new window is transferred from the MRF so that the
SAVE process or the RESTORE process can be performed next time.
[0170] For example, in state denoted by (1) shown in FIG. 26,
WCWP="0100", "local"=11, in=i01, and out=io2 are mapped. When the
SAVE instruction is executed in the state, the "local" portion
pointed by CWP+2 is transferred to 13. In addition, the "out"
portion pointed by CWP+2 is transferred to io4. As a result, in
state denoted by (2) shown in FIG. 26 takes place. In the state
denoted by (2), WCPW="1001", "local"=12, in=io2, and out=io 3 are
mapped.
[0171] In contrast, when the RESTORE instruction is executed in the
state denoted by (2) shown in FIG. 26, the "local" portion pointed
by CWP-2 is transferred to 13. In addition, the "in" portion
pointed by CWP-2 is transferred to io4. As a result, the state
denoted by (2) returns to the state denoted by (1) shown in FIG.
26.
[0172] When the SAVE instruction is executed, it is necessary to
transfer the "local" portion and the "out" portion pointed by CWP+2
from the MRF. When the RESTORE instruction is executed, it is
necessary to transfer the "local" portion and the "in" portion
pointed by CWP-2 from the MRF.
[0173] To do that, a 512-bit (eight bytes.times.eight words) MOVE
BUS is routed between the MRF and the WRF so as to transfer
register data from the MRF to the WRF. To transfer one window, it
is necessary to transfer 16 entries. In the example, the 16 entries
are transferred in two phases.
[0174] Since there is a latency for transferring a window, when the
SAVE instruction or the RESTORE instruction is successively
executed, there is an interlock between the later SAVE instruction
or the later RESTORE instruction and the MOVE process for the
earlier SAVE instruction or the earlier RESTORE instruction.
[0175] FIG. 27 is a schematic diagram showing a time chart of a
process performed when the SAVE instruction is successively
executed.
[0176] When the SAVE instruction is successively executed as with a
program shown in FIG. 27, SAVE (a) causes registers with even reg
numbers to be transferred in phase=0 and registers with odd reg
numbers to be transferred in phase=1. Since each phase can be
pipelined, the latency of the SAVE instruction or the RESTORE
instruction is the latency of which "1" is added to the latency of
the MOVE process. Thus, when the SAVE instruction is followed by
the SAVE instruction or when the RESTORE instruction is followed by
the RESTORE instruction, an interlock of which at least "1" is
added to the latency of MOVE process is required. On the other
hand, the SAVE instruction can be followed by the RESTORE
instruction. In addition, the RESTORE instruction can be followed
by the SAVE instruction. In phase=0, the contents of registers with
even reg numbers (10, 12, 14, 16, i0 (o0), i2 (o2), i4 (o4), and i6
(o6)) are placed on the MOVE BUS. In phase=1, the contents of
registers with odd reg numbers (11, 13, 15, 17, i1 (o1), i3 (o3),
i5 (o5), and i7 (o7)) are placed on the MOVE BUS.
[0177] When the SAVE instruction is executed, a window pointed by
CWP+2 should be moved. When the RESTORE instruction is executed, a
window pointed by CWP-2 should be moved.
[0178] The processes performed by the SAVE instruction and the
RESTORE instruction are summarized as follows:
[0179] FIG. 28 is a schematic diagram for explaining a writing
process for a bank of the WRF when the SAVE instruction is
executed.
[0180] Process performed when SAVE instruction is executed:
[0181] The "local" portion and the "out" portion pointed by CWP+2
are transferred from the MRF and placed on the MOVE BUS.
[0182] Data on the MOVE BUS is written to addresses in individual
banks of the WRF corresponding to the table shown in FIG. 28.
[0183] WCWP is updated as follows.
new WCWP [3:2]=(WCWP [3:2]+1-1)mod 3+1
new WCWP [1:0]=(WCWP [1:0]+1)mod 4
[0184] FIG. 29 is a schematic diagram for explaining a writing
process for a bank of the WRF when the RESTORE instruction is
executed.
[0185] Process performed when RESTORE instruction is executed:
[0186] The "local" portion and the "in" portion pointed by CWP-2
are supplied from the MRF and placed on the MOVE BUS.
[0187] Data on the MOVE BUS is written in individual banks of the
WRF corresponding to the table shown in FIG. 29.
[0188] WCWP is updated as follows.
new WCWP [3:2]=(WCWP [3:2]-1-1)mod 3+1
new WCWP [1:0]=(WCWP [1:0]-1)mod 4
[0189] According to the present invention, since working registers
for a plurality of windows are stored, the instruction execution
order can be changed before a window switching instruction is
executed, the process speed of an information processing apparatus
corresponding to out-of-order instruction execution system can be
improved.
[0190] Although the present invention has been shown and described
with respect to a best mode embodiment thereof, it should be
understood by those skilled in the art that the foregoing and
various other changes, omissions, and additions in the form and
detail thereof may be made therein without departing from the
spirit and scope of the present invention.
* * * * *