U.S. patent application number 08/885380 was filed with the patent office on 2001-07-12 for command insertion and reordering at the storage controller.
Invention is credited to FITZGERALD, JOHN T., HALLIGAN, KENNETH A., OFER, EREZ.
Application Number | 20010008007 08/885380 |
Document ID | / |
Family ID | 25386780 |
Filed Date | 2001-07-12 |
United States Patent
Application |
20010008007 |
Kind Code |
A1 |
HALLIGAN, KENNETH A. ; et
al. |
July 12, 2001 |
COMMAND INSERTION AND REORDERING AT THE STORAGE CONTROLLER
Abstract
The invention relates to a method and apparatus for reordering,
at a disk drive controller, a received sequence of read and write
commands. The reordering can be system driven so that the system,
when a command stalls, examines the command queue for another
command to execute. Alternatively, the command queue itself can be
reordered depending upon the expected time it takes for various
commands to execute. Reordered commands can be flagged so that they
cannot be reordered again. The resulting increase in throughput and
efficiency occurs without reference to the particular disk drive or
disk drive array configuration to which the controller is
connected.
Inventors: |
HALLIGAN, KENNETH A.;
(LEOMINISTER, MA) ; OFER, EREZ; (BROOKLINE,
MA) ; FITZGERALD, JOHN T.; (MANSFIELD, MA) |
Correspondence
Address: |
CHRISTOPHER K. GAGNE, ESQ.
OFFICE OF THE GENERAL COUNSEL
EMC CORPORATION
35 PARKWOOD DRIVE
HOPKINTON
MA
01748-9103
US
|
Family ID: |
25386780 |
Appl. No.: |
08/885380 |
Filed: |
June 30, 1997 |
Current U.S.
Class: |
711/114 ;
711/113; 711/167 |
Current CPC
Class: |
G06F 3/0659 20130101;
G06F 3/0689 20130101; G06F 3/0613 20130101 |
Class at
Publication: |
711/114 ;
711/167; 711/113 |
International
Class: |
G06F 012/00 |
Claims
What is claimed is:
1. A method for reordering, at a disk drive controller, a received
sequence of read and write commands comprising the steps of
determining any time delays in completing an initiated read or
write command at the disk drive controller, initiating, if the time
delay meets specified criteria, a next selected command in the
received sequence, and completing said next selected command while
at least a portion of said initiated read or write command is
completing.
2. The method of claim 1 further wherein said specified criteria
includes an expected time to complete threshold.
3. The method of claim 2 further wherein said specified criteria
are satisfied whenever there is a cache miss during a read
command.
4. The method of claim 1 wherein said specified criteria include a
read miss.
5. The method of claim 1 wherein said specified criteria include a
failure to write data to memory for destaging.
6. A method for reordering, at a disk drive controller, a received
sequence of read and write commands comprising the steps of storing
in a command list said read and write commands, as they are
received at said controller, adding each newly received command to
the end of the command list, the command at the beginning of the
list being the next command to be executed, moving a selected
command from a position more distant from the beginning of the list
to a position less distant from the beginning of the list when that
selected command can be executed and completed without substantial
delay, and executing a command near the top of the list.
7. The method of claim 6 further comprising the steps of flagging
any read/write command which is moved to a position at the
beginning of said list, and preventing any flagged command from
moving further from the beginning of said list.
8. The method of claim 6 further comprising the steps of repeating
said moving step plural times, and wherein said executing step
comprises the step of executing said instruction closest to said
beginning of said list.
9. The method of claim 6 further wherein said executing step
comprises the step of executing the instruction closest to the
beginning of the list.
10. The method of claim 8 wherein said executing instruction is a
write instruction.
11. A disk drive controller for reordering a received sequence of
read and write commands comprising circuitry for determining any
time delays in completing an initiated read or write command,
circuitry for initiating, if the time delay meets specified
criteria, a next selected command in the received sequence, and
circuitry for completing the next selected command while at least a
portion of said initiated read or write command is completing.
12. The apparatus of claim 11, wherein said specified criteria
include a read miss.
13. The apparatus of claim 11 wherein said specified criteria
include a failure to write data to a memory for destaging.
14. A disk drive controller for reordering a received sequence of
read and write commands comprising memory for storing a command
list of read and write commands as they are received at the
controller, circuitry for adding newly received commands to the end
of the command list, the command at the beginning of the list being
the next command to be executed, position altering circuitry for
moving a selected command from a position more distant from the
beginning of the list to a position less distant from the beginning
of the list when that selected command can be executed and
completed without substantial delay, and execution circuitry for
executing a command near the top of the list.
15. The disk drive controller of claim 14, further comprising
circuitry for flagging any read/write command which is reordered to
a position at the beginning of said list, and circuitry for
preventing any flagged command from moving further from the
beginning of said list.
16. The apparatus of claim 14, further comprising circuitry for
repeatedly operating the moving circuitry a plural number of times,
and said executing circuitry executing the instruction closest to
the beginning of the list.
Description
BACKGROUND OF THE INVENTION
[0001] The invention relates generally to mass storage devices, and
in particular, to methods and apparatus for improving performance
by modifying the controller for a mass storage system.
[0002] Disk drive systems have grown enormously in both size and
sophistication in recent years. These systems can typically include
many large disk drive units controlled by a complex, multi-tasking,
disk drive controller such as the EMC Symmetrix disk drive
controller. A large scale disk drive system can typically receive
commands from a number of host computers and can control a number
of disk drive mass storage devices, each mass storage device
capable of storing in excess of several gigabits of data.
[0003] There is every reason to expect that both the sophistication
and the size of the disk drive systems will continue to increase.
As the systems increase in complexity, so does the user's reliance
upon the system, for fast and reliable recovery and storage of
data. Accordingly, the user typically uses data throughput and
speed of response as a primary criteria for evaluating performance
of the disk drive systems. As a result, mass storage devices and
the controllers which drive them have become quite sophisticated in
trying to improve command response time. Systems such as the EMC
Symmetrix disk drive controller system thus incorporate a large
cache memory, and other techniques to improve the system
throughput.
SUMMARY OF THE INVENTION
[0004] The invention relates to a method and apparatus for
reordering, at the disk drive controller, a received sequence of
read and write commands. In this manner, the data throughput will
advantageously be improved.
[0005] The method features the steps of determining any time delays
in completing an initiated command at the disk drive controller;
initiating, if the time delay exceeds allowed criteria, a next
selective command in the received sequence; and completing that
next selective command while waiting for at least a portion of the
originally initiated command to complete. In this manner, the
controller is not idly waiting for a command to complete, but in
fact will complete commands in a sequence other than that in which
they are received. As a result, throughput through the disk drive
controller is increased.
[0006] Among the criteria, which would result in taking commands
out of order, are the expected time to complete for a command,
whether there is a cache miss during a read command, and whether
write cache is full.
[0007] In another aspect, the method of reordering a received
sequence of read and write commands at a disk drive controller
features the steps of storing the received read and write commands
in a command list or queue in the order that they are received at
the controller; adding each newly received command to the end of
the command list, the command at the beginning of the list being
the next command to be executed; moving a selected command from a
position more distant from the beginning of the list to a position
less distant from the beginning of the list when that selected
command can be executed and completed without substantial delay;
and executing a command near the top of the list.
[0008] In this particular embodiment of the invention, the method
can further feature flagging any command which moves to a position
at the beginning of the list (as a result of another command or
commands being moved closer to the end of the list), and preventing
any flagged command from moving further from the beginning of the
list (for any reason). In another aspect, the method further
features repeating the moving step a plurality of times and wherein
the executing step features the step of executing the write
instruction closest to the beginning of the list.
[0009] The disk controller of the invention relates to reordering a
received sequence of read and write commands to increase their
throughput. The disk controller features circuitry for determining
any time delays in completing an initiated read or write command,
circuitry for initiating, if the time delay meets specified
criteria, a next selected command in the received sequence, and
circuitry for completing that next selected command while at least
a portion of the initiated read or write command is completing. In
particular embodiments, the specified criteria can be either an
expected time to complete threshold, a cache miss during a read
operation, or the inability to write data to destaging memory.
[0010] In yet another aspect of the invention, the disk drive
controller receives a sequence of read and write commands and
features storing the read and write commands in a command list in
the order in which they are received, circuitry for adding the
newly received commands to the end of the list, the command at the
beginning of the list being the next command to be executed,
circuitry for moving a selected command from a position more
distant from the beginning of the list to a position less distant
from the beginning of the list when that selected command can be
executed and completed without substantial delay, and circuitry for
executing a command near the top of the list.
[0011] In particular embodiments, the disk drive controller flags
any read or write command which is moved to a position at the
beginning of the list from a position more distant from the head of
the list and prevents any flagged command from being moved away
from the head of the list.
[0012] Accordingly, the invention provides improved and increased
throughput, through the disk drive controller by reordering the
read and write commands from the order in which they are received
by the disk drive controller. As a result, "down time" at the
controller is minimized, and disk drive instructions are using
those system resources which are available.
BRIEF DESCRIPTION OF THE DRAWING
[0013] Other features and advantages of the invention will be
apparent from the following description, taken together with the
drawings, in which:
[0014] FIG. 1 is a schematic block diagram of a system in
accordance with the invention;
[0015] FIG. 2 is a flow chart illustrating a first aspect of system
operation in accordance with the invention;
[0016] FIG. 3 is a flow chart illustrating a second aspect of
system operation in accordance with the invention; and
[0017] FIGS. 4A, 4B, 4C, and 4D illustrate a reordering sequence of
the command queue.
DESCRIPTION OF THE PARTICULAR PREFERRED EMBODIMENTS
[0018] Referring to FIG. 1, the invention relates to a computer
system 10 which has a plurality of host computers 12a, 12b, . . . ,
12n and a mass storage system 13, having a plurality of disk drive
elements 14a, 14b, . . . , 14k. Interconnecting the host computers
12 and the disk drive elements 14 is a disk drive controller 16,
for example that which is manufactured by EMC Corporation and known
as the Symmetrix controller. The disk drive controller 16 receives
memory commands, that is, for example, read and write commands,
from the various host computers over buses 18a, 18b, . . . , 18n,
respectively, for example buses connected and operating in
accordance with a SCSI protocol. The disk drive controller 16
delivers data associated with those commands to or from the
appropriate disk drive elements 14 over respective connected buses
20a, 20b, . . . , 20k. Buses 20 also preferably operate in
accordance with a SCSI protocol. Each of the disk drive elements 14
typically has an excess of 1 gigabit of memory and is logically
divided, in accordance with known techniques, into a plurality of
logical volumes. Each disk drive element 14 can thus have a
plurality of logical volumes, for example, four, eight, or more
logical volumes in a single physical disk drive element. In a
typical configuration, the controller system also connects to a
console PC 22 through a connecting bus 24. Console PC 22 is used
for maintenance and access to the controller and can be employed to
set parameters of and/or to initiate maintenance commands to the
controller, under user control, as is well known in the art.
[0019] As indicated above, it can often occur during operation,
that the system spends time waiting for system resources, such as
cache memory or physical hard drive accesses. As a result, in
accordance with the invention, system throughput is improved by
changing the sequence in which read and write commands in a command
queue are implemented.
[0020] To increase system performance, therefore, the disk drive
controller reorders and/or inserts SCSI read and write commands
from the "natural order" in which they are received from the host
or hosts, as presented in the controller's command queue. By
strategically reordering and executing media access commands, the
system attempts to diminish the time spent waiting for system
resources and uses that "waiting time" to execute other commands in
the command queue, thus providing an increase in performance and
overall throughput. In accordance with the invention, the
controller accomplishes this task as follows.
[0021] There are two methods which can be used to manipulate the
order of execution of the commands received from the host computers
and stored in the command queue of the controller. One method is
referred to herein as command insertion while the other method is
referred to command reordering. In accordance with command
insertion method of operation, the disk drive controller will, upon
receiving a command over the SCSI bus from a host computer, attempt
to promote this command from the end of the queue or list toward
its beginning. This method thus advances a read command until
either another read command is encountered, or until the read
command has reached the head or beginning of the queue. This
operation is performed so that a request will thus be expedited at
the expense of other slower executing commands.
[0022] The command reordering process provides that the disk drive
controller actually begins looking from the beginning of the queue
toward the end of the queue. The disk drive controller looks for an
alternate command to execute if the system resources are not
available to complete the current "active" command, that is, the
command currently at the head or beginning of the queue. This
command reordering process is also implemented immediately after
any command is received from a host computer over the SCSI input
bus.
[0023] Referring to FIG. 2, therefore, in the command insertion
process, the disk drive controller first waits for a previous
request for reading or writing data, and which has been initiated,
but not yet satisfied. At this moment, therefore, at 100, the
current input/output (I/O) operation has stalled (if, for example,
there is a cache miss or write cache is full). The disk drive
controller uses this opportunity, at 110, to determine if a command
exists in the command queue which is either a write command able to
be written to cache memory, or a read command, the data for which
is entirely in cache memory, and therefore available for immediate
execution and completion. If either of these conditions are met,
the selected candidate becomes the new command (at 120) if, in
accordance with the invention, it has a so-called queue tag
associated with it. The disk drive controller will then reorder the
current command that is waiting, for example, for read data, with
the command that has met the criteria identified above (step 130).
The command that is removed from the beginning of the queue becomes
inactive and the new command at the beginning of the queue begins
to execute and becomes the active command. (Step 140)
[0024] Once the new active command has completed execution, this
system then reexams the old "inactive" command, which according to
the invention preferably has previously been tagged so that it
cannot again be "reordered". The system also checks to determine
whether or not it has the data which it requires for proper
handling. For a write command, this can involve data being written
to a previously full write cache, and for a read command, it can
involve the receipt in cache of the required data. This is
indicated at 150. The now active command is then completed.
[0025] In other embodiments of the invention, however, it is
possible and desirable to "intelligently" reorder and change the
sequence in which commands are received over the bus and are
inserted into the command queue for execution and completion. Thus,
it is possible to execute more than one read or write command while
waiting for a previously initiated command to finish. In other
instances, it can be desirable to identify groups of commands which
access the same block or blocks in order to achieve some efficiency
based on a locality of reference. Other intelligent command queue
manipulation can involve taking commands not only out of their
natural sequence, but also out of what might be determined to be
the reordered sequence, based solely upon commands closest to the
beginning of the command queue or commands which are only read or
write commands, using a theory, for example, that the read commands
inherently require greater resource delays since the likelihood of
a cache miss is greater than, for example, the likelihood that the
write cache is full.
[0026] Referring to FIG. 3, consider a situation in which the disk
drive controller is in a state where a write command cannot be
immediately executed and completed because the cache slot required
for the write data associated with that command is locked and/or
otherwise unavailable. The disk drive controller, in accordance
with the invention, must then search down the command list or queue
to avoid stalling while waiting for the cache slot to become
available. This is indicated at 200. The operation of the remainder
of the flow diagram illustrated in FIG. 3 is the same as that noted
above in connection with FIG. 2 for a relatively simple command
reordering implementation. As noted above, however, other command
reordering implementations can also be effective with further
efficiencies by taking into account locality of reference, the
types of commands, and other well-known performance enhancing
methods.
[0027] As noted above, referring to FIGS. 4A, 4B, 4C, and 4D, it is
possible to approach the command reordering process by reordering
the commands in the command queue before the actual implementation
and execution of the command at the top of the queue takes place.
In this aspect of the invention, consider a command queue as
illustrated in FIG. 4A. The command queue has a sequence of read
and write commands labeled A, B, C, D, . . . , F where F is the
newest command to be added at the bottom or end of the list. The
commands in the command queue 250 can be reordered even before, and
without requiring, a so-called stalled command being in place.
Thus, in accordance with the invention, a command which can be
immediately completed, such as write commands where the cache slot
is available for immediate execution and storage of the write data
for later destaging to the disk drive or disk drive array, or read
commands where the data exists in cache memory, can be moved up the
list toward the beginning or top of the queue thereby replacing and
forcing down those commands at the top of the queue which cannot be
immediately implemented and/or completed. This process can result,
in a reordering of the queue as illustrated in FIGS. 4B and 4C as
command "D" moves upward.
[0028] It may occur, however, that in addition to the advantages
noted above and the efficiencies which are achieved in the use of
resources, that some commands once they are reordered to the top of
the queue encounter delays in their execution. Such delays may
occur, for example, because of a rewriting of cache, causing a
cache miss. Nevertheless, in accordance with the illustrated
embodiment of the invention, reordered commands which reach the
head of the command queue can be and preferably are tagged (as
indicated by the plus sign in FIG. 5D), so that they cannot be
reordered downward, that is, they cannot be pushed toward the
bottom of the queue, even by faster executing commands. Thus, in
accordance with one embodiment of the invention, once a command has
been reordered and reaches the beginning of the queue, it cannot be
reordered again. Thus, as noted in FIG. 4D, the command "D", in a
particular embodiment of the invention cannot be reordered after
reaching the head of the queue, no matter what commands enter the
queue. Commands A, B, and C will then be executed once command D is
executed.
[0029] Thus, in accordance with the invention, the disk drive
controller has the capability of recognizing stalled conditions due
to resource unavailability and to take commands out of order or to
reposition commands in the command queue in order to provide
greater throughput and efficiencies. In a particular implementation
of the EMC Symmetrix controller, such as that illustrated in FIG.
1, the command reordering capability is inserted into the SCSI
adapter (SA) and the availability of data in cache memory or the
availability of write cache are the primary criteria in determining
whether to take commands out of order.
[0030] It is thus important to note the two aspects in which
commands can be reordered. In one aspect, system driven, the system
determines whether the time delay, for example, is excessive or is
expected to be excessive, that is, to exceed a selected threshold
value, and if so, a command is rendered inactive while another
command is implemented. In the second embodiment of the invention,
the command queue itself is reordered prior to the determination of
possible delays so that faster executing instructions will "rise"
to the top of the queue, and be executed more quickly than slower
executing instructions which might be delayed depending upon
resource availability at a particular period of time. These two
approaches to command reordering may result in slightly different
command sequence execution, but both instances will result in
better data throughput, which is the prime importance of the
invention.
[0031] Additions, subtractions, and other modifications of the
disclosed preferred particular embodiments of the invention will be
apparent to those practicing the field and are within the scope of
the following claims.
* * * * *