U.S. patent application number 14/265173 was filed with the patent office on 2014-12-11 for performing direct data manipulation on a storage device.
This patent application is currently assigned to NetApp, Inc.. The applicant listed for this patent is NetApp, Inc.. Invention is credited to Pratap Singh, Don Alvin Trimmer, Sandeep Yadav.
Application Number | 20140365539 14/265173 |
Document ID | / |
Family ID | 50982213 |
Filed Date | 2014-12-11 |
United States Patent
Application |
20140365539 |
Kind Code |
A1 |
Trimmer; Don Alvin ; et
al. |
December 11, 2014 |
PERFORMING DIRECT DATA MANIPULATION ON A STORAGE DEVICE
Abstract
A method and system for performing data manipulation on a
storage device is disclosed. A data manipulation command is created
on a computing device, wherein the computing device is separate
from the storage device. The computing device is a client or a
server that requests services of a storage system to store data on
a storage medium. The computing device and the storage device are
connected over a network. The computing device executes a host
application, and its data is stored on the medium. The computing
device issues a command to the storage device to be performed on
the data. The storage device executes the command and sends the
result to the computing device. As a result, the data is not sent
to the computing device for manipulation.
Inventors: |
Trimmer; Don Alvin;
(Sunnyvale, CA) ; Yadav; Sandeep; (Sunnyvale,
CA) ; Singh; Pratap; (Sunnyvale, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
NetApp, Inc. |
Sunnyvale |
CA |
US |
|
|
Assignee: |
NetApp, Inc.
Sunnyvale
CA
|
Family ID: |
50982213 |
Appl. No.: |
14/265173 |
Filed: |
April 29, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11740471 |
Apr 26, 2007 |
8768898 |
|
|
14265173 |
|
|
|
|
Current U.S.
Class: |
707/825 ;
707/827 |
Current CPC
Class: |
G06F 16/1727 20190101;
G06F 16/1827 20190101; G06F 16/16 20190101 |
Class at
Publication: |
707/825 ;
707/827 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1-30. (canceled)
31. A method, comprising: receiving, at a storage server, a command
in a network storage communication protocol from a client device,
the command comprising information identifying a source file stored
at the storage server and a list of file segments to include, the
source file comprising the file segments identified by the list;
executing, at the storage server, the command by copying the file
segments identified by the list from the source file to a
destination file stored at the storage server, without transferring
any portion of the source file to the client; and in response to
the command, sending a confirmation of execution of the command
from the storage server to the client device.
32. The method of claim 31, wherein the receiving comprises:
receiving, at the storage server, a command in a network storage
communication protocol from a client device, the command comprising
information identifying a source file stored at the storage server,
a list of file segments to include and an instruction to add a new
data segment, the source file comprising the file segments
identified by the list.
33. The method of claim 32, further comprising: retrieving, at the
storage server from the client device, data of the new data
segment; and wherein the executing comprises: executing the
command, at the storage server, by copying the file segments
identified by the list from the source file to a destination file
stored at the storage server, and by inserting the data of the new
data segment to the source file, without transferring any portion
of the source file to the client.
34. The method of claim 31, wherein the receiving comprises:
receiving, at the storage server, a command in a network storage
communication protocol from a client device, the command comprising
information identifying a source file stored at the storage server,
a list of file segments to include and file offsets of the file
segments identified by the list within the destination file, the
source file comprising the file segments identified by the
list.
35. The method of claim 31, wherein the executing further
comprises: retrieving from the source file the file segments
identified by the list; and inserting data of the file segments
identified by the list to the destination file at the file offsets
indicated by the command.
36. The method of claim 35, wherein the executing further
comprises: in an event that the command includes an instruction to
remove a file segment of the destination file, removing the file
system from the destination file stored in the storage server.
37. The method of claim 31, wherein the receiving comprises:
receiving, at the storage server, a command in a network storage
communication protocol from a client device, the command comprising
information identifying a source file and a destination file stored
at the storage server and a list of file segments to include in the
destination file, the source file comprising the file segments
identified by the list.
38. The method of claim 37, further comprising: avoiding copying
file segments of the source file that are not identified by the
list to the destination file.
39. The method of claim 37, wherein the receiving comprises:
receiving, at the storage server, a command in a network storage
communication protocol from a client device, the command comprising
information identifying a source file and a destination file stored
at the storage server, a list of file segments to include in the
destination file, and a reorder instruction to specify an order how
the file segments to include appear in the destination file, the
source file comprising the file segments identified by the list;
and wherein the executing comprises: executing the command, at the
storage server, by copying the file segments to include identified
by the list from the source file to a destination file stored at
the storage server, without transferring any portion of the source
file to the client, and by reordering the file segments to include
in the destination file based on the reorder instruction.
40. A non-transitory machine readable medium having stored thereon
instructions for performing a method of manipulating data files,
comprising machine executable code which when executed by at least
one machine, causes the machine to: receive at a storage server a
command from a client device over a network, the command comprising
information identifying a source file stored at the storage server
and a destination file stored at the storage server, the command
further comprising an exclusion list identifying file segments of
the source file that are to be excluded from the destination file;
and copy file segments of the source file that are not identified
by the exclusion list from the source file to the destination file
at the storage server, without transferring any portion of the
source file to the client device.
41. The non-transitory machine readable medium of claim 40, wherein
the machine executable code which when executed by at least one
machine, further causes the machine to: send a message notifying a
result of executing the command from the storage server to the
client device, in response to the command.
42. The non-transitory machine readable medium of claim 40, wherein
the command further comprises new data to be inserted into the
destination file; and wherein the machine executable code which
when executed by at least one machine, further causes the machine
to: insert the new data into the destination file.
43. The non-transitory machine readable medium of claim 40, wherein
the command comprises multiple file system operations in an order;
and wherein the machine executable code which when executed by at
least one machine, further causes the machine to: determine a
reordering of executing the file system operations of the command
by the storage server to improve a performance of the instructions,
prior to executing the file system operations according to the
reordering.
44. A computing device, comprising: a memory containing machine
readable medium comprising machine executable code having stored
thereon instructions for performing a method of manipulating data
sets; a processor coupled to the memory, the processor configured
to execute the machine executable code to: receive from a client a
repacking command identifying a source data set comprising multiple
segments and a destination data set stored at the computing device,
the command including information identifying at least a segment of
the multiple segments of the source data set; and manipulate the
destination data set based on the repacking command by using at
least the segment of the source data set identified by the command,
without transferring data of the destination data set or the source
data set to the client.
45. The computing device of claim 44, wherein the source data set
is a database table, and at least one segment of the database table
comprises a database record marked as deleted.
46. The computing device of claim 45, wherein the repacking command
instructs the computing device to copy database records of the
database table to the destination data set, except database records
of the database table that are marked as deleted.
47. The computing device of claim 45, wherein the processor is
further configured to execute the machine executable code to update
a database index to point to the destination data set and to remove
the source data set from the storage media, after the repacking
command has been executed.
48. The computing device of claim 44, wherein the source data set
is a file storing a folder of electronic mails, and at least one
segment of the database table comprises electronic mails that are
marked as deleted.
49. The computing device of claim 48, wherein the repacking command
instructs the computing device to copy data of the electronic mails
stored in the source data set to the destination data set, except
electronic mails that are marked as deleted.
50. The computing device of claim 48, wherein the repacking command
comprises a list of file offsets identifying locations at which the
electronic mails are stored in the file.
Description
PRIORITY CLAIM
[0001] This application is a continuation of U.S. patent
application Ser. No. 11/740,471 entitled "PERFORMING DIRECT DATA
MANIPULATION ON A STORAGE DEVICE" and filed on Apr. 26, 2007, which
is expressly incorporated by reference herein.
FIELD OF INVENTION
[0002] The present invention generally relates to networked
storage, and more particularly, to a method and system for directly
manipulating data on a storage device.
BACKGROUND
[0003] A data storage system is a computer and related storage
medium that enables storage or backup of large amounts of data.
Storage systems, also known as storage appliances or storage
servers, may support a network attached storage (NAS) computing
environment. A NAS is a computing environment where file-based
access is provided through a network, typically in a client/server
configuration. A storage server can provide clients with a
block-level access to data stored in a set of mass storage devices,
such as magnetic or optical storage disks.
[0004] A file server (also known as a "filer") is a computer that
provides file services relating to the organization of information
on storage devices, such as disks. The filer includes a storage
operating system that implements a file system to logically
organize the information as a hierarchical structure of directories
and files on the disks. Each "on-disk" file may be implemented as a
set of disk blocks configured to store information, whereas the
directory may be implemented as a specially-formatted file in which
information about other files and directories are stored. A filer
may be configured to operate according to a client/server model of
information delivery to allow many clients to access files stored
on the filer. In this model, the client may include an application,
such as a file system protocol, executing on a computer that
connects to the filer over a computer network. The computer network
can include, for example, a point-to-point link, a shared local
area network (LAN), a wide area network (WAN), or a virtual private
network (VPN) implemented over a public network such as the
Internet. Each client may request filer services by issuing file
system protocol messages (in the form of packets) to the filer over
the network.
[0005] A common file system type is a "write in-place" file system,
in which the locations of the data structures (such as inodes and
data blocks) on a disk are typically fixed. An inode is a data
structure used to store information, such as metadata, about a
file, whereas the data blocks are structures used to store the
actual data for the file. The information contained in an inode may
include information relating to ownership of the file, access
permissions for the file, the size of the file, the file type, and
references to locations on disk of the data blocks for the file.
The references to the locations of the file data are provided by
pointers, which may further reference indirect blocks. Indirect
blocks, in turn, reference the data blocks, depending upon the
quantity of data in the file. Changes to the inodes and data blocks
are made "in-place" in accordance with the write in-place file
system. If an update to a file extends the quantity of data for the
file, an additional data block is allocated and the appropriate
inode is updated to reference that data block.
[0006] Another file system type is a write-anywhere file system
that does not overwrite data on disks. If a data block on a disk is
read from the disk into memory and "dirtied" with new data, the
data block is written to a new location on the disk to optimize
write performance. A write-anywhere file system may initially
assume an optimal layout, such that the data is substantially
contiguously arranged on the disks. The optimal disk layout results
in efficient access operations, particularly for sequential read
operations. A particular example of a write-anywhere file system is
the Write Anywhere File Layout (WAFL.RTM.) file system available
from Network Appliance, Inc. The WAFL file system is implemented
within a microkernel as part of the overall protocol stack of the
filer and associated disk storage. This microkernel is supplied as
part of Network Appliance's Data ONTAP.RTM. storage operating
system, residing on the filer that processes file service requests
from network-attached clients.
[0007] As used herein, the term "storage operating system"
generally refers to the computer-executable code operable on a
storage system that manages data access. The storage operating
system may, in case of a filer, implement file system semantics,
such as Data ONTAP.RTM. storage operating system. The storage
operating system can also be implemented as an application program
operating on a general-purpose operating system, such as UNIX.RTM.
or Windows.RTM., or as a general-purpose operating system with
configurable functionality, which is configured for storage
applications as described herein.
[0008] Disk storage is typically implemented as one or more storage
"volumes" that comprise physical storage disks, defining an overall
logical arrangement of storage space. Currently available filer
implementations can serve a large number of discrete volumes.
[0009] The disks within a volume can be organized as a Redundant
Array of Independent (or Inexpensive) Disks (RAID). RAID
implementations enhance the reliability and integrity of data
storage through the writing of data "stripes" across a given number
of physical disks in the RAID group, and the appropriate storing of
parity information with respect to the striped data. In the example
of a WAFL.RTM. file system, a RAID-4 implementation is
advantageously employed, which entails striping data across a group
of disks, and storing the parity within a separate disk of the RAID
group. As described herein, a volume typically comprises at least
one data disk and one associated parity disk (or possibly
data/parity) partitions in a single disk arranged according to a
RAID-4, or equivalent high-reliability, implementation.
[0010] NAS devices provide access to stored data using standard
protocols, e.g., Network File System (NFS), Common Internet File
System (CIFS), Internet Small Computer System Interface (iSCSI),
etc. To manipulate the data stored on these devices, clients have
to fetch the data using an access protocol, modify the data, and
then write back the resulting modified data. Bulk data processing
sometimes requires small manipulations of the data that need to be
processed as fast as possible. This process (fetch-modify-write) is
inefficient for bulk data processing, as it wastes processor time
on protocol and network processing and increases network
utilization. The closer the processing is to the stored data, the
less time the data processing will take.
[0011] Traditional file systems are not particularly adept at
handling large numbers (e.g., more than one million) of small
objects (e.g., one kilobyte (KB) files). The typical way of
addressing this problem is to use a container to hold several of
the small objects. However, this solution leads to the problems of
how to manage the containers and how to manage the objects within
the container. Managing the containers presents the typical file
system problems from a higher level in the containers.
[0012] In applications that use files for storing a list of
records, a deleted record is often marked as "deleted" instead of
being physically removed from the file. The file is periodically
repacked to purge all of the deleted records and to reclaim space.
This process is traditionally carried out by reading the file by an
application via NFS, for example; packing the records by the
application; and writing the file back to storage via NFS, for
example. Again, this process uses the typical fetch-modify-write
pattern, which makes the entire repacking process inefficient for
the storage device.
[0013] Another example of this type of IO-intensive task is reading
a file and rewriting the data to another file, with the data being
relocated within the destination file. In addition to using
resources on the NAS device, this task also incurs a load on the
network (sending the file back and forth) and a load on the client
that is processing the data.
[0014] FIG. 1 is a flow diagram of an existing fetch-modify-write
method 100 for manipulating data stored on a storage device. The
method 100 operates between a server 102 and a data storage media
104. The server 102 and the data storage media 104 communicate over
a network connection. The server 102 requests data to be
manipulated from the storage media 104 (step 110). The storage
media 104 retrieves the data and sends the data over the network to
the server 102 (step 112). The server 102 manipulates the requested
data (step 114) and sends the manipulated data back over the
network to the storage media 104 (step 116).
[0015] As can be seen from FIG. 1, the method 100 requires that the
data be sent over the network twice--once from the storage media
104 to the server 102 (step 112) and second from the server 102 to
the storage media 104 (step 116).
[0016] Accordingly, there is a need for a technique for
manipulating data on a storage device that avoids the limitations
of the prior art solutions.
SUMMARY
[0017] The present invention describes a method and system for
performing data manipulation on a storage device. A data
manipulation command is created on a computing device, wherein the
computing device is separate from the storage device. The computing
device is a client or a server that requests services of a storage
system to store data on a storage medium. The computing device and
the storage device are connected over a network. The computing
device stores a host application, and its data is stored on the
medium. The computing device issues a command to the storage device
to be performed on the data. The storage device executes the
command and sends the result to the computing device.
[0018] The present invention provides advantages over existing
solutions. Several of these advantages are described below by way
of example. First, data manipulation performance is accelerated by
moving the command execution as close to the data as possible.
Second, because all of the data remains on the storage device,
there is no network utilization in transmitting the data to and
from the computer that requested the manipulation. Third, the
requesting computer is not required to expend processing power to
manipulate the data.
[0019] The present invention describes a set of high level commands
that can be built for data manipulation and a mechanism to send the
commands to the storage device. An exemplary command set may
include input/output (IO) instructions (e.g., relocate, remove,
etc.) that can be executed on the storage device. Each instruction
has its own descriptor and a set of parameters that are relevant to
it, e.g., a relocate instruction requires the following inputs: a
range of data to relocate, the source of the data, and destination
for the data. A logical data manipulation event can be composed of
many such instructions. The instructions are composed and packed by
the initiator of the event and sent to the target storage device
over the network. The target storage device unpacks the
instructions and executes the instructions in a data optimized
manner to arrive at the final result. The set of commands is
evaluated for correctness, the commands are executed, and the
results are returned.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] A more detailed understanding of the invention may be had
from the following description of preferred embodiments, given by
way of example, and to be understood in conjunction with the
accompanying drawings, wherein:
[0021] FIG. 1 is a flow diagram of an existing method for
manipulating data stored on a storage device;
[0022] FIG. 2 is a block diagram of a network environment in which
the present invention can be implemented;
[0023] FIG. 3 is a block diagram of the file server shown in FIG.
2;
[0024] FIG. 4 is a block diagram of the storage operating system
shown in FIG. 3;
[0025] FIG. 5 is a flow diagram of a method for directly
manipulating data on a storage device;
[0026] FIG. 6 is a flow diagram of a method for file repacking to
be performed on a storage device; and
[0027] FIG. 7 is a diagram of a data file that is repacked
according to the method shown in FIG. 6.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Network Environment
[0028] FIG. 2 is a block diagram of an exemplary network
environment 200 in which the principles of the present invention
are implemented. The environment 200 includes a number of clients
204 connected to a file server 206 over a network 202. The network
202 can be a local area network (LAN), a wide area network (WAN), a
virtual private network (VPN) using communication links over the
Internet, for example, or any combination of the three network
types. For the purposes of this description, the term "network"
includes any acceptable network architecture.
[0029] The file server 206, described further below, is configured
to control storage of data and access to data that is located on a
set 208 of interconnected storage volumes or disks 210. It is noted
that the terms "storage volumes" and "disks" can be used
interchangeably herein, without limiting the term "storage volumes"
to disks. The term "storage volumes" can include any type of
storage media, such as tapes or non-volatile memory.
[0030] Each of the devices attached to the network 202 includes an
appropriate conventional network interface connection (not shown)
for communicating over the network 202 using a communication
protocol, such as Transport Control Protocol/Internet Protocol
(TCP/IP), User Datagram Protocol (UDP), Hyper Text Transport
Protocol (HTTP), Simple Network Management Protocol (SNMP), or
Virtual Interface (VI) connections.
[0031] File Server
[0032] FIG. 3 is a detailed block diagram of an exemplary file
server ("filer") 206. It will be understood by one skilled in the
art that the inventive concepts described herein apply to any type
of file server, wherever implemented, including on a
special-purpose computer, a general-purpose computer, or a
standalone computer.
[0033] The file server 206 includes a processor 302, a memory 304,
a network adapter 306, a nonvolatile random access memory (NVRAM)
308, and a storage adapter 310, all of which are interconnected by
a system bus 312. Contained within the memory 304 is a storage
operating system 314 that implements a file system to logically
organize the information as a hierarchical structure of directories
and files on the disks 210. In an exemplary embodiment, the memory
304 is addressable by the processor 302 and the adapters 306, 310
for storing software program code. The operating system 314,
portions of which are typically resident in the memory 304 and
executed by the processing elements, functionally organizes the
filer by invoking storage operations in support of a file service
implemented by the filer.
[0034] The network adapter 306 includes mechanical, electrical, and
signaling circuitry needed to connect the filer 206 to clients 204
over the network 202. The clients 204 may be general-purpose
computers configured to execute applications, such as database
applications. Moreover, the clients 204 may interact with the filer
206 in accordance with a client/server information delivery model.
That is, the client 204 requests the services of the filer 206, and
the filer 206 returns the results of the services requested by the
client 204 by exchanging packets defined by an appropriate
networking protocol.
[0035] The storage adapter 310 interoperates with the storage
operating system 314 and the disks 210 of the set of storage
volumes 208 to access information requested by the client 204. The
storage adapter 310 includes input/output (I/O) interface circuitry
that couples to the disks 210 over an I/O interconnect arrangement,
such as a Fibre Channel link. The information is retrieved by the
storage adapter 310 and, if necessary, is processed by the
processor 302 (or the adapter 310 itself) prior to being forwarded
over the system bus 312 to the network adapter 306, where the
information is formatted into appropriate packets and returned to
the client 204.
[0036] In one exemplary implementation, the filer 206 includes a
non-volatile random access memory (NVRAM) 308 that provides
fault-tolerant backup of data, enabling the integrity of filer
transactions to survive a service interruption based upon a power
failure or other fault.
[0037] Storage Operating System
[0038] To facilitate the generalized access to the disks 210, the
storage operating system 314 implements a write-anywhere file
system that logically organizes the information as a hierarchical
structure of directories and files on the disks. As noted above, in
an exemplary embodiment described herein, the storage operating
system 314 is the NetApp.RTM. Data ONTAP.RTM. operating system
available from Network Appliance, Inc., that implements the
WAFL.RTM. file system. It is noted that any other appropriate file
system can be used, and as such, where the terms "WAFL.RTM." or
"file system" are used, those terms should be interpreted broadly
to refer to any file system that is adaptable to the teachings of
this invention.
[0039] Referring now to FIG. 4, the storage operating system 314
includes a series of software layers, including a media access
layer 402 of network drivers (e.g., an Ethernet driver). The
storage operating system 314 further includes network protocol
layers, such as an Internet Protocol (IP) layer 404 and its
supporting transport mechanisms, a Transport Control Protocol (TCP)
layer 406 and a User Datagram Protocol (UDP) layer 408.
[0040] A file system protocol layer 410 provides multi-protocol
data access and includes support for the Network File System (NFS)
protocol 412, the Common Internet File System (CIFS) protocol 414,
and the Hyper Text Transfer Protocol (HTTP) 416. In addition, the
storage operating system 314 includes a disk storage layer 420 that
implements a disk storage protocol, such as a redundant array of
independent disks (RAID) protocol, and a disk driver layer 422 that
implements a disk access protocol such as, e.g., a Small Computer
System Interface (SCSI) protocol.
[0041] Bridging the disk software layers 420-422 with the network
and file system protocol layers 402-416 is a file system layer 430.
Generally, the file system layer 430 implements a file system
having an on-disk format representation that is block-based using
data blocks and inodes to describe the files.
[0042] In the storage operating system 314, a data request path 432
between the network 202 and the disk 210 through the various layers
of the operating system is followed. In response to a transaction
request, the file system layer 430 generates an operation to
retrieve the requested data from the disks 210 if the data is not
resident in the filer's memory 304. If the data is not in the
memory 304, then the file system layer 430 indexes into an inode
file using the inode number to access an appropriate entry and
retrieve a logical volume block number. The file system layer 430
then passes the logical volume block number to the disk storage
layer 420. The disk storage layer 420 maps the logical number to a
disk block number and sends the disk block number to an appropriate
driver (for example, an encapsulation of SCSI implemented on a
Fibre Channel disk interconnection) in the disk driver layer 422.
The disk driver accesses the disk block number on the disks 210 and
loads the requested data in the memory 304 for processing by the
filer 206. Upon completing the request, the filer 206 (and storage
operating system 314) returns a reply, e.g., an acknowledgement
packet defined by the CIFS specification, to the client 204 over
the network 202.
[0043] It is noted that the storage access request data path 432
through the storage operating system layers described above may be
implemented in hardware, software, or a combination of hardware and
software. In an alternate embodiment of this invention, the storage
access request data path 432 may be implemented as logic circuitry
embodied within a field programmable gate array (FPGA) or in an
application specific integrated circuit (ASIC). This type of
hardware implementation increases the performance of the file
services provided by the filer 206 in response to a file system
request issued by a client 204.
[0044] By way of introduction, the present invention provides
advantages over existing solutions. Several of these advantages are
described below by way of example. First, data manipulation
performance is accelerated by moving the command execution as close
to where the data is stored as possible. Second, because all of the
data remains on the storage device, there is no network utilization
in transmitting the data to and from the computer that requested
the manipulation. Third, the requesting computer is not required to
expend processing power to manipulate the data.
[0045] The present invention describes a set of high level commands
that can be built for data manipulation and a mechanism to send the
commands to the storage device. An exemplary command set may
include IO instructions (e.g., relocate, remove, etc.) that can be
executed on the storage device. Each instruction has its own
descriptor and a set of parameters that are relevant to it, e.g., a
relocate instruction requires the following inputs: a range of data
to relocate, the source of the data, and destination for the data.
A logical data manipulation event can be composed of many such
instructions. The instructions are composed and packed by the
initiator of the event and sent to the target storage device over
the network. The target storage device unpacks the instructions and
executes the instructions in a data optimized manner to arrive at
the final result. As discussed in greater detail below, the concept
of a "data optimized manner" can include the storage device
reordering the instructions to improve (e.g., speed up) the
performance of the instructions. The set of commands is evaluated
for correctness, the commands are executed, and the results are
returned.
[0046] In an exemplary embodiment, the present invention is
implemented as an application executing on a computer operating
system. For example, the storage device can include the
NearStore.RTM. storage system running the NetApp.RTM. Data
ONTAP.RTM. operating system available from Network Appliance, Inc.
It is noted that the principles of the present invention are
applicable to any type of storage device running any type of
operating system.
[0047] FIG. 5 shows a flow diagram of a method 500 for directly
manipulating data on a storage device. The method 500 utilizes a
client 502 and a storage device 504, which communicate with each
other over a network connection. A person of ordinary skill in the
art would understand that the client 502 can be a client 204 as
shown in FIG. 2 and that the storage device 504 can be a file
server 206 shown in FIG. 2.
[0048] While the method 500 is described as using a client 502, any
suitable computing device capable of communicating with the storage
device 504 may be used. Client 502 utilizes services of the storage
device 504 to store and manage data, such as, for example, files on
a storage media 508, which can be a set of mass storage devices,
such as magnetic or optical storage based disks or tapes. As used
herein, the term "file" encompasses a container, an object, or any
other storage entity. Interaction between the client 502 and the
storage device 504 can enable the provision of storage services.
That is, the client 502 may request the services of the storage
device 504 and the storage device 504 may return the results of the
services requested by the client 502, by exchanging packets over
the connection system (not shown in FIG. 5). The client 502 may
issue packets using file-based access protocols, such as the Common
Internet File System (CIFS) protocol or the Network File System
(NFS) protocol, over the Transmission Control Protocol/Internet
Protocol (TCP/IP) when accessing information in the form of files
and directories. Alternatively, the client 502 may issue packets
including block-based access protocols, such as the Small Computer
Systems Interface (SCSI) protocol encapsulated over TCP (iSCSI) and
SCSI encapsulated over Fibre Channel (FCP), when accessing
information in the form of blocks. The client 502 executes one or
more host applications (not shown in FIG. 5).
[0049] The storage device 504 includes a storage manager 506 and a
data storage media 508. In a preferred embodiment, the storage
manager 506 is the file system layer 430 of the storage operating
system 314 as shown in FIG. 4. Referring back to FIG. 5, a command
with associated inputs (e.g., the source file name) to manipulate
data is created at the client 502 (step 510) and the client 502
sends the command over the network to the storage manager 506 in
the storage device 504 (step 512). The storage manager 506 unpacks
the command and the associated inputs (step 514) and requests a
source file from the storage media 508 that contains the data to be
manipulated (step 516). The storage media 508 retrieves the source
file (step 518) and the storage manager 506 reads the source file
(step 520). The storage manager 506 manipulates the requested data
from the source file as specified by the instructions in the
command (step 522) and writes the manipulated data back to the
storage media 508 (step 524). The storage manager 506 then sends
the manipulation result back over the network to the client 502
(step 526).
[0050] One specific example of using the method 500 is in
connection with Internet-based electronic mail. In these scenarios,
electronic mail folders are often stored as single files, where
each file contains all of the concatenated mail messages. When a
user deletes a message, it is simply marked as "deleted" in the
file. At a later time, the files need to be "repacked" in order to
reclaim the space freed by the deleted messages.
[0051] A file repacking method 600 (described in connection with
FIG. 6) can assist in repacking these files. The mail repacking
application (not shown in FIG. 6), which is executed at a client
602, sends a list of valid offsets, the source file name, and
destination file name to operate on to a storage device 604. The
storage device 604 reads the list of offsets from the source file
and copies the data from those offsets to the specified destination
file. Once the entire list of offsets is processed, the storage
device 604 returns an indication of success or failure to the
client 602. The mail application can then update its internal
system for tracking individual mail messages to reflect the newly
packed file and delete the old file. Those of skill in the art
would understand that client 602 can correspond to client 502 shown
in FIG. 5; that storage device 604 can correspond to storage device
504; that data storage media 608 can correspond to data storage
media 508; and that storage manager 606 can correspond to storage
manager 506.
[0052] Sending a list of commands to the storage device 604 to
repack the data directly on the storage device 604 provides the
following advantages: the amount of data sent to the storage device
604 is small (only the commands are sent, and not the data), the
storage device 604 can optimize the set of instructions and execute
them in an efficient manner, and no protocol overhead is needed
because the data is never moved off the storage device 604.
[0053] FIG. 6 is a flow diagram of a method 600 for file repacking
to be performed on the storage device 604. The method 600 utilizes
the client 602 and the storage device 604, which communicate with
each other over a network connection. While the method 600 is
described as using a client 602, any suitable computing device
capable of communicating with the storage device 604 may be used.
The storage device 604 includes a storage manager 606 and a data
storage media 608. The client 602 identifies the source file to be
repacked, the destination file, a list of segments in the source
file to copy to the destination file, data to be inserted into the
destination file (this inserted data is optional), and regions to
skip in the destination file (holes) (step 610). It is noted that
in the example of the mail repacking application, the client 602
identifies this information by using information already maintained
by the mail repacking application. As noted above, a benefit of the
method 600 is to transfer processing from the client 602 to the
storage device 604; the basic operation of the underlying mail
repacking application is not changed. One reason that a user may
want to leave holes in the destination file is to leave space to
write metafile information, such as the number of records, the time
of the repacking, and similar information which would be useful at
a later time. The client 602 then packs the information into a
command (step 612).
[0054] Each command executed by the method 600 may consist of a
single call to the storage device 604 that contains all the details
to repack a file, such as the list of segments to be copied, any
data to be inserted into the destination file, and any regions to
skip in the destination file. It is noted that the list of segments
to be copied from the source file to the destination file could
alternatively be a list of segments from the source file that are
not to be copied to the destination file, wherein all other
segments of the source file are to be copied to the destination
file. The choice is implementation-specific and does not affect the
general operation of the method 600. Whether the list of segments
indicates segments to include or segments to exclude from the
destination file can be indicated by a flag, for example. One
skilled in the art can readily identify other types of indicators
for identifying these segments; all such indicators are within the
scope of the present invention.
[0055] Furthermore, if the list of segments indicates a list of
segments to be included in the destination file, a particular
ordering for the inclusion list could be specified, whereby the
segments in the destination file would be reordered from how the
segments appear in the source file. For purposes of discussion of
the method 600, the list of segments includes a list of segments to
copy from the source file to the destination file.
[0056] The client 602 sends the packed command over the network to
the storage manager 606 in the storage device 604 (step 614). The
storage manager 606 unpacks the command (step 616) and requests a
source file from the storage media 608 (step 618). The storage
media 608 retrieves the source file (step 620) and the storage
manager 606 reads the source file (step 622). The storage manager
606 copies the segments from the list of segments of the source
file to the destination file (step 624).
[0057] The storage manager 606 can choose to reorder and optimize
the set of instructions in the command. Whether the instructions
are reordered depends on the implementation and the layout of the
data. For example, data can be pre-fetched for the next instruction
while the current instruction is in progress. The storage manager
606 knows best how to execute the instructions. The client 602 does
not know where the data is physically located in the storage media
608. However, the storage manager 606 knows where the data is
located in the storage media 608, and can use this information to
accelerate the method 600. For example, the storage manager 606
could read blocks out of the order specified in the file repacking
command in order to obtain better performance from the storage
device 604.
[0058] If additional data was provided (in step 610) to be inserted
into the destination file, the storage manager 606 inserts the data
(step 626; this optional step is shown in dashed outline). The
storage manager 606 writes the destination file to the storage
media 608 (step 628). The storage manager 606 then sends the result
of the file repacking command back over the network to the client
602 (step 630).
[0059] FIG. 7 is a diagram of a data file that is repacked
according to the method shown in FIG. 6. A source file 702 includes
a plurality of data segments 710, 712, 716, 718, 722, and several
regions to be deleted 714, 720, 724. The method 600 copies the data
segments 710, 712, 716, 718, 722 to a destination file 704, removes
the regions to be deleted 716, 720, 724, and adds new data 730 to
the destination file 704.
[0060] Another example of using the method 500 is in connection
with database table repacking. In one implementation, the client
502 may execute a database management system, such as Microsoft.TM.
SQL Server, by Microsoft Corporation of Redmond, Wash. Databases
within database management systems maintain tables as files which
contain fixed-size records. When a record is deleted, it is simply
marked as "deleted" and is removed from the table index.
Periodically, databases repack the table file to improve
performance and to free up space held by deleted records. In
particular, the repacking method 600 can also be used for repacking
database tables. As described using the components shown in FIG. 6,
the database management system generates a range of valid offsets
in each table file (step 610) and sends the range of offsets to the
storage device 604 (step 614). The storage device 604 uses the list
of offsets to repack the table file (steps 616-624). Once
completed, the database can update its indices and delete or
archive the old table file.
[0061] While the method 600 was described in connection with
repacking a file, other IO commands can be performed using a
similar method, as generally shown by the method 500. The other IO
commands can include, but are not limited to, the commands shown in
Table 1.
TABLE-US-00001 TABLE 1 IO Commands Command Corresponding Inputs
read file, offset, length of read write file, offset, length of
write resize file, offset delete file rename file1, file2 relocate
file1, offset, length of read, file2, offset, length of write
[0062] The present invention can be implemented in a computer
program tangibly embodied in a computer-readable storage medium
containing a set of instructions for execution by a processor or a
general purpose computer; and method steps of the invention can be
performed by a processor executing a program of instructions to
perform functions of the invention by operating on input data and
generating output data. Suitable processors include, by way of
example, both general and special purpose processors. Typically, a
processor will receive instructions and data from a ROM, a random
access memory (RAM), and/or a storage device. Storage devices
suitable for embodying computer program instructions and data
include all forms of non-volatile memory, including by way of
example semiconductor memory devices, magnetic media such as
internal hard disks and removable disks, magneto-optical media, and
optical media such as CD-ROM disks and digital versatile disks
(DVDs). In addition, while the illustrative embodiments may be
implemented in computer software, the functions within the
illustrative embodiments may alternatively be embodied in part or
in whole using hardware components such as Application Specific
Integrated Circuits (ASICs), Field Programmable Gate Arrays
(FPGAs), or other hardware, or in some combination of hardware
components and software components.
[0063] While specific embodiments of the present invention have
been shown and described, many modifications and variations could
be made by one skilled in the art without departing from the scope
of the invention. The above description serves to illustrate and
not limit the particular invention in any way.
* * * * *