Performing Direct Data Manipulation On A Storage Device Trimmer; Don Alvin ; et al. [NetApp, Inc.]

Performing Direct Data Manipulation On A Storage Device

Trimmer; Don Alvin ; et al.

Patent Application Summary

U.S. patent application number 14/265173 was filed with the patent office on 2014-12-11 for performing direct data manipulation on a storage device. This patent application is currently assigned to NetApp, Inc.. The applicant listed for this patent is NetApp, Inc.. Invention is credited to Pratap Singh, Don Alvin Trimmer, Sandeep Yadav.

Application Number	20140365539 14/265173
Document ID	/
Family ID	50982213
Filed Date	2014-12-11

United States Patent Application	20140365539
Kind Code	A1
Trimmer; Don Alvin ; et al.	December 11, 2014

PERFORMING DIRECT DATA MANIPULATION ON A STORAGE DEVICE

Abstract

A method and system for performing data manipulation on a storage device is disclosed. A data manipulation command is created on a computing device, wherein the computing device is separate from the storage device. The computing device is a client or a server that requests services of a storage system to store data on a storage medium. The computing device and the storage device are connected over a network. The computing device executes a host application, and its data is stored on the medium. The computing device issues a command to the storage device to be performed on the data. The storage device executes the command and sends the result to the computing device. As a result, the data is not sent to the computing device for manipulation.

Inventors:

Trimmer; Don Alvin; (Sunnyvale, CA) ; Yadav; Sandeep; (Sunnyvale, CA) ; Singh; Pratap; (Sunnyvale, CA)

Applicant:

Name	City	State	Country	Type
NetApp, Inc.	Sunnyvale	CA	US

Assignee:

NetApp, Inc.
Sunnyvale
CA

Family ID:

50982213

Appl. No.:

14/265173

Filed:

April 29, 2014

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
11740471	Apr 26, 2007	8768898
14265173

Current U.S. Class:	707/825 ; 707/827
Current CPC Class:	G06F 16/1727 20190101; G06F 16/1827 20190101; G06F 16/16 20190101
Class at Publication:	707/825 ; 707/827
International Class:	G06F 17/30 20060101 G06F017/30

Claims

1-30. (canceled)

31. A method, comprising: receiving, at a storage server, a command in a network storage communication protocol from a client device, the command comprising information identifying a source file stored at the storage server and a list of file segments to include, the source file comprising the file segments identified by the list; executing, at the storage server, the command by copying the file segments identified by the list from the source file to a destination file stored at the storage server, without transferring any portion of the source file to the client; and in response to the command, sending a confirmation of execution of the command from the storage server to the client device.

32. The method of claim 31, wherein the receiving comprises: receiving, at the storage server, a command in a network storage communication protocol from a client device, the command comprising information identifying a source file stored at the storage server, a list of file segments to include and an instruction to add a new data segment, the source file comprising the file segments identified by the list.

33. The method of claim 32, further comprising: retrieving, at the storage server from the client device, data of the new data segment; and wherein the executing comprises: executing the command, at the storage server, by copying the file segments identified by the list from the source file to a destination file stored at the storage server, and by inserting the data of the new data segment to the source file, without transferring any portion of the source file to the client.

34. The method of claim 31, wherein the receiving comprises: receiving, at the storage server, a command in a network storage communication protocol from a client device, the command comprising information identifying a source file stored at the storage server, a list of file segments to include and file offsets of the file segments identified by the list within the destination file, the source file comprising the file segments identified by the list.

35. The method of claim 31, wherein the executing further comprises: retrieving from the source file the file segments identified by the list; and inserting data of the file segments identified by the list to the destination file at the file offsets indicated by the command.

36. The method of claim 35, wherein the executing further comprises: in an event that the command includes an instruction to remove a file segment of the destination file, removing the file system from the destination file stored in the storage server.

37. The method of claim 31, wherein the receiving comprises: receiving, at the storage server, a command in a network storage communication protocol from a client device, the command comprising information identifying a source file and a destination file stored at the storage server and a list of file segments to include in the destination file, the source file comprising the file segments identified by the list.

38. The method of claim 37, further comprising: avoiding copying file segments of the source file that are not identified by the list to the destination file.

39. The method of claim 37, wherein the receiving comprises: receiving, at the storage server, a command in a network storage communication protocol from a client device, the command comprising information identifying a source file and a destination file stored at the storage server, a list of file segments to include in the destination file, and a reorder instruction to specify an order how the file segments to include appear in the destination file, the source file comprising the file segments identified by the list; and wherein the executing comprises: executing the command, at the storage server, by copying the file segments to include identified by the list from the source file to a destination file stored at the storage server, without transferring any portion of the source file to the client, and by reordering the file segments to include in the destination file based on the reorder instruction.

40. A non-transitory machine readable medium having stored thereon instructions for performing a method of manipulating data files, comprising machine executable code which when executed by at least one machine, causes the machine to: receive at a storage server a command from a client device over a network, the command comprising information identifying a source file stored at the storage server and a destination file stored at the storage server, the command further comprising an exclusion list identifying file segments of the source file that are to be excluded from the destination file; and copy file segments of the source file that are not identified by the exclusion list from the source file to the destination file at the storage server, without transferring any portion of the source file to the client device.

41. The non-transitory machine readable medium of claim 40, wherein the machine executable code which when executed by at least one machine, further causes the machine to: send a message notifying a result of executing the command from the storage server to the client device, in response to the command.

42. The non-transitory machine readable medium of claim 40, wherein the command further comprises new data to be inserted into the destination file; and wherein the machine executable code which when executed by at least one machine, further causes the machine to: insert the new data into the destination file.

43. The non-transitory machine readable medium of claim 40, wherein the command comprises multiple file system operations in an order; and wherein the machine executable code which when executed by at least one machine, further causes the machine to: determine a reordering of executing the file system operations of the command by the storage server to improve a performance of the instructions, prior to executing the file system operations according to the reordering.

44. A computing device, comprising: a memory containing machine readable medium comprising machine executable code having stored thereon instructions for performing a method of manipulating data sets; a processor coupled to the memory, the processor configured to execute the machine executable code to: receive from a client a repacking command identifying a source data set comprising multiple segments and a destination data set stored at the computing device, the command including information identifying at least a segment of the multiple segments of the source data set; and manipulate the destination data set based on the repacking command by using at least the segment of the source data set identified by the command, without transferring data of the destination data set or the source data set to the client.

45. The computing device of claim 44, wherein the source data set is a database table, and at least one segment of the database table comprises a database record marked as deleted.

46. The computing device of claim 45, wherein the repacking command instructs the computing device to copy database records of the database table to the destination data set, except database records of the database table that are marked as deleted.

47. The computing device of claim 45, wherein the processor is further configured to execute the machine executable code to update a database index to point to the destination data set and to remove the source data set from the storage media, after the repacking command has been executed.

48. The computing device of claim 44, wherein the source data set is a file storing a folder of electronic mails, and at least one segment of the database table comprises electronic mails that are marked as deleted.

49. The computing device of claim 48, wherein the repacking command instructs the computing device to copy data of the electronic mails stored in the source data set to the destination data set, except electronic mails that are marked as deleted.

50. The computing device of claim 48, wherein the repacking command comprises a list of file offsets identifying locations at which the electronic mails are stored in the file.

Description

PRIORITY CLAIM

[0001] This application is a continuation of U.S. patent application Ser. No. 11/740,471 entitled "PERFORMING DIRECT DATA MANIPULATION ON A STORAGE DEVICE" and filed on Apr. 26, 2007, which is expressly incorporated by reference herein.

FIELD OF INVENTION

[0002] The present invention generally relates to networked storage, and more particularly, to a method and system for directly manipulating data on a storage device.

BACKGROUND

[0003] A data storage system is a computer and related storage medium that enables storage or backup of large amounts of data. Storage systems, also known as storage appliances or storage servers, may support a network attached storage (NAS) computing environment. A NAS is a computing environment where file-based access is provided through a network, typically in a client/server configuration. A storage server can provide clients with a block-level access to data stored in a set of mass storage devices, such as magnetic or optical storage disks.

[0004] A file server (also known as a "filer") is a computer that provides file services relating to the organization of information on storage devices, such as disks. The filer includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks. Each "on-disk" file may be implemented as a set of disk blocks configured to store information, whereas the directory may be implemented as a specially-formatted file in which information about other files and directories are stored. A filer may be configured to operate according to a client/server model of information delivery to allow many clients to access files stored on the filer. In this model, the client may include an application, such as a file system protocol, executing on a computer that connects to the filer over a computer network. The computer network can include, for example, a point-to-point link, a shared local area network (LAN), a wide area network (WAN), or a virtual private network (VPN) implemented over a public network such as the Internet. Each client may request filer services by issuing file system protocol messages (in the form of packets) to the filer over the network.

[0005] A common file system type is a "write in-place" file system, in which the locations of the data structures (such as inodes and data blocks) on a disk are typically fixed. An inode is a data structure used to store information, such as metadata, about a file, whereas the data blocks are structures used to store the actual data for the file. The information contained in an inode may include information relating to ownership of the file, access permissions for the file, the size of the file, the file type, and references to locations on disk of the data blocks for the file. The references to the locations of the file data are provided by pointers, which may further reference indirect blocks. Indirect blocks, in turn, reference the data blocks, depending upon the quantity of data in the file. Changes to the inodes and data blocks are made "in-place" in accordance with the write in-place file system. If an update to a file extends the quantity of data for the file, an additional data block is allocated and the appropriate inode is updated to reference that data block.

[0006] Another file system type is a write-anywhere file system that does not overwrite data on disks. If a data block on a disk is read from the disk into memory and "dirtied" with new data, the data block is written to a new location on the disk to optimize write performance. A write-anywhere file system may initially assume an optimal layout, such that the data is substantially contiguously arranged on the disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations. A particular example of a write-anywhere file system is the Write Anywhere File Layout (WAFL.RTM.) file system available from Network Appliance, Inc. The WAFL file system is implemented within a microkernel as part of the overall protocol stack of the filer and associated disk storage. This microkernel is supplied as part of Network Appliance's Data ONTAP.RTM. storage operating system, residing on the filer that processes file service requests from network-attached clients.

[0007] As used herein, the term "storage operating system" generally refers to the computer-executable code operable on a storage system that manages data access. The storage operating system may, in case of a filer, implement file system semantics, such as Data ONTAP.RTM. storage operating system. The storage operating system can also be implemented as an application program operating on a general-purpose operating system, such as UNIX.RTM. or Windows.RTM., or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.

[0008] Disk storage is typically implemented as one or more storage "volumes" that comprise physical storage disks, defining an overall logical arrangement of storage space. Currently available filer implementations can serve a large number of discrete volumes.

[0009] The disks within a volume can be organized as a Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability and integrity of data storage through the writing of data "stripes" across a given number of physical disks in the RAID group, and the appropriate storing of parity information with respect to the striped data. In the example of a WAFL.RTM. file system, a RAID-4 implementation is advantageously employed, which entails striping data across a group of disks, and storing the parity within a separate disk of the RAID group. As described herein, a volume typically comprises at least one data disk and one associated parity disk (or possibly data/parity) partitions in a single disk arranged according to a RAID-4, or equivalent high-reliability, implementation.

[0010] NAS devices provide access to stored data using standard protocols, e.g., Network File System (NFS), Common Internet File System (CIFS), Internet Small Computer System Interface (iSCSI), etc. To manipulate the data stored on these devices, clients have to fetch the data using an access protocol, modify the data, and then write back the resulting modified data. Bulk data processing sometimes requires small manipulations of the data that need to be processed as fast as possible. This process (fetch-modify-write) is inefficient for bulk data processing, as it wastes processor time on protocol and network processing and increases network utilization. The closer the processing is to the stored data, the less time the data processing will take.

[0011] Traditional file systems are not particularly adept at handling large numbers (e.g., more than one million) of small objects (e.g., one kilobyte (KB) files). The typical way of addressing this problem is to use a container to hold several of the small objects. However, this solution leads to the problems of how to manage the containers and how to manage the objects within the container. Managing the containers presents the typical file system problems from a higher level in the containers.

[0012] In applications that use files for storing a list of records, a deleted record is often marked as "deleted" instead of being physically removed from the file. The file is periodically repacked to purge all of the deleted records and to reclaim space. This process is traditionally carried out by reading the file by an application via NFS, for example; packing the records by the application; and writing the file back to storage via NFS, for example. Again, this process uses the typical fetch-modify-write pattern, which makes the entire repacking process inefficient for the storage device.

[0013] Another example of this type of IO-intensive task is reading a file and rewriting the data to another file, with the data being relocated within the destination file. In addition to using resources on the NAS device, this task also incurs a load on the network (sending the file back and forth) and a load on the client that is processing the data.

[0014] FIG. 1 is a flow diagram of an existing fetch-modify-write method 100 for manipulating data stored on a storage device. The method 100 operates between a server 102 and a data storage media 104. The server 102 and the data storage media 104 communicate over a network connection. The server 102 requests data to be manipulated from the storage media 104 (step 110). The storage media 104 retrieves the data and sends the data over the network to the server 102 (step 112). The server 102 manipulates the requested data (step 114) and sends the manipulated data back over the network to the storage media 104 (step 116).

[0015] As can be seen from FIG. 1, the method 100 requires that the data be sent over the network twice--once from the storage media 104 to the server 102 (step 112) and second from the server 102 to the storage media 104 (step 116).

[0016] Accordingly, there is a need for a technique for manipulating data on a storage device that avoids the limitations of the prior art solutions.

SUMMARY

[0017] The present invention describes a method and system for performing data manipulation on a storage device. A data manipulation command is created on a computing device, wherein the computing device is separate from the storage device. The computing device is a client or a server that requests services of a storage system to store data on a storage medium. The computing device and the storage device are connected over a network. The computing device stores a host application, and its data is stored on the medium. The computing device issues a command to the storage device to be performed on the data. The storage device executes the command and sends the result to the computing device.

[0018] The present invention provides advantages over existing solutions. Several of these advantages are described below by way of example. First, data manipulation performance is accelerated by moving the command execution as close to the data as possible. Second, because all of the data remains on the storage device, there is no network utilization in transmitting the data to and from the computer that requested the manipulation. Third, the requesting computer is not required to expend processing power to manipulate the data.

[0019] The present invention describes a set of high level commands that can be built for data manipulation and a mechanism to send the commands to the storage device. An exemplary command set may include input/output (IO) instructions (e.g., relocate, remove, etc.) that can be executed on the storage device. Each instruction has its own descriptor and a set of parameters that are relevant to it, e.g., a relocate instruction requires the following inputs: a range of data to relocate, the source of the data, and destination for the data. A logical data manipulation event can be composed of many such instructions. The instructions are composed and packed by the initiator of the event and sent to the target storage device over the network. The target storage device unpacks the instructions and executes the instructions in a data optimized manner to arrive at the final result. The set of commands is evaluated for correctness, the commands are executed, and the results are returned.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020] A more detailed understanding of the invention may be had from the following description of preferred embodiments, given by way of example, and to be understood in conjunction with the accompanying drawings, wherein:

[0021] FIG. 1 is a flow diagram of an existing method for manipulating data stored on a storage device;

[0022] FIG. 2 is a block diagram of a network environment in which the present invention can be implemented;

[0023] FIG. 3 is a block diagram of the file server shown in FIG. 2;

[0024] FIG. 4 is a block diagram of the storage operating system shown in FIG. 3;

[0025] FIG. 5 is a flow diagram of a method for directly manipulating data on a storage device;

[0026] FIG. 6 is a flow diagram of a method for file repacking to be performed on a storage device; and

[0027] FIG. 7 is a diagram of a data file that is repacked according to the method shown in FIG. 6.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Network Environment

[0028] FIG. 2 is a block diagram of an exemplary network environment 200 in which the principles of the present invention are implemented. The environment 200 includes a number of clients 204 connected to a file server 206 over a network 202. The network 202 can be a local area network (LAN), a wide area network (WAN), a virtual private network (VPN) using communication links over the Internet, for example, or any combination of the three network types. For the purposes of this description, the term "network" includes any acceptable network architecture.

[0029] The file server 206, described further below, is configured to control storage of data and access to data that is located on a set 208 of interconnected storage volumes or disks 210. It is noted that the terms "storage volumes" and "disks" can be used interchangeably herein, without limiting the term "storage volumes" to disks. The term "storage volumes" can include any type of storage media, such as tapes or non-volatile memory.

[0030] Each of the devices attached to the network 202 includes an appropriate conventional network interface connection (not shown) for communicating over the network 202 using a communication protocol, such as Transport Control Protocol/Internet Protocol (TCP/IP), User Datagram Protocol (UDP), Hyper Text Transport Protocol (HTTP), Simple Network Management Protocol (SNMP), or Virtual Interface (VI) connections.

[0031] File Server

[0032] FIG. 3 is a detailed block diagram of an exemplary file server ("filer") 206. It will be understood by one skilled in the art that the inventive concepts described herein apply to any type of file server, wherever implemented, including on a special-purpose computer, a general-purpose computer, or a standalone computer.

[0033] The file server 206 includes a processor 302, a memory 304, a network adapter 306, a nonvolatile random access memory (NVRAM) 308, and a storage adapter 310, all of which are interconnected by a system bus 312. Contained within the memory 304 is a storage operating system 314 that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks 210. In an exemplary embodiment, the memory 304 is addressable by the processor 302 and the adapters 306, 310 for storing software program code. The operating system 314, portions of which are typically resident in the memory 304 and executed by the processing elements, functionally organizes the filer by invoking storage operations in support of a file service implemented by the filer.

[0034] The network adapter 306 includes mechanical, electrical, and signaling circuitry needed to connect the filer 206 to clients 204 over the network 202. The clients 204 may be general-purpose computers configured to execute applications, such as database applications. Moreover, the clients 204 may interact with the filer 206 in accordance with a client/server information delivery model. That is, the client 204 requests the services of the filer 206, and the filer 206 returns the results of the services requested by the client 204 by exchanging packets defined by an appropriate networking protocol.

[0035] The storage adapter 310 interoperates with the storage operating system 314 and the disks 210 of the set of storage volumes 208 to access information requested by the client 204. The storage adapter 310 includes input/output (I/O) interface circuitry that couples to the disks 210 over an I/O interconnect arrangement, such as a Fibre Channel link. The information is retrieved by the storage adapter 310 and, if necessary, is processed by the processor 302 (or the adapter 310 itself) prior to being forwarded over the system bus 312 to the network adapter 306, where the information is formatted into appropriate packets and returned to the client 204.

[0036] In one exemplary implementation, the filer 206 includes a non-volatile random access memory (NVRAM) 308 that provides fault-tolerant backup of data, enabling the integrity of filer transactions to survive a service interruption based upon a power failure or other fault.

[0037] Storage Operating System

[0038] To facilitate the generalized access to the disks 210, the storage operating system 314 implements a write-anywhere file system that logically organizes the information as a hierarchical structure of directories and files on the disks. As noted above, in an exemplary embodiment described herein, the storage operating system 314 is the NetApp.RTM. Data ONTAP.RTM. operating system available from Network Appliance, Inc., that implements the WAFL.RTM. file system. It is noted that any other appropriate file system can be used, and as such, where the terms "WAFL.RTM." or "file system" are used, those terms should be interpreted broadly to refer to any file system that is adaptable to the teachings of this invention.

[0039] Referring now to FIG. 4, the storage operating system 314 includes a series of software layers, including a media access layer 402 of network drivers (e.g., an Ethernet driver). The storage operating system 314 further includes network protocol layers, such as an Internet Protocol (IP) layer 404 and its supporting transport mechanisms, a Transport Control Protocol (TCP) layer 406 and a User Datagram Protocol (UDP) layer 408.

[0040] A file system protocol layer 410 provides multi-protocol data access and includes support for the Network File System (NFS) protocol 412, the Common Internet File System (CIFS) protocol 414, and the Hyper Text Transfer Protocol (HTTP) 416. In addition, the storage operating system 314 includes a disk storage layer 420 that implements a disk storage protocol, such as a redundant array of independent disks (RAID) protocol, and a disk driver layer 422 that implements a disk access protocol such as, e.g., a Small Computer System Interface (SCSI) protocol.

[0041] Bridging the disk software layers 420-422 with the network and file system protocol layers 402-416 is a file system layer 430. Generally, the file system layer 430 implements a file system having an on-disk format representation that is block-based using data blocks and inodes to describe the files.

[0042] In the storage operating system 314, a data request path 432 between the network 202 and the disk 210 through the various layers of the operating system is followed. In response to a transaction request, the file system layer 430 generates an operation to retrieve the requested data from the disks 210 if the data is not resident in the filer's memory 304. If the data is not in the memory 304, then the file system layer 430 indexes into an inode file using the inode number to access an appropriate entry and retrieve a logical volume block number. The file system layer 430 then passes the logical volume block number to the disk storage layer 420. The disk storage layer 420 maps the logical number to a disk block number and sends the disk block number to an appropriate driver (for example, an encapsulation of SCSI implemented on a Fibre Channel disk interconnection) in the disk driver layer 422. The disk driver accesses the disk block number on the disks 210 and loads the requested data in the memory 304 for processing by the filer 206. Upon completing the request, the filer 206 (and storage operating system 314) returns a reply, e.g., an acknowledgement packet defined by the CIFS specification, to the client 204 over the network 202.

[0043] It is noted that the storage access request data path 432 through the storage operating system layers described above may be implemented in hardware, software, or a combination of hardware and software. In an alternate embodiment of this invention, the storage access request data path 432 may be implemented as logic circuitry embodied within a field programmable gate array (FPGA) or in an application specific integrated circuit (ASIC). This type of hardware implementation increases the performance of the file services provided by the filer 206 in response to a file system request issued by a client 204.

[0044] By way of introduction, the present invention provides advantages over existing solutions. Several of these advantages are described below by way of example. First, data manipulation performance is accelerated by moving the command execution as close to where the data is stored as possible. Second, because all of the data remains on the storage device, there is no network utilization in transmitting the data to and from the computer that requested the manipulation. Third, the requesting computer is not required to expend processing power to manipulate the data.

[0045] The present invention describes a set of high level commands that can be built for data manipulation and a mechanism to send the commands to the storage device. An exemplary command set may include IO instructions (e.g., relocate, remove, etc.) that can be executed on the storage device. Each instruction has its own descriptor and a set of parameters that are relevant to it, e.g., a relocate instruction requires the following inputs: a range of data to relocate, the source of the data, and destination for the data. A logical data manipulation event can be composed of many such instructions. The instructions are composed and packed by the initiator of the event and sent to the target storage device over the network. The target storage device unpacks the instructions and executes the instructions in a data optimized manner to arrive at the final result. As discussed in greater detail below, the concept of a "data optimized manner" can include the storage device reordering the instructions to improve (e.g., speed up) the performance of the instructions. The set of commands is evaluated for correctness, the commands are executed, and the results are returned.

[0046] In an exemplary embodiment, the present invention is implemented as an application executing on a computer operating system. For example, the storage device can include the NearStore.RTM. storage system running the NetApp.RTM. Data ONTAP.RTM. operating system available from Network Appliance, Inc. It is noted that the principles of the present invention are applicable to any type of storage device running any type of operating system.

[0047] FIG. 5 shows a flow diagram of a method 500 for directly manipulating data on a storage device. The method 500 utilizes a client 502 and a storage device 504, which communicate with each other over a network connection. A person of ordinary skill in the art would understand that the client 502 can be a client 204 as shown in FIG. 2 and that the storage device 504 can be a file server 206 shown in FIG. 2.

[0048] While the method 500 is described as using a client 502, any suitable computing device capable of communicating with the storage device 504 may be used. Client 502 utilizes services of the storage device 504 to store and manage data, such as, for example, files on a storage media 508, which can be a set of mass storage devices, such as magnetic or optical storage based disks or tapes. As used herein, the term "file" encompasses a container, an object, or any other storage entity. Interaction between the client 502 and the storage device 504 can enable the provision of storage services. That is, the client 502 may request the services of the storage device 504 and the storage device 504 may return the results of the services requested by the client 502, by exchanging packets over the connection system (not shown in FIG. 5). The client 502 may issue packets using file-based access protocols, such as the Common Internet File System (CIFS) protocol or the Network File System (NFS) protocol, over the Transmission Control Protocol/Internet Protocol (TCP/IP) when accessing information in the form of files and directories. Alternatively, the client 502 may issue packets including block-based access protocols, such as the Small Computer Systems Interface (SCSI) protocol encapsulated over TCP (iSCSI) and SCSI encapsulated over Fibre Channel (FCP), when accessing information in the form of blocks. The client 502 executes one or more host applications (not shown in FIG. 5).

[0049] The storage device 504 includes a storage manager 506 and a data storage media 508. In a preferred embodiment, the storage manager 506 is the file system layer 430 of the storage operating system 314 as shown in FIG. 4. Referring back to FIG. 5, a command with associated inputs (e.g., the source file name) to manipulate data is created at the client 502 (step 510) and the client 502 sends the command over the network to the storage manager 506 in the storage device 504 (step 512). The storage manager 506 unpacks the command and the associated inputs (step 514) and requests a source file from the storage media 508 that contains the data to be manipulated (step 516). The storage media 508 retrieves the source file (step 518) and the storage manager 506 reads the source file (step 520). The storage manager 506 manipulates the requested data from the source file as specified by the instructions in the command (step 522) and writes the manipulated data back to the storage media 508 (step 524). The storage manager 506 then sends the manipulation result back over the network to the client 502 (step 526).

[0050] One specific example of using the method 500 is in connection with Internet-based electronic mail. In these scenarios, electronic mail folders are often stored as single files, where each file contains all of the concatenated mail messages. When a user deletes a message, it is simply marked as "deleted" in the file. At a later time, the files need to be "repacked" in order to reclaim the space freed by the deleted messages.

[0051] A file repacking method 600 (described in connection with FIG. 6) can assist in repacking these files. The mail repacking application (not shown in FIG. 6), which is executed at a client 602, sends a list of valid offsets, the source file name, and destination file name to operate on to a storage device 604. The storage device 604 reads the list of offsets from the source file and copies the data from those offsets to the specified destination file. Once the entire list of offsets is processed, the storage device 604 returns an indication of success or failure to the client 602. The mail application can then update its internal system for tracking individual mail messages to reflect the newly packed file and delete the old file. Those of skill in the art would understand that client 602 can correspond to client 502 shown in FIG. 5; that storage device 604 can correspond to storage device 504; that data storage media 608 can correspond to data storage media 508; and that storage manager 606 can correspond to storage manager 506.

[0052] Sending a list of commands to the storage device 604 to repack the data directly on the storage device 604 provides the following advantages: the amount of data sent to the storage device 604 is small (only the commands are sent, and not the data), the storage device 604 can optimize the set of instructions and execute them in an efficient manner, and no protocol overhead is needed because the data is never moved off the storage device 604.

[0053] FIG. 6 is a flow diagram of a method 600 for file repacking to be performed on the storage device 604. The method 600 utilizes the client 602 and the storage device 604, which communicate with each other over a network connection. While the method 600 is described as using a client 602, any suitable computing device capable of communicating with the storage device 604 may be used. The storage device 604 includes a storage manager 606 and a data storage media 608. The client 602 identifies the source file to be repacked, the destination file, a list of segments in the source file to copy to the destination file, data to be inserted into the destination file (this inserted data is optional), and regions to skip in the destination file (holes) (step 610). It is noted that in the example of the mail repacking application, the client 602 identifies this information by using information already maintained by the mail repacking application. As noted above, a benefit of the method 600 is to transfer processing from the client 602 to the storage device 604; the basic operation of the underlying mail repacking application is not changed. One reason that a user may want to leave holes in the destination file is to leave space to write metafile information, such as the number of records, the time of the repacking, and similar information which would be useful at a later time. The client 602 then packs the information into a command (step 612).

[0054] Each command executed by the method 600 may consist of a single call to the storage device 604 that contains all the details to repack a file, such as the list of segments to be copied, any data to be inserted into the destination file, and any regions to skip in the destination file. It is noted that the list of segments to be copied from the source file to the destination file could alternatively be a list of segments from the source file that are not to be copied to the destination file, wherein all other segments of the source file are to be copied to the destination file. The choice is implementation-specific and does not affect the general operation of the method 600. Whether the list of segments indicates segments to include or segments to exclude from the destination file can be indicated by a flag, for example. One skilled in the art can readily identify other types of indicators for identifying these segments; all such indicators are within the scope of the present invention.

[0055] Furthermore, if the list of segments indicates a list of segments to be included in the destination file, a particular ordering for the inclusion list could be specified, whereby the segments in the destination file would be reordered from how the segments appear in the source file. For purposes of discussion of the method 600, the list of segments includes a list of segments to copy from the source file to the destination file.

[0056] The client 602 sends the packed command over the network to the storage manager 606 in the storage device 604 (step 614). The storage manager 606 unpacks the command (step 616) and requests a source file from the storage media 608 (step 618). The storage media 608 retrieves the source file (step 620) and the storage manager 606 reads the source file (step 622). The storage manager 606 copies the segments from the list of segments of the source file to the destination file (step 624).

[0057] The storage manager 606 can choose to reorder and optimize the set of instructions in the command. Whether the instructions are reordered depends on the implementation and the layout of the data. For example, data can be pre-fetched for the next instruction while the current instruction is in progress. The storage manager 606 knows best how to execute the instructions. The client 602 does not know where the data is physically located in the storage media 608. However, the storage manager 606 knows where the data is located in the storage media 608, and can use this information to accelerate the method 600. For example, the storage manager 606 could read blocks out of the order specified in the file repacking command in order to obtain better performance from the storage device 604.

[0058] If additional data was provided (in step 610) to be inserted into the destination file, the storage manager 606 inserts the data (step 626; this optional step is shown in dashed outline). The storage manager 606 writes the destination file to the storage media 608 (step 628). The storage manager 606 then sends the result of the file repacking command back over the network to the client 602 (step 630).

[0059] FIG. 7 is a diagram of a data file that is repacked according to the method shown in FIG. 6. A source file 702 includes a plurality of data segments 710, 712, 716, 718, 722, and several regions to be deleted 714, 720, 724. The method 600 copies the data segments 710, 712, 716, 718, 722 to a destination file 704, removes the regions to be deleted 716, 720, 724, and adds new data 730 to the destination file 704.

[0060] Another example of using the method 500 is in connection with database table repacking. In one implementation, the client 502 may execute a database management system, such as Microsoft.TM. SQL Server, by Microsoft Corporation of Redmond, Wash. Databases within database management systems maintain tables as files which contain fixed-size records. When a record is deleted, it is simply marked as "deleted" and is removed from the table index. Periodically, databases repack the table file to improve performance and to free up space held by deleted records. In particular, the repacking method 600 can also be used for repacking database tables. As described using the components shown in FIG. 6, the database management system generates a range of valid offsets in each table file (step 610) and sends the range of offsets to the storage device 604 (step 614). The storage device 604 uses the list of offsets to repack the table file (steps 616-624). Once completed, the database can update its indices and delete or archive the old table file.

[0061] While the method 600 was described in connection with repacking a file, other IO commands can be performed using a similar method, as generally shown by the method 500. The other IO commands can include, but are not limited to, the commands shown in Table 1.

TABLE-US-00001 TABLE 1 IO Commands Command Corresponding Inputs read file, offset, length of read write file, offset, length of write resize file, offset delete file rename file1, file2 relocate file1, offset, length of read, file2, offset, length of write

[0062] The present invention can be implemented in a computer program tangibly embodied in a computer-readable storage medium containing a set of instructions for execution by a processor or a general purpose computer; and method steps of the invention can be performed by a processor executing a program of instructions to perform functions of the invention by operating on input data and generating output data. Suitable processors include, by way of example, both general and special purpose processors. Typically, a processor will receive instructions and data from a ROM, a random access memory (RAM), and/or a storage device. Storage devices suitable for embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks and digital versatile disks (DVDs). In addition, while the illustrative embodiments may be implemented in computer software, the functions within the illustrative embodiments may alternatively be embodied in part or in whole using hardware components such as Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), or other hardware, or in some combination of hardware components and software components.

[0063] While specific embodiments of the present invention have been shown and described, many modifications and variations could be made by one skilled in the art without departing from the scope of the invention. The above description serves to illustrate and not limit the particular invention in any way.

* * * * *