Distributed and Deduplicating Data Storage System and Methods of Use Parab; Nitin ; et al. [Axcient, Inc.]

Distributed and Deduplicating Data Storage System and Methods of Use

Parab; Nitin ; et al.

Patent Application Summary

U.S. patent application number 14/864850 was filed with the patent office on 2017-03-30 for distributed and deduplicating data storage system and methods of use. The applicant listed for this patent is Axcient, Inc.. Invention is credited to Aaron Brown, Sagar Dixit, Nitin Parab, Dane Van Dyck.

Application Number	20170090786 14/864850
Document ID	/
Family ID	58409284
Filed Date	2017-03-30

United States Patent Application	20170090786
Kind Code	A1
Parab; Nitin ; et al.	March 30, 2017

Distributed and Deduplicating Data Storage System and Methods of Use

Abstract

Systems and methods for distributed and deduplicating a data store are provided herein. An exemplary method for distributed and deduplicating stored data, may include receiving an input data stream, segmenting the input data stream into chunks, creating a signature for each of the chunks, distributing each chunk to one of a plurality of containers, each container having a container identifier, and creating an index that includes a mapping of a chunk signature and a container identifier.

Inventors:

Parab; Nitin; (Palo Alto, CA) ; Brown; Aaron; (Sunnyvale, CA) ; Van Dyck; Dane; (Redwood City, CA) ; Dixit; Sagar; (Sunnyvale, CA)

Applicant:

Name	City	State	Country	Type
Axcient, Inc.	Mountain View	CA	US

Family ID:

58409284

Appl. No.:

14/864850

Filed:

September 24, 2015

Current U.S. Class:	1/1
Current CPC Class:	G06F 11/14 20130101; G06F 16/128 20190101; G06F 16/2358 20190101; G06F 3/067 20130101; H04L 29/0854 20130101; G06F 3/0641 20130101; G06F 11/1464 20130101; H04L 67/1095 20130101; G06F 11/1453 20130101; G06F 16/2365 20190101; G06F 16/162 20190101; G06F 11/1471 20130101; G06F 16/13 20190101; G06F 2201/84 20130101; G06F 11/1446 20130101; G06F 3/0619 20130101
International Class:	G06F 3/06 20060101 G06F003/06

Claims

1. A method, comprising: generating an input signature for at least a portion of an input data stream from a client, the input signature including a representation of data included in the input data stream; comparing the input signature to stored signatures of data included in a deduplicated backup data store; selecting a stored signature based upon the step of comparing the input signature to the stored signatures of data included in a deduplicated backup data store; comparing data associated with the selected stored signature to the at least a portion of the input data stream to determine unique data included in the at least a portion of the input data stream; and distributing the unique data to the deduplicated backup data store.

2. The method according to claim 1, wherein generating an input signature comprises applying a facial recognition algorithm to the at least a portion of an input data stream.

3. The method according to claim 1, wherein comparing data associated with the selected stored signature to the at least a portion of the input data stream comprises: performing an exact comparison between data of the selected stored signature to data of the at least a portion of the input data stream; ignoring data of the at least a portion of the input data stream that is an exact match to data of the selected stored signature; and storing in the deduplicated backup data store, data of the at least a portion of the input data stream that is not an exact match to data of the selected stored signature.

4. A method comprising: receiving an input data stream; segmenting the input data stream into chunks; creating a signature for each of the chunks; distributing each chunk to one of a plurality of containers, each container comprising a container identifier; and creating a locality index that includes a mapping of a chunk signature and a container identifier, wherein the chunk signatures and container identifiers for each of the chunks are related to one another because they were created from the input stream.

5. The method according to claim 4, further comprising creating a container index that includes an offset and a length for each chunk included in a container.

6. The method according to claim 4, wherein the signatures of the chunks each comprise a cryptographic hash value.

7. The method according to claim 5, further comprising: receiving a request for a file associate with one of the chunks; pre-fetching remaining ones of the chunks or associated files; and providing the requested file to the client.

8. The method according to claim 7, further comprising providing one or more of the remaining ones of the chunks or associated files from the pre-fetch when requested by the client.

9. A system, comprising: a processor; logic encoded in one or more tangible media for execution by the processor and when executed operable to perform operations comprising: generating an input signature for at least a portion of an input data stream from a client, the input signature including a representation of data included in the input data stream; comparing the input signature to stored signatures of data included in a deduplicated backup data store; selecting a stored signature based upon the step of comparing the input signature to the stored signatures of data included in a deduplicated backup data store; comparing data associated with the selected stored signature to the at least a portion of the input data stream to determine unique data included in the at least a portion of the input data stream; and distributing the unique data to the deduplicated backup data store.

10. The system according to claim 9, wherein the deduplicated backup data store resides within a cloud.

11. The system according to claim 9, wherein generating an input signature includes the processor further executing the logic to perform operations of applying a facial recognition algorithm to the at least a portion of an input data stream.

12. The system according to claim 9, wherein comparing data of the selected stored signature to the at least a portion of the input data stream comprises includes the processor further executing the logic to perform operations of: performing an exact comparison between data of the selected stored signature to data of the at least a portion of the input data stream; ignoring data of the at least a portion of the input data stream that does not exactly match data of the selected stored signature; and storing in the deduplicated backup data store, data of the at least a portion of the input data stream that does not exactly match data of the selected stored signature.

13. The system according to claim 9, wherein the processor further executes the logic to perform operations of: receiving the input data stream; segmenting the input data stream into chunks; creating an extent from sequential chunks; and hashing each chunk to create a signature, each signature comprising a hash value for data included in the chunk.

14. The system according to claim 13, wherein the processor further executes the logic to perform operations of: distributing unique chunks to the backup data store in proximity to one another; and creating an index that includes a location of each of the unique distributed chunks.

15. The system according to claim 14, wherein the processor further executes the logic to perform operations of: creating a distributed hash table link for the unique distributed chunks; and combining distributed hash table links into a localized distributed hash table.

16. The system according to claim 9, wherein the processor further executes the logic to perform operations of selecting a stored signature based upon information indicative of an object to which the input data stream belongs.

17. A method, comprising: receiving an input data stream; separating the input data stream into chunks; performing one or more of an exact and an approximate matching of the chunks of the input data stream to chunks stored in a deduplicated backup data store to determine unique chunks; determining one or more locations in the deduplicated backup data store for the unique chunks; updating an index to include the unique chunks with their locations; and distributing the unique chunks to the deduplicated backup data store according to the index.

18. A method, comprising: receiving a first input data stream at a first point in time, the first point in time being associated with a first file modification operation for a first set of files occurring on a client; segmenting the first input data stream into chunks; creating a signature for each of the chunks; distributing each chunk to one of a first plurality of containers, each container comprising a container identifier, the first plurality of containers being proximate to one another on a backup data store; creating a locality index that includes a mapping of a chunk signature and a container identifier, wherein the chunk signatures and container identifiers for each of the chunks are related to one another because they were created from the first input data stream; receiving a second input data stream at a second point in time, the second point in time being associated with a second file modification operation for a second set of files occurring on a client; segmenting the second input data stream into chunks; creating a signature for each of the chunks; distributing each chunk to one of a second plurality of containers, each container comprising a container identifier, the second plurality of containers being proximate to one another on a backup data store; and creating a locality index that includes a mapping of a chunk signature and a container identifier, wherein the chunk signatures and container identifiers for each of the chunks are related to one another because they were created from the second input data stream.

Description

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This non-provisional U.S. patent application is related to non-provisional U.S. patent application Ser. No. 13/889,164, filed on May 7, 2013, entitled "Cloud Storage Using Merkle Trees," which is hereby incorporated by reference herein in its entirety.

FIELD OF THE INVENTION

[0002] The present technology may be generally described as providing systems and methods for distributing and deduplicating data storage.

BACKGROUND

[0003] Creating large backup data stores that are efficient in terms of data storage and data retrieval are complex processes, especially for systems that store petabytes of data or greater. Additional complexities are introduced when these large backup data stores use deduplication, such as when only unique data blocks are stored. Additionally, backup data stores that use deduplication are not currently suitable for storing data using, for example, distributed hash tables ("DHT") as the DHT may destroy the locality of the data and the index used to track the data as it is distributed to the data store.

SUMMARY OF THE PRESENT TECHNOLOGY

[0004] According to some embodiments, the present technology may be directed to methods that comprise: (a) generating a signature for at least a portion of an input data stream, the signature including a representation of data included in the input data stream; (b) comparing the signature to signatures of data included in a deduplicated backup data store; (c) selecting a signature based upon the step of comparing the signature to signatures of data included in a deduplicated backup data store; (d) comparing data associated with the selected signature to the at least a portion of the input data stream to determine unique data included in the at least a portion of the input data stream; and (e) distributing the unique data to the deduplicated backup data store.

[0005] According to some embodiments, the present technology may be directed to methods that comprise: (a) receiving an input data stream; (b) segmenting the input data stream into chunks; (c) creating a signature for each of the chunks; (d) distributing each chunk to one of a plurality of containers, each container comprising a container identifier; and (e) creating a locality index that includes a mapping of a chunk signature and a container identifier.

[0006] According to some embodiments, the present technology may be directed to systems that comprise: (a) a processor; (b) logic encoded in one or more tangible media for execution by the processor and when executed operable to perform operations comprising: (i) generating a signature for at least a portion of an input data stream, the signature including a representation of data included in the input data stream; (ii) comparing the signature to signatures of data included in a deduplicated backup data store; (iii) selecting a signature based upon the step of comparing the signature to signatures of data included in a deduplicated backup data store; (iv) comparing data associated with the selected signature to the at least a portion of the input data stream to determine unique data included in the at least a portion of the input data stream; and (v) distributing the unique data to the deduplicated backup data store.

[0007] According to some embodiments, the present technology may be directed to a non-transitory machine-readable storage medium having embodied thereon a program. In some embodiments the program may be executed by a machine to perform a method. The method may comprise: (a) generating a signature for at least a portion of an input data stream, the signature including a representation of data included in the input data stream; (b) comparing the signature to signatures of data included in a deduplicated backup data store; (c) selecting a signature based upon the step of comparing the signature to signatures of data included in a deduplicated backup data store; (d) comparing data associated with the selected signature to the at least a portion of the input data stream to determine unique data included in the at least a portion of the input data stream; and (e) distributing the unique data to the deduplicated backup data store.

[0008] According to some embodiments, the present technology may be directed to methods that comprise: (a) receiving an input data stream; (b) separating the input data stream into chunks; (c) performing one or more of an exact and an approximate matching of the chunks of the input data stream to chunks stored in a deduplicated backup data store to determine unique chunks; (d) determining one or more locations in the deduplicated backup data store for the unique chunks; (e) updating an index to include the unique chunks with their locations; and (f) distributing the unique chunks to the deduplicated backup data store according to the index.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] Certain embodiments of the present technology are illustrated by the accompanying figures. It will be understood that the figures are not necessarily to scale and that details not necessary for an understanding of the technology or that render other details difficult to perceive may be omitted. It will be understood that the technology is not necessarily limited to the particular embodiments illustrated herein.

[0010] FIG. 1 is a block diagram of an exemplary architecture in which embodiments of the present technology may be practiced;

[0011] FIG. 2 is a flowchart of an exemplary method of exact matching of chunks of data to determine unique chunks;

[0012] FIG. 3 is a flowchart of an exemplary method for providing a distributed and deduplicated data store; and

[0013] FIG. 4 is a flowchart of an example method of the present technology.

[0014] FIG. 5 is another example method of the present technology for storing input streams from two separate file modification operations of a client.

[0015] FIG. 6 illustrates an exemplary computing system that may be used to implement embodiments according to the present technology.

DESCRIPTION OF EXEMPLARY EMBODIMENTS

[0016] While this technology is susceptible of embodiment in many different forms, there is shown in the drawings and will herein be described in detail several specific embodiments with the understanding that the present disclosure is to be considered as an exemplification of the principles of the technology and is not intended to limit the technology to the embodiments illustrated.

[0017] It will be understood that like or analogous elements and/or components, referred to herein, may be identified throughout the drawings with like reference characters. It will be further understood that several of the figures are merely schematic representations of the present technology. As such, some of the components may have been distorted from their actual scale for pictorial clarity.

[0018] Generally speaking, building large data storage systems that allow for efficient storage and retrieval of data is a complex. In general, when data is received, it may be separated into chunks and the chunks may then be transmitted to a storage system. In some systems these data storage systems create an index for all chunks that are received and distributed. A metadata server may maintain the indexes and perform operations on the chunks. Thus, a malfunction of the metadata server may result in a loss of the chunks stored in the storage system, either actual loss of the data or a loss in the ability to track the location of the data in the storage system.

[0019] Additionally, some storage systems may deduplicate block storage, where only unique data blocks are stored. This allows the system to reduce the overall amount of data blocks stored compared to systems that store complete data sets. When deduplication is not utilized, each backup (e.g., snapshot or mirror) taken of a physical system must be stored in order to allow the physical system to be restored back to a given point in time in the past, as described above.

[0020] While the use of distributed hash tables ("DHT") to store data is known, the use of DHTs is currently incompatible with systems that deduplicate data blocks. Advantageously, DHTs allow load balancing within storage systems, where chunks may be distributed into a data storage cloud. In one embodiment, each block of data may be hashed to form the index key for a DHT and the data itself is stored as the value of the key. The combination of data blocks and hash values are used to create a DHT. While the effectiveness of the methods and systems described herein may be advantageously leveraged within systems or processes that use DHTs, the present technology is not limited to these types of systems and processes. Thus, descriptions of DHTs included herein are merely provided as an exemplary use of the present technology.

[0021] While storage of data using DHTs can be effective in load balancing IO load across distributed nodes, unfortunately, when a DHT is used the temporal locality of the data is not maintained spatially on the disk. This is, in part, due to the fact that DHTs use the hash of the data to determine the location of the data and cryptographic hashes are by design random. For example, when multiple snapshots of a physical system are taken over time, random operations are performed on the snapshots when DHTs are used. These random operations are inefficient when compared to sequential operations. In short, DHTs are less than optimal for building deduplicated storage systems. That is, deduplicated storage systems rely on the maintaining temporal locality of the data spatially on the disk.

[0022] To be sure, as described herein, locality can be described in terms of temporality or space. For example, if a user modifies multiple files at the same time, it will be understood or assumed that the modification of these files is related to one another. By way of example, the user could be updating multiple spreadsheets within a given period of time. These spreadsheets may all be related to the same project or task that the user is working on. These file changes can be transmitted over the network efficiently in an input stream. The present technology will store these changes spatially together on the backup store, but their spatial proximity to one another on the backup store is due to their temporal adjacency relating to how they are used.

[0023] If these changes are stored in close spatial proximity on the backup store, context (the fact that they were modified together) is maintained. When the user requests this data from the backup store, the replication or retrieval process can be executed efficiently because all changes to the files were stored in close proximity to one another on the backup store. In contrast, a DHT may randomly distribute the changes to the files anywhere in the backup store, which increases data fragmentation and slows down retrieval.

[0024] In some embodiments, when one file is requested from the backup store, the backup store will automatically pre-fetch the files that were determined to be changed at the same time the requested file. Again, this benefit is possible because temporal locality (context) is determined and maintained. Even if the user does not utilize the additional files, the likelihood that they may be utilized is sufficient to justify pre-fetching the files in anticipation of use. Advantageously, these processes greatly improve file retrieval and replication methods of backup stores.

[0025] The index created for the blocks of the changed files also maintains context and locality due to the manner in which it is created. The updates to the index occur temporally when changes are transferred to the backup store.

[0026] These and other advantages of the present technology will be discussed in greater detail herein.

[0027] Referring now to the drawings, and more particularly, to FIG. 1, which includes a schematic diagram of an exemplary architecture 100 for practicing the present invention. Architecture 100 may include a deduplicated backup data store 105; hereinafter "data store 105." In some instances, the data store 105 may be implemented within a cloud-based computing environment. In general, a cloud-based computing environment is a resource that typically combines the computational power of a large model of processors and/or that combines the storage capacity of a large model of computer memories or storage devices. For example, systems that provide a cloud resource may be utilized exclusively by their owners; or such systems may be accessible to outside users who deploy applications within the computing infrastructure to obtain the benefit of large computational or storage resources.

[0028] The cloud may be formed, for example, by a network of servers, with each server (or at least a plurality thereof) providing processor and/or storage resources. These servers may manage workloads provided by multiple users (e.g., cloud resource consumers or other users). Typically, each user places workload demands upon the cloud that vary in real-time, sometimes dramatically. The nature and extent of these variations typically depend on the type of business associated with the user.

[0029] In some instances the data store 105 may include a block store 115 that stores unique blocks of data for one or more objects, such as a file, a group of files, or an entire disk. For example, the block store 115 may comprise a plurality of containers 120a-n, which are utilized to store data chunks that are separated from the input data stream, as will be described in greater detail below. The term "container" may also be referred to as an "extent."

[0030] In some instances, objects written to the block store 115 are immutable. When the present technology updates an existing object to generate a new object, a new object identifier may be generated and provided back to the object owner.

[0031] In some instances, the responsibility of implementing a traditional interface where object identifiers do not change on update is facilitated by the application/client. In other embodiments, the data store 105 may provide `mutable` metadata storage where the client/application can manage immutable objects which are mapped to mutable object identifiers and other application specific metadata.

[0032] According to some embodiments, the block store 115 may include immutable object addressable block storage. The block store 115 may form an underlying storage foundation that allows for the storing of blocks of objects. The identifiers of the blocks are a unique representation of the object, generated for example by using an SHA1 hash function. The present technology may also use other cryptographic hash functions that would be known to one of ordinary skill in the art with the present disclosure before them.

[0033] The architecture 100 may include a deduplication system, hereinafter referred to as system 125 that provides distributed and deduplicated data storage.

[0034] In some instances, the system 125 receives input data streams from a client device 130. For example, an input data stream may include a snapshot or an incremental file for the client device 130. The client device may include an end user computing system, an appliance, such as a backup appliance, a server, or any other computing device that may include objects such as files, directories, disks, and so forth.

[0035] In some instances the API may encapsulate messages and their respective operations, allowing for efficient writing of objects over a network, such as network 135. In some instances, the network 135 may comprise a local area network ("LAN"), a wide area network ("WAN"), or any other private or public network, such as the Internet.

[0036] The system 125 may divide or separate an input data stream into a plurality of chunks, also referred to as blocks, segments, pieces, and so forth. Any method for separating the input data stream into chunks that would be known to one of ordinary skill in the art may also likewise be utilized in accordance with the present technology. As each chunk is received (or created), the chunks are passed to containers 120a-n, which may also be referred to as blobs. Containers 120a-n may be filled with chunks, which are received sequentially around the same time thus maintaining temporal locality also spatial locality within the same container. Additionally, each of the chunks may be encrypted or otherwise hashed so as to create a unique identifier for the chunk of data. For example, a chunk may be hashed using SHA1 to produce a SHA1 key value for the chunk. In some instances, the input data stream may arrive at the system 125 in an already-chunked manner. Optionally, each of the hashed chunk values may be incorporated by the system 125 into Merkel nodes and the Merkel nodes may be arrange into a Merkel tree at the data store 105.

[0037] Additional details regarding the creation Merkle trees and the transmission of data over a network using such Merkle trees can be found in co-pending non-provisional U.S. patent application Ser. No. 13/889,164, filed on May 7, 2013, entitled "Cloud Storage Using Merkle Trees," which is hereby incorporated by reference herein in its entirety.

[0038] According to some embodiments, the system 125 may generate a signature for each extent using other technologies than cryptographic hashing functions. The signature is a representation of the data included in the extent. In some instances, to generate the signature, the system 125 may apply an algorithm that is similar to an algorithm used for facial recognition. For example, in facial recognition, a signature for a face of an individual included in an image file may be generated. This signature may be compare facial signatures in other image files to determine if facial signatures included these additional image files corresponds to the facial signature of the individual. Thus, the "signature" is a mathematical representation of the unique facial features of the individual. These unique facial features convert into unique mathematical values that may be used to locate the individual in other image files.

[0039] Similarly, extents include data chunks that can be distinguished from other chunks on the basis of unique data features. A signature for an extent would include mathematical representations of these unique features such that comparing a signature for the extent to other signatures of other extents may allow for the system 125 to determine similar or dissimilar extents.

[0040] Because chunks are placed sequentially (in order received relative to the input stream) into containers 120a-n and each chunk is provided with a unique identifier, such as a hash value, locality of the chunks may be maintained. A locality index may be managed by the system 125 that maps each chunk to its corresponding container based upon the chunk identifier. Thus, locality of data chunks is a function of the order in which the chunks are received, as well as the chunk identifiers used to distinguish chunks from one another.

[0041] According to some embodiments, the locality index may comprise a sparse index when the locality index becomes too large and cumbersome to maintain in memory. For example, the sparse index may map only the chunk signature with a container identifier. Also, in some instances, the system 125 may split the locality index into chunks and these chunks may also be stored in the containers, along with the chunks created from the input stream.

[0042] In addition to the locality index, the system 125 may also manage a container index for each container that provides an exact or approximate location for each chunk within the container. For example, the index may specify the offset and length of each chunk within the container.

[0043] In some instances, when the system 125 receives subsequent input data streams (e.g., subsequent snapshots) for the client device 130, the system may also separate the subsequent input streams into chunks and generate signatures for these chunks. When signatures for chunks of a subsequent input data stream are compared to signatures for chunks of a previous input data stream, differences deduced by the system 125 in these signatures may indicate that data in a particular chunk has changed. Thus, the system 125 may then obtain these changed chunks and store data from these changed chunks in the data store 105. The ability for the system 125 to recognize changed data allows the system 125 to store only unique data in the data store 105 (e.g., deduplicated data).

[0044] When comparing signatures and/or data between an input data stream and deduplicated data that is stored in the data store 105, the system 125 may employ either exact or approximated deduplication methods. In some instances, the system 125 may also use approximated deduplication methods initially, followed by a more robust exact matching deduplication method at a later time, as a means of verification.

[0045] With regard to approximate deduplication methods, the system 125 may compare the signature of an extent to signature for similar extents store in the data store 105. Any difference in signatures between similar extents for the same object such as a file, indicate that the data of the object has changed.

[0046] In some instances, the system 125 may establish rules that allow the system 125 to quickly process input data streams to determine if unique data blocks exist in the input data stream. If the comparison between signatures indicates that the input data stream is not likely to include unique data, the system 125 may ignore the input data stream. Conversely, if the comparison between signatures indicates that the input data stream is not likely to include unique data, the system 125 may further examine the input data stream to determine which chunks of data have changed.

[0047] For example, if the signature of an input data stream is determined by the system 125 to be sufficiently different from a signature of an extent for the same object stored in the data store 105, the system 125 may also process the input data stream using the exact deduplication method described below.

[0048] With regard to exact match deduplication methods, the system 125 may compare signatures of chunks of an input data stream to node signatures of similar chunks stored in the data store 105. The system 125 may check matches at the chunk or extent level using hash values associated with chunks. That is, each block or chunk of data included in an extent may be associated with its own signature of identifier. The chunk may include a unique hash value of the data included in a particular chunk of data. Any change in data of a chunk will change the hash value of the chunk. The system 125 can use the comparison of the signatures of the chunks to determine if data has changed in a chunk.

[0049] It will be understood that examining and comparing data streams at the block level via signature comparison allows exact matching, not simply because the comparison is being performed at a more granular level but also because any change in data for the same data block will produce different chunks having different hash values relative to one another.

[0050] According to some embodiments, the system 125 may load the input data stream and selected data from the data store 105 into cache memory. Processing the input data stream and selected data from the data store 105 may allow for faster and more efficient data analysis by the system 125.

[0051] In some embodiments, the system 125 may utilize information indicative of the client device or object stored on the client device to "warm up" the data loaded into the cache. That is, instead of examining an entire input data stream, the system 125 may understand that the input data stream came from a particular customer or client device. Additionally, the system 125 may know that the input data stream refers to a particular object. Thus, the system 125 may not need to compare signatures for each block (e.g., chunk) of a client device to determine unique blocks. The system 125, in effect, narrows the comparison down to the most likely candidate chunks or extents stored in the data store 105. In some instances, the system 105 may select extents by comparing root (or head) signatures for a chunk of an input data stream to root (or head) signatures of extents stored in the data store 105. Extents that have matching signatures may be ignored as the blocks corresponding thereto are already present. This process is known as deduplication. That is, only unique data need be transmitted and stored after its identification.

[0052] After unique blocks have been determined from the input data stream, the system 125 may determine an appropriate location for the unique block(s) in the data store 105 and update an index to include metadata indicative of a location of the unique block(s). The unique block(s) may then be distributed by the system 125 to the data store 105 according to the locations recorded in the index.

[0053] In some instances, the system 125 may store links to multiple containers into a single index. This single index may be referred to as a locality sensitive index. The locality sensitive index is an index that allows various local indices to be tied together into a single index, thus preserving the locality of the individual indices while allowing for interrelation of the same. Thus, the system 125 allows for the use of chunks while preserving the index and locality required for the deduplicated backup data store, as described in greater detail above.

[0054] FIG. 2 illustrates an exemplary method for maintaining locality of an input stream of data. The method may comprise an initial step 205 of receiving an input stream, such as a backup of a local machine. The method may comprise a step 210 of splitting the input stream into a plurality of chunks, according to any desired process. The method may comprise an optional step 215 of creating an identifier for each chunk. As mentioned above, this identifier may comprise a signature or a cryptographic hash value. As the input stream is chunked, the method may comprise a step 220 of placing each of the chunks into a container in a sequential manner.

[0055] Each container may be assigned a size and when the container is full, additional chunks may be placed into an open container. Thus, containers may be filled sequentially. As chunks are placed into containers, the method may include a step 225 of generating a locality index that maps the container in which a chunk is placed. Again, this locality is based on the temporal adjacency of the chunks in the input stream due to their association with a particular file modification process occurring on the client. In sum, chunk "locality" within the system is a function of both the order in which the chunk is received relative to the input stream, as well as a container location of the chunk after placement into a container. Locality preservation is enhanced by tracking chunks using their calculated, created, or assigned identifier. For example, a SHA1 key value for a chunk may be linked to the container in which the chunk has been placed.

[0056] Additionally, the method may comprise a step 230 of generating a container index that includes a location of the chunks within their respective containers. As mentioned previously, the container index may include an offset and a length for each chunk in the container.

[0057] FIG. 3 is a flowchart of an exemplary method for managing a deduplicated backup data store. The method may comprise a step 305 of storing an initial backup of a client device such as an end user computing system. The initial backup may comprise not only blocks of data but also associated Merkle nodes, which when combined with the blocks of data comprise a distributed hash table. Again, the Merkle node is a representation or hash value of the names of the individual data blocks that comprise the files of the client.

[0058] The method may then comprise a step 310 of receiving an input data stream from the client device. In some embodiments, the method may separate the input data stream into chunks in step 315. Once separated into chunks the method may then include a step 320 of hashing the chunks to create a key to index the data block. According to some embodiments, the index may include not only the hashes of data blocks, but also hashes of Merkle nodes. As mentioned previously, sequential chunks may be combined into an extent to maintain their temporal relatedness (which enables and enhances pre-fetching as needed). The extent itself may also be hashed.

[0059] In some instances, the method may include a step 325 of approximating deduplication of the chunks (or extent) by generating a signature for the input data stream. This signature may be compared against the signatures of other extents stored in the deduplicated backup data store. Again, the comparison of signatures may be performed at the chunk level or alternatively at the extent level.

[0060] Next, the method may comprise a step 330 of selecting a signature based upon the step of comparing the signature to signatures of extents. After selection of a signature, the method may comprise a step 335 of comparing data associated with the selected signature to the at least a portion of the input data stream to determine unique blocks included in the at least a portion of the input data stream. This delineation between unique and non-unique data chunks is used in deduplicating the input data stream to ensure that only unique chunks (e.g., changed data) are stored in deduplicated backup data store.

[0061] In some instances, the method may comprise a step 340 of updating an index to reflect the inclusion of the new unique chunks in the deduplicated backup data store. The index provides a location of the unique blocks, which have been distributed to the deduplicated backup data store in a step 345. According to some embodiments, step 345 may also include a plurality of DHTs which are linked together using a locality sensitive index that preserves locality and index of each DHT.

[0062] Referring now to FIG. 4, an example method for storing an input data stream in a de-duplicated manner is illustrated. For context, the input data stream is created when a user performs a file modification process to one or more files. For example, the user may edit several spreadsheets at the same time (or in close temporal proximity, such as within a few seconds or minutes of one another). To be sure, the plurality of files need not be the same type. For example the user can edit a spreadsheet and word processing document together. The changes to these files would be assembled and streamed as an input data stream. In other embodiments, as illustrated in FIG. 4, the input data stream can be checked against the stored signature for the client to determine what parts of the input data stream need be stored in the backup store.

[0063] The input data stream can be transmitted as the file modifications occur or only after a signature comparison has been completed. For example, a prior signature of a backup for the client may have been taken at an earlier point in time. A comparison of a new signature for the client against the old signature stored on the file replication store (e.g., backup store) would indicate that the files were modified. The changed data would then be transmitted over the network to the file replication store.

[0064] Once the input data stream is received, the method of FIG. 4 is executed.

[0065] The method includes a step of generating 405 an input signature for at least a portion of an input data stream from a client. To be sure, the input signature is a representation of data included in the input data stream.

[0066] The method also includes a step of comparing 410 the input signature to stored signatures of data included in a deduplicated backup data store. This process allows the system to find the signature of the client that was previously stored on the backup store.

[0067] The method includes the system selecting 415 a stored signature based upon the step of comparing the input signature to the stored signatures of data included in a deduplicated backup data store.

[0068] To ensure that only changed data that has not already been stored on the backup data store is transmitted to the backup data store, the method includes comparing 420 data associated with the selected stored signature to the at least a portion of the input data stream to determine unique data included in the at least a portion of the input data stream.

[0069] Next, the method includes distributing the unique data to the deduplicated backup data store. Advantageously, only the unique data that has not been stored previously is transmitted over the network to the backup data store. This method provides a network optimization technique, ensuring that only new, unique data is transmitted over the network for any given backup or replication procedure.

[0070] As mentioned above, input data streams are transmitted to the backup data store only upon the occurrence of a file modification process occurring on the client. Thus, as each file modification process occurs at the client, a new input data stream is created and transmitted for storage.

[0071] FIG. 5 illustrates an example method for storing input data streams of multiple file modification operations that occur on a client. For purposes of this example, a first file modification process occurs at a first point in time. This first file modification process occurs for a first set of files. At a second point in time, a second file modification process occurs for a second set of files. Temporal context and locality can be maintained for each of these file modification processes by storing the data in the input data streams in their own extents (e.g., containers).

[0072] Thus, the method can begin with a step of receiving 505 a first input data stream at a first point in time. The first point in time is associated with a first file modification operation for a first set of files occurring on a client. Next, the method includes segmenting 510 the first input data stream into chunks, as well as creating 515 a signature for each of the chunks. Indeed, this could include creating a Sha1 hash value, as an example.

[0073] Next, the method includes distributing 520 each chunk to one of a first plurality of containers. Each container comprises a container identifier and the first plurality of containers is proximate to one another on a backup data store. Thus, the temporal locality of the chunks in the input data stream are represented as spatial locality on the backup data store.

[0074] Next, the method includes creating 525 a locality index that includes a mapping of a chunk signature and a container identifier. To be sure, the chunk signatures and container identifiers for each of the chunks are related to one another because they were created from the first input data stream.

[0075] After this process is complete, a second file modification process occurs on the client. Thus, a second de-duplicating replication process for this new file modification process ensues.

[0076] The method includes receiving 530 a second input data stream at a second point in time. The second and first points in time are different from one another because they are associated with different file modification processes.

[0077] To be sure, the second point in time is associated with a second file modification operation for a second set of files occurring on a client. Next, the method includes segmenting 535 the second input data stream into chunks, and creating 540 a signature for each of the chunks.

[0078] Next, the method comprises distributing 545 each chunk to one of a second plurality of containers. As mentioned above, each container comprises a container identifier. The second plurality of containers is proximate to one another on a backup data store for ease of retrieval and pre-fetching as described above.

[0079] The method also includes creating 550 a locality index that includes a mapping of a chunk signature and a container identifier. Again, the chunk signatures and container identifiers for each of the chunks are related to one another because they were created from the second input data stream.

[0080] FIG. 6 illustrates an exemplary computing system 600 that may be used to implement an embodiment of the present technology. The computing system 600 of FIG. 6 includes one or more processors 610 and memory 620. Main memory 620 stores, in part, instructions and data for execution by processor 610. Main memory 620 can store the executable code when the system 600 is in operation. The system 600 of FIG. 6 may further include a mass storage device 630, portable storage medium drive(s) 640, output devices 650, user input devices 660, a graphics display 670, and other peripheral devices 680. The system 600 may also comprise network storage 645.

[0081] The components shown in FIG. 6 are depicted as being connected via a single bus 690. The components may be connected through one or more data transport means. Processor unit 610 and main memory 620 may be connected via a local microprocessor bus, and the mass storage device 630, peripheral device(s) 680, portable storage device 640, and graphics display 670 may be connected via one or more input/output (I/O) buses.

[0082] Mass storage device 630, which may be implemented with a magnetic disk drive or an optical disk drive, is a non-volatile storage device for storing data and instructions for use by processor unit 610. Mass storage device 630 can store the system software for implementing embodiments of the present technology for purposes of loading that software into main memory 620.

[0083] Portable storage device 640 operates in conjunction with a portable non-volatile storage medium, such as a floppy disk, compact disk or digital video disc, to input and output data and code to and from the computing system 600 of FIG. 6. The system software for implementing embodiments of the present technology may be stored on such a portable medium and input to the computing system 600 via the portable storage device 640.

[0084] Input devices 660 provide a portion of a user interface. Input devices 660 may include an alphanumeric keypad, such as a keyboard, for inputting alphanumeric and other information, or a pointing device, such as a mouse, a trackball, stylus, or cursor direction keys. Additionally, the system 600 as shown in FIG. 6 includes output devices 650. Suitable output devices include speakers, printers, network interfaces, and monitors.

[0085] Graphics display 670 may include a liquid crystal display (LCD) or other suitable display device. Graphics display 670 receives textual and graphical information, and processes the information for output to the display device.

[0086] Peripherals 680 may include any type of computer support device to add additional functionality to the computing system. Peripheral device(s) 680 may include a modem or a router.

[0087] The components contained in the computing system 600 of FIG. 6 are those typically found in computing systems that may be suitable for use with embodiments of the present technology and are intended to represent a broad category of such computer components that are well known in the art. Thus, the computing system 600 of FIG. 6 can be a personal computer, hand held computing system, telephone, mobile computing system, workstation, server, minicomputer, mainframe computer, or any other computing system. The computer can also include different bus configurations, networked platforms, multi-processor platforms, etc. Various operating systems can be used including UNIX, Linux, Windows, Macintosh OS, Palm OS, and other suitable operating systems.

[0088] Some of the above-described functions may be composed of instructions that are stored on storage media (e.g., computer-readable medium). The instructions may be retrieved and executed by the processor. Some examples of storage media are memory devices, tapes, disks, and the like. The instructions are operational when executed by the processor to direct the processor to operate in accord with the technology. Those skilled in the art are familiar with instructions, processor(s), and storage media.

[0089] It is noteworthy that any hardware platform suitable for performing the processing described herein is suitable for use with the technology. The terms "computer-readable storage medium" and "computer-readable storage media" as used herein refer to any medium or media that participate in providing instructions to a CPU for execution. Such media can take many forms, including, but not limited to, non-volatile media, volatile media and transmission media. Non-volatile media include, for example, optical or magnetic disks, such as a fixed disk. Volatile media include dynamic memory, such as system RAM. Transmission media include coaxial cables, copper wire and fiber optics, among others, including the wires that comprise one embodiment of a bus. Transmission media can also take the form of acoustic or light waves, such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, digital video disk (DVD), any other optical medium, any other physical medium with patterns of marks or holes, a RAM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, any other memory chip or data exchange adapter, a carrier wave, or any other medium from which a computer can read.

[0090] Various forms of computer-readable media may be involved in carrying one or more sequences of one or more instructions to a CPU for execution. A bus carries the data to system RAM, from which a CPU retrieves and executes the instructions. The instructions received by system RAM can optionally be stored on a fixed disk either before or after execution by a CPU.

[0091] Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).

[0092] The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. Exemplary embodiments were chosen and described in order to best explain the principles of the present technology and its practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

[0093] Aspects of the present invention are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

[0094] These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

[0095] The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

[0096] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

[0097] While various embodiments have been described above, it should be understood that they have been presented by way of example only, and not limitation. The descriptions are not intended to limit the scope of the technology to the particular forms set forth herein. Thus, the breadth and scope of a preferred embodiment should not be limited by any of the above-described exemplary embodiments. It should be understood that the above description is illustrative and not restrictive. To the contrary, the present descriptions are intended to cover such alternatives, modifications, and equivalents as may be included within the spirit and scope of the technology as defined by the appended claims and otherwise appreciated by one of ordinary skill in the art. The scope of the technology should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims along with their full scope of equivalents.

* * * * *