U.S. patent application number 14/864850 was filed with the patent office on 2017-03-30 for distributed and deduplicating data storage system and methods of use.
The applicant listed for this patent is Axcient, Inc.. Invention is credited to Aaron Brown, Sagar Dixit, Nitin Parab, Dane Van Dyck.
Application Number | 20170090786 14/864850 |
Document ID | / |
Family ID | 58409284 |
Filed Date | 2017-03-30 |
United States Patent
Application |
20170090786 |
Kind Code |
A1 |
Parab; Nitin ; et
al. |
March 30, 2017 |
Distributed and Deduplicating Data Storage System and Methods of
Use
Abstract
Systems and methods for distributed and deduplicating a data
store are provided herein. An exemplary method for distributed and
deduplicating stored data, may include receiving an input data
stream, segmenting the input data stream into chunks, creating a
signature for each of the chunks, distributing each chunk to one of
a plurality of containers, each container having a container
identifier, and creating an index that includes a mapping of a
chunk signature and a container identifier.
Inventors: |
Parab; Nitin; (Palo Alto,
CA) ; Brown; Aaron; (Sunnyvale, CA) ; Van
Dyck; Dane; (Redwood City, CA) ; Dixit; Sagar;
(Sunnyvale, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Axcient, Inc. |
Mountain View |
CA |
US |
|
|
Family ID: |
58409284 |
Appl. No.: |
14/864850 |
Filed: |
September 24, 2015 |
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06F 11/14 20130101;
G06F 16/128 20190101; G06F 16/2358 20190101; G06F 3/067 20130101;
H04L 29/0854 20130101; G06F 3/0641 20130101; G06F 11/1464 20130101;
H04L 67/1095 20130101; G06F 11/1453 20130101; G06F 16/2365
20190101; G06F 16/162 20190101; G06F 11/1471 20130101; G06F 16/13
20190101; G06F 2201/84 20130101; G06F 11/1446 20130101; G06F 3/0619
20130101 |
International
Class: |
G06F 3/06 20060101
G06F003/06 |
Claims
1. A method, comprising: generating an input signature for at least
a portion of an input data stream from a client, the input
signature including a representation of data included in the input
data stream; comparing the input signature to stored signatures of
data included in a deduplicated backup data store; selecting a
stored signature based upon the step of comparing the input
signature to the stored signatures of data included in a
deduplicated backup data store; comparing data associated with the
selected stored signature to the at least a portion of the input
data stream to determine unique data included in the at least a
portion of the input data stream; and distributing the unique data
to the deduplicated backup data store.
2. The method according to claim 1, wherein generating an input
signature comprises applying a facial recognition algorithm to the
at least a portion of an input data stream.
3. The method according to claim 1, wherein comparing data
associated with the selected stored signature to the at least a
portion of the input data stream comprises: performing an exact
comparison between data of the selected stored signature to data of
the at least a portion of the input data stream; ignoring data of
the at least a portion of the input data stream that is an exact
match to data of the selected stored signature; and storing in the
deduplicated backup data store, data of the at least a portion of
the input data stream that is not an exact match to data of the
selected stored signature.
4. A method comprising: receiving an input data stream; segmenting
the input data stream into chunks; creating a signature for each of
the chunks; distributing each chunk to one of a plurality of
containers, each container comprising a container identifier; and
creating a locality index that includes a mapping of a chunk
signature and a container identifier, wherein the chunk signatures
and container identifiers for each of the chunks are related to one
another because they were created from the input stream.
5. The method according to claim 4, further comprising creating a
container index that includes an offset and a length for each chunk
included in a container.
6. The method according to claim 4, wherein the signatures of the
chunks each comprise a cryptographic hash value.
7. The method according to claim 5, further comprising: receiving a
request for a file associate with one of the chunks; pre-fetching
remaining ones of the chunks or associated files; and providing the
requested file to the client.
8. The method according to claim 7, further comprising providing
one or more of the remaining ones of the chunks or associated files
from the pre-fetch when requested by the client.
9. A system, comprising: a processor; logic encoded in one or more
tangible media for execution by the processor and when executed
operable to perform operations comprising: generating an input
signature for at least a portion of an input data stream from a
client, the input signature including a representation of data
included in the input data stream; comparing the input signature to
stored signatures of data included in a deduplicated backup data
store; selecting a stored signature based upon the step of
comparing the input signature to the stored signatures of data
included in a deduplicated backup data store; comparing data
associated with the selected stored signature to the at least a
portion of the input data stream to determine unique data included
in the at least a portion of the input data stream; and
distributing the unique data to the deduplicated backup data
store.
10. The system according to claim 9, wherein the deduplicated
backup data store resides within a cloud.
11. The system according to claim 9, wherein generating an input
signature includes the processor further executing the logic to
perform operations of applying a facial recognition algorithm to
the at least a portion of an input data stream.
12. The system according to claim 9, wherein comparing data of the
selected stored signature to the at least a portion of the input
data stream comprises includes the processor further executing the
logic to perform operations of: performing an exact comparison
between data of the selected stored signature to data of the at
least a portion of the input data stream; ignoring data of the at
least a portion of the input data stream that does not exactly
match data of the selected stored signature; and storing in the
deduplicated backup data store, data of the at least a portion of
the input data stream that does not exactly match data of the
selected stored signature.
13. The system according to claim 9, wherein the processor further
executes the logic to perform operations of: receiving the input
data stream; segmenting the input data stream into chunks; creating
an extent from sequential chunks; and hashing each chunk to create
a signature, each signature comprising a hash value for data
included in the chunk.
14. The system according to claim 13, wherein the processor further
executes the logic to perform operations of: distributing unique
chunks to the backup data store in proximity to one another; and
creating an index that includes a location of each of the unique
distributed chunks.
15. The system according to claim 14, wherein the processor further
executes the logic to perform operations of: creating a distributed
hash table link for the unique distributed chunks; and combining
distributed hash table links into a localized distributed hash
table.
16. The system according to claim 9, wherein the processor further
executes the logic to perform operations of selecting a stored
signature based upon information indicative of an object to which
the input data stream belongs.
17. A method, comprising: receiving an input data stream;
separating the input data stream into chunks; performing one or
more of an exact and an approximate matching of the chunks of the
input data stream to chunks stored in a deduplicated backup data
store to determine unique chunks; determining one or more locations
in the deduplicated backup data store for the unique chunks;
updating an index to include the unique chunks with their
locations; and distributing the unique chunks to the deduplicated
backup data store according to the index.
18. A method, comprising: receiving a first input data stream at a
first point in time, the first point in time being associated with
a first file modification operation for a first set of files
occurring on a client; segmenting the first input data stream into
chunks; creating a signature for each of the chunks; distributing
each chunk to one of a first plurality of containers, each
container comprising a container identifier, the first plurality of
containers being proximate to one another on a backup data store;
creating a locality index that includes a mapping of a chunk
signature and a container identifier, wherein the chunk signatures
and container identifiers for each of the chunks are related to one
another because they were created from the first input data stream;
receiving a second input data stream at a second point in time, the
second point in time being associated with a second file
modification operation for a second set of files occurring on a
client; segmenting the second input data stream into chunks;
creating a signature for each of the chunks; distributing each
chunk to one of a second plurality of containers, each container
comprising a container identifier, the second plurality of
containers being proximate to one another on a backup data store;
and creating a locality index that includes a mapping of a chunk
signature and a container identifier, wherein the chunk signatures
and container identifiers for each of the chunks are related to one
another because they were created from the second input data
stream.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This non-provisional U.S. patent application is related to
non-provisional U.S. patent application Ser. No. 13/889,164, filed
on May 7, 2013, entitled "Cloud Storage Using Merkle Trees," which
is hereby incorporated by reference herein in its entirety.
FIELD OF THE INVENTION
[0002] The present technology may be generally described as
providing systems and methods for distributing and deduplicating
data storage.
BACKGROUND
[0003] Creating large backup data stores that are efficient in
terms of data storage and data retrieval are complex processes,
especially for systems that store petabytes of data or greater.
Additional complexities are introduced when these large backup data
stores use deduplication, such as when only unique data blocks are
stored. Additionally, backup data stores that use deduplication are
not currently suitable for storing data using, for example,
distributed hash tables ("DHT") as the DHT may destroy the locality
of the data and the index used to track the data as it is
distributed to the data store.
SUMMARY OF THE PRESENT TECHNOLOGY
[0004] According to some embodiments, the present technology may be
directed to methods that comprise: (a) generating a signature for
at least a portion of an input data stream, the signature including
a representation of data included in the input data stream; (b)
comparing the signature to signatures of data included in a
deduplicated backup data store; (c) selecting a signature based
upon the step of comparing the signature to signatures of data
included in a deduplicated backup data store; (d) comparing data
associated with the selected signature to the at least a portion of
the input data stream to determine unique data included in the at
least a portion of the input data stream; and (e) distributing the
unique data to the deduplicated backup data store.
[0005] According to some embodiments, the present technology may be
directed to methods that comprise: (a) receiving an input data
stream; (b) segmenting the input data stream into chunks; (c)
creating a signature for each of the chunks; (d) distributing each
chunk to one of a plurality of containers, each container
comprising a container identifier; and (e) creating a locality
index that includes a mapping of a chunk signature and a container
identifier.
[0006] According to some embodiments, the present technology may be
directed to systems that comprise: (a) a processor; (b) logic
encoded in one or more tangible media for execution by the
processor and when executed operable to perform operations
comprising: (i) generating a signature for at least a portion of an
input data stream, the signature including a representation of data
included in the input data stream; (ii) comparing the signature to
signatures of data included in a deduplicated backup data store;
(iii) selecting a signature based upon the step of comparing the
signature to signatures of data included in a deduplicated backup
data store; (iv) comparing data associated with the selected
signature to the at least a portion of the input data stream to
determine unique data included in the at least a portion of the
input data stream; and (v) distributing the unique data to the
deduplicated backup data store.
[0007] According to some embodiments, the present technology may be
directed to a non-transitory machine-readable storage medium having
embodied thereon a program. In some embodiments the program may be
executed by a machine to perform a method. The method may comprise:
(a) generating a signature for at least a portion of an input data
stream, the signature including a representation of data included
in the input data stream; (b) comparing the signature to signatures
of data included in a deduplicated backup data store; (c) selecting
a signature based upon the step of comparing the signature to
signatures of data included in a deduplicated backup data store;
(d) comparing data associated with the selected signature to the at
least a portion of the input data stream to determine unique data
included in the at least a portion of the input data stream; and
(e) distributing the unique data to the deduplicated backup data
store.
[0008] According to some embodiments, the present technology may be
directed to methods that comprise: (a) receiving an input data
stream; (b) separating the input data stream into chunks; (c)
performing one or more of an exact and an approximate matching of
the chunks of the input data stream to chunks stored in a
deduplicated backup data store to determine unique chunks; (d)
determining one or more locations in the deduplicated backup data
store for the unique chunks; (e) updating an index to include the
unique chunks with their locations; and (f) distributing the unique
chunks to the deduplicated backup data store according to the
index.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] Certain embodiments of the present technology are
illustrated by the accompanying figures. It will be understood that
the figures are not necessarily to scale and that details not
necessary for an understanding of the technology or that render
other details difficult to perceive may be omitted. It will be
understood that the technology is not necessarily limited to the
particular embodiments illustrated herein.
[0010] FIG. 1 is a block diagram of an exemplary architecture in
which embodiments of the present technology may be practiced;
[0011] FIG. 2 is a flowchart of an exemplary method of exact
matching of chunks of data to determine unique chunks;
[0012] FIG. 3 is a flowchart of an exemplary method for providing a
distributed and deduplicated data store; and
[0013] FIG. 4 is a flowchart of an example method of the present
technology.
[0014] FIG. 5 is another example method of the present technology
for storing input streams from two separate file modification
operations of a client.
[0015] FIG. 6 illustrates an exemplary computing system that may be
used to implement embodiments according to the present
technology.
DESCRIPTION OF EXEMPLARY EMBODIMENTS
[0016] While this technology is susceptible of embodiment in many
different forms, there is shown in the drawings and will herein be
described in detail several specific embodiments with the
understanding that the present disclosure is to be considered as an
exemplification of the principles of the technology and is not
intended to limit the technology to the embodiments
illustrated.
[0017] It will be understood that like or analogous elements and/or
components, referred to herein, may be identified throughout the
drawings with like reference characters. It will be further
understood that several of the figures are merely schematic
representations of the present technology. As such, some of the
components may have been distorted from their actual scale for
pictorial clarity.
[0018] Generally speaking, building large data storage systems that
allow for efficient storage and retrieval of data is a complex. In
general, when data is received, it may be separated into chunks and
the chunks may then be transmitted to a storage system. In some
systems these data storage systems create an index for all chunks
that are received and distributed. A metadata server may maintain
the indexes and perform operations on the chunks. Thus, a
malfunction of the metadata server may result in a loss of the
chunks stored in the storage system, either actual loss of the data
or a loss in the ability to track the location of the data in the
storage system.
[0019] Additionally, some storage systems may deduplicate block
storage, where only unique data blocks are stored. This allows the
system to reduce the overall amount of data blocks stored compared
to systems that store complete data sets. When deduplication is not
utilized, each backup (e.g., snapshot or mirror) taken of a
physical system must be stored in order to allow the physical
system to be restored back to a given point in time in the past, as
described above.
[0020] While the use of distributed hash tables ("DHT") to store
data is known, the use of DHTs is currently incompatible with
systems that deduplicate data blocks. Advantageously, DHTs allow
load balancing within storage systems, where chunks may be
distributed into a data storage cloud. In one embodiment, each
block of data may be hashed to form the index key for a DHT and the
data itself is stored as the value of the key. The combination of
data blocks and hash values are used to create a DHT. While the
effectiveness of the methods and systems described herein may be
advantageously leveraged within systems or processes that use DHTs,
the present technology is not limited to these types of systems and
processes. Thus, descriptions of DHTs included herein are merely
provided as an exemplary use of the present technology.
[0021] While storage of data using DHTs can be effective in load
balancing IO load across distributed nodes, unfortunately, when a
DHT is used the temporal locality of the data is not maintained
spatially on the disk. This is, in part, due to the fact that DHTs
use the hash of the data to determine the location of the data and
cryptographic hashes are by design random. For example, when
multiple snapshots of a physical system are taken over time, random
operations are performed on the snapshots when DHTs are used. These
random operations are inefficient when compared to sequential
operations. In short, DHTs are less than optimal for building
deduplicated storage systems. That is, deduplicated storage systems
rely on the maintaining temporal locality of the data spatially on
the disk.
[0022] To be sure, as described herein, locality can be described
in terms of temporality or space. For example, if a user modifies
multiple files at the same time, it will be understood or assumed
that the modification of these files is related to one another. By
way of example, the user could be updating multiple spreadsheets
within a given period of time. These spreadsheets may all be
related to the same project or task that the user is working on.
These file changes can be transmitted over the network efficiently
in an input stream. The present technology will store these changes
spatially together on the backup store, but their spatial proximity
to one another on the backup store is due to their temporal
adjacency relating to how they are used.
[0023] If these changes are stored in close spatial proximity on
the backup store, context (the fact that they were modified
together) is maintained. When the user requests this data from the
backup store, the replication or retrieval process can be executed
efficiently because all changes to the files were stored in close
proximity to one another on the backup store. In contrast, a DHT
may randomly distribute the changes to the files anywhere in the
backup store, which increases data fragmentation and slows down
retrieval.
[0024] In some embodiments, when one file is requested from the
backup store, the backup store will automatically pre-fetch the
files that were determined to be changed at the same time the
requested file. Again, this benefit is possible because temporal
locality (context) is determined and maintained. Even if the user
does not utilize the additional files, the likelihood that they may
be utilized is sufficient to justify pre-fetching the files in
anticipation of use. Advantageously, these processes greatly
improve file retrieval and replication methods of backup
stores.
[0025] The index created for the blocks of the changed files also
maintains context and locality due to the manner in which it is
created. The updates to the index occur temporally when changes are
transferred to the backup store.
[0026] These and other advantages of the present technology will be
discussed in greater detail herein.
[0027] Referring now to the drawings, and more particularly, to
FIG. 1, which includes a schematic diagram of an exemplary
architecture 100 for practicing the present invention. Architecture
100 may include a deduplicated backup data store 105; hereinafter
"data store 105." In some instances, the data store 105 may be
implemented within a cloud-based computing environment. In general,
a cloud-based computing environment is a resource that typically
combines the computational power of a large model of processors
and/or that combines the storage capacity of a large model of
computer memories or storage devices. For example, systems that
provide a cloud resource may be utilized exclusively by their
owners; or such systems may be accessible to outside users who
deploy applications within the computing infrastructure to obtain
the benefit of large computational or storage resources.
[0028] The cloud may be formed, for example, by a network of
servers, with each server (or at least a plurality thereof)
providing processor and/or storage resources. These servers may
manage workloads provided by multiple users (e.g., cloud resource
consumers or other users). Typically, each user places workload
demands upon the cloud that vary in real-time, sometimes
dramatically. The nature and extent of these variations typically
depend on the type of business associated with the user.
[0029] In some instances the data store 105 may include a block
store 115 that stores unique blocks of data for one or more
objects, such as a file, a group of files, or an entire disk. For
example, the block store 115 may comprise a plurality of containers
120a-n, which are utilized to store data chunks that are separated
from the input data stream, as will be described in greater detail
below. The term "container" may also be referred to as an
"extent."
[0030] In some instances, objects written to the block store 115
are immutable. When the present technology updates an existing
object to generate a new object, a new object identifier may be
generated and provided back to the object owner.
[0031] In some instances, the responsibility of implementing a
traditional interface where object identifiers do not change on
update is facilitated by the application/client. In other
embodiments, the data store 105 may provide `mutable` metadata
storage where the client/application can manage immutable objects
which are mapped to mutable object identifiers and other
application specific metadata.
[0032] According to some embodiments, the block store 115 may
include immutable object addressable block storage. The block store
115 may form an underlying storage foundation that allows for the
storing of blocks of objects. The identifiers of the blocks are a
unique representation of the object, generated for example by using
an SHA1 hash function. The present technology may also use other
cryptographic hash functions that would be known to one of ordinary
skill in the art with the present disclosure before them.
[0033] The architecture 100 may include a deduplication system,
hereinafter referred to as system 125 that provides distributed and
deduplicated data storage.
[0034] In some instances, the system 125 receives input data
streams from a client device 130. For example, an input data stream
may include a snapshot or an incremental file for the client device
130. The client device may include an end user computing system, an
appliance, such as a backup appliance, a server, or any other
computing device that may include objects such as files,
directories, disks, and so forth.
[0035] In some instances the API may encapsulate messages and their
respective operations, allowing for efficient writing of objects
over a network, such as network 135. In some instances, the network
135 may comprise a local area network ("LAN"), a wide area network
("WAN"), or any other private or public network, such as the
Internet.
[0036] The system 125 may divide or separate an input data stream
into a plurality of chunks, also referred to as blocks, segments,
pieces, and so forth. Any method for separating the input data
stream into chunks that would be known to one of ordinary skill in
the art may also likewise be utilized in accordance with the
present technology. As each chunk is received (or created), the
chunks are passed to containers 120a-n, which may also be referred
to as blobs. Containers 120a-n may be filled with chunks, which are
received sequentially around the same time thus maintaining
temporal locality also spatial locality within the same container.
Additionally, each of the chunks may be encrypted or otherwise
hashed so as to create a unique identifier for the chunk of data.
For example, a chunk may be hashed using SHA1 to produce a SHA1 key
value for the chunk. In some instances, the input data stream may
arrive at the system 125 in an already-chunked manner. Optionally,
each of the hashed chunk values may be incorporated by the system
125 into Merkel nodes and the Merkel nodes may be arrange into a
Merkel tree at the data store 105.
[0037] Additional details regarding the creation Merkle trees and
the transmission of data over a network using such Merkle trees can
be found in co-pending non-provisional U.S. patent application Ser.
No. 13/889,164, filed on May 7, 2013, entitled "Cloud Storage Using
Merkle Trees," which is hereby incorporated by reference herein in
its entirety.
[0038] According to some embodiments, the system 125 may generate a
signature for each extent using other technologies than
cryptographic hashing functions. The signature is a representation
of the data included in the extent. In some instances, to generate
the signature, the system 125 may apply an algorithm that is
similar to an algorithm used for facial recognition. For example,
in facial recognition, a signature for a face of an individual
included in an image file may be generated. This signature may be
compare facial signatures in other image files to determine if
facial signatures included these additional image files corresponds
to the facial signature of the individual. Thus, the "signature" is
a mathematical representation of the unique facial features of the
individual. These unique facial features convert into unique
mathematical values that may be used to locate the individual in
other image files.
[0039] Similarly, extents include data chunks that can be
distinguished from other chunks on the basis of unique data
features. A signature for an extent would include mathematical
representations of these unique features such that comparing a
signature for the extent to other signatures of other extents may
allow for the system 125 to determine similar or dissimilar
extents.
[0040] Because chunks are placed sequentially (in order received
relative to the input stream) into containers 120a-n and each chunk
is provided with a unique identifier, such as a hash value,
locality of the chunks may be maintained. A locality index may be
managed by the system 125 that maps each chunk to its corresponding
container based upon the chunk identifier. Thus, locality of data
chunks is a function of the order in which the chunks are received,
as well as the chunk identifiers used to distinguish chunks from
one another.
[0041] According to some embodiments, the locality index may
comprise a sparse index when the locality index becomes too large
and cumbersome to maintain in memory. For example, the sparse index
may map only the chunk signature with a container identifier. Also,
in some instances, the system 125 may split the locality index into
chunks and these chunks may also be stored in the containers, along
with the chunks created from the input stream.
[0042] In addition to the locality index, the system 125 may also
manage a container index for each container that provides an exact
or approximate location for each chunk within the container. For
example, the index may specify the offset and length of each chunk
within the container.
[0043] In some instances, when the system 125 receives subsequent
input data streams (e.g., subsequent snapshots) for the client
device 130, the system may also separate the subsequent input
streams into chunks and generate signatures for these chunks. When
signatures for chunks of a subsequent input data stream are
compared to signatures for chunks of a previous input data stream,
differences deduced by the system 125 in these signatures may
indicate that data in a particular chunk has changed. Thus, the
system 125 may then obtain these changed chunks and store data from
these changed chunks in the data store 105. The ability for the
system 125 to recognize changed data allows the system 125 to store
only unique data in the data store 105 (e.g., deduplicated
data).
[0044] When comparing signatures and/or data between an input data
stream and deduplicated data that is stored in the data store 105,
the system 125 may employ either exact or approximated
deduplication methods. In some instances, the system 125 may also
use approximated deduplication methods initially, followed by a
more robust exact matching deduplication method at a later time, as
a means of verification.
[0045] With regard to approximate deduplication methods, the system
125 may compare the signature of an extent to signature for similar
extents store in the data store 105. Any difference in signatures
between similar extents for the same object such as a file,
indicate that the data of the object has changed.
[0046] In some instances, the system 125 may establish rules that
allow the system 125 to quickly process input data streams to
determine if unique data blocks exist in the input data stream. If
the comparison between signatures indicates that the input data
stream is not likely to include unique data, the system 125 may
ignore the input data stream. Conversely, if the comparison between
signatures indicates that the input data stream is not likely to
include unique data, the system 125 may further examine the input
data stream to determine which chunks of data have changed.
[0047] For example, if the signature of an input data stream is
determined by the system 125 to be sufficiently different from a
signature of an extent for the same object stored in the data store
105, the system 125 may also process the input data stream using
the exact deduplication method described below.
[0048] With regard to exact match deduplication methods, the system
125 may compare signatures of chunks of an input data stream to
node signatures of similar chunks stored in the data store 105. The
system 125 may check matches at the chunk or extent level using
hash values associated with chunks. That is, each block or chunk of
data included in an extent may be associated with its own signature
of identifier. The chunk may include a unique hash value of the
data included in a particular chunk of data. Any change in data of
a chunk will change the hash value of the chunk. The system 125 can
use the comparison of the signatures of the chunks to determine if
data has changed in a chunk.
[0049] It will be understood that examining and comparing data
streams at the block level via signature comparison allows exact
matching, not simply because the comparison is being performed at a
more granular level but also because any change in data for the
same data block will produce different chunks having different hash
values relative to one another.
[0050] According to some embodiments, the system 125 may load the
input data stream and selected data from the data store 105 into
cache memory. Processing the input data stream and selected data
from the data store 105 may allow for faster and more efficient
data analysis by the system 125.
[0051] In some embodiments, the system 125 may utilize information
indicative of the client device or object stored on the client
device to "warm up" the data loaded into the cache. That is,
instead of examining an entire input data stream, the system 125
may understand that the input data stream came from a particular
customer or client device. Additionally, the system 125 may know
that the input data stream refers to a particular object. Thus, the
system 125 may not need to compare signatures for each block (e.g.,
chunk) of a client device to determine unique blocks. The system
125, in effect, narrows the comparison down to the most likely
candidate chunks or extents stored in the data store 105. In some
instances, the system 105 may select extents by comparing root (or
head) signatures for a chunk of an input data stream to root (or
head) signatures of extents stored in the data store 105. Extents
that have matching signatures may be ignored as the blocks
corresponding thereto are already present. This process is known as
deduplication. That is, only unique data need be transmitted and
stored after its identification.
[0052] After unique blocks have been determined from the input data
stream, the system 125 may determine an appropriate location for
the unique block(s) in the data store 105 and update an index to
include metadata indicative of a location of the unique block(s).
The unique block(s) may then be distributed by the system 125 to
the data store 105 according to the locations recorded in the
index.
[0053] In some instances, the system 125 may store links to
multiple containers into a single index. This single index may be
referred to as a locality sensitive index. The locality sensitive
index is an index that allows various local indices to be tied
together into a single index, thus preserving the locality of the
individual indices while allowing for interrelation of the same.
Thus, the system 125 allows for the use of chunks while preserving
the index and locality required for the deduplicated backup data
store, as described in greater detail above.
[0054] FIG. 2 illustrates an exemplary method for maintaining
locality of an input stream of data. The method may comprise an
initial step 205 of receiving an input stream, such as a backup of
a local machine. The method may comprise a step 210 of splitting
the input stream into a plurality of chunks, according to any
desired process. The method may comprise an optional step 215 of
creating an identifier for each chunk. As mentioned above, this
identifier may comprise a signature or a cryptographic hash value.
As the input stream is chunked, the method may comprise a step 220
of placing each of the chunks into a container in a sequential
manner.
[0055] Each container may be assigned a size and when the container
is full, additional chunks may be placed into an open container.
Thus, containers may be filled sequentially. As chunks are placed
into containers, the method may include a step 225 of generating a
locality index that maps the container in which a chunk is placed.
Again, this locality is based on the temporal adjacency of the
chunks in the input stream due to their association with a
particular file modification process occurring on the client. In
sum, chunk "locality" within the system is a function of both the
order in which the chunk is received relative to the input stream,
as well as a container location of the chunk after placement into a
container. Locality preservation is enhanced by tracking chunks
using their calculated, created, or assigned identifier. For
example, a SHA1 key value for a chunk may be linked to the
container in which the chunk has been placed.
[0056] Additionally, the method may comprise a step 230 of
generating a container index that includes a location of the chunks
within their respective containers. As mentioned previously, the
container index may include an offset and a length for each chunk
in the container.
[0057] FIG. 3 is a flowchart of an exemplary method for managing a
deduplicated backup data store. The method may comprise a step 305
of storing an initial backup of a client device such as an end user
computing system. The initial backup may comprise not only blocks
of data but also associated Merkle nodes, which when combined with
the blocks of data comprise a distributed hash table. Again, the
Merkle node is a representation or hash value of the names of the
individual data blocks that comprise the files of the client.
[0058] The method may then comprise a step 310 of receiving an
input data stream from the client device. In some embodiments, the
method may separate the input data stream into chunks in step 315.
Once separated into chunks the method may then include a step 320
of hashing the chunks to create a key to index the data block.
According to some embodiments, the index may include not only the
hashes of data blocks, but also hashes of Merkle nodes. As
mentioned previously, sequential chunks may be combined into an
extent to maintain their temporal relatedness (which enables and
enhances pre-fetching as needed). The extent itself may also be
hashed.
[0059] In some instances, the method may include a step 325 of
approximating deduplication of the chunks (or extent) by generating
a signature for the input data stream. This signature may be
compared against the signatures of other extents stored in the
deduplicated backup data store. Again, the comparison of signatures
may be performed at the chunk level or alternatively at the extent
level.
[0060] Next, the method may comprise a step 330 of selecting a
signature based upon the step of comparing the signature to
signatures of extents. After selection of a signature, the method
may comprise a step 335 of comparing data associated with the
selected signature to the at least a portion of the input data
stream to determine unique blocks included in the at least a
portion of the input data stream. This delineation between unique
and non-unique data chunks is used in deduplicating the input data
stream to ensure that only unique chunks (e.g., changed data) are
stored in deduplicated backup data store.
[0061] In some instances, the method may comprise a step 340 of
updating an index to reflect the inclusion of the new unique chunks
in the deduplicated backup data store. The index provides a
location of the unique blocks, which have been distributed to the
deduplicated backup data store in a step 345. According to some
embodiments, step 345 may also include a plurality of DHTs which
are linked together using a locality sensitive index that preserves
locality and index of each DHT.
[0062] Referring now to FIG. 4, an example method for storing an
input data stream in a de-duplicated manner is illustrated. For
context, the input data stream is created when a user performs a
file modification process to one or more files. For example, the
user may edit several spreadsheets at the same time (or in close
temporal proximity, such as within a few seconds or minutes of one
another). To be sure, the plurality of files need not be the same
type. For example the user can edit a spreadsheet and word
processing document together. The changes to these files would be
assembled and streamed as an input data stream. In other
embodiments, as illustrated in FIG. 4, the input data stream can be
checked against the stored signature for the client to determine
what parts of the input data stream need be stored in the backup
store.
[0063] The input data stream can be transmitted as the file
modifications occur or only after a signature comparison has been
completed. For example, a prior signature of a backup for the
client may have been taken at an earlier point in time. A
comparison of a new signature for the client against the old
signature stored on the file replication store (e.g., backup store)
would indicate that the files were modified. The changed data would
then be transmitted over the network to the file replication
store.
[0064] Once the input data stream is received, the method of FIG. 4
is executed.
[0065] The method includes a step of generating 405 an input
signature for at least a portion of an input data stream from a
client. To be sure, the input signature is a representation of data
included in the input data stream.
[0066] The method also includes a step of comparing 410 the input
signature to stored signatures of data included in a deduplicated
backup data store. This process allows the system to find the
signature of the client that was previously stored on the backup
store.
[0067] The method includes the system selecting 415 a stored
signature based upon the step of comparing the input signature to
the stored signatures of data included in a deduplicated backup
data store.
[0068] To ensure that only changed data that has not already been
stored on the backup data store is transmitted to the backup data
store, the method includes comparing 420 data associated with the
selected stored signature to the at least a portion of the input
data stream to determine unique data included in the at least a
portion of the input data stream.
[0069] Next, the method includes distributing the unique data to
the deduplicated backup data store. Advantageously, only the unique
data that has not been stored previously is transmitted over the
network to the backup data store. This method provides a network
optimization technique, ensuring that only new, unique data is
transmitted over the network for any given backup or replication
procedure.
[0070] As mentioned above, input data streams are transmitted to
the backup data store only upon the occurrence of a file
modification process occurring on the client. Thus, as each file
modification process occurs at the client, a new input data stream
is created and transmitted for storage.
[0071] FIG. 5 illustrates an example method for storing input data
streams of multiple file modification operations that occur on a
client. For purposes of this example, a first file modification
process occurs at a first point in time. This first file
modification process occurs for a first set of files. At a second
point in time, a second file modification process occurs for a
second set of files. Temporal context and locality can be
maintained for each of these file modification processes by storing
the data in the input data streams in their own extents (e.g.,
containers).
[0072] Thus, the method can begin with a step of receiving 505 a
first input data stream at a first point in time. The first point
in time is associated with a first file modification operation for
a first set of files occurring on a client. Next, the method
includes segmenting 510 the first input data stream into chunks, as
well as creating 515 a signature for each of the chunks. Indeed,
this could include creating a Sha1 hash value, as an example.
[0073] Next, the method includes distributing 520 each chunk to one
of a first plurality of containers. Each container comprises a
container identifier and the first plurality of containers is
proximate to one another on a backup data store. Thus, the temporal
locality of the chunks in the input data stream are represented as
spatial locality on the backup data store.
[0074] Next, the method includes creating 525 a locality index that
includes a mapping of a chunk signature and a container identifier.
To be sure, the chunk signatures and container identifiers for each
of the chunks are related to one another because they were created
from the first input data stream.
[0075] After this process is complete, a second file modification
process occurs on the client. Thus, a second de-duplicating
replication process for this new file modification process
ensues.
[0076] The method includes receiving 530 a second input data stream
at a second point in time. The second and first points in time are
different from one another because they are associated with
different file modification processes.
[0077] To be sure, the second point in time is associated with a
second file modification operation for a second set of files
occurring on a client. Next, the method includes segmenting 535 the
second input data stream into chunks, and creating 540 a signature
for each of the chunks.
[0078] Next, the method comprises distributing 545 each chunk to
one of a second plurality of containers. As mentioned above, each
container comprises a container identifier. The second plurality of
containers is proximate to one another on a backup data store for
ease of retrieval and pre-fetching as described above.
[0079] The method also includes creating 550 a locality index that
includes a mapping of a chunk signature and a container identifier.
Again, the chunk signatures and container identifiers for each of
the chunks are related to one another because they were created
from the second input data stream.
[0080] FIG. 6 illustrates an exemplary computing system 600 that
may be used to implement an embodiment of the present technology.
The computing system 600 of FIG. 6 includes one or more processors
610 and memory 620. Main memory 620 stores, in part, instructions
and data for execution by processor 610. Main memory 620 can store
the executable code when the system 600 is in operation. The system
600 of FIG. 6 may further include a mass storage device 630,
portable storage medium drive(s) 640, output devices 650, user
input devices 660, a graphics display 670, and other peripheral
devices 680. The system 600 may also comprise network storage
645.
[0081] The components shown in FIG. 6 are depicted as being
connected via a single bus 690. The components may be connected
through one or more data transport means. Processor unit 610 and
main memory 620 may be connected via a local microprocessor bus,
and the mass storage device 630, peripheral device(s) 680, portable
storage device 640, and graphics display 670 may be connected via
one or more input/output (I/O) buses.
[0082] Mass storage device 630, which may be implemented with a
magnetic disk drive or an optical disk drive, is a non-volatile
storage device for storing data and instructions for use by
processor unit 610. Mass storage device 630 can store the system
software for implementing embodiments of the present technology for
purposes of loading that software into main memory 620.
[0083] Portable storage device 640 operates in conjunction with a
portable non-volatile storage medium, such as a floppy disk,
compact disk or digital video disc, to input and output data and
code to and from the computing system 600 of FIG. 6. The system
software for implementing embodiments of the present technology may
be stored on such a portable medium and input to the computing
system 600 via the portable storage device 640.
[0084] Input devices 660 provide a portion of a user interface.
Input devices 660 may include an alphanumeric keypad, such as a
keyboard, for inputting alphanumeric and other information, or a
pointing device, such as a mouse, a trackball, stylus, or cursor
direction keys. Additionally, the system 600 as shown in FIG. 6
includes output devices 650. Suitable output devices include
speakers, printers, network interfaces, and monitors.
[0085] Graphics display 670 may include a liquid crystal display
(LCD) or other suitable display device. Graphics display 670
receives textual and graphical information, and processes the
information for output to the display device.
[0086] Peripherals 680 may include any type of computer support
device to add additional functionality to the computing system.
Peripheral device(s) 680 may include a modem or a router.
[0087] The components contained in the computing system 600 of FIG.
6 are those typically found in computing systems that may be
suitable for use with embodiments of the present technology and are
intended to represent a broad category of such computer components
that are well known in the art. Thus, the computing system 600 of
FIG. 6 can be a personal computer, hand held computing system,
telephone, mobile computing system, workstation, server,
minicomputer, mainframe computer, or any other computing system.
The computer can also include different bus configurations,
networked platforms, multi-processor platforms, etc. Various
operating systems can be used including UNIX, Linux, Windows,
Macintosh OS, Palm OS, and other suitable operating systems.
[0088] Some of the above-described functions may be composed of
instructions that are stored on storage media (e.g.,
computer-readable medium). The instructions may be retrieved and
executed by the processor. Some examples of storage media are
memory devices, tapes, disks, and the like. The instructions are
operational when executed by the processor to direct the processor
to operate in accord with the technology. Those skilled in the art
are familiar with instructions, processor(s), and storage
media.
[0089] It is noteworthy that any hardware platform suitable for
performing the processing described herein is suitable for use with
the technology. The terms "computer-readable storage medium" and
"computer-readable storage media" as used herein refer to any
medium or media that participate in providing instructions to a CPU
for execution. Such media can take many forms, including, but not
limited to, non-volatile media, volatile media and transmission
media. Non-volatile media include, for example, optical or magnetic
disks, such as a fixed disk. Volatile media include dynamic memory,
such as system RAM. Transmission media include coaxial cables,
copper wire and fiber optics, among others, including the wires
that comprise one embodiment of a bus. Transmission media can also
take the form of acoustic or light waves, such as those generated
during radio frequency (RF) and infrared (IR) data communications.
Common forms of computer-readable media include, for example, a
floppy disk, a flexible disk, a hard disk, magnetic tape, any other
magnetic medium, a CD-ROM disk, digital video disk (DVD), any other
optical medium, any other physical medium with patterns of marks or
holes, a RAM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, any other
memory chip or data exchange adapter, a carrier wave, or any other
medium from which a computer can read.
[0090] Various forms of computer-readable media may be involved in
carrying one or more sequences of one or more instructions to a CPU
for execution. A bus carries the data to system RAM, from which a
CPU retrieves and executes the instructions. The instructions
received by system RAM can optionally be stored on a fixed disk
either before or after execution by a CPU.
[0091] Computer program code for carrying out operations for
aspects of the present invention may be written in any combination
of one or more programming languages, including an object oriented
programming language such as Java, Smalltalk, C++ or the like and
conventional procedural programming languages, such as the "C"
programming language or similar programming languages. The program
code may execute entirely on the user's computer, partly on the
user's computer, as a stand-alone software package, partly on the
user's computer and partly on a remote computer or entirely on the
remote computer or server. In the latter scenario, the remote
computer may be connected to the user's computer through any type
of network, including a local area network (LAN) or a wide area
network (WAN), or the connection may be made to an external
computer (for example, through the Internet using an Internet
Service Provider).
[0092] The corresponding structures, materials, acts, and
equivalents of all means or step plus function elements in the
claims below are intended to include any structure, material, or
act for performing the function in combination with other claimed
elements as specifically claimed. The description of the present
invention has been presented for purposes of illustration and
description, but is not intended to be exhaustive or limited to the
invention in the form disclosed. Many modifications and variations
will be apparent to those of ordinary skill in the art without
departing from the scope and spirit of the invention. Exemplary
embodiments were chosen and described in order to best explain the
principles of the present technology and its practical application,
and to enable others of ordinary skill in the art to understand the
invention for various embodiments with various modifications as are
suited to the particular use contemplated.
[0093] Aspects of the present invention are described above with
reference to flowchart illustrations and/or block diagrams of
methods, apparatus (systems) and computer program products
according to embodiments of the invention. It will be understood
that each block of the flowchart illustrations and/or block
diagrams, and combinations of blocks in the flowchart illustrations
and/or block diagrams, can be implemented by computer program
instructions. These computer program instructions may be provided
to a processor of a general purpose computer, special purpose
computer, or other programmable data processing apparatus to
produce a machine, such that the instructions, which execute via
the processor of the computer or other programmable data processing
apparatus, create means for implementing the functions/acts
specified in the flowchart and/or block diagram block or
blocks.
[0094] These computer program instructions may also be stored in a
computer readable medium that can direct a computer, other
programmable data processing apparatus, or other devices to
function in a particular manner, such that the instructions stored
in the computer readable medium produce an article of manufacture
including instructions which implement the function/act specified
in the flowchart and/or block diagram block or blocks.
[0095] The computer program instructions may also be loaded onto a
computer, other programmable data processing apparatus, or other
devices to cause a series of operational steps to be performed on
the computer, other programmable apparatus or other devices to
produce a computer implemented process such that the instructions
which execute on the computer or other programmable apparatus
provide processes for implementing the functions/acts specified in
the flowchart and/or block diagram block or blocks.
[0096] The flowchart and block diagrams in the Figures illustrate
the architecture, functionality, and operation of possible
implementations of systems, methods and computer program products
according to various embodiments of the present invention. In this
regard, each block in the flowchart or block diagrams may represent
a module, segment, or portion of code, which comprises one or more
executable instructions for implementing the specified logical
function(s). It should also be noted that, in some alternative
implementations, the functions noted in the block may occur out of
the order noted in the figures. For example, two blocks shown in
succession may, in fact, be executed substantially concurrently, or
the blocks may sometimes be executed in the reverse order,
depending upon the functionality involved. It will also be noted
that each block of the block diagrams and/or flowchart
illustration, and combinations of blocks in the block diagrams
and/or flowchart illustration, can be implemented by special
purpose hardware-based systems that perform the specified functions
or acts, or combinations of special purpose hardware and computer
instructions.
[0097] While various embodiments have been described above, it
should be understood that they have been presented by way of
example only, and not limitation. The descriptions are not intended
to limit the scope of the technology to the particular forms set
forth herein. Thus, the breadth and scope of a preferred embodiment
should not be limited by any of the above-described exemplary
embodiments. It should be understood that the above description is
illustrative and not restrictive. To the contrary, the present
descriptions are intended to cover such alternatives,
modifications, and equivalents as may be included within the spirit
and scope of the technology as defined by the appended claims and
otherwise appreciated by one of ordinary skill in the art. The
scope of the technology should, therefore, be determined not with
reference to the above description, but instead should be
determined with reference to the appended claims along with their
full scope of equivalents.
* * * * *