U.S. patent application number 14/977614 was filed with the patent office on 2019-04-11 for computing device replication using file system change detection methods and systems.
The applicant listed for this patent is eFolder, Inc.. Invention is credited to Aaron Brown, Nitin Parab, Joshua Schwartz, Shashi Sharma.
Application Number | 20190108103 14/977614 |
Document ID | / |
Family ID | 58409284 |
Filed Date | 2019-04-11 |
![](/patent/app/20190108103/US20190108103A9-20190411-D00000.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00001.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00002.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00003.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00004.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00005.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00006.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00007.png)
![](/patent/app/20190108103/US20190108103A9-20190411-D00008.png)
United States Patent
Application |
20190108103 |
Kind Code |
A9 |
Parab; Nitin ; et
al. |
April 11, 2019 |
COMPUTING DEVICE REPLICATION USING FILE SYSTEM CHANGE DETECTION
METHODS AND SYSTEMS
Abstract
Computing device file system change detection and change
replication methods and systems are described herein. In one
embodiment, the change detection is accomplished by comparing two
master file tables of two different versions of the same file
system and inferring file system name space change, file system
metadata change and file data change. The method includes creating
a file system name space change log and file data change log for
any objects associated with the changes to the file system. The
method also includes replicating these change logs to the
replication server and applying the change logs to the data store
on replication server.
Inventors: |
Parab; Nitin; (Palo Alto,
CA) ; Schwartz; Joshua; (Saratoga, CA) ;
Sharma; Shashi; (San Jose, CA) ; Brown; Aaron;
(Sunnyvale, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
eFolder, Inc. |
Atlanta |
GA |
US |
|
|
Prior
Publication: |
|
Document Identifier |
Publication Date |
|
US 20170177452 A1 |
June 22, 2017 |
|
|
Family ID: |
58409284 |
Appl. No.: |
14/977614 |
Filed: |
December 21, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14864850 |
Sep 24, 2015 |
|
|
|
14977614 |
|
|
|
|
13889164 |
May 7, 2013 |
9705730 |
|
|
14864850 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06F 11/1446 20130101;
G06F 3/067 20130101; G06F 16/2358 20190101; G06F 16/2365 20190101;
H04L 29/0854 20130101; G06F 3/0641 20130101; G06F 16/128 20190101;
G06F 11/1453 20130101; G06F 16/162 20190101; G06F 16/13 20190101;
G06F 3/0619 20130101; G06F 2201/84 20130101; G06F 11/1464 20130101;
G06F 11/1471 20130101; G06F 11/14 20130101; H04L 67/1095
20130101 |
International
Class: |
G06F 11/14 20060101
G06F011/14; G06F 17/30 20060101 G06F017/30; H04L 29/08 20060101
H04L029/08 |
Claims
1. A method, comprising: a replication agent obtaining a first
master file table for a client computing device; obtaining a second
master file table for the client computing device; comparing the
first master file table with the second master file table to
determine changes to the first master file table relative to the
second master file table that are indicative of changes to a file
system of the client computing device; and creating a file system
name space change log and file data change log for the file system,
comprising of directories and files.
2. The method according to claim 1, further comprising replicating
the changed inodes on the replication appliance to a cloud data
store.
3. The method according to claim 1, wherein detecting changed
inodes comprises detecting a change in an inode generation number
of an inode of the first set of inodes that is indicative of reuse
of the inode for a new object.
4. The method according to claim 3, wherein if reuse is detected,
the method further comprises creating an entry in a log that
records creation of the new object and deletion of an old object
which was replaced by the new object.
5. The method according to claim 1, wherein the changes are
determined from examining log sequence numbers within the first and
second master file tables.
6. The method according to claim 1, wherein detecting changed
inodes comprises detecting a modified time entry of an inode of the
first set of inodes which is indicative of a change in an object
associated with the inode.
7. The method according to claim 1, wherein detecting changed
inodes comprises detecting a new name for an inode of the first set
of inodes, and further logging the new name in a log.
8. The method according to claim 1, wherein detecting changed
inodes comprises detecting a change to a parent identifier for an
inode of the first set of inodes which is indicative of movement of
the inode within the file system.
9. The method according to claim 1, further comprising copying
metadata and namespace changes for the file system by copying only
the disk blocks that store the metadata and namespace information
(cloned shell volume) to the replication appliance.
10. The method according to claim 1, wherein translating the
changed inodes into namespace locations comprises obtaining file
identifiers of the changed inodes and performing file path
identification to obtain a file path for each of the changed inodes
based on the file identifiers.
11. The method according to claim 1, further comprising converting
the file data change log to a refined log by: performing a tree
walk of a directory in a file system for the client computing
device for a directory rename operation of a directory or file move
operations of a file; adding a create record for child files of the
directory or the object; delete any modified file record if the
file; and wherein the refined log comprises file change records
that are an incremental change or a full backup of the client
computing device.
12. A system, comprising: a cloud data store; and a replication
appliance associated with a client computing device, the
replication appliance comprising a processor and memory, wherein
the processor executes logic stored in memory to: obtain a first
master file table of a file system for a client computing device;
obtain a second master file table of the file system for the client
computing device; compare the first master file table with the
second master file table to determine changes to the first master
file table relative to the second master file table that are
indicative of changes to a file system of the client computing
device; create a file data change log for any objects associated
with the changes to the file system; and replicate blocks
associated with the objects on the replication appliance to the
cloud data store.
13. The system according to claim 12, wherein the processor further
executes the logic to replicate metadata of the file system onto
the replication appliance.
14. The system according to claim 13, wherein the processor further
executes the logic to convert the file data change log to a refined
log by: performing a tree walk of a directory in a file system for
the client computing device for a directory rename operation of a
directory or file move operations of a file; adding a record for
child files of the directory or the object; delete any modified
file record if the file; and wherein the refined log comprises file
change records that are an incremental change or a full backup of
the client computing device.
15. The system according to claim 12, wherein the changes include
file name changes, parent identifier changes, sequence number
changes, time signature changes, log sequence changes, and any
combinations thereof.
16. A method, comprising: obtaining pairs of master file table
(MFT) entries in file identifier numerical order, the pairs MFT
entries being obtained from a first snapshot of a client computing
device and a second snapshot of the client computing device;
creating a record for a second MFT entry of a pair of MFT entries
if a first of the pair of MFT entries is invalid and the second MFT
entry is valid; deleting a record for a first MFT entry of the pair
of MFT entries if the first of the pair of MFT entries is valid and
the second MFT entry is invalid; wherein if both the first MFT
entry and the second MFT entry are valid and generation numbers of
the first and second MFT entries refer to a same object, the method
further comprises obtaining name attributes for each of the first
and second MFT entries; further wherein: if name attributes from
the first MFT entry that do not exist in the second MFT entry, an
unlink record is generated; and if name attributes from the second
MFT entry that do not exist in the first MFT entry, a link record
is generated.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is related to U.S. patent application Ser.
No. 14/864,850, filed Sep. 24, 2015, entitled "Distributed and
Deduplicating Data Storage System and Methods of Use," U.S. patent
application Ser. No. 13/889,164, filed May 7, 2013, entitled "Cloud
Storage Using Merkle Trees," U.S. patent application Ser. No.
14/522,527, filed Oct. 23, 2014, entitled "Systems and Methods for
Restoring a File," and U.S. patent application Ser. No. ______,
filed Dec. 21, 2015, entitled "Cloud Storage Using Merkle Trees,"
all of which are hereby incorporated by reference herein in their
entireties including all references and appendices cited
therein
FIELD OF THE INVENTION
[0002] The present technology may be generally described as
providing systems and methods for efficiently detecting both
metadata and file data changes in file systems, and efficiently
replicating the file system metadata changes from client system to
the backup server system by accessing the on disk data structures
of the client systems file system directly and comparing two point
in time copies of these data structures.
SUMMARY OF THE PRESENT TECHNOLOGY
[0003] Generally speaking, the present technology provides
efficient methods and systems for determining file system changes
on a client computing device. The solution leverages knowledge
obtained from master file tables used by the new technology file
system (NTFS), which records changes to objects in the file system
such as directories and files. The present technology can identify
changes in a file system over time without having to resort to
walking the entire file system, such as with a file system scan
operation.
[0004] For context, file system information other than the user
file data is often referred to as file system metadata and (b) in
file systems the types of changes include (but are not limited to)
namespace changes like new file or directory
create/delete/rename/link or file/directory metadata changes like
access permission changes or the file data change.
[0005] According to some embodiments, the present technology may be
directed to methods to detect the list of files that changed and
file data change that comprise: (a) reading a disk of a client
computing device by a replication agent of a replication server by
sequentially walking the inode table of a file system of the client
computing device for the two point in time copies of the file
system; (b) comparing the first table of inodes of the file system
of the client computing device to a second table of inodes of the
file system of the client computing device; (c) detecting changed
inodes between the first set of inodes and the second set of
inodes; (d) translating the changed inodes data stream offsets into
on disk locations; (e) determining blocks changed in two versions
of the data stream by reading and comparing; (f) generating a file
data change log generating a disk image of the client computing
device that is capable of being mounted by a hypervisor so as to
create a virtual machine using the file system backup and the
blocks of changed data.
[0006] For context, most file systems allocate inodes for each
object (file/directory) in a simple indexed table (like master file
table table, MFT in NTFS) giving each inode a number, referred to
as a file number. On deletion of an inode the index number is free
and can be used for new object (file/directory) later. According to
some embodiments, the present technology may be directed to walk
through all the inodes starting from index zero to max allocated
and compare two inodes at same index in two different versions of
the file system to identify changes in a file system name space
over time. This method is much faster than performing a full
directory walk of two versions of the file system to compare
directory listings of each directory.
[0007] According to some embodiments, the present technology may be
directed to methods that comprise: (a) obtaining pairs of master
file table (MFT) entries in file identifier numerical order, the
pairs of MFT entries being obtained from a first snapshot of a
client computing device and a second snapshot of the client
computing device; (b) creating a record for a second MFT entry of a
pair of MFT entries if a first of the pair of MFT entries is
invalid and the second MFT entry is valid; (c) deleting a record
for a first MFT entry of the pair of MFT entries if the first of
the pair of MFT entries is valid and the second MFT entry is
invalid; (d) wherein if both the first MFT entry and the second MFT
entry are valid and generation numbers of the first and second MFT
entries refer to a same object, the method further comprises
obtaining name attributes for each of the first and second MFT
entries; (e) further wherein: (i) if name attributes from the first
MFT entry do not exist in the second MFT entry, an unlink record is
generated; and (ii) if name attributes from the second MFT entry do
not exist in the first MFT entry, a link record is generated.
[0008] According to some embodiments, the present technology may be
directed to log the namespace changes detected, as described
previously, in a metadata log and replay each operation in the log
on the backup server. According to some embodiments, the backup
server could be a server or cloud a data store.
[0009] In some embodiments, no metadata log is required or applied
to backup server. The metadata and namespace changes are replicated
by copying all the disk blocks of the file system that are used to
store the file system metadata and namespace information. For
context, most file systems support cloning of just the file system
namespace and metadata without the file data. The replication
client can be configured to use existing file system cloning tools
(like ntfsclone for NTFS) to replicate full file system namespace
and metadata from the client computing device to the replication
appliance. By using NTFSclone a NTFS shell volume is generated and
saved on the replication appliance, thus keeping the ondisk format
the same.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] Certain embodiments of the present technology are
illustrated by the accompanying figures. It will be understood that
the figures are not necessarily to scale and that details not
necessary for an understanding of the technology or that render
other details difficult to perceive may be omitted. It will be
understood that the technology is not necessarily limited to the
particular embodiments illustrated herein.
[0011] FIG. 1 is a flowchart of a method for determining changes in
a file system, as well as generating a change record with
attributes.
[0012] FIG. 2 is a flowchart of a method for transmitting changed
data to a cloud data center (store) using master file table changes
and SHA1 signature maps.
[0013] FIG. 3 is a flowchart of an example method for determining
changes in a secure data stream of an NTFS file system.
[0014] FIG. 4 is a flowchart of a method of walking inodes of
snapshots, transmitting changed data and metadata.
[0015] FIG. 5 is a flowchart of a method of using master file
tables for determining changes in a file system, as well as
creating a change log indicative of the changes.
[0016] FIG. 6 is a flowchart of a method for determining changes in
a file system and using the determined changes to locate changed
blocks on a disk, as well as creating a bootable disk image.
[0017] FIG. 7 is a block diagram of an exemplary architecture in
which embodiments of the present technology may be practiced;
[0018] FIG. 8 illustrates an exemplary computing system that may be
used to implement embodiments according to the present
technology;
DESCRIPTION OF EXEMPLARY EMBODIMENTS
[0019] While this technology is susceptible of embodiment in many
different forms, there is shown in the drawings and will herein be
described in detail several specific embodiments with the
understanding that the present disclosure is to be considered as an
exemplification of the principles of the technology and is not
intended to limit the technology to the embodiments
illustrated.
[0020] It will be understood that like or analogous elements and/or
components, referred to herein, may be identified throughout the
drawings with like reference characters. It will be further
understood that several of the figures are merely schematic
representations of the present technology. As such, some of the
components may have been distorted from their actual scale for
pictorial clarity.
[0021] To be sure, each of the changes results in changes to blocks
on the disk of the client computing device, which are referred to
as "changed blocks". Once file system changes are deduced, the
present technology can translate these file system changes to disk
locations, allowing changed blocks of data to be obtained from the
client computing device. The changed blocks can be replicated to a
cloud data store over a network.
[0022] For context, the NTFS ondisk format is designed to be a
collection of streams/files. Even the metadata about each stream,
referred to as an "inode," is stored in a special stream/file
called MFT (Master File Table) as series of entries with the first
entry being a master file table entry (MFTEntry) for the MFT file
itself. A MFTEntry can also be referred to herein as an "inode".
First few MFTEntries in the MFT Table are reserved for special
system files. The MFTEntry for a file contains one or more name
attributes which specify a file name, parent identifier (such as an
inode number or file identifier), a time entry, a sequence number,
and a log sequence number--just to name a few. A time entry is a
time stamp associated with a change in an inode which is recorded
by the NTFS file system. A sequence number can include a numerical
identifier that corresponds to a reuse of an inode. For example, if
the object associated with an inode no longer exists (such as if it
is deleted), the NTFS file system can update the sequence number of
the inode if it is reused for a new object. The log sequence can
refer to log entries for when the NTFS file system logs anticipated
file system operations to objects, the actual changes, and a
notation that the operation was completed. Each of these attributes
provides clues as to what objects within the file system have
actually changed between any two snapshots.
[0023] It is noteworthy to mention that NTFS MFTEntry contains the
name of the object and also the file identifier (MFTEntry number)
of the parent directory. Secondly, it is noteworthy to mention that
NTFS does not allow hard links to directories. Using these two
properties of NTFS, given a file identifier it is possible to do a
reverse lookup of the paths of associated files and/or directories
(referred to generally as "objects") including hard links by simply
walking up the parent identifiers for the inodes. It will be
understood that a hard link is a directory entry that associates a
file name with a file on a file system. An NTFS replication client
of the present technology can exploit this feature to replicate
changes in file system namespace without a top down scan, which is
a scan starting from a root directory.
[0024] File creation events can be determined by detecting the
existence of a new MFTEntry in an MFT table and the entire path of
the new file and/or a directory can be located without scanning the
whole file system namespace, usually occurring in a top-down
manner.
[0025] The present technology of cloud data store (or a replication
appliance) uses file system metadata to allow pure inode based
operations without ordering requirements. A new file and/or
directory create operation can be executed by simply creating two
entries with key value pairs of <dir-inode-number+entry-name,
child-inode-num> and <child-inode-num, inode-attr> without
requiring the corresponding entries for the full path of the new
file and/or directory. Thus a full path like /foo/bar.txt can be
created in reverse order (unlike traditional UNIX file systems) or
any order. As long as all the create operations are applied for all
the parents the file system will eventually be consistent.
[0026] It is noteworthy to mention that because cloud data store
(or a replication appliance) can accept namespace operations in any
order and because each MFTEntry of each NTFS file system contains
all file names and parent identifiers it is possible to replicate
the NTFS namespace to a cloud data store (or a replication
appliance) by walking the MFT entries in file identifier order.
[0027] In some embodiments, the replication client walks through
the MFT file one MFT entry at a time. For each MFT entry with a
create time and/or a modified time after previous replication, the
system will send the corresponding create/setattr (set attributes)
operation with the parent identifier to the cloud data store (or a
replication appliance). It is also possible for the system to
determine if the MFT entry obtained got a new hard link (because
link count increases) and a new name attribute will exist.
[0028] Referring now to FIG. 1, which comprises an example method
for detecting changes to a file system of a client computing
device. In some embodiments, the method will comprise a replication
client obtaining 102 two master file tables for a client computing
device. The first master file table will correspond with a first
snapshot and the second master file table will correspond with a
second snapshot. These snapshots are obtained at two different
points in time. In some instances, the second snapshot can include
a current version of a file system for the client computing device,
while the first snapshot is a backup of the client computing device
taken at some point in the past.
[0029] The method will include the replication client obtaining a
master file table entry (MFTEntry) for a given file identifier from
the first snapshot and the second snapshot. In more detail, in some
embodiments this step includes obtaining the MFT tables from each
of the first and second snapshots.
[0030] To be sure, the replication client reads the MFT tables
together as pairs where the first MFT table is of previous version
of the file system and second MFT table is the current version of
the file system. For example, both MFT tables have a file
identifier of "file 1" which is the first entry in each of
respective MFT tables. The replication client will examine each of
these pairs in sequential order such as "file 2" from the first MFT
table and "file 2" from the second MFT table, and so on. Thus, the
replication client need only walk the MFT tables to deduce changes.
The exact details of how these changes can be detected are provided
below.
[0031] An MFT table is comprised of an ordered list of entries
referred to as "inodes".
[0032] In some embodiments, changes to MFT table entries can
correspond to changes in a file name, a parent identifier, a time
entry, sequence number, log sequence number, and any combinations
thereof. These attributes are described in greater detail
above.
[0033] In some embodiments, if an MFTEntry-1 is invalid in the
first MFT table but an MFTEntry-2 is valid in another MFT table
then the method includes the replication client generating 104 a
create record for MFTEntry-2.
[0034] Similarly, if an MFTEntry-1 is valid in the first MFT table
but MFTEntry-2 in the second MFT table is invalid then the method
includes the replication client generating 106 a delete record for
MFTEntry-1.
[0035] In some embodiments, if both MFTEntries are valid and
generation numbers are same the replication client determines that
both MFTEntries refer to same file. In these instances where it is
determined that both MFTEntries refer to same file, the method can
include the replication client inspecting the name attributes in
both MFTEntries. The method can include replication client
generating an unlink record 108 for every NameAttribute from
MFTEntry-1 that does not exist in MFTEntry-2. Similarly, the method
can include the replication client generating 110 a link record for
every NameAttribute from MFTEntry-2 that does not exist in
MFTEntry-1.
[0036] Similarly, in the instances where the replication client has
determined that both MFTEntries refer to same same file, the method
can include the replication client inspecting all other attributes
in both MFTEntries. If some attributes are not the same between
MFTEntry-1 and MFTEntry-2, the method can include replication
client generating a 112 a set-attribute record where attributes are
set.
[0037] In some embodiments, the method includes the replication
client determining 114 that generation numbers between MFTEntry-1
and MFTEntry-2 are different from one another. In these instances,
the method can include replication client generating a link record
110 for every NameAttribute from MFTEntry-2 illustrated in FIG.
2.
[0038] FIG. 2 illustrates an example method of data stream (e.g.,
file) replication, which occurs once differences are detected
between MFTEntry-1 and MFTEntry-2 which indicate that an object
change (e.g., file or directory) has occurred. For example, if a
file name has changed the replication client can determine that a
file name property has changed (e.g., a user changed the file name)
or that the original file has been replaced by a new file with a
different name.
[0039] For context, the method refers also to replication using
SHA1 (e.g., secure hashing) maps of the client computing device to
replicate changes over a network connection, as is discussed in
U.S. patent application Ser. No. ______, (Attorney Docket No.
PA7073US) filed on Dec. 21, 2015, entitled "Cloud Storage using
Merkle Trees."
[0040] In some embodiments, the method comprises the replication
client determining 202 MFT entries with a create time or a modify
time which is temporally subsequent to a prior replication (e.g.,
snapshot).
[0041] According to some embodiments, the method includes obtaining
204 a SHA1 signature map for the previous version of the file from
a cloud data store, as well as computing 206 a SHA1 signature map
for the current version of the file.
[0042] The method can comprise detecting 208 changed blocks of data
using the SHA1 signature maps. Once the changed blocks of data are
determined the method can comprise the replication client
transmitting 210 all new blocks to the cloud data store. For
example, the replication client can use BULK_PUT_BLOCK application
programming interface (API).
[0043] In some embodiments, the method includes transmitting 212
the new SHA1 signature map for the file to the cloud data center by
the replication client. Note that the actual data blocks are
replicated to the cloud data store in an operation that is separate
from the process used to transmit the SHA1 signature map.
[0044] One of the reserved MFTEntry is used for access control
lists (ACL) information and is called secure file in an NFTS
formatted file system. All ACLs are stored in a data stream in the
secure file in a sequential manner. The ACLs in the data stream are
indexed by two indices stored in the secure file where one index
called SDH maps `hash` of the ACL to the location of the ACL in the
data stream and other index called SSI contains mapping from
SecurityID security identifier to offset into the secure data where
an ACL is stored. This SecurityID is stored in the
STANDARD_INFORMATION of the MFTEntry of all files. Note that NTFS
stores an ACL only once and every time a new file is created with
an ACL, if the ACL is already stored in the secure data stream then
the new MFTEntry gets the same SecurityID. If a new ACL which does
not already exists in the data stream is created by the system,
only then it is stored and it gets appended to the data stream and
is assigned a new SecurityID.
[0045] In some embodiments, the replication client will detect
change in the size and or update time of the secure file and
replicate the changes. In some embodiments, the replication client
will utilize metadata that will maintain the same SecurityID as in
the source NTFS volume. In some embodiments, the metadata is
provided with a new key-value pair as <SecurityID, ACL> and
the SecurityID can be stored in the inode attributes just the way
NTFS stores it in MFTEntry.
[0046] An example ACL replication process is illustrated in FIG. 3.
The method can comprise a step of determining 302 if a $Secure data
stream has been modified. If so, the method comprises reading 304
the $Secure data stream and obtaining 306 all new ACLs added to the
$Secure data stream.
[0047] Next, the method can comprise reading and parsing 308 $SSI,
as well as determining 310 the SecurityID of each of the new ACL in
the SSI.
[0048] The method includes transmitting 312 a change record to the
cloud data store which is a vector of the pair <SecurityID,
ACL>.
[0049] In some embodiments, the present technology can be applied
to systems and methods that utilize snapshots and differential
files for replication of the client computing device.
[0050] FIG. 4 illustrates an example method for replicating
metadata changes for a client computing device.
[0051] In some embodiments, the method includes a scanner of a
replication client performing 402 a top-down walk of an entire file
system tree (taken from a snapshot) and comparing 404 the file
system tree with a previous snapshot to determine namespace
changes. The method includes generating 406 a metadata log with the
namespace changes. The method comprises the scanner comparing 408
the file modified time for inode pairs to determine list of changed
files. Next, the method includes evaluating 410 the file system
metadata and namespace (directory hierarchy) in two locations on
the appliance. The method further comprises generating 412 a shell
NTFS volume image that has all file system metadata (ACLs and even
file block map) and namespace (directory hierarchy). No file data
is obtained in this step, in some embodiments.
[0052] In one embodiment, the method comprises obtaining
differential metadata which stores only the file system name space
(directory hierarchy).
[0053] In some embodiments, the method includes the scanner
transferring 414 the metadata log to the replication appliance
where it is applied to the NTFS shell volume. According to some
embodiments, the method includes performing 416 another top-down
walk of NTFS shell volume (via fuse mount for example) and
comparing 418 with a previous version of differential metadata. In
some embodiments the method includes generating 420 new
differential metadata based on the comparison.
[0054] For context, the method refers to U.S. patent application
Ser. No. 14/864,850, filed Sep. 24, 2015, entitled "Distributed and
Deduplicating Data Storage System and Methods of Use."
[0055] In some embodiments, no metadata log is required or applied
to backup server. The metadata and namespace changes are replicated
by copying all the disk blocks of the file system that are used to
store the file system metadata and namespace information. For
context, most file systems support cloning of just the file system
namespace and metadata without the file data. The replication
client can be configured to use existing file system cloning tools
(like ntfsclone for NTFS) to replicate full file system namespace
and metadata from the client computing device to the replication
appliance. By using NTFSclone a NTFS shell volume is generated and
saved on the replication appliance, thus keeping the ondisk format
the same.
[0056] Optimized Namespace Change Tracking
[0057] As mentioned above, most file systems allocate inodes for
each object (file/directory) in a simple indexed table MFT giving
each inode a number, referred to as a file number. On deletion of
an inode the index number is free and can be used for new object
(file/directory) later. Using file system tools it is possible to
walk through all the inodes starting from index zero to max
allocated. Instead of performing a full directory walk of two
versions of the file system to compare metadata of two versions of
files (like modified time or other attributes) to determine if a
file has changed, the replication client can directly compare two
versions of the file inodes. Since this solution directly reads the
inodes one by one the replication client can avoid having to read
the directories, which is required in top down namespace walk.
[0058] The replication client uses one or more of the following
properties/attributes of a NTFS ondisk inode (called MFT entry). In
one embodiment, NTFS stores a generation number in the inode that
makes it possible to detect reuse of inode for new object
(file/directory). In another embodiment, NTFS stores object
(file/directory) name in the inode itself which makes it possible
to detect "rename" operations on inodes. NTFS also stores parent
file identifiers of an object (file/directory) in the inode itself
which makes it possible to detect "move" operations on inodes. In
NTFS all the inodes are stored as a simple ondisk array in a
special file called MFT Table. Two versions of MFT table are
compared for each MFT record and change log is generated as
follows:
[0059] FIG. 5 an example method that is executed by the replication
client using the rules set forth above. If the two MFT records are
same the replication client does nothing. Again, the comparison
includes comparing pairs of MFT entries to one another in a
sequential manner (e.g., following inode file identifiers in
numerical order from top to bottom).
[0060] If the first MFT entry is unused and used in second MFT
entry then enter a create entry 502 into a log record. If the first
MFT entry is unused in a new MFT table and unused in an older MFT
table then create 504 a log entry that reflects a delete
operation.
[0061] If the generation number is different, a log object
(file/dir) is created 506 for the new first MFT entry and the old
first MFT entry is deleted.
[0062] If the MFT entry in both versions is identical except for a
modified time then a log entry is created 508 that indicates that
the file/directory has changed. In some embodiments, the method
ignores directory changes and log entries are created only for
files. This is due to directories being replicated in other
processes, such as when metadata is replicated.
[0063] If the MFT record has (same generation number) a different
name (or additional name) then a log entry is created 510 that is
indicative of a rename.
[0064] If the MFT record has (same generation number) a different
parent identifier then a log entry is created 512 that indicate a
move.
[0065] In some embodiments the backup server can require that
change records are addressed by a full file path and not only by
inode numbers. However, the file change log address files using
file identifiers. The backup-sender converts the file identifier to
a namespace path by opening the file by using file identifier and
then performing a getFileInformation operation on the handle to get
full path of the file.
[0066] A change log for this process can be converted or
transformed into a "refined log" as follows. For every directory
rename or move record, the replication client is configured to
perform a full tree walk of a directory and add a "create file"
record for all the children files within the directory. If one of
the child files already had a "modified file" record, that record
is replaced with "create file" record. For every file rename or
move record the replication client can add a "create file" record
for that file and delete any "modified file" record for that
file.
[0067] A refined log contains only "file data change" records which
are either "incremental change" (modify) or "full backup" (create).
This refined log is used by the replication client to send FCRs to
the appliance.
[0068] Following is an example list of pseudo-code to generate the
FILE_CHANGE_LOG and DIR_CHANGE_LOG:
[0069] Example code for generating the refined version of the
FILE_CHANGE_LOG generated is provided below. The refine log
generation is done by a separate program whose input is the
DIR_CHANGE_LOG and FILE_CHANGE_LOG and output is
REFINED_FILE_CHANGE_LOG. In one embodiment, the code specifies:
TABLE-US-00001 HashMap FILE_LOG = read(FILE_CHANGE_LOG) List
DIR_LOG = read(DIR_CHANGE_LOG) for dir in DIR_LOG do //Walk the
directory pointed by record for file in dir do //Remove any record
that exist in FILE_LOG for this file FILE_LOG.remove(file) //Add
file as create record in FILE_LOG FILE_LOG.add(file, "Created")
done done Flush(FILE_LOG)
[0070] FIG. 6 illustrates an example method for determining changes
to a file system and using the changes to create a disk image that
can be booted by a hypervisor (for example) to instantiate a
virtual machine. The disk image can also be booted by a physical
device (e.g., bare metal hardware).
[0071] In some embodiments, the method comprises reading 602 a disk
of a client computing device by a replication agent of a
replication appliance. In one example, reading comprises walking a
first set of inodes of a file system of the client computing device
at a first point in time. Again, these inodes belong to an MFT
table for the client computing device. A second set of inodes of a
file system backup are also read for the client computing
device.
[0072] Next, the method includes comparing 604 the first set of
inodes of the file system of the client computing device to the
second set of inodes of the file system backup of the client
computing device.
[0073] In some embodiments, the method comprises detecting 606
changed inodes between the first set of inodes and the second set
of inodes and translating 608 the changed inodes into namespace
locations. Again, this can be accomplished by using file
identifiers, which point to locations on the disk where objects
within the file system are located.
[0074] Next, the method can comprise obtaining 610 blocks of
changed data using the namespace locations.
[0075] The method can also include generating 612 a disk image of
the client computing device that is capable of being mounted by a
hypervisor so as to create a virtual machine using the file system
backup and the blocks of changed data.
[0076] Referring now to the drawings, and more particularly, to
FIG. 7, which includes a schematic diagram of an exemplary
architecture 700 for practicing the present invention. Architecture
700 may include a cloud data store 705, which can comprise a block
store 710 for storing blocks of data for a client computing device
715.
[0077] In some embodiments the client computing device 715 is
coupled to a replication appliance 720 that employs a replication
client 725 (NTFS based) and a scanner 730. The replication client
725 and scanner 730 are described in many of the embodiments
above.
[0078] A cloud-based computing environment. In general, a
cloud-based computing environment is a resource that typically
combines the computational power of a large model of processors
and/or that combines the storage capacity of a large model of
computer memories or storage devices. For example, systems that
provide a cloud resource may be utilized exclusively by their
owners, such as Google.TM. or Yahoo!.TM.; or such systems may be
accessible to outside users who deploy applications within the
computing infrastructure to obtain the benefit of large
computational or storage resources.
[0079] The cloud may be formed, for example, by a network of
servers, with each server (or at least a plurality thereof)
providing processor and/or storage resources. These servers may
manage workloads provided by multiple users (e.g., cloud resource
consumers or other users). Typically, each user places workload
demands upon the cloud that vary in real-time, sometimes
dramatically. The nature and extent of these variations typically
depend on the type of business associated with the user.
[0080] In some instances the block store 710 may include a
deduplicating block store that stores blocks of data for one or
more objects, such as a file, a group of files, or an entire disk.
Additionally the block store 710 may comprise Merkle trees that
include hash-type representations of objects within the
deduplicating block store. That is, for each object (or group of
blocks), a Merkle tree exists that represents the blocks of the
object.
[0081] In other embodiments, the block store 705 can implement
storage for differential backups instead of Merkle trees.
[0082] According to some embodiments, the deduplicating block store
may include immutable object addressable block storage. The
deduplicating block store may form an underlying storage foundation
that allows for the storing of blocks of objects. The identifiers
of the blocks are a unique representation of the block, generated
for example by using a uniform hash function.
[0083] FIG. 8 illustrates an exemplary computing system 800 that
may be used to implement an embodiment of the present technology.
The computing system 800 of FIG. 8 includes one or more processors
810 and memory 820. Main memory 820 stores, in part, instructions
and data for execution by processor 810. Main memory 820 can store
the executable code when the system 800 is in operation. The system
800 of FIG. 8 may further include a mass storage device 830,
portable storage medium drive(s) 840, output devices 850, user
input devices 860, a graphics display 870, and other peripheral
devices 880. The system 800 may also comprise network storage
845.
[0084] The components shown in FIG. 8 are depicted as being
connected via a single bus 890. The components may be connected
through one or more data transport means. Processor unit 810 and
main memory 820 may be connected via a local microprocessor bus,
and the mass storage device 830, peripheral device(s) 880, portable
storage device 840, and graphics display 870 may be connected via
one or more input/output (I/O) buses.
[0085] Mass storage device 830, which may be implemented with a
magnetic disk drive or an optical disk drive, is a non-volatile
storage device for storing data and instructions for use by
processor unit 810. Mass storage device 830 can store the system
software for implementing embodiments of the present technology for
purposes of loading that software into main memory 820.
[0086] Portable storage device 840 operates in conjunction with a
portable non-volatile storage medium, such as a floppy disk,
compact disk or digital video disc, to input and output data and
code to and from the computing system 800 of FIG. 8. The system
software for implementing embodiments of the present technology may
be stored on such a portable medium and input to the computing
system 800 via the portable storage device 840.
[0087] Input devices 860 provide a portion of a user interface.
Input devices 860 may include an alphanumeric keypad, such as a
keyboard, for inputting alphanumeric and other information, or a
pointing device, such as a mouse, a trackball, stylus, or cursor
direction keys. Additionally, the system 800 as shown in FIG. 8
includes output devices 850. Suitable output devices include
speakers, printers, network interfaces, and monitors.
[0088] Graphics display 870 may include a liquid crystal display
(LCD) or other suitable display device. Graphics display 870
receives textual and graphical information, and processes the
information for output to the display device.
[0089] Peripherals 880 may include any type of computer support
device to add additional functionality to the computing system.
Peripheral device(s) 880 may include a modem or a router.
[0090] The components contained in the computing system 800 of FIG.
8 are those typically found in computing systems that may be
suitable for use with embodiments of the present technology and are
intended to represent a broad category of such computer components
that are well known in the art. Thus, the computing system 800 can
be a personal computer, hand held computing system, telephone,
mobile computing system, workstation, server, minicomputer,
mainframe computer, or any other computing system. The computer can
also include different bus configurations, networked platforms,
multi-processor platforms, etc. Various operating systems can be
used including UNIX, Linux, Windows, Macintosh OS, Palm OS, and
other suitable operating systems.
[0091] As used herein, the term "module" may also refer to any of
an application-specific integrated circuit ("ASIC"), an electronic
circuit, a processor (shared, dedicated, or group) that executes
one or more software or firmware programs, a combinational logic
circuit, and/or other suitable components that provide the
described functionality. In other embodiments, individual modules
may include separately configured web servers.
[0092] Some of the above-described functions may be composed of
instructions that are stored on storage media (e.g.,
computer-readable medium). The instructions may be retrieved and
executed by the processor. Some examples of storage media are
memory devices, tapes, disks, and the like. The instructions are
operational when executed by the processor to direct the processor
to operate in accord with the technology. Those skilled in the art
are familiar with instructions, processor(s), and storage
media.
[0093] It is noteworthy that any hardware platform suitable for
performing the processing described herein is suitable for use with
the technology. The terms "computer-readable storage medium" and
"computer-readable storage media" as used herein refer to any
medium or media that participate in providing instructions to a CPU
for execution. Such media can take many forms, including, but not
limited to, non-volatile media, volatile media and transmission
media. Non-volatile media include, for example, optical or magnetic
disks, such as a fixed disk. Volatile media include dynamic memory,
such as system RAM. Transmission media include coaxial cables,
copper wire and fiber optics, among others, including the wires
that comprise one embodiment of a bus. Transmission media can also
take the form of acoustic or light waves, such as those generated
during radio frequency (RF) and infrared (IR) data communications.
Common forms of computer-readable media include, for example, a
floppy disk, a flexible disk, a hard disk, magnetic tape, any other
magnetic medium, a CD-ROM disk, digital video disk (DVD), any other
optical medium, any other physical medium with patterns of marks or
holes, a RAM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, any other
memory chip or data exchange adapter, a carrier wave, or any other
medium from which a computer can read.
[0094] Various forms of computer-readable media may be involved in
carrying one or more sequences of one or more instructions to a CPU
for execution. A bus carries the data to system RAM, from which a
CPU retrieves and executes the instructions. The instructions
received by system RAM can optionally be stored on a fixed disk
either before or after execution by a CPU.
[0095] Computer program code for carrying out operations for
aspects of the present invention may be written in any combination
of one or more programming languages, including an object oriented
programming language such as Java, Smalltalk, C++ or the like and
conventional procedural programming languages, such as the "C"
programming language or similar programming languages. The program
code may execute entirely on the user's computer, partly on the
user's computer, as a stand-alone software package, partly on the
user's computer and partly on a remote computer or entirely on the
remote computer or server. In the latter scenario, the remote
computer may be connected to the user's computer through any type
of network, including a local area network (LAN) or a wide area
network (WAN), or the connection may be made to an external
computer (for example, through the Internet using an Internet
Service Provider).
[0096] The corresponding structures, materials, acts, and
equivalents of all means or step plus function elements in the
claims below are intended to include any structure, material, or
act for performing the function in combination with other claimed
elements as specifically claimed. The description of the present
invention has been presented for purposes of illustration and
description, but is not intended to be exhaustive or limited to the
invention in the form disclosed. Many modifications and variations
will be apparent to those of ordinary skill in the art without
departing from the scope and spirit of the invention. Exemplary
embodiments were chosen and described in order to best explain the
principles of the present technology and its practical application,
and to enable others of ordinary skill in the art to understand the
invention for various embodiments with various modifications as are
suited to the particular use contemplated.
[0097] Aspects of the present invention are described above with
reference to flowchart illustrations and/or block diagrams of
methods, apparatus (systems) and computer program products
according to embodiments of the invention. It will be understood
that each block of the flowchart illustrations and/or block
diagrams, and combinations of blocks in the flowchart illustrations
and/or block diagrams, can be implemented by computer program
instructions. These computer program instructions may be provided
to a processor of a general purpose computer, special purpose
computer, or other programmable data processing apparatus to
produce a machine, such that the instructions, which execute via
the processor of the computer or other programmable data processing
apparatus, create means for implementing the functions/acts
specified in the flowchart and/or block diagram block or
blocks.
[0098] These computer program instructions may also be stored in a
computer readable medium that can direct a computer, other
programmable data processing apparatus, or other devices to
function in a particular manner, such that the instructions stored
in the computer readable medium produce an article of manufacture
including instructions which implement the function/act specified
in the flowchart and/or block diagram block or blocks.
[0099] The computer program instructions may also be loaded onto a
computer, other programmable data processing apparatus, or other
devices to cause a series of operational steps to be performed on
the computer, other programmable apparatus or other devices to
produce a computer implemented process such that the instructions
which execute on the computer or other programmable apparatus
provide processes for implementing the functions/acts specified in
the flowchart and/or block diagram block or blocks.
[0100] The flowchart and block diagrams in the Figures illustrate
the architecture, functionality, and operation of possible
implementations of systems, methods and computer program products
according to various embodiments of the present invention. In this
regard, each block in the flowchart or block diagrams may represent
a module, segment, or portion of code, which comprises one or more
executable instructions for implementing the specified logical
function(s). It should also be noted that, in some alternative
implementations, the functions noted in the block may occur out of
the order noted in the figures. For example, two blocks shown in
succession may, in fact, be executed substantially concurrently, or
the blocks may sometimes be executed in the reverse order,
depending upon the functionality involved. It will also be noted
that each block of the block diagrams and/or flowchart
illustration, and combinations of blocks in the block diagrams
and/or flowchart illustration, can be implemented by special
purpose hardware-based systems that perform the specified functions
or acts, or combinations of special purpose hardware and computer
instructions.
[0101] While various embodiments have been described above, it
should be understood that they have been presented by way of
example only, and not limitation. The descriptions are not intended
to limit the scope of the technology to the particular forms set
forth herein. Thus, the breadth and scope of a preferred embodiment
should not be limited by any of the above-described exemplary
embodiments. It should be understood that the above description is
illustrative and not restrictive. To the contrary, the present
descriptions are intended to cover such alternatives,
modifications, and equivalents as may be included within the spirit
and scope of the technology as defined by the appended claims and
otherwise appreciated by one of ordinary skill in the art. The
scope of the technology should, therefore, be determined not with
reference to the above description, but instead should be
determined with reference to the appended claims along with their
full scope of equivalents.
* * * * *