U.S. patent application number 17/480891 was filed with the patent office on 2022-05-12 for system and method for synchronizing file systems with large namespaces.
This patent application is currently assigned to Egnyte, Inc.. The applicant listed for this patent is Egnyte, Inc.. Invention is credited to Hakan Ancin, Amrit Jassal, Manish Marathe, Shishir Sharma, Dmitry Tisnek.
Application Number | 20220147488 17/480891 |
Document ID | / |
Family ID | |
Filed Date | 2022-05-12 |
United States Patent
Application |
20220147488 |
Kind Code |
A1 |
Sharma; Shishir ; et
al. |
May 12, 2022 |
System And Method For Synchronizing File Systems With Large
Namespaces
Abstract
A method for facilitating synchronization of a remote file
system (RFS) and a local file system (LFS) includes maintaining a
global revision identifier associated with the RFS, where the
global revision identifier has a variable state, and responsive to
the RFS being modified, updating metadata associated with one or
more of the file system objects of the RFS based on a state of the
global revision identifier. More particularly, the method includes
updating the global revision identifier to a new state in response
to the RFS modification and then updating the metadata of the
folder(s) associated with the RFS modification. The RFS metadata is
partitioned by folder and file to facilitate efficient searching
based on a requested namespace view. Another method utilizes prior
revision identifiers stored in the LFS to limit the RFS metadata
returned to the local cloud during a rescan, by excluding the data
set that has not changed, which improves the efficiency of the
rescan synchronization.
Inventors: |
Sharma; Shishir; (Mountain
View, CA) ; Marathe; Manish; (San Jose, CA) ;
Jassal; Amrit; (Morgan Hill, CA) ; Ancin; Hakan;
(Los Gatos, CA) ; Tisnek; Dmitry; (Poznan,
PL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Egnyte, Inc. |
Mountain View |
CA |
US |
|
|
Assignee: |
Egnyte, Inc.
Mountain View
CA
|
Appl. No.: |
17/480891 |
Filed: |
September 21, 2021 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
15179459 |
Jun 10, 2016 |
11144510 |
|
|
17480891 |
|
|
|
|
62174099 |
Jun 11, 2015 |
|
|
|
International
Class: |
G06F 16/178 20060101
G06F016/178 |
Claims
1. In a remote file storage system, a method for facilitating
synchronization of a remote file system (RFS) and a local file
system (LFS), said RFS being located remotely from said LFS, said
method comprising: maintaining a global revision identifier
associated with said RFS, said global revision identifier having a
variable state; establishing a connection with a user; providing
access to said RFS to said user, said RFS including a plurality of
file system objects, each of said file system objects being
associated with metadata; receiving an instruction to modify said
RFS from said user; and responsive to said RFS being modified,
updating said metadata associated with one or more of said file
system objects of said RFS based on a state of said global revision
identifier.
2. The method of claim 1, wherein said step of updating said
metadata includes altering said metadata of said one or more file
system objects to reflect said state of said global revision
identifier.
3. The method of claim 2, further comprising updating said state of
said global revision identifier to a new state prior to said step
of updating said metadata based on said state of said global
revision identifier.
4. The method of claim 1, wherein said step of updating said
metadata associated with said one or more file system objects
includes updating metadata associated with one or more folders of
said RFS.
5. The method of claim 4, wherein said instruction to modify said
RFS comprises an instruction involving a file located within said
one or more of said folders having said metadata associated
therewith updated.
6. The method of claim 1, wherein: said metadata associated with
said file system objects of said RFS are partitioned into a folders
portion and a files portion; said folders portion includes a
plurality of folder records each storing folder metadata associated
with a folder of said RFS; said files portion includes a plurality
of file records each storing file metadata associated with a file
of said RFS; said folder metadata for each of said folder records
includes revision data indicative of said state of said global
revision identifier at some point in time; and said step of
updating said metadata of said one or more file system objects
based on said state of said global revision identifier includes
updating said revision data of said folder metadata.
7. The method of claim 1, further comprising: establishing a second
connection with a local file storage system associated with said
LFS; receiving a synchronization request from said local file
storage system; and providing metadata associated with at least
some of said file system objects of said RFS to said local file
storage system.
8. The method of claim 7, wherein: said synchronization request
includes data indicative of at least one requested folder within
said RFS; and said step of providing metadata includes providing
folder metadata associated with one or more identified folders of
said RFS based on said at least one requested folder of said RFS,
said folder metadata including revision data being indicative of a
state of said global revision identifier the last time an
identified folder of said RFS was modified.
9. The method of claim 8, further comprising: receiving a file
listing request from said local file storage system for at least
some of said plurality of folders identified by said folder
metadata provided to said local file storage system; and providing
file metadata for files stored in each folder identified in said
file listing request to said local file storage system.
10. The method of claim 7, further comprising providing a current
state of said global revision identifier to said local file storage
system.
11-21. (canceled)
22. A method for synchronizing a local file system (LFS) with an
associated remote file system (RFS), said LFS being located
remotely from said RFS, said method comprising: establishing a
network connection with a remote file storage system associated
with said RFS; initiating a synchronization process with said
remote file storage system; providing at least one namespace
identifier to said remote file storage system via said network
connection, said namespace identifier being indicative of at least
a portion of a subscribed namespace synchronized on said RFS and
said LFS; and providing at least one prior revision identifier to
said remote file storage system, said prior revision identifier
defining a state associated with said RFS when said at least said
portion of said subscribed namespace was last synchronized between
said RFS and said LFS.
23. The method of claim 22, further comprising receiving RFS
metadata from said remote file storage system, said RFS metadata
being associated with file system objects in said at least said
portion of said subscribed namespace that have been modified since
said prior synchronization.
24. The method of claim 23, wherein: said RFS metadata comprises
folder metadata associated with said one or more folders of said
subscribed namespace; and said method further comprises querying
said remote file storage system for file metadata associated with
files stored in at least some of said one or more folders.
25. The method of claim 22, further comprising: receiving a current
revision identifier from said remote file storage system, said
current revision identifier defining a current state associated
with said RFS; and storing said current revision identifier in LFS
metadata associated with at least one folder of said portion of
said subscribed namespace on said LFS associated with said
namespace identifier.
26. A local file storage system storing a local file system (LFS)
that is synchronized with an associated remote file system (RFS)
stored remotely from said LFS, said local file storage system
comprising: memory for storing data, said data including said LFS
including a plurality of file system objects, each of said file
system objects including LFS metadata, and a plurality of prior
revision identifiers, each prior revision identifier defining a
state of said RFS at a time when an associated folder of a
subscribed namespace between said LFS and said RFS was last
synchronized; a remote cloud interface configured to establish a
connection with said remote file storage system; and a synchronizer
configured to initiate a synchronization process with said remote
file storage system, provide at least one namespace identifier to
said remote file storage system, said namespace identifier being
indicative of at least a portion of said subscribed namespace, and
provide at least one prior revision identifier associated with said
namespace identifier to said remote file storage system.
27. The system of claim 26, wherein said remote cloud interface
receives RFS metadata from said remote file storage system, said
RFS metadata being associated with file system objects in at least
said portion of said subscribed namespace that have been modified
since said prior synchronization.
28. The system of claim 27, wherein: said RFS metadata comprises
folder metadata associated with said one or more folders of said
subscribed namespace; and said synchronizer is configured to query
said remote file storage system for file metadata associated with
files stored in at least some of said one or more folders.
29. The system of claim 26, wherein: said remote cloud interface is
configured to receive a current revision identifier from said
remote file storage system, said current revision identifier
defining a current state associated with said RFS; and said
synchronizer is configured to store said current revision
identifier in LFS metadata associated with at least one folder of
said portion of said subscribed namespace on said LFS associated
with said namespace identifier.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of co-pending U.S. patent
application Ser. No. 15/179,459, filed on Jun. 10, 2016 by the same
inventors, which claims the benefit of U.S. Provisional Patent
Application Ser. No. 62/174,099, filed on Jun. 11, 2015 by the same
inventors and entitled "System and Method for Synchronizing File
Systems with Large Namespaces", both of which are incorporated
herein by reference in their respective entireties.
BACKGROUND OF THE INVENTION
Field of the Invention
[0002] This invention relates generally to computer systems, and
more particularly to cloud file storage systems. Even more
particularly, this invention relates to synchronizing a remote file
system and a local file system having a large synchronized
namespace.
Description of the Background Art
[0003] Cloud computing systems are known. In cloud computing
systems, computing and storage services are provided to remote
clients over a wide area network such as the Internet. In the case
of storage, the client's file system will exist both in the
client's local storage device (e.g., a hard drive, network attached
storage (NAS) device, etc.) and in the cloud. For example, the
client might store a backup copy of its local file system in the
cloud for both data protection and remote access. However, these
benefits are negated as the remote (cloud) file system becomes out
of date as more and more changes are made to the local file system
and/or the remote file system.
[0004] The client's local and remote file systems can be
resynchronized. Rescan synchronization is one method that can be
used to resynchronize the remote and local file systems. In a
rescan synchronization, the namespaces of the local file system and
the remote file system are compared on an object-by-object basis to
identify differences between the local and remote file systems, and
then file system operations are applied to the local and remote
file systems to bring them back into synchronization. A full rescan
synchronization (FRS) is used to resynchronize the entire
synchronized file system, whereas a limited rescan synchronization
(LRS) is used to resynchronize a smaller portion of the
synchronized file system. Because FRS and LRS operate similarly,
just on different amounts of the synchronized file system, they can
be referred to generally as a rescan synchronization (RS).
[0005] Rescan synchronizations, however, suffer from serious
drawbacks. First, rescan synchronizations are very memory
intensive, especially for file systems with extremely large
namespaces. For example, a metadata snapshot of the entire
synchronized file system can be extremely large, and this snapshot
will need to be communicated from the cloud to the client's local
device, or vice-versa. For a synchronized file system containing
hundreds of millions of objects, this file transfer will use a
large amount of communication bandwidth (e.g., over the Internet).
Even a snapshot associated with an LRS on a large namespace can be
very bandwidth intensive. Additionally, the actual object-by-object
comparison between the remote and local file systems needs to
occur, and is again is very time consuming and processor intensive.
Moreover, oftentimes the majority of the file system objects that
have been synchronized have not changed on either file system since
the last synchronization, and so system resources spent to perform
the object-by-object comparison on the unchanged items is wasted.
The above limitations become even more problematic if portions of
the remote and/or local file systems are unavailable during the
rescan synchronization. The above limitations also make it
difficult for the cloud to serve multiple customers and their
client devices efficiently.
[0006] What is needed, therefore, is a system and method that
facilitates efficiently resynchronizing a client's remote and local
file systems, even where the client's remote file system is very
large. What is also needed is a system and method for
resynchronizing remote and local file systems that is more scalable
so as to serve multiple clients having different sizes of
synchronized file systems.
SUMMARY
[0007] The present invention overcomes the problems associated with
the prior art by providing a system and method for facilitating
efficient rescan synchronizations between a remote cloud server and
a local cloud server. In particular, the invention optimizes the
metadata transfer and comparison processes that are part of the
rescan by limiting the remote dataset that needs to be transferred
to the local cloud to the dataset that may have actually changed
since the last synchronization. Additionally, metadata associated
with the remote file system is partitioned by folders and files for
efficient searching. The invention maintains a global revision
identifier having a variable state and associates revision data
with at least some of the remote file system's metadata. Remote
file system objects that may have changed can then be quickly
identified by examining the revision data stored in the metadata.
Portions of the synchronized namespace that have not changed can
then be excluded based on the revision data and not transferred to
the local cloud for comparison. Therefore, rescan synchronizations
according to the invention are more efficient and easier on system
resources.
[0008] In a remote file storage system, a method for facilitating
synchronization of a remote file system (RFS) and a local file
system (LFS) is disclosed, where the RFS includes a plurality of
file system objects each associated with metadata. The method
includes maintaining a global revision identifier associated with
the RFS, where the global revision identifier has a variable state,
establishing a connection with a user, providing access to the RFS
to the user, receiving an instruction to modify the RFS from the
user, and responsive to the RFS being modified, updating the
metadata associated with one or more of the file system objects of
the RFS based on a state of the global revision identifier. The
step of updating the metadata can include altering the metadata of
the one or more file system objects (folders) to reflect the state
of the global revision identifier. More particularly, when a file
is updated, the metadata of the folder in which the file is located
is updated. A particular method includes updating the state of the
global revision identifier to a new state prior to updating the
metadata based on the state of the global revision identifier.
[0009] The metadata of the RFS can also be partitioned into a
folders portion and a files portion, where the folders portion
includes a plurality of folder records each storing folder metadata
associated with a folder of the RFS, the files portion includes a
plurality of file records each storing file metadata associated
with a file of the RFS, and the folder metadata for each of the
folder records includes revision data indicative of the state of
the global revision identifier at some point in time. Accordingly,
the step of updating the metadata of the one or more file system
objects includes updating the revision data of the folder
metadata.
[0010] The invention also provides RFS metadata to a local file
storage system (the local cloud) for use during a rescan
synchronization. A particular method includes establishing a second
connection with a local file storage system associated with the
LFS, receiving a synchronization request from the local file
storage system, and providing metadata associated with at least
some of the file system objects of the RFS to the local file
storage system. Additionally, the synchronization request from the
local cloud server can include data indicative of at least one
requested folder within the RFS, and the step of providing metadata
can include providing folder metadata associated with one or more
identified folders of the RFS to the local cloud server. In such a
case, the folder metadata includes revision data that is indicative
of a state of the global revision identifier the last time an
identified folder of the RFS was modified. Even more particularly
the method can include the steps of receiving a file listing
request from the local file storage system for at least some of the
plurality of folders identified by the folder metadata provided to
the local cloud, and providing file metadata for files stored in
each folder identified in the file listing request to the local
file storage system. Any of the above methods can also include
providing a current state of the global revision identifier to the
local file storage system.
[0011] A remote file storage system is also disclosed. The remote
file storage system includes memory for storing data, a user
interface operative to establish a connection with a user of the
client and to provide access to the RFS to the user, and an RFS
updater. The stored data includes a RFS including metadata and a
global revision identifier having a variable state. The RFS updater
is configured to update the metadata associated with one or more of
the file system objects based on a state of the global revision
identifier, responsive to the RFS being modified by the user. In a
particular embodiment, the RFS updater is configured to alter the
metadata of the one or more file system objects (one or more
folders) to reflect the state of the global revision identifier. If
the alteration is to a file, then the RFS updater is configured to
update the metadata associated with the folder containing the file.
In another particular embodiment, the RFS updater is configured to
update the state of the global revision identifier to a new state
prior to updating the metadata based on the global revision
identifier.
[0012] The metadata associated with the RFS can also be partitioned
into a folders portion and a files portion as described above. In
such a case, the RFS updater is configured to update the revision
data of the folder metadata responsive to the RFS being modified by
the user.
[0013] In another particular embodiment, the remote file storage
system includes a remote synchronizer, and the client interface is
configured to establish a second connection with a local file
storage system associated with an LFS and to receive a
synchronization request from the local file storage system. The
remote synchronizer is also configured to provide metadata
associated with at least some of the file system objects of the RFS
to the local file storage system. More particularly, the
synchronization request from the local cloud can include at least
one requested folder within the RFS, and the remote synchronizer is
configured to provide folder metadata associated with one or more
identified folders of the RFS to the local file storage system,
where the provided folder metadata includes revision data
indicative of the state of the global revision identifier the last
time an identified folder of the RFS was modified. The client
interface can also be configured to receive a file listing request
from the local file storage system for at least some of the
plurality of folders associated with the folder metadata provided
to the local file storage system, and the remote synchronizer is
further configured to provide file metadata for files stored in
each folder identified in the file listing request. The remote
synchronizer is also configured to provide a current state of the
global revision identifier to the local file storage system.
[0014] Importantly, the remote file storage system includes of the
invention includes a means for tracking a revision state associated
with the RFS and a means for updating the metadata associated with
one or more of the file system objects based on the revision state
responsive to the RFS being modified by the user.
[0015] A method for synchronizing an LFS with a remotely-located
RFS is also disclosed and includes the steps of establishing a
network connection with a remote file storage system, initiating a
synchronization process with the remote file storage system,
providing at least one namespace identifier to the remote file
storage system, and providing at least one prior revision
identifier to the remote file storage system. Here, the namespace
identifier is indicative of at least a portion of a subscribed
namespace synchronized on the RFS and the LFS, and the prior
revision identifier defines a state associated with the RFS when
the portion of the subscribed namespace was last synchronized. A
more particular method includes the step of receiving RFS metadata
from the remote file storage system, where the RFS metadata is
associated with file system objects in the associated portion of
the subscribed namespace that have been modified since the prior
synchronization. In a still more particular method, the RFS
metadata includes folder metadata associated with the one or more
folders of the subscribed namespace, and the method further
includes querying the remote file storage system for file metadata
associated with files stored in at least some of the one or more
folders. Still another particular method includes the steps of
receiving a current revision identifier from the remote file
storage system, where the current revision identifier defines a
current state associated with the RFS, and storing the current
revision identifier in LFS metadata associated with at least one
folder of the portion of the subscribed namespace on the LFS
associated with the namespace identifier.
[0016] A local file storage system storing an LFS that is
synchronized with a remotely-stored RFS includes memory for storing
data, a remote cloud interface configured to establish a connection
with the remote file storage system, and a synchronizer. The data
includes the LFS, which includes a plurality of file system objects
and associated LFS metadata. The data also includes a plurality of
prior revision identifiers, where each prior revision identifier
defines a state of the RFS at a time when an associated folder of a
subscribed namespace between the LFS and the RFS was last
synchronized. The synchronizer of the system is configured to
initiate a synchronization process with the remote file storage
system, provide at least one namespace identifier to the remote
file storage system, where the namespace identifier is indicative
of at least a portion of the subscribed namespace, and provide at
least one prior revision identifier associated with the namespace
identifier to the remote file storage system. In a particular
embodiment, the remote cloud interface receives RFS metadata from
the remote file storage system, where the RFS metadata is
associated with file system objects in at least the portion of the
subscribed namespace that have been modified since the prior
synchronization. More particularly, the RFS metadata includes
folder metadata associated with the one or more folders of the
subscribed namespace, and the synchronizer is configured to query
the remote file storage system for file metadata associated with
files stored in at least some of the one or more folders. In still
another particular embodiment, the remote cloud interface is
configured to receive a current revision identifier from the remote
file storage system, where the current revision identifier defines
a current state associated with the RFS, and the synchronizer is
configured to store the current revision identifier in LFS metadata
associated with at least one folder of the portion of the
subscribed namespace on the LFS associated with the namespace
identifier.
[0017] Non-transitory, electronically-readable media having code
embodied therein for causing an electronic device to perform the
methods of the present invention are also disclosed.
BRIEF DESCRIPTION OF THE DRAWINGS
[0018] The present invention is described with reference to the
following drawings, wherein like reference numbers denote
substantially similar elements:
[0019] FIG. 1 is a diagram of a cloud computing system;
[0020] FIG. 2 is a diagram of a global namespace for a client's
remote file system stored on the remote cloud server of FIG. 1;
[0021] FIG. 3A is a diagram summarizing a process for synchronizing
a subscribed namespace of a remote file system and a local file
system for the first time according to the present invention;
[0022] FIG. 3B is a diagram summarizing a process for
resynchronizing all or parts of the subscribed namespace according
to the present invention;
[0023] FIG. 4 shows an exemplary data structure for storing
metadata representative of the client's remote file system on the
remote cloud server of FIG. 1;
[0024] FIG. 5 shows an exemplary data structure for storing
metadata representative of the client's local file system on the
local cloud server of FIG. 1;
[0025] FIG. 6 shows an exemplary data structure for storing records
in a last valid sync (LVS) database according to the present
invention;
[0026] FIG. 7 is a block diagram of the remote cloud server of FIG.
1;
[0027] FIG. 8 is a relational diagram showing the functional
aspects of the remote cloud server of FIG. 7;
[0028] FIG. 9 is a table indicating how metadata of the remote file
system is updated based on the state of a revision identifier in
response to changes to the remote file system;
[0029] FIG. 10 is a block diagram of the local cloud server of FIG.
1;
[0030] FIG. 11 is a relational diagram showing the functional
aspects of the local cloud server of FIG. 10;
[0031] FIG. 12A shows a table illustrating part of a full rescan
synchronization (FRS) according to the present invention;
[0032] FIG. 12B shows a table illustrating part of a limited rescan
synchronization (LRS) according to the present invention;
[0033] FIG. 13 is a flowchart summarizing one method of
facilitating synchronization of remote and local file systems
according to the present invention;
[0034] FIG. 14 is a flowchart summarizing one method of performing
the seventh step of the method of FIG. 13;
[0035] FIG. 15 is a flowchart summarizing one method of
synchronizing remote and local file systems according to the
present invention;
[0036] FIG. 16 is a flowchart summarizing a first method of
performing the second step of the method of FIG. 15 according to
the invention; and
[0037] FIG. 17 is a flowchart summarizing a method of performing
the fourth step of the method of FIG. 16 according to the
invention;
[0038] FIG. 18 is a flowchart summarizing an alternative method of
performing the second step of the method of FIG. 15 according to
the invention.
DETAILED DESCRIPTION
[0039] The present invention overcomes the problems associated with
the prior art, by providing a system and method for efficiently
synchronizing file systems having large namespaces between the
cloud and the client's local device. In the following description,
numerous specific details are set forth (e.g., particular
functional modules, particular metadata examples, particular API
examples, etc.) in order to provide a thorough understanding of the
invention. Those skilled in the art will recognize, however, that
the invention may be practiced apart from these specific details.
In other instances, details of well-known cloud computing practices
and components have been omitted, so as not to unnecessarily
obscure the present invention.
[0040] FIG. 1 shows a cloud computing system 100 that includes a
remote cloud server 102, a local cloud server 104, and a local
cloud server 106, which communicate and are synchronized via the
Internet 108. Local cloud server 104 can be hosted, for example, by
a file server in an office 110 and is, therefore, sometimes
referred to as an office local cloud (OLC). In this example, the
client's office 110 is located in Chicago, Ill. A local file system
(e.g., namespace and file data) stored on local cloud server 104 is
synchronized with remote cloud server 102 to provide local and
remote data access and remote data security. In this embodiment, at
least a portion of the local file system stored on local cloud
server 104 is bi-directionally synchronized with remote cloud
server 102. Although one-way synchronization of the local and
remote file systems is also possible. Similarly, the client's local
file system (e.g., namespace and file data) stored on local cloud
server 106, which is located in the client's office in Taipei,
Taiwan, is also synchronized with remote cloud server 102.
[0041] Local users 112 of the client can access local file system
objects stored on local cloud 104 via a local network 114.
Optionally, local cloud 104 can extend access for local users 112
to the client's remote file system stored on remote cloud server
102 via Internet 108. Similarly, local cloud server 106 also
provides local file system access to its own local users (not
shown) of the client and optionally extends those local users
access to the remote file system on remote cloud server 102. Remote
users 116 of the client can access the client's remote file system,
including the client's global namespace and associated data files,
by accessing remote cloud 102 via Internet 108 or some other
connection 118 with remote cloud server 102.
[0042] In the present embodiment, the client associated with local
cloud servers 104 and 106 is a customer of a cloud service provider
operating remote cloud server 102. It will be understood,
therefore, that remote cloud server 102 can store and synchronize
file systems associated with other customers as well, for example,
on a subscription basis. Additionally, the present invention will
be primarily described hereinafter with reference to local cloud
server 104 and remote cloud server 102, but the present invention
can be equally applied between local cloud server 106 and remote
cloud server 102.
[0043] FIG. 2 shows a global namespace 200 for a remote file system
(RFS) of a particular client 202, which is stored on remote cloud
server 102. Global namespace 200 includes a plurality of subscribed
namespaces, including a subscribed namespace 204 and a subscribed
namespace 206. Subscribed namespace 204 identifies the portions of
the client's RFS (e.g., "Chicago_Domain" folder 210 and its
subtrees) that are synchronized with local cloud server 104, which
is physically located in the Chicago office of client 202.
Accordingly, the file system corresponding to subscribed namespace
204 is stored on remote cloud server 102 as part of the RFS and
will be stored on local cloud server 104 as all or part of its
local file system. Similarly, subscribed namespace 206 includes the
portions of the client's RFS (e.g., "Taipei Domain" folder 212 and
its subtrees) that are synchronized with local cloud server 106,
which is physically located in the Taipei, Taiwan office of client
202.
[0044] Global namespace 200 is exemplary in nature. Indeed, the
subscribed namespace for a particular local cloud server can be
customized as desired. For example, it could be that client 202
desires that local cloud server 104 be subscribed to all folders of
global namespace 200 (e.g., namespaces 204 and 206), whereas local
cloud server 106 only be subscribed to namespace 206. Additionally,
global namespace 200 can include a remote-cloud-only namespace
portion stored only on remote cloud server 102, subscribed
namespaces of other local cloud servers, etc.
[0045] As changes are made to global namespace 200 (e.g., by remote
users 116, extended local users 112 from local cloud 104, etc.),
the synchronized portions of the global namespace 200 (and
associated file system objects) will become out of date with their
respective local cloud servers 104 and 106. Accordingly, it is
desirable to re-synchronize the subscribed namespaces of RFS with
the corresponding local file system (LFS) stored on a local cloud
server, such that the subscribed namespace and its associated data
objects of the RFS and LFS become identical, or at least nearly
identical, as of some particular time. This means that the file
system objects (metadata and data files) stored in the LFS will
also be stored in the RFS, and vice versa.
[0046] In summary, while client 202 might want its users to have
access to the entire global namespace 200 and remote file system
via remote cloud server 102, the client 202 has restricted
synchronization between the RFS and the LFS on local clouds 104 and
106 to their respective subscribed namespaces 204 and 206. Even
though each of subscribed namespaces 204 and 206 represents only a
portion of the global namespace 200 and the RFS, the subscribed
namespaces 204 and 206 themselves can still be extremely large.
[0047] As indicated above, known rescan synchronization methods are
extremely CPU, bandwidth, and memory intensive, especially for file
systems with very large namespaces. The present invention overcomes
these problems by providing a system and method for efficiently
performing snapshot-based synchronizations, particularly FRS and
LRS, even for very large namespaces consisting of hundreds of
millions of objects or more. The invention makes rescan
synchronizations faster, more efficient, and scalable by
identifying the data set of the subscribed namespace 204 that may
have changed (e.g., that changed itself or is associated with a
change) since the last synchronization. The invention then limits
the metadata snapshot that is transferred to local cloud server 104
to this mutated data set so that file system operations can be
generated and applied to bring the mutated portions of the
subscribed namespace 204 back into synchronization. The invention
provides these advantages by (A) effectively partitioning the
global namespace 200 and (B) identifying and efficiently
synchronizing the minimum data set.
[0048] FIG. 3A graphically illustrates a method 300A for
synchronizing subscribed namespace 204 (and associated data files)
between an RFS 302 and an LFS 306 for the first time according to
the present invention. As shown therein, remote cloud server 102
stores an RFS 302, which includes RFS metadata 304 and associated
data files (not shown), whereas local cloud server 104 stores an
LFS 306, which includes LFS metadata 308 and associated data files
(not shown). Time extends down the page in the direction of the
dashed arrows.
[0049] RFS metadata 304 defines the global namespace 200 associated
with client 202. According to the invention, RFS metadata 304 is
partitioned between files and folders, such that folder metadata is
maintained separately (e.g., in a different table) from file
metadata, as will be described in more detail below.
[0050] Remote cloud server 102 also maintains a global revision
identifier 310 that indicates a current revision state of global
namespace 200. In particular, the current state of the global
revision identifier indicates the largest state (value) of the
revision identifier that is associated with any folder within
global namespace 200 at the current time. The state of global
revision identifier 310 is changed in a predetermined way (e.g.,
incremented, etc.) in response to file system actions being made to
RFS 302 by users. In this embodiment, remote cloud server 102
maintains one global revision identifier 310 per client 202 (e.g.,
one global revision identifier per customer), and all local cloud
servers 104 and 106 associated with that client 202 utilize the
same global revision identifier 310 as will be described in more
detail below. Remote cloud server 102 can maintain a separate
global revision identifier for each customer it serves.
[0051] As actions are performed against global namespace 200 by
users accessing remote cloud server 102, RFS metadata 304 will be
modified according to the user's file system actions. Remote cloud
server 102 implements operations to expose these file system
changes to those accessing remote cloud server 102 and global
namespace 200. Remote cloud 102 also keeps track of the changes
made to the global namespace 200 by these file system actions by
including revision data in the folder record of RFS metadata 302
for each folder in global namespace 200. A folder's revision data
then is updated based on the revision identifier when a file system
action related to that folder is made by the user. In other words,
the state of the revision identifier stored in the folder metadata,
relative to the current state of the global revision identifier
310, indicates the last time an action was performed on the
associated folder that may have changed its contents. A list of
file system actions and their effects on the RFS metadata 304 and
revision identifiers will be discussed below with reference to FIG.
9.
[0052] FIG. 4 shows an exemplary data structure for RFS metadata
304. As shown in FIG. 4, RFS metadata 304 is partitioned into two
tables: a Folders table 402 and a Files table 404. As a result, RFS
metadata 304 is queryable in an efficient manner, because folders
normally constitute a small percentage (commonly 10-12%) of the
global namespace 200 relative to files.
[0053] Each record of Folders table 402 represents a folder in the
global namespace 200 of client 202 and includes a Folder ID field
406, a Canonical Path field 408, a Path field 410, a Parent_ID
field 412, a Lstmtime field 414, and a Revision_ID field 416.
Folder ID field 406 is the key field that uniquely identifies the
particular folder record. Canonical Path field 408 includes a
standardized path name. Path field 410 includes the local display
path. Parent_ID field 412 includes the Folder ID value of the
parent folder of the folder represented by the current record.
Lstmtime field 414 includes data indicative of the last time the
associated folder was modified. Revision_ID field 416 includes data
indicative of a state of the global revision identifier 310 when
the corresponding folder, or a file system object associated with
that folder (e.g., a file object contained within that folder), was
last modified by user action.
[0054] Each record of Files table 404 represents a file in the
global namespace 200 of client 202 and includes a File ID field
418, a Folder ID field 420, a Canonical Name field 422, a Name
field 424, a Total Versions field 426, a Lstmtime field 428, a
Checksum field 430, a Size field 432, and a Version_ID field 434.
File ID field 418 is a key field and uniquely identifies the
particular file record. Folder ID field 420 includes data
identifying a record in Folders table 402 (the folder in which the
file resides). Canonical Name field 422 includes a standardized
file name. Name field 424 includes data indicative of the local
display name. Total Versions field 426 includes data indicative of
how many versions of the file are kept in RFS 302. Lstmtime field
428 includes data indicative of the last time the associated file
was modified. Checksum field 430 includes data indicative of a
checksum (e.g., Sha512, etc.) of the associated file. Size field
432 includes data indicative of the size of the associated data
file stored in mass data storage of RFS 302. Version_ID field 434
includes data indicative of the current version of the associated
file. It should be noted that the tables provided in FIG. 4 are
exemplary in nature. Each folder and file record in the tables
could include additional or fewer fields as needed. Additionally,
additional tables can be included in RFS metadata 304 if desirable.
For example, a table containing version metadata for each version
of a data file stored in RFS 302 could be linked to the file
records in files table 404. These and other embodiments are
possible.
[0055] Returning to FIG. 3A, at (1), local cloud server 104
initiates the synchronization with remote cloud server 102 by
providing an initial synchronization request to remote cloud server
102. Here the initial synchronization request comprises one or more
folder listing queries to remote cloud server 102, requesting the
folder metadata of RFS metadata 304 for the entire subscribed
namespace 204. Local cloud server 104 queries remote cloud server
102 using one or more subscribed namespace identifiers 312(1-n)
associated with subscribed namespace 204. Each subscribed namespace
identifiers 312(1-n) comprises a path of a top-most folder in the
subscribed namespace 204. Here, because folder 210
("/Chicago_Domain/") is the only top-most folder in subscribed
namespace 204, local cloud server 104 makes a single folder listing
query using a subscribed namespace identifier 312 corresponding to
the path of folder 210. Alternatively, the subscribed namespace
identifier 312 could comprise a unique identifier that remote cloud
server 102 could use to lookup the subscribed namespace 204
associated with local cloud server 104.
[0056] Stated another way, subscribed namespace identifiers
312(1-n) represent a requested "namespace view" 315 of subscribed
namespace 204 that is currently of interest to local cloud server
104. Because this is the initial synchronization of subscribed
namespace 204 between local cloud server 104 and remote cloud
server 102, the requested namespace view 315 corresponds to the
entire subscribed namespace 204 stored in RFS 302.
[0057] Remote cloud server 102 receives the folder listing query,
including the subscribed namespace identifier 312 (corresponding to
folder 210) and then retrieves folder metadata 316 for each of the
folders in subscribed namespace 204. Because, in this embodiment,
remote cloud server 102 provides folder metadata 316 for the folder
specified in the subscribed namespace identifier 312 and any
sub-folders of that specified folder, a subscribed namespace
identifier 312 that identifies top-most folder 210 will cause
remote cloud 102 to return folder metadata for every sub-folder in
subscribed namespace 204.
[0058] Accordingly, at (2), remote cloud server 102 provides folder
metadata 316 for each folder in subscribed namespace 204 to local
cloud server 104. Remote cloud server 102 also provides the current
state of global revision identifier 310 to local cloud server 104,
before or after the folder metadata 316. Local cloud server 104
then receives and saves the folder metadata 316 and the current
state of global revision identifier 310 in LFS 306. The global
revision identifier 310 denotes the current state (version) of the
folder portion of global namespace 200 at this point in time on
remote cloud server 102.
[0059] Thereafter, at (3), local cloud server 104 makes one or more
file listing requests (queries) to remote cloud server 102 for file
metadata 318 associated with each folder defined by the returned
folder metadata 316. In this embodiment, local cloud server 104
queries remote cloud server 102 for file metadata on a
folder-by-folder basis, according to the paths contained in the
returned folder metadata 316.
[0060] Remote cloud server 102 receives the folder-specific file
listing queries, and then searches RFS metadata 304 for file
metadata 318 for the requested folders. Then at (4), remote cloud
server 102 provides the file metadata 318 for each specified folder
to local cloud server 104. In the present embodiment, local cloud
server 104 makes a separate query for file metadata for each folder
identified by folder metadata 316. This enables local cloud server
104 to better control the synchronization process, e.g., for
bandwidth and resource reasons. However, other file listing queries
are possible (e.g. requesting folders as a batch).
[0061] The returned folder metadata 316 and file metadata 318
together represent a RFS metadata snapshot 320 of the subscribed
namespace 204 stored on RFS 302 at a time represented by the state
of global revision identifier 310 provided from remote cloud server
102. At (5), local cloud server 104 compares (e.g., on a
path-by-path basis) the RFS metadata snapshot 320 with an LFS
metadata snapshot 322 taken of LFS metadata 308 associated with the
subscribed namespace 204. Based on this comparison, local cloud
server 104 generates, at (6), RFS operations 324 and LFS operations
326 and communicates the RFS operations 324 to remote cloud server
102 where they are applied to RFS 302. Local cloud server 104 also
applies LFS operations 326 to LFS 306. After the file system
operations 324 and 326 are applied, the folders and files (both
metadata and file data) of subscribed namespace 204 will be
synchronized in both RFS 302 and LFS 306 at the time associated
with global revision identifier 310.
[0062] As file system objects of subscribed namespace 204 are
synchronized, local cloud server 104 performs several other
actions. First, as those folders on LFS 306 are synchronized, local
cloud server 104 associates the state of global revision identifier
310 provided by remote cloud server 102 with each folder record in
LFS metadata 308 associated with the namespace view 315 defined by
subscribed namespace identifier(s) 312(1-n). Here, assuming the
entire subscribed namespace 204 is successfully synchronized, the
state of global revision identifier 310 will be stored as a prior
revision identifier in each folder record of LFS 306 that is part
of subscribed namespace 204. This way, for a specific folder record
in LFS metadata 308 that is part of subscribed namespace 204, local
cloud server 104 knows that the specific folder record and its
contents in LFS 306 matches the state of the corresponding folder
record and its contents in RFS 302 as of the time associated with
state of the global revision identifier 310 provided at (2) by
remote cloud server 102. Additionally, it should be noted that each
particular folder record in LFS metadata 308 is updated with the
state of global revision identifier 310 after that folder and its
contents have been successfully synchronized. Otherwise, if the
prior revision identifier for that folder record in LFS metadata
308 were updated before the synchronization completed and the
synchronization process later failed or was interrupted, the
synchronization state of that folder would be incomplete.
[0063] Second, as part of the initial synchronization shown in FIG.
3A, local cloud server 104 initializes and populates LVS database
328 as file system objects are successfully synchronized as
discussed above. LVS database 328 stores extended attributes about
every file system object (e.g., files and folders) that have been
successfully synchronized between RFS 302 and LFS 306. For
subsequent rescan synchronizations, LVS database 328 is used to
help determine which file system objects are already in sync and
prevent them from being synchronized again.
[0064] FIG. 5 shows an exemplary data structure for storing LFS
metadata 308. Like RFS metadata 304, LFS metadata 308 is
partitioned into two tables: a Folders table 502 and a Files table
504. As a result, LFS metadata 308 is also queryable in an
efficient manner.
[0065] Each record of Folders table 502 represents a folder of LFS
306 that is associated with subscribed namespace 204. Folders table
502 might also include folder records for folders that are private
and not synchronized. Each record in Folders table 502 includes a
Folder ID field 506, a Canonical Path field 508, a Path field 510,
a Parent_ID field 512, an RFS_Folder_ID field 514, a Lstmtime field
516, a status field 518, a synctime field 520, a Version_ID field
522, and a Prior_Revision_ID field 524. Folder ID field 506 is the
key field that uniquely identifies the particular folder record.
Canonical Path field 508 includes a standardized path name. Path
field 510 includes the local display path. Parent_ID field 512
includes the Folder ID value of the parent folder of the folder
represented by the current record. RFS_Folder_ID field 514 includes
the Folder_ID of the corresponding folder of RFS metadata 304.
Lstmtime field 516 includes data indicative of the last time the
associated folder was modified. Status field 518 includes
information indicative of the current synchronization status (i.e.
synced, skipped, pending, etc.) of the folder represented by the
current record. Synctime field 520 includes data indicative of the
last time the current folder was successfully synchronized.
Version_ID field 522 includes data indicative of the current
version of the associated folder. Prior_Revision_ID field 524
includes data indicative of the state of the global revision
identifier 310 that was provided to local cloud server 104 the last
time the associated folder was successfully synchronized.
[0066] Each record of Files table 504 represents a file stored in
LFS 306 that is associated with subscribed namespace 204. Each
record in Files table 504 includes a File ID field 526, a Folder ID
field 528, a Canonical Name field 530, a Name field 532, an
RFS_File_ID field 534, a Lstmtime field 536, a Status field 538, a
Checksum field 540, a Synctime field 542, and a Version_ID field
544. File ID field 526 is the key field that uniquely identifies
the particular record. Folder ID field 528 includes data
identifying a record in Folders table 502 (the folder in which the
file resides). Canonical Name field 530 includes a standardized
file name. Name field 532 includes data indicative of the local
display name. RFS_File_ID field 534 includes the File ID of the
corresponding file stored in RFS 302. Lstmtime field 536 includes
data indicative of the last time the associated file was modified.
Status field 538 includes information indicative of the current
synchronization status (i.e. synced, skipped, pending, etc.) of the
file represented by the current record. Checksum field 540 includes
data indicative of a checksum (e.g., Sha512, etc.) of the
associated file. Synctime field 542 includes data indicative of the
last time the associated file was successfully synchronized.
Version_ID field 544 includes data indicative of the current
version of the associated file.
[0067] The tables provided in FIG. 5 are again exemplary in nature.
Each folder and file record in the tables could include additional
or fewer fields as needed. Additional tables can be included in LFS
metadata 308, for example, as described above.
[0068] FIG. 6 shows an exemplary data structure for storing data in
LVS database 328. LVS database 328 contains the extended attributes
related to file synchronization for each file and folder that is
successfully synchronized in subscribed namespace 204. Thus, each
synchronized object in subscribed namespace 204 will have an entry
in LVS database 328. Local cloud server 104 updates LVS database
328 after each successful synchronization by creating records,
deleting records, modifying records, and/or populating records with
the particular attribute information. LVS database 328 includes a
folders table 602 and files table 604 and is, therefore, searchable
in an efficient manner. The fields in folders table 602 and files
table 604 are similar to those described above in FIGS. 4 and 5 and
will, therefore, not be discussed in detail. As above, the tables
and fields provided in FIG. 6 are exemplary in nature. Each folder
and file record in tables 602 and 604 could include additional
fields (e.g., fields for unique file system identifiers, etc.) or
fewer fields as desired. Additional tables could also be
incorporated if desirable (e.g., multiple folders and files tables
specific to RFS 302 and LFS 306, respectively; tables for one-way
vs. two-way synchronizations; etc.).
[0069] Returning to FIG. 3B, following the initial synchronization
between remote cloud server 102 and local cloud server 104
described in FIG. 3A, users accessing the subscribed namespace 204
on remote cloud server 102 will modify file system objects therein.
Accordingly, the Revision_ID fields 416 of the folder record(s) 402
in RFS metadata 304 associated with those modifications will be
updated based on a state of the global revision identifier 310.
Associating revision identifier information with the folder records
402 enables the local cloud server 104 (or in alternative
embodiments the remote cloud server 102) to quickly identify the
data set of RFS 302 that may have changed with respect to a prior
revision state of LFS 306 for a rescan synchronization (FRS or
LRS).
[0070] FIG. 3B graphically illustrates a method 300B for
resynchronizing all or a portion of subscribed namespace 204 (and
the associated data files) between RFS 302 and LFS 204. Method 300B
is performed in response to a request for an FRS or LRS from local
cloud server 104 or remote cloud server 102. In response to a call
for an FRS or LRS, local cloud server 104 can use method 300B to
resynchronize the entire subscribed namespace 204 in the case of an
FRS or only portions of interest of subscribed namespace 204 in the
case of an LRS. Regardless of whether an FRS or LRS is desired, the
invention applies equally to either and provides the advantage that
it limits the data set that needs to be transferred to local cloud
server 104 from remote cloud server 102 to the portions of the
requested namespace view that may have changed (mutated) since they
were last synchronized. File system objects that remote cloud
server 102 know have not changed can, therefore, be excluded from
the data set transferred to local cloud server 104. Limiting this
dataset, therefore, also optimizes the comparison process of local
and remote file systems during the rescan synchronization.
[0071] As before, remote cloud server 102 implements folder listing
queries to lookup portions of global namespace 200 by specific
folder paths, which enables local cloud server 104 to search for
and download information about the folder subtrees and associated
data files of subscribed namespace 204 that are of interest to it
for the particular rescan synchronization. The queries also enables
local cloud server 104 to download content for only those portions
of subscribed namespace 204 that may have changed in some way in
RFS 302 since they were last synchronize. Because this mutated data
set is usually much smaller than the data set for the entire
portion of interest of subscribed namespace 204, the invention
greatly reduces the amount of data that needs to be transferred
between remote cloud server 102 and local cloud server 104 during a
rescan synchronization. This, in turn, reduces the synchronization
bandwidth requirements between remote cloud server 102 and local
cloud server 104, speeds up the comparison process, because local
cloud server 104 only processes a small portion of the RFS metadata
304, and enables remote cloud server 102 to serve many more clients
with the same server-side resource capacity (e.g., processor
capacity, memory capacity, bandwidth, etc.).
[0072] As shown at (1) in FIG. 3B, when local cloud server 104
initializes a rescan synchronization process, it queries remote
cloud server 102 using one or more subscribed namespace identifiers
312(1-n) and the prior revision identifiers 314(1-n) associated
with those subscribed namespace identifiers 312(1-n). In the
present embodiment, each subscribed namespace identifier 312(1-n)
identifies a folder (e.g., by canonical folder path from Canonical
Path field 508 of an associated record 502, by RFS_Folder_ID 514,
etc.) in subscribed namespace 204. The associated prior revision
identifiers 314(1-n), therefore, include the information stored in
the Prior_Revision_ID fields 524 of the same folder records 502 of
the requested namespace, respectively. Each prior revision
identifier 314(1-n) corresponds to the state of a global revision
identifier 310 communicated from remote cloud server 102 the last
time the associated folder was successfully synchronized.
[0073] As an aside, the folder listing queries implemented at (1)
by local cloud server 104 in methods 300A and 300B can be the same.
For example, while providing prior revision identifiers 314(1-n) at
(1) in method 300B can have useful advantages and enable
alternative methods that will be discussed below, providing these
prior revision identifiers 314(1-n) is optional.
[0074] The folders represented by subscribed namespace identifiers
312(1-n) represent a namespace view 315 requested by local cloud
server 104 for rescan synchronization. This namespace view 315
could comprise one identifier 312 for top-most folder 210 in
subscribed namespace 204 in the case of an FRS, or it could
comprise multiple identifiers 312(1-n) each identifying a folder of
subscribed namespace 204 for rescan synchronization in the case of
an LRS.
[0075] Remote cloud server 102 receives the subscribed namespace
identifiers 312(1-n) and associated prior revision identifiers
314(1-n) and, at (2), searches for folder records in table 402 of
RFS metadata 304 for the requested namespace view 315 to identify
folder records 402 of folders contained therein. The folder records
402 that are searched by remote cloud server 102 include the folder
identified by each subscribed namespace identifier 312 and any of
its sub-folders. This feature advantageously limits the number of
folder listing queries that must be made.
[0076] Remote cloud server 102 then supplies the identified folder
metadata 316, along with a current state of global revision
identifier 310, to local cloud server 104. The folder metadata 316
provided by remote cloud server 102 represents each folder that is
currently part of the requested namespace view 315 in RFS 302. As
part of this folder metadata 316, remote cloud server 102 returns
the revision identifier stored in Revision_ID field 416 for each
identified folder. The local cloud server 104 saves the current
state of global revision identifier 310 and the folder metadata 316
in LFS 302.
[0077] Then, at (3), local cloud server 104 utilizes file listing
queries to request file metadata 318 for at least some of the
folders identified by folder metadata 316 returned by remote cloud
server 102. For each folder identified in the folder metadata 316,
local cloud server 104 compares the provided revision identifier of
that folder on RFS with the prior revision identifier stored in
Prior_Revision_ID field 524 of the corresponding folder record 502
in LFS metadata 304. If the Revision_ID 416 for an RFS folder is
greater than the Prior_Revision_ID 524 for the LFS folder, local
cloud server 104 knows that the folder on RFS has changed. As such,
local cloud server generates a file listing query for that folder
and provides it to remote cloud server 102. If, however, the
Revision_ID 416 for the RFS folder is the same as the
Prior_Revision_ID 524 for the LFS folder, then local cloud server
104 knows that folder on RFS has not changed. As such, local cloud
server 104 does not need to generate a file listing query for that
folder, because the folder and its contents have not changed on RFS
302 and, therefore, the file metadata for that folder does not need
to be transferred to local cloud server 104.
[0078] Local cloud server 104 performs the above process on a
folder-by-folder basis for all paths contained in the returned
folder metadata 316. Remote cloud server 102 receives each
folder-specific file listing query, identifies its associated
folder record 402, and then searches the records in files table 404
for file metadata 318 for files contained in that folder. Then at
(4), remote cloud server 102 provides the file metadata 318 for the
specified folder to local cloud server 104.
[0079] The folder metadata 316 and file metadata 318 represent a
limited RFS metadata snapshot 330 associated with RFS 302 that
excludes metadata for file system objects in the requested
namespace view 315 that have not changed since the last time the
namespace view 315 was synchronized. Local cloud server 104 then,
at (5), uses the limited RFS snapshot 330, a current LFS metadata
snapshot 322, and the information in LVS database 328 to complete
the FRS or LRS file system comparisons and to generate file system
operations 324 and 326. Local cloud server 104 then, at (6),
provides RFS operations 324 to remote cloud server 102 for
application to RFS 302 there, as well as, applies LFS operations
326 to LFS 306. Application of RFS operations 324 and LFS
operations 326 to RFS 302 and LFS 306, respectively, brings the
portions (metadata and file data) of RFS 302 and LFS 306 associated
with the requested namespace view 315 back into
synchronization.
[0080] As portions of the subscribed namespace 204 are successfully
synchronized between RFS 302 and LFS 306, local cloud server 104
again updates LVS database 328 as appropriate. Local cloud server
104 also updates the folder records 502 of LFS metadata 306 that
were successfully synchronized. Assuming the FRS or LRS is
successful, each folder record 502 associated with one of the
folders that was fully resynchronized after a file listing query
was made for it will have its Prior_Revision_ID field 524 updated
to reflect the state of the global revision identifier 310 provided
by remote cloud server 102. If there is an error with the FRS or
LRS on some folder or associated file system object (e.g., a file
in the folder), then the Prior_Revision_ID field 524 will not be
updated in LFS metadata 304, to ensure that portion of the
subscribed namespace 204 will be synchronized again in a later FRS
or LRS.
[0081] Local cloud server 104 controls and coordinates the
synchronization processes between local cloud server 104 and remote
cloud server 102, and it can use the synchronization methods
discussed above in a variety of ways. For example, the invention
can be used to initially synchronize the subscribed namespace 204
between RFS 302 and LFS 306. After the initial synchronization, the
invention facilitates re-synchronization of all or portions of the
subscribed namespace, for example, at regular synchronization
intervals. For example, the invention enables an FRS of the full
subscribed namespace 204 to be repeatedly performed at regular,
short intervals (e.g., every 5 minutes), because the invention can
quickly identify and sync only the mutated data set. This mutated
data set is typically very small for a short synchronization
interval. Accordingly, the full subscribed namespace 204 can be
regularly and repeatedly re-synchronized with RFS 302. Such
repeated rescan synchronizations are not known to be possible in
the prior art. As another example, an FRS or LRS can be quickly
completed at the special request of local cloud server 104 or
remote cloud server 102. This might occur when an FRS or LRS is
triggered as part of another synchronization process (e.g., a
steady state sync process) to resolve conflicts or due to
uncertainties in the state of all or a portion of the subscribed
namespace 204. Thus, rescan synchronizations are performed very
quickly and efficiently, because metadata for the contents of
folders that have not changed do not need to be provided to or
evaluated again by local cloud server 104.
[0082] As still another example, the present invention can be used
to ready a portion of subscribed namespace 204 for particular
users. Typically, different users (e.g., local users 112, remote
users 116, etc.) will be assigned permissions that allow them to
access different parts of RFS 302 and LFS 306. For example, a local
user 112 might only have permission to access the folder " . . .
\Accounting\" 214 and its contents in subscribed namespace 204. The
present invention, therefore, could be used to quickly
resynchronize the "\Accounting\" folder on LFS 302 when that local
user 112 logs in to provide the local user 112 an up-to-date piece
of global namespace 204 that is specific to that user. Accordingly,
the present invention can be used to "prime" subscribed namespace
204 for users based on those users' permissions. As yet another
example, the present invention can be used to quickly and
immediately re-synchronize the contents of a folder in subscribed
namespace 204 at the explicit request of a user or system
administrator. These and other advantages of the invention will be
apparent from this disclosure.
[0083] As mentioned above, the prior revision identifiers 314(1-n)
provided to remote cloud server 102 can provide important
advantages. For example, remote cloud server 102 could receive the
requested namespace view 315 and discover, based on a comparison of
a prior revision identifier 314 and revision data (Revision_ID 416)
for each folder in the requested namespace view 315, that none of
the folders in the requested namespace view 315 have changed since
their last synchronization. In such a case, remote cloud server 102
could notify local cloud server 104 that the requested namespace
view 315 is already synchronized and then end the synchronization
process without sending any, or very limited, folder metadata
316.
[0084] As yet another example, remote cloud server 102 could use
the prior revision identifier(s) 314(1-n) of the requested
namespace view 315 and a current state of the global revision
identifier 310 to define one or more ranges of interest. In such a
case, remote cloud server 102 could return folder metadata 316 only
for a folder that corresponds to, or is a sub-folder of, a
subscribed namespace identifier 312(1-n) and that has revision data
(e.g., a revision state in Revision_ID field 416) that is greater
than the prior revision identifier 314(1-n) provided for that
subscribed namespace identifier 312(1-n) and less than the current
state of global revision identifier 310. This process would provide
an advantage, because the amount of folder metadata 316 provided to
local cloud server 104 would be limited to particular folder
records of the requested namespace view 315 that remote cloud
server 102 knows have changed relative to the provided prior
revision identifier(s) 314(1-n). Furthermore, in such an
embodiment, local cloud server 104 would automatically know that it
will need to obtain file metadata 318 for each folder identified in
folder metadata 316, which would save the local cloud server 104
from having to perform the revision data comparison process
described above. Accordingly, remote cloud server 102 could further
provide the file metadata 318 automatically for any folder
identified as having changed in folder metadata 316.
[0085] FIG. 7 is a block diagram of remote cloud server 102. Remote
cloud server 102 includes a wide-area network adapter 702, one or
more processing units 704, working memory 706, one or more user
interface devices 708, a local network adapter 710, a remote cloud
services component 712, and non-volatile memory 714, all
intercommunicating via an internal bus 716. Processing units(s) 704
impart functionality to remote cloud server 102 by executing code
stored in any or all of non-volatile memory 714, working memory
706, and remote cloud services 712. Remote cloud services 712
represents hardware, software, firmware, or some combination
thereof, that provides the synchronization functionality described
herein.
[0086] Wide area network adapter 702 provides a means for remote
cloud server 102 to communicate with remote users 116, local cloud
server 104, and local cloud server 106 via Internet 108. Local
network adapter 710 provides a means for accessing a plurality of
data storage devices 722(1-n), via a private network 720. Clients'
files are stored in and retrieved from data storage devices
722(1-n) as needed. Additional data storage devices 722(n+) can be
added as needed to provide additional storage capacity. In this
example embodiment, data storage devices 722(1-n) are network
attached storage (NAS) devices, but any suitable type of storage
device can be used.
[0087] Cloud-based object storage infrastructures are further
described in U.S. Publication No. 2014/0149794 A1, dated May 29,
2014 by Shetty et al. and entitled "System And Method Of
Implementing An Object Storage Infrastructure For Cloud-Based
Services", which is incorporated herein by reference in its
entirety.
[0088] FIG. 8 is a relational diagram showing the functional
aspects of remote cloud server 102 in greater detail. In the
illustrated embodiment, the functional aspects are provided by
remote cloud services 712 but could be distributed across other
service modules or even other machines.
[0089] Remote user 116 is a device and/or process used to access
the global namespace 200 of RFS 302 via an RFS handler 802. Remote
user 116 can connect with RFS handler 802 either via the Internet
108 or via connection 118 (FIG. 1). RFS handler 802 represents an
interface/protocol by which remote user 116 can access and modify
RFS 302. For example, RFS handler 802 can implement HTTP, WebUI,
WebDAV, RESTful application program interfaces (APIs) and/or FTP,
an interface compatible with a mobile application (e.g., an
application running on a smartphone, tablet, etc.), etc. Responsive
to remote user 116, RFS handler 802 calls remote virtual file
system (VFS) module 804.
[0090] Remote VFS module 804 provides remote user 116 with file and
folder access to RFS 302. Remote VFS module 804 intercepts the file
system calls coming from remote user 116 via RFS handler 802 and
calls permissions enforcement module 806. Permissions enforcement
module 806 enforces cloud permissions on the requested file system
access utilizing permissions information stored in permissions
database 808. Permissions enforcement module 806 and permissions
database 808 are discussed only generally herein. However,
permission management frameworks for cloud servers are described in
detail in U.S. Publication No. 2014/0149461 A1, dated May 29, 2014
by Wijayaratne et al. and entitled "Flexible Permission Management
Framework For Cloud Attached File Systems", which is incorporated
herein by reference in its entirety.
[0091] In accordance with the user's permissions, remote VFS module
804 accesses RFS metadata 304 of RFS 302 to provide remote user 116
with a hierarchical virtual file system (e.g., a directory tree
view of folders and files) of global namespace 200 either via the
Internet 108 or via connection 118 (FIG. 1). The remote user 116
can then interact with the virtual file system and make changes to
file system objects. When a data file needs to be uploaded to,
downloaded from, or deleted from client data storage devices
722(1-n), remote VFS module 804 utilizes RFS object I/O module 810
to facilitate the data file transfer to or from client data storage
722(1-n).
[0092] RFS object I/O module 810 manages the I/O subsystem for
organized data file storage and access on data storage devices
722(1-n). Responsive to VFS module 804 and metadata 304, RFS object
I/O module 810 downloads associated data files from, uploads
associated data files to, and deletes associated data files from
data storage devices 722(1-n). I/O module 810 also provides and
receives data files to and from VFS module 804, which in turn,
provides data files to, and retrieves data files from, remote user
116 as needed via RFS handler 802.
[0093] RFS 302 includes both the RFS metadata 304 and the data
files stored on data storage devices 722(1-n). Metadata 304 is
stored as described in FIG. 4 and describes a hierarchical, virtual
file system that remote user 116 can access. Data storage devices
722(1-n) store data files that are associated with the virtual file
system objects defined by the RFS metadata 304. RFS metadata 304
can also be stored in data storage devices 722(1-n) or non-volatile
memory 714.
[0094] Remote cloud server 102 also defines and maintains global
revision identifier 310 that changes in a predetermined way as
alterations are made to global namespace 200. In the present
embodiment, global revision identifier 310 is a long variable, but
could be in other formats such as a date and time stamp, etc. In
this embodiment, for each change remote VFS module 804 makes to
global namespace 200 in accordance with user actions, remote VFS
module 804 increments the current state of global revision
identifier 310 to a new state and then updates the Revision_ID
field 416 in the appropriate folder record(s) 402 of RFS metadata
304 based on the new state of the global revision identifier 310,
as will be explained in further detail below.
[0095] Remote cloud server 102 also includes a remote synchronizer
812, which communicates with remote VS module 804 and with local
cloud server 104 (or local cloud server 106) via a local cloud
interface 814. Local cloud interface 814 is a means (e.g., network
protocol(s), a set of APIs, etc.) by which remote cloud server 102
can intercommunicate with local cloud server 104 as needed. In a
particular embodiment, local cloud interface 814 maintains an open
(always on) connection with local cloud server 104 for efficient
synchronization.
[0096] Remote synchronizer 812 controls and coordinates the various
aspects of the synchronization process that remote cloud server 102
is responsible for between remote cloud server 102 and local cloud
server 104. In particular, remote synchronizer 812 can receive
commands from local cloud server 104 to initiate synchronization,
for example, responsive to receiving one or more folder listing
queries, including subscribed namespace identifiers 312(1-n)
defining a requested namespace view 315, from the local cloud
server 104. Remote synchronizer 812 is also operative to receive
and store any prior revision identifiers(s) 314(1-n), and can
provide the subscribed namespace identifier(s) 312(1-n) and prior
revision identifier(s) 314(1-n) to remote VFS module 804 for
further action.
[0097] Remote synchronizer 812 utilizes remote VFS module 804 to
search the folder records 402 of RFS metadata 304 based on the
subscribed namespace identifier(s) 312(1-n) and prior revision
identifiers 314(1-n) provided by local cloud server 104. Remote VFS
module 804 carries out the requested search of the RFS metadata 306
and returns folder metadata 316 for each folder of the requested
namespace view 315. Remote synchronizer 812 provides the folder
metadata 316 returned by remote VFS module 804 to local cloud
server 104 along with the current state of global revision
identifier 310. In the alternative embodiment discussed above where
remote cloud server 102 makes a comparison of revision data, remote
VFS module 804 can be configured to return and provide folder
metadata 316 only for folders that have revision data (Revision_ID
field 416) that is greater than an associated prior revision
identifier 314(1-n).
[0098] If the synchronization is the initial synchronization of the
subscribed namespace 204 (e.g., no prior revision identifier 314
provided for top-most folder 210), then VFS module 804 returns
folder metadata 316 for the entire subscribed namespace 204 to
remote synchronizer 812. Remote synchronizer 812 then provides this
folder metadata 316 for each subscribed folder to local cloud
server 104 along with the current state of global revision
identifier 310.
[0099] Remote synchronizer 812 is also operative to receive file
metadata queries from local cloud server 104 and, in response,
instruct remote VFS module 804 to search RFS metadata database 304
for file metadata 318 associated with each file stored in a folder
requested by local cloud server 104. In response, VFS module 804
performs the requested search of RFS metadata 304, and provides the
returned file metadata 318 to remote synchronizer 812, which in
turn, provides the file metadata 318 to local cloud server 104.
Alternatively, remote synchronizer 812 can automatically cause the
file metadata 318 to be retrieved and transferred to local cloud
server 104 for each folder returned in folder metadata 316.
[0100] Remote synchronizer 812 also functions to receive and apply
RFS operations 324 from local cloud server 104 via local cloud
interface 814. When a stream of RFS operations 324 is received,
remote synchronizer 812 provides the RFS operations 324 to RFS
handler 802 (or alternatively directly to remote VFS module 804) so
that they can be applied to RFS 302. RFS operations 324 can include
any file system operations that are recognized by remote cloud
server 102 and can include, but are not limited, upload, download,
create, delete, update, etc. For example, RFS operations 324 can
cause a file or folder to be created, deleted, etc. in RFS metadata
304. As another example, RFS operations 324 can result in a data
file to be uploaded to, downloaded from, deleted from, updated,
etc. in client data storage 722(1-n). Other RFS operations 324
(e.g., attribute modifications, etc.) can also be implemented as
desired. In response to RFS operations 324, remote VFS module 804
determines and causes the necessary file system actions to be
applied to RFS metadata 304 and/or client data storage 722(1-n) in
accordance with the RFS operations 324 such that the associated
portions of subscribed namespace 204 are synchronized with local
cloud server 104.
[0101] When remote VFS module 804 receives actions on the global
namespace 200 from a user that modifies RFS 302, remote VFS module
804 modifies RFS metadata 304, including updating Revision_ID field
416 in one or more folder records 402, as appropriate based on the
user's actions. FIG. 9 shows a table 900 including a first column
902, identifying the file system actions made by a user on global
namespace 200, and a second column 904 identifying the effects of
such actions on the Revision_ID field 416 of the associated folder
record(s) 402.
[0102] When a user adds a file to RFS 302 (and therefore global
namespace 200), remote VFS module 804 creates a new file record in
Files table 404 and fills the fields 418-434. Additionally, remote
VFS module 804 increments global revision identifier 310 to a new
state and then stores the new state in the Revision_ID field 416 of
the folder record 402 associated with the folder in which the file
was created in (see column 904). Storing the new state overwrites
an older revision state stored in Revision_ID field 416.
[0103] When a user adds a folder to RFS 302, remote VFS module 804
creates a new folder record in Folders table 402 and fills the
other metadata fields as appropriately. Additionally, remote VFS
module 804 increments global revision identifier 310 to a new state
and then stores the new state in Revision_ID field 416 of the new
folder record 406.
[0104] When a user copies a file within RFS 302, remote VFS module
804 creates a new file record 418 in Files table 404 for the target
file and fills the appropriate metadata fields 618-634.
Additionally, remote VFS module 804 increments global revision
identifier 310 to a new state and then stores the new state in
Revision_ID field 416 of the folder record containing the target
file.
[0105] When a user copies one or more folder(s) within RFS 302,
remote VFS module 804 creates one or more new folder record(s) in
Folders table 402 for the target folder hierarchy and fills the
other metadata fields in each new folder record. New file record(s)
might also be created in table 404 for any files in the copied
folders. Additionally, remote VFS module 804 increments global
revision identifier 310 to a new state and then stores the new
state in the Revision_ID field 416 of each new folder record 402
created in the target folder hierarchy, including the topmost
target folder record.
[0106] When the metadata of a file changes (e.g., when a user
changes the file name, if the checksum 430 changes, etc.), remote
VFS module 804 modifies the metadata in the associated file record
in Files table 404. Additionally, remote VFS module 804 increments
global revision identifier 310 to a new state and then updates the
Revision_ID field 416 of the folder record 606 of the folder
containing the file with the new state.
[0107] When a user moves a file within RFS 302, remote VFS module
804 modifies the associated file record accordingly (e.g., updates
the folder ID field 420). Additionally, remote VFS module 804
increments global revision identifier 310 to a new state and then
updates the Revision_ID field 416 of the source and target folder
records 402 associated with the file move with the new state.
[0108] When a user moves a folder within RFS 302, remote VFS module
804 modifies the metadata (e.g., canonical path 408, parent ID 412,
etc.) in the folder record(s) in Folders table 402 that are
associated with the moved (target) folder hierarchy. Additionally,
remote VFS module 804 increments global revision identifier 310 to
a new state and then updates the Revision_ID field 416 of each
folder record of the target folder hierarchy, including the topmost
target folder, with the new state.
[0109] When a user removes a file from RFS 302, remote VFS module
804 updates the associated file record in Files table 404 to
indicate the file is deleted (e.g., moves the file to a trash table
(not shown), updates status field 442 such that the file record
will be moved to a trash table, etc.). Additionally, remote VFS
module 804 increments global revision identifier 310 to a new state
and then stores the new state of the global revision identifier 310
in Revision_ID field 416 of the folder record 402 of the folder
containing the deleted file.
[0110] When a user removes a folder from RFS 302, remote VFS module
804 updates the associated folder record in Folders table 402 to
indicate the folder is deleted (e.g., updates the path field 408
with a trash folder path, moves the folder record to a trash table
(not shown), etc.). Otherwise, remote VFS module 804 takes no
action with respect to Revision_ID field 416 of the folder record
associated with the deleted folder. This is because the folder
ceases to exist in the global namespace 200. Notice of this
folder's deletion will be propagated to local cloud server 104, for
example, when local cloud server 104 pulls a new folder listing
where the (previously-synchronized) deleted folder is now absent.
This is also the reason why the revision identifier of a source
folder does not need to be updated when a folder is moved from it.
Notice could also be provided to the local cloud via a steady-state
event synchronization process such as that described in U.S. Pub.
No. 2014/0040196 A1, dated Feb. 6, 2014 by Wijayaratne et al. and
entitled "System and Method for Event-Based Synchronization of
Remote and Local File Systems", which is incorporated by reference
herein in its entirety.
[0111] When a user restores one or more folder(s) and/or file(s) to
RFS 302 from the trash, remote VFS module 804 updates the
associated folder record(s) and file record(s) (e.g., returns them
to Folders table 402 and Files table 404 from respective trash
tables, etc.). Additionally, for a folder restore, remote VFS
module 804 increments global revision identifier 310 to a new state
and then stores the new state in the Revision_ID field 416 of each
folder record of the restored folder hierarchy. For synchronization
purposes, these restored folder records represent the folders that
may have different contents from subscribed namespace on LFS 306.
In the case of a trash restore or a version restore of a file,
remote VFS module 804 increments global revision identifier 310 to
a new state and then stores the new state in the Revision_ID field
416 of the folder record of the folder containing the restored or
reverted file.
[0112] In summary, each action described in FIG. 10 is associated
with one or more "events" that occur to RFS 302. Folders affected
by an action have their associated folder records marked with a new
state of global revision identifier 310, which is incremented in
response to the action occurring. As an alternative, the folder
records could be updated with the current state of the global
revision identifier 310 and global revision identifier 310 could
then be incremented to a new state to use for the next action. The
achieved result would be the same in either case, depending on the
starting value of the global revision identifier 310. Additionally,
because each action to RFS 302 denotes an event, the revision
identifier can also be used as an event identifier and defines a
point in time in the event stream for a domain. Because of these
properties, global revision identifier 310 can also be used to
version folders.
[0113] FIG. 10 is a block diagram showing local cloud server 104 in
greater detail. (Note the elements of FIG. 10 apply equally to
local cloud server 106.) In this particular embodiment, local cloud
server 104 is an enhanced network attached storage (NAS) device
that includes one or more processing units 1004, working memory
1006, one or more user interface devices 1008, a local network
adapter 1010, a local cloud services component 1012, and
non-volatile memory 1014, all intercommunicating via an internal
bus 1016. Processing units(s) 1004 impart functionality to local
cloud server 104 by executing code stored in any or all of
non-volatile memory 1014, working memory 1006, and local cloud
services 1012. A wide-area network adapter 1018 facilitates
communication with remote cloud server 102 via local network 114
and the Internet 108.
[0114] Non-volatile memory 1014 also provides local file storage
for client files/objects. By way of example, the nonvolatile memory
1014 is shown to include (in addition to other types of memory) a
set of hard drives arranged in a RAID configuration. The client's
file system on the RAID drives can be accessed by local users 112
via local network 114, as is known in the art.
[0115] Local cloud services 1012 represents hardware, software,
firmware, or some combination thereof, that provides the
synchronization functionality described herein. Local cloud
services 1012 also provide file storage and retrieval services to
local users 112. The file storage functionality of local cloud
services 1012 will not be described in detail herein, except to the
extent it relates to the synchronization aspects, so as not to
unnecessarily complicate this disclosure.
[0116] FIG. 11 is a relational diagram of the functional aspects of
local cloud server 104 or, alternatively, local cloud server 106 in
greater detail. In this illustrated embodiment, the functional
aspects are provided by local cloud services 1012 but can be
distributed across other service modules or even other
machines.
[0117] LFS handler 1102 receives requests for access (e.g., logins,
read requests, write requests, etc.) from local users 112. In this
particular example, local users 112 are WINDOWS.RTM. clients, and
LFS handler 1102 includes a server application implementing Samba,
which allows local cloud server 104 to interact with the local
users 112. Local user 112 includes a device/process used to access
the objects in LFS 306 that are hosted by local cloud server 104,
including the synchronized files associated with subscribed
namespace 204. A user maps the "share" that is exported, in
accordance with the user's permissions, by LFS handler 1102 (e.g.,
via Common Internet File System (CIFS), SMB, etc.) and then
accesses the files and folders within the exported share.
[0118] A local VFS module 1104 monitors the actions of local user
112 to detect changes being made to LFS 306 by local user 112. When
local VFS module 1104 detects a change to LFS 306, local VFS module
1104 modifies the LFS metadata 308 to record the change. LFS object
I/O module 1108 manages the I/O subsystem for organized storage and
access of data files stored in local file storage 1014 on LFS 306.
Local VFS module 1104 conveys data objects between LFS handler 1102
and LFS Object I/O module 1108 when data files need to be
transferred between local file storage system 1014 and local user
112.
[0119] A local synchronizer 1110 is responsible for driving the
synchronization processes on local cloud server 104.
Synchronization settings 1112 are configuration settings that
define the subscribed namespace 204 associated with local cloud
server 104 and optionally define a synchronization interval (e.g.,
every 3 hours, every 30 minutes, every 5 minutes, etc.) which
triggers local synchronizer 1110 to resynchronize all of, or
particular portions of, subscribed namespace 204. In this
embodiment, synchronization settings 1112, including subscribed
namespace 204, are determined when service begins with remote cloud
server 102 and can be updated (e.g., to redefine subscribed
namespace 204) by an administrator as desired.
[0120] Local synchronizer 1110 can also be triggered into
initiating a rescan synchronization for various other reasons. For
example, local cloud server 104 might implement a second type of
synchronization process (e.g., steady-state synchronization (SSS),
etc.) in addition to rescan synchronization. The second
synchronization process might trigger an FRS or LRS if the it
crashes, experiences conflicts, etc. Other types of synchronization
processes are described in U.S. Pub. No. 2014/0040196 A1, dated
Feb. 6, 2014 by Wijayaratne et al. and entitled "System and Method
for Event-Based Synchronization of Remote and Local File Systems",
which is incorporated by reference herein in its entirety. Another
event-based synchronization system and method is described in U.S.
Pub. No. 2016/0019233 A1, published Jan. 21, 2016 by Wijayaratne et
al. and entitled "System And Method For Policy Based
Synchronization Of Remote And Local File Systems", which is also
incorporated by reference herein in its entirety.
[0121] Local synchronizer 1110 initializes synchronization by
establishing a connection (if none already exists) with remote
cloud server 102 via a remote cloud interface 1114. Local
synchronizer 1110 communicates with remote cloud server 102 via
APIs and is operative to query remote cloud 102 with one or more
subscribed namespace identifier(s) 312(1-n), corresponding to a
requested namespace view 315 that it wishes to synchronize. Local
synchronizer 1110 is also operative to query remote cloud server
102 with prior revision identifier(s) 314 if one or more have been
previously stored. Local synchronizer 1110 obtains the subscribed
namespace identifier(s) 312(1-n) and prior revision identifier(s)
314(1-n) by retrieving the data from the folder records 502 of LFS
metadata 308 (e.g., the data in canonical path field 508, prior
revision identifier 524, etc.).
[0122] In response to each folder listing query and subscribed
namespace identifier 312 from local synchronizer 1110, remote cloud
server 102 returns the folder metadata 316 for all folders of RFS
302 that are part of the requested namespace view 315, along with
the current state of global revision identifier 310, to local cloud
server 104. The returned folder metadata 316 includes a revision
identifier (the revision state contained in Revision_ID 416 field)
for each folder identified therein. Local synchronizer 1110
receives and stores the current global revision identifier 310.
Local synchronizer 1110 also stores each of the records returned in
the folder metadata 316 as part of a RFS metadata snapshot 320 or
limited RFS metadata snapshot 330 in LFS 306.
[0123] Subsequently, local synchronizer 1110 determines which
folders identified in the folder metadata 316 have changed since
they were last synchronized. Local synchronizer 1110 accomplishes
this by, for each folder identified in folder metadata 316,
comparing a revision identifier returned for that folder with a
prior revision identifier (Prior_Revision_ID field 524) stored for
that folder in LFS 306. If this comparison reveals that a folder
has been modified (e.g., if the revision state of the folder in RFS
302 is greater than prior revision state of the folder in LFS 306),
then local synchronizer 1110 makes a file listing query for file
metadata 318 contained in that folder. In contrast, if the
comparison reveals that the folder has not changed (e.g., if the
revision state of the folder in RFS 302 is the same as the prior
revision state of the folder in LFS 306), then local synchronizer
1110 does not make a file listing query for that folder and does
not download file metadata 318 for that folder. In this embodiment,
local synchronizer 1110 requests file metadata 318 on a
folder-by-folder as indicated above. When the file metadata 318 is
received, local synchronizer 1110 stores the file metadata 318 as
part of RFS metadata snapshot 320 or limited RFS metadata snapshot
330. The RFS metadata snapshot 320 and limited RFS metadata
snapshot 330 can be partitioned by folder and file similar to RFS
metadata 304 for efficient searching.
[0124] An exemplary folder listing API employed by local
synchronizer 1110 according to the present invention will now be
discussed. The folder listing API obtains and outputs the folder
metadata 316 for each folder of RFS 302 that is part of the
requested namespace view 315 defined by the subscribed namespace
identifiers 312(1-n). The returned list of folders can be acquired
and/or filtered based on the permissions set in the remote cloud
102. (In the alternative embodiments discussed above where remote
cloud server 102 makes a comparison of the RFS folder revision data
and the prior revision identifier(s) 314(1-n), the folder listing
output would only list folders that might have changed since their
last synchronizations).
[0125] Initially, local synchronizer 1110 sends one or more folder
listing queries to remote cloud server 102. Each folder listing
query includes a subscribed namespace identifiers 312, which
identifies a folder of interest in subscribed namespace 204 (e.g.,
by the path in canonical path field 508 of the folder record of
interest). Each folder listing query can also include a prior
revision identifier 314 (if available) corresponding to the prior
state of global revision identifier 310 stored for the folder of
interest in Prior_Revision_ID field 524.
[0126] Each folder listing query results in the output shown below.
The "#start_folder_meta" and "#start_folder_list" are different
sections of the data returned by remote cloud 102.
TABLE-US-00001 #start_folder_meta #start_folder_list
Folder_ID1\tCanonical_Path1\tLstmtime1\tRevision_ID1\n
Folder_ID2\tCanonical_Path2\tLstmtime2\tRevision_ID2\n ...
#end_folder_list global_rev_id\t100\n #end_folder_meta
[0127] The #start_folder_list section of the output lists folders
and associated metadata for each folder in RFS 302 on remote cloud
server 102 that corresponds to the folder identified by the
subscribed namespace identifier 312 or is one of its sub-folders.
The metadata returned for each folder includes, but is not limited
to, the metadata in Folder ID field 406, Canonical_Path field 408,
Lstmtime field 414, and Revision_ID field 416 of each associated
folder record 402. Local synchronizer 1110 saves the returned list
of folders and associated folder metadata 316 in LFS 306 as part of
RFS metadata snapshot 320 or limited RFS metadata snapshot 330.
Here, the amount of returned folder metadata for each folder is
limited to conserve bandwidth.
[0128] The #start_folder_meta section of the output returns the
current state (e.g., "100") of global revision identifier 310 as
seen on the remote cloud server 102 and defines a tab spacing
("\t") in the output. Thereafter, the folder listing API ends and
local synchronizer 1110 saves the provided state of the global
revision identifier 310 for updating Prior_Revision_ID fields 524
of the synchronized folders in LFS 306 when synchronization is
complete. The output of the folder listing query is provided in
flat-file format. Each "\t" in the listing translates to a tab in
the output, whereas each "\n" translates to a new line in the
output.
[0129] For an initial synchronization or an FRS, local synchronizer
1110 can make a folder listing query for "Chicago_Domain" folder
210. In response, remote cloud server 102 provides a folder listing
consisting of the identified folder 210 as well as each of its
sub-folders. Folder 210 is the top-most folder in the hierarchy of
subscribed namespace 204, and therefore, providing its path is an
efficient means by which to obtain a folder listing of the entire
subscribed namespace 204. If there are multiple top-level folders
in the subscribed namespace, then local synchronizer 1110 would
make a folder listing query that includes all top-level
folders.
[0130] For an LRS, a folder listing query can be defined and made
for each folder of interest by local synchronizer 1110. For
example, an LRS of folders 216 and 220 in subscribed namespace 204
might be needed. Accordingly, local synchronizer 1110 can make a
first folder listing query for folder 216 comprising a subscribed
namespace identifier 312(1) including the path " . . .
/chicago_domain/accounting/billing/" from Canonical Path field 508
of the corresponding record 502 and a prior revision identifier
314(1) stored in Prior_Revision_ID field 524 of the same folder
record 502. A second folder listing query could be made for folder
220 comprising a subscribed namespace identifier 312(2) including
the path " . . . /chicago_domain/us_teams/" from Canonical Path
field 508 of the corresponding record 502 and a prior revision
identifier 314(2) from Prior_Revision_ID field 524 of the same
folder record 502. Note the prior revision identifiers 314(1-2)
might be different, depending on when the folders 214 and 220 were
last synchronized.
[0131] Subscribed namespace identifiers 312 that identify lower
folders (e.g., folder 216, folder 220, etc.) in subscribed
namespace 204 will also cause those lower folders to be treated as
the top folder for that query. Accordingly, remote cloud server 102
will return folder metadata 316 for the requested folder and any
sub-folders. This method of returning a listing of all sub-folders
of a requested folder conserves bandwidth during the folder listing
query.
[0132] An exemplary file listing API employed by the present
invention will now be discussed. Based on the folders returned by
the folder listing API and, in particular embodiments, a comparison
of their revision identifiers with the prior revision identifiers
for those folders stored in LFS metadata 308, the local
synchronizer 1110 is able to identify the folders that have
changed, and then obtain the file metadata 318 for files in those
folders. Accordingly, local synchronizer 1110 makes file listing
calls to remote cloud 102 based on the folders returned by the
folder listing API. Remote cloud server 102 responds with the
following output for the file listing API.
TABLE-US-00002 #start_file_list #start_folder_file_meta
folder_path\tstatus_code\tnum_files\tlatest_mtime\n
File_ID1\t\Name1\tLstmtime1\tSize1\tChecksum1\n
File_ID2\t\Name2\tLstmtime2\tSize2\tChecksum2\n ...
#end_folder_file_meta #start_folder_file_meta ...
#end_folder_file_meta #start_unprocessed_query_folders
folder_path_1 folder_path_2 #end_unprocessed_query_folders
#end_file_list
[0133] For each folder path provided, the query returns some
additional RFS metadata for that folder followed by a list of files
and metadata for those files stored within each requested folder.
The metadata returned for the requested folder includes the
folder's (canonical) path, a status code ("status_code"), the
number of files ("num_files") contained in the folder to be listed,
and the last time the folder was modified. The status code is used
by remote cloud server 102 to indicate the status of the folder
(e.g., if it is still available, if access is authorized, etc.).
The metadata returned for each file in the requested folder can
include, but is not limited to, the information in File ID field
418, Name field 424, Lstmtime field 428, Size field 432, and
Checksum field 430 from the file records 404 of RFS metadata
304.
[0134] The output for each returned folder is delineated by
"#start_folder_file_meta" and "#end_folder_file_meta" and is
provided in flat-file format. As before, "\t" translates to a tab
in the output and "\n" translates to a new line.
[0135] The output for #start unprocessed_query_folders indicates
any requested folders that remote cloud server 102 could not
process for file list output, for example, due to temporary
capacity issues. Local synchronizer 1110 could make note of these
paths and take appropriate action (e.g., logging the unprocessed
folders, making a new file listing request for the unprocessed
folders, not updating the prior revision identifier 524 for those
folders so they will synchronized again in the future, etc.)
[0136] Together, the folder metadata 316 and file metadata 318
returned by remote cloud server 102 form a limited RFS snapshot 330
in the case of a rescan synchronization. Local synchronizer 1110
compares the data in limited RFS snapshot 330, the data in an LFS
metadata snapshot 322 of the corresponding portions of LFS metadata
304, and the data in LVS database 328 to determined differences
between the subscribed namespace on RFS 302 and LFS 306. Based on
these differences, local synchronizer 1110 determines and stores
records of remote RS events and local RS events in a remote RS
events database 1116 and a local RS events database 1118,
respectively. Remote RS events correspond to changes made to RFS
302 since the last synchronization, whereas local RS events 1118
correspond to changes made to LFS 306 since the last
synchronization.
[0137] Local synchronizer 1110 is also configured to instruct an
event processor 1120 to begin processing the remote and local RS
event records in databases 1116 and 1118. Event processor 1120
consumes the remote and local RS event records in databases 1116
and 1118 and generates RFS operations 324 and LFS operations 326
based on the RS events that are needed to synchronize the requested
namespace view 315 of associated with the synchronization. Event
processor 1120 outputs the operations to sync server APIs 1122,
which in turn cause the operations to be applied to LFS 306 or RFS
302.
[0138] As the file system operations 324 and 326 are applied,
portions of RFS 302 and LFS 306 become synchronized. As each folder
and its contents are fully synchronized, local synchronizer 1110
updates (overwrites) the Prior_Revision_TD field 524 for that
folder in LFS metadata 308 with the state of global revision
identifier 310 provided by remote cloud server 102 in response to
the folder listing query.
[0139] The invention provides advantages as described above because
the mutated portions of the requested namespace view 315 on remote
cloud server 102 can be quickly identified and downloaded to local
cloud server 104, while un-altered portions of the requested
namespace view 315 can be excluded. This conserves processor and
memory resources on the remote and local cloud servers, as well as
transmission time and bandwidth. The downloaded folder and file
information can then be compared with corresponding LFS metadata
308 to synchronize the namespaces and associated file data. The
invention can also provide an advantage by complementing or
replacing the snapshot-based "rescan synchronization" processes in
the cloud-based applications as described above.
[0140] Each of FIGS. 12A and 12B contains a table summarizing how
the invention provides advantages when performing a rescan
synchronization. Table 1202 of FIG. 12A illustrates a full rescan
synchronization (FRS), whereas table 1204 of FIG. 12B illustrates a
limited rescan synchronization (LRS).
[0141] In FIG. 12A, the namespace view 315 requested by local cloud
server 104 for synchronization is defined by subscribed namespace
identifier 312(1). Subscribed namespace identifier 312(1)
identifies the canonical path of "/chicago_domain/" folder 210.
Additionally, local cloud server 104 has provided a prior revision
identifier 314(1) for the folder 210, which corresponds to the
revision identifier stored in Prior_Revision_ID field 524 of the
associated folder record 502 in LFS metadata 308. In this case, the
prior revision identifier is 357899 and represents the state of
global revision identifier 310 the last time folder 210 was
synchronized. Folder 210 is the top-most folder in subscribed
namespace 204 and, therefore, folder 210 and each sub-folder in
subscribed namespace 204 will be re-synchronized in an FRS.
However, table 1202 shows only a small portion of the subscribed
namespace 204.
[0142] After submitting a folder listing query and the requested
namespace view 315, remote cloud server 102 returns the current
state of its global revision identifier 510 (589638 in this
example) along with folder metadata 316 and file metadata 318, as
discussed above. This folder metadata 316 and file metadata 318
defines the limited RFS snapshot 330, a portion of which is shown
in a first column 1206 of table 1202. The paths contained in LVS
database 328 and LFS snapshot 322 (if any) are shown in second and
third columns 1208 and 1210, respectively. During the FRS, the data
in the first three columns 1206, 1208, and 1210 is compared by
local synchronizer 1110, which generates records of determined
remote RS events and local RS events (not shown) and stores them in
databases 1116 and 1118, respectively. A fourth column 1212 shows
the remote events determined by local synchronizer 1110 based on
its comparison of the first through third columns 1206-1210.
Finally, a fifth column 1214 indicates file system operations that
will be applied in this example based on the remote RS events in
column 1212.
[0143] Because the top-most folder 210 ("/chicago_domain/") was
requested as part of this FRS, limited RFS snapshot 330 includes a
listing of each folder in subscribed namespace 204. In a first row
1216 of table 1202, a comparison between columns 1206 and 1210
indicates that folder 210 has not changed in RFS 302 since it was
last synchronized with LFS 306. This is because the last revision
state (357899) of folder 210 on RFS 302 is the same as the prior
revision state (357899) on LFS 306. Accordingly, limited RFS
snapshot 330 does not include a file listing for folder 210 and, in
this example, no file system operations are generated to
synchronize folder 210. (Note, however, that if folder 210 were
changed in LFS 306, then resynchronization of folder 210 might
still occur due to the changes in LFS 306.)
[0144] In a second row 1218, a comparison between columns 1206 and
1210 indicates that the revision state of " . . . /accounting/"
folder 214 has changed in RFS 302, because its last revision state
(495632) on RFS 302 is greater than the prior revision state
(396658) on LFS 306. Accordingly, a remote RS event and associated
file system operations will be generated to resynchronize the
metadata associated with folder 214 in RFS 302 and LFS 306.
Additionally, because folder 214 is associated with a file system
change, limited RFS snapshot 330 includes a file listing for folder
214, which in this example is represented as file 222
("revenue.xlsx").
[0145] In a third row 1220 of column 1206, no RFS metadata 316 is
provided in limited RFS snapshot 330 for folder 218, indicating
that folder 218 was deleted from RFS 302. Because folder 218 use to
be present in RFS 302, as determined by a comparison with the
entries in LVS database 328 and LFS snapshot 322, file system
operations are generated to delete folder 218 from LFS 302.
[0146] In a fourth row 1222, RFS metadata is provided for folder
216. However, a corresponding path does not exist in LVS database
328 or LFS snapshot 322. Accordingly, local synchronizer 1110
determines that folder 216 was created in RFS 302 and creates a
corresponding RS remote event in remote RS events database 1116.
When this remote RS event is processed, LFS operations 326 will be
generated to create a corresponding folder in LFS 306 and LVS
database 328 will be updated to reflect the synchronized folder
216. Because folder 216 needs to be synchronized, limited RFS
snapshot 330 will also include a file listing (not shown) for the
files contained in folder 216.
[0147] In a fifth row 1224 of table 1202, a comparison between
columns 1206 and 1210 indicates that the revision state of folder
220 (" . . . /us_teams") has not changed in RFS 302 since folder
220 was last fully synchronized with LFS 306, because its last
revision state (512375) on RFS 302 is the same as its prior
revision state (512375) on LFS 306. Thus, folder 220 does not need
to be resynchronized. Accordingly, limited RFS snapshot 330 does
not include a file listing for folder 220.
[0148] Limited RFS snapshot 330 also includes file metadata 318 for
files in the returned folder paths in limited RFS snapshot 330 that
are determined to have changed. Accordingly, limited RFS snapshot
330 will include file metadata 318 for each file in folders 214 and
216. However, no file metadata 318 is included in limited RFS
snapshot 330 for folders 210 and 220, because these folders (or the
files contained therein) did not mutate since their last
synchronization. Accordingly, this illustrates how limited RFS
snapshot 330 can be much smaller and can conserve bandwidth and
system resources over prior art rescan techniques.
[0149] A sixth row 1226 shows file metadata 318 for a
"revenue.xlsx" file 222 stored in "accounting" folder 214. While it
is expected that limited RFS snapshot 330 would contain metadata
318 for many files, only this one example is shown in FIG. 12A for
simplicity. As illustrated by columns 1206, 1208, and 1210, an
entry for file 222 is not contained in LVS database 328 or LFS
snapshot 322 and, therefore, local synchronizer 1110 determines
that the file 222 was created in RFS 302 since the "accounting"
folder 214 was last synchronized. Accordingly, local synchronizer
1110 records a remote RS event in database 1116 such that file
system operations will be generated and applied to create a
corresponding file in LFS 306.
[0150] It should be noted that the above description discusses
changes made to the subscribed namespace 204 from the point of view
of RFS 302. Other changes may have been made to LFS 306 and,
accordingly, local synchronizer 1110 would also generate and record
local RS events in local RS events database 1118 to ensure that the
changes to the version of subscribed namespace 204 on LFS 306 were
also propagated to RFS 302.
[0151] Additionally, when a folder is successfully synchronized,
local synchronizer 1110 records the state (589638) of the global
revision identifier 310 in the Prior_Revision_ID field 524 of the
corresponding folder record 502 in LFS metadata 508. In the example
FRS in FIG. 12A, the state 589638 will be stored in field 524 of
the folder record 502 for each of folders 214 and 216. Storing this
prior revision identifier indicates that the associated folders and
their file contents were synchronized with RFS 302 as of the time
the global revision identifier 310 on remote cloud server 102 had
the state/value of 589638.
[0152] Table 1204 of FIG. 12B illustrates an exemplary limited
rescan synchronization (LRS) according to the invention that occurs
at some time after the FRS described in FIG. 12A. For a LRS, local
synchronizer 1110 requests synchronization of only limited portions
of the subscribed namespace 204. In the example shown in FIG. 12B,
the namespace view 315 requested by local synchronizer 1110 for
synchronization is defined by subscribed namespace identifier
312(1) and subscribed namespace identifier 312(2). Subscribed
namespace identifier 312(1) identifies the canonical path for
folder 216 (" . . . /billing/") of subscribed namespace 204. Local
synchronizer 1110 also supplies the prior revision identifier
314(1) for the folder 216 from Prior_Revision_ID field 524 of the
corresponding folder record 502 of LFS metadata 308. Here, the
prior revision identifier 314(1) is 589638, which is the state
corresponding with the last FRS in FIG. 12A.
[0153] The namespace view requested by local synchronizer 1110 also
includes the canonical path for folder 220 (" . . . /us_teams/") of
subscribed namespace 204, as identified by subscribed namespace
identifier 312(2) along with the associated prior revision
identifier 314(2) from Prior_Revision_ID field 524 of the
corresponding folder record 502. Here, the prior revision
identifier 314(2) corresponds to a state of 613589, which means
that folder 220 was synchronized at some time between the FRS
described in FIG. 12A and the current LRS.
[0154] After submitting the folder listing queries with the
requested namespace view 315, remote cloud server 102 returns the
current state of its global revision identifier 510 (728683 in this
example) along with folder metadata 316 and file metadata 318 for
the mutated portions of RFS metadata 304 associated with the
requested namespace view 315. This folder metadata 316 and file
metadata 318 defines the limited RFS snapshot 330, a portion of
which is shown in a first column 1206 of table 1204. The contents
of LVS database 328 and LFS snapshot 322 are again shown in second
and third columns 1208 and 1210, respectively.
[0155] Limited RFS snapshot 330 includes metadata 316 for path
"/chicago_domain/billing/" (folder 216) and for path
"/chicago_domain/us_teams/" (folder 220). Limited RFS snapshot 330
would also contain folder metadata 316 for any sub-folders of
folders 216 and 220 that may have mutated since the last
synchronization.
[0156] In a first row 1228 of table 1204, a comparison between
columns 1206 and 1210 indicates that the revision state of folder
216 has not changed in RFS 302 since the last FRS, because the
revision state (589638) of folder 216 on RFS 302 is the same as the
prior revision state (589638) on LFS 306. Accordingly, folder 216
is not resynchronized, and limited RFS snapshot 330 does not
include a file listing for folder 216.
[0157] In a second row 1230, the comparison between columns 1206
and 1210 indicates that folder 220 has changed, because the
revision identifier (669853) associated with folder 220 in RFS 302
is greater than the prior revision identifier (613589) for folder
220 on LFS 306. Accordingly, local synchronizer 1110 records this
remote RS event and generates file system operations to synchronize
the metadata for folder 220 on RFS 302 and LFS 306. Because folder
220 has mutated since its last synchronization, limited RFS
snapshot 330 will also include file metadata 318 for each file
stored in folder 220.
[0158] A third row 1232 of column 1206 shows file metadata 318 for
a "teama.docx" file 224 stored in "/us_teams/" folder 220. While it
is expected that limited RFS snapshot 330 would contain metadata
318 for many files associated with the returned folder list, only
this one example is shown in FIG. 12B for simplicity. As
illustrated by columns 1206, 1208, and 1210, an entry for file 224
is not contained in LVS database 328 or LFS snapshot 322 and,
therefore, local synchronizer 1110 determines that the file 224 was
created in RFS 302 since folder 220 was last synchronized. (File
224 is therefore shown dashed in FIG. 2.) Accordingly, local
synchronizer 1110 records a remote RS event that file 224 was
created in RFS 302 such that file system operations will be
generated and applied to create a corresponding file in LFS
306.
[0159] When a folder is successfully synchronized, local
synchronizer 1110 records the state of the global revision
identifier 310 in the Prior_Revision_ID field 524 of the associated
folder record 502 in LFS metadata 308. In this example, the state
728683 will be stored in field 524 of the folder record 502, which
indicates that folder 220 and its contents were synchronized on RFS
302 and LFS 306 as of the time the global revision identifier 310
had the state/value of 728683.
[0160] Some methods of the invention will now be described with
reference to FIGS. 13-17. For the sake of clear explanation, these
methods might be described with reference to particular elements
discussed herein that perform particular functions. However, it
should be noted that other elements, whether explicitly described
herein or created in view of the present disclosure, could be
substituted for those cited without departing from the scope of the
present invention. Therefore, it should be understood that the
methods of the present invention are not limited to any particular
element(s) that perform(s) any particular function(s). Further,
some steps of the methods presented need not necessarily occur in
the order shown. For example, in some cases two or more method
steps may occur simultaneously. These and other variations of the
methods disclosed herein will be readily apparent, especially in
view of the description of the present invention provided
previously herein, and are considered to be within the full scope
of the invention.
[0161] FIG. 13 is a flowchart summarizing a method 1300, in a
remote cloud server 102, for facilitating synchronization of a
subscribed namespace 204 between an RFS 302 and an LFS 304. In a
first step 1302, the remote cloud server 102 maintains a global
revision identifier 310 associated with the RFS 302, where the
global revision identifier 310 has a variable state. In a second
step 1304, remote cloud server 102 establishes a connection with a
user (e.g., remote user 116) and, in a third step 1306, provides
access to RFS 302 to the user. In a fourth step 1308, remote cloud
server 102 receives instruction(s) from the user to modify a
portion of the subscribed namespace 204 on the RFS 302. Responsive
to RFS 302 being modified, in a fifth step 1310, a remote
synchronizer 812 of remote cloud server 102 updates metadata
associated with one or more of the file system objects in the RFS
302 based on a state of the global revision identifier 310. In a
particular method, the Revision_ID field 416 of folder metadata
record(s) 502 is updated according to table 900 shown in FIG. 9,
depending on the modification made to RFS 302. In a sixth step
1312, remote cloud server 102 receives a synchronization request
from local cloud server 104, and in a seventh step 1314, remote
synchronizer 812 provides metadata associated with at least some of
the file system objects of RFS 302 to the local cloud server 104.
In an eighth step 1316, remote synchronizer 812 receives file
system operations from local cloud server 104 and applies them to
RFS 302 to synchronize portions of subscribed namespace 204.
[0162] FIG. 14 is a flowchart summarizing a method of performing
step 1314 (provide RFS metadata) of method 1300. In a first step
1402, remote synchronizer 812 receives one or more subscribed
namespace identifiers 312(1-n) associated with portion(s) of
subscribed namespace 204 from local cloud server 102. Then, in a
second step 1404, remote synchronizer 812 receives one or more
prior revision identifiers 314(1-n) from local cloud server 104
that are associated with the requested subscribed namespace
portions. (Second step 1404 is optional as discussed above.) In a
third step 1406, remote synchronizer 814 provides the current state
of global revision identifier 310 to local cloud server 104. In a
fourth step 1408, remote synchronizer 812 searches the portions of
RFS 302 corresponding to the namespace view 315 associated with
subscribed namespace identifiers 312(1-n) and returns folder
metadata 316 for each folder contained therein. (In alternative
methods, remote synchronizer 812 can provide folder metadata 316
only for folders in the requested namespace view 315 that have a
revision identifier (Revision_ID field 416) between an associated
prior revision identifier 314 and the current state of global
revision identifier 310 (e.g., within a desired window or range).)
Then in a fifth step 1410, remote synchronizer 812 receives file
metadata request(s) from local cloud server 104 for at least some
of the folders returned in folder metadata 316. Accordingly, in a
sixth step 1412, remote synchronizer 812 obtains the file metadata
318 (a file listing) for each requested folder of RFS 302 and
provides the file metadata 318 to local cloud server 104.
[0163] FIG. 15 is a flowchart summarizing one method 1500 for
synchronizing all or some of a subscribed namespace 204 between an
RFS 302 and an LFS 306 according to the present invention. In a
first step 1502, local synchronizer 1110 of local cloud server 104
initiates a synchronization process with remote synchronizer 812 of
remote cloud server 102. In a second step 1504, local synchronizer
1110 obtains an RFS snapshot 320 or limited RFS snapshot 330 of RFS
metadata 304 from remote cloud server 102 and stores the RFS
snapshot 320 or limited RFS snapshot 330 in LFS 306. Local
synchronizer 1110 then obtains an LFS snapshot 322 of LFS metadata
308 and stores the LFS snapshot 322 in LFS 204 in a third step
1506. Then, in a fourth step 1508, local synchronizer 1110 compares
the RFS snapshot 320 or limited RFS snapshot 330, the LFS snapshot
322, and data in LVS database 328 (if any exists yet). In a fifth
step 1510, local synchronizer 1110 identifies remote and local RS
events based on the comparison and causes file system operations to
be applied to RFS 302 and LFS 306 to synchronize portions of
subscribed namespace 204. In a sixth step 1512, local synchronizer
1110 updates the Prior_Revision_ID field 524 in each folder record
502 of LFS metadata 308 that was synchronized on RFS 302 and LFS
306.
[0164] FIG. 16 is a flowchart summarizing a first method 1504A of
performing the second step (obtain RFS snapshot) of the method 1500
according to the present invention. In a first step 1602, local
cloud server 104 provides one or more subscribed namespace
identifiers 312(1-n) (e.g., the paths for the top-most folders of
the requested namespace view 315) to remote cloud server 102.
Optionally, local cloud server 104 can also provide one or more
prior revision identifier(s) 314 (1-n) to remote cloud server 102.
Then, in a second step 1604, local cloud server 104 downloads
folder metadata 316 (e.g., a list of folder records 402 and their
associated revision identifiers 416) associated with each folder in
the requested namespace view 315 of the subscribed namespace 204
from remote cloud server 102. Additionally, in a third step 1604,
local cloud server 104 downloads and saves the current state of
global revision identifier 310 from remote cloud server 102. Then,
in a fourth step 1608, local cloud server 104 downloads and saves
file metadata 318 (e.g., a list of file records 404) for the files
stored in at least some of the folders identified in the returned
folder metadata 316.
[0165] FIG. 17 is a flowchart summarizing a method of performing
the fourth step 1608 (download file metadata) of the method 1600 of
FIG. 16. In a first step 1702, local synchronizer 1110 determines
the revision state of a first folder of RFS 302 that is provided
(identified) in the folder metadata 316 received from remote cloud
server 102. The revision state corresponds to a state stored in
Revision_ID field 416 of the RFS metadata provided as part of the
folder metadata 316 for the identified folder. Then, in a second
step 1704, local synchronizer 1110 determines the prior revision
state of a corresponding folder in LFS 302. The prior revision
state corresponds to the state stored in Prior_Revision_ID field
524 for the corresponding folder in the LFS metadata. Then, in a
third step 1706, local synchronizer 1110 compares the revision
state for the identified folder in RFS 302 with the prior revision
state of the corresponding folder in LFS 304 to determine if the
folder has changed (e.g., if the prior revision state for the
corresponding folder in LFS 306 is greater than the revision state
of the identified folder in RFS 302). If so, local synchronizer
1110 determines that the identified folder has mutated and, in a
fourth step 1708, makes a file listing query for the identified
folder to remote cloud server 102. Then, in a fifth step 1710,
local synchronizer 1110 receives the file metadata 318 for the
folder. Thereafter, in a sixth step 1712, local synchronizer 1110
determines if more folders are identified in the folder metadata
316 returned by remote cloud server 102. If so, the method returns
to first step 1702. If not, the method ends.
[0166] However, if in third step 1706, local synchronizer 1110
determines that the identified folder in RFS 302 has not changed,
then the method proceeds to sixth step 1712 without downloading
file metadata for the identified folder. In this particular method,
local synchronizer 1110 determines that the identified folder has
not changed if the revision state of the identified folder in RFS
302 is equal to the prior revision state of the corresponding
folder in LFS 306. Accordingly, the metadata retrieved by local
cloud server 104 for the requested namespace view is limited, which
facilitates the advantages of the invention discussed above.
[0167] FIG. 18 is a flowchart summarizing an alternative method
1504B for performing the second step (obtain RFS snapshot) that
facilitates the continued resynchronization of subscribed namespace
204 between LFS 306 and RFS 302. In a first step 1802, local cloud
server 104 provides one or more subscribed namespace identifier
312(1-n) (e.g., paths for the top-most folder(s) of the desired
namespace view 315 of subscribed namespace 204) to remote cloud
server 102. In a second step 1804 (optionally occurring
simultaneously with step 1802), local cloud server 104 provides
prior revision identifier(s) 314(1-n) for the requested namespace
view 315 to remote cloud server 102. The prior revision
identifier(s) 314(1-n) correspond to the prior revision identifiers
stored in Prior_Revision_ID field 524 of the folders 502 identified
by the subscribed namespace identifiers 312(1-n). Then, in a third
step 1806, local cloud server 104 downloads folder metadata 316
(e.g., a list of folder records) from remote cloud server 102
corresponding to each folder that is part of the requested
namespace view 315 on RFS 302 that has a revision identifier 416
greater than a corresponding prior revision identifier 314 provided
by local cloud server 104 for that part of the namespace. In a
fourth step 1808, local cloud server 104 downloads and saves the
current state of global revision identifier 310 from remote cloud
server 102. In a fifth step 1810, local cloud server 104 downloads
and saves file metadata 318 (e.g., a list of file records) for the
files contained in at least some (e.g., each) of the folders
identified in the folder metadata 316 received in step 1806.
[0168] The description of particular embodiments of the present
invention is now complete. Many of the described features may be
substituted, altered or omitted without departing from the scope of
the invention. For example, the folder and file metadata might
include additional or fewer fields. As another example, alternative
formats of the global revision identifier can be used. These and
other deviations from the particular embodiments shown will be
apparent to those skilled in the art, particularly in view of the
foregoing disclosure.
* * * * *