System And Method For Guaranteeing Consistent Data Synchronization From A Volatile Data Source Obusek; Ernest [Obusek; Ernest]

System And Method For Guaranteeing Consistent Data Synchronization From A Volatile Data Source

Obusek; Ernest

Patent Application Summary

U.S. patent application number 13/364977 was filed with the patent office on 2013-08-08 for system and method for guaranteeing consistent data synchronization from a volatile data source. This patent application is currently assigned to NetApp, Inc.. The applicant listed for this patent is Ernest Obusek. Invention is credited to Ernest Obusek.

Application Number	20130204841 13/364977
Document ID	/
Family ID	48903811
Filed Date	2013-08-08

United States Patent Application	20130204841
Kind Code	A1
Obusek; Ernest	August 8, 2013

SYSTEM AND METHOD FOR GUARANTEEING CONSISTENT DATA SYNCHRONIZATION FROM A VOLATILE DATA SOURCE

Abstract

Systems and methods for, among other things, updating a destination data set of hierarchical data in relation to a source set of hierarchical data. The method, in certain embodiments, includes receiving an indication that the source data set has one or more changes, initiating a comparison between the source data set and the destination data set, identifying differences and related hierarchical relationships, and altering the destination data set by performing changes in an order that preserves the hierarchical relationships. The method may use the change notifications as an indicator to start the comparison and restart the comparison upon the receipt of a new notification. By using this method, the two data sets can be kept synchronized while preserving hierarchical relationships between the data elements in an environment where the source data set experiences unpredictable changes and cannot be locked.

Inventors:

Obusek; Ernest; (Monroeville, PA)

Applicant:

Name	City	State	Country	Type
Obusek; Ernest	Monroeville	PA	US

Assignee:

NetApp, Inc.
Sunnyvale
CA

Family ID:

48903811

Appl. No.:

13/364977

Filed:

February 2, 2012

Current U.S. Class:	707/624 ; 707/E17.005
Current CPC Class:	G06F 16/273 20190101; G06F 11/14 20130101; G06F 11/2097 20130101; G06F 11/2094 20130101; G06F 2201/82 20130101
Class at Publication:	707/624 ; 707/E17.005
International Class:	G06F 17/30 20060101 G06F017/30

Claims

1. A method for updating a destination data set of hierarchical data in relation to a source data set of hierarchical data, comprising receiving an indication that the source data set has one or more changes, in response to the indication, initiating a comparison between the source data set and the destination data set, identifying differences between the source data set and the destination data set and identifying a hierarchical characteristic for an identified difference, and altering the destination data set by performing changes in an order determined by the hierarchical characteristic of the identified difference.

2. The method of claim 1, wherein the source data and destination data include tables of data elements and wherein initiating a comparison comprises comparing respective data elements within the corresponding tables.

3. The method of claim 1, wherein the source data and destination data include tables of data elements and wherein initiating a comparison comprises comparing a data element within the corresponding tables representative of a version number of the table.

4. The method of claim 1, wherein identifying hierarchical characteristics comprises identifying meta-data associated with a data source and representative of hierarchical relationships between data elements in the data source.

5. The method of claim 1, further comprising generating a list of the identified differences and the hierarchical relationship of data elements associated with the identified differences.

6. The method of claim 1, wherein altering the destination data set comprises, synchronizing the destination data set to the source data set by processing a list of identified differences and related hierarchical relationships and causing changes to the destination data set in a sequence determined by the related hierarchical relationships.

7. The method of claim 1, wherein altering the destination data set includes adding data elements having a parent and child hierarchical relationship to the destination data set in a sequence of additions that add parent data elements prior to associated child data elements.

8. The method of claim 1, wherein altering the destination data set includes deleting from the destination data set, data elements having a parent and child hierarchical relationship through a sequence of operations that delete a parent element in the destination data set subsequent to the deletion of an associated child element.

9. The method of claim 1 further comprising receiving, during the comparison between the source data set and the destination data set, a second indication that the source data set has one or more changes, and in response to the second indication, restarting the comparison between the source data set and the destination data set.

10. The method of claim 9 further comprising waiting a predetermined amount of time after the second indication is received prior to restarting the comparison between the source data set and the destination data set.

11. The method of claim 1, wherein the destination data set includes at least a portion of the data from the source data set.

12. A method for updating a destination data set of hierarchical data in relation to a source data set of hierarchical data, comprising receiving an indication that the source data set has one or more changes, in response to the indication, initiating a comparison between the source data set and the destination data set, identifying a destination data set element to be deleted, identifying child elements that are hierarchically dependent on the identified destination data set element, adding the identified elements having child elements to a list of parent elements, deleting from the destination data set the identified elements lacking child elements, and deleting the elements on the list of parent elements from the destination data set in reverse order from which the elements were added to the list.

13. The method of claim 12 further comprising receiving, during the comparison between the source data set and the destination data set, a second indication that the source data set has one or more changes, and in response to the second indication, deleting the list of parent elements and restarting the comparison between the source data set and the destination data set.

14. The method of claim 13 further comprising waiting a predetermined amount of time after the second indication is received prior to deleting the list of parent elements and restarting the comparison between the source data set and the destination data set.

15. The method of claim 12, wherein the destination data set contains a subset of the data from the source data set.

16. A system for synchronizing a destination data set of hierarchical data with a source data set of hierarchical data, comprising a communications network configured to transmit and receive data and change notifications, a memory for storing a destination data set, a processor connected to the communications network and configured to receive an indication that the source data set has one or more changes, initiate a comparison between the source data set and the destination data set, identify differences between the source data set and the destination data set and related hierarchical relationships between data elements associated with the identified changes, and alter the destination data set by performing changes in an order that is determined as a function of the hierarchical relationships between the elements of the destination data set.

17. The system of claim 16 wherein the source data set comprises a data set stored in a data memory controlled by a node connected to the communication network.

18. The system of claim 16 wherein the source data set comprises a logical data set representative of a reference data set against which other data sets are to be synchronized.

19. The system of claim 16, wherein the processor further comprises a notification processor for detecting a user-generated change to be made to the destination data set and for generating a notification for transmission over the communications network and representative of the change to be made to the destination data set.

20. The system of claim 16, further comprising a comparator for identifying meta-data associated with a data source and representative of hierarchical relationships between data elements in the data source.

21. The system of claim 16, wherein the comparator further comprises a list generator for generating a list of the identified differences and the hierarchical relationship of data elements associated with the identified differences.

22. The system of claim 16, comprising a driver for altering the destination data set to synchronize the destination data set to the source data set by processing a list of identified differences and related hierarchical relationships and to perform a sequence of changes to the destination data.

23. The system of claim 16, wherein the processor is further configured to store a data set listing parent data elements.

24. The system of claim 23, wherein the processor is further configured to receive, during a comparison between the source data set and the destination data set, a second indication that the source data set has one or more changes, and in response, delete the data set listing parent elements and restart the comparison between the source data set and the destination data set.

25. The system of claim 24, wherein the processor is further configured to wait for a predetermined amount of time after the second indication is received prior to restarting the comparison between the source data set and the destination data set.

Description

FIELD OF THE INVENTION

[0001] The systems and methods described herein relate to storage systems, and in particular, to systems and methods for synchronizing a data set and its copies in an environment that does not support file locking.

BACKGROUND

[0002] Today, software systems are used to manage large collections of data to make that data more easily and quickly available. To this end, software systems may replicate some or all of the data set being managed. Replication of the stored data set can improve availability of the data set, as well as fault tolerance. For example, a database management system may replicate a large data set across multiple locations, where each location provides storage for the local copy of the data set and support for processes that access and use the local data set copy. A user at such a location, typically referred to as a node, accesses its local copy of the data set to avoid the bottlenecks that appear when all users are accessing a single master copy, and thereby achieve high availability. Thus, reading data from the database can be done much more quickly when each node has a local copy. Moreover, in the event of a node failure, the data set of the failed node can be replaced or repaired by accessing a data set stored on another node.

[0003] The advantages of using replicated data sets come at the expense of increased system complexity. Although read operations make no changes to the data set, edits and deletions will change the stored data. Replication of a data set requires the system to synchronize duplicated copies of the data set so data integrity is maintained. Maintaining data integrity typically means that each user perceives a single logical data set instead of perceiving a system of multiple independent copies that contain different data.

[0004] To maintain data integrity across multiple nodes, the software system typically designates one data set to be the master copy, and designates the other nodes as copies of the master. As the distributed nodes operate on the different data sets, the operations are monitored by the node storing the master copy. In one system, the master node monitors the other nodes to log the changes the nodes propose to make to their respective local data sets. In this system, a mechanism synchronizes the data sets by coordinating the actions of the separate nodes. As the different nodes are independently making changes to their local data set, some mechanism is to be employed to synchronize the local data sets with the master data set. This synchronization mechanism may, for example, log the proposed changes, make the changes first to the master copy, then publish all the changes made to the master copy to the other nodes. The published updates are made by the other nodes to their local copies, and the nodes then confirm the updates by sending an acknowledgement to the master node. Typically, the master data set publishes the updates as the updates are made. The copies then make the changes as updates are published.

[0005] Although these systems can work well, relying on a master copy to control updates can create a bottleneck that slows overall system performance. To address this, some software systems allow multiple nodes to publish the changes made to their respective local data set. Each node responds to these published changes and coordinates the changes in a way that seeks to maintain the integrity of each local data set relative to the other data sets in the system.

[0006] Although such systems can provide improved performance, the asynchronous character of the data set updates published by multiple nodes can cause data integrity to suffer between data set copies. To address this, some systems employ a file locking process that locks local data set copies during update processes. This ensures that updates are consistent across copies. Although this lock process can work well, it can reduce data set availability.

[0007] As such, there is a need for systems that allow multiple data sets to maintain synchronization through processes that provide data integrity and high availability.

SUMMARY

[0008] In certain embodiments, the system and methods described herein relate to synchronizing data sets, including systems that maintain a consistent view of plural data set copies, as the source data set, which may be a physical master data copy, a logical master data copy or a system abstraction of a master copy, is changing. In one embodiment, the systems and methods described herein identify the hierarchical relationships of a data element being changed within the data set and use the identified hierarchical relationships to control the order of changes to a data set. Hierarchically dependent data elements refer to data elements which are ranked, ordered, or graded into successive levels such that the elements are represented as being above, such as a parent element, below, such as a child element, or at the same level as other data elements.

[0009] It is one realization of the systems and methods described herein, that synchronization algorithms that update data sets as, at least in part, a function of the hierarchical characteristics of the data elements in the data set can improve the internal consistency of the hierarchical relationships, including within environments where changes to the data set are being issued in unpredictable ways. These systems and processes, among other things, reduce transient inconsistencies between copies of a data set. For example, the systems and methods described herein may be employed for use with a distributed data set, where a data set may be distributed amongst one or more nodes, and changes to the data set may come from any node in the network. In such cases, changes to the data set may issue in an order which contradicts the hierarchical relationships between the elements of the data set. For example, child data elements may be added to data copies before the parent data elements. The systems and methods described herein employ the identified hierarchical relationships of the data being changed to avoid creating inconsistencies in local data set copies.

[0010] The systems and methods described herein include, in certain embodiments and practices, methods for receiving an indication that the source data set has one or more changes, initiating a comparison between the source data set and the destination data set, identifying differences and related hierarchical relationships, and altering the destination data set by performing changes in an order that preserve the hierarchical relationships. The method may use the change notifications as an indicator to start the comparison and restart the comparison upon the receipt of a new notification. By using this method, the two data sets can be kept synchronized while preserving hierarchical relationships between the data elements in an environment where the source data set experiences unpredictable changes and cannot be locked.

[0011] In some embodiments, a single node may keep a master copy of the data set, i.e., the source data set, with every other node maintaining a copy of the master data set, i.e., the destination data sets. In alternate embodiments, the data set may be distributed among several nodes, with a respective node storing a subset of the data set. In yet other embodiments, the source data set may be a logical data set that represents a master state for the data set, being the state that the data set copies in the system should synchronize against to achieve coherency. This logical data set may be maintained by a state process that monitors or tracks changes being made to the data set copies and provides a reference state representative of the state that the data set copies will synchronize against. In either case, the source data set could be distributed amongst several nodes, with changes to a destination data set being issued from multiple nodes. In such embodiments, it may not be possible to pause, interrupt, or lock the changes coming in from the source data set. Furthermore, changes from the source data set may be issued out of order, for example due to network latency, or issued in ways which violate the hierarchical relationships between the data elements of the destination data set. For example, changes can issue from the source data set which cause a child element to be left without a corresponding parent element. In such a dynamic environment, the destination data set is to be updated in an order related to the hierarchical relationships of the data elements.

[0012] In some embodiments, a destination node employs a received change notification as an indication to start a comparison between the source and destination data set. Although the change notification may contain information regarding the location and nature of the change, the destination node may initiate the comparison at the beginning of the data set and compares the data elements in sequential order. In some embodiments, the data elements may be grouped into tables with version numbers. In this case, the method may first compare table version numbers to identify a modified table, and compare data elements within the modified table in sequential order, and typically comparing respective data elements within corresponding tables.

[0013] A difference may be identified between the source and destination data set which indicates a change to be made to the destination data set. Optionally, identifying differences may include identifying hierarchical characteristics associated with the differences and may comprise identifying meta-data associated with a data source and representative of hierarchical relationships between data elements in the data source. Further optionally, the method may generate a list of the identified differences and the hierarchical relationship of data elements associated with the identified differences. In one practice, changes are made to synchronize the destination data set to the source data set by processing a list of identified differences and related hierarchical relationships to direct a sequence of changes to the destination data set by the related hierarchical relationships. For example, in the case of an addition of a data element, a parent element is added before its corresponding children elements. In the case of a modification of a data element, the modification may change the maximum number of children that can be assigned to a parent element, which may result in children elements being removed prior to modification of the parent element.

[0014] The deletion of a data element may include steps to provide hierarchical consistency of the destination data set. In particular, the deletion of a parent element may be implemented subsequent to the deletion of its corresponding children elements. To achieve this, the comparison process in one process identifies the corresponding children elements that are hierarchically dependent on the parent element to be deleted. Of the identified data elements, the elements that are themselves parent elements, i.e., elements that have hierarchically dependent children elements, are added to a list of parent elements to be processed, optionally at a later time. The elements that lack hierarchically dependent children elements may be deleted from the destination data set. The elements on the list of parent elements may be deleted in an order that avoids leaving a child element without a corresponding parent element.

[0015] At unscheduled times during the data synchronization process, an additional change notification may be received indicating that the source data set has been modified. Upon receipt of the change notification, the comparison may restart from the beginning of the data set. Although not all changes may have been implemented to the destination data set when the additional change notification is received, the destination data set has been modified in an order that avoids the production of a child element lacking a parent element, preserving hierarchical consistency. In some embodiments and practices, the method is restarted after a delay sufficient to allow more change notifications to be issued before another comparison is initiated.

[0016] The systems and methods described herein allow data synchronization of two data sets to be performed while preserving hierarchical relationships between data elements in an environment that does not support file locking. Other objects, features, and advantages of the present invention will become apparent upon examining the following detailed description of an embodiment thereof, taken in conjunction with the attached drawings.

BRIEF DESCRIPTION OF THE FIGURES

[0017] The systems and methods described herein are set forth in the appended claims. However, for purpose of explanation, several illustrative embodiments are set forth in the following figures.

[0018] FIGS. 1A and 1B are schematic block diagrams of exemplary storage environments in which some embodiments may operate.

[0019] FIG. 2 is a schematic block diagram of a node for use in the distributed data set environment of FIG. 1.

[0020] FIG. 3 shows a conceptual diagram of an exemplary data set with hierarchical data elements.

[0021] FIG. 4 depicts an illustrative example of a modification update from a source data set to a destination data set.

[0022] FIG. 5 depicts an illustrative example of an addition update from a source data set to a destination data set.

[0023] FIG. 6 depicts an illustrative example of a deletion update from a source data set to a destination data set.

[0024] FIG. 7 is a flowchart of one method for performing an addition or modification update from a source data set to a destination data set.

[0025] FIG. 8 is a flowchart of a method for performing a deletion update from a source data set to a destination data set.

DETAILED DESCRIPTION OF CERTAIN ILLUSTRATED EMBODIMENTS

[0026] To provide an overall understanding of the system and methods described herein, certain illustrative embodiments will now be described, including systems and methods for synchronizing data sets with hierarchical data elements. However, it will be understood by one of ordinary skill in the art that the methods and systems described herein may be adapted and modified as is appropriate for other applications and uses and that the system and methods described herein may be modified as suited to address such other uses, and that such additions and modifications will not depart from the scope hereof.

[0027] In one embodiment, systems and methods described herein update a destination data set, which typically is a copy of a larger data set, of hierarchical data in relation to a source set of hierarchical data, where the source data set is typically the data set to which other data sets will be synchronized. The method, in certain embodiments, includes receiving an indication that the source data set has one or more changes, initiating a comparison between the source data set and the destination data set, identifying differences between the source data set and the destination data set and identifying related hierarchical characteristics of the identified differences. The process alters the destination data set by performing changes in an order that is set, at least in part, by the hierarchical relationships of the data set that is being updated. For example, a change to the source data set, which may be in certain embodiments the master data set, may involve the addition of data into the data set. The added data may be hierarchical data with one datum characterized as parent data and a related datum characterized as child data. The systems and methods described herein may alter the source data set and the destination data set, which in certain embodiments may be the local data set copy, through a sequence of operations that add the parent datum and, in a subsequent operation add the child datum.

[0028] This order of operations that is determined, at least in part, by the hierarchical relationship of the data, reduces the likelihood that an intervening read operation of the destination data set will result in the production of child data lacking parent data, where the parent data exists in the source data set. Thus, by using these methods, the two data sets may be synchronized through a process that provides increased data integrity by reducing logical inconsistencies, and preserves hierarchical relationships between the data elements in environments where the source data set experiences unpredictable changes and, typically, cannot or will not be locked.

[0029] FIGS. 1A and 1B are schematic block diagrams of an exemplary data storage environment in which some embodiments may operate. In FIG. 1A, the depicted data storage system 100 includes nodes 102a-d having respective memories 110a-d, network 120, and network links 122a-d. The storage system 100 could be any suitable system for distributing information amongst the depicted plurality of nodes. Typically, the storage system 100 is a computer application that manages a corpus of data and system storage operations, including storing data, retrieving data and organizing the data corpus according to some logical structure, which may for example include files, tables or some other organizational framework. Further, the storage system 100 is only an example of the type of data storage application that can be supported by the systems and methods described herein. The storage system 100 alternatively may be any suitable data storage application, including a file storage system, such as the commercially available Data ONTAP data management environment developed by NetApp, Inc., the assignee hereof, a database application, or any other storage application. As such, those of skill in the art will recognize that the systems and methods described herein can work with any storage system that stores replicated data sets or portions of replicated data sets, such as database systems, storage operating systems, cloud storage systems, data filers with replicated data storage, RAID storage systems, or any other storage system or application having replicated data.

[0030] The nodes 102a-d may be computer systems that implement services of the storage system 100 to store and manage data. To that end, the nodes 102a-d may have and execute one or more applications that submit access and modify requests to the storage system 100, or an application executing on the storage system 100 such as a database application, to access and/or modify the data set maintained by the storage system 100. The nodes 102a-102d may consist of a hardware platform that may be any suitable computing system that can support storing and processing data sets. For example, the nodes 102a-d can be a commercially available network appliance, such as a file server appliance, or maybe a conventional data processing platform such as an IBM.RTM. PC-compatible computer running the Windows.RTM. operating system, or a SUN.RTM. workstation running a UNIX operating system. Alternatively, the nodes 102a-d can comprise a dedicated processing system that includes an embedded programmable data processing system such as a single board computer. The nodes 102a-d also include a memory 110a-d which can be any suitable data memory, including a hard disk drive, RAID system, tape drive system, flash memory, magnetic disk, or any other suitable memory. Additionally, the memory may be real, virtual or a combination of real and virtual. The depicted memories 110a-d store a local data set, which may be a full copy or a partial copy of a master data set.

[0031] The nodes 102a-d in the storage system 100 are connected to the network 120 through a plurality of network links 122a-d. The network 120 can be any suitable connection system for connecting the nodes 102a-d and exchanging data and/or commands. Typically, the network 120 is a computer network such as a Local Area Network (LAN), a Wide Area Network (WAN), a Metropolitan Area Network (MAN), the Internet or any other type of network or communication system and may comprise wired links, wireless links, or a combination of wired and wireless links.

[0032] FIG. 1B depicts a network data storage environment, which can represent a more detailed view of the environment in FIG. 1A. The environment 150 includes a plurality of client systems 154 (154.1-154.M), a clustered storage server system 152, and a computer network 156 connecting the client systems 154 and the clustered storage server system 152. As shown in FIG. 1B, the clustered storage server system 152 includes a plurality of server nodes 158 (158.1-158.N), a cluster switching fabric 160, and a plurality of mass storage devices 162 (162.1-162.N), which can be disks, as henceforth assumed here to facilitate description. Alternatively, some or all of the mass storage devices 162 can be other types of storage, such as flash memory, SSDs, tape storage, etc.

[0033] Each of the nodes 158 is configured to include several modules, including an N-module 164, a D-module 166, and an M-host 168 (each of which may be implemented by using a separate software module) and an instance of, for example, a replicated database (RDB) 170. Specifically, node 158.1 includes an N-module 164.1, a D-module 166.1, and an M-host 168.1; node 158.N includes an N-module 164.N, a D-module 166.N, and an M-host 168.N; and so forth. The N-modules 164.1-164.M include functionality that enables nodes 158.1-158.N, respectively, to connect to one or more of the client systems 154 over the network 156, while the D-modules 166.1-166.N provide access to the data stored on the disks 162.1-162.N, respectively. The M-hosts 168 provide management functions for the clustered storage server system 152. Accordingly, each of the server nodes 158 in the clustered storage server arrangement provides the functionality of a storage server.

[0034] FIG. 1B illustrates that the RDB 170 is a database that is replicated throughout the cluster, i.e., each node 158 includes an instance of the RDB 170. The various instances of the RDB 170 are updated regularly to bring them into synchronization with each other. The RDB 170 provides cluster-wide storage of various information used by all of the nodes 158, including a volume location database (VLDB) (not shown). The VLDB is a database that indicates the location within the cluster of each volume in the cluster (i.e., the owning D-module 166 for each volume) and is used by the N-modules 164 to identify the appropriate D-module 166 for any given volume to which access is requested.

[0035] The nodes 158 are interconnected by a cluster switching fabric 160, which can be embodied as a Gigabit Ethernet switch, for example. The N-modules 164 and D-modules 166 cooperate to provide a highly-scalable, distributed storage system architecture of a clustered computing environment implementing exemplary embodiments of the present invention. Note that while there is shown an equal number of N-modules and D-modules in FIG. 1B, there may be differing numbers of N-modules and/or D-modules in accordance with various embodiments of the technique described here. For example, there need not be a one-to-one correspondence between the N-modules and D-modules. As such, the description of a node 158 comprising one N-module and one D-module should be understood to be illustrative only. Further, it will be understood that the client systems 154 (154.1-154.M) can also act as nodes and include data memory for storing some or all of the data set being maintained by the storage system.

[0036] In systems described herein, a single node may keep a master copy of the data set, which may be the source data set, with every other node maintaining a copy of the master data set, typically the destination data sets. For example, the node 102a may store the master copy, with nodes 102b-d synchronizing their data sets to the master copy stored in the node 102a. When a change to the master data set occurs at node 102a, for example by a user modification of the data set through a user interface associated with, for example, a database application running on node 102a, the node 102a sends a notification to the other nodes 102b-d that a change has been made to the source data set. In some embodiments, the change notification may be one or more data packets suitable for transmission over the network 120 and having data that includes a list of data elements that have been modified in the source data set. For example, the change notification may include data representative of a list such as: [0037] "table 1, element 1, [0038] element 2, . . . element N," where the listed elements are the elements in a table 1 of the tables in the database that have been changed in the source data set. In alternate embodiments, the change notification may be network data packets carrying an alert flag that signals that the source data set has been changed. In this practice the elements changed are not specified and the system is to determine the changes through another process. Upon receipt of this change notification, the nodes 102b-d synchronize their data sets to the source data set of the node 102a.

[0039] In some embodiments, the node storing the master copy may change to a different node. For example, if communication link 122a becomes unavailable, node 102b may be elected or voted by the nodes 102c and 102d to keep the master copy of the data set, with nodes 102c and 102d synchronizing their data sets to the data set kept by node 102b. In alternate embodiments, node 102b may be predetermined to act as the master node should node 102a become unavailable. Other methods for reassigning the master node will be known to those skilled in the art and any suitable technique may be used. In other alternate embodiments, the data set may be distributed among several nodes 102a-d in the storage system 100, with respective nodes storing a subset of the data set. In this embodiment, the source data set could be as a system level logical model or abstraction with the actual data distributed amongst several nodes, and with changes to a destination data set being issued from multiple nodes.

[0040] In any case, the above described system may not be possible to pause, interrupt, or lock the changes coming in from the source data set. Furthermore, changes from the source data set may be issued out of order or in ways which break the hierarchical consistency of the destination data set. For example, a parent data element may be added to the source data set at the node 102b and a child element may be added to the source data set at node 102c. If these changes were to arrive out of order at node 102a, e.g., due to network latency, then a child element could be added to the destination data set before its corresponding parent element, which may violate the hierarchical consistency of the destination data set and cause a transient data set inconsistency. In such a dynamic environment, the systems and methods described herein update the destination data set in an order dictated by the hierarchical relationships such that the destination data set maintains hierarchical consistency.

[0041] FIG. 2 is a schematic block diagram of a single node 202 as used in the storage system 100 of FIG. 1A. The node 202 includes a processor 204 having a comparator 206 and driver 208, a communications device 210 configured to receive source data set 220, and a memory 212 configured to store a destination data set 222 and a list of parent elements 224. As described above, the node 202 can be any suitable computing device for storing and altering data sets with hierarchically dependent elements.

[0042] The processor 204 of node 202 is configured to update a destination data set 222 stored in memory 212 to be consistent with the source data set 220, accessible through the communications device 210. The processor 204 includes at least a comparator 206 and a driver 208. The comparator 206 compares the source data set 220 and the destination data set 222, identifies differences between the data sets, and identifies hierarchical relationships related to those differences. The driver 208 is configured to alter the destination data set 222 by performing changes in an order that is determined by the hierarchical relationships. The processor 204 can take the form of a general purpose processor, a microprocessor such as an application-specific integrated circuit (ASIC), a plurality of microprocessors, a field programmable gate array (FPGA), an embedded processor such as the ARM processor, or any other suitable processor for use in a computing system.

[0043] The communications device 210 of node 202 is configured to receive a notification that a change has been made to the source data set 220 from other nodes in the network 120. The communication device 210 can respond to the notification by accessing all or part of the source data set 220, which in this embodiment, represents the data set to which other data sets are to be synchronized. In one practice, the source data set 220 is passed to processor 204 for comparison with the destination data set 222 stored in memory 212. The communications device 210 may also transmit the destination data set 222 stored in memory 212 as well as a change notification indicating that a change has been made to the destination data set 222 to other nodes in the network 120. The communications device 210 could take the form of a wireless transmitter, a network interface card, a network switch, a network bridge, or any other suitable device for forwarding data through a network.

[0044] The memory 212 of node 202 is configured to store at least a destination data set 222 and a list of parent elements 224. Processor 204 accesses the destination data set 222 and compares it to the source data set 210 received by the communications device 210. Optionally, the processor may temporarily store a list of parent elements in memory, as will be described in more detail below. The memory 212, as described above in relation to FIG. 1A, can be any suitable memory device for storing a data set with hierarchically dependent elements.

[0045] In one illustrative example of a synchronization update, a user at a node, such as the node 102c in FIG. 1A, edits its local copy of the data set. The node 102c issues a change notification that may be transmitted to each other node maintaining a copy of the data set. Returning to FIG. 2, the node 202 is an example of a node that receives a change notification from node 102c, and in this example will also receive the source data set 220 from the node 102c. The data set of node 102c will be treated as the source data set. The processor 204 employs the change notification to initiate a comparison between the source data set 220 and the destination data set 222 stored in memory 212 of node 202. The comparator 206 identifies differences between the source data set 220 and the destination data set 222 as well as one or more data elements to be changed in the destination data set 222. The comparator 206 may use any suitable technique for comparing data sets to identify differences, and comparison operations are typically features of most database management systems. These and other techniques for comparing data sets may be employed. The comparator 206 may also identify the parent-child relationships of the data elements to be changed by revisions of the table relationship data, or using any other suitable method. For example, each element may store a list of its corresponding parent and children elements and an indication of its hierarchical relationships with other elements. The driver 208 may implement the changes in an order that is determined, at least in part, by these hierarchical relationships.

[0046] FIG. 3 depicts a conceptual diagram of an exemplary data set 300 with hierarchically dependent data elements 310-340. The data elements 310-340 may be optionally grouped into table 302 and table 304, labeled Table 1 and Table 2 respectively in FIG. 3. The data elements 310-340 are depicted visually as rows, but it will be appreciated by one skilled in the art that the data elements could be stored in any suitable arrangement and the arrangement selected will depend in part on the application at hand. The data elements 310-340 are ranked into successive levels such that the elements are represented as being above, i.e., a parent element, below, i.e., a child element, or at the same level as other data elements. A parent data element may have multiple children elements, and the parent-child relationships may go to an arbitrary depth to form a family tree. For example, the element 310 has the children elements 312 and 320, listed below the element 310. Further, the element 312 has the children elements 314, 316, and 318. Although not depicted in FIG. 3, children elements may also have multiple parent elements. In some embodiments, the data elements 310-340 may additionally be grouped into tables 302 and 304 with version number 306 or 308, which is incremented whenever a change is made to the corresponding table.

[0047] In some embodiments, the hierarchy may have a strict ordering. For example, the hierarchy may be ordered such that an element cannot be both a parent and a child of another element. Also, the hierarchy may be ordered such that children elements should not exist without their corresponding parent elements. Some embodiments may list the data elements in hierarchical order such that the parent elements are listed sequentially before their corresponding children elements. The data elements may have links, pointers, or metadata such as the links 350 to indicate their hierarchical relationships with other elements.

[0048] The data elements 310-340 of FIG. 1A may represent data structures which include, without limitation, user profile information, user preferences, security information, or log files. In some embodiments, the data elements may also include one or more attributes which describe the hierarchical relationships of the data element with other elements. For example, a parent element may include an attribute which determines the maximum number of children that may be hierarchically dependent upon the parent element.

[0049] Methods for synchronizing a destination data set in relation to a source data set while maintaining hierarchical consistency will now be discussed in relation to the illustrative examples depicted in FIGS. 4-6. For the purposes of discussion, the source node will be node 102a in network 100, while the destination node will be 102b, although it will be appreciated by those skilled in the art that other combinations of source and destination nodes may be possible as described above.

[0050] FIG. 4 depicts an illustrative example of a modification update from a source data set in node 102a to a destination data set in node 102b. The source data set 402 includes table 404 comprising version number 424 and data elements 406-412, and table 414 comprising version number 426 and data elements 416-422. The destination data set 432 includes table 434 comprising version number 454 and data elements 436-442, and table 444 comprising version number 456 and data elements 446-452.

[0051] When a change occurs to data element 422 in the source data set 402, a change notification is issued over network 120 by node 102a. As described above, if the changes are processed in the order that they are received, the hierarchical relationships between the data elements of the destination data set 432 could be violated. As a result, although the change notification may contain information indicating the location and nature of the change, the comparator 206 of destination node 102b uses the change notification primarily as a trigger to initiate a comparison process between the source data set 402 and the destination data set 432.

[0052] In the example depicted in FIG. 4, the data element 422 has been modified in the source data set 402. Upon receipt of the change notification, the comparator 206 of destination node 102b begins a comparison to identify any differences between the source data set 402 and the destination data set 432. In some embodiments, the comparator 206 begins with comparing the first data element 406 in the source data set 402 with the first data element 436 in the destination data set 432. The comparator 206 continues to compare the data elements in sequential order, starting from the top of the data set, until a difference between the source data set 402 and destination data set 432 is found. In alternate embodiments, the comparator 206 begins by comparing the version number 424 of the first table 404 in the source data set 402 with the version number 454 of the first table 434 in the destination data set 432. If the version numbers of the first tables are the same, the comparator 206 continues to the next table until a difference in the version numbers is found. In this way, the comparator 206 can progress through tables which have not been modified and more efficiently narrow the search to altered portions of the data set. For example, the version number 426 denotes table version v1.4 in the source data set 402 while version number 456 denotes table version v1.3 in the destination data set 432. The comparator 206 proceeds to compare the data elements within the tables 414 and 444 in sequential order to identify a difference.

[0053] Comparator 206 may identify a data element to be modified in the destination data set 432 as well as its hierarchical relationships. In FIG. 4, the data element 422 has been modified, and the driver 208 updates the corresponding data element 452 in the destination data set 432. This modification does not change any hierarchical relationships, so the modification to element 452 is performed.

[0054] In some cases, the modification of a data element may change the hierarchical relationships with other data elements. For example, a modification to a parent element could reduce the number of children elements that can be hierarchically dependent on the parent element. In such a case, the driver 208 deletes the children elements prior to the modification to the parent element. In FIG. 4, a modification to element 416 in the source data set 402 may reduce the number of children that can be hierarchically dependent on element 416 from two child elements to one, resulting in element 422 being deleted from the source data set 402. In the synchronization process, the driver 208 deletes element 452 from the destination data set 432 prior to the modification of element 446.

[0055] At unscheduled times during the comparison process, another change notification can be issued by the source node 102a. Upon receipt of the change notification, the comparator 206 of destination node 102b restarts the comparison process from the beginning of the data set. The driver 208 maintains hierarchical consistency by modifying the destination data set 432 in an order that avoids the production of a child element lacking a parent element. Furthermore, this order of operations reduces the likelihood that an intervening read operation of the destination data set 432 will result in a read of a child element that has already been deleted from the source data set 402, thereby maintaining data integrity between the source data set 402 and the destination data set 432. As such, the comparator 206 of destination node 102b may restart the comparison process from the beginning of the data set at unscheduled times while maintaining data integrity. In some embodiments, the comparison process is restarted after a delay to allow more changes to occur to source data set 402 before the update process begins again.

[0056] FIG. 5 depicts an illustrative example of an addition update from a source data set 502 to a destination data set 532. The source data set 502 includes table 504 comprising version number 524 and data elements 506-512, and table 514 comprising version number 526 and data elements 516-523. The destination data set 532 includes table 534 comprising version number 554 and data elements 536-542, and table 544 comprising version number 556 and data elements 546-552.

[0057] When the element 523 is added to the source data set 502, source node 102a issues a change notification over network 120. In a process similar to the method described in relation to FIG. 4, the comparator 206 of destination node 102b uses the change notification primarily as an indication to start a comparison process between the source data set 502 and the destination data set 532. The comparator 206 begins at the top of the data set and may compare the elements in sequential order to determine a difference between the source data set 502 and destination data set 532. In alternate embodiments, the comparator 206 first compares table version numbers 524 and 554 to determine which tables have been modified. Upon finding a modified table, the comparator 206 continues the comparison of the data elements within the modified table in sequential order to determine a difference between the source data set 502 and the destination data set 532.

[0058] In FIG. 5, the destination node 102b determines that the table 514 is a modified table by comparing the table version numbers 526 and 556, and identifies the element 523 as a difference between the source data set and the destination data set. In this case the identified difference is an addition to the source data set 502. In the case of an addition, the driver 208 alters the destination set and to that end performs an addition operation such that parent elements are added before corresponding children elements. By listing the elements of the data set in the order of their hierarchical order, the parent element will be listed before its corresponding children elements, and the driver 208, operating from this list and moving through the list in order, will add the elements in an order determined by the identified hierarchical characteristic of these identified differences.

[0059] FIG. 6 depicts an illustrative example of a deletion update from a source data set 602 to a destination data set 632. The source data set 602 includes table 604 comprising version number 624 and data elements 606-612, and table 614 comprising version number 626 and data elements 616-622. The destination data set 632 includes table 634 comprising version number 654 and data elements 636-642, and table 644 comprising version number 656 and data elements 646-652. The list of parent elements 670 includes depicted parent elements 646 and 648.

[0060] When the elements 616, 618, and 620 are deleted from the source data set 602, node 102a issues a change notification over network 120. In a process similar to the method described in relation to FIGS. 4 and 5, the comparator 206 of destination node 102b uses the change notification as an indication to start a comparison process between the source data set 602 and destination data set 632. In one practice, the comparator 206 begins at the top of the data set or at any point that represents logically the beginning of the data set. For example, this may mean starting at the first row and first column of the first table. However, the starting point used by the comparator 206 and the process applied will depend upon the organization of the data set and other factors. Thus, the process may compare the elements in sequential order to determine a difference between the source data set 602 and destination data set 632. In alternate practices, the comparator 206 first compares table version numbers 624 and 654 to determine which tables have been modified. Upon finding a modified table, the comparator 206 continues the comparison of the data elements within the modified table to determine a difference between the source data set 602 and the destination data set 632.

[0061] In contrast to the modification and addition changes described with reference to FIGS. 4 and 5, a deletion change may include additional steps to maintain hierarchical consistency of the destination data set 632. In particular, the driver 208 may perform the deletion of data elements from the destination data set 632 in an order which deletes children elements prior to the deletion of their corresponding parent elements.

[0062] In the example depicted in FIG. 6, the comparator 206 of destination node 102b identifies element 616 as having been deleted from source data set 602. As a result, hierarchically dependent child elements 618, 620, and 622 are also deleted from source data set 602. The comparator 206 identifies element 646 to be deleted from the destination data set 632. However, instead of deleting element 646, the driver 208 of node 102b identifies the child elements with no other dependent elements in the family tree for element 646, in this case elements 650 and 652. Driver 208 deletes elements 650 and 652 from destination data set 632, as deletion of those elements will not leave any child elements without parent elements. The remainder of the elements in the family tree, in this case elements 646 and 648, are added to a list of parent elements 670 to be processed at a later time. The comparator 206 then continues to identify further differences between the source data set 602 and the destination data set 632.

[0063] The driver 208 of destination node 102b deletes the elements in the list of parent elements 670 from the destination data set 632 beginning with elements in the list 670 which lack hierarchically dependent elements. In some embodiments, the driver 208 identifies which elements in the list lack hierarchically dependent elements, deletes the identified elements from the list 670, and repeats the process until all elements from the list 670 have been deleted. In alternate embodiments, the driver 208 deletes the elements from the list 670 in reverse order from the order in which they were added to the list. Other embodiments may use other techniques for deleting elements from the list 670 in an order that maintains data integrity between data set copies.

[0064] In the example depicted in FIG. 6, the driver 208 deletes element 648 from the destination data set 632. The driver subsequently deletes element 646 from the destination data set 632. Deleting the parent elements in this order preserves the hierarchical consistency of the destination data set 632 by deleting child elements prior to their corresponding parent elements. For example, if parent element 646 was deleted first, and a change notification arrived before elements 648 and 650 were deleted, then the comparison process would be restarted while leaving child elements 648 and 650 without their parent element 646, violating the hierarchical consistency of the destination data set 632.

[0065] FIG. 7 is a flowchart of a method 700 for performing an addition or modification update from a source data set to a destination data set. The method 700 includes receiving a change notification at a destination node at step 702, initiating a comparison between the source and destination data set at step 704, identifying differences between the source and destination data set decision at step 706, identifying hierarchical characteristics of the identified differences at step 708, and altering the destination data based at least in part on the hierarchical relationships at step 710. The method 700 further includes detecting, at decision step 712, whether an additional change notification has been issued and initiating the comparison after an optional delay at step 714. When there are no more differences detected and no further change notifications have been received, the method 700 terminates at step 718.

[0066] As set forth above, a first step, step 702, in performing an addition or modification update is to receive, at a destination node, a change notification from a source node. The change notification may include information regarding the location and the nature of the change. Upon receipt of the change notification, the method 700 initiates a comparison between the source and destination data set at step 704. The comparison can be performed using any suitable technique for identifying differences between two data sets, including a modification, addition, or deletion of a data element. In some embodiments, the comparison begins at the beginning of the data set and compares the data elements in the sequential order. In alternate embodiments, the comparison begins by first identifying modified tables and comparing the elements in the modified tables in sequential order.

[0067] A difference between the source and destination data set is identified in step 706, and the hierarchical relationships of the data elements to be changed in the destination data set is determined at step 708. Using this information, the method 700 alters the destination data set based, at least in part, on the hierarchical relationships to maintain the hierarchical consistency of the destination data set. For example, as discussed above, a modification update to a parent element could affect related children elements, in which case the modification is performed in such a way that the child elements are modified before their corresponding parent element. Similarly, the addition of a child element is made subsequent to the addition of its corresponding parent element.

[0068] At any point, the method 700 determines whether an additional change notification has been received at decision 712. If an additional change notification has not been received, the method 700 continues to identify differences between the source and destination data sets. When all differences between the data sets have been identified and resolved, the synchronization process is complete, and the method 700 terminates at step 718. If an additional change notification has been received at step 712, then the comparison is initiated at the beginning of the data set at step 704. In some embodiments, the method 700 is restarted after a delay at step 714 to allow more change notifications to be issued before another comparison is initiated.

[0069] FIG. 8 is a flowchart of a method for performing a deletion update from a source data set to a destination data set. The method 800 includes receiving a change notification at step 802, initiating a comparison between the source data set and destination data set at step 804, identifying differences between the source and destination data set at decision 806, identifying hierarchical characteristics of the identified differences at step 808, adding parent elements to a list of parent elements at step 810, and deleting children elements that are not parent elements at step 812. The method 800 further comprises detecting whether an additional change notification has been received at decision 814, and upon determining that an additional change notification has been received, purging the parent list at step 816 and initiating the comparison after an optional delay at step 818. When there are no more differences detected and no further change notifications have been received, the method 800 deletes parents from the list of parent elements in reverse order at step 820 and terminates at step 822.

[0070] Similar to the method 700, the method 800 begins at step 802 upon receiving, at a destination node, a change notification from a source node. The change notification may be a series of data packets carried over the network 120 and have information regarding the location and the nature of the change. Upon receipt of the change notification, the method 800 initiates a comparison between the source and destination data set at step 804. The comparison can be performed using any suitable technique for identifying differences between two data sets, including a modification, addition, or deletion of a data element. In some embodiments, the comparison begins at the beginning of the data set and compares the data elements in the sequential order. In alternate embodiments, the comparison begins by first identifying modified tables and comparing the elements in the modified tables in sequential order.

[0071] A difference between the source and destination data set is identified at step 806, and the hierarchical relationships of the data elements to be changed in the destination data set is determined at step 808. In the case of a deletion of a data element, identifying the affected hierarchical relationships includes identifying any children elements that are hierarchically dependent on the element to be deleted. From the identified children elements, the elements which are themselves parent elements are added to a list of parent elements at step 810. The remaining identified children elements, which do not have any hierarchically dependent elements, are deleted from the destination data set at step 812.

[0072] At any point, the method 800 determines whether an additional change notification has been received at decision 814. If an additional change notification has not been received, the method 800 continues to identify differences between the source and destination data sets. When all differences have been identified, the method 800 deletes the data elements from the list of parent elements in reverse order from which the elements were added to the list at step 820. In this way, children elements are deleted before their corresponding parent elements, and the hierarchical consistency of the destination data set is preserved. When the parent elements have been deleted from the destination data set, the synchronization is complete and the method 800 terminates at step 822.

[0073] If an additional change notification is received at any point during the method 800, the list of parent elements is purged, i.e., the elements are deleted from the list at step 816. The method 800 then restarts the comparison at the beginning of the data set at step 804. In some embodiments, the method is restarted after a delay at step 818 to allow more change notifications to be issued before another comparison is initiated.

[0074] Some embodiments of the above described may be conveniently implemented using a conventional general purpose digital computer or server that has been programmed to carry out the methods described herein. Some embodiments may also be implemented by the preparation of application-specific integrated circuits or by interconnecting an appropriate network of conventional component circuits, as will be readily apparent to those skilled in the art. Those of skill in the art would understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, requests, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.

[0075] Some embodiments include a computer program product comprising a computer readable medium having instructions stored thereon/in and, when executed, e.g., by a processor, perform methods, techniques, or embodiments described herein, the computer readable medium comprising sets of instructions for performing various steps of the methods, techniques, or embodiments described herein. The computer readable medium may comprise a storage medium having instructions stored thereon/in which may be used to control, or cause, a computer to perform any of the processes of an embodiment. The storage medium may include, without limitation, any type of disk including floppy disks, mini disks, optical disks, DVDs, CD-ROMs, micro-drives, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices including flash cards, magnetic or optical cards, nanosystems including molecular memory ICs, RAID devices, remote data storage/archive/warehousing, or any other type of media or device suitable for storing instructions and/or data thereon/in.

[0076] Additionally, the systems and methods described herein may be applied to any storage application that includes data set copies which are to be synchronized. These systems can work with any storage medium, including discs, RAM, and hybrid systems that store data across different types of media, such as flash media and disc media. Optionally, the different media may be organized into a hybrid storage aggregate. In some embodiments different media types may be prioritized over other media types, such as the flash media may be prioritized to store data or supply data ahead of hard disk storage media or different workloads may be supported by different media types, optionally based on characteristics of the respective workloads. Additionally, the system may be organized into modules and supported on blades configured to carry out the storage operations described herein. The term "storage system" should, therefore, be taken broadly to include such arrangements.

[0077] Stored on any one of the computer readable medium, some embodiments include software instructions for controlling both the hardware of the general purpose or specialized computer or microprocessor, and for enabling the computer or microprocessor to interact with a human user and/or other mechanism using the results of an embodiment. Such software may include without limitation device drivers, operating systems, and user applications. Ultimately, such computer readable media further includes software instructions for performing embodiments described herein. Included in the programming software of the general-purpose/specialized computer or microprocessor are software modules for implementing some embodiments.

[0078] The method can be realized as a software component operating on a conventional data processing system such as a Unix workstation. In that embodiment, the synchronization method can be implemented as a C language computer program, or a computer program written in any high level language including C++, Fortran, Java or BASIC. See The C++ Programming Language, 2nd Ed., Stroustrup Addision-Wesley. Additionally, in an embodiment where microcontrollers or DSPs are employed, the synchronization method can be realized as a computer program written in microcode or written in a high level language and compiled down to microcode that can be executed on the platform employed.

[0079] It will be apparent to those skilled in the art that such embodiments are provided by way of example only. It should be understood that numerous variation, alternatives, changes, and substitutions may be employed by those skilled in the art in practicing the invention. Accordingly, it will be understood that the invention is not to be limited to the embodiments disclosed herein, but is to be understood from the following claims, which are to be interpreted as broadly as allowed under the law.

* * * * *