U.S. patent application number 11/933155 was filed with the patent office on 2008-05-01 for system for automatically shadowing encrypted data and file directory structures for a plurality of network-connected computers using a network-attached memory with single instance storage.
This patent application is currently assigned to Rebit, Inc.. Invention is credited to Dennis B. Batchelor, Nathan Probst, David Schwaab.
Application Number | 20080104146 11/933155 |
Document ID | / |
Family ID | 40595707 |
Filed Date | 2008-05-01 |
United States Patent
Application |
20080104146 |
Kind Code |
A1 |
Schwaab; David ; et
al. |
May 1, 2008 |
SYSTEM FOR AUTOMATICALLY SHADOWING ENCRYPTED DATA AND FILE
DIRECTORY STRUCTURES FOR A PLURALITY OF NETWORK-CONNECTED COMPUTERS
USING A NETWORK-ATTACHED MEMORY WITH SINGLE INSTANCE STORAGE
Abstract
The Secure Networked Data Shadowing System is connected to a
plurality of monitored computer systems via an existing
communication medium to store the shadowed data. The data is
encrypted by the monitored computer system using a cryptokey, and
the data file is processed using a hash function prior to
encryption, so the contents of this file are uniquely identified.
Thus, the encrypted file is stored in its encrypted form and the
hash index is used to identify the encrypted file. A "data
de-duplication" process avoids storing multiple copies of the same
files by identifying instances of duplication via the hash index.
Files that have the same hash index can be reduced to a single copy
without any loss of data as long as the file structure information
for each instance of the file is maintained.
Inventors: |
Schwaab; David; (Fort
Collins, CO) ; Probst; Nathan; (Fort Collins, CO)
; Batchelor; Dennis B.; (Longmont, CO) |
Correspondence
Address: |
PATTON BOGGS LLP
1801 CALFORNIA STREET, SUITE 4900
DENVER
CO
80202
US
|
Assignee: |
Rebit, Inc.
Longmont
CO
|
Family ID: |
40595707 |
Appl. No.: |
11/933155 |
Filed: |
October 31, 2007 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60863665 |
Oct 31, 2006 |
|
|
|
Current U.S.
Class: |
1/1 ;
707/999.204; 707/E17.007; 707/E17.01 |
Current CPC
Class: |
G06F 11/1471 20130101;
G06F 16/1756 20190101; G06F 21/6218 20130101; G06F 11/1453
20130101; G06F 2201/84 20130101; G06F 11/1451 20130101; G06F
11/1458 20130101; G06F 16/1734 20190101 |
Class at
Publication: |
707/204 ;
707/E17.007 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A data shadowing system for the automatic backup storage of data
that is written in the memories of a plurality of monitored
computer systems and the selective retrieval of this data for
restoration to said memories, comprising: memory module means,
connected to a plurality of monitored computer systems via a
communication medium, for storing a shadow copy of the data which
is written in the memories of the plurality of monitored computer
systems; customer data file management means, resident in each of
said plurality of monitored computer systems, comprising: customer
data file mapping means for mapping file directory structures
associated with the customer data written in the memory of the
monitored computer system into file tree information which
identifies original names of customer data files as stored on said
monitored computer system, and a location in said memory module
means which stores said customer data; customer data file hash
means for processing each of said customer data files to generate
an associated hash value of each of said customer data files;
customer data encryption means for encrypting each of said customer
data files using a cryptokey having public and private components;
encrypted customer data file transmission means for transmitting
said encrypted customer file, said hash value and a public
encryption key to said memory module means; and customer data file
storage means, located at said memory module means, for storing
each of said encrypted customer data files, said associated hash
value, and said public encryption key received from each of said
monitored computer systems via said communication medium, in said
memory module means.
2. The data shadowing system of claim 1, further comprising:
duplication means, responsive to a plurality of said encrypted
customer data files having the same hash value, for storing a
single instance of said encrypted customer data file in said memory
module means.
3. The data shadowing system of claim 2 wherein said customer data
file storage means additionally stores said file tree information
as well as said associated hash value and public cryptokey for each
of said plurality of encrypted customer data files having the same
hash value with said single instance of said customer data file in
said memory module means.
4. The data shadowing system of claim 2, further comprising: system
file management means for automatically mapping system files
written in the memory of the monitored computer system into file
tree information which identifies original names of system files as
stored on said monitored computer system and a location in said
memory module means which stores said system files.
5. The data shadowing system of claim 4 wherein said system files
comprise: computer files, exclusive of customer provided data,
including at least one of: master boot record, partition table, and
programs.
6. The data shadowing system of claim 4, further comprising: system
file hash means for processing each of said system files to
generate an associated hash value of each of said system files;
wherein said memory module means further comprises: system file
storage means for storing each of said system files and said
associated hash value in said memory module means; and wherein said
mapping database means stores said hash value of each of said
customer data files in said database.
7. The data shadowing system of claim 6, further comprising:
duplication means, responsive to a plurality of said system files
having the same hash value, for storing a single instance of said
system file in said memory module means.
8. The data shadowing system of claim 7 wherein said system file
storage means additionally stores said file tree information for
each of said plurality of system files having the same hash value
and said associated hash value with said single instance of said
system file in said memory module means.
9. The data shadowing system of claim 1 wherein said customer data
file management means comprises: change journal means for
automatically tracking changes to said customer data files,
including: customer data file creation, customer data file
movement, customer data file content changes, and customer data
file renaming.
10. The data shadowing system of claim 9, further comprising: file
history means, responsive to a modification of a customer data
file, for generating file change data indicative of a difference
between said customer data file and said modified customer data
file; file tracking means for storing said file change data in said
memory module means; and wherein said customer data file management
means stores said file change data in said memory module means and
substitutes said modified customer data file for said customer data
file in said memory module means.
11. A method for the autonomous encrypted backup storage of data
that is written in the memories of a plurality of monitored
computer systems and the selective retrieval of this encrypted data
for restoration to said memories, comprising: storing, in a memory
module which is connected to a plurality of monitored computer
systems via a communication medium, a shadow copy of the data which
is written in the memories of the plurality of monitored computer
systems; managing customer data files, resident in each of said
plurality of monitored computer systems, comprising: mapping file
directory structures associated with the customer data written in
the memory of the monitored computer system into file tree
information which identifies original names of customer data files
as stored on said monitored computer system and a location in said
memory module means which stores said customer data; processing
each of said customer data files to generate an associated hash
value of each of said customer data files; encrypting each of said
customer data files using a cryptokey having public and private
components; transmitting said encrypted customer file, said hash
value and a public encryption key to said memory module; and
storing customer data files, located at said memory module, each of
said encrypted customer data files, said associated hash value, and
said public encryption key received from each of said monitored
computer systems via said communication medium, in said memory
module.
12. The method for the autonomous encrypted backup storage of data
of claim 11, further comprising: storing, in response to a
plurality of said encrypted customer data files having the same
hash value, a single instance of said encrypted customer data file
in said memory module.
13. The method for the autonomous encrypted backup storage of data
of claim 12 wherein said step of storing customer data files
additionally stores said file tree information as well as said
associated hash value and public cryptokey for each of said
plurality of encrypted customer data files having the same hash
value with said single instance of said customer data file in said
memory module.
14. The method for the autonomous encrypted backup storage of data
of claim 12, further comprising: automatically mapping system files
written in the memory of the monitored computer system into file
tree information which identifies original names of system files as
stored on said monitored computer system and a location in said
memory module which stores said system files.
15. The method for the autonomous encrypted backup storage of data
of claim 14 wherein said system files comprise: computer files,
exclusive of customer provided data, including at least one of:
master boot record, partition table, and programs.
16. The method for the autonomous encrypted backup storage of data
of claim 14, further comprising: processing each of said system
files to generate an associated hash value of each of said system
files; wherein said step of storing a shadow copy of data files
further comprises: storing each of said system files and said
associated hash value in said memory module; and wherein said step
of mapping customer data files stores said hash value of each of
said customer data files in said database.
17. The method for the autonomous encrypted backup storage of data
of claim 16, further comprising: storing, in response to a
plurality of said system files having the same hash value, a single
instance of said system file in said memory module.
18. The method for the autonomous encrypted backup storage of data
of claim 17 wherein said system file storage means additionally
stores said file tree information for each of said plurality of
system files having the same hash value and said associated hash
value with said single instance of said system file in said memory
module means.
19. The method for the autonomous encrypted backup storage of data
of claim 21 wherein said customer data file management means
comprises: change journal means for automatically tracking changes
to said customer data files, including: customer data file
creation, customer data file movement, customer data file content
changes, and customer data file renaming.
20. The method for the autonomous encrypted backup storage of data
of claim 19, further comprising: file history means, responsive to
a modification of a customer data file, for generating file change
data indicative of a difference between said customer data file and
said modified customer data file; file tracking means for storing
said file change data in said memory module means; and wherein said
customer data file management means stores said file change data in
said memory module means and substitutes said modified customer
data file for said customer data file in said memory module means.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a utility filing which claims priority
to U.S. Provisional Application No. 60/863,665, titled "Data Backup
Systems And Methods" and filed on 31 Oct. 2006. This application is
also related to the following U.S. Patent Applications filed on the
same date as the present application: System For Automatically
Shadowing Data And File Directory Structures That Are Recorded On A
Computer Memory; System For Automatically Recovering A Computer
Memory Using Shadowed Data And File Directory Structures; and
System For Automatically Shadowing Data And File Directory
Structures For A Plurality Of Network Connected-Computers Using A
Network-Attached Memory With Single Instance Storage.
FIELD OF THE INVENTION
[0002] This invention relates to systems that are used to provide
data backup for individual computer systems.
BACKGROUND OF THE INVENTION
[0003] It is a problem both to safeguard data that is stored on a
computer system and to restore all or portions of this data that
are lost or corrupted. Many computer systems have no protection
systems in place and the loss of data from these computer systems
is irrevocable. Other computer systems make use of attached data
backup systems to store a copy of the data that is stored in the
computer memory and updates thereto for eventual retrieval to
restore data that is lost from or corrupted in the computer system
memory. However, the use of these existing data backup systems is
laborious and can be confusing to the casual user. Furthermore, in
the instance of network connected computer systems, the use of a
common shared data backup system wastes a significant amount of
memory by storing multiple copies of the same data and raises the
issue of security of the data that is stored on the shared data
backup system. The shared data backup system must not only be
secure from intrusion from outside sources, but the files stored on
the shared data backup system must also be secure from access by
other individuals who are authorized to store their files on the
shared data backup system. Presently, there does not exist a shared
data backup system that properly addresses these issues.
[0004] In information technology, backup refers to making copies of
data so that these additional copies may be used to restore the
original after a data loss event. These additional copies are
typically called "backups." Backups are useful primarily for two
purposes. The first is to restore a computer to an operational
state following a disaster (called disaster recovery). The second
is to restore one or more files after they have been accidentally
deleted or corrupted. Backups are typically that last line of
defense against data loss, and consequently the least granular and
the least convenient to use.
[0005] Since a data backup system contains at least one copy of all
data worth saving, the data storage requirements are considerable,
which data storage requirements can be exacerbated by the method
used to perform the data backup where change tracking is wasteful
of memory. Organizing this storage space and managing the backup
process is a complicated undertaking. A data repository model can
be used to provide structure to the data storage device for the
management of the data that is backed up. In the modern era of
computing, there are many different types of data storage devices
that are useful for making backups. There are also many different
ways in which these data backup devices can be arranged to provide
geographic redundancy, data security, and portability.
[0006] Before data is ever sent to its data backup storage
location, it is selected, extracted, and manipulated. Many
different techniques have been developed to optimize the backup
procedure. These include optimizations for dealing with open files
and live data sources as well as compression, encryption, and
de-duplication, among others. Many organizations and individuals
require that they have some confidence that the backup process is
working as expected and work to define measurements and validation
techniques to confirm the integrity of the backup process. It is
also important to recognize the limitations and human factors
involved in any backup scheme.
[0007] Due to a considerable overlap in technology, backups and
data backup systems are frequently confused with archives and
fault-tolerant systems. Backups differ from archives in the sense
that archives are the primary copy of data and backups are a
secondary copy of data. Data backup systems differ from
fault-tolerant systems in the sense that data backup systems assume
that a fault will cause a data loss event and fault-tolerant
systems assume a fault will not cause a data loss event.
Data Repository Models
[0008] Any backup strategy starts with the concept of a data
repository. The backup data needs to be stored somehow and probably
should be organized to a degree. It can be as simple as a manual
process which uses a sheet of paper with a list of all backup tapes
and the dates they were written or a more sophisticated automated
setup with a computerized index, catalog, or relational database.
Different repository models have different advantages. This is
closely related to choosing a backup rotation scheme. The following
paragraphs summarize the various existing backup models presently
in use.
Unstructured
[0009] An unstructured repository may simply be a writable media
consisting of, for example, a stack of floppy disks or CD-R media
with minimal information about what data from the computer system
was backed up onto this writeable media and when the backup(s)
occurred. This is the easiest backup method to implement, but
probably the least likely to achieve a high level of recoverability
due to the dearth of indexing information that is associated with
the data that is backed up.
Full+Incremental
[0010] A Full+Incremental data backup model aims to make storing
several copies of the source data more feasible. At first, a full
backup of all files from the computer system is taken. After that
full backup is completed, an incremental backup of only the files
that have changed since the previous full or incremental backup is
taken. Restoring the whole computer system to a certain point in
time requires locating not only the full backup taken previous to
that certain point in time but also all the incremental backups
taken between that full backup and the particular point in time to
which the system is supposed to be restored. The full backup
version of the data is then processed, using the set of incremental
changes, to create a present view of the data as of that designated
certain point in time. This data backup model offers a high level
of security that selected data can be restored to its present state
and this data backup model can be used with removable media such as
tapes and optical disks. The downside of this data backup process
is dealing with a long series of incremental changes and the high
storage requirements entailed in this data backup process, since a
copy of every changed file in each incremental backup is stored in
memory.
Full+Differential
[0011] A Full+Differential data backup model differs from a
Full+Incremental data backup model in that after the full backup is
taken of all files on the computer system, each incremental backup
of the files captures all files created or changed since the full
backup, even though some may have been included in a previous
partial backup. The advantage of this data backup model is that
restoring the whole computer system to a certain point in time
involves recovering only the last full backup and then overlaying
it with the last differential backup.
Mirror+Reverse Incremental
[0012] A Mirror+Reverse Incremental data backup model is similar to
a Full+Incremental data backup model. The difference is that
instead of an aging full data backup, followed by a series of
incremental data backups; this model offers a mirror that reflects
the state of the computer system as of the last data backup and a
history of reverse incremental data backups. One benefit of this
data backup method is that it only requires an initial full data
backup. Each incremental data backup is immediately applied to the
mirror and the files they replace are moved to a reverse
incremental backup. This data backup model is not suited to the use
of removable media since every data backup must be done in
comparison to the data backup mirror version of the data. This
process, when used to restore the whole computer system to a
certain point in time, is also intensive in its use of memory.
Continuous Data Protection
[0013] This data backup model takes the data backup process a step
further and instead of scheduling periodic data backups, the data
backup system immediately logs every change made on the computer
system. This is generally done by saving byte or block-level
differences rather than file-level differences. It differs from
simple disk mirroring in that it enables a roll-back of the log and
thus can restore an old image of data. Restoring the whole computer
system to a certain point in time using this method requires that
the original version of the data must be processed to incorporate
every change recorded in each differential change to recreate the
present version of the data.
Problems
[0014] In spite of all of these various methods of data backup,
existing data backup systems (including both hardware and software)
fail to ensure that the user can simply plug in to the computer
system to "back-up" the data stored therein, and also enable
recovery of a revision of a file from a point-in-time, and enable
all of the hard disk(s) in the computer system to be restored to a
point-in-time. Existing data backup systems fail to efficiently
track and store the state of multiple file systems over time, while
allowing for correct disk-level and file-level restoration, to a
point-in-time, without storing a significant amount of redundant
data. These data backup systems require the user to learn new
technology, understand the file system of the computer system,
learn how to schedule data backup sessions, and learn new controls
that must be used for this new functionality. Furthermore, the
restoration of lost files is difficult using these data backup
systems and there is a failure to address the security of the data
that is being saved on the backup system.
BRIEF SUMMARY OF THE INVENTION
[0015] The above-described problems are solved and a technical
advance achieved by the present System For Automatically Shadowing
Encrypted Data And File Directory Structures For A Plurality Of
Network Connected Computers Using A Network Attached Memory With
Single Instance Storage (termed "Secure Networked Data Shadowing
System" herein), which comprises a memory module that is connected
to a plurality of monitored computer systems via an existing
communication medium, such as a Local Area Network, a data
communication medium (Internet), and an input/output port of the
monitored computer system to store the shadowed data. The memory
module includes one or more memory devices for data storage as well
as software, including a control software component that is
automatically installed on the monitored computer systems when the
monitored computer system is first connected to the memory module,
as well as associated module software for maintaining a record of
the data stored on the memory devices and controlling the operation
of the memory devices.
[0016] The Secure Networked Data Shadowing System automatically
stores the data which is retrieved from the memory of each of the
monitored computer systems on to the memory devices located in the
memory module in a single format, while representing it in a data
management database in two formats: disk sectors and files. The
Secure Networked Data Shadowing System thereby efficiently tracks
and stores the state of multiple file systems resident on a
plurality of monitored computer systems over time, while allowing
for correct disk-level and file-level restoration, to a
point-in-time, without storing redundant data.
[0017] Security of the data that is stored on the shared Secure
Networked Data Shadowing System is effected by a plurality of
measures which collectively provide multiple dimensions of
security. In particular, the data that is transferred between the
monitored computer systems and the Secure Networked Data Shadowing
System is encrypted by the monitored computer system, using a
cryptokey that has public and private components. The data file
that is stored on the shared Secure Networked Data Shadowing System
is processed using a hash function prior to encryption, so the
contents of this file are uniquely identified. Thus, when the
encrypted file is received by the shared Secure Networked Data
Shadowing System, it is stored in its encrypted form in memory and
the hash index is used to identify the encrypted file.
[0018] The Secure Networked Data Shadowing System achieves a
significant amount of data storage efficiency by implementing a
"data de-duplication" process to avoid storing multiple copies of
the same files. The Secure Networked Data Shadowing System
identifies instances of duplication via the hash index that is
created for each file. Identical hash indexes are indicative of
identical files. Therefore, files that have the same hash index can
be reduced to a single copy without any loss of data as long as the
file structure information for each instance of the file is
maintained. Thus, a plurality of the monitored computer systems can
all reference the same file stored in the Secure Networked Data
Shadowing System, since the local identification of this file is
computer-centric, but the memory location in the Secure Networked
Data Shadowing System is the same for all instances.
[0019] The Secure Networked Data Shadowing System operates
autonomously, freeing the user from needing to interact with the
Secure Networked Data Shadowing System to have the memory of their
monitored computer system backed up. The backup is nearly always up
to date so long as the Secure Networked Data Shadowing System is
connected to the monitored computer system. The Secure Networked
Data Shadowing System incorporates database technology to optimize
the data storage and retrieval for normal operations, and the
database of file directory information itself resides on the
monitored computer system hard drive, while a backup copy of the
database is written periodically to the Secure Networked Data
Shadowing System.
[0020] In addition, the file changes, creations, relocations, and
deletions are tracked through time, with the Secure Networked Data
Shadowing System enabling point-in-time restoration of individual
files as well as file systems on any of the monitored computer
systems, including monitored computer systems that are remotely
located. The full system restore capability enables the
reconstruction of the entire memory of the monitored computer
system, including: operating system, applications, and data files
for a given point in time without requiring the intervention of the
user.
[0021] If the monitored computer system is disconnected from the
Secure Networked Data Shadowing System memory module for any length
of time, the control software component that executes on the
monitored computer system tracks the appropriate file changes
occurring through time and then performs normal backup activities
once the Secure Networked Data Shadowing System memory module is
reconnected to the monitored computer system via the network.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] FIG. 1 illustrates the basic architecture of the present
Secure Networked Data Shadowing System operational in a typical
network environment;
[0023] FIGS. 2A and 2B illustrate, in flow diagram form, the
operation of the present Secure Networked Data Shadowing System
during the initial installation of the Secure Networked Data
Shadowing System on a monitored computer system;
[0024] FIG. 3 illustrates, in flowchart form, the operation of the
present Secure Networked Data Shadowing System to store a copy of
the data that is presently added to the monitored computer system's
memory;
[0025] FIG. 4 illustrates, in flowchart form, the operation of the
present Secure Networked Data Shadowing System to create and store
an integrity point to benchmark changes in the monitored computer
system's memory;
[0026] FIG. 5 illustrates, in flow diagram form, the operation of
the present Secure Networked Data Shadowing System to retrieve data
stored therein for restoration of a file in the memory of the
monitored computer system; and
[0027] FIG. 6 illustrates, in flow diagram form, the operation of
the present Secure Networked Data Shadowing System to retrieve data
stored therein for restoration of the entirety of the memory of the
monitored computer system.
DETAILED DESCRIPTION OF THE INVENTION
Definitions
[0028] The following terms as used herein have the following
meanings.
[0029] "File system"--the system utilized by the computer operating
system to organize, store, and access information contained in the
computer system memory.
[0030] "File navigation system"--the textual, hierarchical
navigation interface used by the computer operating system to
provide a user with an organized manner of storing, identifying,
locating, and operating on files for user operations contained in
the computer system memory.
[0031] "Change journal"--a computer operating system provided
system to identify and track any file changes, creations,
deletions, or relocations.
[0032] "Meta file"--an indirect means of storing information about
a related file (e.g., file size and creation date for a data
file).
[0033] "Page file"--a computer operating system defined and created
file which is specific to the present session running on the
computer system; the page file represents short-lived data that is
not valid or meaningful to a subsequent session and is therefore of
no value to retain.
[0034] "Integrity Point"--a collection of files and file references
which exist at a particular time to represent the files that were
current and valid for that time; restoration of an integrity point
ensures that files are consistent and meaningful to the computer
operating system and applications that may require multiple files
to be self-consistent.
[0035] "File Reference Number" or FRN--a unique identifier for a
given file or folder entry in the file system file table.
[0036] "NTFS"--Acronym associated with the file system for a
computer operating system. The file system provides an important
feature known as journaling, which creates a queue of file changes,
creations, deletions, or relocations.
System Architecture
[0037] FIG. 1 illustrates the basic architecture of the present
Secure Networked Data Shadowing System operational in a typical
network environment. There are a plurality of monitored computer
systems 110-110B and each typically includes a processor 112-112B,
memory 113-113B (such as a disk drive, although any form of
read/write memory can be used, and the term "memory" is used herein
to describe this element), and a data communication medium link,
such as an input/output port 111-111B, or wireless interface and
the like. The Secure Networked Data Shadowing System comprises a
memory module 101 that is connected to the monitored computer
systems 110-110B via existing data communication mediums 150, 151
to store the shadowed data. For the sake of example, the data
communication medium illustrated herein includes a Local Area
Network 150 which serves to interconnect monitored computer systems
110, 110A to the memory module 101. A media interface 152 can also
be used to interconnect the Local Area Network 150 to a data
communication medium 151, such as the Internet, to enable remotely
located monitored computer systems 110B to access the memory module
101. However, any data communication medium can be used, whether
wired or wireless and regardless of the data communication protocol
used.
[0038] The memory module 101 includes a memory device 102 and its
associated memory module software 104 and database 105 for managing
the data storage as well as a control software component 103 that
is automatically installed on each monitored computer system
110-110B when the memory module 101 is first connected to the
monitored computer systems 110-110B.
[0039] In this networked environment, the memory module 101 can
include a plurality of memory devices 102 or even a large capacity
mass storage system which is equipped with a server to execute the
memory module software 104 and store the control software component
103 for installation on each of the monitored computer systems 110
as they are initially included in the extended network illustrated
in FIG. 1. The memory module software 104 maintains a listing of
the various monitored computer systems 110-110B that are served by
the memory module 101 and manages the allocation of memory in
memory device 102 to ensure that the data residing therein for one
monitored computer system 110 is not accessible by another
monitored computer system 110B. Furthermore, the memory module
software 104 manages the communications with the plurality of
monitored computer systems 110-110B to ensure the timely storage of
data received from the monitored computer systems 110-110B on to
the memory device 102.
[0040] The monitored computer systems 110-110B may be devices that
are all part of the distributed computer network of a single
entity, with the data communication LAN 150 being an internal LAN.
Alternatively, there can be a division of the monitored computer
systems 110-110B such that they are part of at least two entities
and there can also be multiple LANs (only one shown in FIG. 1)
serving these various collections of monitored computer systems. In
any case, the Secure Networked Data Shadowing System can address
these different environments and is capable of creating and
managing virtual partitions between the different entities or
different groups within a single entity.
[0041] However, in order to simplify the following description of
operation of the Secure Networked Data Shadowing System, the
description focuses on the operation of a single monitored computer
system 110 as it is initialized, stores the baseline image of its
disk(s) on memory device 102, and processes data file updates and
well as file restoration. These various operations are typically
executed by the various monitored computer systems 110-110B as
needed and autonomously, with the user of each monitored computer
system 110-110B not being required to manage the operation of the
Secure Networked Data Shadowing System and the monitored computer
systems 110-110B operating independent of each other.
[0042] The simplicity and ease-of-use of the Secure Networked Data
Shadowing System requires minimal user interaction, and the
"Autorun" feature the resident operating system can be used, for
example, to support an automatic installation of the Secure
Networked Data Shadowing System software component 103. Thus, upon
the first connection of the memory module 101 of the Secure
Networked Data Shadowing System to the monitored computer system
110, the Secure Networked Data Shadowing System calls the "Autorun"
software resident on the operating system of the monitored computer
system 110 to initiate the installation application portion of the
control software component 103 which is stored on the memory module
101 of the Secure Networked Data Shadowing System. (Alternatively,
a mountable media can be used to initiate installation of the
control software component 103 from the monitored computer system
110.) The installation application then identifies that this is an
initial installation of the Secure Networked Data Shadowing System
with the monitored computer system 110. The memory module software
104 requests system information from the operating system of the
monitored computer system 110 and stores this system information in
a database 105. This system information is subsequently used to
determine if the Secure Networked Data Shadowing System has been
previously connected to monitored computer system 110. If the
Secure Networked Data Shadowing System has already been installed,
the monitored computer system 110 activates memory module 101 and
starts talking to it.
Initialization
[0043] FIGS. 2A and 2B illustrate, in flow diagram form, the
operation of the present Secure Networked Data Shadowing System
during the initial installation of the Secure Networked Data
Shadowing System on a monitored computer system 110, where the
Secure Networked Data Shadowing System is linked to this monitored
computer system 110 and an initial shadow copy of the contents of
the monitored computer system's memory is created in the memory
module 101 of the Secure Networked Data Shadowing System.
[0044] The Secure Networked Data Shadowing System optionally
self-authenticates at step 201 when it is first attached to the
monitored computer system 110 by ensuring that the serial number
encoded into the memory device 102 of the Secure Networked Data
Shadowing System memory module 101 matches the serial number entry
inserted into the control software component 103. During
manufacturing, the serial number is queried from the memory device
102, inserted into the control software component 103, and stored
onto the Secure Networked Data Shadowing System in a manner to
circumvent unauthorized replication of the Secure Networked Data
Shadowing System software onto additional memory devices.
[0045] The Secure Networked Data Shadowing System then begins
installation and initialization of the Secure Networked Data
Shadowing System for the monitored computer system 110 at step 202.
In place of the traditional software installation process whereby
the user is required to insert a mountable media into a selected
drive of the monitored computer system 110 in order to install
software, the Secure Networked Data Shadowing System can utilize
the simple "Autorun" feature of the resident operating system. The
control software component 103 of the Secure Networked Data
Shadowing System is loaded on to the monitored computer system 110
at step 202 and at step 203, the monitored computer system 110 is
interrogated by the control software component 103 of the Secure
Networked Data Shadowing System to obtain data which defines the
hardware topology and device signatures of the monitored computer
system 110. This signature information is used to "pair" the Secure
Networked Data Shadowing System to the monitored computer system
110 and is stored in memory module software 104 at step 204.
[0046] The Secure Networked Data Shadowing System displays a simple
dialog box to the user at step 205 via the display screen of the
monitored computer system 110 to indicate that they agree to the
Secure Networked Data Shadowing System user license agreement. This
simplified user agreement dialog is required to ensure that the
user is agreeable with the terms set forth in the end user license
agreement. If the user did not intend to install the Secure
Networked Data Shadowing System, or is dissatisfied with the end
user license agreement, nothing remains on the monitored computer
system 110 pertaining to the Secure Networked Data Shadowing
System.
[0047] Upon successful installation of the Secure Networked Data
Shadowing System, the user is not required to take further action
to ensure the protection and backup of the data that is presently
stored and subsequently added, deleted or modified on the memory
113 of their monitored computer system 110. The user is required to
leave the monitored computer system 110 attached to the memory
module 101 of the Secure Networked Data Shadowing System for an
initial period of time in order to have an initial valid backup of
their data files and directory structures from the monitored
computer system 110 to the memory module 101 of the Secure
Networked Data Shadowing System at step 206, but attaching
monitored computer system 110 to the memory module 101 of the
Secure Networked Data Shadowing System is the only action step
required of the user. The control software component 103
concurrently monitors the ongoing memory activity of the monitored
computer system 110 while the initial data backup is being executed
without requiring the modification of the monitored computer system
110 or the use of complex interconnection processes.
[0048] The Secure Networked Data Shadowing System efficiently
stores the data retrieved from the memory 113 of the monitored
computer system 110 in a single format, while representing it
internally in two formats: disk sectors and files. The Secure
Networked Data Shadowing System also efficiently tracks and stores
the state of multiple file systems that are resident on the
monitored computer system 110 over time, while allowing for correct
disk-level and file-level restoration, to a point-in-time, without
storing redundant data. A Meta File System may be implemented in
the Secure Networked Data Shadowing System to describe the state of
each active file system and the underlying physical disk or disks,
at a point-in-time, with integrity. The Meta File System is an
internally consistent, related-in-time, collection of critical data
and metadata from the file systems and physical disks under its
protection. The Meta File System may collect certain data, and do
so in a way that correctness is ensured.
[0049] Typical Meta-File data that is collected may include: [0050]
A baseline image of the non-NTFS sectors which are formatted on
each physical disk installed in the monitored computer system 110.
[0051] A complete indexing of the file systems contained on each
physical disk for a designated point-in-time. This index includes
the mapping of file objects to their location on the physical disk.
[0052] A serialized journal of file system changes over time.
[0053] Copies of the file object contents resulting from file
system changes over time. [0054] Multiple self-consistent
"snapshots" of the on-disk metadata for each active file system at
a point-in-time.
[0055] The challenge of creating a consistent-in-time view of
multiple active file systems is met by combining the collected data
into a single database and organizing and accessing it via data
management algorithms resident in the Secure Networked Data
Shadowing System.
Memory Indexing
[0056] The first step in this initial data transfer process is to
generate a master index of all contents of the monitored computer
system's memory 113 at step 206. The control software component 103
discovers each storage device (memory 113) on the monitored
computer system 110 and creates a corresponding Object Model for
each Storage Device (TRStorageDevice). The Storage Device objects
are children of the monitored computer system 110. While they all
share some base level attributes, they can specialize for different
aspects of the physical device.
[0057] For each TRStorage device, control software component 103
identifies all of the unique disk regions that it contains, and
creates an object model for each (TRDiskRegion). While all
TRDiskRegions share some basic traits, they specialize themselves
according to the type of Region they describe. For instance,
examples of unique disk regions include the Master Boot Record
(MBR), the partition table, a file system region (NTFS or FAT32
partition), a hidden OEM recovery partition, and seemingly unused
"slices" that are the leftovers between formal partitions. Secure
Networked Data Shadowing System identifies and accounts for every
single sector on a physical storage device and creates an
appropriate TRDiskRegion object to manage and index them.
[0058] TRDiskRegions that do not have a recognizable file system,
are treated as "Block Regions." Block regions comprise a span of
disk sectors (start, from sector zero, and length), and are simply
archived as a block range onto the Secure Networked Data Shadowing
System memory device 102.
[0059] This master index includes processing the master boot record
and file system at step 207 to generate an index of every
partition, file and folder on the monitored computer system 110,
and this index data for each partition, file and folder is entered
into a database 114 residing on the monitored computer system
memory 113 as well as optionally a database 105 in the memory
module 101.
[0060] The master boot record contains information about the
arrangement of data on the monitored computer system memory 113.
These contents may be arranged with subsets of data such that there
is a primary bootable partition and alternate, non-bootable
partitions. An entry in the master boot record determines the
status of these partitions, as well as size and binary offset
values for each partition. Capturing and processing this
information permits the Secure Networked Data Shadowing System to
automatically reconstruct the entire contents of the monitored
computer system memory 113. The database exists largely to
facilitate a (faster) way to search and retrieve file history and
revisioning. The method used to lay down the "copy/backup" of the
file system of the monitored computer system 110, enables
recreation of the data contained in the database 114 from the
Secure Networked Data Shadowing System itself. In the case of
Secure Networked Data Shadowing System, most of the Object Models
that model a feature or attribute of the monitored computer system
110 are persisted to the Secure Networked Data Shadowing System
memory module 101 as file system streams in a directory structure
that matches or emulates the physical hierarchy from where they
came from.
[0061] After processing the master boot record, the file system for
the primary bootable partition is processed at step 208 to record
each file and folder entry, placing records into the database 114
residing on the monitored computer system memory 113. This database
114 contains information about each file and folder and is accessed
primarily during file retrieval requests and is also updated with
changes to individual files and folders to create a chronological
record of changes. This same database 114 is mirrored (database
105) onto the Secure Networked Data Shadowing System memory module
101 whenever the monitored computer system 110 is connected to the
memory module 101 of the Secure Networked Data Shadowing System.
The mirrored database 105 is used primarily during full-system
restoration where the monitored computer system memory 113 may have
failed and the mirrored database 105 contains records of each file
and folder residing in the binary data copied to the Secure
Networked Data Shadowing System memory device 102. TRDiskRegions
that do have a recognized file system create an Object Model for
the file system "Volume" (TRVolume). A Volume understands the
concepts and navigation of its contained file system, and the
concept of its associated mount point.
Memory Copy
[0062] Upon completion of processing the master boot record and
file system, the Secure Networked Data Shadowing System begins the
second step of this process by copying the binary information from
the monitored computer system memory 113 with the exception of a
subset of the memory 113. The exception subset consists of: areas
not allocated, or identified as in use, by any of the partitions as
well as areas identified as temporary information by the operating
system. An example of the temporary information is the operating
system page file, which is useful only during the current session
and is meaningless to a subsequent session.
[0063] The copy process identifies a Secure Networked Data
Shadowing System storage device and writes the non-NTFS file
objects onto the Secure Networked Data Shadowing System memory
device 102 at step 211. Once all of these objects are written into
memory device 102, the Secure Networked Data Shadowing System
writes all of the NTFS files on to memory device 102 at step 212 in
a directory hierarchy that mimics their physical and logical
relationships on the monitored computer system 110. Below is a
simple base directory tree of a Secure Networked Data Shadowing
System (depth of the contained file systems has been omitted.
[0064] R:\data\REBITDV05\072CE3A9 [0065] R:\data\REBITDV05\19F418B5
[0066] R:\data\REBITDV05\647931C9 [0067] R:\data\REBITDV05\647931D6
[0068] R:\data\REBITDV05\072CE3A9\Region0 [0069]
R:\data\REBITDV05\072CE3A9\Region1 [0070]
R:\data\REBITDV05\072CE3A9\Region2 [0071]
R:\data\REBITDV05\072CE3A9\Region1\{ddffc3ed-7035-11dc-9485-000c29fddfb0}
[0072]
R:\data\REBITDV05\072CE3A9\Region2\{ddffc3f3-7035-11dc-9485-000c29-
fddfb0 } [0073] R:\data\REBITDV05\19F418B5\Region0 [0074]
R:\data\REBITDV05\19F418B5\Region1 [0075]
R:\data\REBITDV05\19F418B5\Region1\{732534f9-cb5a-11db-befe-806e6f6e6963}
[0076] R:\data\REBITDV05\647931C9\Region0 [0077]
R:\data\REBITDV05\647931C9\Region1 [0078]
R:\data\REBITDV05\647931C9\Region1\{a93586cc-cb5f-11db-b097-000c29e897d0}
[0079] R:\data\REBITDV05\647931D6\Region0 [0080]
R:\data\REBITDV05\647931D6\Region1 [0081]
R:\data\REBITDV05\647931D6\Region2 [0082]
R:\data\REBITDV05\647931D6\Region1\{a93586d2-cb5f-11db-b097-000c29e897d0}
[0083] To understand this, the control software component 103 knows
that the Secure Networked Data Shadowing System storage device 103
was mounted on drive "R:" and all archiving operations are going to
directory "data" which is located in the memory module 101. The
next indicia in this string is the name of the monitored computer
system 110 that provided the content "REBITDEV05", then the
physical disk signature (i.e., 072CE3A9, 072CE3A9, etc.). If the
disk drive has data that is to be archived, it is then organized
into Region objects that are simply sequentially numbered (Region0,
Region1, etc.). If a region contains an understood file
system/volume, its volume identifier is used in the persistent
storage to map its path. In the case of
R:\data\REBITDV05\072CE3A9\Region1\{ddffc3ed-7035-11dc-9485-000c29fddfb0}-
, on this system, it happens that this is an NTFS volume, and a
full mirror of the file system for drive "C:" of the monitored
computer system 110.
[0084] A key point here is that the Object Models for each element
of the monitored computer system 110 are themselves stored in file
system streams on the Secure Networked Data Shadowing System memory
device 102. For example, the TRMachine object is "saved" as a
hidden stream inside of the r:\data\REBITDEV05\ directory entry,
and the volume object for
R:\data\REBITDV05\072CE3A9\Region1\{ddffc3ed-7035-11dc-9485-000c29fddfb0}
is saved as a hidden stream on that directory entry.
[0085] What this means is, from the Secure Networked Data Shadowing
System file system alone, all of the object relationships and their
Meta data can be reconstructed with no database. Further, when a
file is eventually archived to the Secure Networked Data Shadowing
System, all of its associated history and Meta data are stored as
hidden streams in the file entry itself. The database 114 can be
completely reconstructed from the Secure Networked Data Shadowing
System storage file system itself.
[0086] In addition, in the Secure Networked Data Shadowing System
storage architecture, the files are not actually stored with the
name they had on the monitored computer system 110. Rather, they
are stored with a file name that is a unique hash value of the
contents of that file. A file system "soft link" then is used in
the directory structure above to point to the data of the hash
value named "blob" of data that is the file from the monitored
computer system 110. The user only sees the soft link. Secure
Networked Data Shadowing System stores the hashed value named file.
If any two files hash to the same value (meaning they are binary
identical), only one copy need be hosted in storage, and the
symbolic links for both host copies point to the same stored
content. This attribute of functionality is the first level of
intrinsic data de-duplication.
[0087] To continue, when a file is modified on the monitored
computer system 110, the new data is hashed named and stored on the
monitored computer system 110, and the old version of the file is
removed, and replaced with only a description of its binary
differences to the new version (Reverse X-Delta). This strategy
allows for Secure Networked Data Shadowing System to keep pristine
copies of all current files, while being able to regenerate
previous versions at all times while minimizing data storage space
requirements on the Secure Networked Data Shadowing System
itself.
[0088] Because of the time required to read the memory 113 of the
monitored computer system 110, and because it contains an active
file system, the Secure Networked Data Shadowing System enables
Journaling at step 209 for the active file systems residing on the
physical disk being imaged. In addition, the Secure Networked Data
Shadowing System at step 210 sets the flag in the database
indicating an Integrity Point is desired by creating a set of
cursors against the active file system journals, which set of
cursors are termed the "Start Cursors". The Journal process begins
identifying and queuing files to act upon. Once the cursors are
created, the Secure Networked Data Shadowing System at step 211
creates and compresses an image of the active file systems into the
memory device 101 of the memory module 101 of the Secure Networked
Data Shadowing System. To save memory space, the active file
systems are queried for their allocated regions of the physical
disk and only allocated regions are read and compressed.
[0089] At step 212, the Secure Networked Data Shadowing System
indexes the active file systems to extract relevant metadata for
every file object in the file system and records it in a database.
The Secure Networked Data Shadowing System identifies and indexes
all directories contained within the file navigation system by File
Reference Number, or FRN, and identifies and inserts entries into
the database for each cluster run representing the file. The Secure
Networked Data Shadowing System initializes the baseline by
inserting entries in the database signifying completion of the
initialization. Once the image and index are complete, the Secure
Networked Data Shadowing System at step 213 creates a second set of
cursors against the active file system journals, termed the "Most
Recent Entries".
[0090] At step 214, the Secure Networked Data Shadowing System
enables Change Tracking, and at step 215, the journals for the
active file systems are processed from the Start Cursor to the Most
Recent Entry, to record records of changes in the database,
including file object contents. Upon reaching a point in time where
no files remain in the queue to process, the appropriate actions
are taken to insert an Integrity Point entry into the database.
[0091] Finally, at step 216, the Secure Networked Data Shadowing
System records an Integrity Point in the database to which the
baseline image and file object change records are related. This is
the data required to allow a self-consistent Disk Recovery at the
point-in-time which the Integrity Point represents. Thus, the full
disk copy and the file changes, creations, deletions, or
relocations that occurred during the full disk copy are collected
into a set to represent a fully restorable point called the
Integrity Point.
Change Tracking
[0092] FIG. 3 illustrates, in flowchart form, the operation of the
present Secure Networked Data Shadowing System to store a copy of
data that are newly added to the monitored computer system's memory
113. The Secure Networked Data Shadowing System process registers
with the operating system change journal in order to receive
notification of changes occurring to files and folders residing on
the monitored computer system memory 113. The change journal then
dynamically notifies the Secure Networked Data Shadowing System of
changes, permitting the Secure Networked Data Shadowing System to
determine the appropriate action to take. File creation, movement,
content changes, and renaming are all events requiring action, and
each action is entered into an action queue for processing.
[0093] The Secure Networked Data Shadowing System action queue is
utilized for periods where the monitored computer system 110 is
connected or disconnected to the memory module 101 of the Secure
Networked Data Shadowing System. If the monitored computer system
110 is connected to the memory module 101 of the Secure Networked
Data Shadowing System, the Secure Networked Data Shadowing System
processes each action queue entry, updating the entry in the
database 114 and, if necessary, compressing and transferring the
file binary contents to the Secure Networked Data Shadowing System
memory module 101.
[0094] During periods of time that the monitored computer system
110 is not connected to the memory module 101 of the Secure
Networked Data Shadowing System, the action queue is utilized for
recording actions that are to be performed once the monitored
computer system 110 is connected to the memory module 101 of the
Secure Networked Data Shadowing System. This recording process
permits the Secure Networked Data Shadowing System to prioritize
the actions to be performed, selecting the files of highest
importance to be processed before lower priority files. This is the
continuous process of maintaining the data required to assemble a
consistent-in-time view of the file systems. The process of change
tracking begins immediately after the Initialization and Indexing
is complete, as described above.
Journal Processing
[0095] Journal processing is continuous and occurs whether or not
the monitored computer system 110 is connected to the memory module
101 of the Secure Networked Data Shadowing System. The control
software component 103 of the Secure Networked Data Shadowing
System at step 301 queries the file system journals for any more
recent changes, starting from the last entry previously processed.
The control software component 103 at step 302 then creates a
change record in the action queue in database 114 and increments
the journal cursor for each relevant journal entry. For each
relevant journal entry, the control software component 103 creates
a change record in the action queue in database 114 and increments
the journal cursor. When the journal entries are exhausted
(up-to-date), the control software component 103 watches for new
entries.
Data Synchronization
[0096] Data Synchronization is intermittent and occurs only when
the monitored computer system 110 is connected to the memory module
101 of the Secure Networked Data Shadowing System. When the
monitored computer system 110 is connected to the memory module 101
of the Secure Networked Data Shadowing System, the control software
component 103 starts processing at step 304 from the first
unprocessed change record in the action queue in database 114. For
the oldest change record, and all related, unprocessed change
records, the control software component 103 at step 305 determines
if each is still relevant (for example, if the file was created and
is already deleted, it is not relevant). The control software
component 103 at step 306 removes all non-relevant change records
from the action queue in database 114. Alternatively, at step 307,
the control software component 103 takes the appropriate action for
each relevant change record. If the file was created, the control
software component 103 stores new file and file-version records in
the action queue in database 114 and copies the file-version's
contents to the Secure Networked Data Shadowing System memory
module 101 at step 308. If the file was moved or renamed, the
control software component 103 creates a new file record in the
action queue in database 114, relates all file-versions from the
old file record with the new file record, and marks the old file
record as deleted at step 309. If the file was deleted, the control
software component 103 marks the file record in the action queue in
database 114 as deleted at step 310. If a directory was created,
the control software component 103 stores a new directory record in
the action queue in database 114. If a directory was moved or
renamed, the control software component 103 creates a new directory
record in the action queue in database 114, relates all file
records from the old directory record with the new directory
record, and marks the old directory record as deleted at step 312.
If a directory was deleted, the control software component 103
marks the directory record in the action queue in database 114 as
deleted at step 313. Finally, at step 314, the control software
component 103 removes the change record from the action queue in
database 114 and processing returns to step 305.
Create An Integrity Point
[0097] FIG. 4 illustrates, in flowchart form, the operation of the
present Secure Networked Data Shadowing System to create and store
an integrity point to benchmark changes in the monitored computer
system's memory 113.
[0098] This is the operation required to store the information
necessary to execute a Disk Recovery for a point-in-time. The
process of creating an Integrity Point requires reading and storing
a self-consistent "snapshot" of the meta files maintained on-disk
by the active file systems. This requires monitoring these file
systems for changes occurring while the snapshot is created and
deciding if they invalidate the snapshot, requiring another
attempt. Exemplary operations include the following.
[0099] Before attempting to create an Integrity Point, Journal
Processing and Data Synchronization must be up-to-date. Each active
file system is queried (or directly parsed) by the control software
component 103 at step 401 to determine the physical locations
on-disk that is has allocated for its own use (File System
Regions). These File System Regions contain the data structures
that define a consistent state of the file system and must be
self-consistent. The control software component 103 then queries
each active file system's journal at step 402 for its next record
index and this value is kept as a cursor. The control software
component 103 instructs the operating system to flush all active
file systems to memory 102 at step 403 and the File System Regions
for each active file system are read directly from disk 113 at the
sector level and stored in an archive on the Secure Networked Data
Shadowing System at step 404.
[0100] The control software component 103 again queries each active
file system's journal at step 405 for its next record index and
this value is compared with the previously recorded cursor. If the
cursors match, then at step 406 the Integrity Point is "confirmed"
and marked as such in the database 114. If the cursors do not
match, the offending journal is queried for the inter-cursor
entries at step 407. The entries are examined by the control
software component 103 at step 408 and a decision is made whether
or not they invalidate the snapshot. If so, the process is repeated
from step 401 until a valid snapshot is achieved. If the snapshot
is valid, then at step 410, all file objects that resulted from
change records occurring between the previous Integrity Point and
this one are related to this Integrity Point record in the database
114 and the Integrity Point is marked as "sealed." The database
application is instructed at step 411 to perform a backup
operation, resulting in the placement of a compressed
representation of the database 114 onto the memory module 101 of
the Secure Networked Data Shadowing System.
Network Data De-Duplication
[0101] The Secure Networked Data Shadowing System achieves a
significant amount of data storage efficiency by implementing a
"data de-duplication" process to avoid storing multiple copies of
the same files. For example, the plurality of monitored computer
systems 110-110B typically all have the same programs resident in
the memory 113-113B of these systems. The programs occupy a
significant amount of data storage capacity and replicating a copy
of each program for each monitored computer system 110-110B wastes
a significant amount of memory capacity in memory device 102.
Similarly, the users at the plurality of monitored computer systems
110-110B are typically employed by the same organization and many
of the customer data files resident on the monitored computer
systems 110-110B may be duplicates of each other.
[0102] The memory module software 104 identifies instances of
duplication via the hash index that is created for each file.
Identical hash indexes are indicative of identical files.
Therefore, files that have the same hash index can be reduced to a
single copy without any loss of data as long as the file structure
information for each instance of the file is maintained. Thus, a
plurality of the monitored computer systems 110-110B can all
reference the same file stored in the memory device 102 of the
memory module 101, since the local identification of this file is
computer-centric, but the memory location in memory device 104 is
the same for all instances. When a file is revised by one of the
monitored computer systems, the new version of this file is written
in its entirety into the memory device 104 of the memory module 101
and a new hash index is created for this version of the file. This
does not impact the other monitored computer systems, since their
file structure still points to the prior version of the file which
still remains in the memory device 104 of the memory module 101.
Thus, the use of the hash index for identifying files stored in
memory module 101 has the added benefits of enabling the
identification of duplicate copies of a file as well as
annonomizing the file, since this new moniker for the file fails to
reveal its source or place in the file system structure of the
originating monitored computer system.
File Version Retrieval
[0103] FIG. 5 illustrates, in flow diagram form, the operation of
the present Secure Networked Data Shadowing System to retrieve data
stored therein for restoration of a file in the memory 113 of the
monitored computer system 110. This is the operation to
"reconstitute" the contents of a file at a point-in-time. This
file-version may reside in the baseline disk image stored to the
Secure Networked Data Shadowing System during initialization or in
a file-version archive on the Secure Networked Data Shadowing
System.
[0104] The database 114 contains records of each file that has been
stored on the Secure Networked Data Shadowing System, including the
files captured during initialization. Over the course of time, data
which enables the restoration of multiple versions of a given file
may be stored on the Secure Networked Data Shadowing System,
creating the ability to retrieve a version of a file from one of
several points-in-time. When a file is modified on the monitored
computer system 110, the new data is hashed named and stored on the
monitored computer system 110, and the old version of the file is
removed, and replaced with only a description of its binary
differences to the new version (Reverse X-Delta). This strategy
allows for Secure Networked Data Shadowing System to keep pristine
copies of all current files, while being able to regenerate
previous versions at all times while minimizing data storage space
requirements on the Secure Networked Data Shadowing System
itself.
[0105] The process of retrieving a file from the database and
related location of the Secure Networked Data Shadowing System
begins at step 501 where the user opens user interface and
navigates through the hierarchical file and folder system to locate
the desired file or folder. The user selects the desired file or
folder at step 502 and uses "drag and drop" functionality to move
the selected file or folder to another folder location (e.g.,
`Desktop` or `My Documents`) on the monitored computer system. Upon
releasing the mouse button, the operating system at step 503
generates a request from the Secure Networked Data Shadowing System
for data related to the source file identified by the user
interface. The database then is queried at step 504 to locate the
present version of the selected file and its binary differences to
the new version, traced back to the point-in-time selected by the
user.
[0106] If the user selects a present version of the file, at step
505, the Secure Networked Data Shadowing System retrieves the
pristine copy of the current file and delivers the file to the
user. Otherwise, the Secure Networked Data Shadowing System at step
506 uses the collection of binary differences to trace the selected
file backwards in time to recreate the selected version of the file
as indicated by the user and then delivers the reconstructed file
to the user. The user reads and seeks on the data stream interface
at step 507 and processes the contents as desired.
Disk Recovery
[0107] FIG. 6 illustrates, in flow diagram form, the operation of
the present Secure Networked Data Shadowing System to retrieve data
stored therein for restoration of the entirety of the memory 113 of
the monitored computer system 110. This is the operation required
to restore the complete state of a physical disk 113 of the
monitored computer system 110 at a point-in-time. The available
points-in-time are defined by previously stored Integrity Points.
The goal of a Disk Recovery is to "reconstitute" a self-consistent
image of the subject physical disk 113 to the sector level and
write this to a hard disk 113 on the monitored computer system
110.
[0108] In order to write to the physical system disk 113, it is
necessary to boot the monitored computer system 110 from an
alternative media and ensure that the file systems on hard disk(s)
113 are not in use at step 601. At step 602, the user must ensure
that the environment is acceptable. The Secure Networked Data
Shadowing System is connected to the monitored computer system 110
at step 603 and must be accessible. At step 604, the subject hard
disk(s) 113 must be available and large enough to receive the
restoration disk image. The subject hard disk(s) 113 do not need to
be formatted, but can be formatted if desired. At step 605, any
file systems present on the subject hard disk(s) are unmounted and
the user selects an Integrity Point to restore onto the subject
hard disk(s) 113 at step 606.
[0109] The baseline non-NTFS disk image(s) stored on the Secure
Networked Data Shadowing System is written directly to the subject
hard disk(s) 113 sector-by-sector at step 607. The database 114 is
queried at step 608 for the snapshot corresponding to the closest
file system image to the selected baseline. At step 609, the
snapshot is written to the subject hard disk(s) and, for each file
object, the database 114 is queried at step 610 for the file
object's storage location. The file object's contents are written
directly to its disk location at step 611. The subject drive(s) 113
are now ready for use and the monitored computer system 110 may be
rebooted at step 612.
Guest PC/Portable File Access
[0110] The Secure Networked Data Shadowing System makes note of
distinguishing features of the monitored computer system 110 such
that the connection of the Secure Networked Data Shadowing System
memory module 101 to a second, non-host computer system is quickly
identified. In this alternative connection condition, the Secure
Networked Data Shadowing System "Autorun" initialization
application asks the user if they want access to the files stored
within the Secure Networked Data Shadowing System memory module 101
or if they wish to re-initialize the Secure Networked Data
Shadowing System to pair with the newly connected computer system.
If the user wishes to re-initialize with the newly connected
computer system, all backup data from the previous monitored
computer system 110 is eliminated, and a message indicating the
same is displayed. If the user wishes to access files contained on
the memory module 101, the Secure Networked Data Shadowing System
initializes a limited application permitting the user to utilize
the same graphical user interface as before. The user may then
locate and drag-and-drop files onto the newly connected computer
system hard disk drive.
[0111] The operating system on the monitored computer system
recognizes specific files contained in the base directory of a disk
drive or Secure Networked Data Shadowing System newly connected to
the monitored computer system 110. The file of type `autorun.inf`
alerts the operating system to the presence of a sequence of
operations to be performed, as defined within the file. The Secure
Networked Data Shadowing System, upon successful installation onto
the monitored computer system 110, alters this `autorun.inf` file
to behave differently if plugged into a subsequent, or guest,
computer system. This altered `autorun.inf` file instructs the
Secure Networked Data Shadowing System to make available the
contents of the drive by reconstructing and interrogating the
duplicated database. Through this method, user files of interest
may be identified for copying onto the guest computer system.
Therefore, the monitored computer system's files, such as digital
photographs and music files, may be transferred from the Secure
Networked Data Shadowing System onto the guest computer system for
display or sharing as desired.
[0112] In order to make the access to files on the guest computer
system as seamless as the access on the monitored computer system,
the file explorer system of the guest computer system is utilized.
By registering with, and making calls to, the file explorer system,
the display of the contents of the Secure Networked Data Shadowing
System mimics the display of the contents of the user's typical
computer system.
Summary
[0113] The Secure Networked Data Shadowing System automatically
stores the data on the memory module in a single format, while
representing it in a data management database in two formats: disk
sectors, and files. The Secure Networked Data Shadowing System
thereby efficiently tracks and stores the state of multiple file
systems over time, while allowing for correct disk-level and
file-level restoration, to a point-in-time, without storing
redundant data.
* * * * *