U.S. patent application number 14/281858 was filed with the patent office on 2014-11-13 for systems and methods for collapsing a derivative version of a primary storage volume.
This patent application is currently assigned to Quantum Corporation. The applicant listed for this patent is J. Mitchell Haile, Gregory L. Wade. Invention is credited to J. Mitchell Haile, Gregory L. Wade.
Application Number | 20140337594 14/281858 |
Document ID | / |
Family ID | 43648552 |
Filed Date | 2014-11-13 |
United States Patent
Application |
20140337594 |
Kind Code |
A1 |
Wade; Gregory L. ; et
al. |
November 13, 2014 |
SYSTEMS AND METHODS FOR COLLAPSING A DERIVATIVE VERSION OF A
PRIMARY STORAGE VOLUME
Abstract
Disclosed are systems, methods, and computer readable media for
restoring virtual machines. In a particular embodiment, a
non-transitory computer readable medium is provided having
instructions stored thereon that, when executed by a computer
system, cause the computer system to perform a method for restoring
virtual machines. The method comprises generating a snapshot of a
storage volume representing a virtual machine in a virtual machine
environment and storing the snapshot in the virtual machine
environment which tracks changes to the snapshot that occur since
the snapshot was generated. Based on the changes, the method
provides merging differences between the storage volume and the
snapshot.
Inventors: |
Wade; Gregory L.; (San Jose,
CA) ; Haile; J. Mitchell; (Somerville, MA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Wade; Gregory L.
Haile; J. Mitchell |
San Jose
Somerville |
CA
MA |
US
US |
|
|
Assignee: |
Quantum Corporation
San Jose
CA
|
Family ID: |
43648552 |
Appl. No.: |
14/281858 |
Filed: |
May 19, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12848863 |
Nov 17, 2010 |
8732427 |
|
|
14281858 |
|
|
|
|
61230892 |
Aug 3, 2009 |
|
|
|
Current U.S.
Class: |
711/162 |
Current CPC
Class: |
G06F 16/1787 20190101;
G06F 16/188 20190101; G06F 16/128 20190101 |
Class at
Publication: |
711/162 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A non-transitory computer readable medium having instructions
stored thereon that, when executed by a computer system, cause the
computer system to perform a method for restoring virtual machines,
the method comprising: generating a snapshot of a storage volume
representing a virtual machine in a virtual machine environment;
storing the snapshot in the virtual machine environment which
tracks changes to the snapshot that occur since the snapshot was
generated; and based on the changes, merging differences between
the storage volume and the snapshot.
2. The non-transitory computer readable medium of claim 1, wherein
the storage volume comprises a v-disk file.
3. The non-transitory computer readable medium of claim 1, wherein
the storage volume comprises a plurality of data blocks.
4. The non-transitory computer readable medium of claim 3, wherein
the changes comprise blocks of the plurality of data blocks that
are changed, allocated, and non-transient.
5. The non-transitory computer readable medium of claim 1, wherein
merging the differences comprises copying only the changes to the
storage volume.
6. The non-transitory computer readable medium of claim 1, wherein
the changes are identified based on metadata.
7. The non-transitory computer readable medium of claim 1, wherein
the method further comprises removing the snapshot after merging
the differences between the storage volume and the snapshot.
8. A non-transitory computer readable medium having instructions
stored thereon that, when executed by a computer system, cause the
computer system to perform a method for controlling storage volume
snapshots, the method comprising: generating a copy of a storage
volume in a networked environment, wherein the storage volume
comprises a plurality of data blocks; generating a version of the
copy by modifying at least one block of the plurality of data
blocks; and identifying the at least one block and storing the at
least one block in the networked environment.
9. The non-transitory computer readable medium of claim 8, wherein
the storage volume comprises a v-disk file.
10. The non-transitory computer readable medium of claim 8, wherein
the at least one block comprises one or more blocks of the
plurality of data blocks that are changed, allocated, and
non-transient.
11. The non-transitory computer readable medium of claim 8, wherein
the method further comprises merging the at least one block into
the storage volume.
12. The non-transitory computer readable medium of claim 11,
wherein the method further comprises removing the version after
merging the at least one block.
13. The non-transitory computer readable medium of claim 8, wherein
the at least one block is identified based on metadata.
14. A system for controlling a virtual storage volume in a
networked computing environment, comprising: a processor configured
to generate a snapshot of a storage volume representing a virtual
machine in a virtual machine environment; a storage medium
configured to store the snapshot in the virtual machine environment
which tracks changes to the snapshot that occur since the snapshot
was generated; and the processor further configured to merge
differences between the storage volume and the snapshot based on
the changes.
15. The system of claim 14, wherein the storage volume comprises a
v-disk file.
16. The system of claim 14, wherein the storage volume comprises a
plurality of data blocks.
17. The system of claim 16, wherein the changes comprise blocks of
the plurality of data blocks that are changed, allocated, and
non-transient.
18. The system of claim 14, wherein the processor configured to
merge the differences comprises the processor configured to copying
only the changes to the storage volume.
19. The system of claim 14, wherein the changes are identified
based on metadata.
20. The system of claim 14, wherein the storage system is further
configured to remove the snapshot after the differences are merged
between the storage volume and the snapshot.
Description
RELATED APPLICATIONS
[0001] This application is a continuation of U.S. patent
application Ser. No. 12/848,863, entitled "SYSTEMS AND METHODS FOR
COLLAPSING A DERIVATIVE VERSION OF A PRIMARY STORAGE VOLUME," filed
on Aug. 2, 2010; which is related to and claims priority to U.S.
Provisional Patent Application No. 61/230,892, entitled "A Method
for Optimizing Copy-On-Write Snapshot Collapsing using Filesystem
Meta Data," filed on Aug. 3, 2009, and which are both hereby
incorporated by reference in their entirety.
TECHNICAL BACKGROUND
[0002] In the field of computer hardware and software technology, a
virtual machine is a software implementation of a machine
(computer) that executes program instructions like a real machine.
Virtual machine technology allows for the sharing of, between
multiple virtual machines, the physical resources underlying the
virtual machines.
[0003] A technique known as copy-on-write allows multiple
applications or processes to request access to the same resource.
Once one of the processes attempts to modify the resource, a
duplicate resource is created.
[0004] In virtual machine environments, storage volumes within the
virtual machines contain data items that need to be accessed.
Further complicating matters, a virtual machine environment
utilizing copy-on-write may require access to the one or more
duplicate storage volume.
[0005] Unfortunately, accessing the underlying contents of a
storage volume and/or a duplicate storage volume can be very
resource intensive, reducing the performance of a virtual machine
and other operations within a virtual machine environment.
OVERVIEW
[0006] Disclosed are systems, methods, and computer readable media
for restoring virtual machines. In a particular embodiment, a
non-transitory computer readable medium is provided having
instructions stored thereon that, when executed by a computer
system, cause the computer system to perform a method for restoring
virtual machines. The method comprises generating a snapshot of a
storage volume representing a virtual machine in a virtual machine
environment and storing the snapshot in the virtual machine
environment which tracks changes to the snapshot that occur since
the snapshot was generated. Based on the changes, the method
provides merging differences between the storage volume and the
snapshot.
[0007] In some embodiments, the storage volume comprises a v-disk
file.
[0008] In some embodiments, the storage volume comprises a
plurality of data blocks.
[0009] In some embodiments, the changes comprise blocks of the
plurality of data blocks that are changed, allocated, and
non-transient.
[0010] In some embodiments, merging the differences comprises
copying only the changes to the storage volume.
[0011] In some embodiments, the changes are identified based on
metadata.
[0012] In some embodiments, the method further provides removing
the snapshot after merging the differences between the storage
volume and the snapshot.
[0013] In a further embodiment, a non-transitory computer readable
medium is provided having instructions stored thereon that, when
executed by a computer system, cause the computer system to perform
a method for controlling storage volume snapshots. The method
provides generating a copy of a storage volume in a networked
environment, wherein the storage volume comprises a plurality of
data blocks. The method further provides generating a version of
the copy by modifying at least one block of the plurality of data
blocks and identifying the at least one block and storing the at
least one block in the networked environment.
[0014] In some embodiments, the storage volume comprises a v-disk
file.
[0015] In some embodiments, the at least one block comprises one or
more blocks of the plurality of data blocks that are changed,
allocated, and non-transient.
[0016] In some embodiments, the method further comprises merging
the at least one block into the storage volume.
[0017] In some embodiments, the method further comprises removing
the version after merging the at least one block.
[0018] In some embodiments, the at least one block is identified
based on metadata.
[0019] In yet another embodiment, a system is provided for
controlling a virtual storage volume in a networked computing
environment. The system includes a processor configured to generate
a snapshot of a storage volume representing a virtual machine in a
virtual machine environment. The system further includes a storage
medium configured to store the snapshot in the virtual machine
environment which tracks changes to the snapshot that occur since
the snapshot was generated. The processor is further configured to
merge differences between the storage volume and the snapshot based
on the changes.
[0020] In some embodiments, the storage volume comprises a v-disk
file.
[0021] In some embodiments, the storage volume comprises a
plurality of data blocks.
[0022] In some embodiments, the changes comprise blocks of the
plurality of data blocks that are changed, allocated, and
non-transient.
[0023] In some embodiments, the processor configured to merge the
differences comprises the processor configured to copying only the
changes to the storage volume.
[0024] In some embodiments, the changes are identified based on
metadata.
[0025] In some embodiments, the storage system is further
configured to remove the snapshot after the differences are merged
between the storage volume and the snapshot.
BRIEF DESCRIPTION OF THE DRAWINGS
[0026] FIG. 1 illustrates a data control system according to an
embodiment.
[0027] FIG. 2 illustrates the operation of a data control system
according to an embodiment.
[0028] FIG. 3 illustrates a data control system according to an
embodiment.
[0029] FIGS. 4A and 4B illustrate the operation of a data control
system according to an embodiment.
[0030] FIG. 5 illustrates a data control system environment
according to an embodiment.
[0031] FIG. 6 illustrates a data control system environment
according to an embodiment.
DETAILED DESCRIPTION
[0032] The following description and associated figures teach the
best mode of the invention. For the purpose of teaching inventive
principles, some conventional aspects of the best mode may be
simplified or omitted. The following claims specify the scope of
the invention. Note that some aspects of the best mode may not fall
within the scope of the invention as specified by the claims. Thus,
those skilled in the art will appreciate variations from the best
mode that fall within the scope of the invention. Those skilled in
the art will appreciate that the features described below can be
combined in various ways to form multiple variations of the
invention. As a result, the invention is not limited to the
specific examples described below, but only by the claims and their
equivalents.
[0033] In virtual machine environments, accessing the underlying
contents of a storage volume can be very resource intensive,
reducing the performance of a virtual machine and other operations
within a virtual machine environment.
[0034] Some virtual machine environments use an optimization
strategy known as copy-on-write. Copy-on-write allows multiple
processes to request access to the same resource. Once one of the
processes attempts to modify the resource, a derivative version of
the resource is created. Over time the derivative version of the
resource grows as the process modifies the underlying blocks.
Further complicating matters, those skilled in the art will
appreciate that derivative version of a resource may themselves
have derivative versions creating a chain of derivatives.
Eventually, the derivative version(s) of the resource must be
collapsed or merged back into the resource by copying the modified
or changed blocks back into the resource.
[0035] Advantageously, the number of blocks that need to be copied
in order to collapse the derivative version of the resource back
into the resource can be reduced by copying only those blocks that
changed and remain allocated in the resource.
[0036] Referring now to FIG. 1, data control environment 100 is
illustrated in an embodiment whereby data control system 101 is
implemented in the data control environment 100 in order to
generate a derivative version of a primary storage volume and then
collapse the derivative version back into the primary storage
volume.
[0037] As shown, data control environment 100 includes data control
system 101, primary storage volume 113, and primary derivative
volume 123. Primary storage volume 113 is comprised of blocks 114.
Primary storage volume 113 includes secondary storage volume 115.
Secondary storage volume 115 includes data items 116.
[0038] Primary derivative volume 123 is a derivative version of
primary storage volume 113. Primary derivative volume 123 is
comprised of blocks 124. Primary derivative volume 123 includes
secondary storage volume 125. Secondary derivative volume 125
includes data items 126.
[0039] Primary storage volume 113, primary derivative volume 123,
secondary storage volume 115, and secondary derivative volume 125
may be any storage volumes capable of storing a volume of data. As
discussed, primary storage volume 113 is comprised of blocks 114
and primary derivative volume 123 is comprised of blocks 124. Each
block comprises a section of the primary volume that corresponds to
one or more data items in the secondary volume.
[0040] Data items 116 and 126 comprise the volume of data in
secondary storage volume 115 and secondary derivative volume 125,
respectively. Each data item 116 corresponds to one or more blocks
114 in secondary storage volume 115. Similarly, each data item 126
corresponds to one or more blocks 124 in secondary derivative
volume 125.
[0041] Data control system 101 comprises any system or collection
of systems capable of generating primary derivative volume 123 and
then collapsing primary derivative volume 123 back into primary
storage volume 113. Data control system 101 may be a
micro-processor, an application specific integrated circuit, a
general purpose computer, a server computer, or any combination or
variation thereof.
[0042] Data control system 101 may control access (i.e., reads and
writes) to the contents of a virtual drive (e.g., to data items 116
of secondary storage volume 115 and/or to blocks 114 of primary
storage volume 113). Data control system 101 may allow multiple
processes to read the contents of the virtual drive. However, in
operation, when one of the processes attempts to write or modify
the contents of the virtual drive, data control system 101
generates a derivative version of the virtual drive so that other
processes reading the virtual drive are not interrupted.
[0043] In operation, data control system 101 generates primary
derivative volume 123. Primary derivative volume 123 is a
derivative version of primary storage volume 113 which may
initially be an individual copy of primary storage volume 113 which
is accessible to one or more processes. Those skilled in the art
will appreciate that primary derivative volume 123 may not be an
exact copy of primary storage volume 113.
[0044] Once generated, the process requesting the write has an
individual version of primary storage volume 113 (i.e., primary
derivative volume 123) which may be modified and/or otherwise
changed. Typically, primary derivative volume 123 grows over time
as the data items 126 and/or blocks 124 are changed by the process
that requested the write. As blocks 124 of primary derivative
volume 123 are changed, data control system 101, primary derivative
volume 123, and/or primary storage volume 113 may track those
changed blocks. In addition, data control system 101 tracks those
blocks in the primary storage volume 113 (the ancestor disk) that
remain allocated or free.
[0045] Data control system 101 may receive an instruction, request,
or other indication that the process no longer needs access to
primary derivative volume 123. At this point, data control system
101 collapses primary derivative volume 123 back into primary
storage volume 113 by copying the modified or changed blocks from
primary derivative volume 123 to primary storage volume 113.
[0046] Prior to copying all the changed blocks from primary
derivative volume 123 to primary storage volume 113, data control
system 101 first identifies which of the changed blocks are free
blocks (or unallocated blocks). Identifying the changed blocks that
remain allocated allows data control system 101 to copy only those
changed blocks that also remain allocated. Consequently, data
control system 101 does not have to read the contents of changed
and unallocated blocks from primary derivative volume 123, which
optimizes the I/O cost of collapsing primary derivative volume 123
back into the primary storage volume 113.
[0047] FIG. 2 illustrates process 200 describing the operation of
data control system 101 in data control environment 100. To begin,
a volume of data is generated and stored on a primary storage
volume. Data control system 101 includes a processor that generates
a derivative version of a primary storage volume comprised of
blocks that contain data items stored in a secondary storage volume
(Step 202). For example, the processor in data control system 101
may generate primary derivative volume 123 which includes data
items 126 stored in secondary derivative volume 125. In this
example primary derivative volume 123 is a derivative version of
primary storage volume 113.
[0048] As discussed, primary storage volume 113 is comprised of
blocks 114 and includes secondary storage volume 115 and secondary
storage volume 115 comprises data items 116. In some examples,
derivative version of blocks 114, secondary storage volume 115, and
data items 116 are also created when the processor in data control
system 101 generates a derivative version of primary storage volume
113.
[0049] Primary derivative volume 123 may be generated as a result
or in response to a number of events. For example, data control
system 101 may receive a request, instruction, or other indication
from a process attempting to write to primary storage volume 113.
The processor in data control system 101 may generate primary
derivative volume 123 in response to the request, instruction,
and/or other indication.
[0050] Once generated, the processor in data control system 101
identifies changed blocks of the derivative version of primary
storage volume 113 (Step 204). The processor may determine the
changed blocks using a changed block list which tracks the blocks
that change. The changed block list may be maintained by data
control system 101, primary derivative volume 123, and/or primary
storage volume 113.
[0051] The processor in data control system 101 then identifies
which changed blocks on the derivative version of the primary
storage volume remain allocated (Step 206). The processor in data
control system 101 may identify the allocated blocks by determining
which blocks on primary storage volume 113 are free blocks (or
unallocated blocks). In one example, the processor in data control
system 101 may determine the allocation status of the blocks based
on, for example, a volume meta data (bitmap). Once the free blocks
are determined, the processor in data control system 101 can then
copy those changed blocks 124 that remain allocated to primary
storage volume 113.
[0052] Lastly, the processor in data control system 101 collapses
the derivative version of the primary storage volume into the
primary storage volume by copying those blocks identified as
changed and allocated to the primary storage volume (Step 208). For
example, the processor in data control system 101 copies blocks 124
that have changed and that are still allocated in primary storage
volume 113 back to primary storage volume 113.
[0053] FIG. 3 illustrates data control system 301 according to an
embodiment. Data control system 301 includes communication
interface 311, user interface 312, processing system 315, storage
system 116, and software 113.
[0054] Processing system 315 is linked to communication interface
311 and user interface 312. Processing system 315 includes
processing circuitry and storage system 316 that stores software
313. Data control system 301 may include other well-known
components such as a power system and enclosure that are not shown
for clarity.
[0055] Communication interface 311 comprises a network card,
network interface, port, or interface circuitry that allows data
control system 301 to communicate with other elements of a data
control environment. Communication interface 311 may also include a
memory device, software, processing circuitry, or some other
communication device. Communication interface 311 may use various
protocols, such as host bus adapters (HBA), SCSI, SATA, Fibre
Channel, iSCSI, WiFi, Ethernet, TCP/IP, or the like to
communicate.
[0056] User interface 312 comprises components that interact with a
user to receive user inputs and to present media and/or
information. User interface 312 may include a speaker, microphone,
buttons, lights, display screen, mouse, keyboard, or some other
user input/output apparatus--including combinations thereof. User
interface 312 may be omitted in some examples.
[0057] Processing system 315 may comprise a microprocessor and
other circuitry that retrieves and executes software 313 from
storage system 316. Storage system 316 comprises a disk drive,
flash drive, data storage circuitry, or some other memory
apparatus. Processing system 315 is typically mounted on a circuit
board that may also hold storage system 316 and portions of
communication interface 311 and user interface 312.
[0058] Software 313 comprises computer programs, firmware, or some
other form of machine-readable processing instructions. Software
313 may include an operating system, utilities, drivers, network
interfaces, applications, or some other type of software. When
executed by processing system 315, software 313 directs processing
system 315 to operate data control system 120 as described
herein.
[0059] FIGS. 4A-4B illustrate a sequence of operations in data
control environment 400 according to an embodiment. Referring first
to FIG. 4A, which illustrates generation of a derivative version of
a storage volume. As shown in this example, data control
environment 400 includes data control system 401, primary storage
volume 413, and primary derivative volume 423.
[0060] Primary storage volume 413 comprises blocks 414. Blocks 414
comprise block A, block B, block C, and block D. Primary storage
volume 413 includes secondary storage volume 415 which comprises
data items 416. Data items 416 include data item X, data item Y,
and data item Z.
[0061] Primary derivative volume 423 comprises a derivative version
of primary storage volume 413 which is generated by data control
system 401 responsive to a Generate Request or other indication.
Primary derivative volume 423 comprises blocks 424. Blocks 424
comprise block A, block B, block C, and block D. Primary derivative
volume 423 includes secondary derivative volume 425 which comprises
data items 426. Data items 426 include data item X', data item Y',
and data item Z'.
[0062] As discussed, in operation data control system 401 receives
a Generate Request and responsively generates a derivative version
of primary storage volume 413 (i.e., primary derivative volume
423). Those skilled in the art will appreciate that a Generate
Request may be generated within data control system 401 in response
to some event or state. Moreover, those skilled in the art will
also appreciate that the Generate Request may not be a request but
some other indication. For example, in some embodiments the
Generate Request may simply be generated by data control system 401
in response to receiving a write request from a process attempting
to change one or more data items 416 on secondary storage volume
415.
[0063] Referring now to FIG. 4B, which illustrates changes in the
derivative version of a storage volume being collapsed back into
the storage volume. In this example, a derivative version of
storage volume 413 has been generated and data items 426 are
changed.
[0064] In operation, data control system 401 first identifies data
item X' and data item Z' of secondary derivative volume 425 as
changed in response to a Collapse Request. Those skilled in the art
will appreciate that a Collapse Request may be generated within
data control system 401 in response to some event or state.
Moreover, those skilled in the art will also appreciate that the
Collapse Request may not be a request but some other indication.
For example, in some embodiments, a Collapse Request may simply be
generated by data control system 401 in response to receiving a
file release message from a process that previously issued a write
request to change one or more of the data items 416 on secondary
storage volume 415.
[0065] Data control system 401 then determines the changed blocks
of blocks 424 that correspond to the identified changed data items
X' and Z'. In this example, block A' corresponds to changed file X'
and blocks C' and D' correspond to changed file Z'. Those skilled
in the art will appreciate that multiple data items may correspond
to a single block. Similarly, multiple blocks may correspond to a
single data item.
[0066] Once the changed blocks have been identified, data control
system 401 then identifies whether the identified changed blocks
are still allocated or free. The allocation status of the
identified changed blocks may be read from a volume meta data
(bitmap) which may be located on the primary storage volume or the
derivative version of the primary storage volume.
[0067] In this example, blocks A', C' and D' have been identified
by data control system 401 as changed blocks. Data control system
401 determines the allocation status of blocks A', C', and D' in
primary derivative volume 423 by examining the allocation status of
blocks A, C, and D in primary storage volume 413. In this example,
blocks A and B are allocated and blocks C and D are not allocated
in primary storage volume 413.
[0068] The allocation status may be determined by accessing a
volume meta data bimap (not shown) of primary storage volume 413
and/or primary derivative volume 423. The volume meta data bitmap
may be located on primary storage volume 413. In other embodiments,
the volume meta data bitmap may be located elsewhere including
within data control system 401. Those skilled in the art will
appreciate that the allocation status of blocks A', C', and D' in
primary derivative volume 423 may alternatively and/or additionally
be determined by accessing a derivative volume meta data bitmap of
primary derivative volume 423.
[0069] Data control system 401 then collapses the derivative
version of the primary storage volume back into the primary storage
volume by copying those blocks identified as changed and allocated
to the primary storage volume. In this example, block A' is
identified as changed and allocated in the derivative version of
the primary storage volume. Consequently, data control system 401
does not have to read the contents of blocks C' and D' from primary
derivative volume 423. Rather, only block A' need to be copied from
primary derivative volume 423 to primary storage volume 413 which
optimizes the I/O cost of collapsing primary derivative volume 423
back into primary storage volume 413. As a result, only data item A
is updated in primary storage volume 413.
[0070] FIG. 5 illustrates a data control environment 500 according
to an embodiment. In this example, data control system 501 is
implemented in a virtual machine (VM) environment to generate and
collapse a snapshot of a primary storage volume in response to
input from one or more processes and/or one or more VM guest
operating systems.
[0071] As shown in this example, VM environment 510 includes data
control system 501, primary storage volume 513, and primary
derivative volume 523. Elements of VM environment 510 may include,
for example, virtual machines, hypervisors, server machines, and
other underlying virtual files. Other elements are also possible
although not shown for simplicity.
[0072] Primary storage volume 513 is comprised of blocks 514 and
includes secondary storage volume 515. Secondary storage volume 515
includes data items 516. In this example, primary derivative volume
523 comprises a snapshot of primary storage volume 513 which
includes data items 526. A snapshot is a read-only copy of a data
set frozen at a point in time. The snapshot allows applications or
processes to write (or modify) their data sets without interruption
to other applications or processes which may be concurrently
accessing the same data sets.
[0073] Data control system 501 comprises any system or collection
of systems capable of generating a snapshot of primary storage
volume 513 (i.e., primary derivative volume 523) and then
collapsing the snapshot of primary storage volume 513 back into
primary storage volume 513. Data control system 501 may be a
micro-processor, an application specific integrated circuit, a
general purpose computer, a server computer, or any combination or
variation thereof. In this example, data control system 501 is
shown within VM environment 510. Those skilled in the art will
appreciate that in some embodiments data control system 501 may be
located outside VM environment 510.
[0074] Primary storage volume 513 and secondary storage volume 515
may be any storage volumes capable of storing a volume of data. As
discussed, primary storage volume 513 is comprised of blocks 514.
Each block of blocks 514 comprises a section of primary storage
volume 513 that corresponds to one or more data items 516 in
secondary storage volume 515. Data items 516 comprise the volume of
data in secondary storage volume 515 and each data item 516
corresponds to one or more blocks 514.
[0075] In this example, primary storage volume 513 comprises a
v-disk file representing a virtual machine and secondary storage
volume 515 comprises a virtual storage volume or drive. Secondary
storage volume 515 includes data items which comprise the virtual
storage contents of the virtual storage volume. The virtual storage
contents of the virtual storage volume may be, for example, data
files on the virtual storage volume.
[0076] In operation, data control system 501 controls access (i.e.,
reads and writes) to the contents of a virtual drive (e.g., to data
items 516 of secondary storage volume 515 and/or to blocks 514 of
primary storage volume 513). For example, data control system 501
may allow a process or a VM guest operating system (OS) to read the
contents of the virtual drive. However, when the processes or the
VM guest OS attempts to write or modify the contents of the virtual
drive, data control system 501 responsively generates a snapshot of
the virtual drive so that other processes reading the virtual drive
are not disturbed.
[0077] In this example, data control system 501 generates primary
derivative volume 523. Primary derivative volume 523 is a snapshot
of primary storage volume 513 which is accessible to one or more
processes and/or one or more or VM guest operating systems. Primary
derivative volume 523 may initially be an individual copy of
primary storage volume 513. Those skilled in the art will
appreciate that primary derivative volume 523 may not be an exact
copy of primary storage volume 513.
[0078] Once primary derivative volume 523 is generated, the process
or the VM guest OS requesting to write primary storage volume 113
has an individual version of primary storage volume 113 which may
be modified and/or otherwise changed. Typically, this individual
version (i.e., primary derivative volume 523) grows over time as
the data items 526 and/or blocks 524 are changed by the process or
the VM guest OS. Data control system 501, primary derivative volume
523, and/or primary storage volume 513 may track blocks 524 of the
data volume in primary derivative volume 523 that have changed. In
addition, data control system 501 tracks those blocks in the
primary storage volume 513 (the ancestor disk) that remain
allocated or free.
[0079] Data control system 501 may receive an instruction, request,
or other indication from the process or the VM guest OS that
primary derivative volume 523 is no longer needed. At this point
data control system 501 collapses primary derivative volume 523
back into primary storage volume 513 by copying the modified or
changed blocks from primary derivative volume 523 to primary
storage volume 513.
[0080] FIG. 6 illustrates a data control environment 600 according
to an embodiment. In this example, data control system 601 is
implemented in a virtual machine (VM) environment to generate and
collapse a snapshot of a primary storage volume in response to from
one or more data utilities and/or one or more VM guest operating
systems (OS).
[0081] As shown in this example, data control environment 600
includes VM environment 610, agent system 620, and data utilities
630 and 640. VM environment 610 includes elements similar to
elements of VM environment 510 of FIG. 5. In this example, data
control system 501 is implemented in a virtual machine (VM)
environment to generate and collapse a snapshot of a primary
storage volume in response to input from one or more processes
and/or one or more VM guest operating systems.
[0082] As shown in this example, VM environment 610 includes data
control system 601, primary storage volume 613, and primary
derivative volume 623. Elements of VM environment 610 may include,
for example, virtual machines, hypervisors, server machines, and
other underlying virtual files. Other elements are also possible
although not shown for simplicity.
[0083] Primary storage volume 613 is comprised of blocks 614 and
includes secondary storage volume 615. Secondary storage volume 615
includes data items 616. In this example, primary derivative volume
623 comprises a snapshot of primary storage volume 613 which
includes data items 626. In this example, a snapshot is a read-only
copy of a data set frozen at a point in time. The snapshot allows
applications or processes to write (or modify) their data sets
without interruption to other applications or processes which may
be concurrently accessing the same data sets.
[0084] Data control system 601 comprises any system or collection
of systems capable of generating primary derivative volume 623 (a
snapshot of primary storage volume 613) and then collapsing primary
derivative volume 623 back into primary storage volume 613. Data
control system 601 may be a micro-processor, an application
specific integrated circuit, a general purpose computer, a server
computer, or any combination or variation thereof. In this example,
data control system 601 is shown within VM environment 610. Those
skilled in the art will appreciate that in some embodiments data
control system 601 may be located outside VM environment 610.
[0085] Primary storage volume 613 and secondary storage volume 615
may be any storage volumes capable of storing a volume of data. As
discussed, primary storage volume 613 is comprised of blocks 614.
Each block of blocks 614 comprises a section of primary storage
volume 613 that corresponds to one or more data items 616 in
secondary storage volume 615. Data items 616 comprise the volume of
data in secondary storage volume 615 and each data item 616
corresponds to one or more blocks 614.
[0086] In this example, primary storage volume 613 comprises a
v-disk file representing a virtual machine and secondary storage
volume 615 comprises a virtual storage volume or drive. Secondary
storage volume 615 includes data items which comprise the virtual
storage contents of the virtual storage volume. The virtual storage
contents of the virtual storage volume may be, for example, data
files on the virtual storage volume.
[0087] Agent system 620 may be any computer system, group of
computer systems, custom hardware, or other device configured to
communicate with VM environment 610 and data utilities 630 and 640.
For example, agent system 620 may communicate with data utilities
630 and/or 640 to create generation and collapse requests for data
control system 601.
[0088] A data utility (e.g., data utility 630 or data utility 640)
may be, for example, a PC based backup system that needs to access
the contents of primary storage volume 613 in order replicate the
data items 616 or virus scanning software that needs to access the
contents of primary storage volume 613 in order replicate the data
items 616. Other examples are also possible.
[0089] In operation, data control system 601 controls access (i.e.,
reads and writes) to the contents of a virtual drive (e.g., to data
items 616 of secondary storage volume 615 and/or to blocks 614 of
primary storage volume 513). For example, data control system 601
may allow a data utility (through agent system 620) or a VM guest
operating system (OS) access to read or write the contents of the
virtual drive. When the data utility or the VM guest OS attempts to
write or modify the contents of the virtual drive, data control
system 601 responsively generates a snapshot of the virtual drive
so that other processes reading the virtual drive are not
disturbed.
[0090] Once primary derivative volume 623 is generated, the data
utility or the VM guest OS requesting to write primary storage
volume 613 has an individual version of primary storage volume 613
which may be modified and/or otherwise changed. Typically, this
individual version (i.e., primary derivative volume 623) grows over
time as the data items 626 and/or blocks 624 are changed by the
process or the VM guest OS. Data control system 601, primary
derivative volume 623, and/or primary storage volume 613 may track
blocks 624 of the data volume in primary derivative volume 623 that
have changed. In addition, data control system 601 tracks those
blocks in the primary storage volume 613 (the ancestor disk) that
remain allocated or free.
[0091] Data control system 601 may receive an instruction, request,
or other indication from agent system 620 or the VM guest OS
indicating that primary derivative volume 623 is no longer needed.
At this point data control system 601 collapses primary derivative
volume 623 back into primary storage volume 613 by copying the
modified or changed blocks from primary derivative volume 623 to
primary storage volume 613.
[0092] The above description and associated figures teach the best
mode of the invention. The following claims specify the scope of
the invention. Note that some aspects of the best mode may not fall
within the scope of the invention as specified by the claims. Those
skilled in the art will appreciate that the features described
above can be combined in various ways to form multiple variations
of the invention. As a result, the invention is not limited to the
specific embodiments described above, but only by the following
claims and their equivalents.
* * * * *