U.S. patent application number 12/970536 was filed with the patent office on 2011-04-21 for systems and methods for recovering electronic information from a storage medium.
Invention is credited to Parag Gokhale, Rajiv Kottomtharayil, Jun Lu, Yanhui Lu, Yu Wang.
Application Number | 20110093672 12/970536 |
Document ID | / |
Family ID | 36337097 |
Filed Date | 2011-04-21 |
United States Patent
Application |
20110093672 |
Kind Code |
A1 |
Gokhale; Parag ; et
al. |
April 21, 2011 |
SYSTEMS AND METHODS FOR RECOVERING ELECTRONIC INFORMATION FROM A
STORAGE MEDIUM
Abstract
In one embodiment of the invention, a method is provided for
retrieving certain electronic information previously stored on
certain storage media after a threshold set in the storage
retention criteria has been exceeded in an electronic information
storage system that stores electronic information on storage media
in accordance with a storage retention criteria is provided. The
method includes storing a record in a memory associated with a
system manager that assigns the storage retention criteria to the
certain electronic data, designating the storage media available
for overwrite after the threshold set in the storage retention
policy has been exceeded, identifying the certain storage media
available for overwrite, and retrieving information from the
certain media after the threshold set in the storage retention
policy has been exceeded.
Inventors: |
Gokhale; Parag; (Ocean,
NJ) ; Lu; Jun; (Ocean, NJ) ; Lu; Yanhui;
(Acton, MA) ; Wang; Yu; (Edison, NJ) ;
Kottomtharayil; Rajiv; (Marlboro, NJ) |
Family ID: |
36337097 |
Appl. No.: |
12/970536 |
Filed: |
December 16, 2010 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12276868 |
Nov 24, 2008 |
7873802 |
|
|
12970536 |
|
|
|
|
11269515 |
Nov 7, 2005 |
7472238 |
|
|
12276868 |
|
|
|
|
60626076 |
Nov 8, 2004 |
|
|
|
60625746 |
Nov 5, 2004 |
|
|
|
Current U.S.
Class: |
711/159 ;
711/E12.002 |
Current CPC
Class: |
G06F 12/00 20130101;
G06F 3/0683 20130101; H04L 41/22 20130101; G06F 12/02 20130101;
H04L 67/1097 20130101; G06F 3/0605 20130101; G06F 3/0631 20130101;
G06F 11/1453 20130101; G06F 2201/815 20130101; G06F 3/0619
20130101; G06F 3/0604 20130101; G06F 3/0608 20130101; G06F 3/0686
20130101; H04L 41/046 20130101; G06F 3/065 20130101; G06F 3/0665
20130101; G06F 3/064 20130101 |
Class at
Publication: |
711/159 ;
711/E12.002 |
International
Class: |
G06F 12/02 20060101
G06F012/02 |
Claims
1. A computer-implemented method for selecting a spare storage
medium to be used in a data storage operation, wherein the spare
storage medium includes data designated to be overwritten, the
method comprising: accessing at least one index that stores-- first
index information regarding a first spare storage medium, and
second index information regarding a second spare storage medium;
identifying the first spare storage medium associated with the
first index information, wherein the first index information
identifies data stored on the first spare storage medium, and
wherein the first spare storage medium includes data designated to
be overwritten; identifying the second spare storage medium
associated with the second index information, wherein the second
index information identifies data stored on the second spare
storage medium, and wherein the second spare storage medium
includes data designated to be overwritten; and selecting the first
spare storage medium for use in a data storage operation based on
the accessing of the first index information and the second index
information.
2. The method of claim 1, wherein the first index information
identifies data having a lower priority of preservation than the
data identified by the second index information.
3. The method of claim 1, wherein the first index information
identifies data older than the data identified by the second index
information.
4. The method of claim 1, further comprising: overwriting the data
stored on the first storage medium during the data storage
operation; and deleting the first index information only after the
data stored on the first storage medium is partially
overwritten.
5. The method of claim 1, further comprising: overwriting the data
stored on the first storage medium during the data storage
operation; and deleting the first index information only after the
data stored on the first storage medium is substantially
overwritten.
6. A data storage system, comprising: at least one client computing
device; and at least one data storage device coupled to the client
computing device via a network, wherein the data storage device
includes at least a first spare storage medium and a second spare
storage medium; wherein the client computing device is programmed
to: access at least one index having-- first index information
regarding the first spare storage medium of the data storage
device, and second index information regarding the second spare
storage medium of the data storage device; identifying the first
spare storage medium associated with the first index information,
wherein the first index information identifies data stored on the
first spare storage medium, and wherein the first spare storage
medium includes data designated to be overwritten; identifying the
second spare storage medium associated with the second index
information, wherein the second index information identifies data
stored on the second spare storage medium, and wherein the second
spare storage medium includes data designated to be overwritten;
and selecting the first spare storage medium for use in a data
storage operation based on the accessing of the first index
information and the second index information.
7. The data storage system of claim 6, wherein the first index
information identifies data having a lower priority of preservation
than the data identified by the second index information.
8. The data storage system of claim 6, wherein the first index
information identifies data older than the data identified by the
second index information.
9. The data storage system of claim 6, further comprising:
overwriting the data stored on the first storage medium during the
data storage operation; and deleting the first index information
only after the data stored on the first storage medium is partially
overwritten.
10. The data storage system of claim 6, further comprising:
overwriting the data stored on the first storage medium during the
data storage operation; and deleting the first index information
only after the data stored on the first storage medium is
substantially overwritten.
11. A method for selecting a spare storage medium to be used in a
data storage operation, wherein the spare storage medium includes
data designated to be overwritten, the method comprising:
identifying a first spare storage medium associated with first
index information that identifies data stored on the first spare
storage medium; identifying a second spare storage medium
associated with second index information that identifies data
stored on the second spare storage medium; selecting the first
spare storage medium for use in a data storage operation based on a
review of the first index information and the second index
information, wherein the first spare storage medium includes data
designated to be overwritten; retrieving a storage medium;
verifying that the retrieved storage medium is the first spare
storage medium, wherein the verifying includes automatically
reading data from the retrieved storage medium; and, when the
retrieved storage medium is verified to be the first spare storage
medium, then overwriting, with new data, the data designated to be
overwritten on the first spare storage medium.
12. The method of claim 11, wherein the first index information
identifies data having a lower priority of preservation than the
data identified by the second index information.
13. The method of claim 11, wherein the first index information
identifies data older than the data identified by the second index
information.
14. The method of claim 11, further comprising: overwriting the
data stored on the first storage medium during the data storage
operation; and deleting the first index information only after the
data stored on the first storage medium is partially
overwritten.
15. The method of claim 11, further comprising: overwriting the
data stored on the first storage medium during the data storage
operation; and deleting the first index information only after the
data stored on the first storage medium is substantially
overwritten.
16. The method of claim 11, wherein the automatically reading data
from the retrieved storage medium includes reading an on media
label (OML) on the retrieved storage medium.
17. The method of claim 11, wherein the automatically reading data
from the retrieved storage medium includes reading a header file on
the retrieved storage medium.
18. The method of claim 11, further comprising systematically
searching media libraries to locate the first spare storage medium
when the retrieved storage medium is verified not to be the first
spare storage medium.
Description
CROSS-REFERENCE TO RELATED APPLICATION(S)
Priority Claim
[0001] This application is a divisional of U.S. application Ser.
No. 12/276,868 titled SYSTEMS AND METHODS FOR RECOVERING ELECTRONIC
INFORMATION FROM A STORAGE MEDIUM, filed Nov. 24, 2008, which is a
continuation of U.S. application Ser. No. 11/269,515 titled SYSTEMS
AND METHODS FOR RECOVERING ELECTRONIC INFORMATION FROM A STORAGE
MEDIUM, filed Nov. 7, 2005, now U.S. Pat. No. 7,472,238, which
claims the benefit of U.S. Provisional Application No. 60/626,076
titled SYSTEM AND METHOD FOR PERFORMING STORAGE OPERATIONS IN A
COMPUTER NETWORK, filed Nov. 8, 2004, and U.S. Provisional
Application No. 60/625,746 titled STORAGE MANAGEMENT SYSTEM filed
Nov. 5, 2004, each of which is incorporated herein by reference in
its entirety.
RELATED APPLICATIONS
[0002] This application is also related to the following patents
and pending applications, each of which is hereby incorporated by
reference in its entirety: [0003] U.S. Pat. No. 6,418,478, titled
PIPELINED HIGH SPEED DATA TRANSFER MECHANISM, issued Jul. 9, 2002;
[0004] application Ser. No. 09/610,738, titled MODULAR BACKUP AND
RETRIEVAL SYSTEM USED IN CONJUNCTION WITH A STORAGE AREA NETWORK,
filed Jul. 6, 2000, now U.S. Pat. No. 7,035,880; [0005] application
Ser. No. 09/774,268, titled LOGICAL VIEW AND ACCESS TO PHYSICAL
STORAGE IN MODULAR DATA AND STORAGE MANAGEMENT SYSTEM, filed Jan.
30, 2001, now U.S. Pat. No. 6,542,972; [0006] application Ser. No.
60/409,183, titled DYNAMIC STORAGE DEVICE POOLING IN A COMPUTER
SYSTEM, filed Sep. 9, 2002; [0007] application Ser. No. 11/269,520,
titled SYSTEM AND METHOD FOR PERFORMING MULTISTREAM STORAGE
OPERATIONS, filed Nov. 7, 2005; [0008] application Ser. No.
11/269,512, titled SYSTEM AND METHOD TO SUPPORT SINGLE INSTANCE
STORAGE OPERATIONS, filed Nov. 7, 2005; [0009] application Ser. No.
11/269,514, titled METHOD AND SYSTEM OF POOLING STORAGE DEVICES,
filed Nov. 7, 2005 now U.S. Pat. No. 7,809,914; [0010] application
Ser. No. 11/269,521, titled METHOD AND SYSTEM FOR SELECTIVELY
DELETING STORED DATA, filed Nov. 7, 2005, now U.S. Pat. No.
7,765,369; [0011] application Ser. No. 11/269,519, titled METHOD
AND SYSTEM FOR GROUPING STORAGE SYSTEM COMPONENTS, filed Nov. 7,
2005, now U.S. Pat. No. 7,500,053; and [0012] application Ser. No.
11/269,513, titled METHOD AND SYSTEM FOR MONITORING A STORAGE
NETWORK, filed Nov. 7, 2005.
COPYRIGHT NOTICE
[0013] A portion of the disclosure of this patent document contains
material that is subject to copyright protection. The copyright
owner has no objection to the facsimile reproduction by anyone of
the patent document or the patent disclosures, as it appears in the
Patent and Trademark Office patent files or records, but otherwise
expressly reserves all other rights to copyright protection.
BACKGROUND
[0014] The present invention generally relates to the storage and
retrieval of electronic data used in computer systems. More
particularly, the present invention relates to systems and methods
for managing the storage of electronic data on recordable medium
that extends the amount of time the electronic data may be
retrieved from the recordable medium before the medium is reused in
another storage application.
[0015] The storage of electronic data has evolved over time. During
the early development of the computer, storage of electronic data
was limited to individual computers. Electronic data was stored in
the Random Access Memory (RAM) or some other storage medium such as
a magnetic tape or hard drive that was a part of the computer
itself.
[0016] Later, with the advent of network computing, the storage of
electronic data gradually migrated from the individual computer to
stand-alone storage devices accessible via a network. These
individual network storage devices soon evolved into networked tape
drives, optical libraries, Redundant Arrays of Inexpensive Disks
(RAID), CD-ROM jukeboxes, and other devices. Common architectures
included drive pools, which generally are logical collections of
drives with associated media groups including magnetic tapes or
other storage media used by a given drive pool.
[0017] Storage systems, such as some of the systems described
above, typically employ certain high capacity data storage mediums,
which may include magnetic tapes, optical disks and the like to
store electronic information. At some point in time, however, it is
often no longer necessary or desirable to retain the electronic
information stored on these media. When this point is reached, the
media on which such electronic information is stored may be reused
or recycled by the system for use in other storage jobs rather than
simply discarding the media or maintaining the information in
perpetuity.
[0018] For example, in a tape-based system, a storage tape with
unwanted or outdated information may be designated within the
storage management system for reuse in a subsequent storage
operation in a spare media pool. Such a spare media pool may
contain media that is available for storage use in subsequent
storage operations and may include new media or media designated
for reuse within the storage system. When storage media are
assigned to the spare media pool, any information in the storage
management system regarding the old data on the tape may be
discarded, erased or designated for overwrite and replaced with a
simple designation indicating that the tape is available for use in
another storage operation. For example, an index entry used by the
storage management system that includes information about the old
data may be overwritten of renamed to after the data retention
period has expired.
[0019] In many storage systems, however, the reused storage media
continues to contain the data from the previous storage operation,
which typically remains on the media until it is overwritten by a
new storage process. Thus, in many storage systems, the media
designated for reuse continues to contain old information for a
significant period of time past any established retention date.
Nevertheless, because records are not typically retained or
retrievable by storage management systems regarding the media
designated for reuse (and any old information contained thereon),
it is difficult to recover or restore any of this old information,
absent the use of cumbersome, uncommon restore procedures, despite
the fact that the such information still exists on media designated
for reuse within the system.
[0020] Accordingly, what is needed are systems and methods that
overcome this and other deficiencies.
SUMMARY
[0021] In one embodiment of the invention, a method is provided for
retrieving certain electronic information previously stored on
certain storage media after a threshold set in the storage
retention criteria has been exceeded in an electronic information
storage system that stores electronic information on storage media
in accordance with a storage retention criteria is provided. The
method includes storing a record in a memory associated with a
system manager that assigns the storage retention criteria to the
certain electronic data, designating the storage media available
for overwrite after the threshold set in the storage retention
policy has been exceeded, identifying the certain storage media
available for overwrite, and retrieving information from the
certain media after the threshold set in the storage retention
policy has been exceeded.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] The above and other objects and advantages of the present
invention will be apparent upon consideration of the following
detailed description, taken in conjunction with the accompanying
drawings, in which like reference characters refer to like parts
throughout, and in which:
[0023] FIG. 1 is a block diagram of a network architecture for a
system that performs storage and retrieval operations on electronic
data in a computer network in accordance with the principles of the
present invention;
[0024] FIG. 2 is a block diagram of an exemplary media library
storage device for a system to perform storage and retrieval
operations in accordance with an embodiment of the invention;
[0025] FIG. 3 is a flow chart illustrating some of the steps for
performing storage and retrieval operations on electronic data in a
computer network according to an embodiment of the invention;
and
[0026] FIG. 4 is a flow chart illustrating some of the steps
involved with selecting media for reuse in accordance with an
embodiment of the invention.
DETAILED DESCRIPTION
[0027] An embodiment of the system constructed in accordance with
the principles of the present invention is shown in FIG. 1. As
shown, the system may include a client 50, a data agent 60, a data
store 70, a storage management component (SMC) 80, a storage
manager index 90, one or more media management components 100
(sometimes referred to as media agents), one or more media
management component indexes 110, and one or more storage devices
120. Although FIG. 1 depicts a system having two media management
components 100, there may be one media management component, or a
plurality of media management components providing communication
between the client 50, storage manager 80 and the storage devices
120. In addition, the system can include one or a plurality of
storage devices 120. Moreover, in some embodiments, media
management components 100 may be removed, omitted or otherwise
bypassed, with storage manager 80 directly controlling storage
devices 120.
[0028] Client 50 can be any networked client 50 and may include at
least one attached data store 70. Data store 70 may be any memory
device or local data storage device known in the art, such as a
hard drive, CD-ROM drive, tape drive, RAM, or other types of
magnetic, optical, digital and/or analog local storage. In some
embodiments of the invention, client 50 includes at least one data
agent 60, which is a software module that is generally responsible
for storing, archiving, migrating, and recovering data of a client
50 stored in data store 70 or other memory location.
[0029] Storage operations may include, but are not limited to,
creation, storage, retrieval, migration, deletion, and tracking of
primary or production volume data, secondary volume data, primary
copies, secondary copies, auxiliary copies, snapshot copies, backup
copies, incremental copies, differential copies, synthetic copies,
HSM copies, archive copies, Information Lifecycle Management
("ILM") copies, and other types of copies and versions of
electronic data.
[0030] In some embodiments of the invention, the system of FIG. 1
provides at least one, and typically a plurality of data agents 60
for each client, each data agent 60 is intended to store, backup,
migrate, and recover data associated with a different application.
For example, a client 50 may have different individual data agents
60 designed to handle Microsoft Exchange data, Lotus Notes data,
Microsoft Windows file system data, Microsoft Active Directory
Objects data, and other types of data known in the art.
[0031] Storage manager 80 is generally a software module or
application that coordinates and controls the system. For example,
storage manager 80 may manage and control storage operations
performed by the system shown in FIG. 1. Storage manager 80 may
communicate with some or all components of the system including
clients 20, data agents 60, media management components 100, and
storage devices 120 to initiate and manage storage operations.
Storage manager 80 may include an index 90 for storing data related
to storage operations (described in more detail below). Generally
speaking, storage manager 80 communicates with storage devices 120
via a media management component 100. In some embodiments, storage
manager 80 may communicate directly with storage devices 120.
[0032] The system shown in FIG. 1 may include one or more media
management components, such as media management component 100.
Media management component 100 may be implemented as a software
module that conveys data, as directed by the storage manager 80,
between the client 50 and one or more storage devices 120, which
can be storage devices such as a tape library, a hard drive, a
magnetic media storage device, an optical media storage device, or
other storage device. Media management component 100 is
communicatively coupled with and may control storage device 120.
For example, media management component 100 might instruct a
storage device 120 to store, archive, migrate, or restore
application specific data. Media management component 100 generally
communicates with the storage device 120 via a local bus such as a
SCSI adaptor or a host bus adaptor (HBA).
[0033] Each media management component 100 may maintain an index
cache 110 which stores index data that the system generates during
storage operations. For example, storage operations for Microsoft
Exchange data generate index data. Index data may include, for
example, information regarding the location of the stored data on a
particular media (e.g., a location offset value), information
regarding the content of the data stored such as file names, sizes,
creation dates, formats, application types, and other file-related
criteria, information regarding one or more clients associated with
the data stored, information regarding one or more storage
policies, storage criteria, or storage preferences associated with
the data stored, compression information, retention-related
information, encryption-related information, stream-related
information, and other types of information. Index data thus
provides the system with an efficient mechanism for performing
storage operations including locating user files for recovery
operations and for managing and tracking stored data.
[0034] The system of FIG. 1 may maintain multiple copies of the
index data regarding particular stored data. A first copy may be
stored with the data copied to a storage device 120. Thus, a tape
may contain the stored data as well as index information related to
the stored data. In the event of a system restore, the index data
stored with the stored data can be used to rebuild a media
management component index 110 or other index useful in performing
storage operations. In addition, the media management component 100
that controls the storage operation also may generally write an
additional copy of the index data to its index cache 110. The data
in the media management component index cache 110 is generally
stored on faster media, such as magnetic media, and is thus readily
available to the system for use in storage operations and other
activities without having to be first retrieved from the storage
device 120.
[0035] Storage manager 80 may also maintain an index cache 90.
Storage manager index data may be used to indicate, track, and
associate logical relationships and associations between components
of the system, user preferences, management tasks, and other useful
data. For example, storage manager 80 might use its index cache 90
to track logical associations between media management components
100 and storage devices 120. Storage manager 80 may also use its
index cache 90 to track the status of storage operations to be
performed, storage patterns associated with the system components
such as media use, storage growth, network bandwidth, service level
agreement ("SLA") compliance levels, data protection levels,
storage policy information, storage criteria associated with user
preferences, retention criteria, storage operation preferences, and
other storage-related information.
[0036] Index caches 90 and 110 typically reside on their
corresponding storage component's hard disk or other fixed storage
device. For example, jobs agent 85 of a storage manager component
80 may retrieve storage manager index 90 data regarding a storage
policy and storage operation to be performed or scheduled for a
particular client 50. Jobs agent 85, either directly or via another
system module, may communicate with data agent 60 regarding the
storage operation. In some embodiments, jobs agent 85 may also
retrieve from index cache 90 a storage policy associated with the
client 50 and uses information from the storage policy to
communicate to data agent 60 one or more media management
components 100 associated with performing storage operations for
that particular client 50 as well as other information regarding
the storage operation to be performed such as retention criteria,
encryption criteria, streaming criteria, etc. Data agent 60 then
packages or otherwise manipulates the client data stored in client
data store 90 in accordance with the storage policy information
and/or according to a user preference, and communicates this client
data to the appropriate media management component(s) 100 for
processing. Media management component(s) 100 store the data
according to storage preferences associated with the storage policy
including storing the generated index data with the stored data, as
well as storing a copy of the generated index data in the media
management component index cache 110. Data may be stored in
accordance with any suitable storage policy or preference including
those disclosed in U.S. patent application Ser. No. 10/818,749,
which is hereby incorporated by reference in its entirety.
[0037] In some embodiments, components of the system may reside and
execute on the same computer. In some embodiments, a client
component such as a data agent 60, a media management component
100, or a storage manager 80 coordinates and directs local
archiving, migration, and retrieval application functions as
further described in U.S. patent application Ser. No. 09/610,738,
which is hereby incorporated by reference in its entirety. These
client components can function independently or together with other
similar client components.
[0038] Storage device 120 may be any conventional storage device
capable of storing data. Some storage devices 120 may include a
robotic arm (not shown) that may be used to insert and remove
storage media 145 contained in the storage device. The type of
storage media used in storage device 120 is not critical and can be
a magnetic tape or optical disk, such as that generally depicted in
FIG. 2. For example, storage device 120 may include any suitable
storage media such as storage tapes 145, but some embodiments may
also include other optical and magnetic media such as CDRW, DVDRW,
etc., (not shown). Storage device 120 may also include drives 125,
130, and 135 for reading information from and writing information
to such media. Tapes 145 may store electronic data containing
backups of application data, user preferences, system information,
and other useful information known in the art.
[0039] In operation, the system shown in FIG. 1 may store
electronic data on storage media 145 as described above. Generally
speaking, the information stored on media 145 may be maintained in
accordance with particular storage policy or retention preference
that may be predefined or updated periodically. Such policies may
be user defined or may be one of several available predefined
default settings (e.g., as directed by a storage manager index). A
storage policy is generally a data structure or other information
that includes a set of preferences and other storage criteria for
performing a storage operation. The preferences and storage
criteria may include, but are not limited to: a storage location,
relationships between system components, network pathway to
utilize, retention policies, data characteristics, compression or
encryption requirements, preferred system components to utilize in
a storage operation, and other criteria relating to a storage
operation. A storage policy may be stored to a storage manager
index, to archive media as metadata for use in restore operations
or other storage operations, or to other locations or components of
the system.
[0040] In the case where information is retrieved from media 145,
storage manager 80 and/or media management components 100 may
cooperate with one another and interact with storage device 120 to
locate a particular media 145 and retrieve the desired data. Media
145 may be located using any suitable means including index
information that specifies the physical location of the media
within storage device 120 and may also utilize external or internal
labels or other indicia identifying the media and data stored
thereon. Such media identifiers may include on media labels (OMLs),
bar codes, RFIDs, etc.
[0041] Furthermore, during normal operation, storage device 120 may
reuse or recycle storage media 145 as appropriate to provide the
system with the storage resources necessary to perform future
storage operations and to promote the efficient use of spare media
within the system. One benefit of this reuse type system is that it
reduces the amount of media required by the storage system thereby
eliminating the need for large amounts unnecessary storage media
145.
[0042] For example, storage manager 80 and/or media component
manager 100 may monitor the retention preferences or storage
policies of data stored on media 145. When certain data exceeds one
or more predetermined thresholds (e.g., exceeds an age, size or
other specified parameter), storage manager 80 and/or media
component manager 100 may designate the media on which that data is
stored available for current use (i.e., may be overwritten). This
allows storage device 120 to use that media, which still contains
old data that has passed its retention period, for new storage
tasks. For example, after certain data on a particular media 145
has exceeded a threshold parameter, media manager component 100 may
designate that media for reuse. The information regarding the old
data, however, still exists and may be retained (e.g., in an index
or backup index). This information, which may include descriptive
metadata, is useful in future restore operations where spare media
tape 145 has not yet been overwritten and it is desired to retrieve
some of that data stored thereon. Such information may be retained
until the media 145 is completely overwritten with new data. After
media 145 is overwritten with new data in a subsequent storage
operation, the old data previously stored on the overwritten
portion of the media is usually unrecoverable.
[0043] In some embodiments, media 145 may be managed by assigning
the media to one or more "media pools." Media 145 may be assigned
to a particular media pool by storage manager 80 based on certain
attributes of the data stored on the media. For example, one type
of media pool may be referred to as a "save pool." Media assigned
to a save pool may be designated by storage manager 80 and/or media
management components 100 as "write protected" or "unavailable" or
"in storage." Certain media 145 may be assigned to such a save pool
in the case where the data stored therein is to remain in storage
and accessible pursuant to a storage policy and therefore cannot be
overwritten or reused at this point in time. Storage manager 80 may
retain records and other information relating to the data stored on
each media 145 in the save pool such as its physical location and
the relationship between the data, media ID, and storage policy in
order to coordinate access and management of storage resources and
the stored data.
[0044] Another type of media pool may be referred to as a "scratch
pool." Media assigned to a scratch pool may be designated by
storage manager 80 as "writeable" or "available" or "spare" or
"spare media pool." Media assigned to the scratch pool is generally
available for storage operations and is generally not write
protected or otherwise restricted from use within a storage device.
Thus, when the system of FIG. 1 requires additional media 145 for
new storage operations, the spare media pool is where such media
may be located and made available to the system. Moreover, media
145 may be assigned to such a scratch pool in the case where the
media is newly added to the system or where the data stored on a
previously used media no longer needs to be retained or has
exceeded a limitation set forth in its storage policy and therefore
may be overwritten or reused at this point in time. For example,
certain data may have exceeded its age criteria. In this case, the
media 145 on which that data is stored may be designated for reuse
and assigned to the spare media pool. The metadata and other
information describing the data may also be retained in a spare
media index that tracks such information (not shown). The spare
media index may be substantially the similar to or the same as
index 90 used to track media 145 and may be stored in or part of
index 90.
[0045] Thus, the system of FIG. 1 has the ability to keep track of
what previously used media is available for new storage operations
by consulting an index of data records that indicate which media
145 are members of which media pool. In other embodiments, the
status of a particular media may be determined by consulting
records that are maintained on a media by media basis. For example,
when a certain media is available for new storage operations, a
flag may set in that media's profile record. With this system,
storage manger 80 may quickly determine system capacity,
availability and degree of utilization of spare media.
[0046] Moreover, in some embodiments, data may be overwritten on
spare media (and the media may be reused) based on a classification
scheme or according to certain preferences. For example, data may
be assigned to various retention levels and may be overwritten
based on those retention levels, with the highest priority data
being overwritten last. Thus, for example, low priority data may be
overwritten first, intermediate priority data may be overwritten
next, and high priority data overwritten last. Such a hierarchy
extends the lifecycle of data on a sliding scale, providing
additional flexibility in retrieving data based on retention level,
while making storage media available within the system.
[0047] Unlike prior art systems, a preferred embodiment of the
present invention continues to retain records and other information
relating to media assigned to the scratch pool (or simply for the
media designated for reuse in general irrespective of whether a
scratch pool or save pool concept is actually implemented). For
example, media management component 100 and/or storage manager 80
may store or retain records relating to each media 145 in the
scratch pool including its physical location within storage device
120, the data stored on that media, as well as information useful
in indexing that data, media identification information and storage
policy, etc. (e.g., in a spare media pool index). This allows the
present invention to identify and retrieve previously stored
information from scratch pool media that has exceeded its retention
date, thus accommodating the need for the reuse of storage media
and promoting system efficiency while succeeding in extending the
storage period of previously stored data past its retention date by
leveraging description data already present within the system This
ability represents an improvement over prior art systems which
typically cannot access old information from recycled media despite
the fact such information continues to remain on spare media within
the storage system prior to overwrite.
[0048] In some embodiments, the index or other information retained
for the scratch pool media (i.e., spare media pool index) may be
the same as or substantially similar to the information retained
for save pool media. In this case, when a media 145 is assigned to
the scratch pool from the save pool, the associated records may be
simply copied or redesignated as scratch pool records. Using this
approach, little or no additional processing of existing media
management information need be performed to obtain detailed and
accurate information regarding scratch pool media. The redesignated
information may be used by storage manager 80 or other management
systems (not shown) to retrieve old information that remains on
scratch pool media (prior to reuse).
[0049] Although media may be reassigned from one storage pool to
the other as described above, it will be understood that this does
not necessarily require any physical movement of the storage media
from one location to another. Rather, media may remain at one
location with that media being reassigned to the scratch pool
within management software resident on storage manager 80.
[0050] Furthermore, in some embodiments of the invention, storage
manager 80 may monitor the reuse of media from the scratch pool
such that the system keeps track of the storage space and/or data
overwritten by subsequent storage operations. This may involve
updating the scratch pool media records so that the records reflect
how much of the old data remains on that particular media. For
example, a certain media 145 designated for reuse may be partially
overwritten such that it includes both new data and old data. This
may involve keeping track of certain files, chunks, and/or blocks
of data including any location offset on the media or any
description. Storage manger 80 may update the records associated
with that media so it may be readily determined how much old
information still may be recovered. Such updating may be automated
and triggered by reuse of a previously used media 145 and/or in
accordance with any classification or retention scheme such that
description information or metadata may be updated, deleted or
otherwise modified when corresponding portions of data are
overwritten on media 145. Any suitable data monitoring and updating
procedure or program may be used to achieve this objective. This
feature permits the present invention to identify and retrieve (or
partially retrieve) old information from media already in
reuse.
[0051] Another aspect of the invention involves the management,
organization and display of save pool and scratch pool information.
In some embodiments, both save pool and scratch pool information
may be organized and displayed using a graphical user interface
with familiar pull down menus and a folder/file organization
structure. For example, a user may browse information in either
pool by merely clicking on a particular folder (such as save or
scratch) and select a particular media (which may be represented as
a file within the folder) to view the information stored on that
particular media. This allows a user media level access to the
information stored in the system. In other embodiments, browse
features associated with the system may locate and display for a
user a graphical view or all save pool media in one display and a
different display that shows all the scratch pool media. For
example, by searching for all available spare pool media, the
system of FIG. 1 may populate a table, list or other graphical
display showing the available spare pool media, the records of the
data stored on that media, and any other useful information (e.g.,
a spare pool media display). The same or similar may also be done
for save pool media. In some embodiments, access to such
information may be password protected within the system and
available to only users with the appropriate privileges. For
example, a user may only have privileges to the save pool and not
the scratch pool, the may have access to high level data such as
the available or used media, but not to any index information,
etc.
[0052] Additionally, management software may include a search
engine and command functions that allow the user to quickly search
save system media to determine if particular data exists or to
observe the status of certain media. For example, if a user wants
to determine if certain data which has past its retention cycle
still exists on media within the scratch pool, a boolean word
search or other searching method may determine whether that data
still exists or not. Moreover, the system may generate summaries
that include general information such as listing the oldest data in
the scratch pool, the current contents of the scratch pool,
remaining unused system storage capacity etc. Command functions may
allow users to modify or otherwise direct manipulation of media
outside of normal automated operation. These summary, command, and
search functions may be user configurable and arranged according to
the needs or desires of a particular user.
[0053] The system of FIG. 1 may select media for reuse employing a
number of different selection criteria. For example, media that
contains data past its retention cycle may be designated for reuse
immediately after (or some time after) the retention period
expires. However, the order in which those media are overwritten
may vary according to default or user-specified preferences. For
example, a default preference may specify that the media containing
the oldest data be overwritten first. Other default scenarios may
include specifying reuse preference based on data type. For
example, all marketing data may be overwritten before any financial
data is overwritten, system backup files may have priority over
email backups, etc. System users may customize their system with
reuse procedures and policies that best reflect the needs of a
particular business or enterprise. Nonetheless, it will be
understood that any suitable reuse or recycle policy may be used if
desired.
[0054] Some of the steps involved in recovering electronic
information from a storage medium in accordance with the present
invention are illustrated in flow chart 300 shown in FIG. 3. As
shown, at step 302, the system of FIG. 1 may determine what data is
to be stored, and which retention policy should govern the storage
operation. This step is preferably automated and may be
accomplished at least in part, by system management software
resident on storage manager 80 which oversees the storage of
information on a particular media 145. At this point, a certain
storage media may be assigned to a save pool and be designated as
restricted. After these decisions have been made and the data
stored, a record may be created at step 304 that may be maintained
in index 90 and/or media management component 100. Next, at step
306, jobs agent 85 or other management agent monitors the retention
policies of data stored within the save pool.
[0055] At step 308, when certain data exceeds its retention
threshold, jobs manager 85 or other selection logic may selectively
"prune" or remove certain media from the save pool by releasing its
associated index entry and designating it available (i.e., placing
it in the scratch pool) while retaining its record profile. Next,
at step 310, storage manager 80 and/or media management component
100 may select a media 145 for overwrite based on default of other
criteria described above (the "reused media") and update that
media's record profile accordingly. At step 312, a user may
optionally search for and retrieve information from the reused
media assigned to the scratch pool using the indexing and location
information stored at step 308. This may be accomplished for
example, by invoking the media pool display screen described above,
and populating that display with the desired information. A user
may then retrieve or otherwise access data stored on the identified
media. Afterwards, at step 314, reused media may be partially
overwritten in a new storage operation. At this point, the media in
use may have record profiles that belong to both the scratch and
save pools and both new and old data may be retrieved from the
media. For example, a certain media 145 may have index entries in
both the save and scratch pool with offset data defining the
location of the old or new data on that media. Furthermore, media
145 used in this type of dual role may be organized in any suitable
way, as desired, such as by overwriting large contiguous sections,
or by selectively overwriting old data of lesser importance, etc.
Media 145 containing both old and new information may sometimes be
referred to as hybrid media.
[0056] Next, at step 316, media management component 100 and/or
storage manager 80 may update the record profile or index entry
associated with the reused media to reflect the extent to which the
reused media has been overwritten and to indicate how much old data
still remains. The record profile may also be updated to reflect
the newly added information. At step 318, a user may optionally
retrieve any old data remaining on the used media, and finally, at
step 320 the reused media may be completely overwritten and its
associated record profile may be updated to reflect this change. At
this point, the reused media may be assigned back to the save pool,
and the records in the scratch pool regarding this media may be
deleted.
[0057] Although the steps shown above are illustrative of a general
embodiment of the invention, it will be understood these steps are
not intended to be comprehensive or necessarily performed in the
order shown. For example, steps 314 to 318 may be performed on an
iterative basis until the media in use is completely overwritten or
designated to the save pool. For example, steps 314 to 318 may be
performed until a threshold is reached, such as media capacity, in
which case the index data may be deleted.
[0058] Some of the steps involved in selecting media assigned to
the scratch pool for overwrite in accordance with the present
invention are illustrated in flow chart 400 shown in FIG. 4. As
shown, at step 402, the system of FIG. 1 may determine what media
is to be overwritten first according to certain defined or default
criteria as described above. Available media may be tracked in a
data structure by storage manager 80 and/or media management
component 100 such that the most appropriate media is readily
identifiable and available when the need arises for spare media.
For example, a data structure representing a virtual queue or other
arrangement may be used to track and order media according to
retention criteria of other preferences such as first in, first
out, by data type or subject matter, etc.
[0059] Next, at step 404, a certain media identified in the data
structure may be retrieved for an overwrite operation, which would
overwrite portions of data previously stored on that media. Storage
device 120 may retrieve this media and confirm it is the correct
one by verifying its identity via an OML, a header file, or other
marking indicia at step 406 to ensure the correct media has been
selected for overwrite. If the media identity is verified, the
media may be overwritten at step 408 and tracked according to the
applicable retention preference or policy at step 410. If the media
identity is not verified, the system of FIG. 1 may perform the
appropriate discovery steps to locate the media in question at step
412 (e.g., by systematically searching through various media
libraries). If the media is located through discovery, the
verification procedure may be performed from step 404 going
forward. If found, and the media is the correct one, it can
continue to on step 408. If not, step 412 may be repeated several
times, and if the media is still not found, it may be determined as
lost at step 414 (e.g., by a setting flag in the media index
profile, or assigning the media to a "lost pool").
[0060] Although the steps shown above are illustrative of a general
embodiment of the invention, it will be understood these steps are
not intended to be comprehensive or necessarily performed in the
order shown.
[0061] Thus, systems and methods for recovering electronic
information from a storage medium are provided. It will be
understood that the foregoing is merely illustrative of the
principles of the present invention and that various modifications
can be made by those skilled in the art without departing from the
scope and spirit of the invention. Accordingly, such embodiments
will be recognized as within the scope of the present
invention.
[0062] Systems and modules described herein may comprise software,
firmware, hardware, or any combination(s) of software, firmware, or
hardware suitable for the purposes described herein. Software and
other modules may reside on servers, workstations, personal
computers, computerized tablets, PDAs, and other devices suitable
for the purposes described herein. Software and other modules may
be accessible via local memory, via a network, via a browser or
other application in an ASP context, or via other means suitable
for the purposes described herein. Data structures described herein
may comprise computer files, variables, programming arrays,
programming structures, or any electronic information storage
schemes or methods, or any combinations thereof, suitable for the
purposes described herein. User interface elements described herein
may comprise elements from graphical user interfaces, command line
interfaces, and other interfaces suitable for the purposes
described herein. Screenshots presented and described herein can be
displayed differently as known in the art to input, access, change,
manipulate, modify, alter, and work with information.
[0063] While the invention has been described and illustrated in
connection with preferred embodiments, many variations and
modifications as will be evident to those skilled in this art may
be made without departing from the spirit and scope of the
invention, and the invention is thus not to be limited to the
precise details of methodology or construction set forth above as
such variations and modification are intended to be included within
the scope of the invention.
[0064] Persons skilled in the art will appreciate that the present
invention can be practiced by other than the described embodiments,
which are presented for purposes of illustration rather than of
limitation and that the present invention is limited only by the
claims that follow.
* * * * *