U.S. patent application number 14/080420 was filed with the patent office on 2014-05-15 for method and system for managing metadata in a storage environment.
This patent application is currently assigned to NETAPP, INC.. The applicant listed for this patent is NETAPP, INC.. Invention is credited to Gaurav Agarwal, Manish M. Agarwal, Anant Chaudhary, Sridher Jeyachandran, Varun Jobanputra, Sloan Johnson, Vikram Shukla.
Application Number | 20140136483 14/080420 |
Document ID | / |
Family ID | 49596765 |
Filed Date | 2014-05-15 |
United States Patent
Application |
20140136483 |
Kind Code |
A1 |
Chaudhary; Anant ; et
al. |
May 15, 2014 |
METHOD AND SYSTEM FOR MANAGING METADATA IN A STORAGE
ENVIRONMENT
Abstract
Method and system is provided for managing metadata for a
plurality of data containers that are stored at one or more storage
volumes in a storage system. The metadata is collected from one or
more storage volumes and then provided to a catalog module. The
catalog module pre-processes the metadata and then generates a
searchable data structure. The searchable data structure may then
be used to respond to a user request for information regarding the
storage system.
Inventors: |
Chaudhary; Anant; (Fremont,
CA) ; Agarwal; Gaurav; (Menlo Park, CA) ;
Johnson; Sloan; (San Francisco, CA) ; Agarwal; Manish
M.; (Santa Clara, CA) ; Jobanputra; Varun;
(San Jose, CA) ; Shukla; Vikram; (Fremont, CA)
; Jeyachandran; Sridher; (Santa Clara, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
NETAPP, INC. |
Sunnyvale |
CA |
US |
|
|
Assignee: |
NETAPP, INC.
Sunnyvale
CA
|
Family ID: |
49596765 |
Appl. No.: |
14/080420 |
Filed: |
November 14, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12706974 |
Feb 17, 2010 |
8595237 |
|
|
14080420 |
|
|
|
|
Current U.S.
Class: |
707/639 |
Current CPC
Class: |
G06F 16/316 20190101;
G06F 16/14 20190101 |
Class at
Publication: |
707/639 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1-36. (canceled)
37. A machine implemented method, comprising: configuring a storage
volume as a catalog volume for storing metadata associated with a
plurality of data containers stored at a plurality of storage
volumes managed by a plurality of storage system nodes of a storage
system, each node executing a storage operating system for reading
and writing the plurality of data containers at storage space
associated with the plurality of storage volumes; collecting
metadata for the plurality of data containers; wherein after a
first snapshot is generated for the plurality of data storage
volume, metadata is collected only for data containers that were
changed after the first snapshot was generated; and storing
pre-processed metadata in a searchable data structure for the
plurality of data containers; wherein the searchable data structure
stores metadata for a plurality of directory entries and for
non-directory based data containers stored at the plurality of
storage volumes; and wherein each entry for the non-directory based
data containers stores a reference to a parent directory entry that
stores a storage path for a corresponding non-directory based data
container such that the storage path for each data container can be
obtained using the reference to the parent directory entry without
having to store individual storage paths for each non-directory
based data containers.
38. The method of claim 37, wherein the searchable data structure
includes a first searchable segment that stores attributes for the
plurality of directory entries including a unique directory
identifier, a directory name, a size of a directory, a permission
associated with a directory, a user identifier identifying a user
that uses a directory, a group identifier identifying a group
associated with a directory, an access time when a directory was
accessed, a creation time when a directory was created, a
modification time when a directory was modified and an indicator
indicating if a directory was created, modified or deleted.
39. The method of claim 38, wherein the searchable data structure
includes a second searchable segment that stores attributes for the
plurality of data containers, including a unique identifier for
identifying a data container, an identifier that associates a data
container to a directory entry, a data container name, a size of a
data container, a permission associated with a data container, a
user identifier identifying a user for a data container, a group
identifier identifying a group associated with a data container, an
access time when a data container was accessed, a creation time
when a data container was created, a modification time when a data
container was modified and an indicator indicating if a data
container was created, modified or deleted.
40. The method of claim 37, wherein the searchable data structure
includes a snapshot identifier for identifying a snapshot when the
plurality of data containers were replicated, a time stamp when the
snapshot was taken and a snapshot version indicator indicating the
snapshot when a change in status for the data container was
discovered.
41. The method of claim 39, wherein in response to a user request
for data container information, the first searchable segment and
the second searchable segment are used to obtain the requested
information.
42. The method of claim 37, wherein the searchable data structure
stores metadata associated with an active file system.
43. The method of claim 37, wherein the metadata is collected based
on an event, a schedule or a user request.
44. A non-transitory, machine readable storage medium storing
executable instructions, which when executed by a machine, causes
the machine to perform a method, the method comprising: configuring
a storage volume as a catalog volume for storing metadata
associated with a plurality of data containers stored at a
plurality of storage volumes managed by a plurality of storage
system nodes of a storage system, each node executing a storage
operating system for reading and writing the plurality of data
containers at storage space associated with the plurality of
storage volumes; collecting metadata for the plurality of data
containers; wherein after a first snapshot is generated for the
plurality of data storage volume, metadata is collected only for
data containers that were changed after the first snapshot was
generated; and storing pre-processed metadata in a searchable data
structure for the plurality of data containers; wherein the
searchable data structure stores metadata for a plurality of
directory entries and for non-directory based data containers
stored at the plurality of storage volumes; and wherein each entry
for the non-directory based data containers stores a reference to a
parent directory entry that stores a storage path for a
corresponding non-directory based data container such that the
storage path for each data container can be obtained using the
reference to the parent directory entry without having to store
individual storage paths for each non-directory based data
containers.
45. The storage medium of claim 44, wherein the searchable data
structure includes a first searchable segment that stores
attributes for the plurality of directory entries including a
unique directory identifier, a directory name, a size of a
directory, a permission associated with a directory, a user
identifier identifying a user that uses a directory, a group
identifier identifying a group associated with a directory, an
access time when a directory was accessed, a creation time when a
directory was created, a modification time when a directory was
modified and an indicator indicating if a directory was created,
modified or deleted.
46. The storage medium of claim 45, wherein the searchable data
structure includes a second searchable segment that stores
attributes for the plurality of data containers, including a unique
identifier for identifying a data container, an identifier that
associates a data container to a directory entry, a data container
name, a size of a data container, a permission associated with a
data container, a user identifier identifying a user for a data
container, a group identifier identifying a group associated with a
data container, an access time when a data container was accessed,
a creation time when a data container was created, a modification
time when a data container was modified and an indicator indicating
if a data container was created, modified or deleted.
47. The storage medium of claim 44, wherein the searchable data
structure includes a snapshot identifier for identifying a snapshot
when the plurality of data containers were replicated and a
snapshot version indicator indicating the snapshot when a change in
status for the data container was discovered.
48. The storage medium of claim 46, wherein in response to a user
request for data container information, the first searchable
segment and the second searchable segment are used to obtain the
requested information.
49. The storage medium of claim 44, wherein the searchable data
structure stores metadata associated with an active file
system.
50. The storage medium of claim 44, wherein the metadata is
collected based on an event, a schedule or a user request.
51. A system, comprising: a plurality of storage volumes managed by
a plurality of storage system nodes of a storage system, each node
having a processor for executing a storage operating system for
reading and writing a plurality of data containers at storage space
associated with the plurality of storage volumes; wherein a storage
volume is configured as a catalog volume for storing metadata
associated with the plurality of data containers; and a processor
executing instructions out of memory for: collecting metadata for
the plurality of data containers; wherein after a first snapshot is
generated for the plurality of data storage volume, metadata is
collected only for data containers that were changed after the
first snapshot was generated; and storing pre-processed metadata in
a searchable data structure for the plurality of data containers;
wherein the searchable data structure stores metadata for a
plurality of directory entries and for non-directory based data
containers stored at the plurality of storage volumes; and wherein
each entry for the non-directory based data containers stores a
reference to a parent directory entry that stores a storage path
for a corresponding non-directory based data container such that
the storage path for each data container can be obtained using the
reference to the parent directory entry without having to store
individual storage paths for each non-directory based data
containers.
52. The system of claim 51, wherein the searchable data structure
includes a first searchable segment that stores attributes for the
plurality of directory entries including a unique directory
identifier, a directory name, a size of a directory, a permission
associated with a directory, a user identifier identifying a user
that uses a directory, a group identifier identifying a group
associated with a directory, an access time when a directory was
accessed, a creation time when a directory was created, a
modification time when a directory was modified and an indicator
indicating if a directory was created, modified or deleted.
53. The system of claim 52, wherein the searchable data structure
includes a second searchable segment that stores attributes for the
plurality of data containers, including a unique identifier for
identifying a data container, an identifier that associates a data
container to a directory entry, a data container name, a size of a
data container, a permission associated with a data container, a
user identifier identifying a user for a data container, a group
identifier identifying a group associated with a data container, an
access time when a data container was accessed, a creation time
when a data container was created, a modification time when a data
container was modified and an indicator indicating if a data
container was created, modified or deleted.
54. The system of claim 51, wherein the searchable data structure
includes a snapshot identifier for identifying a snapshot when the
plurality of data containers were replicated and a snapshot version
indicator indicating the snapshot when a change in status for the
data container was discovered.
55. The system of claim 53, wherein in response to a user request
for data container information, the first searchable segment and
the second searchable segment are used to obtain the requested
information.
56. The system of claim 51, wherein the searchable data structure
stores metadata associated with an active file system.
57. The system of claim 51, wherein the metadata is collected based
on an event, a schedule or a user request.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This patent application is related to U.S. patent
application, entitled "METHOD AND SYSTEM FOR MANAGING METADATA IN A
CLUSTER BASED STORAGE ENVIRONMENT"; Docket No. P01-6210, Ser. No.
______, filed on even date herewith and the disclosure of which is
incorporated herein by reference in its entirety.
TECHNICAL FIELD
[0002] The present disclosure relates to storage systems.
BACKGROUND
[0003] Various forms of storage systems are used today. These forms
include direct attached storage (DAS) network attached storage
(NAS) systems, storage area networks (SANs), and others. Network
storage systems are commonly used for a variety of purposes, such
as providing multiple users with access to shared data, backing up
data and others.
[0004] A storage system typically includes at least one computing
system executing a storage operating system for storing and
retrieving data on behalf of one or more client processing systems
("clients"). The storage operating system stores and manages shared
data containers in a set of mass storage devices, such as magnetic
or optical disks or tapes.
[0005] In traditional storage environments, the operating system is
typically geared towards handling access to one object at a time.
Access to a group of data containers within a file system is
difficult because the operating system layout is such that metadata
for data containers, for example, a file name, attributes, access
control lists, and information regarding an owner of the data
container may not be stored contiguously at a storage device and
may be stored at different locations. Therefore, it is difficult
for an operating system to respond to user queries for information
regarding a data container or a group of data containers because
one typically has to traverse through a namespace and perform an
extensive directory search. The term namespace refers to a virtual
hierarchical collection of unique volume names or identifiers and
directory paths to the volumes, in which each volume represents a
virtualized container storing a portion of the namespace descending
from a single root directory. This is inefficient because metadata
information is stored at various locations and also a directory may
have a large number of files within a namespace. Continuous efforts
are being made to integrate managing data containers and the
metadata for the data containers.
SUMMARY
[0006] In one embodiment, a method and system is provided for
managing metadata for a plurality of data containers that are
stored at one or more storage volumes in a storage system. The
metadata is collected from one or more storage volumes and then
provided to a catalog module. The catalog module pre-processes the
metadata and then generates a searchable data structure. The
searchable data structure may then be used to respond to a user
request for information regarding the storage system.
[0007] In another embodiment, a machine implemented method for a
storage system is provided. The method includes configuring a data
storage volume for collecting metadata for a plurality of data
containers stored at the data storage volume. The metadata includes
at least an attribute that is associated with the plurality of data
containers. A storage volume is configured to operate as a catalog
volume for storing metadata associated with the plurality of data
containers. The metadata for the plurality of data containers is
collected and pre-processed by extracting one or more fields. The
pre-processed metadata is stored in a searchable data structure at
the catalog volume for responding to a user query requesting
information regarding the plurality of data containers.
[0008] In yet another embodiment, a machine implemented method for
a storage system for storing a plurality of data containers at one
or more storage volumes is provided. The method includes
pre-processing metadata associated with the plurality of data
containers where the metadata includes an attribute that is
associated with the plurality of data containers. A searchable data
structure is then generated by indexing the pre-processed metadata
such that information related to the plurality of data containers
is obtained regardless of a storage volume location.
[0009] In another embodiment, a machine implemented method for a
storage system for storing a plurality of data containers at one or
more storage volumes is provided. The method includes indexing
metadata associated with the plurality of data containers where the
metadata includes an attribute that is associated with the
plurality of data containers and the metadata is collected from at
least one storage volume. The indexed metadata is then stored in a
searchable data structure which may be used for obtaining
information regarding the plurality of data containers. The
searchable data structure stores a snapshot table identifier for
identifying a snapshot when the plurality of data containers were
replicated and a time stamp when the snapshot was taken.
[0010] This brief summary has been provided so that the nature of
this disclosure may be understood quickly. A more complete
understanding of the disclosure can be obtained by reference to the
following detailed description of the various embodiments thereof
in connection with the attached drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The foregoing features and other features will now be
described with reference to the drawings of the various
embodiments. In the drawings, the same components have the same
reference numerals. The illustrated embodiments are intended to
illustrate, but not to limit the present disclosure. The drawings
include the following Figures:
[0012] FIG. 1A shows a block diagram of a storage environment,
managed according to one embodiment;
[0013] FIG. 1B shows an example of a management application used
for managing the storage environment of FIG. 1A, according to one
embodiment;
[0014] FIG. 2 shows an example of a storage environment with a
cluster system, managed according to one embodiment;
[0015] FIGS. 3A and 3B show examples of a storage operating system,
used according to one embodiment;
[0016] FIG. 3C shows an example of an aggregate, used according to
one embodiment;
[0017] FIG. 3D shows an example of a namespace used according to
one embodiment;
[0018] FIG. 4A shows an example of a catalog system, according to
one embodiment;
[0019] FIG. 4B shows an example of the catalog system used in a
clustered storage environment, according to one embodiment;
[0020] FIGS. 4C-4F show examples of different data structures used
by the catalog system, according to one embodiment;
[0021] FIGS. 5A-5C show process flow diagrams, according to the
various embodiments of the present disclosure;
[0022] FIG. 6 shows an example of a node used in a cluster system,
according to one embodiment; and
[0023] FIG. 7 shows an example of a computing system for
implementing the process steps of the present disclosure.
DETAILED DESCRIPTION
Definitions
The following definitions are provided as they are typically (but
not exclusively) used in the computing/storage environment,
implementing the various adaptive embodiments described herein.
[0024] "Aggregate" is a logical aggregation of physical storage,
i.e., a logical container for a pool of storage, combining one or
more physical mass storage devices (e.g., disks) or parts thereof
into a single logical storage object, which includes or provides
storage for one or more other logical data sets at a higher level
of abstraction (e.g., volumes).
[0025] "CIFS" means the Common Internet File System Protocol, an
access protocol that client systems use to request file access
services from storage systems over a network.
[0026] "Data Container" means a block, a file, a logical unit of
data or any other information.
[0027] "FC" means Fibre Channel, a high-speed network technology
primarily used for storage networking. Fibre Channel Protocol (FCP)
is a transport protocol (similar to Transmission Control Protocol
(TCP) used in Internet Protocol ("IP") networks) which
predominantly transports SCSI commands over Fibre Channel
networks.
[0028] "iSCSI" means the Internet Small Computer System Interface,
an IP based storage networking standard for linking data storage
facilities. The standard allows carrying SCSI commands over IP
networks. iSCSI may be used to transmit data over local area
networks (LANs), wide area networks (WANs), or the Internet and can
enable location-independent data storage and retrieval.
[0029] "Metadata" refers to one or more attributes for a data
container, for example, a directory or data file. The attributes
include (a) a unique data container identifier, for example, an
inode number; (b) a data container type, i.e., if the data
container is a directory, file and others; (c) information
regarding whether the data container was created, modified or
deleted; (d) a data container name (for example, NFS file name and
CIFS file name) and path; (e) an owner identifier, for example, an
NFS user identifier or a CIFS owner identifier; (f) a group
identifier, for example, an NFS group identifier (GID); (g) a data
container size; (h) permissions associated with the data container,
for example, NFS permission bits that provide information regarding
permissions associated with the data container; (i) time the data
container was accessed (access time); (j) time the data container
was modified (modification time); (k) time the data container was
created (creation time), when applicable; and (l) any other custom
fields that may be specified by a user or a storage system, for
example, access control lists (ACLs) or a named stream which is a
CIFS level feature that connects a file to a directory or any other
attribute.
[0030] "Namespace" refers to a virtual hierarchical collection of
unique volume names or identifiers and directory paths to the
volumes, in which each volume represents a virtualized container
storing a portion of the namespace descending from a single root
directory. For example, each volume associated with a namespace can
be configured to store one or more data files, scripts, word
processing documents, executable programs and others. In a typical
storage system, the names or identifiers of the volumes stored on a
storage server can be linked into a namespace for that storage
server. The term global namespace refers to a virtual hierarchical
collection of unique volume names or identifiers and directory
paths to the volumes, in which the volumes are stored on multiple
server nodes within a clustered storage server system. The term
virtual in this context means a logical representation of an
entity.
[0031] "NFS" means Network File System, a protocol that allows a
user to access storage over a network.
[0032] "Snapshot" (without derogation to any trademark rights of
NetApp, Inc.) means a point in time copy of a storage file system.
A snapshot is a persistent point in time image of an active file
system that enables quick recovery of data after data has been
corrupted, lost, or altered. Snapshots can be created by copying
the data at each predetermined point in time to form a consistent
image, or virtually by using a pointer to form the image of the
data.
[0033] "Volume" is a logical data set which is an abstraction of
physical storage, combining one or more physical mass storage
devices (e.g., disks) or parts thereof into a single logical
storage object, and which is managed as a single administrative
unit, such as a single file system. A volume is typically defined
from a larger group of available storage, such as an aggregate.
[0034] As used in this disclosure, the terms "component", "module",
"system," and the like are intended to refer to a computer-related
entity, either software-executing general purpose processor,
hardware, firmware and a combination thereof. For example, a
component may be, but is not limited to being, a process running on
a processor, a processor, an object, an executable, a thread of
execution, a program, and/or a computer.
[0035] By way of illustration, both an application running on a
server and the server can be a component. One or more components
may reside within a process and/or thread of execution, and a
component may be localized on one computer and/or distributed
between two or more computers. Also, these components can execute
from various computer readable media having various data structures
stored thereon. The components may communicate via local and/or
remote processes such as in accordance with a signal having one or
more data packets (e.g., data from one component interacting with
another component in a local system, distributed system, and/or
across a network such as the Internet with other systems via the
signal).
[0036] Computer executable components can be stored, for example,
on computer readable media including, but not limited to, an ASIC
(application specific integrated circuit), CD (compact disc), DVD
(digital video disk), ROM (read only memory), floppy disk, hard
disk, EEPROM (electrically erasable programmable read only memory),
memory stick or any other storage device, in accordance with the
claimed subject matter.
[0037] Storage Environment 100:
[0038] FIG. 1A shows an example of a non-cluster based storage
environment 100 where the various embodiments disclosed herein may
be implemented. Storage environment 100 is used to store a
plurality of data containers across a plurality of storage devices.
The embodiments disclosed herein provide a catalog system that
collects metadata for the plurality of data containers,
pre-processes the collected metadata and stores the pre-processed
information in one or more searchable data structures, for example,
a relational database. The searchable data structure may then be
used to search for information regarding the plurality of data
containers and respond to user queries with respect to the stored
data containers.
[0039] Storage environment 100 may include a plurality of storage
systems 108, each coupled to a storage subsystem 111. A storage
subsystem 111 may include multiple mass storage devices 112a-112n
(may also be referred to as 112) that may be used to store a
plurality of data containers (for example, directory files and data
files) as well as the searchable data structure, as described
below. The mass storage devices in each storage subsystem 111 may
be, for example, conventional magnetic disks, optical disks such as
CD-ROM or DVD based storage, magneto-optical (MO) storage, flash
memory storage device or any other type of non-volatile storage
devices suitable for storing data.
[0040] Each storage subsystem 111 is managed by a corresponding
storage system 108. The storage devices in each storage subsystem
111 can be organized into one or more redundant array of
inexpensive disks ("RAID") groups, in which case the corresponding
storage system 108 accesses the storage subsystem 111 using an
appropriate RAID protocol.
[0041] Each storage system 108 may operate as a NAS based file
server, a block-based storage server such as used in a storage area
network (SAN), or a combination thereof, or a node in a clustered
environment described below with respect to FIG. 2, or any other
type of storage server. Note that certain storage systems from
NetApp Inc. in Sunnyvale, Calif., are capable of providing clients
with both file-level data access and block-level data access.
[0042] Storage environment 100 may also include a plurality of
client systems 104.1-104.2 (may also be referred to as 104), a
management console 120 executing a catalog module 119 and at least
one network 106 communicably connecting the client systems
104.1-104.2, storage system 108 and management console 120. The
client systems 104.1-104.2 may be connected to the storage systems
108 via the computer network 106, such as a packet-switched
network.
[0043] Clients 104.1-104.2 may be general purpose computers having
a plurality of components. These components may include a central
processing unit (CPU), main memory, I/O devices, and storage
devices (for example, flash memory, hard drives and others). The
main memory may be coupled to the CPU via a system bus or a local
memory bus. The main memory may be used to provide the CPU access
to data and/or program information that is stored in main memory at
execution time. Typically, the main memory is composed of random
access memory (RAM) circuits. A computer system with the CPU and
main memory is often referred to as a host system.
[0044] Processors executing instructions in storage system 108 and
client systems 104.1-104.2 communicate according to well-known
protocols, such as the NFS protocol or the CIFS protocol, to make
data stored on disk 112 appear to users and/or application programs
as though the data were stored locally on the client systems
104.1-104.2. The storage system 108 can present or export data
stored on disks 112 as a volume, or one or more qtree sub-volume
units, to each of the client systems 104.1-104.2. Each volume may
be configured to store data files, scripts, word processing
documents, executable programs, and the like. As described below in
more detail, a volume may be configured to operate as a "catalog
volume" that stores a searchable data structure with metadata
information regarding directories and data files stored on disks
112.
[0045] From the perspective of one of the client systems
104.1-104.2, each volume can appear to be a single disk drive.
However, each volume can represent the storage space in one disk,
an aggregate of some or all of the storage space in multiple disks,
a RAID group, or any other suitable set of storage space.
[0046] Specifically, each volume can include a number of
individually addressable files. For example, in a NAS
configuration, the files of a volume are addressable over the
computer network 106 for file-based access. In addition, an
aggregate is a fixed-sized volume built on top of a number of RAID
groups containing one or more virtual volumes or FlexVol.RTM.
flexible volumes.
[0047] In a typical mode of operation, one of the client systems
104.1-104.2 transmits one or more input/output commands, such as an
NFS or CIFS request, over the computer network 106 to the storage
system 108. Storage system 108 receives the request, issues one or
more I/O commands to storage device 112 to read or write the data
on behalf of the client system 104.1-104.2, and issues an NFS or
CIFS response containing the requested data over the network 106 to
the respective client system.
[0048] The management console 120 that executes storage management
application (may also be referred to as management application) 118
may be, for example, a conventional PC, workstation, or the like.
In another embodiment, management application 118 may also be
executed by storage system 108. The management application 118 may
be a module with executable instructions, typically used by a
storage network administrator to manage a pool of storage devices.
Management application 118 enables the administrator to perform
various operations, such as monitoring and allocating storage space
in the storage pool, creating and deleting volumes, directories and
others.
[0049] In one embodiment, management application includes a catalog
module 119 that interfaces with storage system 108 for receiving
metadata, pre-processes the collected metadata and then stores it
in a searchable structure, for example, a relational database 115.
Although catalog module 119 is shown as a part of management
application 118, it may operate as a standalone application or may
also be integrated with the operating system of storage system 108.
Furthermore, although catalog module 119 is shown in the context of
a NAS in FIG. 1A, it can be used effectively in a direct attached
storage system (not shown) as well.
[0050] Communication between the storage management application
118, clients 104 and storage systems 108 may be accomplished using
any of the various conventional communication protocols and/or
application programming interfaces (APIs), the details of which are
not germane to the technique being introduced here. This
communication can be implemented through the network 106 or it can
be via a direct link (not shown) between the management console 120
and one or more of the storage systems 108.
[0051] One or more other storage-related applications may also be
operatively coupled to the network 106, residing and executing in
one or more other computer systems 121. Examples of such other
applications include data backup software, snapshot management
software and others. It is noteworthy that these applications may
also be running at storage system 108.
[0052] Storage Management Application 118:
[0053] FIG. 1B shows a block diagram of storage management
application 118 having catalog module 119, according to one
embodiment. In the illustrated embodiment, the storage management
application 118 may also include a graphical user interface (GUI)
module 122 to generate a GUI (e.g., for use by a storage
administrator); an Operations Manager 124 for managing storage
system 108, according to one embodiment; one or more other
management modules 126 to perform various other storage management
related functions; and a communication module 128.
[0054] The communication module 128 implements one or more
conventional communication protocols and/or APIs to enable the
storage management application 118 to communicate with the storage
system 108 and cluster system 114.
[0055] The storage management application 118 may also maintain
policies 130, a list 132 of all volumes in a storage pool as well
as a table 140 of all free space (on a per-disk basis) in a storage
pool. Policies 130 may be used to store configuration information,
based on which metadata is collected, pre-processed, indexed and
then stored in database 115. Details regarding database 115 are
provided below.
[0056] Clustered System:
[0057] The following describes a cluster based storage system (may
also be referred to as "clustered storage system" or "cluster
storage system") in a storage environment 200 of FIG. 2. The
clustered system is a scalable, distributed architecture that
stores data containers at different storage devices that are
managed by a plurality of nodes. When configured, metadata for each
node is collected and provided to an instance of catalog module 119
executed at each node. The metadata is pre-processed and then
stored in a searchable format. More details regarding processing of
metadata is provided below.
[0058] Storage environment 200 may include a plurality of client
systems 204.1-204.2 (may also be referred to as 204), a cluster
storage system 202, management console 120 and at least one
computer network 206 (similar to network 106) communicably
connecting the client systems 204.1-204.2 and a clustered storage
system 202.
[0059] The clustered storage system 202 includes a plurality of
nodes 208.1-208.3 (may also be referred to as 208), a cluster
switching fabric 210, and a plurality of mass storage devices such
as disks 212.1-212.3 (may also be referred to as disks 212 (similar
to storage 112). Each of the plurality of nodes 208.1-208.3 in the
clustered storage system 202 provides the functionality of a
storage server. Clustered storage systems like the clustered
storage system 202 are available from NetApp, Inc.
[0060] Each of the plurality of nodes 208.1-208.3 may be configured
to include an N-module, a D-module, and an M-host, each of which
can be implemented as a separate software module. Specifically,
node 208.1 includes an N-module 214.1, a D-module 216.1, and an
M-host 218.1; node 208.2 includes an N-module 214.2, a D-module
216.2, and an M-host 218.2; and node 208.3 includes an N-module
214.3, a D-module 216.3, and an M-host 218.3.
[0061] The N-modules 214.1-214.3 (may also be referred to as 214)
include functionality that enables the respective nodes 208.1-208.3
to connect to one or more of the client systems 204.1-204.2 over
the computer network 206, while the D-modules 216.1-216.3 (may also
be referred to as 216) connect to one or more of the disks
212.1-212.3. The D-modules interface with a metadata collection
module (See FIG. 4B, 416) and provides metadata for a plurality of
data containers stored at one or more of disks 212.
[0062] The M-hosts 218.1-218.3 (may also be referred to as 218)
provide management functions for the clustered storage server
system 202. In one embodiment, each M-host 218 includes or
interfaces with an instance of catalog module 119 (similar to 410,
FIG. 4A) for receiving collected metadata, pre-processing the
collected metadata and then storing the information is a searchable
data structure.
[0063] A switched virtualization layer including a plurality of
virtual interfaces (VIFs) (may also be referred to a logical
interfaces (LIFs)) 220 is provided between the respective N-modules
214.1-214.3 and the client systems 204.1-204.2, allowing the disks
212.1-212.3 associated with the nodes 208.1-208.3 to be presented
to the client systems 204.1-204.2 as a single shared storage
pool.
[0064] In one embodiment, the clustered storage system 202 can be
organized into any suitable number of virtual servers (also
referred to as "vservers"), in which each vserver represents a
single storage system namespace with separate network access. Each
vserver has a user domain and a security domain that are separate
from the user and security domains of other vservers. Moreover,
each vserver is associated with one or more VIFs and can span one
or more physical nodes, each of which can hold one or more VIFs and
storage associated with one or more vservers. Client systems can
access the data on a vserver from any node of the clustered system,
but only through the VIFs associated with that vserver. The
interaction between a vserver and catalog module 119 is described
below with respect to FIG. 4B.
[0065] Each of the nodes 208.1-208.3 is defined as a computer
adapted to provide application services to one or more of the
client systems 204.1-204.2. In this context, a vserver is an
instance of an application service provided to a client system. The
nodes 208.1-208.3 are interconnected by the switching fabric 210,
which, for example, may be embodied as a Gigabit Ethernet switch.
Although FIG. 2 depicts an equal number (i.e., 3) of the N-modules
214.1-214.3, the D-modules 216.1-216.3, and the M-Hosts
218.1-218.3, any other suitable number of N-modules, D-modules, and
M-Hosts may be provided. There may also be different numbers of
N-modules, D-modules, and/or M-Hosts within the clustered storage
server system 202. For example, in alternative embodiments, the
clustered storage server system 202 may include a plurality of
N-modules and a plurality of D-modules interconnected in a
configuration that does not reflect a one-to-one correspondence
between the N-modules and D-modules.
[0066] The clustered storage server system 202 can include the
NETAPP.RTM. DATA ONTAP.RTM. storage operating system, available
from NetApp, Inc., that implements the WAFL.RTM. storage system, or
any other suitable storage operating system.
[0067] The client systems 204.1-204.2 of FIG. 2 may be implemented
as general-purpose computers configured to interact with the
respective nodes 208.1-208.3 in accordance with a client/server
model of information delivery.
[0068] Each client system 204.1, 204.2 may request the services of
one of the respective nodes 208.1, 208.2, 208.3, and that node may
return the results of the services requested by the client system
by exchanging packets over the computer network 206, which may be
wire-based, optical fiber, wireless, or any other suitable
combination thereof. The client systems 204.1-204.2 may issue
packets according to file-based access protocols, such as the NFS
protocol or the CIFS protocol, when accessing information in the
form of files and directories.
[0069] In a typical mode of operation, one of the client systems
204.1-204.2 transmits an NFS or CIFS request for data to one of the
nodes 208.1-208.3 within the clustered storage server system 202,
and the VIF 220 associated with the respective node receives the
client request. It is noted that each VIF 220 within the clustered
system 202 is a network endpoint having an associated IP address,
and that each VIF can migrate from N-module to N-module. The client
request typically includes a file handle for a data file stored in
a specified volume on one or more of the disks 212.1-212.3.
[0070] Specifically, each volume comprises a storage system subtree
that includes an index node file (an inode file) having a root
inode, and a set of directories and files contained under the root
inode. Each inode is a data structure allocated for a respective
data file to store metadata that describes the data file. For
example, an inode can contain data and pointers for use in
facilitating access to blocks of data within the data file, and
each root inode can contain pointers to a number of inodes.
[0071] Before describing the details of catalog module 119 and how
it interfaces with various components of storage environment 100
and 200, the following provides a description of a storage
operating system that may be used in storage environment 100 and
200, according to one embodiment.
[0072] Operating System:
[0073] FIG. 3A illustrates a generic example of an operating system
300 executed by a node 208.1 (and/or storage system 108), according
to one embodiment of the present disclosure. Operating system 300
interfaces with catalog module 119 via an interface 301. As
described below in more detail, operating system 300 provides
metadata to catalog module 119 to build a searchable data
structure.
[0074] In one example, operating system 300 may include several
modules, or "layers" executed by one or both of N-Module 214 and
D-Module 216. These layers include a file system manager 302 that
keeps track of a directory structure (hierarchy) of the data stored
in storage devices and manages read/write operations, i.e. executes
read/write operations on disks in response to client 204
requests.
[0075] Operating system 300 may also include a protocol layer 304
and an associated network access layer 308, to allow node 208.1 to
communicate over a network with other systems, such as clients 204
and storage management application 118. Protocol layer 304 may
implement one or more of various higher-level network protocols,
such as NFS, CIFS, Hypertext Transfer Protocol (HTTP), TCP/IP and
others, as described below.
[0076] Network access layer 308 may include one or more drivers,
which implement one or more lower-level protocols to communicate
over the network, such as Ethernet. Interactions between clients
104 and mass storage devices 112 are illustrated schematically as a
path, which illustrates the flow of data through operating system
300.
[0077] The operating system 300 may also include a storage access
layer 306 and an associated storage driver layer 310 to allow
D-module 216 to communicate with a storage device. The storage
access layer 306 may implement a higher-level disk storage
protocol, such as RAID while the storage driver layer 310 may
implement a lower-level storage device access protocol, such as FC
or SCSI. In one embodiment, the storage access layer 306 may
implement the RAID protocol, such as RAID-4 or RAID-DP.TM. (RAID
double parity for data protection provided by NetApp Inc. the
assignee of the present disclosure).
[0078] In one embodiment, storage access layer 306 obtains metadata
for various data containers that may be stored in a data volume and
provides that information to catalog module 119. The information is
processed and then stored in a searchable data structure, as
described below.
[0079] FIG. 3B shows a detailed block diagram of the storage
operating system 300 that may be advantageously used with the
present invention. In this example, the storage operating system
comprises a series of processor executable layers organized to form
an integrated network protocol stack or, more generally, a
multi-protocol engine 325 that provides data paths for clients to
access information stored on the node using block and file access
protocols. In addition, the storage operating system includes a
series of processor executable layers organized to form a storage
server 365 that provides data paths for accessing information
stored on the disks 212.1 of the node 208.1. Both the
multi-protocol engine 325 and storage server 365 interface with the
storage management application 118 such that metadata for data
containers stored at disks 212 can be collected, processed and
searched, according to one embodiment.
[0080] N-blade 214 and D-blade 216 may interface with each other
using CF protocol 341. Both blades may also include interface 340a
and 340b to communicate with other nodes and systems.
[0081] The multi-protocol engine includes a media access layer 312
(part of layer 308, FIG. 3A) of network drivers (e.g., Gigabit
Ethernet drivers) that interfaces to network protocol layers (part
of layer 304, FIG. 3A), such as the IP layer 314 and its supporting
transport mechanisms, the TCP layer 316 and the User Datagram
Protocol (UDP) layer 315.
[0082] A file system protocol layer provides multi-protocol file
access and, to that end, includes support for the Direct Access
File System (DAFS) protocol 318, the NFS protocol 320, the CIFS
protocol 322 and the HTTP protocol 324.
[0083] A virtual interface ("VI") layer 326 implements the VI
architecture to provide direct access transport (DAT) capabilities,
such as RDMA (Remote Direct Memory Access), as required by the DAFS
protocol 318. An iSCSI driver layer 328 provides block protocol
access over the TCP/IP network protocol layers, while a FC driver
layer 330 receives and transmits block access requests and
responses to and from the node. The FC and iSCSI drivers provide
FC-specific and iSCSI-specific access control to the blocks and,
thus, manage exports of LUNS to either iSCSI or FCP or,
alternatively, to both iSCSI and FCP when accessing the blocks on
the node 208.1.
[0084] The storage server 365 includes a file system module 302 in
cooperating relation with a volume stripped module (VSM) 370, a
RAID system module 380 and a disk driver system module 390.
[0085] The VSM 370 illustratively implements a striped volume set
(SVS). The VSM 370 cooperates with the file system 302 to enable
storage server 365 to service a volume of the SVS.
[0086] The RAID system 380 manages the storage and retrieval of
information to and from the volumes/disks in accordance with I/O
operations, while the disk driver system 390 implements a disk
access protocol such as, e.g., the SCSI protocol.
[0087] The file system 302 implements a virtualization system of
the storage operating system 300 through the interaction with one
or more virtualization modules illustratively embodied as, e.g., a
virtual disk (vdisk) module (not shown) and a SCSI target module
335. The SCSI target module 335 is generally disposed between the
FC and iSCSI drivers 330, 328 and the file system 302 to provide a
translation layer of the virtualization system between the block
(lun) space and the file system space, where luns are represented
as blocks.
[0088] The file system 302 is illustratively a message-based system
that provides logical volume management capabilities for use in
access to the information stored on the storage devices, such as
disks.
[0089] The file system 302 illustratively may implement a
write-anywhere file system having an on-disk format representation
that is block-based using, e.g., 4 kilobyte (KB) blocks and using
index nodes (inodes) to identify data containers and metadata for
the data container (such as creation time, access permissions, size
and others). The file system uses data containers to store metadata
describing the layout of its file system; these metadata data
containers include, among others, an inode data container. A data
container handle, i.e., an identifier that includes an inode number
(inum), may be used to retrieve an inode from disk.
[0090] Typically, the metadata as handled by file system 302 may
not be stored contiguously and may be spread out among different
storage volumes. This makes it difficult for the file system to
provide user requested information that can be derived from the
metadata. Hence, as described below in more detail, the present
catalog module 119 is being introduced to manage, organize and use
the metadata for the data containers.
[0091] Broadly stated, all inodes of the write-anywhere file system
are organized into the inode data container. A file system (fs)
info block specifies the layout of information in the file system
and includes an inode of a data container that includes all other
inodes of the file system. Each logical volume (file system) has an
fsinfo block that is preferably stored at a fixed location within,
e.g., a RAID group. The inode of the inode data container may
directly reference (point to) data blocks of the inode data
container or may reference indirect blocks of the inode data
container that, in turn, reference data blocks of the inode data
container. Within each data block of the inode data container are
embedded inodes, each of which may reference indirect blocks that,
in turn, reference data blocks of a data container.
[0092] Operationally, a request from the client 204 is forwarded as
a packet over the computer network 206 and onto the node 208.1. A
network driver processes the packet and, if appropriate, passes it
on to a network protocol and file access layer for additional
processing prior to forwarding to the write-anywhere file system
302. Here, the file system generates operations to load (retrieve)
the requested data from disk 212 if it is not resident "in core",
i.e., in memory 604 (FIG. 6).
[0093] If the information is not in memory, the file system 302
indexes into the inode data container using the inode number (inum)
to access an appropriate entry and retrieve a logical vbn. The file
system then passes a message structure including the logical vbn to
the RAID system 380; the logical vbn is mapped to a disk identifier
and disk block number (disk,dbn) and sent to an appropriate driver
(e.g., SCSI) of the disk driver system 390. The disk driver 390
accesses the dbn from the specified disk 212 and loads the
requested data block(s) in memory for processing by the node. Upon
completion of the request, the node (and operating system) returns
a reply to the client 204.
[0094] It should be noted that the software "path" through the
operating system layers described above needed to perform data
storage access for a client request received at node 208.1 may
alternatively be implemented in hardware. That is, in an alternate
embodiment of the disclosure, the storage access request data path
may be implemented as logic circuitry embodied within a field
programmable gate array (FPGA) or an ASIC. This type of hardware
implementation increases the performance of the file service
provided by node 208.1 in response to a file system request issued
by client 204.
[0095] As used herein, the term "storage operating system"
generally refers to the computer-executable code operable on a
computer to perform a storage function that manages data access and
may, in the case of a node 208.1, implement data access semantics
of a general purpose operating system. The storage operating system
can also be implemented as a microkernel, an application program
operating over a general-purpose operating system, such as
UNIX.RTM. or Windows XP.RTM., or as a general-purpose operating
system with configurable functionality, which is configured for
storage applications as described herein.
[0096] In addition, it will be understood to those skilled in the
art that the invention described herein may apply to any type of
special-purpose (e.g., file server, filer or storage serving
appliance) or general-purpose computer, including a standalone
computer or portion thereof, embodied as or including a storage
system. Moreover, the teachings of this disclosure can be adapted
to a variety of storage system architectures including, but not
limited to, a network-attached storage environment, a storage area
network and a disk assembly directly-attached to a client or host
computer. The term "storage system" should therefore be taken
broadly to include such arrangements in addition to any subsystems
configured to perform a storage function and associated with other
equipment or systems. It should be noted that while this
description is written in terms of a write any where file system,
the teachings of the present invention may be utilized with any
suitable file system, including a write in place file system.
[0097] FIG. 3C depicts three exemplary aggregates 392A, 392B, 392C,
which can be stored on one or more of the disks 212.1-212.3 of the
clustered storage server system 202 (see FIG. 2). As shown in FIG.
3C, each of the aggregates 392A, 392B, 392C contains two
representative volumes, in which each volume comprises a storage
system subtree. Specifically, the aggregate 392A contains two
volumes vol1, vol2; the aggregate 392B contains two volumes RT,
vol3; and the aggregate 392C contains two volumes vol4, vol5. In
the clustered storage server system 202, the names of the volumes
from the plurality of nodes 208.1-208.3 are linked into a global
namespace, allowing the client systems 204.1-204.2 to mount the
volumes from one of the nodes 208.1-208.3 with a high level of
flexibility.
[0098] FIG. 3D depicts an exemplary global namespace 394 composed
of the volumes RT, vol1, vol2, vol3, vol4, vol5. The global
namespace 394 may be maintained by storage operating system and may
be used in a cluster environment, for example, 200, FIG. 2. In the
global namespace 394, each volume RT, vol1-vol5 represents a
virtualized container storing a portion of the global namespace 394
descending from a single root directory. The volumes RT, vol1-vol5
are linked together in the global namespace 394 through a number of
junctions. A junction is an internal mount point which, to a
client, resolves to a directory (which would be the root directory
of the target volume). Such a junction can appear anywhere in a
volume, and can link a volume to the root directory of another
volume. For example, in the clustered system 202, a junction in the
volume vol3 associated with the D-module 216.2 links that volume to
the root directory of the volume vol4, which is associated with the
D-module 216.3. A junction can therefore link a volume on one of
the D-modules 216.1-216.3 to another volume on a different one of
the D-modules 216.1-216.3.
[0099] As shown in FIG. 3D, the global namespace 394 includes the
volume RT (i.e., the root volume), which has three junctions
linking the volume RT to the volumes vol1, vol2, vol3. The global
namespace 394 further includes the volume vol3, which has two
junctions linking the volume vol3 to the volumes vol4, vol5.
[0100] As shown in FIGS. 3C and 3D, data containers and the
metadata associated with the data containers may be spread out
among various volumes. In order to get information regarding data
containers, storage usage and other user queries that rely on
metadata information, one has to traverse the namespace and
evaluate individual directory entries. Catalog module 119, as
described below in detail efficiently organizes the metadata in a
searchable data structure such that metadata can be easily searched
and hence utilized to process user requests.
[0101] Catalog System:
[0102] FIG. 4A shows an example of a catalog system 400 (may also
be referred to as system 400) that collects metadata for a
plurality of data containers, organizes the metadata (jointly
referred to as "cataloging"), and then provides user requested
information pertaining to the data containers, according to one
embodiment. As described below, system 400 may include various
modules some of which may be executed by the management console,
the storage system as well as at the client level.
[0103] System 400 includes a catalog module 401 (similar to catalog
module 119, FIG. 1B) that may be executed by or integrated with
M-host 218 for a clustered environment 200 (FIG. 2) or operates as
module (for example, 119, FIG. 1B) of management application 118
for storage environment 100 (FIG. 1A). Catalog module 401 includes
a catalog controller module (also referred to as "catalog
controller") 410 that interfaces with various modules and
implements various cataloging related process steps, as described
below.
[0104] Catalog controller 410 interfaces with a configuration
module 408 that stores configuration information regarding
cataloging metadata for a plurality of data containers at one or
more data volumes. Configuration information may include
information regarding how often metadata may be collected,
frequency and manner for indexing the collected data as well as
details regarding any actions/reports that a user may seek based on
the collected metadata. Configuration module 408 may be a memory
module that is accessible by catalog controller 410.
[0105] Catalog module 401 may also include a catalog scheduler 406
that interfaces with catalog controller 410 and schedules
cataloging jobs. The cataloging jobs may include collecting
metadata, arranging or indexing the collected metadata, generating
reports based on collected and indexed metadata data, performing a
search based on a user request, as well as taking an action based
on the search results.
[0106] In one embodiment, catalog scheduler 406 receives a client
request or may create a job request based on configuration
information stored at configuration module 408. The job request may
be for collecting metadata, arranging or indexing the collected
metadata, generating reports based on the collected and indexed
metadata data and performing a search based on a user request.
[0107] System 400 may further include metadata collection module
416 (may also be referred to as a metadata collector module), a
pre-processing module 412 and a database engine 411. Metadata
collection module 416 is used to collect metadata from operating
system 300 for a plurality of data containers stored at a data
volume, for example, 418. The structure and operation of metadata
collection module 416 depends on the storage environment. For
example, in one embodiment, in storage environment 100, an instance
of metadata collection module 416 may be a part of storage system
108. In this example, metadata collection module 416 interfaces
with the file system 302 and obtains metadata regarding a plurality
of data containers stored in within volume 418.
[0108] In another embodiment, for storage environment 200, metadata
collection module 416 may be executed at each node 208. In this
example, metadata collection module 416 interfaces with each
D-blade 216 to collect metadata for a plurality of data containers
that may be stored within a volume accessible to each node 208.
[0109] The information collected by metadata collector 416 depends
on user needs and how system 400 is configured. An example of the
type of information that is collected is provided below.
[0110] In one embodiment, metadata is collected for an initial
version of the plurality of data containers. This may be referred
to as "baseline" metadata information (or baseline image). Storage
environments typically maintain a snapshot of the file system and
the associated data containers. A file system manager (302, FIG.
3A0 or any other module may take the actual snapshot and
communicate it to catalog controller 410. The snapshots being a
point in time copy of the file system may be used to restore a
storage file system to an instance when the snapshot was taken.
[0111] A first snapshot for a data volume operates as a starting
point and once that is created, metadata for data containers that
may have changed after the first snapshot is collected and
processed. One process that may be used to obtain differential
information is called "SnapDiff" that is provided by NetApp Inc.
the assignee of the present application. Metadata collection module
416 may use the SnapDiff process to first obtain baseline metadata
information for the plurality of data containers that may be stored
in data volume 418. Once the baseline is established, metadata
collection module 416 may only collect information for data
containers that may have been created, modified or deleted from the
baseline snapshot. If there are no changes to data containers after
the baseline image, then metadata for those data containers is not
collected. It is noteworthy that system 400 may establish any
snapshot to be a baseline and then collect incremental metadata for
data containers that are modified or created after the baseline is
established.
[0112] Metadata collection module 416 provides the collected
metadata to catalog controller 410 via an interface 409 (similar to
interface 301, FIG. 3A). The collected metadata is initially
handled by pre-processing module 412 that receives the metadata and
stores it in an intermediate data structure (may also be referred
to as staging table or intermediate table) 413. Information from
the intermediate table 413 is then used by database engine 411 for
populating database 414 in a catalog volume 415. It is noteworthy
that although pre-processing module 412 is shown as a separate
module, it could be implemented as part of database engine 411.
[0113] The following provides an example of what information is
collected, pre-processed and stored in intermediate table 413 and
then stored in database 414 as a searchable data structure.
[0114] Metadata collection module 416 may collect the following
information from file system 302:
[0115] (a) Unique data container identifier, for example, an inode
number; (b) a data container type, i.e. if the data container is a
directory, file and others; (c) information regarding whether the
data container was accessed, created, modified or deleted; (d) a
data container name (for example, NFS file name and CIFS file name)
and path; (e) an owner identifier, for example, an NFS user
identifier (UID) or a CIFS owner identifier; (f) a group
identifier, for example, an NFS group identifier (GID); (g) a data
container size; (h) permissions associated with the data container,
for example, NFS permission bits that provide information regarding
permissions associated with the data container; (i) time the data
container was accessed (access time); (j) time the data container
was modified (modification time); (k) time the data container was
created (creation time), when applicable; and (l) any other user
specified fields.
[0116] The pre-processing module 412 takes the foregoing
information, extracts a plurality of fields and populates them in
intermediate data structure 413. For example, pre-processing module
412, extracts the unique identifier value, the NFS and CIFS
accessible path where the data container resides, data container
name (i.e. NFS and CIFS accessible name), and extension of the data
container that identifies a property of the container, for example,
if a data container is a data file xyz.doc, then the pre-processing
module extracts the ".doc".
[0117] The pre-processing module 412 also extracts information to
identify the data container type, i.e. a file, directory or others,
creation time of the data container, last time it was accessed and
modified, if applicable. The pre-processing module 412 separates
UID, GID, permission bits and the size of the data container.
[0118] In case the data container is a part of a directory and a
snapshot, the pre-processing module 416 generates a unique
identifier that identifies the snapshot. The pre-processing module
412 also generates a flag that identifies whether the data
container was created, modified or deleted.
[0119] Once the intermediate table 413 is populated, database
engine 411 takes that information and then either creates database
414, if one does not exist or modifies an exiting database 414. In
one embodiment, database 414 may be a relational database that
includes one or more components. Database 414 may include a
plurality of searchable segments that are described below in
detail. A user may request information regarding data containers
and catalog module 401 provides user requested information using
database 414.
[0120] A reporting module 407 is also provided such that user
requested information may be compiled into reports. The layout and
structure of the report will depend on the user needs and the user
preferences. The user may set these reporting preferences using
management application 118 via a user interface.
[0121] Before describing the details of database 414, the following
provides an example of using catalog system 400 in storage
environment 429 (similar to 200) as shown in FIG. 4B, according to
one embodiment. Each node in storage environment 429 may execute an
instance of catalog module 401 that is described above with respect
to FIG. 4A. Each node may also execute an instance of metadata
collection module 416 (shown as C 416.1-416.n) to collect metadata
from D-blades 216.1-216.n.
[0122] Storage environment 429 includes a plurality of volumes,
namely 430a-430g. Volumes 430a and 430b are managed by D1 216.1,
volumes 430c-430e are managed by D2 216.2 and volumes 430f and 430g
are managed by Dn 216.n. The volumes in storage environment 429 may
be provided to different virtual servers via VIFs 431, 433, 435 and
437. For example, VIF 431 provides access to volume 430d, VIF 433
provides access to volume 430c, VIF 435 provides access to volume
430d and VIF 437 provides access to volume 430f. Catalog module 401
manages metadata for the various vservers as if they were
individual nodes.
[0123] Metadata collection module 416.1 collects metadata for
volumes 430a and 430b. Catalog module 401 at node 208.1 then
preprocesses the metadata and stores it at catalog volume 432a.
Metadata collection module 416.2 collects metadata for volumes
430c-430e. The collected and pre-processed metadata is then stored
at catalog volume 434. Similarly, metadata collection module 416.n
collects metadata for volumes 430f and 430g which is then stored at
catalog volumes 432b.
[0124] In one embodiment, a query involving metadata stored at
different catalog volumes (for example, 432a, 432b and 434) may be
generated. The catalog module at the node where the query is
generated, gathers metadata from different catalog volumes and then
the results are aggregated together and presented, as requested by
the query. For example, when catalog module at node 208.2 receives
a request for information regarding data containers stored at
volumes 430a-430g, then catalog module 401 gathers information from
catalog volumes 432a, 432b and 434 and presents the aggregated
information to the user.
[0125] It is noteworthy that the systems disclosed herein, for
example, 429 are scalable. Based on storage space utilization and
overall performance, one can assign any volume to operate as a
catalog volume. One can also add new catalog volumes to store
metadata. Furthermore, a same volume may be configured to store
data containers and metadata.
[0126] FIG. 4C shows an example of a data structure 440 (may also
be referred to as snapshot table 440) having a plurality of columns
that may be used by database engine 411 to index metadata for
database tables 414. Snapshot table 440 may be a stand-alone table
or integrated with database 414. Table 440 may also be stored on a
per-volume basis on each catalog volumes, for example, 432a-b and
434, as shown in FIG. 4A.
[0127] Snapshot table 440 may include a plurality of fields'
440A-440F. Field 440A (ID) may be used to identify a snapshot
itself. Field 440B (NAME) may be used to name the snapshot. Field
440C (Creation Time) may be used to store the time when the
snapshot was taken. Field 440D (Index_Start_Time) stores a time
when metadata collected for a particular snapshot was indexed.
Field 440E (Index_End_Time) may be used to store a time when
information for the snapshot was indexed.
[0128] The metadata for a particular snapshot may be indexed based
on a schedule that may be established by a user during storage
system configuration, a request generated by the user, initiated by
a management application based on whether the overall storage
system is busy doing other tasks or if the system is idle. The
indexing itself can be optimized such that it does not negatively
impact the overall performance of the storage environment.
[0129] Field 440F (ATTR) may be used to store attribute information
regarding a snapshot. For example, field 440F may include a
snapshot version indicator indicating a snapshot when a change in
status for the data container was discovered.
[0130] Besides the fields shown in FIG. 4C, other fields may also
be added. One such field may be referred to as a "tag". A tag is a
user defined field that one can add, for example, a user may want
to identify all files that are labeled as "confidential" by using a
"confidential" tag. The systems and processes described herein
allow one to search for metadata based on the tags.
[0131] Table 440 may be used to determine if there are any
snapshots. The snapshots itself may be taken by the file system
manager 302 (or any other module) and communicated to the catalog
module via catalog interface 301(See FIG. 3A). In one embodiment,
whenever a snapshot is taken, file system manager 302 may send a
notification to the catalog module.
[0132] When the first snapshot is taken, then metadata collected
for that snapshot may be used as a baseline image for database
table 414 (FIG. 4A), as described below. As more snapshots are
taken, metadata for data containers that were created, modified or
deleted from the initial snapshot is collected and indexed, as
described below. If there is no change in the data containers after
the initial snapshot, then no metadata is collected for the
unchanged data containers.
[0133] FIGS. 4D and 4E show examples of data structures of database
414 generated by database engine 411 of catalog module 401,
according to one embodiment. Database 414 may be a relational
database having a plurality of searchable segments that logically
interface with each other. For example, database 414 may include a
directories table 450 and a data container table 452. The first
searchable segment, a directory table 450, which may include
information regarding all the directories for a data volume that is
configured to be cataloged, for example, 418 (FIG. 4A). Directory
table 450 may include a plurality of fields' 450A-450M that are now
described below.
[0134] Field 450A (Identifier) may be a unique identifier to
identify a directory, for example, an inode number, an inode
generation number or both. Field 450B (Parent) identifies a
"parent" for the directory. The parent in this case is an upper
level directory to which the directory identified by 450A may
belong. Field 450C provides a directory path.
[0135] Field 450D provides a name for the directory. Field 450E
provides a directory size. Field 450F (Mode) provides the
permissions associated with a directory. The permissions indicate
what level of authority a user has with respect to a particular
directory. Permissions may range from being able to read the
directory entry to be able to create, modify or delete the entry
and other permission types.
[0136] Field 450G identifies the owner of the directory, shown as
"uid". Field 450H identifies a group to which the directory may
belong to, shown as "Gid". In an enterprise having different
business groups, for example, engineering, sales, marketing legal
and others, a storage system may be divided among different
entities. Field 450H identifies the group to which a particular
directory belongs.
[0137] Field 450I (Atime) provides a time when the directory was
last accessed, while field 450J (Ctime) provides a time when the
directory was created. Field 450K (Mtime) includes a time when the
directory was modified. Field 450L includes a flag that indicates
whether an entry was added (by using a flag "A"), modified (by
using a flag "M") or deleted (by using a flag "D").
[0138] Field 450M identifies a snapshot to which the directory may
belong. This may be similar to field 440A shown in FIG. 4C.
[0139] Database 414 may also include a second searchable segment,
for example, a data container table that may store metadata
information regarding a plurality of data containers. FIG. 4E shows
an example of data container table 452 that stores information
regarding a plurality of data containers, for example, files. Each
file in the data container table 452 is associated with an entry in
the directories table 450. This allows one to include a path for a
file only once in the directories table and one does not have to
copy the path in data container table 452 every time the metadata
for the file is indexed.
[0140] Data container table 452 may include various fields
452A-452L. Field 452A identifies the file with a unique identifier,
for example, an inode number, an inode generation number or both.
Field 452B associates a parent to the data container identified by
field 452A. This field maps to an entry in the directory table 450.
Because of this cross reference to the directory table, one does
not have to enter the data container path for all individual data
container entries. This saves memory space and processing time. For
example, if there are one million files in a storage system, if one
tried to save the paths for all one million files, it would take
space and processing time. Instead, in one embodiment, field 452B
cross references to a directory entry in data structure 450 where
the path for each entry in data structure 452 is located.
[0141] In another embodiment, the structure of cross-referencing
files to directory entry also reduces processing time when a
directory is renamed. For example, if each file had an entry that
provided the storage path and directory name, then one would have
to go and change entries for each individual files. Using the
foregoing scheme, one only has to update directory names and
individual path entries do not need to be updated.
[0142] Field 452C includes a data container name, for example, a
file name, while field 452D includes a size of the data container.
Field 452E (Mode) identifies the permissions that may be associated
with the data container. This includes, whether a user is permitted
to simply read the data container content, modify it or delete
it.
[0143] Field 452F (UID) identifies the owner of the data container,
while field 452G (GID) identifies the group to which the data
container belongs.
[0144] Field 452H (Atime) identifies the time the data container is
accessed, field 452I (Ctime) identifies the time it was created,
while field 452J (Mtime) identifies the time the data container was
modified, if applicable. Field 452K is a flag that indicates
whether the data container was created (A), modified (M) or deleted
(D). Field 452L identifies the snapshot, if applicable to which the
file belongs. This identifier is similar to 450M in table 450.
[0145] The following example explains the various entries of FIGS.
4D and 4E: Directories "a" and "b" are identified as 10 and 20 by
identifier 405A in FIG. 4D. Directories "a" and "b" are parent
directories as shown by directory path 450C entry "/". Directories
"c" and ""d" are identified as 30 and 40 and are sub-directories
under parent directory "a".
[0146] File f.txt as identified by file name 452C (FIG. 4E) is
stored at "/a/c". The path can be obtained by using the cross
referenced parent directory entry 30 under 452B (FIG. 4E). Files
g.txt, h.doc, i.epp and j.pdf as identified by file name 452B are
stored at "/a/d" as shown by the parent identifier 30. File e.jpeg
is stored under sub-directory "b" based on parent identifier
20.
[0147] It is noteworthy that although FIGS. 4D and 4E show examples
of different database tables 450 and 452, the adaptive embodiments
are not limited to having separate tables. In one embodiment, the
files and directory tables 450 and 452 may be included in a single
table but differentiated by an identifier, for example, a Snapshot
identifier.
[0148] FIG. 4F shows an example of populating directory and data
container tables at time to and time t1. The directory table at
time t0 identifies the inodes 10, 20, 30 and 40 under field 450A.
The parent fields are specified as 0, 0, 10 and 10 under field
450B. The directory path is shown as /, /, /a/ and /a/ under field
450C. The names of the directories are provided as "a", "b", "c"
and "d" under field 450D.
[0149] The files or data containers at time t0 are also shown in
the data container table labeled as Files0. For example, field 452A
provides the inode numbers 31, 41, 42, 43, 44 and 51 for files
f.txt, g.txt, h.doc, i.cpp, j.pdf and e.jpeg, respectively. Each
file is associated with a parent under field 452B, i.e. 30, 40, 40,
40, 40 and 20 respectively.
[0150] At time t1, another snapshot is taken and metadata for the
snapshot at time t1 (may be referred to as Snap1) is shown as 450'
and 452'. Under Snap1, directory z gets created under /b/ as
indicated by the flag "A" which means added, directory c is moved
from /a/c to /b/c and directory /a/d is modified.
[0151] In the Files1 table, at Snap1, file y.txt is created under
/b/z, file j.pdf is modified and file h.doc is deleted.
[0152] In one embodiment, database segments 450 and 452 may be used
efficiently to respond to user queries for information regarding
data containers that can obtained by searching metadata
information. Since metadata fields are organized in a relational
database, one can search through the database to provide user
requested information. The information type of course may vary
based on a user request.
[0153] As shown above, database 414 is split into multiple logical
tables 450 and 452. This is efficient and saves disk space because
the data container tables (or file tables) do not include the path
for every file entry and this saves storage space. Instead, each
data container (for example, a file) is associated with a parent
(or directory) identifier in a directories table. To access a data
container, one simply has to look at the parent entry and ascertain
the path where the data container is stored.
[0154] Process Flow:
[0155] FIG. 5A shows a block diagram for using system 400 for
collecting metadata, pre-processing and indexing the pre-processed
metadata to build database 414, according to one embodiment. The
process begins in block S500 when a storage volume is configured to
operate as a catalog volume. A storage administrator having
appropriate permissions and using management application 118
configures the storage volume as a catalog volume (for example, 415
(FIG. 4A) to store database 414.
[0156] The storage administrator may also configure one or more
data volumes, for example, 418 (FIG. 4A) or 430a-430g (FIG. 4B)
such that metadata for the data containers stored at the data
volumes can be collected, indexed and then stored at the catalog
volume. The storage administrator may associate one or more data
volumes to a particular catalog volume. The storage administrator
may specify a collection frequency which determines how often the
metadata is collected. The storage administrator may also specify
certain events based on which the metadata may be collected. For
example, the storage administrator may specify that when a new
snapshot is taken, metadata should be collected for the data
containers that may have changed from a previous snapshot of the
same data volume.
[0157] In block S502, metadata is collected by metadata collection
module 416. In one embodiment, metadata is collected based on a
user specified schedule as described above. In another instance,
metadata may be collected based on an event, for example, a
snapshot. In yet another embodiment, a user may send a request to
collect metadata for a data volume.
[0158] The metadata that is collected by metadata collection module
416 may be for a baseline snapshot. This means that metadata is
collected for all the data containers stored at the data volume.
When there are changes to the data containers and a snapshot is
taken at a later instance, then metadata is collected for only the
changed data containers. Incremental metadata collection is
efficient because one does not have to repeat the metadata
collection step for all the data containers including data
containers that may not have changed from a previous instance.
[0159] In one embodiment, for a clustered environment, the metadata
collection module 416 is executed at one or more nodes and collects
metadata associated with data volumes that are accessible to the
node. The metadata may be collected from operating system 300 that
maintains information regarding all the data containers at the
selected data volume.
[0160] After the metadata is collected, it is pre-processed and
placed at intermediate table 413 in block S504. One reason for
pre-processing the metadata is because the metadata received from
the operating system may be of a different format and one may have
to extract one or more fields so that the information can be placed
in database 414 and used efficiently to respond to user requests as
described below. An example of how fields are extracted from the
collected metadata and placed at intermediate table 413 has been
described above.
[0161] After the metadata is pre-processed, the information from
intermediate table 413 is indexed. The indexing is based on one or
more fields that have been described above with respect to the
database 414 tables.
[0162] The indexing in block S506 may be based on a policy that is
set up by a user and stored in configuration module 408 (FIG. 4A).
The policy allows a user to set indexing of metadata collected
after each snapshot. The indexing may be "on-demand" i.e. based on
when a user or storage administrator sends a request to start
indexing. In another embodiment, indexing may be time based such
that catalog controller 410 starts indexing based on a set
schedule. The indexing policy settings make the system and process
flexible because users in different storage environments may use
different polices for indexing metadata based on user needs.
[0163] After the pre-processed metadata is indexed, in block 506,
it is stored in database 414. In one embodiment, the stored
metadata is placed in a searchable relational database 414. An
example, of searchable database 414 is described above with respect
to FIGS. 4C-4F.
[0164] In one embodiment, for a clustered environment, database 414
may be stored at one or more volumes that may be referred to as
catalog volumes. Metadata collected from different nodes may be
stored at the catalog volumes. Catalog controller 410 can access a
volume locator database (VLDB) 403 (FIG. 4A) (or 220, FIG. 2A) that
identifies different volumes and their locations. This allows the
catalog controller to cross reference the volume identifiers with
the collected metadata.
[0165] FIG. 5B shows a process flow diagram for handling query
requests using database 414, according to one embodiment. The
process begins in block S508 when a user request is received by
catalog module 401. The user request may be received via a user
interface that is provided by client 402 (FIG. 4A). The request is
received by catalog interface 404 and forwarded to scheduler 406.
Scheduler 406 may maintain one or more queues for receiving user
requests. The user request is then forwarded to catalog controller
410. In another embodiment, the query may be scheduled by the user
based on a specified duration or an event, as described above.
[0166] In block S510, the query is forwarded to database engine 411
so that user requested information can be obtained from database
414. Catalog controller 410 parses the user request to ascertain
what fields in database 414 may need to be searched. For example,
if the user wants to know how many ".pdf" files belong to a
particular group, then catalog controller will search file name
452C and group identifier 452G to respond to the query.
[0167] In block S512, the user requested information is presented
to the user. The information may be displayed in a user interface
on a display device. The information may be presented as a report
by reporting module 407.
[0168] In block S514, an action that may need to be taken, based on
the search results is performed. The nature and action type may be
based on user request. For example, a user request may be to obtain
information regarding certain file types for example, video files.
The action associated with the file type may be to move the certain
file type from one volume to another volume. Catalog controller 410
obtains the file types by searching database 414 that stores
information regarding file types. Thereafter, catalog controller
communicates with operating system 300 to move the files from the
first location to one or more locations. This example is provided
to illustrate the adaptive nature of the various embodiments and
not to limit the various embodiments shown herein.
[0169] FIG. 5C shows a process flow diagram for collecting metadata
and then processing user requests in a clustered system, according
to one embodiment. The process begins in block S516 when a storage
volume is configured for collecting metadata. Referring back to
FIG. 4C, the different volumes 430a-430g associated with different
virtual servers may be configured to collect metadata.
[0170] In block S518, metadata is collected from a plurality of
nodes. The metadata is collected by metadata collection module 416
executed by the plurality of nodes and then stored at one or more
catalog volumes (432a, 432b and 434, FIG. 4B). An example, of this
is shown in FIG. 4B, where metadata collection modules 416.1-416.N
are executed at each node and collect metadata for data volumes
that are configured in block S516.
[0171] In block S520, metadata collected from different volumes and
controlled by different nodes is pre-processed and stored in
database 414 at catalog volume 434 (FIG. 4B). The pre-processing is
performed so that information from the collected metadata can be
used to populate database 414. The collected metadata may arrive in
an order determined by the storage operating system. The collected
metadata may include more information than what may be needed by
catalog module 401. The pre-processing is performed such that
catalog module can extract the relevant fields and values that are
used in database 414. Details regarding pre-processing and database
414 are provided above with respect to FIGS. 4A-4F and FIGS.
5A-5B.
[0172] In block S522, a user query for information regarding a
plurality of data structures that may be stored at different
volumes and controlled by different nodes is received. The user
query is received by catalog module 410 via a user interface
provided by management application 118. A user may request
different information types for the plurality of data structures.
The type of user query and the nature of information that the user
may seek depends on how a user is using storage environment
200.
[0173] In block S524, database 414 is used to search for
information requested by the user. Searching database 414 is faster
and less taxing on computing resources vis-a-vis performing a
directory "walk" analyzing metadata for millions of files. For
example, to determine how many files were accessed within certain
duration, one only has to search using field 452G and ascertain the
number of files within the specified duration. One is able to do
that because of the way database 414 is structured and built.
[0174] In some instances, an action may be associated with a search
query. When an action is associated with a search query, then the
requested action associated with the search results is performed in
block S526. For example, a user may configure a volume such that
after every snapshot, certain file types may be moved to another
location. To accommodate this action, after every snapshot, first
database 414 is searched to ascertain the file types and then
operating system 300 is notified to move the file types.
[0175] In one embodiment, using catalog system 400 and the process
steps described above, one can efficiently search metadata for data
containers stored at one or more data volumes both in a clustered
environment 200 and non-cluster environment 100. In traditional
storage environments, the operating system is typically geared
towards handling access to one object at a time. Access to a group
of files within a file system is difficult. Furthermore, the
operating system layout is such that metadata for a data container,
for example, a file name, attributes, access control lists,
information regarding the owner may not be stored contiguously at
the storage devices. Therefore, to access information regarding a
data container or a group of data containers, one has to traverse
through a namespace and perform a directory search.
[0176] The embodiments disclosed herein efficiently search for data
containers using relational database 414 and its associated tables.
For example, one can search for "all files greater than size 1 MB
that were not accessed within the last year" by searching data
structure 452. One can use the size field 452D and access time
field 452H to filter all files that may be greater than 1 MB and
were not accessed within one year, without having to do an
extensive namespace based directory search.
[0177] In one embodiment, catalog system 400 integrates metadata
management related operations as well as data container related
operations within a storage environment. In conventional systems,
typically, one vendor provides an operating system 300 and a
different vendor provides a separate system for handling metadata
related operations. Catalog system 400 is integrated with operating
system 300 and management application 118. Hence, one does not need
to use another third party module for handling metadata related
operations.
[0178] In one embodiment, metadata related operations are executed
efficiently because catalog system 400 is integrated with operating
system 300. This allows one to use operating system 300's ability
to collect metadata efficiently. If one were to use an external,
third party system, then one will have to scan an entire file
system using other techniques, compared to the techniques that are
integrated with the operating system.
[0179] In one embodiment, because metadata is handled efficiently,
one can provide useful reports to users such that users can
efficiently use the storage space. The reports are provided by
reporting module 407 and management application 118 via a user
interface. The data for the reports is provided by catalog module
401 and formatted and presented by management application 118.
[0180] Reports can be configured based on user specified
parameters, for example, users may want to know what different
types of files are being used, for example, media files, ".doc"
files and others. In conventional systems, to gather that
information, one will have to traverse through a
namespace/directory that may include millions of files. In the
embodiments disclosed herein, one can obtain this information from
database 414 by searching field 452C that includes the file type.
This is faster and more efficient than searching through a
directory that may include millions of files.
[0181] The embodiments herein also allow a user to generate reports
based on different users that use the storage space. For example,
by searching database 414 using fields 452C and owner
identification field 452F, one can ascertain which users are using
a certain file type. One can also view usage of storage space based
on groups, by using the group identifier 452G. One can do this
efficiently because of the manner in which the relational database
414 is structured.
[0182] In another embodiment, reports can be generated based on
volumes that are spread out in a clustered environment 200. Because
metadata is collected for different nodes and efficiently cataloged
at one or more catalog volumes (for example, 434, FIG. 4B), one is
able to obtain an overall view of the clustered system, as well as
node based view. A storage administrator can issue cluster wide
requests and catalog module can obtain information regarding the
entire cluster or for specific volumes. One can obtain all this
information without having to perform an entire file system search
that can be resource intensive and inefficient.
[0183] In yet another embodiment, not only one can generate reports
and perform fast queries, one can also perform actions that may be
related to the search results. For example, a user may want to know
how many files of a certain type, for example, .mp3, are saved in
the storage system and then move the files to a different storage
environment. One can conduct an efficient search using database 414
and then perform the appropriate action. This allows a user to
efficiently use storage space. Continuing with the foregoing
example, if the .mp3 files are not being accessed or used
frequently and the user has access to secondary storage that is
also not used frequently, then the user can move the files to the
secondary storage that is used infrequently.
[0184] This allows a user to efficiently manage and use storage
resources. The user can obtain storage system usage views
efficiently by using database 414 and based on user needs perform
the appropriate actions for moving information around.
[0185] The embodiments disclosed herein allow a user to search for
data containers based on a data container owner, name of the data
container, modification time, access time and type of data
container and other fields. The search may be performed by
combining different fields. For example, a user can search which
owners and groups use the highest amount of storage as well as the
least amount of storage. One can then apportion storage cost to
individuals, teams and business units.
[0186] Since metadata is collected incrementally for different
snapshots, one can look at the growth of storage between snapshots.
This will allow storage administrators to plan better for upgrading
or downgrading storage space, based on business need.
[0187] It is noteworthy that the systems and processes described
herein are not limited to collecting metadata for Snapshots but
instead catalog module may catalog metadata for an active file
system.
[0188] Storage System Node:
[0189] FIG. 6 is a block diagram of a node 208.1 (FIG. 2) that is
illustratively embodied as a storage system comprising of a
plurality of processors 602A and 602B, a memory 604, a network
adapter 610, a cluster access adapter 612, a storage adapter 616
and local storage 613 interconnected by a system bus 608. The local
storage 613 comprises one or more storage devices, such as disks,
utilized by the node to locally store configuration information
(e.g., in a configuration table 614).
[0190] The cluster access adapter 612 comprises a plurality of
ports adapted to couple node 208.1 to other nodes of cluster 202.
In the illustrative embodiment, Ethernet may be used as the
clustering protocol and interconnect media, although it will be
apparent to those skilled in the art that other types of protocols
and interconnects may be utilized within the cluster architecture
described herein. In alternate embodiments where the N-modules and
D-modules are implemented on separate storage systems or computers,
the cluster access adapter 612 is utilized by the N/D-module for
communicating with other N/D-modules in the cluster 202.
[0191] Each node 208.1 is illustratively embodied as a dual
processor storage system executing a storage operating system 606
that preferably implements a high-level module, such as a file
system, to logically organize the information as a hierarchical
structure of named directories, files and special types of files
called virtual disks (hereinafter generally "blocks") on disks
212.1. However, it will be apparent to those of ordinary skill in
the art that the node 208.1 may alternatively comprise a single or
more than two processor systems. Illustratively, one processor 602A
executes the functions of the N-module 214.1 on the node, while the
other processor 602B executes the functions of the D-module
216.1.
[0192] The memory 604 illustratively comprises storage locations
that are addressable by the processors and adapters for storing
programmable instructions and data structures. The processor and
adapters may, in turn, comprise processing elements and/or logic
circuitry configured to execute the programmable instructions and
manipulate the data structures. It will be apparent to those
skilled in the art that other processing and memory means,
including various computer readable media, may be used for storing
and executing program instructions pertaining to the invention
described herein.
[0193] The storage operating system 300, portions of which is
typically resident in memory and executed by the processing
elements, functionally organizes the node 208.1 by, inter alia,
invoking storage operations in support of the storage service
implemented by the node. An example of operating system 300 is the
DATA ONTAP.RTM. (Registered trademark of NetApp, Inc.) operating
system available from NetApp, Inc. that implements a Write Anywhere
File Layout (WAFL.RTM. (Registered trademark of NetApp, Inc.)) file
system. However, it is expressly contemplated that any appropriate
storage operating system may be enhanced for use in accordance with
the inventive principles described herein. As such, where the term
"ONTAP" is employed, it should be taken broadly to refer to any
storage operating system that is otherwise adaptable to the
teachings of this invention.
[0194] The network adapter 610 comprises a plurality of ports
adapted to couple the node 208.1 to one or more clients 204.1/204.2
over point-to-point links, wide area networks, virtual private
networks implemented over a public network (Internet) or a shared
local area network. The network adapter 610 thus may comprise the
mechanical, electrical and signaling circuitry needed to connect
the node to the network. Illustratively, the computer network 106
may be embodied as an Ethernet network or a Fibre Channel (FC)
network. Each client 204.1/204.2 may communicate with the node over
network 106 by exchanging discrete frames or packets of data
according to pre-defined protocols, such as TCP/IP.
[0195] The storage adapter 616 cooperates with the storage
operating system 300 executing on the node 208.1 to access
information requested by the clients. The information may be stored
on any type of attached array of writable storage device media such
as video tape, optical, DVD, magnetic tape, bubble memory,
electronic random access memory, micro-electro mechanical and any
other similar media adapted to store information, including data
and parity information. However, as illustratively described
herein, the information is preferably stored on disks 212.1. The
storage adapter 616 comprises a plurality of ports having
input/output (I/O) interface circuitry that couples to the disks
over an I/O interconnect arrangement, such as a conventional
high-performance, FC link topology.
[0196] Storage of information on each array 212.1 is preferably
implemented as one or more storage volumes that comprise a
collection of physical storage disks 212.1 cooperating to define an
overall logical arrangement of volume block number (vbn) space on
the volume(s). Each logical volume is generally, although not
necessarily, associated with its own file system. The disks within
a logical volume/file system are typically organized as one or more
groups, wherein each group may be operated as a RAID. Most RAID
implementations, such as a RAID-4 level implementation, enhance the
reliability/integrity of data storage through the redundant writing
of data "stripes" across a given number of physical disks in the
RAID group, and the appropriate storing of parity information with
respect to the striped data. An illustrative example of a RAID
implementation is a RAID-4 level implementation, although it should
be understood that other types and levels of RAID implementations
may be used in accordance with the inventive principles described
herein.
[0197] Processing System:
[0198] FIG. 7 is a high-level block diagram showing an example of
the architecture of a processing system, at a high level, in which
the executable instructions described above can be implemented. The
processing system 700 can represent management console 120, for
example. Note that certain standard and well-known components which
are not germane to the present invention are not shown in FIG.
7.
[0199] The processing system 700 includes one or more processors
702 and memory 704, coupled to a bus system 705. The bus system 705
shown in FIG. 7 is an abstraction that represents any one or more
separate physical buses and/or point-to-point connections,
connected by appropriate bridges, adapters and/or controllers. The
bus system 705, therefore, may include, for example, a system bus,
a Peripheral Component Interconnect (PCI) bus, a HyperTransport or
industry standard architecture (ISA) bus, a small computer system
interface (SCSI) bus, a universal serial bus (USB), or an Institute
of Electrical and Electronics Engineers (IEEE) standard 1394 bus
(sometimes referred to as "Firewire").
[0200] The processors 702 are the central processing units (CPUs)
of the processing system 700 and, thus, control its overall
operation. In certain embodiments, the processors 702 accomplish
this by executing executable instructions 706 stored in memory 704.
A processor 702 may be, or may include, one or more programmable
general-purpose or special-purpose microprocessors, digital signal
processors (DSPs), programmable controllers, application specific
integrated circuits (ASICs), programmable logic devices (PLDs), or
the like, or a combination of such devices.
[0201] Memory 704 represents any form of random access memory
(RAM), read-only memory (ROM), flash memory, or the like, or a
combination of such devices. Memory 704 includes the main memory of
the processing system 700. Instructions 706 may be used to
implement the techniques introduced above (e.g. catalog module 401)
may reside in and executed (by processors 702) from memory 704.
[0202] Also connected to the processors 702 through the bus system
705 are one or more internal mass storage devices 710, and a
network adapter 712. Internal mass storage devices 710 may be or
may include any conventional medium for storing large volumes of
data in a non-volatile manner, such as one or more magnetic or
optical based disks. The network adapter 712 provides the
processing system 700 with the ability to communicate with remote
devices (e.g., storage servers 202) over a network and may be, for
example, an Ethernet adapter, a Fibre Channel adapter, or the like.
The processing system 700 also includes one or more input/output
(I/O) devices 708 coupled to the bus system 705. The I/O devices
708 may include, for example, a display device, a keyboard, a
mouse, etc.
[0203] Thus, a method and apparatus for managing metadata for data
containers have been described. Note that references throughout
this specification to "one embodiment" or "an embodiment" means
that a particular feature, structure or characteristic described in
connection with the embodiment is included in at least one
embodiment of the present invention. Therefore, it is emphasized
and should be appreciated that two or more references to "an
embodiment" or "one embodiment" or "an alternative embodiment" in
various portions of this specification are not necessarily all
referring to the same embodiment. Furthermore, the particular
features, structures or characteristics being referred to may be
combined as suitable in one or more embodiments of the invention,
as will be recognized by those of ordinary skill in the art.
[0204] While the present disclosure is described above with respect
to what is currently considered its preferred embodiments, it is to
be understood that the disclosure is not limited to that described
above. To the contrary, the disclosure is intended to cover various
modifications and equivalent arrangements within the spirit and
scope of the appended claims.
* * * * *