U.S. patent application number 10/387188 was filed with the patent office on 2004-09-16 for method and apparatus for seamless management for disaster recovery.
This patent application is currently assigned to Hitachi, Ltd.. Invention is credited to Fujibayashi, Akira.
Application Number | 20040181707 10/387188 |
Document ID | / |
Family ID | 32961845 |
Filed Date | 2004-09-16 |
United States Patent
Application |
20040181707 |
Kind Code |
A1 |
Fujibayashi, Akira |
September 16, 2004 |
Method and apparatus for seamless management for disaster
recovery
Abstract
A method, apparatus, article of manufacture, and system are
presented for establishing redundant computer resources. According
to one embodiment, in a system including a plurality of processor
devices and a plurality of storage devices, the processor devices,
the storage devices and the management server being connected via a
network, the method comprises storing device information relating
to the processor devices and the storage devices and topology
information relating to topology of the network, identifying at
least one primary computer resource, selecting at least one
secondary computer resource suitable to serve as a redundant
resource corresponding to the at least one primary computer
resource based on the device information and the topology
information, and assigning the at least one secondary computer
resource as a redundant resource corresponding to the at least one
primary computer resource.
Inventors: |
Fujibayashi, Akira;
(Sunnyvale, CA) |
Correspondence
Address: |
TOWNSEND AND TOWNSEND AND CREW, LLP
TWO EMBARCADERO CENTER
EIGHTH FLOOR
SAN FRANCISCO
CA
94111-3834
US
|
Assignee: |
Hitachi, Ltd.
Tokyo
JP
|
Family ID: |
32961845 |
Appl. No.: |
10/387188 |
Filed: |
March 11, 2003 |
Current U.S.
Class: |
714/6.12 |
Current CPC
Class: |
G06F 11/008 20130101;
G06F 11/2069 20130101; G06F 11/20 20130101; G06F 16/275 20190101;
Y10S 707/99953 20130101; G06F 11/2094 20130101; H04L 1/22
20130101 |
Class at
Publication: |
714/006 |
International
Class: |
H04L 001/22 |
Claims
What is claimed is:
1. A method for a management server to establish redundant computer
resources in a system including a plurality of processor devices
and a plurality of storage devices, the processor devices, the
storage devices and the management server being connected via a
network, said method comprising: storing device information
relating to the processor devices and the storage devices and
topology information relating to topology of the network, the
device information including at least software information thereon;
identifying at least one primary computer resource, said at least
one primary computer resource including at least one primary
processor device and at least one portion of storage implemented in
at least one primary storage device; selecting at least one
secondary computer resource suitable to serve as a redundant
resource corresponding to said at least one primary computer
resource based on the device information and the topology
information, said at least one secondary computer resource
including at least one secondary processor device and at least one
portion of storage implemented in at least one secondary storage
device, said at least one secondary processor device being a member
of a cluster which said at least one primary processor device is a
member of, said at least one portion of storage implemented in said
at least one secondary storage device being accessible from said at
least one secondary processor device; and assigning said at least
one secondary computer resource as a redundant resource
corresponding to said at least one primary computer resource.
2. The method of claim 1 wherein if the at least one primary
storage device has storage-based remote mirroring function, the at
least one secondary computer resource is selected such that the at
least one secondary storage device also has storage-based remote
mirroring function and is accessible from the at least one primary
storage device.
3. The method of claim 1 wherein said at least one secondary
computer resource is selected based on at least one user-specified
policy.
4. The method of claim 2 wherein said at least one user-specified
policy includes performance of said at least one secondary computer
resource.
5. The method of claim 2 wherein said at least one user-specified
policy includes reliability of said at least one secondary computer
resource.
6. The method of claim 2 wherein said at least one user-specified
policy includes cost of said at least one secondary computer
resource.
7. The method of claim 1 wherein said step for selecting said at
least one secondary computer resource comprises the steps of:
selecting at least one candidate suitable to serve as a redundant
resource corresponding to said at least one primary computer
resource; presenting said at least one candidate to a user; and
receiving input from said user indicating selection, from said at
least one candidate, of said at least one secondary computer
resource.
8. The method of claim 7 wherein said at least one candidate is
selected based on at least one user-specified policy.
9. The method of claim 1 wherein said at least one primary computer
resource includes a first network interface device for the network;
and wherein said at least one secondary computer resource includes
a second network interface device for the network.
10. A method for a user to accomplish establishing redundant
computer resources in a system including a plurality of processor
devices and a plurality of storage devices, the processor devices
and the storage devices being connected via a network, said method
comprising: issuing a command to begin establishing redundant
computer resources for at least one primary computer resource which
includes at least one primary processor and at least one portion of
storage implemented in at least one primary storage device; and
specifying at least one policy to influence selection of at least a
secondary computer resource suitable to serve as a redundant
resource corresponding to the at least one primary computer
resource, said selection based on device information relating to
the processor devices and the storage devices and topology
information relating to topology of the network.
11. A method for a user to accomplish establishing redundant
computer resources in a system including a plurality of processor
devices and a plurality of storage devices, the processor devices
and the storage devices being connected via a network, said method
comprising: issuing a command to begin establishing redundant
computer resources for at least one primary computer resource which
includes at least one primary processor and at least one portion of
storage implemented in at least one primary storage device;
reviewing at least one candidate suitable to serve as a redundant
resource corresponding to the at least one primary computer
resource, said at least one candidate being selected based on
device information relating to the processor devices and the
storage devices and topology information relating to topology of
the network; and selecting from said at least one candidate at
least one secondary computer resource.
12. The method of claim 11 further comprising the step of
specifying at least one policy to influence selection of said at
least one candidate.
13. An apparatus for establishing redundant computer resources in a
system including a plurality of processor devices and a plurality
of storage devices, the processor devices and the storage devices
being connected via a network, the apparatus comprising: a
management server connected via the network and adapted to store
device information relating to the processor devices and the
storage devices and topology information relating to topology of
the network; at least one management agent communicatively coupled
to the management server, the at least one management agent adapted
to collect the device information and topology information; wherein
the management server is further adopted to select at least one
secondary computer resource suitable to serve as a redundant
resource corresponding to at least one primary computer resource
based on the device information and topology information, the at
least one primary computer resource including at least one primary
processor device and at least one portion of storage implemented in
at least one primary storage device, the at least one secondary
computer resource including at least one secondary processor device
and at least one portion of storage implemented in at least one
secondary storage device.
14. The apparatus of claim 13 wherein said management server is
adapted to select said at least one secondary computer resource
based on at least one user-specified policy.
15. The apparatus of claim 14 wherein said at least one
user-specified policy includes performance of said at least one
secondary computer resource.
16. The apparatus of claim 14 wherein said at least one
user-specified policy includes reliability of said at least one
secondary computer resource.
17. The apparatus of claim 14 wherein said at least one
user-specified policy includes cost of said at least one secondary
computer resource.
18. The apparatus of claim 13 wherein said management server is
further adapted to select at least one candidate, to be presented
to a user, suitable to serve as a redundant resource corresponding
to said at least one primary computer resource, and wherein said
server is further adapted to receive indication of said user's
selection, from said at least one candidate, of said at least one
secondary computer resource.
19. The apparatus of claim 18 wherein said at least one candidate
is selected based on at least one user-specified policy.
20. The apparatus of claim 13 wherein said at least one primary
computer resource includes a first network interface device for the
network; and wherein said at least one secondary computer resource
includes a second network interface device for the network.
21. An article of manufacture comprising: a computer usable medium
having computer readable program code means embodied therein for a
management server to establish redundant computer resources in a
system including a plurality of processor devices and a plurality
of storage devices, the processor devices, the storage devices and
the management server being connected via a network, the computer
readable program code means in said article of manufacture
comprising: computer readable program code means for storing device
information relating to the processor devices and the storage
devices and topology information relating to topology of the
network, the device information including at least software
information thereon; computer readable program code means for
identifying at least one primary computer resource, said at least
one primary computer resource including at least one primary
processor device and at least one portion of storage implemented in
at least one primary storage device; computer readable program code
means for selecting at least one secondary computer resource
suitable to serve as a redundant resource corresponding to said at
least one primary computer resource based on the device information
and the topology information, said at least one secondary computer
resource including at least one secondary processor device and at
least one portion of storage implemented in at least one secondary
storage device; and computer readable program code means for
assigning said at least one secondary computer resource as a
redundant resource corresponding to said at least one primary
computer resource.
22. A system for a management server to establishing redundant
computer resources in a system including a plurality of processor
devices and a plurality of storage devices, the processor devices,
the storage devices and the management server being connected via a
network, said system comprising: means for storing device
information relating to the processor devices and the storage
devices and topology information relating to topology of the
network, the device information including at least software
information thereon; means for identifying at least one primary
computer resource, said at least one primary computer resource
including at least one primary processor device and at least one
portion of storage implemented in at least one primary storage
device; means for selecting at least one secondary computer
resource suitable to serve as a redundant resource corresponding to
said at least one primary computer resource based on the device
information and the topology information, said at least one
secondary computer resource including at least one secondary
processor device and at least one portion of storage implemented in
at least one secondary storage device; and means for assigning said
at least one secondary computer resource as a redundant resource
corresponding to said at least one primary computer resource.
23. A method for a management server to establish redundant
computer resources in a system including a plurality of processor
devices and a plurality of storage devices, the processor devices,
the storage devices and the management server being connected via a
network, said method comprising: storing device information
relating to the processor devices and the storage devices and
topology information relating to topology of the network;
identifying at least one primary computer resource, said at least
one primary computer resource including at least one primary
processor device and at least one portion of storage implemented in
at least one primary storage device, a cluster software being
installed in the at least one primary processor device; selecting
at least one secondary computer resource suitable to serve as a
redundant resource corresponding to said at least one primary
computer resource based on at least one user-specified policy, the
device information and the topology information, said at least one
secondary computer resource including at least one secondary
processor device and at least one portion of storage implemented in
at least one secondary storage device; if the cluster software is
not installed in the at least one secondary processor device,
installing the cluster software in the at least one secondary
processor device; turning the at least one secondary processor
device into a member of a cluster which said at least one primary
processor device is a member of; and assigning said at least one
secondary computer resource as a redundant resource corresponding
to said at least one primary computer resource.
Description
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] Not applicable
BACKGROUND OF THE INVENTION
[0002] This invention is generally related to the field of
clustering systems and remote mirroring technology.
[0003] The use of clustering systems to accomplish fault-tolerance
and/or load-balancing is becoming increasingly popular. Generally
speaking, a clustering system may provide redundant resources so
that if one portion of the system experiences failure, another
portion can take over affected tasks or otherwise provide recovery
from the failure. Also, a clustering system may use its redundant
resources to process tasks in a more distributed manner, allowing
different portions of the system to work in parallel in
accomplishing tasks.
[0004] A typical clustering system may be made up of two or more
nodes, each having its own processing and storage capabilities. In
one particular use of a clustering system, a primary node may
comprise of a server and associated storage devices, while a
secondary node may also comprise of another server and associated
storage devices. The secondary node may be created to be similar to
the primary node, in terms of processing, storage, and other
capabilities. Here, the clustering system may maintain exact
correspondence between the data storage of the primary node and the
data storage of the secondary node, such that any write or read to
data storage at the primary node is replicated at the secondary
node. If the primary node fails as it performs its various tasks,
the secondary node may take over the tasks performed by the primary
node. For example, if a web server that is configured as a primary
node in a clustering system fails for some reason, a secondary node
may take over and serve web server functions in place of the failed
primary node. A web site supported by such a system thus continues
to operate with little or no down time. Web site visitors may
continue to visit the associated web site as if no failure had
occurred. In this example, providing a primary and a secondary node
of similar capabilities allows the secondary node to be capable of
taking over the tasks previously performed by the primary node.
[0005] In other situations, the secondary node may have lesser
capabilities than the primary node. For example, if the secondary
node is only designed to temporarily take over the tasks of the
primary node, or if the secondary node is only designed to record
periodic snap shots of the data storage of the primary node, it may
be sufficient to create the secondary node with lesser
capabilities. This may be especially true if the cost associated
with creating a similarly capable secondary node is to be avoided,
or if failure of the primary node is not expected to extend beyond
a certain amount of time. Thus, depending on the situation, the
required capabilities of the secondary node may vary.
[0006] The correspondence between the data storage of a primary
node and the data storage of a secondary node storage may also be
referred to as remote mirroring. This is especially the case if the
data storage of the primary node is at a geographically distant
location from the data storage of the secondary node. Remote
mirroring may be carried out by different portions of a system. For
example, in host-based remote mirroring, a host, such as a server,
may be principally responsible for maintaining the correspondence
between the data storage of the primary node and the data storage
of the secondary node. In storage-based remote mirroring, a storage
system, such as a storage area network (SAN), may be principally
responsible for maintaining such correspondence. Depending on the
implementation, remote mirroring may require separate software and
equipment installation and/or configuration, in addition to that
required by other parts of the clustering system.
[0007] Currently, in order to realize the many advantages of a
clustering system, the multiple nodes of a clustering system must
be established by a system administrator. For example, in a
clustering system having a primary and a secondary node, the system
administrator must decide exactly what should be the processing,
storage, and other capabilities of the secondary node, install or
identify available resources meeting those capabilities, install
required software, and perform necessary configurations to set up
the clustering system. These steps involve factors that can be
overwhelmingly complex and difficult to analyze for the system
administrator, even if that person is an expert. Thus, the
administrator may only be able to make a rough guess, in an ad hoc
manner, as to what storage capability is needed for the secondary
node. As discussed above, the required storage capability of the
secondary node may vary from situation to situation, and it may not
always be ideal to simply mimic the storage capability of the
primary node.
[0008] Furthermore, after the desired processing, storage, and
other capabilities of the secondary node is decided, the
administrator must go about looking for existing equipment in the
system that fit the description, or install such equipment. In a
large system having many different components, it may be extremely
difficult and time-consuming for an administrator to search through
all available resources in order to find the appropriate equipment.
Finally, after the appropriate resources are decided and located,
software installation and configuration may take additional time
and effort. Thus, while clustering systems provide import
fault-tolerance and/or load-balancing capabilities, the deployment
of clustering systems remains largely a difficult and imprecise
undertaking.
BRIEF SUMMARY OF THE INVENTION
[0009] The present invention provides a method, apparatus, article
of manufacture, and system for establishing redundant computer
resources. According to one embodiment, in a system including a
plurality of processor, a plurality of storage devices, and a
management server connected via a network, the method comprises
storing device information relating to the processor devices and
the storage devices and topology information relating to topology
of the network, identifying at least one primary computer resource,
the at least one primary computer resource including at least one
primary processor device and at least one portion of storage
implemented in at least one primary storage device, selecting at
least one secondary computer resource suitable to serve as a
redundant resource corresponding to the at least one primary
computer resource based on the device information and the topology
information, the at least one secondary computer resource including
at least one secondary processor device and at least one portion of
storage implemented in at least one secondary storage device, and
assigning the at least one secondary computer resource as a
redundant resource corresponding to the at least one primary
computer resource.
[0010] If the at least one primary storage device has storage-based
remote mirroring function, the at least one secondary computer
resource may be selected such that the at least one secondary
storage device also has storage-based remote mirroring function and
is accessible from the at least one primary storage device.
[0011] In one embodiment, the at least one secondary computer
resource is selected based on at least one user-specified policy,
which may include performance of the at least one secondary
computer resource, reliability of the at least one secondary
computer resource, and/or cost of the at least one secondary
computer resource.
[0012] In another embodiment, the step for selecting the at least
one secondary computer resource comprises the steps of selecting at
least one candidate suitable to serve as a redundant resource
corresponding to the at least one primary computer resource,
presenting the at least one candidate to a user, and receiving
input from the user indicating selection, from the at least one
candidate, of the at least one secondary computer resource.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] FIG. 1 is a block diagram of a clustering system in
accordance with at least one embodiment of the present
invention.
[0014] FIG. 2 is an illustration of a mapping table.
[0015] FIG. 3 is an illustration of a logical unit number (LUN)
binding table.
[0016] FIG. 4A is an illustration of a discovery list.
[0017] FIG. 4B is an illustration of a functional discovery list
that may be maintained in addition to or in place of the discovery
list shown in FIG. 4A.
[0018] FIG. 5 is an illustration of a topology table.
[0019] FIG. 6A illustrates a fibre channel switch (FC-SW) zoning
configuration table.
[0020] FIG. 6B illustrates a different FC-SW zoning configuration
table.
[0021] FIG. 6C illustrates a storage-based replication
configuration table.
[0022] FIG. 6D illustrates a host-based replication configuration
table.
[0023] FIG. 6E illustrates a cluster configuration table.
[0024] FIG. 6F illustrates a cluster resource group configuration
table.
[0025] FIG. 6G illustrates a heartbeat configuration table.
[0026] FIG. 7 is a flow chart summarizing the general steps
involved in automatic configuration and semi-automatic
configuration of a clustering system in accordance with at least
one embodiment of the present invention.
[0027] FIG. 8 depicts a visual configuration diagram that may be
presented to the user.
DETAILED DESCRIPTION OF THE INVENTION
[0028] Clustering System
[0029] FIG. 1 is a block diagram of a clustering system 100 in
accordance with at least one embodiment of the present invention.
Here, clustering system 100 is comprised of equipment found in at
least two geographically distinct locations 102 and 104. For
example, location 102 may be a metropolitan area such as San Diego,
Calif., and location 104 may be a different metropolitan area such
as San Francisco, Calif. At location 102, a management server 106
is responsible for monitoring, configuring, and otherwise managing
servers 108 and 110, network equipment 112, and storage equipment
113, 114, and 115. Management server 106, servers 108 and 110,
network equipment 112, and storage equipment 113, 114, and 114
communicate through a local network 116, forming a local SAN.
[0030] As shown, management server 106 includes a SAN manager 118
that includes a configuration engine 120 and a topology repository
122. SAN manager 118 also maintains a discovery list 124, a
configuration table 126, a topology table 128, and a mapping table
130, which are discussed in further detail below. SAN manager 118
maintains this information by communicating with various management
agents located in servers 108 and 110, network equipment 112, and
storage equipment 113, 114, and 115. SAN manager 118 and the
various management agents may be implemented in software.
[0031] Server 108 may include one or more application programs.
These application programs may be server level applications such as
Web server applications, network file sharing applications, and
others. As FIG. 1 illustrates, server 108 may also include
clustering software for maintaining a clustering system, a
management agent, and a number of host ports. Server 110 is
similarly arranged and may also include one or more application
programs, clustering software, a management agent, and a number of
host ports.
[0032] Network equipment 112 is illustrated in FIG. 1 as a switch
having a number of switch ports. Network equipment 112 also
includes a management agent. Network equipment 112 facilitates
communication through local network 116. As shown, network
equipment 112 provides communication between servers 108 and 110
and storage equipment 115.
[0033] Storage equipment 115 may include a number of disk ports, a
number of logical volumes 132, 134, and 136, and a management
agent. Here, each of the logical volumes 132, 134, and 136 may be
implemented in different ways, such as by use of various types of
redundant array of independent disks (RAID). Each of logical
volumes 132, 134, 136 may be implemented on a single physical disk
(not shown), across multiple physical disks (not shown) within a
disk group (not shown), across disks in multiple disk groups, or in
some other arrangement.
[0034] Here, server 108, network equipment 112, and storage
equipment 115 may represent a primary node in a clustering system.
For example, server 108 may be executing a database application,
using storage equipment 115 to store the associated databases and
communicating data to and from storage equipment 115 through
network equipment 112. Fault-tolerance for this database service
may be realized by creating a secondary node corresponding to the
primary node. Use of equipment located at a geographically distinct
location, such as location 104, would provide effective
fault-tolerance because if a catastrophic local event damages
equipment at location 102, redundant equipment at location 104
would be able to provide effective recovery.
[0035] At location 104, a management server 138 is responsible for
monitoring, configuring, and otherwise managing a server 140,
network equipment 142, and storage equipment 144. Management server
138, server 140, network equipment 142, and storage equipment 144
communicated through a local network 146, forming a local SAN.
Local SANs at locations 102 and 104, and perhaps other local SANs,
may together form a wide area SAN by communicating over one or more
wide area networks 148.
[0036] As shown, management server 138 includes a SAN manager 150
that includes a configuration engine 152 and a topology repository
154. SAN manager 150 also maintains a discovery list 156, a
configuration table 158 , a topology table 160, and a mapping table
162, which are discussed in further detail below. SAN manager 150
maintains this information by communicating with various management
agents located in server 140, network equipment 142, and storage
equipment 144. SAN manager 150 and the various management agents
may be implemented in software.
[0037] Server 140 may include one or more application programs,
clustering software for maintaining a clustering system, a
management agent, and a number of host ports. Network equipment 112
is illustrated in FIG. 1 as a switch having a number of switch
ports. Network equipment 112 also includes a management agent.
Network equipment 112 facilitates communication through local
network 146. As shown, network equipment 112 provides communication
between server 140 and storage equipment 144.
[0038] Storage equipment 144 may include a number of disk ports, a
pool 164 of logical volumes, from which logical volumes 166, 168,
and 170 may be selected, and a management agent. Here, each of the
logical volumes in logical volume pool 164, including logical
volumes 166, 168, and 170, may be implemented in different ways,
such as by use of various types of redundant array of independent
disks (RAID). Thus, each of the logical volumes may be implemented
on a single physical disk (not shown), across multiple physical
disks (not shown) within a disk group (not shown), across disks in
multiple disk groups, or in some other arrangement.
[0039] Here, server 140, network equipment 142, and storage
equipment 144 may be used to form a secondary node associated with
the previously discussed primary node in the clustering system. For
example, if the clustering system is designed to provide a
secondary node having similar processing, storage, and other
capabilities as those of the primary node, it would be desirable to
identify a secondary node having similar equipment as the primary
node. Server 140, network equipment 142, and storage equipment 144
may fit such requirements. The present invention allows equipment
such as server 140, network equipment 142, and storage equipment
144 to be identified as resources that may be used to form the
secondary node.
[0040] Servers 108, 110, and 140 are examples of processor devices,
network equipment 115 and 144 are examples of storage devices, and
network equipment 112 and 142 are examples of network interface
devices.
[0041] Information Maintained at Management Server and
Elsewhere
[0042] FIG. 2 is an illustration of mapping table 130 maintained in
management server 106 of FIG. 1. Mapping table 130 is illustrated
here as an example. Other mapping tables, such as mapping table 162
maintained in management server 138, may have similar formats. As
shown in FIG. 2, mapping table 130 provides a mapping between
application programs being executed and the location(s) of data
storage being utilized by such application programs. For instance,
an application program executing in server 108 may utilize logical
volumes 132, 134, and 136 in storage equipment 115, and mapping
table 130 would register such utilization in detail. Different
methods may be used to identify the various application programs
executing in a particular server. One such method involves using
the Common Information Model (CIM) standard, which allows
application programs executing in a server may communicate with one
another. For example, the management agent in server 108 may use
the CIM standard to communicate with, and thereby identify, the
various application programs executing in server 108. Another
method involves using repository information maintained by the
operating system of the server. For example, the management agent
in server 108 may retrieve data from the repository information of
the operating system of server 108 to identify various application
program executing in server 108.
[0043] Mapping table 130 is shown to include the following
categories of information: ID 202, Server 204, Application 206,
Related Mount Point 208, Related Volume ID 210, Disk Group (DG) ID
212, Block Device 214, Logical Unit (LU) Binding ID 216, Small
Computer System Interface (SCSI) ID 218, and SCSI Logical Unit
Number (LUN) 220. Here, table 130 indicates that a database (DB)
application is executing in Server A (server 108). Table 130
further indicates that this DB application is utilizing logical
volumes Vol1, Vol2, and Vol3 (logical volumes 132, 134, and 136).
For each of these three logical volumes, table 130 provides
additional information. Taking Vol1 just as an example, table 130
indicates the mount point (/u01) at which Vol1 is associated with,
or "mounted" to, the system executing the DB application. Table 130
also indicates the physical disk group (0) and block device
(c2t2d1) in which Vol1 is implemented. In this example, logical
volumes are also associated with SCSI IDs, as well as LUNs within
particular SCSI IDs. Here, Vol1 is shown to be associate with a
particular SCSI ID (2) and a particular SCSI LUN (1).
[0044] FIG. 3 is an illustration of a LUN binding table 300
maintained in server 108 of FIG. 1. LUN binding table 300 is
illustrated here as an example. Other LUN binding tables maintained
in other servers, such as servers 110 and 140, may have similar
formats. LUN binding table 300 indicates the SCSI ID assignment and
LUN assignment associated with location(s) of data storage being
utilized by application programs executing in server 108. LUN
binding table 300 is shown to include the following categories of
information: Binding ID 302, SCSI ID 304, LUN 306, and Inquiry
Information 308. Each Binding ID 302 indicates a particular
location of storage and is associated with a particular SCSI ID 304
and a particular LUN 306. Also, each Binding ID 302 further
indicates Inquiry Information 308, which can provide additional
data such as vendor, storage type, and logical volume information.
Binding table 300 may be maintained as a part of the operation of
the management agent in server 108. Thus, individual binding tables
maintained at various servers, such as servers 108 and 110, may be
used to form the mapping table 130 shown in FIG. 2.
[0045] FIG. 4A is an illustration of discovery list 124 maintained
in management server 106 of FIG. 1. Discovery list 124 is
illustrated here as an example. Other discovery lists, such as
discovery list 156 maintained in management server 138, may have
similar formats. As shown in FIG. 4, discovery list 124 provides a
listing of devices available at various locations, such as
locations 102 and 104. Discovery list 124 shows the following
categories of information for each device: Local SAN ID 402,
Discovery ID 404, Device Type 406, Device Information 408, IP
address 410, and Area/Global Position 412. Local SAN ID 402
identifies the local SAN to which the device belongs. Discovery ID
404 identifies a numerical order for the device within its local
SAN. Device Information 406 may indicate various information
relating to the device, such as vendor and device type. IP address
408 indicates the IP address assigned to the device. Area/Global
Position 410 provides information relating to the location of the
device, such as name of metropolitan area, longitude, and latitude.
Thus, discovery list 124 allows management server 106 to identify
available devices at various locations, including distant
locations, that may be potential resources suitable to serve as
part of a secondary node corresponding a primary node in a
clustering system.
[0046] FIG. 4B is an illustration of a functional discovery list
440 that may be maintained in management server 106 of FIG. 1, in
addition to or in place of discovery list 124. Functional discovery
list 440 is illustrated here as an example. Other discovery lists
maintained in other management servers may have similar formats. As
shown in FIG. 5, functional discovery list 440 provides a listing
of devices available at various locations, such as locations 102
and 104. Functional discovery list 440 shows the following
categories of information for each device: Local SAN ID 442,
Discovery ID 444, Function Type 446, and Device Information 448.
Local SAN ID 442 identifies the local SAN to which the device
belongs. Discovery ID 444 identifies a numerical order for the
device within its local SAN. Function Type 446 provides information
on the possible function of the device, such as use in host-based
remote mirroring or storage-based remote mirroring. Device
Information 448 may indicate various information relating to the
device, such as vendor, device type, and device class. Functional
discovery list 440 allows management server 106 to identify
available devices at various locations, including distant
locations, that may be potential resources suitable to serve as
part of a secondary node corresponding a primary node in a
clustering system.
[0047] FIG. 5 is an illustration of topology table 128 maintained
in management server 106 of FIG. 1. Topology table 128 is
illustrated here as an example. Other topology tables, such as
topology table 160 maintained in management server 138, may have
similar formats. As shown in FIG. 5, topology table 128 provides a
summary of interconnections over which data may be sent in system
100. Topology table 128 shows the following categories of
information: server information 502, first local network
information 504, interconnect information 506, second local network
information 508, and storage information 510. Topology table 128
depicts the manner by which various networking and storage
equipment are linked, including local and wide area network
connections. Here, topology table 128 is shown to be focused on
storage network topology for purposes of illustration. Other types
of topology information may be included as well.
[0048] FIGS. 6A-6G show various configuration tables that may be
implemented, individually or in combination, as the contents of
configuration table 126 maintained in management server 106 of FIG.
1. Contents of configuration table 126 is illustrated here as
examples. Other configuration tables, such as configuration table
158 maintained in management server 138, may have similar
formats.
[0049] FIG. 6A illustrates a fibre channel switch (FC-SW) zoning
configuration table 600. This table contains categories of
information including Zone ID 602 and Switch Port ID List 604. Zone
ID 602 identifies different zones, or groupings of devices, such
that devices within a common zone may readily communicate with one
another. Switch Port ID List 604 identifies the different network
ports which belong to the identified zone. FIG. 6B illustrates a
different FC-SW zoning configuration table 606, similar in
structure to table 600. Zoning configuration tables 600 and 606
allow convenient separation of groups of devices. Here, tables 600
and 606 are described as fibre channel switch zoning configuration
tables for purposes of illustration, other types of equipment may
also be organized in similar zoning tables.
[0050] FIG. 6C illustrates a storage-based replication
configuration table 608. This table identifies the configuration of
storage-based data replication from a set of primary storage
locations to a corresponding set of secondary storage locations.
Here, the storage system is responsible of maintaining the proper
replication of data. Table 608 shows the following categories of
information: ID 610, Group ID 612, Group Name 614, primary storage
information 616, secondary storage information 618, and Cluster
Config ID 620. ID 610 is an entry identifier. Group ID 612 and
Group Name 614 relate to the identification number and name for
each group of storage resources, such as a group of volumes,
representing a storage location. The primary and secondary storage
information 616 and 618 each identifies the host and volume
information associated with the relevant storage location. Cluster
Config ID 620 identifies a label for the cluster corresponding to
the primary and secondary storage locations.
[0051] FIG. 6D illustrates a host-based replication configuration
table 622. This table identifies the configuration of host-based
data replication from a set of primary storage locations to a
corresponding set of secondary storage locations. Here, the host
system is responsible of maintaining the proper replication of
data. Table 622 shows the following categories of information: ID
624, Valid 626, Group ID 628, Group Name 630, primary storage
location information 632, secondary storage location information
634, and Cluster Config ID 636. Valid 626 relates to whether the
particular replication configuration is available. Also, primary
and secondary storage location information 632 and 634 are each
shown to also include information for identifying the corresponding
disk group and block device. Other information in table 622 is
similar to information shown in table 608 of FIG. 6C.
[0052] FIG. 6E illustrates a cluster configuration table 638. This
table identifies the arrangement of various clusters in the system,
which may include the configuration of physical devices being
controlled by cluster software. Table.638 shows the following
categories of information: ID 640, Valid 642, Cluster ID/Name 644,
Cluster Type/Vender 646, Member Node List 648, Heartbeat List 650,
Heartbeat Configuration ID List 652, Replication Type List 654, and
Replication Configuration ID List 656. ID 640 identifies a numeric
label for each entry, Valid 642 relates to whether the particular
cluster is available. Cluster ID/Name 644 provides a number
identifier and a name identifier for each cluster presented.
Cluster Type/Vendor 646 identifies the classification of the
cluster and vendor of the associated equipment. Member Node List
648 identifies the nodes that are members of the particular
cluster. Heartbeat List 650 and Heartbeat Configuration 652 relate
to arrangement of the heartbeat, which provides a signal that may
be used to indicate whether a node, or particular resource at a
node, is active. Replication Type List 654 and Replication
Configuration ID List 656 relate to the type of replication
available and the associated configuration label.
[0053] FIG. 6F illustrates a cluster resource group configuration
table 658. This table identifies the various resources available at
different clusters, which may include the configuration of the
logical resource group for each node in each cluster. Such
resources may be processing, communication, storage, or other types
of resources. Table 658 shows the following categories of
information: ID 660, Valid 662, Cluster Type ID 664, Resource Group
ID 666, Resource Group Name 668, Member Node List 670, Resource
List 672, Replication Type 674, and Replication Configuration ID
676. ID 660 provides an numerical label for each entry, Valid 662
relates to whether the particular cluster is available. Cluster
Type ID 664 provides an identifier for the cluster and indicates
the type and vendor of equipment associated with the cluster.
Resource Group ID 666 and Resource Group Name 668 provide a number
identifier and a name identifier for each collection of resources
associated with the cluster. Resource List 672 identifies the
particular resources available within the identified resource
group. Replication Type 674 and Replication Config ID 676 relate to
the type of replication available and the associated configuration
label.
[0054] FIG. 6G illustrates a heartbeat configuration table 678.
This table identifies provides further detail on the arrangement of
the heartbeat for each cluster. Table 678 shows the following
categories of information: ID 680, Valid 682, Cluster Type ID 684,
Heartbeat Type ID 686, Heartbeat Name 688, Member Node List 690,
NIC List 692, and Storage List 694. ID 680 provides a numerical
label for each entry. Valid 682 relates to whether the cluster is
available. Cluster Type ID 684 provides an identifier for the
cluster and indicates the type and vendor of equipment associated
with the cluster. Heartbeat Type ID 686 and HeartBeat Name 688
identify the classification and name of the heartbeat utilized. For
example, the heartbeat may be host-based or storage-based. Member
Node List 690 identifies the nodes that are members of the
particular cluster. NIC List 692 identifies NICs which correspond
the to a particular host-base heartbeat. Storage list identifies
storage systems which correspond to a particular storage-based
heartbeat.
[0055] The information maintained at each management server may be
communicated to other management servers. For example, although
management servers 106 and 108 are situated at geologically
distinct locations 102 and 104, respectively, they may exchange
some or all of the information that is contained in various tables
such as those discussed above.
[0056] Automatic Configuration
[0057] FIG. 7 is a flow chart summarizing the general steps
involved in automatic configuration and semi-automatic
configuration of a clustering system in accordance with at least
one embodiment of the present invention. The steps shown may be
implemented as an integrated routine that allows the selection of
either automatic configuration or semi-automatic configuration.
Alternatively, the steps shown may be implemented as two separate
routines. That is, a system may employ only automatic
configuration, or only semi-automatic configuration. For purposes
of illustration, FIG. 7 shows the establishment of a clustering
system through the formation of a secondary node corresponding to a
primary node. Different steps shown in FIG. 7 may be accomplished
with use of a user interface, such as an interactive graphical user
interface (GUI). Also, the GUI can be situated at any location, as
long as the relevant information can be passed to the system. For
example, the information submitted through the GUI by the user may
be sent to the management server 106, or to the management server
138.
[0058] Under automatic configuration, establishment of a clustering
system begins with step 702, in which the primary node of the
planned clustering system is identified. This may involve
identification, by the user, of the name of one or more target
applications and the name of the target server corresponding to the
primary node. Alternatively, a more automated process may be
employed. For example, the main application executing in a target
server may be selected.
[0059] Next, in step 704, policies for creating the clustering
system, including remote mirroring features, may be specified. This
step may involve specification by the user of general policies to
follow in establishing the clustering system and importance
assigned to such policies. For example, the user may be presented
with three potential policies: (1) performance, (2) reliability,
and (3) cost.
[0060] Performance may relate to the effectiveness of the data
transfer between the data storage of the primary node and the data
storage of the secondary node, which may involve measures of
bandwidth, distance, and network usage in a wide area SAN covering
metropolitan areas of San Francisco (SF) and San Diego (SD) are
provided in the table below:
1 Network Type Total Usage SD Local 2 Gbps 50% SF-SD Interconnect
48 Gbps 10% SF Local 2 Gbps 8%
[0061] Illustrative measures of bandwidth, distance, and network
usage in the same wide area SAN, but from the perspective of the
San Diego (SD) metropolitan area, are provided in the table
below:
2 Network Type Tested Throughput Distance Total Usage SF
interconnect 500 Mbps 1000 mile 48 Gbps 10%
[0062] Thus, if a user places emphasis on performance, the
secondary node may be chosen to have equal performance as the
primary node, in terms of processing capability (server type),
storage capability (throughput, cache size, RAID level, etc.), and
network interface capability (number and performance of host bus
adaptors). Also if there are two or more option for interconnects
between the primary device and secondary device, the interconnect
that has more available throughput capacity may be chosen. For
example, assume there are two interconnects: interconnect A, which
has 48 Gbps total throughput capacity and 10% average usage rate
(43.2 Gbps available throughput capacity), and interconnect B,
which has 128 Gbps total throughput capacity and 80% average usage
rate (25.6 Gbps available throughput capacity). Here, interconnect
A has more available throughput capacity than interconnect B, so
interconnect A may be chosen.
[0063] Reliability may relate to the level of confidence with which
the data storage of the secondary node replicates data in the data
storage of the primary node. If a user places emphasis on
reliability the secondary node may be chosen to have redundant host
bus adaptors and highly reliable, enterprise lever storage, such as
RAID level 1. Cost may relate to the cost of using equipment, such
as maintenance costs. Cost may also relate to the cost of acquiring
currently unavailable equipment. If a user places emphasis on cost,
the secondary node may be chosen to have much lower performance
than the primary node, in terms of processing capability (server
type), storage capability (throughput, cache size, RAID level,
etc.), and network interface capability (number and performance of
host bus adaptors). For example, storage equipment of RAID level 5
may be chosen.
[0064] Thus, by specifying general policies such as (1)
performance, (2) reliability, and (3) cost, to follow in
establishing the clustering system, the user is able control the
design of the clustering system, without being required to decipher
the detailed considerations relating to technical specifications of
related equipment and software. The user may be presented with
various general policies from which to choose. The user may specify
policies by simply identifying particular policies as important.
The user may also specify policies by assigning importance, or
weight, to particular policies. This may be done in different ways,
such as by user input of ratings, ratios, percentages, or other
measures for different policies.
[0065] The next step under automatic configuration is step 706, in
which information on the current system is gathered. Such
information may include the contents of mapping tables, discovery
tables, topology tables, and configuration tables. This information
provides a detailed picture of the various aspects of the current
system, including the mapping from applications to resources they
utilize, available resource and their configurations, and so
on.
[0066] In step 708, the information on the current system gathered
in step 706 is analyzed to select the most appropriate resources
and/or arrangements to be used for creating the secondary node.
This is done in view of the various policies, and possibly weights
assigned to those policies, as defined by the user in step 704. In
step 710, the selected resources and/or arrangements are presented
to the user, and the user is given to opportunity to confirm the
selection of resources and/or arrangements. If the user confirms
the selection, the process continues with step 712, discussed
below. If the user does not confirm the selection, the process
loops back to step 704.
[0067] In step 712, the selected resources and/or arrangements are
used to create the secondary node. If the selected resources need
additional software installation or configuration in order to
function properly as the secondary node, such installation or
configuration may be performed. Alternatively, the automatic
configuration routine or semi-automatic configuration routine may
re-select from resources that do not require additional software
installation or configuration. Also, default resources that do not
require additional software installation or configuration may also
be selected in order to avoid such installation or configuration of
software. Finally, in step 714, the configuration table(s) are
updated to include information on the secondary node just
created.
[0068] Semi-Automatic Configuration
[0069] Under semi-automatic configuration, establishment of a
clustering system also begins with step 702, which has been
discussed previously. Next, in step 716, information on the current
system is gathered. This step is similar to step 706 discussed
above. In step 708, one or more potential selections of appropriate
equipment and/or arrangements to be used for creating the secondary
node is presented to the user. The user is given the opportunity to
select the various equipment and/or arrangements to be used in
creating the secondary node. In step 710, the user's selection is
received and presented back to the user for confirmation. Here, a
visual topology diagram such as the one shown in FIG. 8 may be
presented to the user. FIG. 8 may also represent a simplified
version of block diagram shown in FIG. 1 If the user confirms the
selection, the process continues with step 712, which is has been
described previously. If the user does not confirm the selection,
the process loops back to step 618.
[0070] In addition, semi-automatic configuration may also take into
account user-defined policies, as is done in the case of automatic
configuration. Here, such policies may allow potential selections
of equipment and/or arrangements presented to be narrowed, so that
the user may be presented with a more focused set of potential
equipment and/or arrangements from which to make a selection. Other
features discussed above in relation to automatic configuration may
be adopted for use with semi-automatic configuration, and vise
versa. For example, the visual confirmation diagram discussed in
relation to semi-automatic configuration may also be used with
automatic configuration, in order to present the automatically
selected equipment and or arrangement to the user for confirmation.
Further, variations on the different steps shown in FIG. 7 may also
be adopted.
[0071] FIG. 1 is a block diagram of a clustering system 100 in
accordance with at least one embodiment of the present invention..
Such a diagram would allow the user to visually inspect a proposed
configuration for a clustering system. This provides an efficient
way to present a proposed configuration to the user for
confirmation.
[0072] Although the present invention has been described in terms
of specific embodiments, it should be apparent to those skilled in
the art that the scope of the present invention is not limited to
the described specific embodiments. The specification and drawings
are, accordingly, to be regarded in an illustrative rather than a
restrictive sense. It will, however, be evident that additions,
subtractions, substitutions, and other modifications may be made
without departing from the broader spirit and scope of the
invention as set forth in the claims.
* * * * *