U.S. patent application number 10/509133 was filed with the patent office on 2006-01-12 for network management system.
Invention is credited to Niraj Agrawal, Elke Jahn.
Application Number | 20060010441 10/509133 |
Document ID | / |
Family ID | 28051740 |
Filed Date | 2006-01-12 |
United States Patent
Application |
20060010441 |
Kind Code |
A1 |
Jahn; Elke ; et al. |
January 12, 2006 |
Network management system
Abstract
There is provided a network management system and a method of
managing a network, especially an optical network, that includes a
plurality of nodes that are interconnected in an arbitrary topology
so as to be capable of carrying traffic between selected nodes. The
method includes the steps of providing a supervisory network by
means of supervisory channels between the node, providing a node
manager including one or more software modules in each node,
establishing supervisory connections over one or more of the
supervisory channels between selected nodes through which the node
manager communicates with other node managers in other nodes,
providing a node module in each node manager that provides an
interface to the hardware settings of the node, providing a master
module in at least one node manager, establishing supervisory
connections over one or more supervisory channels between selected
nodes through which the master module communicates with the node
modules, and amending and/or monitoring hardware settings in
selected nodes with respective node module of the node. Controlling
the amendments carried out by the node modules and/or processing of
the monitored hardware settings is carried out by the master
module.
Inventors: |
Jahn; Elke; (Hochberg,
DE) ; Agrawal; Niraj; (Hochberg, DE) |
Correspondence
Address: |
OHLANDT, GREELEY, RUGGIERO & PERLE, LLP
ONE LANDMARK SQUARE, 10TH FLOOR
STAMFORD
CT
06901
US
|
Family ID: |
28051740 |
Appl. No.: |
10/509133 |
Filed: |
March 27, 2003 |
PCT Filed: |
March 27, 2003 |
PCT NO: |
PCT/EP03/03201 |
371 Date: |
August 24, 2005 |
Current U.S.
Class: |
718/100 |
Current CPC
Class: |
H04Q 2011/0088 20130101;
H04J 14/0284 20130101; H04J 14/0283 20130101; H04J 14/0297
20130101; H04J 14/02 20130101; H04Q 11/0062 20130101; H04L 41/044
20130101; H04L 41/042 20130101; H04Q 2011/0081 20130101; H04Q
2011/0077 20130101 |
Class at
Publication: |
718/100 |
International
Class: |
G06F 9/46 20060101
G06F009/46 |
Foreign Application Data
Date |
Code |
Application Number |
Mar 27, 2002 |
EP |
020070082 |
Mar 14, 2003 |
WO |
PCT/EP03/02704 |
Claims
1. A method of managing a network, that includes a plurality of
nodes that are interconnected in an arbitrary topology so as to be
capable of carrying traffic between said plurality of nodes, the
method comprising the steps of: providing a supervisory network by
means of supervisory channels between the nodes of said plurality
of nodes; providing a node manager which is one or more software
modules in each one of said plurality of nodes; establishing
supervisory connections over one or more of the supervisory
channels between selected nodes of said plurality of nodes through
which the node manager communicates with other node managers in
other nodes of said plurality of nodes; providing a node module in
each node manager that provides an interface to hardware settings
of each of said plurality of nodes that is associated with the node
module; providing a master module in at least one node manager;
establishing supervisory connections over one or more supervisory
channels between the selected nodes of said plurality of nodes,
said supervisory connections providing communication between the
master module and the node modules; and performing a function
selected from the group consisting of amending hardware settings in
the selected nodes, monitoring hardware settings in the selected
nodes, and a combination thereof, with the node module of each of
the selected nodes, wherein controlling the amendments carried out
by the node modules and processing the monitored hardware settings
is carried out by the master module.
2. The method of managing a network according to claim 1,
comprising the further steps of: providing a master module in each
of at least two node managers, wherein each master module is in a
state selected from the group consisting of an active state and a
passive state; and setting a first of the at least two master
modules to the active state and maintaining or setting the other of
the at least two master modules to the passive state, wherein
controlling the amendments carried out by the node modules and
processing the monitored hardware settings is carried out only by
the first master module.
3. The method of managing a network according to claim 2, wherein
the setting of the state of the at least two master modules is done
automatically.
4. The method of managing a network according to claim 3, further
comprising the steps of: periodically generating heartbeat messages
in each node of said plurality of nodes and exchanging these
messages among all of said plurality of nodes, wherein each
heartbeat message contains information about the state of the
master module of a respective node of said plurality of nodes; and
processing the received heartbeat message in each node of said
plurality of nodes and setting the state of the master module in
the respective node depending on information in the received
messages, so that a single master module of all of said plurality
of nodes is always in the active state.
5. The method of managing a network according to claim 4, further
comprising the step of providing each master module with an initial
passive state when the node manager of the respective node of said
plurality of nodes is initialized, and wherein changing the state
of the master module in the respective node of said plurality of
nodes is made according to a decision selected from the group
consisting of: if the master module of the respective node of said
plurality of nodes is in the passive state and the respective node
of said plurality of nodes receives at least one heartbeat message
that contains information about a master module of another node of
said plurality of nodes being in the active state, the master
module of the respective node of said plurality of nodes remains in
the passive state; and if the master module of the respective node
of said plurality of nodes is in the passive state and the
respective node of said plurality of nodes receives no heartbeat
message that contains information about a master module of another
node of said plurality of nodes being in the active state within a
predetermined time interval, the master module of the respective
node of said plurality of nodes changes into the active state.
6. The method of managing a network according to claim 4, wherein
each heartbeat message generated in each node of said plurality of
nodes further contains a node ID of the respective node of said
plurality of nodes in which the message is generated, and wherein
changing of the state of the master module in the respective node
of said plurality of nodes is made according a decision selected
from the group consisting of: if the master module of the
respective node of said plurality of nodes is in the passive state
and the respective node of said plurality of nodes receives at
least one heartbeat message that contains information about a
master module of another node of said plurality of nodes being in
the active state, the master module of the respective node of said
plurality of nodes remains in the passive state; if the master
module of the respective node of said plurality of nodes is in the
passive state and the respective node of said plurality of nodes
receives no heartbeat message that contains information about a
master module of another of said plurality of nodes being in the
active state within a predetermined time, the respective node of
said plurality of nodes compares the node ID with other received
node IDs using a predetermined procedure, and depending on the
result of this procedure, especially if the node ID is smaller than
the other received node IDs, the master module of the respective
node of said plurality of nodes changes into the active state; if
the master module of the respective node of said plurality of nodes
is in the active state and the node receives no heartbeat message
that contains information about a master module of another of said
plurality of nodes being in the active state within a predetermined
time, the master module of the respective node of said plurality of
nodes remains in the active state; if the master module of the
respective node of said plurality of nodes is in the active state
and the respective node of said plurality of nodes receives at
least one heartbeat message that contains information about a
master module of another of said plurality of nodes being in the
active state, the respective node of said plurality of nodes
compares the node ID of the node of said plurality of nodes with
other received node IDs using a predetermined procedure and
depending on the result of this procedure, especially if the node
ID is not smaller than the other received node IDs, the master
module of the respective node of said plurality of nodes changes
into the passive state.
7. The method of managing a network according to claim 1,
comprising the further steps of: communicating between the node
module in each node of said plurality of nodes and the master
module through a set of supervisory connections selected from the
group consisting of a full set of supervisory connections and a
reduced set of supervisory connections, wherein in the full set of
supervisory connections, each node module communicates with all of
the master modules present in one or more nodes of said plurality
of nodes, especially whether in the active state or passive state,
and wherein in the reduced set of supervisory connections, each
node module communicates only with a single master module present
in one of said plurality of nodes.
8. The method of managing a network according to claim 4,
comprising the further step of: providing a master controller
module in each node of said plurality of nodes which is connected
to the master module of the respective node, wherein master
controller modules of different nodes of said plurality of nodes
generate, exchange and process the heartbeat messages and control
the state of the master module of the respective node.
9. The method of managing a network according to claim 8, wherein
the node module in each node of said plurality of nodes
communicates only with the master module in the active state, and
in the case of changing the state of the master module to the
active state and a further master module to the passive state, the
supervisory connections through which the communication takes place
are reconfigured.
10. The method of managing a network according to claim 9, wherein
the master controller module of the node of said plurality of nodes
having the further master module that has been changed to the
active state sends a reconfigure message to each node of the
plurality of nodes that contains the node ID of the node of said
plurality of nodes having the further master module.
11. The method of managing a network according to claim 2,
comprising the further steps of: providing a database containing
information relating to a hardware state of each node of said
plurality of nodes and local and global network management
activities in each node of said plurality of nodes; synchronizing
the database in each node of said plurality of nodes according to
the following steps: before the first master module is set to the
active state, a first node of said plurality of nodes, that is
associated with the first master module and includes a current
state of the database, sends the current state of the database to
all other nodes of said plurality of nodes, the receiving nodes of
said plurality of nodes that receive the current state of the
database, synchronize the database in each receiving node with the
current state of the database.
12. The method of managing a network according to claim 11,
comprising the further steps of: the master module in each
receiving node of said plurality of nodes informs a master
controller in each receiving node of said plurality of nodes of any
changes in the database of the receiving node of said plurality of
nodes; the master controller sends the changes in the database of
the receiving node of the plurality of nodes to all other master
controllers in all other nodes of the plurality of nodes; when one
of the plurality of nodes comes up after a failure the master
controller in the one of the plurality of nodes that comes up after
a failure requests the current state of the database from the
master controller of the first node of said plurality of nodes to
synchronize the database of the one node that comes up after a
failure with the database of the first node of said plurality of
nodes.
13. A network management system of a network including a plurality
of nodes which are interconnected in an arbitrary topology so as to
be capable of carrying traffic between said plurality of nodes,
comprising: a supervisory network interconnecting the plurality of
nodes, that is provided by supervisory channels between the
plurality of nodes; a node manager associated with each one of said
plurality of nodes that communicates with other node managers
through a supervisory connection established over one or more
supervisory channels between selected nodes of said plurality of
nodes; a node module associated with each node manager that
provides an interface to the hardware of the node of said plurality
of nodes that is associated with the node module and allows for
amending and monitoring of amendments of hardware settings of the
node of said plurality of nodes that is associated with the node
module; and a master module associated with at least one node
manager that is connected to the various node modules through the
supervisory connections established over the one or more
supervisory channels between selected nodes, wherein the master
module provides functionality for controlling the node modules and
amending the hardware settings and for processing the hardware
settings monitored by the node modules.
14. The network management system according to claim 13, further
comprising an interface associated with the master module to
support one or more Graphical User Interfaces located in one or
more nodes of the plurality of nodes.
15. The network management system according to claim 13, further
comprising one or more software modules included in the master
module for global and local network management.
16. The network management system according to claim 13, wherein at
least one node manager has the master module, and wherein each
master module can be set to a passive state or to an active state,
wherein only in the active state the master module has the
functionality for controlling the node modules and amending the
hardware settings and for processing the hardware settings
monitored by the node modules, and wherein in the passive state the
master module has functionality for performing database
synchronization.
17. The network management system according to claim 16, further
comprising a master controller module associated with each node of
said plurality of nodes for setting the state of the master
module.
18. A network management system of a network including a plurality
of nodes which are interconnected in an arbitrary topology so as to
be capable of carrying traffic between selected nodes, comprising:
a supervisory network interconnecting the plurality of nodes, that
is provided by supervisory channels between the plurality of nodes;
a node manager associated with each one of said plurality of nodes
that communicates with other node managers through a supervisory
connection established over one or more supervisory channels
between the selected nodes of said plurality of nodes; a node
module associated with each node manager that provides an interface
to the hardware of the node of said plurality of nodes that is
associated with the node module and allows for amending and
monitoring of amendments of hardware settings of the node of said
plurality of nodes that is associated with the node module; and a
master module associated with at least one node manager that is
connected to the various node modules through the supervisory
connections established over the one or more supervisory channels
between selected nodes, wherein the master module provides
functionality for controlling the node modules and amending the
hardware settings and for processing the hardware settings
monitored by the node modules, and according to one of claims 13 to
17, wherein the network management system is managed by a method
according to claim 1.
19. The method of managing a network according to claim 7, wherein
each node module communicates only with a single master module in
an active state present in one node in the reduced set of
supervisory connections.
20. The network management system according to claim 15, further
comprising one or more software modules in the master module for
database related tasks and features for a database containing
information relating to a hardware state of each node and local and
global network management activities in each node.
Description
[0001] The present invention belongs to the field of communication
systems, especially of optical communication networks, more
particularly, to dense wavelength division multiplexed optical
networks with arbitrary topology, e.g., point-two-point, ring,
mesh, etc.
[0002] The soaring demand for virtual private networks, storage
area networking, and other new high speed services are driving
bandwidth requirements that test the limits of today's optical
communications systems. In an optical network, a node is physically
linked to another using one or more optical fibres (cf. FIG. 1).
Each of the fibres can carry as many as one hundred or more
communication channels, i.e., wavelengths in WDM (Wavelength
Division Multiplex) or Dense WDM (DWDM) systems. Thus, for example,
for a node with three neighbours as many as three hundred or more
wavelength signals originate or terminate or pass through a given
node. Each of the wavelengths may carry signals with data rates up
to 10 Gbit/s or even higher. Thus each fibre is carrying several
terabits of information. This is a tremendous amount of bandwidth
and information that must be managed automatically, reliably,
rapidly, and efficiently. It is evident that large amount of
bandwidth needs to be provisioned. Fast and automatic provisioning
enables network bandwidth to be managed on demand in a flexible,
dynamic, and efficient manner. Another very important feature of
such DWDM networks is reliability or survivability in presence of a
failure such as an inadvertent fibre-cut, various types of hardware
and software faults, etc. In such networks, in case of a failure,
the user data is automatically rerouted to its destination via an
alternate or restoration path.
[0003] In general, such networks are managed by a network
management system which is adapted especially for a single existing
network. However, when the existing network, especially its
topology, is changed, the network management system must be
reconfigured by manually adapting of the hardware and software of
several nodes. This is an expensive and time-consuming work,
especially in the case of meshed networks. Furthermore, the known
network management systems are not able to be implemented in
networks with an arbitrary topology without manual adaptation of
the network management system.
[0004] It is an object of the present invention to overcome the
disadvantages of the state of the art and especially to provide a
network management system that could be implemented in a network
with an arbitrary topology, and which provides a highly flexible
and reliable managing of the network.
[0005] The object of the invention is realized by a method
according to claim 1 and a network management system according to
claim 11. The sub-claims provide preferable embodiments of the
present invention.
[0006] In the network, especially the optical network, which is
managed by the method according to the present invention multiple
nodes are interconnected in an arbitrary topology. The management
system is able to manage the whole network and provides
intelligence for efficient and optimal use of network resources.
The management system comprises preferably various software modules
in each node. One software module is a node manager, for example,
which takes care of the network management activities. A node
manager in each node communicates with other node managers in other
nodes through the supervisory network. The supervisory network is
formed with the help of supervisory channels between the various
nodes of the network. A physical supervisory channel between two
nodes in the network might be carried over optical fibre or other
types of transport media. Node managers in different nodes might
communicate over logical supervisory connections established over
one or more physical supervisory channels between various nodes.
These logical supervisory connections might be configured manually
or with the help of software modules in one or more network nodes.
In a preferred embodiment this is done by using a software module
called NetProc, which is described in the application
PCT/EP03/102704, which has been filed on 14 Mar. 2003 by the same
applicant, and which is incorporated by reference into the present
application. The NetProc provides the following supervisory network
features: [0007] 1) Supervisory connection establishment between
two network nodes. Each node can have one or more NetProcs. This
architecture allows establishment of a direct logical supervisory
connection between any arbitrary pair of nodes interconnected by
the supervisory channel. Fault-tolerant or redundant connections
through two or more paths. In a preferred embodiment these paths
are node and link disjoint, as will be described in more detail.
The management system uses NetProc's services to exchange messages
with other nodes. Any supervisory data is sent through one or
several or all of the available redundant connections. Each message
is given a sequence number. On the receiving end the duplicate
messages are discarded and only one, for example the first, of the
arriving message is passed on to the supervisory management layer.
[0008] 2) Hardware fault and software error detection on all paths
of the supervisory channel and the associated auto-recovery to
re-establish the supervisory channel. Error checking in the data
transmission is done by using sequence numbers on the messages. The
status of each connection is monitored by sending keep-alive
messages at regular intervals. In the event that a reply to
keep-alive message is not received within a specified time the
connection is explicitly closed and the two nodes try to
re-establish connection between themselves. The closing of
connection(s) and attempts to re-establish them are done
automatically. [0009] 3) Relaying information reliably to one or
more network managers running on one or more network nodes or other
work stations. [0010] 4) The management of the network is carried
out by a node manager present in each node or at one or more nodes
or other centralized locations. The various node managers
communicate using the NetProc.
[0011] A preferred supervisory network has the flexibility to be
configured by standard protocols like OSPF, MPLS or by using
NetProc. Following features apply: [0012] The supervisory network
topology is automatically discovered with the help of OSPF. Each
node manager executes a single OSPF and the OSPF in each node is
configured to talk with neighbouring nodes. [0013] The nodes
discover their neighbours and exchange Link State Advertisements.
Once the Link State adjacencies are formed and the OSPF converges
on the topology, each node possesses the routing table and is able
to reach other nodes over the supervisory channel. [0014] The
status of the supervisory channel is monitored by OSPF and in the
event of link failure the alternate routes are configured.
Fault-tolerant connections are set up using two or more Label
Switched Paths over two or more disjoint paths to each destination.
Thus a signalling message sent to a node travels through multiple
Label Switched Paths and reaches its appropriate destination.
[0015] According to the present invention a node module is provided
in each node manager. Thereby, the module could be implemented in
form of software or hardware or both. The node module in each node
provides an interface to the hardware of the corresponding node. By
each node module the hardware settings of the respective node could
be amended and/or monitored.
[0016] At least one node manager is provided with a master module.
The master module could also be implemented in form of software
and/or hardware. The master module communicates through supervisory
connections with the various node modules and controls the various
amendments carried out by the different node modules and/or
processes the hardware settings of the different nodes monitored by
the corresponding node modules.
[0017] Preferably, not only one but several or all of the node
managers in the different nodes comprise a master module.
Preferably, in this case the master module has an active state and
a passive state which the master module might be set to. Further
preferably, at a given time only one master module is allowed to be
set to the active state. Such a master module might be called the
Master and all the other master modules, which are in a passive
state, might be called Deputy Master (DM). Only the master module
that is in the active state (Master) controls the different
amendments of hardware settings carried out by the node modules and
processes the hardware settings monitored by the node modules.
[0018] Preferable embodiments of the present invention will be
described in the following with reference to the accompanying
drawings, in which
[0019] FIG. 1 shows a preferable first architecture of a node
manager;
[0020] FIG. 2 shows the established supervisory connections between
corresponding different nodes;
[0021] FIG. 3 shows a second preferable architecture of a node
manager with an attached master controller;
[0022] FIGS. 4 and 5 show reduced supervisory connections used in
the shown second architecture.
[0023] The functions of a node manager (1) according to the
embodiment shown in FIG. 1 are separated into two main modules. The
node module (2) takes care of the activities local to a node. Every
node has a node module (2), which connects to one or more master
modules (3) located at the same node or other nodes using the
supervisory channel. Among other things, the node module (2)
provides interface to the hardware and allows the master module (3)
to make any changes or informs the master module (3) of any changes
in the hardware properties. The second module called master module
(3) is present in one or several or all nodes. The master module
(3) includes MasterProc (5) for global and local network
management, DBProc for database related tasks and features,
Interface to GUI (4) to support the hardware element management and
local and global network management. This is shown in FIG. 1.
Thereby, the term "Proc" denotes one or more software modules with
predetermined functionality.
[0024] In addition to the node manager (1), there is a Graphical
User Interface (GUI), which is used to input (or enter), output (or
view), and modify various parameters and/or properties related to
the node hardware. The GUI is also used to input (or enter), output
(or view), and modify various parameters and/or properties related
to the local and/or global network management. The GUI is connected
to the master module (3) (cf. FIG. 1).
[0025] The functions of a master module (3) include [0026]
Receiving/sending node information from/to one or more nodes,
reading, writing, and updating the database (DB) and providing an
interface to the GUI. [0027] Accepting user and/or hardware
commands for modifying and/or updating node properties and sending
them to the relevant nodes. Such commands may also be received from
other nodes. [0028] Processing network management related commands
and messages, e.g., demand information from the user, which
includes creation of demand, selection of one or more demand-paths,
starting and stopping traffic for a demand, etc. [0029] Monitoring
the status of demands and providing protection or restoration
actions in the event of one or more faults and/or errors in a
demand. [0030] Exchange of heartbeat messages and related
processing [0031] Database synchronization
[0032] The master module (3) according to the shown embodiment
provides the following interfaces [0033] Interface to the node
module (2) in one or several or all nodes [0034] Interface to the
database [0035] Interface to the GUI (4) in one or several or all
nodes
[0036] Although there are several master modules (3) located in
several network nodes, at a given time only one master module (3)
may be active. Such a master module (3) might be designated as the
Master and all the other master modules (3) as a Deputy Master
(DM). Further, a master module (3) performs the tasks of the Master
or a Deputy Master depending on the configuration. Such a
configuration can be done statically or dynamically. It may also be
done manually or automatically.
[0037] The node module (2) in each node needs a connection to the
master module (3) and vice-versa. This connection is set-up over
the supervisory channel using NetProc or equivalent software
modules.
[0038] The Master located in a particular node coordinates all the
network management activities. The Master is an essential part of
the network management and needs to be functional all the time. It
therefore becomes important to make sure that there is a backup or
standby module, which takes over when the Master fails for some
reason. For this purpose one or more Deputy Masters are designated
as the backup or standby to the Master. These Deputy Masters take
over the functions of the master module (3) when the Master fails.
The master module (3) has different functionality based on whether
it is the Master or a Deputy Master. The nodes where the Master and
a Deputy Master are located are termed as the master node and a DM
node, respectively. Finally, a full set of supervisory connections
between all pairs of nodes which contain master module (3) are
required in order to manage the redundancy and fault-tolerance with
respect to the Master functionality. A full set of supervisory
connections implies a supervisory connection between all pair of
nodes. A reduced set of supervisory connections is defined as a set
of those connections between a pair of nodes in which one of the
nodes is the master node.
[0039] As the node manager software first comes up, a node
preferably is always initialised to be a Deputy Master node.
Following protocol is used in determining as to which node acts as
a Master at a given time: 1) All nodes periodically exchange
Heartbeat messages among each other, the contents of which are used
to determine as to which node is the master node and also to
monitor the status of master node by the various Deputy Master
nodes. 2) A Heartbeat message contains the node ID of the sender
node as well as its status, either Master or Deputy-Master. 3) The
receiving node first examines the status of all the received
Heartbeat Messages within a certain time interval. If ft receives a
Master status in any of the received Heartbeat messages, it remains
in the same state as before without altering its status. If it does
not receive a Master status in any of its Heartbeat messages, it
compares its ID with other received IDs. If its ID is smaller than
the received IDs It assumes the role of Master otherwise it remains
in the same state as before without altering its status. As an
alternative, if on start-up a node does not receive Heartbeat
message from other nodes after sending a configurable number of
Heartbeat messages it assumes the role of the Master. 4) If and
only if the existing Master fails the new Master election process
takes place. Master election is done by processing heartbeat
messages as discussed above. 5) In case two nodes assume for any
reason the unintended role of a master node it is resolved using
the following protocol. Among the different master nodes the node
with the lowest ID number retains the role of Master, all other
master nodes revert their role to being a Deputy Master node.
[0040] Based on the contents of heartbeat messages there may be
other procedures for selecting as to which master module acts as
the Master, for example the master module in the node with the
largest ID.
[0041] After the election is over, the master module (3) in master
node takes over the operations of the network and performs the
network management functions. The change of role of a particular
node from a Deputy Master node to a master node should be performed
as quickly and as seamlessly as possible to have minimum disruption
in network operation. The master node and Deputy Master nodes
perform additional functions for fault-tolerance. These include
among other functions database synchronization between master node
and Deputy Master nodes.
[0042] In the following sections two architectures for handling
redundancy and fault-tolerance are presented.
[0043] The node manager corresponding to a first architecture is
shown in FIG. 1. The master node and all the Deputy Master nodes
are connected through the supervisory channel configured by NetProc
or an equivalent software module. Using such supervisory
connections (10) between each pair of nodes, each node module in
each node sends all node-related information to the master node and
to all the Deputy Master nodes as shown in FIG. 2, e.g., for a four
node network. Exchange of heartbeat messages and related processing
is done as discussed previously in this document.
[0044] The database in the master node and a Deputy Master node
needs to be synchronized at all times. This ensures correct
operation when the master node fails and a new master node is
elected. After a new master node is elected, it sends the current
dump (state) of the database to all other Deputy Master nodes
before resuming its duty as a master node. This makes sure that the
database in all nodes are synchronized before the nodes begin their
management function. During normal operation, both the master node
and all Deputy Master nodes receive messages from node modules in
all nodes. Thus, the master module in each node updates the
database located in that particular node. The difference in the
functionality of Master versus Deputy Master is that a node acting
as Deputy Master does not send any message to other nodes but only
receives all node-related messages. The primary function of a
Deputy Master node in this architecture is to perform the database
synchronization. When a node comes up again after a failure and a
master node already exists then the restored node requests for the
current dump of the database from the master node.
[0045] In the second architecture, there is an additional software
module running at a node, namely, master controller as shown in
FIG. 3. The so-called master controller (4) is a module, which
could be implemented by software and/or hardware.
[0046] The Node Module (2) and master controller (7) are active in
all nodes of the network. However, the master module (3) is active
only in the master node. In this architecture, it is the master
controller (7) which takes part in master-election and role-change
related steps, e.g., database synchronization. When the nodes come
up for the first time, the Node Module (2) and master controller
(7) are started in each node. The master module (3) is not started
initially. The master controllers (7) in various nodes by
exchanging and processing heartbeat messages among each other elect
a particular node as the master node. Thereafter, it starts the
master module, (3) only in the master node (cf. FIG. 3).
[0047] The master controller (7) in each node is connected to all
other master controllers (7) in other nodes through the supervisory
channel. The Node Module in different nodes is connected only to
the master module (3) as shown in FIG. 4 through a reduced set of
supervisory connections (10).
[0048] When the master node changes, e.g., from node 1 to node 2,
the master controller (7) in that node, dynamically and
automatically re-configures the connection between the node modules
(2) and the new master module (3) as shown in FIG. 5.
[0049] This dynamic reconfiguration is done using NetProc or other
similar software modules and the master controller present in each
node. The master controller sends a re-configure message to NetProc
in each node, with the node ID of the new master node. The NetProc
in each node on receiving the message re-configures the connections
so that all the nodes have a logical supervisory connection to the
new master node. The nodes can also be statically connected as in
architecture 1 and the dynamic reconfiguration step can be
avoided.
[0050] Exchange of heartbeat messages and related processing is
done as discussed previously in this document.
[0051] The master controller (7) does the database synchronization
between a pair of nodes. After a new master node is elected, the
master controller (7) sends the current dump (state) of the
database to all the master controllers(7) in Deputy Master nodes
before starting the master module processes in the master node.
This makes sure that the database in all nodes are synchronized
before the nodes begin the management function. The master module
(3) informs the master controller (7) of any changes in database
and these changes are sent to all other master controllers (7) in
other nodes in the network. The master controller (7) in other
Deputy Master nodes on receiving the changes from the master node
updates the local database. This keeps the database synchronized
with the master node. When a node comes up again after a failure
and a master node already exists then the restored node requests
for the current dump of the database from the master node.
* * * * *