System and Method for Suppressing Redundant Alarms Naedele; Martin ; et al. [ABB RESEARCH LTD]

System and Method for Suppressing Redundant Alarms

Naedele; Martin ; et al.

Patent Application Summary

U.S. patent application number 11/630953 was filed with the patent office on 2008-12-25 for system and method for suppressing redundant alarms. This patent application is currently assigned to ABB RESEARCH LTD. Invention is credited to Alexander Fay, Christian Frei, Martin Naedele, Patrick Sager.

Application Number	20080316015 11/630953
Document ID	/
Family ID	34957774
Filed Date	2008-12-25

United States Patent Application	20080316015
Kind Code	A1
Naedele; Martin ; et al.	December 25, 2008

System and Method for Suppressing Redundant Alarms

Abstract

The present invention is concerned with an alarm system and a method for suppressing redundant alarms in a monitored system. The alarm system comprises a process control unit, an FDI unit, a filter, and an alarm display unit. The process control unit detects anomalies in the monitored system and generates alarms corresponding to them. The FDI unit diagnoses failures in the monitored system and generates a failure-cause-effect graph comprising a list of generated alarms, along with the alarms corresponding to the causes and effects of each listed alarm. The FDI unit dynamically updates the failure-cause-effect graph concurrently with the generation of alarms by the process control unit and the diagnosis of failures by the FDI unit. The filter receives the generated alarms and identifies redundant and not-yet-classified alarms from the received alarms, by using the failure-cause-effect graph. The filter suppresses the identified redundant alarms and passes on non-redundant and the not-yet-classified alarms to the alarm display unit, for display.

Inventors:	Naedele; Martin; (Zurich, CH) ; Frei; Christian; (Fislisbach, CH) ; Sager; Patrick; (Oberageri, CH) ; Fay; Alexander; (Hamburg, DE)
Correspondence Address:	BUCHANAN, INGERSOLL & ROONEY PC POST OFFICE BOX 1404 ALEXANDRIA VA 22313-1404 US
Assignee:	ABB RESEARCH LTD ZURICH CH
Family ID:	34957774
Appl. No.:	11/630953
Filed:	June 28, 2004
PCT Filed:	June 28, 2004
PCT NO:	PCT/CH04/00403
371 Date:	July 25, 2008

Current U.S. Class:	340/506
Current CPC Class:	H04L 41/0613 20130101; H04L 41/065 20130101
Class at Publication:	340/506
International Class:	G08B 29/00 20060101 G08B029/00

Claims

1. An alarm system capable of suppressing redundant alarms, the alarm system comprising: at least one process control unit for detecting system anomalies and generating alarms corresponding to the detected system anomalies; a Fault Diagnosis and Isolation (FDI) unit for diagnosing a plurality of system failures, the FDI unit capable of generating a failure-cause-effect graph comprising alarms corresponding to at least one cause and at least one effect of each of a plurality of diagnosed system failures; a filter coupled to the FDI unit and the process control unit, the filter managing alarms, the alarm management being based upon the identification of redundant alarms and not-yet-classified alarms by using the generated failure-cause-effect graph, wherein an alarm corresponding to a detected anomaly is identified as a not-yet-classified alarm if the alarm corresponding to the detected anomaly is not listed in the failure-cause-effect graph; and at least one alarm display unit coupled to the filter, the alarm display unit displaying the identified not-yet-classified alarms and non-redundant alarms.

2. The alarm system according to claim 1, wherein the FDI unit dynamically updates the failure-cause-effect graph concurrently with the generation of alarms by the process control unit and the diagnosis of failures by the FDI unit.

3. The alarm system according to claim 1, wherein the identification of redundant alarms by using the generated failure-cause-effect graph comprises a comparison or logical classification of an alarm corresponding to a detected anomaly with respect to the alarms corresponding to at least one cause and at least one effect of a diagnosed system failure.

4. The alarm system according to claim 3, wherein a previously displayed alarm is identified as a redundant alarm if a new alarm corresponding to a detected anomaly is diagnosed as a cause of the previously displayed alarm.

5. The alarm system according to claim 3, wherein an alarm corresponding to a detected anomaly is identified as a redundant alarm if it is one of the determined effects of a previously displayed alarm.

6. The alarm system according to claim 3, wherein an alarm previously identified as a redundant or a not-yet-classified alarm is re-identified as a redundant or a non-redundant alarm when the failure-cause-effect graph is updated.

7. The alarm system according to claim 1, wherein the alarm display unit displays at least one reason for suppressing the redundant alarms.

8. A method of dynamically suppressing redundant alarms in a system, the method using a failure-cause-effect graph comprising alarms corresponding to at least one cause and at least one effect of each of a plurality of failures diagnosed in the system, the failure-cause-effect graph being dynamically updated concurrently with the generation of alarms and the diagnosis of failures, the method comprising the steps of: detecting at least one anomaly in the system; diagnosing at least one system failure; identifying an alarm corresponding to a detected anomaly as a not-yet-classified alarm if the alarm corresponding to the detected anomaly is not listed in the failure-cause-effect graph; identifying an alarm corresponding to a detected anomaly as a redundant alarm by comparison or logical classification of the alarm with respect to alarms corresponding to at least one cause and at least one effect of a diagnosed system failure listed in the failure-cause-effect graph; suppressing the identified redundant alarm; and displaying a non-redundant alarm and the identified not-yet-classified alarm.

9. The method according to claim 8, wherein an alarm corresponding to a detected anomaly is identified as a redundant alarm if it is one of the determined effects of a previously displayed alarm.

10. The method according to claim 8, wherein a previously displayed alarm is identified as a redundant alarm if a new alarm corresponding to a detected anomaly is diagnosed as a causes of the previously displayed alarm.

11. A computer program product, comprising a computer usable medium having a computer readable program code embodied therein, for dynamically suppressing redundant alarms in a system, by using a failure-cause-effect graph comprising alarms corresponding to at least one cause and at least one effect of each of a plurality of failures diagnosed in the system, the failure-cause-effect graph being dynamically updated concurrently with the generation of alarms and the diagnosis of failures, the computer program product comprising: program instruction means for detecting at least one anomaly in the system; program instruction means for diagnosing at least one system failure; program instruction means for identifying an alarm corresponding to a detected anomaly as a not-yet-classified alarm if the alarm corresponding to the detected anomaly is not listed in the failure-cause-effect graph; program instruction means for identifying an alarm corresponding to a detected anomaly as a redundant alarm by comparison or logical classification of the alarm with respect to alarms corresponding to at least one cause and at least one effect of a diagnosed system failure listed in the failure-cause-effect graph; program instruction means for suppressing the identified redundant alarm; and program instruction means for displaying a non-redundant alarm and the identified not-yet-classified alarm.

Description

FIELD OF THE INVENTION

[0001] The invention relates to the field of failure diagnosis and alarm systems. More specifically, the invention relates to the suppression of redundant alarms corresponding to failures occurring in a system being monitored.

BACKGROUND OF THE INVENTION

[0002] All large as well as small-scale systems, such as manufacturing units, processing units, nuclear plants, computer networks, etc., are generally provided with a failure monitoring unit. The failure monitoring unit ensures the safe operation of the system being monitored, by generating alarms corresponding to each failure detected by it. The failure monitoring unit is coupled to an alarm system, which displays the generated alarms, thereby informing a user about the occurrence of a failure in the monitored system.

[0003] The failure monitoring unit generates alarms corresponding to failures in the monitored system by using pre-defined criteria, such as a process variable exceeding its threshold value. Process variables include the parameters upon which the safe working of the monitored system depends. For example, in a steel manufacturing unit, the process variables include temperature and pressure inside a reactor; in a computer network the variables may include time out parameter applicable to the network, etc. Very often, alarms are generated due to a slight deviation of the process variables from a desired value, and do not correspond to a critical system failure. These alarms are termed as nuisance alarms. Most conventional alarm systems display even the nuisance alarms.

[0004] In other situations, a critical failure, which affects many parts of the monitored system, leads to the generation of a very large number of alarms. Most of these alarms are generated due to the same root failure and correspond to the various degrees of the failure and the process variables involved in the failure. Hence, generally, the alarms generated by the failure monitoring unit corresponding to failures in the monitored system are related. This is because a particular alarm is either a cause or an effect of a previously generated alarm, thereby, making most of the generated alarms redundant.

[0005] For example, consider that at an instant a first alarm is generated corresponding to the failure of a node in a LAN network, and at the next instant a second alarm is generated corresponding to the failure of a peripheral device attached to the node. In this case the second alarm corresponds to an effect of the previously generated first alarm because the failure of the peripheral device is a direct consequence of the failure of the node. Hence, the second alarm is a redundant alarm. In other words, the first alarm is a cause of the second alarm, thereby making the two alarms related to each other.

[0006] Hence, the failure-monitoring unit generates a large number of alarms corresponding to a sequence of failures in the monitored system. If the alarm system displays all the generated alarms, it becomes difficult for a user managing the monitored system to experimentally judge and locate the important alarms corresponding to a root failure from the large number of alarms being displayed at a particular instant. Hence, the user is unable to react to the root failure that is causing the alarms being displayed, and the alarm system is rendered useless.

[0007] Conventionally, various alarm systems have been designed with the objective of suppressing redundant alarms, thereby preventing them from being displayed.

[0008] U.S. Pat. No. 5,581,242 titled `Automatic alarm display processing system in plant` relates to a method for automatically selecting important alarms from the alarms generated in a plant operation monitoring system. The method involves suppression of an alarm if the conditions for alarm suppression are met. The conditions for alarm suppression are stored in a causal table.

[0009] U.S. Pat. No. 6,594,236 titled `Alarm suppressing method for optical transmission apparatus` relates to an alarm suppressing method for an optical transmission apparatus. The method involves an analysis of the relation between a root alarm and the subsequent alarms, which are generated subsequent to the root alarm and are referred to as, propagation alarms.

[0010] There are certain limitations associated with the prior art alarm systems and methods. Some of these alarm systems use hard coded alarm suppression rules. The alarm systems use these rules to decide which of the related alarms generated by the failure monitoring units are redundant. The alarm systems then suppress the redundant alarms and display only the non-redundant ones. These alarm suppression rules are generally based on an alarm source containment hierarchy. The alarm source containment hierarchy is a previously made list of all the possible failures that can occur in the monitored system along with the causes and effects of the failures. In accordance with the alarm suppression rules, the alarm systems suppress the alarms corresponding to the effects of a failure. Hence, these alarm systems succeed in preventing a large number of related alarms from being displayed. However, the hard coded alarm suppression rules may hide the concurrent occurrence of multiple failures in the monitored system. This may lead to no alarm being displayed, corresponding to some of the concurrently occurring unrelated failures. In addition, the operation of conventional alarm systems is not flexible, and setting them up and maintaining them is cumbersome.

[0011] Another method used to suppress redundant alarms involves the suppression of the alarms on the basis of the `time stamps` associated with them. All the alarms occurring at a later time, with respect to a base alarm, are suppressed. However, the alarm systems employing this method of alarm suppression require a very high time resolution and highly synchronised system clocks. In many situations, this method also, may hide the concurrent occurrence of multiple failures in the monitored system. In general, these alarm systems are prone to synchronisation inaccuracies and delays. In addition, it is difficult to conclude causality from the time sequence of alarms generated in the monitored system.

[0012] Hence, the existing alarm systems are not capable of dynamically suppressing redundant alarms concurrently with their generation in the monitored system. These alarm systems are also not capable of displaying non-redundant alarms without a significant time lag between the generation of the alarm and the presentation of the alarm. In addition, the existing alarm systems are not capable of preventing the suppression of an alarm, corresponding to a critical failure in a situation where multiple failures occur concurrently in the monitored system.

DESCRIPTION OF THE INVENTION

[0013] It is therefore an objective of the invention to provide a system and method for suppressing redundant alarms and displaying non-redundant and not-yet-classified alarms. These objectives are achieved by implementing an alarm system according to claim 1, method of suppressing redundant alarms according to claim 8 and a computer program product for suppressing redundant alarms according to claim 11. Further preferred embodiments are evident from the dependent patent claims.

[0014] In accordance with an embodiment of the invention, an alarm system capable of suppressing redundant alarms is provided. The alarm system comprises at least one process control unit for detecting anomalies in a system being monitored and generating alarms corresponding to the detected anomalies; a fault diagnosis and isolation (FDI) unit for diagnosing failures in the monitored system; a filter coupled to the FDI unit and the process control unit; the filter managing alarms, and at least one alarm display unit coupled to the filter for displaying the identified not-yet-classified alarms and non-redundant alarms. In accordance with another embodiment of the present invention, the filter is coupled to the FDI unit and, via the alarm display unit, to the process control unit.

[0015] The FDI unit generates a failure-cause-effect graph comprising alarms corresponding to at least one cause and at least one effect of each alarm corresponding to a diagnosed failure. The FDI unit dynamically updates the failure-cause-effect graph concurrently with the generation of alarms by the process control unit and the diagnosis of failures by the FDI unit. The filter performs alarm management by identifying redundant alarms and not-yet-classified alarms by using the generated failure-cause-effect graph. An alarm corresponding to a detected anomaly is identified as a not-yet-classified alarm if it is not listed in the failure-cause-effect graph.

[0016] In accordance with a first preferred variant of the invention, the filter identifies the redundant alarms by performing a comparison or logical classification of an alarm corresponding to a detected anomaly with respect to the alarms corresponding to at least one cause and at least one effect of a diagnosed system failure listed in the failure-cause-effect graph.

[0017] A previously displayed alarm is identified as a redundant alarm if a new alarm corresponding to a detected anomaly is diagnosed as a cause of the previously displayed alarm. In other terms, the previously displayed alarm is identified as a redundant alarm if the failure leading to the generation of or diagnosed cause for this previously displayed alarm is identified as one of the effects of the failure leading to the generation of the new alarm. The previously generated alarm corresponding to a previously diagnosed failure or a previously detected anomaly corresponding to a detected anomaly is identified as a redundant alarm if it is one of the determined effects of a previously displayed alarm. In other terms, the new alarm is identified as a redundant alarm if the failure leading to the generation of or diagnosed as cause for a previously displayed alarm is identified as one of the causes of the failure leading to the generation of the new alarm; the previously generated alarm corresponding to a previously diagnosed failure or a previously detected anomaly.

[0018] In a second preferred variant of the invention, an alarm previously identified, as a redundant or a not-yet-classified alarm, is re-identified as a redundant or a non-redundant alarm when the failure-cause-effect graph is updated.

[0019] In a third preferred variant of the invention, the alarm display unit also displays the reasons for suppressing the redundant alarms.

[0020] In accordance with another embodiment of the invention, a method of dynamically suppressing redundant alarms in a system being monitored is provided. The method uses the failure-cause-effect graph for suppressing the redundant alarms. The method comprises detecting at least one anomaly in the monitored system; diagnosing at least one failure in the monitored system; identifying an alarm corresponding to a detected anomaly as a not-yet-classified alarm if the alarm corresponding to the detected anomaly is not listed in the failure-cause-effect graph; identifying an alarm corresponding to a diagnosed failure as a redundant alarm by comparison or logical classification of the alarm with respect to alarms corresponding to at least one cause and at least one effect of a diagnosed system failure listed in the failure-cause-effect graph; suppressing the identified redundant alarm and displaying a non-redundant alarm and the identified not-yet-classified alarm.

BRIEF DESCRIPTION OF THE DRAWINGS

[0021] The subject matter of the invention will be explained in more detail in the following text with reference to preferred exemplary embodiments illustrated in the attached drawings, of which:

[0022] FIG. 1a depicts an alarm system, in accordance with an exemplary embodiment of the invention;

[0023] FIG. 1b depicts an alarm system, in accordance with another embodiment of the invention;

[0024] FIG. 2 depicts a flow chart for a method for dynamically suppressing redundant alarms, in accordance with the exemplary embodiment of the invention illustrated in FIG. 1a;

[0025] FIG. 3 depicts a flow chart for a method for dynamically suppressing redundant alarms, in accordance with another exemplary embodiment of the invention illustrated in FIG. 1b; and

[0026] FIG. 4 depicts an exemplary failure-cause-effect graph, in accordance with an exemplary embodiment of the invention.

[0027] The reference symbols used in the drawings, and their meanings, are listed in summary form in the list of reference symbols. In principle, identical parts are provided with the same reference symbols in the figures.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0028] FIG. 1a depicts an alarm system 100 in an exemplary embodiment of the present invention. Alarm system 100 comprises a process control unit 101, a fault diagnosis and isolation (FDI) unit 103, a filter 105, and an alarm display unit 107.

[0029] Alarm system 100 manages the alarms generated corresponding to anomalies and failures occurring in a system being monitored and suppresses the redundant alarms. Process control unit 101 and FDI unit 103 receive inputs from sensors (not shown in FIG. 1) fitted in the monitored system. Process control unit 101 detects anomalies in the monitored system and generates alarms corresponding to the detected anomalies. FDI unit 103 diagnoses failures occurring in the monitored system and dynamically generates a failure-cause-effect graph corresponding to the diagnosed failures. Filter 105 looks up each alarm generated by process control unit 101 in the failure-cause-effect graph and identifies redundant and not-yet-classified alarms. The identified redundant alarms are suppressed, whereas the non-redundant and the not-yet-classified ones are displayed.

[0030] The alarm system provided by the invention may be employed in any system containing a failure monitoring mechanism to detect system failures. Examples of such systems include large scale or small-scale plants, or networks. Large scale or small-scale plants include processing plants, nuclear plants and manufacturing plants. A network includes a wireless network, or a computer network, such as Ethernet LAN, WAN and the Internet.

[0031] Process control unit 101 detects anomalies in the monitored system. An anomaly is an undesirable state or behaviour of the monitored system, which can be indicated by the monitored system's process variables exceeding a threshold value. An anomaly is an indicator of a failure in some cases. In other situations, anomalies represent short-term process deviations in the monitored system and do not indicate any critical failures. Examples of anomalies detected in a processing plant include boiler temperature exceeding its threshold value, pipe pressure becoming low, etc.

[0032] In an embodiment of the present invention, process control unit 101 can be a Supervisory Control and Data Acquisition (SCADA) unit. SCADA is a process control application that collects data from the sensors located on a system shop floor or in remote locations and sends the collected data to a central processor for management and control. SCADA unit resides on the central processor, which receives information from sensors, determines the control requirements of the monitored system and sends commands to control actuators.

[0033] In an embodiment of the present invention, process control unit 101 receives input regarding the states of the process variables via sensors fitted in different parts of the monitored system. Process control unit 101 generates an alarm corresponding to each detected anomaly.

[0034] FDI unit 103 performs the diagnosis and isolation of the failures occurring in the monitored system. Failure diagnosis and isolation is performed by using methods including expert systems, theorem provers, mathematical and control theoretical models, neural networks, fuzzy logic, and simulators, or combinations thereof, collaborating for example, in the form of a multi-agent system. An exemplary method for fault diagnosis and isolation is described in the IEEE paper titled `Dynamic functional-link neural networks genetically evolved applied to fault diagnosis`, authored by T. Marcu, B. Koppen-Seliger, P. M. Frank and S. X. Ding and presented at the 7.sup.th European Control Conference ECC'03, Sep. 1-4, 2003, University of Cambridge, UK.

[0035] FDI unit 103 produces hypotheses regarding the diagnosed failure type and its location, based on a variety of inputs. The input data comprises current and historical measured values of the process variables and simulation results obtained via sensors fitted in different parts of the system.

[0036] In an exemplary embodiment of the invention, FDI unit 103 dynamically generates the failure-cause-effect graph, which comprises a list of alarms corresponding to each diagnosed failure, along with the alarms corresponding to the possible causes and effects of the diagnosed failure. It is to be noted that an alarm listed in the failure-cause-effect graph corresponds to a diagnosed failure, i.e., `a first alarm is a cause or an effect of a second alarm` is equivalently expressed, as `a first failure corresponding to a first alarm is a cause or an effect of a second failure corresponding to a second alarm`.

[0037] In an alternate embodiment of the invention, FDI unit 103 dynamically generates the failure-cause-effect graph, which comprises a list of diagnosed failures, along with the possible causes and effects of the diagnosed failure.

[0038] In alternate embodiments, FDI unit 103 can generate a data structure such as a table or a chart for listing the diagnosed failures along with their causes and effects. FDI unit 103 updates this graph concurrently with the generation of alarms by process control unit 101 and the diagnosis of failures by FDI unit 103. In particular, the graph is updated whenever a new hypothesis of failure becomes available to FDI unit 103. The new hypothesis of failure is based upon additional information from the sensors providing inputs to FDI unit 103.

[0039] Filter 105 receives the alarms generated by process control unit 101 and the cause-effect-graph generated by FDI unit 103. Filter 105 then identifies the redundant and the not-yet-classified alarms from the received alarms by using the failure-cause-effect graph. Next, filter 105 suppresses the redundant alarms and passes on non-redundant and the not-yet-classified alarms to alarm display unit 107. Hence, filter 105 prevents a large number of redundant alarms from being displayed to a user managing the monitored system. Only the alarms identified as non-redundant or not-yet-classified are displayed.

[0040] In an embodiment of the present invention, filter 105 is implemented in hardware, software, or in a combination of hardware and software, by a computing system. The computing system includes a processor and a memory for storing computer-readable instructions. The processor includes one or more general or special purpose processors, such as a Pentium.RTM., Centrino.RTM., Power PC.RTM., digital signal processor (DSP) etc. The memory includes hard disk variants, floppy/compact disk variants, digital versatile disk (DVD) variants, smart cards, partially or fully hardened removable media, read only memory, random access memory, cache memory etc., in accordance with the requirements of a particular application. Various programming languages or other tools can also be utilized for implementing filter 105, such as those compatible with C variants (e.g., C++, C#), the Java 2 Platform, Enterprise Edition (J2EE) or other programming languages, in accordance with the requirements of a particular application.

[0041] Alarm display unit 107 receives the non-redundant and the not-yet-classified alarms from filter 105 and displays such alarms. In an embodiment of the present invention, the alarms are displayed to the user using an audio as well as a visual display system. In another embodiment, alarm display unit 107 also displays the redundant alarms suppressed by filter 105 and the reasons for suppressing them, for example, using visual representations of the failure-cause-effect graph.

[0042] FIG. 1b depicts an alarm system in accordance with an alternate embodiment of the invention. In this embodiment, process control unit 101 directly sends the generated alarms to alarm display unit 107. Alarm display unit 107 displays the received alarms. In addition, alarm display unit 107 makes the received alarms accessible to filter 105 for the suppression of the redundant alarms and the identification of the non-redundant and the not-yet-classified alarms. Filter 105 suppresses the redundant alarms from the alarms being displayed, and only the identified non-redundant and not-yet-classified alarms continue to be displayed by alarm display unit 107. Filter 105 works in parallel with alarm display unit 107 and therefore, does not create additional latency in displaying the non-redundant and the not-yet-classified alarms.

[0043] FIG. 2 depicts a flow chart for a method for dynamically suppressing redundant alarms, in accordance with the exemplary embodiment of the invention illustrated in FIG. 1a. In accordance with the exemplary embodiment, alarms are generated corresponding to each detected anomaly in the monitored system. However, only the identified non-redundant and not-yet-classified alarms are displayed, the rest being suppressed. In other words, the alarms identified as redundant alarms from amongst the generated alarms, are not displayed.

[0044] Process control unit 101 detects a multitude of anomalies and FDI unit 103 diagnoses a multitude of failures occurring in the monitored system. However, for the purpose of illustration, FIGS. 2 and 3 have been described assuming a single anomaly and a single failure. It will be apparent to a person skilled in the art that the method described in FIGS. 2 and 3 can be extended to a scenario, wherein there are more than one anomaly and failure.

[0045] At step 201, process control unit 101 detects an anomaly, and an alarm corresponding to the detected anomaly is generated. At step 203, FDI unit 103 diagnoses a failure in the monitored system and generates a failure-cause-effect graph. Steps 201 and 203 are performed concurrently.

[0046] Next, at step 205, it is determined whether the alarm corresponding to the detected anomaly is listed in the failure-cause-effect graph or not. In order to accomplish this, filter 105 looks up the failure-cause-effect graph to determine whether the alarm is listed in the graph. If the alarm is not listed in the graph, then at step 207, the alarm is identified as a not-yet-classified alarm and displayed via alarm display unit 107.

[0047] If the alarm corresponding to the detected anomaly is listed in the failure-cause-effect graph then, filter 105 identifies whether the alarm is a redundant alarm or not. In accordance with one embodiment of the invention, filter 105 identifies the redundant alarms by performing a comparison or logical classification of an alarm corresponding to a detected anomaly. The classification is done to classify each alarm as either corresponding to at least one cause and at least one effect of a diagnosed system failure listed in the failure-cause-effect graph.

[0048] At step 209, it is determined whether the alarm corresponding to the detected anomaly is an effect of a previously displayed alarm or not. The previously displayed alarm corresponds to a previously detected anomaly or a previously diagnosed failure, wherein the previously detected anomaly can be a symptom of a later diagnosed failure. Therefore, it is determined whether the detected anomaly is an effect of a previously detected anomaly or a previously diagnosed failure or not. If the alarm is an effect of a previously displayed alarm, then the alarm is identified as a redundant alarm at step 211. At step 217, the identified redundant alarm is suppressed, and hence, is prevented from being displayed.

[0049] If the alarm generated corresponding to the diagnosed failure is not an effect of a previously generated alarm, step 213 is performed. At step 213, it is determined whether the alarm corresponding to the detected anomaly is a cause of a previously displayed alarm or not. Therefore, it is determined whether the detected anomaly is a cause of a previously detected anomaly or a previously diagnosed failure or not. If the alarm is a cause of any previously displayed alarm, the previously displayed alarm, which corresponds to a previously diagnosed failure, is identified as a redundant alarm at step 215. At step 217, the identified redundant alarm is suppressed and hence, is prevented from being displayed.

[0050] If the alarm corresponding to the detected anomaly is neither a cause nor an effect of a previously displayed alarm, then at step 219 the alarm is identified as a non-redundant alarm and is displayed via alarm display unit 107.

[0051] FIG. 3 depicts a flow chart for a method for dynamically suppressing redundant alarms, in accordance with another exemplary embodiment of the invention illustrated in FIG. 1b. In accordance with the exemplary embodiment, alarms generated corresponding to each detected anomaly at step 301 are displayed at step 303. Next not-yet-classified and redundant alarms are identified at steps 307 and 311 respectively using the failure-cause-effect graph generated at step 305. At step 307, it is determined whether the alarm being displayed is listed in the failure-cause-effect graph or not. If the alarm being displayed is not listed in the failure-cause-effect graph, it is identified as a not-yet-classified alarm at step 309 and is continued to be displayed. At steps 311 and 315, it is determined whether the alarm being displayed is a cause or an effect of a previously displayed alarm or not. If the alarm is an effect of a previously displayed alarm, it is identified as a redundant alarm at step 313. If the alarm is determined to be a cause of a previously displayed alarm, the previously displayed alarm is identified as a redundant alarm at step 317. At step 319, the identified redundant alarms are suppressed. Therefore, in accordance with the embodiment of the present invention, the alarms identified as redundant from amongst the alarms being displayed are suppressed, while, the rest continue to be displayed.

[0052] It will also be appreciated that one or more of the elements and steps, depicted in the drawings/figures, can be implemented in a more separated or integrated manner, or even removed or rendered as inoperable in certain cases, as is useful, in accordance with a particular application. Additionally, any signal arrows in the drawings/figures should be considered only as exemplary, and not limiting, unless otherwise specifically noted. Furthermore, the particular features, structures, or characteristics of any specific embodiment of the present invention may be combined in any suitable manner with one or more other embodiments. It is to be understood that other variations and modifications of the embodiments of the present invention described and illustrated herein are possible in light of the teachings herein and are to be considered as part of the spirit and scope of the present invention.

[0053] FIG. 4 depicts an exemplary failure-cause-effect graph, in accordance with an embodiment of the invention. Alarms generated, corresponding to each diagnosed failure (failure alarms) and their possible causes and effects are listed in the failure-cause-effect graph 400. In FIG. 4, arrows point from a cause of a failure alarm to a possible effect of the failure alarm. A failure alarm with more than one outbound arrow signifies that there are several possible effects of the failure alarm. A failure alarm with more than one inbound arrow signifies that there are several possible causes of the failure alarm.

[0054] For the purpose of illustration, it is assumed that process control unit 101 detects an anomaly, and failure alarm 401 is displayed, corresponding to the detected anomaly at a particular instant. Failure alarms 403 and 405 correspond to failures, which are direct consequences of the detected anomaly, and are identified as the possible effects of failure alarm 401. Failure alarms 407, 409 and 411 correspond to failures, which are the possible causes of the detected anomaly and are identified as the possible causes of failure alarm 401 from failure-cause-effect graph 400.

[0055] As a result, if process control unit 101 detects an anomaly, which corresponds to failure alarm 403 at a later instant, then failure alarm 403 is identified as a redundant alarm and is suppressed. This is because failure alarm 403 corresponds to a failure, which is a direct consequence of the anomaly, detected at the earlier instant. In other words, failure alarm 403 is an effect of the previously displayed failure alarm 401, and hence, is identified as a redundant alarm.

[0056] On the other hand, if process control unit 101 detects an anomaly, which corresponds to failure alarm 407 at a later instant, failure alarm 401 is identified as a redundant alarm and is suppressed. Failure alarm 407 corresponds to a failure, which is a cause of the anomaly detected at the earlier instant. Hence, failure alarm 401 is identified as a redundant alarm because it is an effect of the newly displayed failure alarm 407. Hence, failure alarm 407 is displayed via alarm display unit 107, whereas failure alarm 401 is suppressed.

[0057] Next, failure alarms 413 and 415 are determined as a cause and an effect of failure alarm 407 respectively, and failure-cause-effect graph 400 is used, as described above, for determining the redundant alarms from the subsequently generated alarms.

[0058] Therefore, the present invention suppresses the redundant alarms in the monitored system and prevents a large number of alarms from being displayed to a user managing the monitored system. The invention makes use of the failure-cause-effect graph generated by FDI unit 103 to suppress the redundant alarms instead of relying on purely mechanistic suppression schemes. For identifying the redundant alarms, both the causes and effects of a generated alarm are identified. A previously generated alarm is suppressed if a newly generated alarm is close to the root cause of the failure causing the previously generated alarm. The invention also makes use of the alarm prediction made by FDI unit 103, so that, if a newly generated alarm is already expected as an effect of a previously generated alarm, it is directly suppressed.

[0059] As a result, only the most relevant alarms to which the user needs to react with counter-actions are displayed. The invention also reduces the risk of ignoring concurrent anomalies and failures occurring in the monitored system.

LIST OF DESIGNATIONS

[0060] 101 Process Control Unit [0061] 103 FDI [0062] 105 Filter [0063] 107 Alarm Display Unit [0064] 201 System anomaly detection and alarm generation step [0065] 203 System failure diagnosis and failure-cause-effect graph generation step [0066] 205 Not-yet-classified alarm determination step [0067] 207 Display of not-yet-classified alarm step [0068] 209 Determination of an alarm being an effect of a previously generated alarm step [0069] 211 Identification of the alarm as a redundant alarm step [0070] 213 Determination of an alarm being a cause of a previously generated alarm step [0071] 215 Identification of the previously generated alarm as a redundant alarm step [0072] 217 Suppression of the redundant alarm step [0073] 219 Display of non-redundant alarm step [0074] 301 System anomaly detection step [0075] 303 Alarm display step [0076] 305 System failure diagnosis and failure-cause-effect graph generation step [0077] 307 Not-yet-classified alarm determination step [0078] 309 Display of not-yet-classified alarm step [0079] 311 Determination of an alarm being an effect of a previously generated alarm step [0080] 313 Identification of the alarm as a redundant alarm step [0081] 315 Determination of an alarm being a cause of a previously generated alarm step [0082] 317 Identification of the previously generated alarm as a redundant alarm step [0083] 319 Suppression of the redundant alarm step [0084] 401 Failure alarm 6 [0085] 403 Failure alarm 2 [0086] 405 Failure alarm 3 [0087] 407 Failure alarm 9 [0088] 409 Failure alarm 10 [0089] 411 Failure alarm 11 [0090] 413 Failure alarm 13

* * * * *