U.S. patent application number 13/222169 was filed with the patent office on 2012-03-01 for method of configuring a sensor-based detection device and a corresponding computer program and adaptive device.
This patent application is currently assigned to COMMISSARIAT A L'ENERGIE ATOMIQUE ET AUX ENE ALT. Invention is credited to Pierre JALLON.
Application Number | 20120054133 13/222169 |
Document ID | / |
Family ID | 44169109 |
Filed Date | 2012-03-01 |
United States Patent
Application |
20120054133 |
Kind Code |
A1 |
JALLON; Pierre |
March 1, 2012 |
METHOD OF CONFIGURING A SENSOR-BASED DETECTION DEVICE AND A
CORRESPONDING COMPUTER PROGRAM AND ADAPTIVE DEVICE
Abstract
This method of configuring a device for detecting a situation
from among a set of situations in which it is possible to find a
physical system observed by a least one sensor, comprises the
following steps: receiving (102) a training sequence corresponding
to a determined situation of the physical system; determining (118)
parameters of a statistical hidden Markov model recorded on the
detection device and related to the determined situation, based on
a prior initialization (104-116) of these parameters. The prior
initialization (104-116) comprises the following steps: determining
(104, 106) multiple probability distributions from the training
sequence; distributing (108-114) the determined probability
distributions between the hidden states of the statistical model
being used; and initializing the parameters of the statistical
model being used from representative probability distributions
determined for each hidden state of the statistical model being
used.
Inventors: |
JALLON; Pierre; (Grenoble,
FR) |
Assignee: |
COMMISSARIAT A L'ENERGIE ATOMIQUE
ET AUX ENE ALT
Paris
FR
|
Family ID: |
44169109 |
Appl. No.: |
13/222169 |
Filed: |
August 31, 2011 |
Current U.S.
Class: |
706/12 |
Current CPC
Class: |
G06K 9/6297 20130101;
G06N 5/00 20130101; G06N 5/045 20130101; G06N 7/005 20130101 |
Class at
Publication: |
706/12 |
International
Class: |
G06F 15/18 20060101
G06F015/18 |
Foreign Application Data
Date |
Code |
Application Number |
Aug 31, 2010 |
FR |
10 56894 |
Claims
1. A method of configuring a device (10) for detecting a situation
from among a set of situations (S-1, . . . , S-N) in which it is
possible to find a physical system (40) observed by a least one
sensor (18), comprising the following steps: receiving (102) a
sequence of observation data of the physical system, called a
training sequence (L-1, . . . , L-N), provided by the sensor and
corresponding to a determined situation of the physical system,
determining, from the training sequence, the parameters of a
statistical hidden Markov model (HMM-1, . . . , HMM-N) recorded
onto the detection device's storage media (20) and relating to the
determined situation, by prior initializing (104-116) these
parameters, then updating (118) these initialized parameters,
characterized in that this prior initialization (104-116) comprises
the following steps: with the statistical model being used having a
given number of hidden states, determining (104, 106) multiple
probability distributions from the training sequence, by dividing
the sequence into sub-sequences and assigning to each sub-sequence
a probability distribution statistically modeling it, the number of
determined probability distributions being greater than the number
of hidden states in the statistical model being used, distributing
(106-114) said determined probability distributions between the
hidden states of the statistical model being used, determining
(110), for each hidden state in the statistical model being used
and, from the probability distributions assigned to said hidden
state, a single probability distribution that is representative of
said hidden state, and initializing (116) the parameters of the
statistical model being used from the determined representative
probability distributions, and in that the method also includes a
configuration step (122) for the detection device such that the
statistical model being used includes the parameters determined by
said prior initialization (104-116) and then said update (118).
2. A configuration method according to claim 1, in which the
distribution (106-114) comprises the execution (110-114) of an
iterative K-Means algorithm on a number of classes equal to the
number of hidden states in the statistical model being used (HMM-1,
. . . , HMM-N), this iterative algorithm comprising, at each
iteration: an estimate (112) of distances between probability
distributions using the Kullback-Leibler divergence, and the
calculation (110), for each class, of a probability distribution
representing its center.
3. A configuration method according to claim 2, in which the
distribution (106-114) comprises an initialization (106, 108, 110)
of the iterative K-Means algorithm, consisting of: sorting (106)
the probability distributions in ascending order of one of the
parameters of said distributions, distributing (108) the sorted
probability distributions into the classes in this ascending order,
from the first to the last class, for each class initialized in
such a way, determining (110) a probability distribution that
represents its center.
4. A configuration method according to claim 3, in which, each
probability distribution being a normal distribution, the sorting
(106) of the probability distributions during the initialization of
the iterative K-Means algorithm involves sorting an expectation
component of said normal distributions.
5. A configuration method according to any one of claims 2 to 4, in
which, each probability distribution being a normal distribution,
the probability distribution representing the center of a class Ki
is a normal distribution determined by the calculation (110) of its
expectation .mu..sub.i and its variance .SIGMA..sub.i based on the
expectations .mu..sub.i,j and variances .SIGMA..sub.i,j of all
probability distributions of this class Ki, as follows: .mu. i = 1
Card ( Ki ) j .di-elect cons. Ki .mu. i , j and .SIGMA. i = 1 Card
( Ki ) j .di-elect cons. Ki ( .SIGMA. i , j + .mu. i , j H .mu. i ,
j ) - .mu. i H .mu. i , ##EQU00006## where Card is the "Cardinal"
function and H is the Hermitian operator.
6. A configuration method according to any one of claims 1 to 5, in
which the update (118) of the parameters of the statistical model
being used (HMM-1, . . . , HMM-N) includes the execution of the
Baum-Welch algorithm on the training sequence (L-1, . . . ,
L-N).
7. A configuration method according to any one of claims 1 to 6, in
which the prior initialization (104-116) of the parameters of the
statistical model being used (HMM-1, . . . , HMM-N) also comprises:
the initialization of the initial probabilities of each hidden
state at a common value of equiprobability, and the initialization
of the matrix of transitions from each hidden state to each other
at a matrix whose diagonal coefficients are equal to a first near
value of 1, specifically between 0.8 and 1, and whose other
coefficients are equal to a second near value of 0, specifically
between 0 and 0.2.
8. A computer program that can be downloaded from a communication
network and/or saved on a computer-readable medium and/or executed
by a processor, characterized in that it comprises instructions for
executing the steps of a configuration method according to any one
of claims 1 to 7, when said program is executed on a computer.
9. An adaptive device (10) for detecting a situation from among a
set of situations (S-1, . . . , S-N) in which it is possible to
find a physical system (40) observed by at least one sensor, from
observation data of the physical system provided by the sensor,
comprising: at least one sensor (18) for providing a sequence of
observation data of the physical system, means of storage (20), for
each situation (S-1, . . . , S-N) in the set of situations, of a
statistical hidden Markov model (HMM-1, . . . , HMM-N), a computer
(22), connected to the sensor and to the storage means, programmed
(28) to select one of the situations by comparing probabilities of
these situations, knowing the observation data sequence, the
probabilities being estimated based on stored statistical models,
in which the computer is also programmed (32) to execute the steps
of a configuration method according to any one of claims 1 to 7,
upon receiving a sequence identified as a training sequence
corresponding to a determined situation of the physical system.
10. An adaptive device (10) according to claim 9, in which the
sensor (18) includes at least one of the elements of the set
comprised of a movement sensor with at least one measurement axis,
a pressure monitor, a heart rate monitor, and a glucose monitor.
Description
[0001] This invention relates to a method of configuring a device
for detecting a situation from among a set of situations in which
it is possible to find a physical system observed by a least one
sensor. It also relates to a corresponding computer program and an
adaptive device for detecting a situation in which there is a
physical system observed by at least one sensor.
[0002] By "physical system", this means any system producing a
physical output that can be observed by a sensor, the system being
a priori assumed to be able to be found in a predetermined number
of situations modeled by the detection device.
[0003] The observed physical system may for example be an inanimate
object, such as a structure whose state we want to monitor in order
to detect possible anomalies or deformations using one or more
sensors.
[0004] It can also be an animated system, such as a person or an
animal, for example, suffering from a chronic disease with crisis
situations that can be detected using a sensor. Depending on the
sensor(s) used, there are various detectable situations and many
applications.
[0005] In particular, one promising application is covered in the
paper by P. Jallon et al, entitled "Detection system of motor
epileptic seizures through motion analysis with 3D accelerometers,"
published at the IEEE EMBC 2009 conference. In this paper, an
epileptic seizures detection device using movement sensors,
specifically 3D accelerometers, is based on statistical hidden
Markov models, each modeling at best, for a given situation, the
statistical properties of observation sequences provided by the
sensors as they are expected for this situation. Specifically, each
statistical hidden Markov model in this device corresponds to a
predetermined possible situation for a person subject to epileptic
seizures, including, for example, a first crisis situation, a
second crisis situation that is different from the first one, and a
situation of no crisis. The detection principle then consists of
selecting one of the possible situations by comparing the
probabilities of the situations, knowing a sequence of observations
provided by at least one accelerometer, the probabilities being
calculated based on each of the statistical hidden Markov models in
the device.
[0006] The problem with such a detection device is that it is not
adaptive. The parameters of the statistical models are
predetermined, specifically saved once and for all in the device,
and must be able to remain relevant when the detection device is
used by different people. Of course, because each person reacts
differently in epileptic crisis situations or in a situation of no
crisis, if the detection device is successful for one person, it
will not necessarily be so for another.
[0007] More generally, it is known to configure or reconfigure a
statistical hidden Markov model when one has at least one training
sequence considered as representative of the assumed situation
modeled by this statistical model.
[0008] Thus, the invention more specifically applies to a method of
configuration comprising the following steps: [0009] receiving a
sequence of observation data of the physical system, called a
training sequence and corresponding to a determined situation of
the physical system, [0010] determining, from the training
sequence, the parameters of a statistical hidden Markov model
relating to the determined situation, by prior initializing these
parameters, then updating these initialized parameters.
[0011] Such a method of configuration is for example proposed in
the paper by L. Rabiner, titled "A tutorial on Hidden Markov Models
and selected applications in speech recognition," Proceedings of
the IEEE, vol. 77, no. 2, pp. 257-286, February 1989. In this
paper, the update is performed by an iterative
expectation-maximization algorithm, specifically the Baum-Welch
algorithm. But like any optimization algorithm using iterations, it
is particularly sensitive to the prior initialization of the
parameters to be optimized. In fact, if it is improperly
initialized, the result it will provide, although numerically
stable, may be largely sub-optimal, for example by converging
toward a local maximum depending on the cost function it optimizes.
At worst, it may not even converge numerically and provide aberrant
output parameters.
[0012] In practice, the parameters to be initialized, and then
eventually updated, for a statistical hidden Markov model, are:
[0013] C, the number of hidden states in the statistical model that
is used, [0014] .pi..sub.1, . . . , .pi..sub.C, the C initial
probabilities, independent of any observation, of each hidden state
of the statistical model that is used, [0015]
(a.sub.i,j).sub.1.ltoreq.i,j.ltoreq.C, the matrix of probabilities
for transition from each hidden state i to each other hidden state
j in the statistical model that is used, and [0016] for each hidden
state, the parameters of a probability distribution of the
observation provided at each instant by the sensor, this
observation being considered as a random variable.
[0017] Note that the Baum-Welch algorithm or any other known
expectation-maximization algorithm does not allow the number C of
hidden states to be updated, said number being considered a
constant. C must therefore be set prior to the initialization, and
it is not updated by the algorithm.
[0018] Also note that the probability distribution for each hidden
state of the statistical model being used may be multidimensional
if the observation is multidimensional, meaning that data provided
by the sensor (or set of sensors) to the detection device contains
multiple values. For example, if the probability distribution is
chosen as being a normal distribution, the sufficient parameters
for defining it are its expectation and its variance, which may be
scalars when the probability distribution is one-dimensional, or
respectively a vector and a matrix when the probability
distribution is multidimensional.
[0019] Finally, note that the parameters of the statistical model
being used can be determined based on one or more training
sequences, knowing that it is generally recommended to use multiple
sequences to statistically improve the adaptation of the
statistical model being used based on the reality of observation
sequences in the situation it is supposed to model. For a single
training sequence, the cost function to optimize by updating the
parameters in the statistical model being used corresponds to the
probability of observing the training sequence by this model. For
multiple training sequences, the cost function becomes the product
of the probabilities of observing the training sequences, still by
this same model.
[0020] To overcome the shortcomings of the algorithm for updating
the initialized parameters, a well-known solution consists of
providing multiple sets of initial parameters, executing the
expectation-maximization algorithm on each set of initial
parameters, and finally selecting the one that provides the best
value for the optimized cost function. This solution reduces the
risk of ending up in an unfavorable case of executing the
algorithm, but it does not solve the problem of initialization and
greatly increases the processing involved with the training
sequence.
[0021] Other solutions include trying to directly improve the prior
initialization step.
[0022] A method of initializing hidden Markov models is for example
described in the paper by K. Nathan et al, titled "Initialization
of hidden Markov models for unconstrained on-line handwriting
recognition", published during the ICASSP conference, 1996. In this
paper, each hidden state in a Markov model has multiple summed
normal distributions whose parameters are obtained by an upfront
classification of the observations. These normal distributions are
common to all of the states, these states being differentiated only
by weight coefficients. The initialization actually involves
determining these weight coefficients. However, this method is
specific to a model that is very specifically tailored to
handwriting recognition. It cannot be generalized to all hidden
Markov models.
[0023] In the P. Smyth paper, titled "Clustering sequences with
hidden Markov models", published in Advances in Neural Information
Processing Systems, 1996, the authors group the training sequences
according to some measure of similarity. For each of these groups,
a model is learned, and the model computed for the initialization
of the Baum-Welch algorithm is the concatenation of these different
models. The disadvantage of this method is multiplying the number
of hidden states in the final model by the concatenation operation.
Consequently, the final model over-describes the signals of the
training sequence, which in addition to increasing the complexity
of the processing, can significantly disrupt the performance of the
detection device.
[0024] It may therefore be desirable to provide a method of
configuration that can overcome at least some of the above problems
and constraints.
[0025] The invention therefore relates to a method of configuring a
device for detecting a situation from among a set of situations in
which it is possible to find a physical system observed by a least
one sensor, comprising the following steps: [0026] receiving a
sequence of observation data of the physical system, called a
training sequence, provided by the sensor and corresponding to a
determined situation of the physical system, [0027] determining,
from the training sequence, the parameters of a statistical hidden
Markov model recorded onto the detection device's storage media and
relating to the determined situation, by prior initializing these
parameters, then updating these initialized parameters, [0028]
configuring the detection device so that the statistical model
being used incorporates the determined parameters, the prior
initialization comprising the following steps: [0029] with the
statistical model being used having a given number of hidden
states, determining multiple probability distributions from the
training sequence, by dividing the sequence into sub-sequences and
assigning to each sub-sequence a probability distribution
statistically modeling it, the number of determined probability
distributions being greater than the number of hidden states in the
statistical model being used, [0030] distributing said determined
probability distributions between the hidden states of the
statistical model being used, [0031] determining, for each hidden
state in the statistical model being used and, from the probability
distributions assigned to said hidden state, a single probability
distribution that is representative of said hidden state, and
[0032] initializing the parameters of the statistical model being
used from the determined representative probability distributions,
wherein the method also includes a configuration step for the
detection device such that the statistical model being used
includes the parameters determined by said prior initialization and
then said update.
[0033] Therefore, the initialization of the parameters of any one
of the statistical models of the detection device can be used on
the basis of another very fine model applied to the training
sequence, this other very fine model being able to present a much
higher number of probability distributions than the number of
hidden states in the model being used. The reduction of this very
fine model, by distributing its probability distributions among the
hidden states in the model being used, then using this distribution
to determine the representative probability distributions of the
hidden states, makes it possible to finely initialize the model
being used, even if it has a limited number of hidden states.
Updating these parameters by known methods then produces a globally
optimal result. Consequently, the adaptation of the detection
device to the physical system observed is improved.
[0034] Optionally, the distribution comprises the execution of an
iterative K-Means algorithm on a number of classes equal to the
number of hidden states in the statistical model being used, this
iterative algorithm comprising, at each iteration: [0035] an
estimate of distances between probability distributions using the
Kullback-Leibler divergence, and [0036] the calculation, for each
class, of a probability distribution representing its center.
[0037] Also optionally, the distribution comprises an
initialization of the iterative K-Means algorithm, consisting of:
[0038] sorting the probability distributions in ascending order of
one of the parameters of said distributions, [0039] distributing
the sorted probability distributions into the classes in this
ascending order, from the first to the last class, [0040] for each
class initialized in such a way, determining a probability
distribution that represents its center.
[0041] Also optionally, with each probability distribution being a
normal distribution, the sorting of the probability distributions
during the initialization of the iterative K-Means algorithm
involves sorting an expectation component of said normal
distributions.
[0042] Also optionally, with each probability distribution being a
normal distribution, the probability distribution representing the
center of a class Ki is a normal distribution determined by the
calculation of its expectation .mu., and its variance .SIGMA.,
based on the expectations .mu..sub.i,j and variances
.SIGMA..sub.i,j of all probability distributions of this class Ki,
as follows:
.mu. i = 1 Card ( Ki ) j .di-elect cons. Ki .mu. i , j and .SIGMA.
i = 1 Card ( Ki ) j .di-elect cons. Ki ( .SIGMA. i , j + .mu. i , j
H .mu. i , j ) - .mu. i H .mu. i , ##EQU00001##
where Card is the "Cardinal" function and H is the Hermitian
operator.
[0043] Also optionally, the update for the parameters of the
statistical model being used includes the execution of the
Baum-Welch algorithm on the training sequence.
[0044] Also optionally, the prior initialization of the parameters
of the statistical model being used also comprises: [0045] the
initialization of the initial probabilities of each hidden state at
a common value of equiprobability, and [0046] the initialization of
the matrix of transitions from each hidden state to each other at a
matrix whose diagonal coefficients are equal to a first near value
of 1, specifically between 0.8 and 1, and whose other coefficients
are equal to a second near value of 0, specifically between 0 and
0.2.
[0047] The invention also relates to a computer program that can be
downloaded from a communication network and/or saved on a
computer-readable medium and/or executed by a processor, comprising
instructions for executing the steps of a configuration method such
as defined above, when said program is executed on a computer.
[0048] The invention also relates to an adaptive device for
detecting a situation from among a set of situations in which it is
possible to find a physical system observed by at least one sensor,
from observation data of the physical system provided by the
sensor, comprising: [0049] at least one sensor for providing a
sequence of observation data of the physical system, [0050] means
of storage, for each situation in the set of situations, of a
statistical hidden Markov model, [0051] a computer, connected to
the sensor and to the storage means, programmed to select one of
the situations by comparing probabilities of these situations,
knowing the observation data sequence, the probabilities being
estimated based on stored statistical models, in which the computer
is also programmed to execute the steps of a configuration method,
as defined above, upon receiving a sequence identified as a
training sequence corresponding to a determined situation of the
physical system.
[0052] Optionally, the sensor includes at least one of the elements
of the set comprised of a movement sensor with at least one
measurement axis, a pressure monitor, a heart rate monitor, and a
glucose monitor.
[0053] The invention will be better understood using the following
description, given purely as reference and referring to the
accompanying drawings, in which:
[0054] FIG. 1 schematically shows the general structure of a
detection device according to an embodiment of the invention,
[0055] FIG. 2 illustrates a particular use of the detection device
in FIG. 1,
[0056] FIG. 3 illustrates the successive steps of a configuration
method, for example for the device in FIG. 1, according to an
embodiment of the invention, and
[0057] FIGS. 4A to 4D illustrate, using diagrams, the intermediary
results of a distribution step in the configuration method in FIG.
3.
[0058] This device 10 shown in FIG. 1 is an adaptive device for
detecting a situation from among a set of situations in which it is
possible to find a physical system observed by a least one sensor.
For this, it includes an observation module 12, a processing module
14, and an interface module 16.
[0059] The observation module 12 includes one or more sensors
represented by the unique reference 18 for the observation of the
physical system.
[0060] Some non-exclusive examples of sensors and situations that
can be observed using these sensors are given: [0061] the sensor 18
may, for example, include a movement sensor with one, two, or three
measurement axes, including a 3D accelerometer worn by an
individual, for determining an epileptic seizure or the absence of
an epileptic seizure in the individual, [0062] more generally, it
may include a movement sensor for determining the activity of a
mobile system in a set of predetermined activities, [0063] it may
include a heart rate monitor for determining an activity in the
individual, [0064] it may include a sensor that monitors glucose in
an individual or animal suffering from diabetes for determining a
crisis situation or the absence of a crisis, [0065] it may include
a pressure monitor to determine the operating situation (normal,
borderline, abnormal) of an installation under pressure, [0066]
etc.
[0067] The sensor 18 may also include multiple sensors, each
providing observations that, combined, can make it possible to
detect more complex situations.
[0068] It takes measurements on the physical system to provide at
least an observation signal, transmitted in the form of sequences
of observation data to the processing module 14. The observation
data can come directly from a sampling of the observation signal or
obtained after one or more rounds of processing, including one or
more filters, of this signal. The observation data is understood to
contain one or more values, including when there is only one sensor
18.
[0069] The processing module 14 is an electronic circuit board,
such as in a computer. It includes means of storage 20, such as
RAM, ROM, or other memory, where the parameters of statistical
hidden Markov models are stored.
[0070] Each situation S-1, . . . , S-N intended to be detectable by
the detection device 10 using the sensor 18 is modeled by a
corresponding statistical hidden Markov model, denoted HMM-1, . . .
, HMM-N.
[0071] Any one of the stored statistical hidden Markov models,
denoted HMM-n and modeling the situation E-n, is defined by the
following parameters: [0072] Cn, the number of hidden states in
this model HMM-n, [0073] .pi..sub.1, . . . , .pi..sub.Cn, the Cn
initial probabilities, independent of any observation, for each
hidden state of this model HMM-n, [0074]
(a.sub.i,j).sub.1.ltoreq.i,j.ltoreq.Cn, the matrix of probabilities
for transition from each hidden state i to each other hidden state
j in this model HMM-n, and [0075] for each hidden state, the
parameters of a probability distribution for the observation
provided at each instant by the sensor.
[0076] As a non-exclusive example to simplify the notations, the
probability distribution for each hidden state i in the model HMM-n
can be chosen from the family of normal distributions. In this
case, it is defined by its expectation .mu.n.sub.i and its variance
.SIGMA.n.sub.i. When the data provided by the sensor 18 has
multiple values, .mu.n.sub.i is a vector comprising as many
components and .SIGMA.n.sub.i is a matrix comprising as many rows
and columns as there are values provided at each instant.
[0077] The memory 20 can also store, in association with each model
HMM-n, one or more training sequences L-n. Each training sequence
for the model HMM-n is actually an observation sequence provided by
the sensor 18, but a priori known to be extracted from the
observation of the physical system while it was in the situation
S-n. It can therefore be processed upon receipt by the processing
module 14, or even stored in memory 20 in relation with the model
HMM-n for future processing, for configuration or reconfiguration
of the detection device 10 by updating the parameters of the model
HMM-n, as will be detailed with reference to FIG. 3.
[0078] The processing module 14 also includes a computer 22, for
example a computer's central processing unit, equipped with a
microprocessor 24 and a storage space for at least one computer
program 26. This computer 22, and more specifically the
microprocessor 24, is connected to the sensor 18 and to the memory
20.
[0079] The computer program 26 fulfills three main functions,
illustrated by modules 28, 30, and 32 in FIG. 1.
[0080] The first function, performed by the detection module 28,
for example in the form of an instruction loop, is a function for
detecting a situation in which the physical system is found, upon
receipt of an observation sequence provided by the sensor 18. More
specifically, the detection module 28 is programmed to select one
of the situations S-1, . . . , S-N by comparing the probabilities
of these situations, knowing the observation sequence, the
probabilities being estimated based on the stored statistical
models HMM-1, . . . , HMM-N. The resolution of this selection using
statistical hidden Markov models is well known and including in the
three major categories of problems resolved by hidden Markov
models, as mentioned in the L. Rabiner paper mentioned above. The
method used will therefore not be detailed.
[0081] The second function, performed by the recording module 30,
for example in the form of an instruction loop, is a function for
recording, in the memory 20, an observation sequence in relation to
one of the situations S-1, . . . , S-N. This observation sequence
then becomes a training sequence to be used to configure or
reconfigure the detection device 10.
[0082] The third function, performed by the configuration module
32, for example in the form of an instruction loop, is a function
for reconfiguring the detection device 10 by updating the
parameters of at least one statistical model HMM-n stored in memory
20 using a training sequence or a corresponding set of training
sequences L-n. This function will be detailed with reference to
FIG. 3.
[0083] To select which function the processing module 14 must
perform, the interface module 16 may include a mode selector 34
controlled by a user, specifically the individual wearing the
detection device 10, when the observed physical system is an
individual.
[0084] In a simple embodiment, it may be interpreted that the
detection device 10 works by default in detection mode, thus
executing the detection module 28. Because one of the advantages of
the detection device 10 is detecting at least one critical
situation from among a set of possible situations, such as an
epileptic seizure in the wearer of the device subject to this type
of situation, the interface module 16 may also include an alert
trigger 36. This trigger may, for example, include a screen (to
display a warning message), a speaker (to emit an audio signal), or
a transmitter (to transmit a signal to a remote alarm).
[0085] At the request of the operator via the mode selector 34, the
detection device 10 may temporarily switch to recording mode, when
an observation sequence associated with a known situation in the
observed physical system is provided by the sensor 18 and must be
recorded as a training sequence in the memory 20. The detection
device may then include a recording interface 38, by which the
operator defines the observation sequence (for example by marking
its start and end) and associates it to one of the possible
situations. The recording interface 38 may include, traditionally,
a screen and/or means of input.
[0086] At the request of the operator via the mode selector 34
also, the detection device 10 may temporarily switch to
configuration mode, when the operator believes that there are
sufficient training sequences in memory 20 to improve the
adaptation of the detection device 10 to the observed physical
system.
[0087] Note that the observation module 12, processing module 14,
and interface module 16 are structurally separable. Therefore, the
detection device 10 can be designed as one piece or as several
distinct hardware elements connected together by means of wired or
wireless data transmission. Specifically, the processing module 14
and possibly the interface module 16 can be implemented by a
computer. Only the observation module 12 is required to be in the
vicinity or in contact with the physical system being observed
since it includes the sensor(s).
[0088] In FIG. 2, a particularly compact embodiment is illustrated,
for an application for monitoring an individual 40. According to
this embodiment, the detection device 10 is entire embedded in a
box 42 worn by the individual. The sensor is, for example, a 3D
accelerometer, and the observed situations are, for example,
twofold, such as an epileptic seizure modeled by a statistical
model HMM-1 and a situation of no epileptic seizure modeled by a
statistical model HMM-2. For this application, the box 42 is, for
example, firmly held to an arm belonging to the individual 40 by
means of a strap 44, such that the detection device 10 is worn like
a watch.
[0089] The operation of the configuration module 32 will now be
detailed with reference to FIG. 3 using the example of a
configuration of the detection device 10 by updating the parameters
of any one (HMM-n) of the statistical models stored in the memory
20. The execution of the configuration module 32 by the
microprocessor 24 produces the sequence of steps illustrated in
this figure.
[0090] During a first step 100, in a range of possible values for
the number of hidden states that may be presented by this model
HMM-n, this number is set to Cn. An example of a range of possible
values is [3;10]. For a first value, Cn can take the first value in
this range.
[0091] During a step 102, a set L-n of training sequences related
to the situation S-n modeled by the statistical hidden Markov model
HMM-n is received by the microprocessor 24 for processing by the
configuration module 32. It can be received directly from the
sensor 18, but more commonly, it is extracted from the memory 20 in
which the training sequences may have been recorded at very
different times, particularly during different occurrences of the
situation S-n. Specifically, for an application for detecting
epileptic seizures, knowing that the observations sequences
transmitted by the sensor 18 may be processed by the detection
module 28 in sliding windows of observations of, for example, 45
seconds, at a rate of 25 samples per second, each training sequence
may represent several minutes of operation. Therefore, in total, a
set of training sequences may last several minutes, even an hour or
more.
[0092] During the next steps 104 and 106, multiple probability
distributions are determined from the training sequence, the number
Ln of determined probability distributions being greater than, or
much greater than, Cn.
[0093] More specifically, during the step 104, the number Ln of
probability distributions to be determined may optionally be
obtained by dividing all of the training sequences into
sub-sequences of one second each. In the above example, this
results in sub-sequences of 25 samples. In general, a sub-sequence
of 25 pieces of data with one or more values may be enough to
determine a probability distribution, particularly a normal
distribution, statistically modeling this sub-sequence correctly.
Furthermore, the division of the training sequence into
sub-sequences can be performed with or without overlapping between
successive sub-sequences.
[0094] Therefore, during the step 106, each sub-sequence is
associated with a corresponding probability distribution, for
example a normal distribution of parameters .mu.n.sub.l
(expectation) and .SIGMA.n.sub.l (variance). At this stage of the
method, the determination of the Ln distributions, and thus their
parameters .mu.n.sub.l and .SIGMA.n.sub.l is simple. Simply
calculate the average and variance of each sub-sequence, considered
to be estimators of .mu.n.sub.l and .SIGMA.n.sub.l.
[0095] During this same step, the Ln probability distributions are
sorted in ascending order of the first component of the expectation
parameters .mu.n.sub.l. In the specific case of one-dimensional
training sequences, the sorting is done in ascending order of the
expectations. This results, for example, in a distribution D as
illustrated in FIGS. 4A to 4D.
[0096] During the next step 108, an initial distribution of the
determined probability distributions is performed between the Cn
hidden states of the statistical model HMM-n. This distribution is
done based on the previous sorting. For example, if Ln is a
multiple of Cn, if .E-backward.k/Ln=kCn, we can assign the first k
probability distributions to a first hidden state, the next k to a
second state, and so on until the last hidden state. If Ln is not a
multiple of Cn, the distribution can be done on the same basis, for
example by ignoring the last sub-sequences. This step corresponds
to an initial classification of the Ln probability distributions
into Cn classes by equal division, each class corresponding to a
hidden state.
[0097] During the next step 110, for each class Ki
(1.ltoreq.i.ltoreq.Cn) and based on a calculated average of the
probability distributions assigned to this class Ki, a probability
distribution is determined that represents its center. Y is a
random variable that follows the distribution of this center. If
the center must represent the average of the probability
distributions of the class Ki, then we can write:
Y = l .di-elect cons. Ki 1 ( X = l ) Z l , ##EQU00002##
where Z.sub.l is a random variable that follows the normal
distribution for index l and parameters .mu.n.sub.i,l and n.sub.i,l
of the class Ki, and where X is a random variable that is equal to
l if Y follows the same probability distribution as Z.sub.l.
[0098] The distribution of the center of the class Ki is a sum of
the normal distributions that can be estimated, but it is also
possible to approach it simply by using a normal distribution of
parameters .mu.n.sub.i and .SIGMA.n.sub.i. We then have:
.mu. n i = E X , Z ( Y ) = E X , Z ( l .di-elect cons. Ki 1 ( X = l
) Z l ) = l .di-elect cons. Ki E X , Z ( 1 ( X = l ) ) E X , Z ( Z
l ) , then .mu. n l = 1 Card ( Ki ) l .di-elect cons. Ki .mu. n i ,
l , where Card is the " Cardinal " function . and .SIGMA. n i = E X
, Z ( ( Y - E X , Z ( Y ) ) H ( Y - E X , Z ( Y ) ) ) = E X , Z ( Y
H Y ) - E X , Z ( Y ) H E X , Z ( Y ) , .SIGMA. n i = E X , Z ( ( l
.di-elect cons. Ki 1 ( X = l ) Z l ) H ( m .di-elect cons. Ki 1 ( X
= m ) Z m ) ) - .mu. n i H .mu. n i , .SIGMA. n i = l , m .di-elect
cons. Ki E X , Z ( 1 ( X = l ) 1 ( X = m ) Z l Zm ) - .mu. n i H
.mu. n i , .SIGMA. n i = 1 Card ( Ki ) l .di-elect cons. Ki E X , Z
( Z l H Z l ) - .mu. n i H .mu. n i , then ( 1 ) .SIGMA. n i = 1
Card ( Ki ) l .di-elect cons. K i ( .SIGMA. n i , l + .mu. n i , l
H .mu. n i , l ) - .mu. n i H .mu. n i , ( 2 ) ##EQU00003##
where H is the Hermitian operator.
[0099] Equations (1) and (2) show that, as the center of any class
Ki is defined, it is possible to simply calculate its parameters of
normal distribution .mu.n.sub.i and .SIGMA.n.sub.i from the
parameters .mu.n.sub.i,l and .SIGMA.n.sub.i,l of the normal
distributions of class Ki.
[0100] During a next step 112, based on the Cn centers determined
in the previous step, a new distribution of the Ln probability
distributions determined in step 106 is made using a function that
finds the "distance" between normal probability distributions. More
specifically, for each probability distribution determined in step
106, its "distance" is calculated with respect to each of the
centers and then assigned the class Ki with the nearest center.
[0101] For this, we define a "distance" function between normal
distributions based on the Kullback-Leibler divergence. Because
this divergence is not commutative, it is not strictly a distance,
but can still be comprised of a model that can be used at each
classification step. Remember that the Kullback-Leibler divergence
is written as followed for two probability distributions p and
q:
D KL ( p || q ) = .intg. log ( p ( u ) q ( u ) ) p ( u ) u .
##EQU00004##
[0102] For normal distributions pn.sub.l and pn.sub.k with
respective parameters .mu.n.sub.l, .SIGMA.n.sub.l, and .mu.n.sub.k,
.SIGMA.n.sub.k, it takes the following form:
D KL ( pm l || pn k ) = 1 2 ( log ( .SIGMA. n k .SIGMA. n l ) + Tr
( .SIGMA. n k - 1 .SIGMA. n l ) + ( .mu. n l - .mu. n k ) H .SIGMA.
n k - 1 ( .mu. n l - .mu. n k ) - N ) , ##EQU00005##
where |.SIGMA.| is the absolute value of the matrix determinant
.SIGMA., Tr the Trace function and N the number of components in
the vector .mu.n.sub.l or .mu.n.sub.k.
[0103] Following the steps 110 and 112, we move to a step 114
during which a stop criteria is tested, comprising at least one of
the following two conditions: [0104] the new distribution, obtained
from the step 112, of the Ln probability distributions determined
in the step 106 is unchanged from the previous distribution (i.e.
initial distribution in the step 108 or distribution obtained
during a previous execution of the step 112), [0105] the steps 110
and 112 were repeated a number Nmax of times, Nmax being a
predetermined constant.
[0106] If the stop criteria is not satisfied, the configuration
module 32 returns to the step 110 for another execution of the
steps 110 and 112. Otherwise, it goes to a step 116 to initialize
the parameters of the statistical model HMM-n using the result of
the loop from steps 110 to 114.
[0107] Note that the loop from steps 110 to 114 includes an
implementation of the K-Means algorithm for the unsupervised
automatic classification of the Ln normal distributions in Cn
classes corresponding to the Cn hidden states of the model HMM-n.
The result of this application of the K-Means algorithm to the Ln
probability distributions determined in the step 106 is an
optimized distribution of these probability distributions between
the Cn hidden states of the statistical model HMM-n. In addition,
each center of parameters .mu.n.sub.i and .SIGMA.n.sub.i calculated
at the last execution of the step 110 constitutes a single
probability distribution representative of the class (i.e. the
hidden state) of which it is the center.
[0108] The initialization 116 of the parameters of the statistical
model HMM-n is done, based on the previously mentioned result, as
follows: [0109] the number of hidden states of the initialized
model HMM-n is set to the value Cn, [0110] the Cn initial
probabilities .pi..sub.1, . . . , .pi..sub.C, of the model HMM-n
are initialized to a common value of equiprobability of 1/Cn,
[0111] the matrix of transition probabilities
(a.sub.i,j).sub.1.ltoreq.i,j.ltoreq.Cn for the model HMM-n is
initialized in a matrix whose diagonal coefficients are equal to a
first near value of 1, specifically between 0.8 and 1, and whose
other coefficients are equal to a second near value of 0,
specifically between 0 and 0.2, and [0112] the parameters of the
probability distribution of the observation provided at each
instance by the sensor 18 for the hidden state Ki are initialized
to those, .mu.n.sub.i and .SIGMA.n.sub.i, of the center, calculated
during the last execution of the step 110, for this hidden
state.
[0113] Following this initialization step 116, we move to a step
118 for updating, from the set of training sequences, the
parameters initialized from the model HMM-n. This update is
performed, as previously indicated, by the execution of an
iterative expectation-maximization algorithm, specifically the
Baum-Welch algorithm, on the set of training sequences. Given the
relevance of the initialization described previously, this step
provides the parameters of the model HMM-n globally optimized with
respect to the set of the training sequences, for a given number Cn
of hidden states.
[0114] During the next step 120, a test is performed to find out
whether the series of steps 108 to 118 must be executed again for a
new value of Cn. Cn is, for example, incremented by one unit, and
if it remains within the mentioned range of possible values, the
method moves to the step 108. Otherwise, it goes to a last step 122
for configuring the detection device 10 so that the statistical
model HMM-n includes the parameters that were ultimately
determined.
[0115] More specifically, during this last step 122, multiple sets
of parameters compete, corresponding to multiple values of Cn. This
step is to select one of them. The configuration module 32 may, for
example, take what is given as the best value from the cost
function used in the execution 118 of the Baum-Welch algorithm.
Then, this set of parameters that is ultimately determined for the
statistical model HMM-n is recorded in the memory 20.
[0116] FIG. 4A illustrates, using a diagram, the result of the step
108 and of the first execution of the step 110 on a set of Ln
normal distributions, consistent with what is actually obtained
from a training sequence for an epileptic seizure situation,
previously sorted for Cn=5. The five centers of the five classes,
in which the Ln normal distributions are equally distributed, are
shown using thick lines.
[0117] FIGS. 4B, 4C, and 4D respectively illustrate what happens to
these five centers after the first, second, and third iterations of
the loop of steps 112-114-110. Assuming that FIG. 4D illustrates
the result used in the step 116, note that the five centers that
are ultimately obtained are highly representative of the set Ln of
the probability distributions extracted from the set of training
sequences. In any case, they are much more representative of the Ln
probability distributions than the five initial centers in FIG.
4A.
[0118] With respect to the distribution D of the Ln normal
distributions provided as an example, we can easily image that this
value 5 of Cn will certainly provide the best result in the step
118 and will be used in the step 122.
[0119] It clearly appears that a detection device like that
described previously allows for precise reconfigurations as
frequently as the user wants. It is therefore easy to adjust the
detection device to the physical system being observed and even to
the changes to this physical system over time, since the
statistical hidden Markov models used for its detection are not
fixed.
[0120] Also note that the invention is not limited to the
embodiment described previously.
[0121] Specifically, the detection device may be designed in many
forms since its observation module 12, processing module 14, and
interface module 16 are separable. Its design can therefore adjust
to the planned application and to the physical system being
observed.
[0122] In addition, an algorithm other than the Baum-Welch
algorithm may be used if it is equivalent in terms of optimizing
parameters of a statistical hidden Markov model, to execute the
step 118, an algorithm other than the K-Means algorithm may be used
if it is equivalent in terms of classifying with an a priori known
number of classes without supervision, to execute the steps 108 to
114, and other metrics or methods of calculating centers can be
used to execute the steps 110 and 112.
[0123] More generally, as is known to those skilled in the art,
there are various modifications that can be made to the embodiment
described above, with respect to the teaching that has been
disclosed. In the following claims, the terms used should not be
interpreted as limiting the claims to the embodiment presented in
this description, but should be interpreted to include all of the
equivalents that the claims intend to cover by their formulation
and whose projection is within reach of those skilled in the art by
applying their general knowledge to the teaching that has just been
disclosed.
* * * * *