U.S. patent application number 10/890134 was filed with the patent office on 2006-08-31 for system and method for estimating probabilities of events.
Invention is credited to Ronald Holzloehner, Amitkumar Mahadevan, Curtis R. Menyuk, Joel M. Morris, John W. Zweck.
Application Number | 20060193400 10/890134 |
Document ID | / |
Family ID | 36931917 |
Filed Date | 2006-08-31 |
United States Patent
Application |
20060193400 |
Kind Code |
A1 |
Morris; Joel M. ; et
al. |
August 31, 2006 |
System and method for estimating probabilities of events
Abstract
A dual adaptive importance-sampling system and method is
provided that can estimate the probability of events by combining
dual complementary importance-sampling simulations. The present
invention exploits the ability to determine an optimal biased pdf
using an iterative procedure that requires relatively little a
priori knowledge of how to bias. Hence, the present invention is
particularly suited for evaluating the BERs and/or WERs of coded
communication and storage systems, and is generally applicable to
arbitrarily chosen codes. When applied to coded communication and
storage systems, the present invention provides a versatile
technique for the fast and accurate estimation of BERs and WERs of
FEC codes down to values of 10.sup.-20 or lower.
Inventors: |
Morris; Joel M.; (Columbia,
MD) ; Holzloehner; Ronald; (Muenchen, DE) ;
Mahadevan; Amitkumar; (Baltimore, MD) ; Menyuk;
Curtis R.; (Silver Spring, MD) ; Zweck; John W.;
(Columbia, MD) |
Correspondence
Address: |
FLESHNER & KIM, LLP
P.O. BOX 221200
CHANTILLY
VA
20153
US
|
Family ID: |
36931917 |
Appl. No.: |
10/890134 |
Filed: |
July 14, 2004 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60486970 |
Jul 14, 2003 |
|
|
|
Current U.S.
Class: |
375/316 |
Current CPC
Class: |
H04L 1/203 20130101 |
Class at
Publication: |
375/316 |
International
Class: |
H04L 27/06 20060101
H04L027/06 |
Goverment Interests
GOVERNMENT RIGHTS
[0002] This invention was made with government support under Grant
Nos. MDA904-01-C0940/Z949001 and MDA904-02-C0428/Z956801 awarded by
the Department of Defense. The government has certain rights in
this invention.
Claims
1. A method of estimating the probability of occurrence of an event
(E), comprising: performing an adaptive unconstrained estimation of
an optimal biased distribution of a multi-dimensional random
variable (z) defined on a sample space (.GAMMA.); performing an
importance-sampling (IS) simulation using the optimal biased
distribution for z to yield a first result; performing an adaptive
constrained estimation of an optimal biased distribution for z over
regions of .GAMMA. where E occurs; performing an IS simulation
using the optimal biased distribution for z, over regions of
.GAMMA. where E occurs, to yield a second result; and estimating
the probability of the occurrence of E based on the first and
second results.
2. The method of claim 1, wherein the constrained and unconstrained
estimations are performed iteratively.
3. The method of claim 1, wherein the event comprises a word error
rate for a forward error correcting (FEC) code.
4. The method of claim 3, wherein the FEC code comprises a LDPC
code.
5. The method of claim 1, wherein the event comprises a bit error
rate for a forward error correcting (FEC) code.
6. The method of claim 5, wherein the FEC code comprises a LDPC
code.
7. The method of claim 1, wherein the steps of performing an
adaptive unconstrained estimation of an optimal biased distribution
for z and performing an IS simulation using the optimal biased
distribution for z to yield a first result comprise: performing
adaptive unconstrained Metropolis simulations to iteratively
estimate the optimal biased distribution for z; and performing an
IS simulation using the optimal biased distribution for z and an
unconstrained Metropolis random walk to yield the first result.
8. The method of claim 7, wherein the steps of performing an
adaptive constrained estimation of an optimal biased distribution
for z over regions of .GAMMA. where E occurs and performing an IS
simulation using the optimal biased distribution for z over regions
of .GAMMA. where E occurs to yield a second result comprise:
performing adaptive constrained Metropolis simulations to
iteratively estimate the optimal biased distribution for z over
regions of .GAMMA. where E occurs; and performing an IS simulation
using the optimal biased distribution for z over regions of .GAMMA.
where E occurs and a constrained Metropolis random walk to yield
the second result.
9. The method of claim 1, wherein the probability of the occurrence
of E based on the first and second results is estimated by: scaling
the second result to fit the first result to yield a scaling
factor; and estimating the probability of the occurrence of E based
on the scaling factor.
10. A method of estimating the probability of occurrence of an
event (E), given a known probability distribution (.rho.(z)) of a
multi-dimensional random variable (z) defined on a sample space
(.GAMMA.), comprising: determining a scalar mapping (V) from the
multi-dimensional sample space to a single-dimensional space;
defining bins that partition a first range of values of V
(.GAMMA..sub.V.sup.u), such that values of V not in the range of
values .GAMMA..sub.V.sup.u have a negligible probability of
occurrence; performing adaptive unconstrained Metropolis
simulations to iteratively estimate an optimal biased distribution
for z; performing an importance-sampling (IS) simulation using the
optimal biased distribution for z and an unconstrained Metropolis
random walk to yield a first result; defining bins that partition a
second range of values of V (.GAMMA..sub.V.sup.c), such that values
of V not in the range of values .GAMMA..sub.V.sup.c have a
negligible contribution to the probability of occurrence of E;
performing adaptive constrained Metropolis simulations to
iteratively estimate an optimal biased distribution for z over
regions of .GAMMA. where E occurs; performing an IS simulation
using the optimal biased distribution for z over regions of .GAMMA.
where E occurs and a constrained Metropolis random walk to yield a
second result; and estimating the probability of occurrence of E
based on the first and second results.
11. The method of claim 10, wherein the first result comprises
estimates of probability distributions p(V) and p(V, E), and the
second result comprises an estimate of the probability distribution
p(V|E).
12. The method of claim 11, wherein the probability of occurrence
of E (P(E)) is estimated by scaling the estimate of p(V|E) to fit
the estimate of p(V, E) over a range of values of V where the
estimates of p(V|E) and p(V, E) have a predetermined reliability,
to yield a scaling factor (SF), wherein SF comprises a first
estimate of P(E).
13. The method of claim 12, further comprising integrating
[p(V|E)*SF] over .GAMMA..sub.V.sup.c to yield a second estimate of
P(E), wherein the second estimate is more accurate than the first
estimate.
14. The method of claim 10, wherein the adaptive unconstrained
Metropolis simulations and the adaptive constrained Metropolis
simulations are performed by: (a) setting an initial distribution
for V over all bins .GAMMA..sub.V.sup.u and .GAMMA..sub.V.sup.c for
adaptive unconstrained and adaptive constrained simulations,
respectively; (b) generating a random walk in the sample space of z
using a Metropolis algorithm; (c) generating a histogram
(H.sub.k.sup.j, k=1, . . . , M) of V=f(z) over all the bins of
.GAMMA..sub.V.sup.u or .GAMMA..sub.V.sup.c based on the values of z
generated by the Metropolis random-walk; (d) obtaining a new
estimate for the distribution of V as
P.sub.k.sup.j+1=.phi.(P.sub.k.sup.j,P.sub.k-1.sup.j,P.sub.k-1.sup.j-1,H.s-
ub.k.sup.l,H.sub.k-1.sup.l, l=0, 1, . . . , j), where .phi. denotes
a predefined function; (e) incrementing counter j and repeating
steps (a)-(d) if the histogram of V is not approximately uniform;
and (f) setting P.sub.k.sup..infin.=P.sub.k.sup.j+1, .A-inverted.
k.epsilon.{1, . . . , M} and
.rho.*(z)=.rho.(z)/(cP.sub.k.sup..infin.) as an optimal biased
distribution of z for all z such that f(z) falls in the k.sup.th
bin of .GAMMA..sub.V.sup.u or .GAMMA..sub.V.sup.c, and wherein c is
chosen such that .intg..rho.*(z)dz=1.
15. A program storage device readable by a machine, tangibly
embodying a program of instructions executable by the machine to
perform method steps for estimating the probability of occurrence
of an event (E), the method steps comprising: performing an
adaptive unconstrained estimation of an optimal biased distribution
for a multi-dimensional random variable (z) defined on a sample
space (.GAMMA.); performing an importance-sampling (IS) simulation
using the optimal biased distribution for z to yield a first
result; performing an adaptive constrained estimation of an optimal
biased distribution for z over regions of .GAMMA. where E occurs;
performing an IS simulation using the optimal biased distribution
for z, over regions of .GAMMA. where E occurs, to yield a second
result; and estimating the probability of the occurrence of E based
on the first and second results.
16. The program storage device of claim 15, wherein the constrained
and unconstrained estimations are performed iteratively.
17. The program storage device of claim 15, wherein the event
comprises a word error rate for a forward error correcting (FEC)
code.
18. The program storage device of claim 17, wherein the FEC code
comprises a LDPC code.
19. The program storage device of claim 15, wherein the event
comprises a bit error rate for a forward error correcting (FEC)
code.
20. The program storage device of claim 19, wherein the FEC code
comprises a LDPC code.
21. The program storage device of claim 15, wherein the steps of
performing an adaptive unconstrained estimation of an optimal
biased distribution for z and performing an IS simulation using the
optimal biased distribution for z to yield a first result comprise:
performing adaptive unconstrained Metropolis simulations to
iteratively estimate the optimal biased distribution for z; and
performing an IS simulation using the optimal biased distribution
for z and an unconstrained Metropolis random walk to yield the
first result.
22. The program storage device of claim 21, wherein the steps of
performing an adaptive constrained estimation of an optimal biased
distribution for z over regions of .GAMMA. where E occurs and
performing an IS simulation using the optimal biased distribution
for z over regions of .GAMMA. where E occurs to yield a second
result comprise: performing adaptive constrained Metropolis
simulations to iteratively estimate the optimal biased distribution
for z over regions of .GAMMA. where E occurs; and performing an IS
simulation using the optimal biased distribution for z over regions
of .GAMMA. where E occurs and a constrained Metropolis random walk
to yield the second result.
23. The program storage device of claim 15, wherein the probability
of the occurrence of E based on the first and second results is
estimated by: scaling the second result to fit the first result to
yield a scaling factor; and estimating the probability of the
occurrence of E based on the scaling factor.
24. A program storage device readable by a machine, tangibly
embodying a program of instructions executable by the machine to
perform method steps for estimating the probability of occurrence
of an event (E), given a known probability distribution (.rho.(z))
of a multi-dimensional random variable (z) defined on a sample
space (.GAMMA.), the method steps comprising: determining a scalar
mapping (V) from the multi-dimensional sample space to a
single-dimensional space; defining bins that partition a first
range of values of V(.GAMMA..sub.V.sup.u), such that values of V
not in the range of values .GAMMA..sub.V.sup.u have a negligible
probability of occurrence; performing adaptive unconstrained
Metropolis simulations to iteratively estimate an optimal biased
distribution for z; performing an IS simulation using the optimal
biased distribution for z and an unconstrained Metropolis random
walk to yield a first result; defining bins that partition a second
range of values of V (.GAMMA..sub.V.sup.c), such that values of V
not in the range of values .GAMMA..sub.V.sup.c have a negligible
contribution to the probability of occurrence of E; performing
adaptive constrained Metropolis simulations to iteratively estimate
an optimal biased distribution for z over regions of .GAMMA. where
E occurs; performing an IS simulation using the optimal biased
distribution for z over regions of .GAMMA. where E occurs and a
constrained Metropolis random walk to yield a second result; and
estimating the probability of occurrence of E based on the first
and second results.
25. The program storage device of claim 24, wherein the first
result comprises estimates of probability distributions p(V) and
p(V, E), and the second result comprises an estimate of the
probability distribution p(V|E).
26. The program storage device of claim 25, wherein the probability
of occurrence of E, (P(E)) is estimated by scaling the estimate of
p(V|E) to fit the estimate of p(V, E) over a range of values of V
where the estimates of p(V|E) and p(V, E) have a predetermined
reliability, to yield a scaling factor (SF), wherein SF comprises a
first estimate of P(E).
27. The program storage device of claim 25, further comprising
integrating [p(V|E)*SF] over .GAMMA..sub.V.sup.c to yield a second
estimate of P(E), wherein the second estimate is more accurate than
the first estimate.
28. The program storage device of claim 24, wherein the adaptive
unconstrained Metropolis simulations and the adaptive constrained
Metropolis simulations are performed by: (a) setting an initial
distribution for V over all partition bins of .GAMMA..sub.V.sup.u
and .GAMMA..sub.V.sup.c for adaptive unconstrained and adaptive
constrained simulations, respectively; (b) generating a random walk
in the sample space .GAMMA. using a Metropolis algorithm; (c)
generating a histogram (H.sub.k.sup.j, k=1, . . . , M) of V=f(z)
over all partition bins of .GAMMA..sub.V.sup.u or
.GAMMA..sub.V.sup.c based on the values of z generated by the
Metropolis random-walk; (d) obtaining a new estimate for the
distribution of V as
P.sub.k.sup.j+1=.phi.(P.sub.k.sup.j,P.sub.k-1.sup.j,P.sub.k-1.sup.j-1,H.s-
ub.k.sup.l,H.sub.k-1.sup.l;l=0, 1, . . . , j), where .phi. denotes
a predefined function; (e) incrementing counter j and repeating
steps (a)-(d) if the histogram of V is not approximately uniform;
and (f) setting P.sub.k.sup..infin.=P.sub.k.sup.j+1, .A-inverted.
k.epsilon.{1, . . . , M} and
.rho.*(z)=.rho.(z)/(cP.sub.k.sup..infin.) as an optimal biased
distribution of z for all z such that f(z) falls in the k.sup.th
bin of .GAMMA..sub.V.sup.u or .GAMMA..sub.V.sup.c, and wherein c is
chosen such that .intg..rho.*(z)dz=1.
29. A system, comprising: a processor programmed with computer
readable program code for: performing an adaptive unconstrained
estimation of an optimal biased distribution for a
multi-dimensional random variable (z) defined on a sample space
(.GAMMA.), performing an importance-sampling (IS) simulation using
the optimal biased distribution for z to yield a first result,
performing an adaptive constrained estimation of an optimal biased
distribution for z over regions of .GAMMA. where E occurs,
performing an IS simulation using the optimal biased distribution
for z, over regions of .GAMMA. where E occurs, to yield a second
result, and estimating the probability of the occurrence of E based
on the first and second results; and a user interface in
communication with the processor.
30. A method of combining a first result from an unconstrained
importance-sampling (IS) simulation with a second result from a
constrained IS simulation, comprising: receiving the first result;
receiving the second result; and scaling the second result to fit
the first result to yield a scaling factor.
31. The method of claim 30, wherein the first and second results
comprise estimates of first and second probability
distributions.
32. The method of claim 30, further comprising determining
information about an event based on the scaling factor.
33. The method of claim 32, wherein the information about the event
comprises an estimate of a probability of occurrence of the event.
Description
REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of provisional U.S.
Patent Application No. 60/486,970, filed Jul. 14, 2003.
BACKGROUND OF THE INVENTION
[0003] 1. Field of the Invention
[0004] The present invention relates to estimating the
probabilities of events and, more particularly, to a dual adaptive
importance-sampling system and method.
[0005] 2. Background of the Related Art
[0006] In the field of communications engineering, the performance
of forward error correction (FEC) codes in digital communication
systems is almost exclusively studied and compared on the basis of
performance curves--plots of bit-error rate (BER) and/or word-error
rate (WER) versus the information signal power to noise power ratio
(SNR).
[0007] The exact analytical evaluation of FEC code performance
curves under a specific decoding scheme is extremely difficult to
perform and computationally intractable, even for moderately long
codes, say with more than 30 to 40 information bits per
codeword.
[0008] Approximate performance of FEC codes under certain decoding
schemes, in the form of lower and upper bounds, may be obtained
analytically provided knowledge of the weight enumerator function
(WEF) of the code is available: partial knowledge of the WEF in the
form of the first non-trivial term may also suffice. In general,
these bounds serve as very good approximations to the actual code
performance at high SNR values. The quality of the approximations,
however, is not very good at relatively moderate-to-low values of
SNR. To make matters worse, unless the code belongs to one of the
few classes of codes whose WEF is already known, such as the RS and
BCH codes that still abound in practical FEC schemes, the
computation of the code WEF or even the first non-trivial term is
itself intractable, in a practical sense, for codes of practical
size.
[0009] Thus, in general, for a large set of codes with parameters
in the realm of current practical applicability (hundreds and
thousands of bits long), it is impossible for FEC code researchers
and developers to analytically obtain performance curves, either
exact or in the form of bounds. Consequently, for some time now,
the preferred methodology employed by communications engineers and
researchers to study code performance has and continues to be based
on performing Monte Carlo computer simulations or physical system
experiments. For the simpler and less powerful codes, simulation
software packages, such as Mathwork's MATLAB or Elanix's System
View, provide communications oriented simulation tools (e.g.,
MATLAB's Communication ToolBox and SimuLink) that perform straight
Monte Carlo simulation of BERs.
[0010] Monte Carlo simulations and physical system experiments,
which are collectively referred to herein as "standard Monte Carlo
techniques", have been quite effective in obtaining performance
curves down to BER values as low as approximately 10.sup.-7. These
BERs are consistent with performance requirements for such
communication systems as ADSL and wireless phone systems.
Attempting to obtain performance curves for lower values of BER,
however, continues to be a challenge even with current computing
capabilities because of the inordinate amounts of time required--on
the order of months or years--to obtain meaningful results at such
low BER values.
[0011] Recent technological advancements have drastically altered
the scenario in the communications industry. The industry has
witnessed an explosion in communications speeds and capacity of
storage media, in addition to improvements in the overall quality
of communications. This has led service providers to impose more
stringent requirements on the error rate of communication systems,
e.g., current optical fiber communication (OFC) systems operate at
design BERs in the range of 10.sup.-12 to 10.sup.-15.
Unfortunately, standard Monte Carlo techniques prove to be
inadequate for evaluating FEC codes and decoders at such low BER
values.
[0012] In an attempt to extend the range of standard Monte Carlo
techniques and provide some assistance in the evaluation of higher
performing systems, researchers have called upon the services of
supercomputers or system experiments. Both of these approaches are
characterized by unduly long performance time requirements and,
even then, the best reliable BERs that have been achieved are on
the order of 10.sup.-10 or 10.sup.-11 and are acquired over time
spans measured in months. This is very expensive to industry and
researchers in terms of person hours and computer time.
[0013] The evaluation of BERs for an actual physical system, e.g.,
an exploratory development prototype, invariably involves the use
of a bit-error-rate tester (BERT). BERTs designed to operate at the
extremely high speeds of 40-48 giga-bits per second (Gbps) have
been recently introduced in the market. They are capable of
operating at `full line rate`, i.e., these BERTs can count each and
every transmitted (and received) bit at speeds of up to 40-48 Gbps.
With such devices, one can easily transmit and receive about
10.sup.15 bits in a period of about 2.5.times.10.sup.4 seconds
(i.e., <7 hours), thus enabling the evaluation of BERs reliably
at least down to 10.sup.-13 (100/total number of bits) within a
relatively short period of time. However, such BERTs are extremely
expensive, costing on the order of $500,000.
[0014] In coded communication systems with long codes, the range of
word-error rates (WERs) that can be reliably evaluated is less than
the uncoded system by a factor equal to the length of the code. For
example, consider a code of length n=1.1.times.10.sup.3, with
k=10.sup.3 information bits and a rate of r=0.909, employed on a
communication system operating at an information transfer rate of
40 Gbps (i.e., a line rate of 40/r=44 Gbps). Transmission and
reception of 10.sup.15 information bits would require about
2.5.times.10.sup.4 seconds. However, 10.sup.15 information bits
correspond to only 10.sup.12 codewords. Thus, a reliable estimate
of WER only down to 10.sup.-10 is possible. Further, if one assumes
that every word that is not correctly decoded results in about 10
information bits in error (the average number of information bits
in error per codeword in error is dependent on the minimum distance
of the code and other code properties), one realizes that the
transmission of 10.sup.15 information bits is capable of providing
us with reliable BERs only down to 10.sup.-12, instead of the
10.sup.-13 that was possible in the uncoded system. The situation
is further worsened as the code length increases. In addition to
the above problem, the use of high-speed BERTs in conjunction with
system experiments for evaluating the performance of coded
communication systems is severely limited by a host of other
constraints.
[0015] First, evaluation of a code at such high speeds requires
that encoder/decoder hardware implementations capable at operating
at the above-mentioned speeds be available. The fastest
encoder/decoders available today operate around 1 Gbps.
Consequently, practical coded communications systems that operate
at speeds in the range of 10-40 Gbps require the
multiplexing/demultiplexing of FEC encoder/decoders at the
transmitter and receiver, thus increasing the complexity of the
system. Further, fabricating encoder/decoders for any code that
needs to be evaluated is generally not economically feasible
because of the cost of manufacturing high-speed VLSI devices and
the almost universal approach of software implementation for
initial design and evaluation. To complicate matters even more, the
new class of decoders use soft-information to perform soft-decision
decoding. This information is obtained by
multi-threshold-decisioning on the received signal, and is
difficult to extract--commercially impractical--at 10-40 Gbps
operating speeds. This virtually eliminates the possibility of
evaluating soft-decision decoding performance of FEC codes using
high-speed BERTs.
[0016] In addition, the high-speed BERTs are themselves quite
expensive, i.e., the cost of the Ixia 2.4 Gbps BERT is greater than
$10,000.00. BERTs capable of operating at higher speeds are even
more expensive (e.g., about $500,000 for the 40-48 Gbps model). The
costs of the BERT, of fabricating the encoder/decoders, of the
auxiliary equipment, and of setting up the test-bench to evaluate
the code, becomes prohibitive, making it practically impossible for
researchers and small companies to evaluate their codes to the
desired low values of BER using a physical system setup employing
BERTs.
[0017] A variety of simulation techniques have been developed to
evaluate FEC codes to progressively smaller BER values. One of the
most promising areas in this context has been the study of
importance sampling (IS) techniques. Importance sampling is based
on increasing the frequency of occurrence of very low (important)
error events defined by the channel noise probability density
function (pdf) by generating these important events more frequently
from a related biased pdf. However, for importance sampling to be
effective for evaluating FEC codes and decoders, one must choose a
biased pdf using some knowledge of which noise realizations are
most likely to generate errors at each desired low BER.
[0018] The task of determining an appropriate biased pdf is FEC
code specific. Most of the proposed techniques for evaluating FEC
codes via importance sampling rely on using some code-property to
obtain a rough estimate of the region in noise space that is a
major contribution to the BER. For example, some techniques exploit
the knowledge of the minimum-weight codewords to determine a
suitable biased pdf. However, as discussed above, this knowledge is
practically unobtainable for codes in general.
[0019] One technique, described in T. Yamane and Y. Katayama, "Bit
Error Rate Analysis on Iterative Two-Stage Decoding of
Two-Dimensional Codes by Importance Sampling", Proc. 2003 IEEE
Int'l. Conf. Commun., 11-15 May 2003, Anchorage, Ak., USA, uses
knowledge of the error-correcting capability of codes (a code
parameter that may not be known in general) in order to obtain a
suitable biased pdf for IS. This technique is restricted to
hard-decision decoding, and is particularly suited only for
obtaining performance curves for serially concatenated coding
schemes (product codes), where it is possible to easily
characterize the error correcting capability of the constituent
codes.
[0020] Another technique, described in B. Xia and W. E. Ryan, "On
importance sampling for linear block codes", Proc. 2003 IEEE Int'l.
Conf. Commun., 11-15 May 2003, Anchorage, Ak., USA, use the regular
structure of the bipartite graphs of certain codes to determine the
biased pdf. For every codeword that is received, the BER evaluation
using this IS technique is performed for only a single bit-position
in the codeword. Standard Monte Carlo techniques, on the other
hand, evaluate the BER for all n bit-positions for every received
codeword of length n.
[0021] Hence, for the transmission and decoding of the same number
of codewords (which can be assumed to take the same time for both
IS as well as standard Monte Carlo techniques), the standard Monte
Carlo technique evaluates n times as many bits as this IS
technique. This is referred to as the `divide-by-n` problem. This
divide-by-n problem is likely to affect the efficiency of this IS
technique relative to the standard Monte Carlo techniques for codes
where n becomes large. In addition, this IS technique is predicated
on the regular structure in the bipartite graph of codes, which
justifies the analysis of a single bit-position in the code to
estimate the BER of the entire code. A regular bipartite graph
structure is obtained from regular low-density parity-check (LDPC)
codes, but is usually not consistent with irregular LDPC codes or
codes in general.
[0022] Accordingly, the IS techniques described above are
inefficient and/or impractical for very-low-BER-requirement
evaluation of arbitrarily chosen codes. Thus, there is clearly a
technological need for a more time-efficient simulation tool for
evaluating the BERs/WERs of coded communication and storage
systems, as well as a tool for evaluating the probabilities of
other rare events.
SUMMARY OF THE INVENTION
[0023] An object of the invention is to solve at least the above
problems and/or disadvantages and to provide at least the
advantages described hereinafter.
[0024] Therefore, an object of the present invention is to provide
a system and method for evaluating very rare events.
[0025] Another object of the present invention is to provide a
system and method for estimating the probabilities of very rare
events.
[0026] Another object of the present invention is to provide a
system and method for estimating the probabilities of very rare
events in coded communications systems.
[0027] Another object of the present invention is to provide a
system and method for evaluating very low BERs/WERs of FEC
codes.
[0028] Another object of the present invention is to provide a
multicanonical Monte Carlo simulation system and method for
evaluating very low BERs/WERs of FEC codes.
[0029] Another object of the present invention is to provide a dual
adaptive importance sampling system and method for evaluating very
low BERs/WERs of FEC codes.
[0030] Another object of the present invention is to provide a
system and method for combining the results of constrained and
unconstrained importance-sampling simulations using a scaling
technique.
[0031] To achieve at least the above objects, in whole or in part,
there is provided a method of estimating the probability of
occurrence of an event (E), including performing an adaptive
unconstrained estimation of an optimal biased distribution for a
multidimensional random variable (z) defined on a sample space
(.GAMMA.), performing an importance-sampling (IS) simulation using
the optimal biased distribution for z to yield a first result,
performing an adaptive constrained estimation of an optimal biased
distribution for z over regions of .GAMMA. where E occurs,
performing an IS simulation using the optimal biased distribution
for z, over regions of .GAMMA. where E occurs, to yield a second
result, and estimating the probability of the occurrence of E based
on the first and second results.
[0032] To achieve at least the above objects, in whole or in part,
there is also provided a method of estimating the probability of
occurrence of an event (E), given a known probability distribution
(.rho.(z)) of a multi-dimensional random variable (z) defined on a
sample space (.GAMMA.), including determining a scalar mapping (V)
from the multi-dimensional sample space (.GAMMA.) to a
single-dimensional space, defining bins that partition a first
range of values of V(.GAMMA..sub.V.sup.u), such that values of V
not in the range of values .GAMMA..sub.V.sup.u have a negligible
probability of occurrence, performing adaptive unconstrained
Metropolis simulations to iteratively estimate an optimal biased
distribution for z, performing an IS simulation using the optimal
biased distribution for z and an unconstrained Metropolis random
walk to yield a first result, defining bins that partition a second
range of values of V (.GAMMA..sub.V.sup.c), such that values of V
not in the range of values .GAMMA..sub.V.sup.c have a negligible
contribution to the probability of occurrence of E, performing
adaptive constrained Metropolis simulations to iteratively estimate
an optimal biased distribution for z over regions of .GAMMA. where
E occurs, performing an IS simulation using the optimal biased
distribution for z over regions of .GAMMA. where E occurs and a
constrained Metropolis random walk to yield a second result, and
estimating the probability of occurrence of E based on the first
and second results.
[0033] To achieve at least the above objects, in whole or in part,
there is also provided a program storage device readable by a
machine, tangibly embodying a program of instructions executable by
the machine to perform method steps for estimating the probability
of occurrence of an event (E), the method steps comprising
performing an adaptive unconstrained estimation of an optimal
biased distribution for a multi-dimensional random variable (z)
defined on a sample space (.GAMMA.), performing an
importance-sampling (IS) simulation using the optimal biased
distribution for z to yield a first result, performing an adaptive
constrained estimation of an optimal biased distribution for z over
regions of .GAMMA. where E occurs, performing an IS simulation
using the optimal biased distribution for z, over regions of
.GAMMA. where E occurs, to yield a second result, and estimating
the probability of the occurrence of E based on the first and
second results.
[0034] To achieve at least the above objects, in whole or in part,
there is also provided a program storage device readable by a
machine, tangibly embodying a program of instructions executable by
the machine to perform method steps for estimating the probability
of occurrence of an event (E), given a known probability
distribution (.rho.(z)) of a multi-dimensional random variable (z)
defined on a sample space (.GAMMA.), the method steps comprising
determining a scalar mapping (V) from the multi-dimensional sample
space (.GAMMA.) to a single-dimensional space, defining bins that
partition a first range of values of V (.GAMMA..sub.V.sup.u), such
that values of V not in the range of values .GAMMA..sub.V.sup.u
have a negligible probability of occurrence, performing adaptive
unconstrained Metropolis simulations to iteratively estimate an
optimal biased distribution for z, performing an IS simulation
using the optimal biased distribution for z and an unconstrained
Metropolis random walk to yield a first result, defining bins that
partition a second range of values of V (.GAMMA..sub.V.sup.c), such
that values of V not in the range of values .GAMMA..sub.V.sup.c
have a negligible contribution to the probability of occurrence of
E, performing adaptive constrained Metropolis simulations to
iteratively estimate an optimal biased distribution for z over
regions of .GAMMA. where E occurs, performing an IS simulation
using the optimal biased distribution for z over regions of .GAMMA.
where E occurs and a constrained Metropolis random walk to yield a
second result, and estimating the probability of occurrence of E
based on the first and second results.
[0035] To achieve at least the above objects, in whole or in part,
there is also provided a system, comprising a processor programmed
with computer readable program code for performing an adaptive
unconstrained estimation of an optimal biased distribution for a
multi-dimensional random variable (z) defined on a sample space
(.GAMMA.), performing an importance-sampling (IS) simulation using
the optimal biased distribution for z to yield a first result,
performing an adaptive constrained estimation of an optimal biased
distribution for z over regions of .GAMMA. where E occurs,
performing an IS simulation using the optimal biased distribution
for z, over regions of .GAMMA. where E occurs, to yield a second
result, and estimating the probability of the occurrence of E based
on the first and second results; and a user interface in
communication with the processor.
[0036] To achieve at least the above objects, in whole or in part,
there is also provided a method of combining a first result from an
unconstrained importance-sampling (IS) simulation with a second
result from a constrained IS simulation, comprising receiving the
first result, receiving the second result, and scaling the second
result to fit the first result to yield a scaling factor.
[0037] Additional advantages, objects, and features of the
invention will be set forth in part in the description which
follows and in part will become apparent to those having ordinary
skill in the art upon examination of the following or may be
learned from practice of the invention. The objects and advantages
of the invention may be realized and attained as particularly
pointed out in the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0038] The invention will be described in detail with reference to
the following drawings in which like reference numerals refer to
like elements wherein:
[0039] FIG. 1 is a flowchart showing the main steps of a dual
adaptive importance sampling (DAIS) method of estimating the
probability of the occurrence of an event, in accordance with one
embodiment of the present invention;
[0040] FIG. 2 is a flowchart showing a more detailed DAIS method of
estimating the probability of the occurrence of an event, in
accordance with a preferred embodiment of the present
invention;
[0041] FIG. 3 is a flowchart of steps in one preferred method for
performing adaptive unconstrained and constrained Metropolis
simulations, in accordance with the present invention;
[0042] FIG. 4a is a plot showing an example of combining the
results of unconstrained and constrained (before scaling)
simulation results to obtain the scaling factor P(E) and the
estimate p(V|E) P(E) of p(V, E) (constrained and after scaling), in
accordance with the present invention;
[0043] FIG. 4b is a histogram G.sub.k, k=1, . . . , M, of
accumulated values of V belonging to .GAMMA..sub.V.sup.u.andgate.E
generated during the unconstrained simulation for the example of
FIG. 4a, in accordance with the present invention;
[0044] FIG. 5a is a plot showing a pdf of a scalar control quantity
V, in accordance with the present invention;
[0045] FIG. 5b is a plot of BER and WER estimates obtained using
the methods of the present invention; and
[0046] FIG. 6 is a block diagram of a system for estimating the
probability of occurrence of events, in accordance with one
embodiment of the invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
[0047] The system and method of the present invention can
automatically determine the biased pdf by using an iterative
procedure that requires relatively little a priori knowledge of how
to bias. Hence, the present invention is particularly suited for
evaluating the BERs and/or WERs of coded communication and storage
systems, and is generally applicable to arbitrarily chosen codes.
When applied to coded communication and storage systems, the system
and method of the present invention provides a versatile technique
for the fast and accurate estimation of BERs and WERs of FEC codes
down to values of 10.sup.-20 or lower. However, it should be
appreciated that the system and method of the present invention can
also be applied to the evaluation of other types of rare
events.
[0048] The system and method of the present invention utilizes a
technique hereinafter referred to as a "dual adaptive
importance-sampling" (DAIS). The DAIS technique is based on the
multicanonical Monte Carlo (MMC) technique, and is used to
iteratively obtain approximate estimates of the optimal biased
distribution on the underlying probability sample space (henceforth
referred to as "sample space") for a coded communications system in
order to accurately compute, via dual complementary IS Metropolis
random-walk simulations, the probability of occurrence of coded
communication system events, such as received or decoder error,
down to extremely low values.
[0049] FIG. 1 is a flowchart showing the main steps of a DAIS
method of estimating the probability of the occurrence of an event,
in accordance with one embodiment of the present invention. The
method starts at step 100, where an adaptive unconstrained
estimation of the optimal biased distribution of a
multi-dimensional random variable (z) defined on the underlying
sample space for the problem (.GAMMA.) is performed. The estimation
is "unconstrained", in that it is not constrained to specific
regions of the sample space. Then, at step 110, an IS simulation is
performed using the optimal biased distribution for z to yield a
first result.
[0050] At step 120, an adaptive constrained estimation of the
optimal biased distribution for z is performed. This estimation is
"constrained", in that it is constrained to regions of sample space
where the rare event (E) is guaranteed to occur. Then, at step 130,
an IS simulation is performed using the optimal biased distribution
for z, given the occurrence of E, to yield a second result. At step
140, the first and second results are used to estimate the
probability of the occurrence of E.
[0051] Although in the method of FIG. 1, the unconstrained
estimation 100 and simulation 110 are shown as being performed
before the constrained estimation 120 and simulation 130, it should
be appreciated that the constrained estimation 120 and simulation
130 could be performed before unconstrained estimation 100 and
simulation 110, while still falling within the scope of the present
invention.
[0052] FIG. 2 is a flowchart showing a more detailed DAIS method of
estimating the probability of the occurrence of an event, in
accordance with a preferred embodiment of the present invention.
The method starts at step 200, where the probability distribution
(.rho.(z)) of a multi-dimensional random variable (z) defined on a
sample space (.GAMMA.), and the event E whose probability is to be
estimated is specified.
[0053] As discussed above, z represents the multi-dimensional
random variable defined on the underlying sample space, for the
problem, with known probability distribution (pdf) .rho.(z). The
occurrence of E is function of the value of z, and perhaps, other
factors. However, specific regions in sample space where E occurs
are unknown apriori.
[0054] At step 210, a scalar mapping V=f(z) from the
multi-dimensional sample space to a single-dimensional space (such
as the real line or its subset) is chosen that is problem-specific
and may often represent some form of distance measure on the sample
space (such as the l.sub.2-norm) or modifications of such measures.
A desirable mapping represents an ordering of easy-to-define
regions in the sample space based on the relative probability of
occurrence of event E.
[0055] At step 215, .GAMMA..sub.V.sup.u is set as the
problem-specific significant range of V for the unconstrained
simulation, such that values of V not in the range of values
.GAMMA..sub.V.sup.u have a negligible probability of occurrence. At
step 215, the region .GAMMA..sub.V.sup.u is also partitioned into a
large number (M) of equal-width bins.
[0056] At step 220, adaptive unconstrained Metropolis simulations
are performed to iteratively estimate the optimal biased
distribution for z, i.e., .rho..sup.j(z).fwdarw..rho.*(z). In this
step, the Metropolis random-walk at each iteration is not
constrained to specific regions of sample space (and equivalently
regions of .GAMMA..sub.V.sup.u). Then, at step 230, .rho.*(z) is
used to perform an IS simulation using the unconstrained Metropolis
random-walk to obtain estimates of the distributions p(V) and p(V,
E) from histograms H.sub.k and G.sub.k, k=1, . . . , M, generated
from accumulated values of V belonging to .GAMMA..sub.V.sup.u and
.GAMMA..sub.V.sup.u.andgate.E, respectively.
[0057] At step 235, .GAMMA..sub.V.sup.c is set as the
problem-specific significant range of V for the constrained
simulation, such that values of V not in the range of values
.GAMMA..sub.V.sup.c have a negligible contribution to the
probability of occurrence of E. Since P(E) is unknown a priori,
.GAMMA..sub.V.sup.c has to be set in an ad-hoc fashion, and may
have to be modified by trial and error. At step 235, the region
.GAMMA..sub.V.sup.c is also partitioned into a large number (M) of
equal-width bins. Then, at step 240, adaptive constrained
simulations are performed to iteratively estimate the optimal
biased distribution for z given E, i.e.,
.rho..sup.j(z|E).fwdarw..rho.*(z|E). In this step, the Metropolis
random-walk at each iteration is constrained to regions of sample
space (and equivalently regions of .GAMMA..sub.V.sup.c) where event
E is guaranteed to occur. Then, at step 250, .rho.*(z|E) is used to
perform an IS simulation using the constrained Metropolis
random-walk to obtain an estimate of the distribution p(V|E) from
the histogram H.sub.k, k=1, . . . , M, generated from accumulated
constrained values of V belonging to .GAMMA..sub.V.sup.c.
[0058] The results of the constrained and unconstrained IS
Metropolis simulations are then combined at step 260 by scaling the
estimate of p(V|E), obtained from the constrained simulation (step
250), to fit the estimate of p(V, E), obtained from the
unconstrained simulation (step 230), over the range of values of V
where both estimates are sufficiently reliable. This range is
problem-specific, but is typically defined by those bins of
.GAMMA..sub.V.sup.u that have a large fraction of accumulated
values of V belonging to the event E during the unconstrained
simulation.
[0059] The scaling factor then provides the initial estimate of
P(E) via p(V, E)=p(V|E)P(E) over the range of values of V where
both estimates are sufficiently reliable. Finally, at step 270, the
product p(V|E)P(E) is used as a more reliable estimate of p(V, E)
for all V, and one integrates p(V|E)P(E) over .GAMMA..sub.V.sup.c
to obtain the better estimate of P(E).
[0060] As shown in FIGS. 1 and 2, and as described above, the DAIS
method of the present invention performs the adaptive Metropolis
simulations twice, under unconstrained and constrained assumptions
on the generated simulation values for z, then combines both
results to estimate the probability of the desired event.
[0061] FIG. 3 is a flowchart of steps in one preferred method for
performing steps 220 and 240 of FIG. 2 (the adaptive unconstrained
and constrained Metropolis simulations, respectively). The method
starts at step 300, where an initial distribution for V over all
the bins of .GAMMA..sub.V.sup.u or .GAMMA..sub.V.sup.c, for
unconstrained or constrained simulations, respectively, is set. A
natural choice is the uniform distribution, i.e.,
P.sub.k.sup.i=M.sup.-1, k=1, . . . , M, where the subscript
represents the bin number and the superscript represents the
iteration number.
[0062] Then, at step 310, a Metropolis algorithm is used to
generate a random-walk in the sample space .GAMMA. according to the
distribution .rho..sup.j(z).infin..rho.(z)/P.sub.k.sup.j for all z
such that f(z) falls in the k.sup.th bin of .GAMMA..sub.V.sup.u or
.GAMMA..sub.V.sup.c. The method then proceeds to step 320, where a
histogram is generated of V=f(z), i.e., H.sub.k.sup.j, k=1, . . . ,
M, over all the bins of .GAMMA..sub.V.sup.u or .GAMMA..sub.V.sup.c
based on the values of z generated by the Metropolis random-walk.
At step 330, a new estimate for the distribution of V as
P.sub.k.sup.j+1+.phi.(P.sub.k.sup.j,P.sub.k-1.sup.j,P.sub.k-1.sup.j-1,H.s-
ub.k.sup.l,H.sub.k-1.sup.l; l=0, 1, . . . , j) is obtained, where
.phi. denotes a suitably defined function.
[0063] At step 340, it is determined if the histogram of V at the
j.sup.th iteration is approximately uniform, i.e.,
H.sub.k.sup.j.apprxeq.H.sub.m.sup.j, .A-inverted. k,m.epsilon.{1, .
. . , M}, or equivalently, when
P.sub.k.sup.j+1.apprxeq.P.sub.k.sup.j, .A-inverted. k.epsilon.{1, .
. . , M}. If the histogram is uniform, then the method proceeds to
step 350. Otherwise, the method jumps to step 370.
[0064] At step 350, P.sub.k.sup..infin.=P.sub.k.sup.j+1,
.A-inverted. k.epsilon.{1, . . . , M} and
.rho.*(z)=.rho.(z)/(cP.sub.k.sup..infin.) is set as the optimal
biased distribution of z for all z such that f(z) falls in the
k.sup.th bin of .GAMMA..sub.V.sup.u or .GAMMA..sub.V.sup.c, where c
is chosen such that .intg..rho.*(z)dz=1. Specific examples of the
update rule in step 350 are: P k j + 1 = P k - 1 j + 1 .times. P k
j P k - 1 j .times. ( H k j H k - 1 j ) g ^ k - 1 j , where .times.
.times. g ^ k - 1 j .times. g k - 1 j l = 1 j .times. g k - 1 l
.times. and .times. .times. g k - 1 l = H k l .times. H k - 1 l H k
l + H k - 1 l , and .times. .times. ( b ) .times. .times. P k j + 1
.varies. P k j .times. H k j . ##EQU1## The method then ends at
step 360.
[0065] At step 370, the iteration counter j is incremented, and
then steps 300-340 are repeated.
[0066] FIG. 4a is a plot showing an example of combining the
results of unconstrained and constrained (before scaling)
simulation results to obtain the scaling factor P(E) and the
estimate p(V|E) P(E) of p(V, E) (constrained and after scaling).
This data corresponds to the estimation of word error rate (WER)
for a LDPC error correcting code and sum-product algorithm
decoding. FIG. 4b displays the histogram G.sub.k, k=1, . . . , M,
of accumulated values of V belonging to
.GAMMA..sub.V.sup.u.andgate.E generated during the unconstrained
simulation for this example.
Estimating BER of a Specific LDPC Error Correcting Code
[0067] As an illustrative example of one application of the present
invention, the performance of a regular (96, 50) LDPC code with a
code rate of r=50/96 in an additive white Gaussian noise (AWGN)
channel with binary phase-shift-keying (BPSK) was studied using the
DAIS method of the present invention. The parity-check matrix (H)
of this code can be found in D. J. C. MacKay, "Encyclopedia of
Sparse Graph Codes", which is incorporated by reference herein in
its entirety
(http://www.inference.phy.cam.ac.uk/mackay/codes/EN/C/96.3.963).
[0068] This code was chosen because it was: (1) the same code
studied in B. Xia and W. E. Ryan, "On importance sampling for
linear block codes," Proc. IEEE International Conference on
Communications 2003 (ICC '03), pp. 2904-2908, Anchorage, Ak., 2003,
which is incorporated by reference herein in its entirety; and (2)
it is possible to exactly compute the first two non-trivial
coefficients of the code's weight enumerator function (3 and 24
codewords at Hamming weights 6 and 8, respectively), which allows
one to compare the simulation results with the code's union-bound
performance. Note that the MacKay and Xia, et al., references noted
above refer to this code as a (96, 48) code, but 2 of the 48 rows
of H are linearly dependent.
[0069] Sum-product decoding (SPD) was implemented employing the
log-likelihood modification of described in H. Futaki and T.
Ohtsuki, "Performance of low-density parity-check (LDPC) coded OFDM
systems," Proc. ICC '02, vol. 3, pp. 1696-1700, which is
incorporated by reference herein in its entirety, and symmetric
signal levels of +1 and -1 for logical 1s and 0s, respectively. The
pdf .rho..sub.l of the noise in the lth bit at the receiver is zero
mean Gaussian with variance .sigma..sup.2=1/(2r E.sub.b/N.sub.0).
It suffices to transmit the all-zeros codeword, since the code is
linear and the noise is symmetric.
[0070] Let .GAMMA. be the n-dimensional probability space of the
noise in the n bits of a codeword. The noise vector z=(z.sub.1, . .
. , z.sub.n) is multivariate Gaussian with joint pdf .rho. .times.
.times. ( z ) = l = 1 n .times. .times. .rho. l .function. ( z l )
. ##EQU2## The MMC algorithm is controlled by a scalar control
quantity V defined here as as .times. .times. V .times. .times. ( z
) = { 1 n .times. l = 1 n .times. [ H .times. .times. ( q l .times.
z l ) .times. .times. z l ] 2 } 1 / 2 , ##EQU3## where
q.sub.l=(-1).sup.b.sup.l, with b.sub.l being the transmitted bit in
the lth position, and H(x)=1 if x>0 and H(x)=0 otherwise. V(z)
is constructed so that a noise component z.sub.l contributes to V
only if it may produce a bit-error at the input to the decoder. A
received word with a noise realization z is characterized as
generating an error, if the LDPC decoder cannot decode it to the
transmitted codeword within 50 iterations.
[0071] Given a range [V.sub.min, V.sub.max] for V, .GAMMA. is
partitioned into M subsets
.GAMMA..sub.k={z.epsilon..GAMMA.|V.sub.k-1.ltoreq.V(z)<V.sub.k},
where V.sub.k=V.sub.min+k.DELTA.V, 1.ltoreq.k.ltoreq.M, and
.DELTA.V=V.sub.k-V.sub.k-1=(V.sub.max-V.sub.min)/M is the width of
each bin in the partition of [V.sub.min, V.sub.max]. Let P.sub.k be
the probability of selecting a realization z from .rho. so that
z.epsilon..GAMMA..sub.k. Then, P k = .intg. .GAMMA. .times. .chi. k
.function. ( z ) .times. .rho. .times. .times. ( z ) .rho. *
.function. ( z ) .times. .times. .rho. * .function. ( z ) .times.
.times. d z .apprxeq. 1 N .times. i = 1 N .times. .chi. k
.function. ( z * , i ) .times. .rho. .times. .times. ( z * , i )
.rho. * .function. ( z * , i ) ( 1 ) ##EQU4## where .rho.*(z) is a
positive biasing pdf, .chi..sub.k(z)=1 if z.epsilon..GAMMA..sub.k
and .chi..sub.k(z)=0 otherwise, and the z.sup.*,i are N random
sample points in .GAMMA., selected according to the pdf .rho.*(z).
The variance of the estimate of (1) is zero if the optimal biasing
pdf pdf .times. .times. .rho. opt * .function. ( z ) = .chi. k
.function. ( z ) .times. .rho. .times. .times. ( z ) P k ##EQU5##
is used. However, .rho.*.sub.opt(z) depends on P.sub.k, which is
initially unknown. In standard IS, one uses physical intuition to
guess a biasing pdf that is close to .rho.*.sub.opt.
[0072] The MMC algorithm instead iterates over a sequence of
biasing pdfs .rho..sup.*,j that approach .rho.*.sub.opt.
.rho..sup.*,j is defined for the jth iteration by
.rho..sup.*,j(z)=.rho.(z)|(c.sup.jP.sub.k.sup.j), where k is such
that z.epsilon..GAMMA..sub.k is satisfied. The quantities
P.sub.k.sup.j satisfy P.sub.k.sup.j>0 and k = 1 M .times. P k j
= 1 , ##EQU6## and c.sup.j is an unknown constant that ensures The
vector P.sub.k.sup.j, k=1, . . . M, is the key quantity in the MMC
algorithm and completely determines the bias. At the first MMC
iteration, P.sub.k.sup.i is usually set to 1/M, .A-inverted.k=1, .
. . , M.
[0073] Within each MMC iteration j, the Metropolis algorithm is
employed to produce a random walk of samples z.sup.*,i whose pdf
equals .rho..sup.*,j(z). The Metropolis algorithm is described in
N. Metropolis, et al., "Equation of state calculations by fast
computing machines," J. Chem. Phys., vol. 21, pp. 1087-1092, 1953,
and is hereby incorporated by reference in its entirety.
[0074] A Markov chain of transitions consisting of small steps in
the noise space is considered. Each transition goes from
z.sup.*,i=z*.sub.a.epsilon..GAMMA..sub.k.sub.u to
z*.sub.b=(z*.sub.a+.epsilon..sup.j.DELTA.z).epsilon..GAMMA..sub.k.sub.b,
where .DELTA.z is random and symmetric, i.e., it does not favor any
direction in .GAMMA., and the transition is accepted with
probability .pi..sub.ab. If a transition from z.sup.*,i to z*.sub.b
is accepted, z.sup.*,i+1=z*.sub.b is set, else
z.sup.*,j+1=z.sup.*,j=z*.sub.a is set. The ratio
.pi..sub.ab/.pi..sub.ba equals
.rho..sup.*,j(z*.sub.b)/.rho..sup.*,j(z*.sub.a), which is the
detailed balance condition that ensures that the limiting
(stationary) pdf for infinitely many steps of this random walk is
.rho..sup.*,i.
[0075] The perturbation of the noise component in each bit
z*.sub.a,i of z*.sub.a is considered separately, and it is accepted
or rejected independently with the probability [ .rho. l .function.
( z b , l * ) .rho. l .function. ( z a , l * ) , 1 ] . ##EQU7##
Each perturbation .DELTA.z.sub.l is picked from a zero mean
symmetric pdf. A trial state z*.sub.b is obtained in which only
some of the components are different from their previous values in
z*.sub.a. Next, k.sub.b, the bin corresponding to z*.sub.b, is
computed, and the step from z*.sub.a to z*.sub.b is accepted with
the probability [ P k a j P k b j , 1 ] . ##EQU8##
[0076] In each iteration, the perturbation coefficient
.epsilon..sup.j is constant for all samples. After each iteration,
.epsilon..sup.j is adjusted so that the acceptance ratio
.alpha..quadrature. (number of accepted steps)/(total number of
steps, N) is close to 0.3 (empirically chosen based on experience
from previous experiments). The minimum required N for this random
walk depends on the average step size
.alpha..epsilon..sup.j<|.DELTA.z|> and hence is
system-dependent. The noise realizations are recorded in the
histogram H.sup.*,j, where
[0077] is the number of the z.sup.* j in iteration j that fall into
.GAMMA..sub.k. The P.sub.k.sup.j are updated after each MMC
iteration using the recursion relations given in B. A. Berg,
"Algorithmic aspects of multicanonical Monte Carlo simulations,"
Nucl. Phys. Proc. Suppl., vol. 63, pp. 982-984, 1998, which is
hereby incorporated by reference in its entirety, based on the
histogram H.sup.*,j. As j increases, the expected number of samples
<H.sub.k.sup.*,j> becomes independent of the bin number k,
which implies that P.sub.k.sup.j.fwdarw.P.sub.k.
[0078] Let P.sub.err be the probability that a received word with
noise realization z selected according to pdf .rho. leads to an
error, and P.sub.err,k the probability that z leads to an error and
falls into bin k. Then P err , k = P err k .times. P k = P k err
.times. P err , ( 2 .times. a ) P err = k = 1 M .times. P err , k ,
( 2 .times. b ) ##EQU9## where P.sub.err|k and P.sub.k|err are the
conditional probabilities of an error given that z falls into bin
k, and vice versa. P.sub.err can be computed by first running an
MMC simulation as described above, where we also count the errors
in bin k to produce a histogram G.sub.k.sup.*,j. One can then
approximate after j.sub.max MMC iterations. Summing over all MMC
iterations is valid since the biasing pdf at any MMC iteration only
affects the total number of hits in a bin, but not the behavior of
error hits relative to the total hits within a bin. Finally, one
can use the left equation of (2a), and equation (2b) to get
P.sub.err.
[0079] In FIG. 5a, the approximation
is shown with dots for E.sub.b/N.sub.0=11 dB. The dashed line shows
P.sub.k/.DELTA.V, and the circles show the sum of the error
histograms
The number of sampled errors rapidly decreases to 0 as V decreases
towards 0.4, which is where P.sub.err,k tends to be largest.
Consequently, the approximation
converges very slowly as the iteration number j increases. The
reason is that in this unconstrained MMC simulation, not enough of
the higher-probability smaller-noise realizations that generate
errors have been sampled.
[0080] One efficient method to overcome this undersampling problem
is to run a second, constrained MMC simulation (hence the term
"dual" in DAIS), in which one only accepts Metropolis steps that
lead to errors. If a trial realization z*.sub.b does not yield an
error in this simulation, we set .pi..sub.ab to zero. The
constrained simulation, hence, takes its samples from .rho. ~
.times. .times. ( z ) = .chi. err .function. ( z ) .times. .rho.
.times. .times. ( z ) P err , ##EQU10## where .chi..sub.err(z)=1 if
z produces an error and .chi..sub.err(z)=0 otherwise. Note that
{tilde over (.rho.)}(z) is proportional to .rho.(z) wherever
.chi..sub.err(z)=1. If the Metropolis random walk is ergodic in the
error subset of .GAMMA., the constrained MMC simulation
approximates P.sub.k|err. Since the P.sub.err,k and P.sub.k|err
estimates obtained using the two simulations are both smooth for
large k, using (2a) we can obtain P.sub.err=P.sub.err,k/P.sub.k|err
from the data where k is large.
[0081] In FIG. 5a, the dash-dot line shows P.sub.k|err/.DELTA.V
obtained from the constrained simulation, while the solid line
shows the resulting P.sub.err,k/.DELTA.V obtained by scaling
P.sub.k|err/.DELTA.V to fit P.sub.k/.DELTA.V from the unconstrained
simulation for 0.55<V<0.6. Since MMC yields a similar number
of samples in each bin, the relative statistical sampling error of
P.sub.k|err in the constrained simulation is smaller at small V
than in the unconstrained simulation. A significant advantage of
running separate unconstrained and constrained simulations is that
the algorithm optimizes the perturbation coefficients
.epsilon..sup.j of the two simulations independently. The values of
.epsilon..sup.j tend to differ strongly between the two
simulations.
[0082] In these simulations, M=300. In the first iteration
N.sup.j-1=5000 samples in the unconstrained case and
N.sup.j-1=10000 in the constrained case, and the number of samples
is increased after each iteration so that N.sup.j+1=1.3N.sup.j. In
each case, P.sub.k.sup.1=1/M, k=1, . . . , M, and it is assumed
that the simulation has sufficiently converged when
max|(P.sub.k.sup.j-P.sub.k.sup.j+1)/P.sub.k.sup.j+1|<0.1 This
convergence requires .apprxeq.10.sup.6 to 10.sup.8 samples in
total, with the samples increasing on average with increasing
E.sub.b/N.sub.o. Also, in both cases, each MMC iteration is
initialized with a z that gives a decoder error.
[0083] In FIG. 5b, the x and + symbols denote the decoder output
BER and WER estimates, respectively, obtained via Monte Carlo (MC).
The dashed curve with .quadrature. and dash-dot curve with
.smallcircle. denote the decoder output BER and WER estimates,
respectively, obtained using DAIS. Finally, the solid curve and
dotted curve denote the BER and WER union bounds, respectively.
[0084] The union bound can be closely approximated at high
E.sub.b/N.sub.o by the contribution of low Hamming weight (6 and 8
in this case) codewords. The SPD for LDPC codes approximates the ML
decoder. Hence, one would expect the SPD to perform worse than the
union bound on ML decoding at high E.sub.b/N.sub.o. The results
from DAIS are consistent with this expectation and indicate that
DAIS can simulate WER and BER performance of codes at very low
values. Excellent agreement is also observed between the results
obtained by DAIS and MC, wherever MC results are available (DAIS
falls within the 99% error bars for MC), which further validates
DAIS.
[0085] The assertion that the true code performance should be close
to the union bound at high E.sub.b/N.sub.o is further bolstered by
the observation that for MC simulations, as E.sub.b/N.sub.o
increases, the contribution of the probability of decoding-to-wrong
codewords progressively dominates the WER. For example, at
E.sub.b/N.sub.o=4 dB, 216 of 1888 word errors recorded were due to
decoding to wrong codewords (the rest were decoder failures),
whereas at E.sub.b/N.sub.o=7 dB, the corresponding numbers were 40
of 52. Note that the BER results in the B. Xia and W. E. Ryan
reference are farther away from the union bound than the present
results (by about 0.4 dB at BER=10.sup.-9), which may be attributed
to their use of .ltoreq.5 iterations for the SPD, and possibly a
different decoder implementation.
[0086] It is noted that the present BER data points do not show a
waterfall region since they correspond to large E.sub.b/N.sub.o
relative to the Shannon limit (.apprxeq.0 dB for the present code),
and since the code is not very long. BER estimates down to
10.sup.-39 for a smaller (20, 7) code have also been obtained.
[0087] A measure of DAIS's improvements over MC is given by the
ratio of the number of samples (codewords) required to achieve a
given WER at a given E.sub.b/N.sub.o, e.g., at E.sub.b/N.sub.o=10
dB, WER=10.sup.-14 is obtained by DAIS using 8.times.10.sup.7
codewords (unconstrained)+3.times.10.sup.7 codewords
(constrained)=11.times.10.sup.7 codewords (total), whereas MC would
require .gtoreq.10.sup.15 codewords (assuming .gtoreq.10 word error
events). Thus the gain is 10 15 11 .times. 10 7 .apprxeq. 9 .times.
10 6 ##EQU11## It is believed that DAIS's gain increases with
decreasing WER.
[0088] As code length increases, the dimensionality of .GAMMA. and
its partitions that map to bins of V increases. Hence, maintaining
a given level of statistical accuracy in sampling each partition of
.GAMMA. requires more samples for the longer code.
System for Estimating Probability of Occurrence of Events
[0089] FIG. 6 is a block diagram of a system for estimating the
probability of occurrence of events, in accordance with one
embodiment of the invention. The system includes a DAIS processor
500 and a user interface 600. The DAIS processor 500 is preferably
programmed with code for implementing the methods discussed above,
and illustrated in FIGS. 1-5. The code can be written in any
processor language known in the art, however, it is preferably
written in ANSI C.
[0090] The code that is used to implement the methods of the
present invention can be stored on a computer readable medium 700,
such as a hard drive, read-only memory (ROM), random access memory
(RAM), or any other type of electronic memory or storage device
using any type of media, such as magnetic, optical or other media.
Although the computer readable medium 700 in FIG. 6 is shown as
separate from the DAIS processor 500, it can be incorporated into
the DAIS processor 500.
[0091] The DAIS processor 500 can be a general purpose computer.
However, the DAIS processor 500 can also be a special purpose
computer, programmed microprocessor or microcontroller and
peripheral integrated circuit elements, ASICs or other integrated
circuits, hardwired electronic or logic circuits such as discrete
element circuits, programmable logic devices such as FPGA, PLD, PLA
or PAL or the like. In general, any device on which a finite state
machine capable of implementing the flowcharts of FIGS. 1-3 can be
used to implement the DAIS processor 500.
[0092] The user interface 600 is preferably a graphical user
interface (GUI) with associated parameter error-checking in order
to prevent improper operation of the code due to inappropriate
parameter initialization, and make the code user-friendly. The user
interface 600 interface preferably accepts code inputs and
initializations, provides clear and useful messages to the user in
the event that the input parameters are illegal, handles the data
processing once the simulations have been completed, and presents
the results of the code in a predefined and easily understandable
format. The user interface 600 is preferably informative, easy to
understand, and capable of providing the user with frequent
messages regarding the status of the simulation and guiding the
user with step-by-step instructions, wherever possible.
[0093] This system and methods of the present invention can be
applied to several general classes of problems: [0094] (1)
Estimation of very low probability events of the form
E={z|g(z).epsilon.G}, where g(z) is known analytically or
algorithmically, or is representative of an actual system under
test, and G is an observable/measurable event; [0095] (2)
Evaluation of expectations E{g(z)}, with z.quadrature..rho.(z)
where the largest contributors to E{g(z)} come from regions
z.epsilon.E such that P(E) is very small; and [0096] (3) Evaluation
of very small integrals .intg..sub.Ag(z) dz where z is non-random
and .intg..sub.Adz is also very small.
[0097] Problem class (1) includes the application addressed above
(estimation of BERs/WERs of FEC codes), as well as other
communication system simulation problems where extremely small
probability events need to be evaluated or simulation time is too
long using previous Monte-Carlo-based simulation techniques. It
also includes the case where one is testing an actual system (e.g.,
electromechanical or electro-optical) by both controlling its input
z and processing the output data via the methods of the present
invention.
[0098] Class (1) also includes the case where the algorithmic
description of the system or operation g(z) consists of several
interconnected subsystems, and requires a large amount of
computation time to produce an output for a given input. Problem
class (3) is addressed by the present invention via using
.rho.(z).about.uniform on the sample space for z.
[0099] The foregoing embodiments and advantages are merely
exemplary, and are not to be construed as limiting the present
invention. The present teaching can be readily applied to other
types of apparatuses. The description of the present invention is
intended to be illustrative, and not to limit the scope of the
claims. Many alternatives, modifications, and variations will be
apparent to those skilled in the art. Various changes may be made
without departing from the spirit and scope of the invention, as
defined in the following claims.
* * * * *
References