U.S. patent application number 09/030488 was filed with the patent office on 2001-08-23 for multiple description transform coding using optimal transforms of arbitrary dimension.
Invention is credited to GOYAL, VIVEK K., KOVACEVIC, JELENA.
Application Number | 20010016079 09/030488 |
Document ID | / |
Family ID | 21854439 |
Filed Date | 2001-08-23 |
United States Patent
Application |
20010016079 |
Kind Code |
A1 |
GOYAL, VIVEK K. ; et
al. |
August 23, 2001 |
MULTIPLE DESCRIPTION TRANSFORM CODING USING OPTIMAL TRANSFORMS OF
ARBITRARY DIMENSION
Abstract
A multiple description (MD) joint source-channel (JSC) encoder
in accordance with the invention encodes n components of a signal
for transmission over m channels of a communication medium. In
illustrative embodiments, the invention provides optimal or
near-optimal transforms for applications in which at least one of n
and m is greater than two, and applications in which the failure
probabilities of the m channels are non-independent and
non-equivalent. The signal to be encoded may be a data signal, a
speech signal, an audio signal, an image signal, a video signal or
other type of signal, and each of the m channels may correspond to
a packet or a group of packets to be transmitted over the medium. A
given n.times.m transform implemented by the MD JSC encoder may be
in the form of a cascade structure of several transforms each
having dimension less than n.times.m. The transform may also be
configured to provide a substantially equivalent rate for each of
the m channels.
Inventors: |
GOYAL, VIVEK K.; (BERKELEY,
CA) ; KOVACEVIC, JELENA; (NEW YORK, NY) |
Correspondence
Address: |
JOSEPH B RYAN
RYAN, MASON & LEWIS, LLP
90 Forest Avenue
LOCUST VALLEY
NY
11560
US
|
Family ID: |
21854439 |
Appl. No.: |
09/030488 |
Filed: |
February 25, 1998 |
Current U.S.
Class: |
382/251 |
Current CPC
Class: |
H04S 1/00 20130101 |
Class at
Publication: |
382/251 |
International
Class: |
G06K 009/36; G06K
009/38; G06K 009/46 |
Claims
What is claimed is:
1. A method of encoding a signal for transmission, comprising the
steps of: encoding n components of the signal in a multiple
description joint source-channel encoder for transmission over m
channels, wherein at least one of n and m is greater than two; and
transmitting the encoded components of the signal.
2. The method of claim 1 wherein the signal includes at least one
of a data signal, a speech signal, an audio signal, an image signal
and a video signal.
3. The method of claim 1 wherein each of the channels corresponds
to at least one packet.
4. The method of claim 1 wherein at least a subset of the m
channels have probabilities of failure which are not independent of
one another.
5. The method of claim 1 wherein at least a subset of the m
channels have non-equivalent probabilities of failure.
6. The method of claim 1 wherein the encoding step includes
encoding the n components for transmission over the m channels
using a transform of dimension n.times.m.
7. The method of claim 1 wherein the encoding step includes
encoding the n components for transmission over the m channels
using a transform which is in the form of a cascade structure of a
plurality of transforms each having dimension less than
n.times.m.
8. The method of claim 1 wherein the encoding step includes
encoding the n components for transmission over the m channels
using a transform which is configured to provide a substantially
equivalent rate for each of the channels.
9. The method of claim 1 wherein the encoding step includes
encoding the n components for transmission over the m channels in a
multiple description joint source-channel encoder which includes a
series combination of N multiple description encoders followed by
an entropy coder, wherein each of the N multiple description
encoders includes a parallel arrangement of M multiple description
encoders.
10. The method of claim 9 wherein each of the M multiple
description encoders implements one of: (i) a quantizer block
followed by a transform block, (ii) a transform block followed by a
quantizer block, (iii) a quantizer block with no transform block,
and (iv) an identity function.
11. An apparatus for encoding a signal for transmission,
comprising: a processor for processing the signal to form
components thereof; and a multiple description joint source-channel
encoder for encoding n components of the signal for transmission
over m channels, wherein at least one of n and m is greater than
two.
12. The apparatus of claim 11 wherein the signal includes at least
one of a data signal, a speech signal, an audio signal, an image
signal and a video signal.
13. The apparatus of claim 11 wherein each of the channels
corresponds to at least one packet.
14. The apparatus of claim 11 wherein at least a subset of the m
channels have probabilities of failure which are not independent of
one another.
15. The apparatus of claim 11 wherein at least a subset of the m
channels have non-equivalent probabilities of failure.
16. The apparatus of claim 11 wherein the multiple description
joint source-channel encoder is operative to encode the n
components for transmission over the m channels using a transform
of dimension n.times.m.
17. The apparatus of claim 11 wherein the multiple description
joint source-channel encoder is operative to encode the n
components for transmission over the m channels using a transform
which is in the form of a cascade structure of a plurality of
transforms each having dimension less than n.times.m.
18. The apparatus of claim 11 wherein the multiple description
joint source-channel encoder is operative to encode the n
components for transmission over the in channels using a transform
which is configured to provide a substantially equivalent rate for
each of the channels.
19. The apparatus of claim 11 wherein the multiple description
joint source-channel encoder further includes a series combination
of N multiple description encoders followed by an entropy coder,
wherein each of the N multiple description encoders includes a
parallel arrangement of M multiple description encoders.
20. The apparatus of claim 19 wherein each of the M multiple
description encoders implements one of: (i) a quantizer block
followed by a transform block, (ii) a transform block followed by a
quantizer block, (iii) a quantizer block with no transform block,
and (iv) an identity function.
21. A method of decoding a signal received over a communication
medium, comprising the steps of: receiving encoded components of
the signal over m channels of the medium; and decoding n of the
components of the signal in a multiple description joint
source-channel decoder, wherein at least one of n and m is greater
than two.
22. An apparatus for decoding a signal received over a
communication medium, comprising: a multiple description joint
source-channel decoder for decoding n components of the signal
received over m channels of the medium, wherein at least one of n
and m is greater than two.
23. A method of encoding a signal for transmission, comprising the
steps of: encoding n components of the signal in a multiple
description joint source-channel encoder for transmission over m
channels, wherein at least a subset of the m channels have
probabilities of failure which are not independent of one another;
and transmitting the encoded components of the signal.
24. An apparatus for encoding a signal for transmission,
comprising: a processor for processing the signal to form
components thereof; and a multiple description joint source-channel
encoder for encoding n components of the signal for transmission
over m channels, wherein at least a subset of the m channels have
probabilities of failure which are not independent of one another.
Description
FIELD OF THE INVENTION
[0001] The present invention relates generally to multiple
description transform coding (MDTC) of data, speech, audio, images,
video and other types of signals for transmission over a network or
other type of communication medium.
BACKGROUND OF THE INVENTION
[0002] Multiple description transform coding (MDTC) is a type of
joint source-channel coding (JSC) designed for transmission
channels which are subject to failure or "erasure." The objective
of MDTC is to ensure that a decoder which receives an arbitrary
subset of the channels can produce a useful reconstruction of the
original signal. A distinguishing characteristic of MDTC is the
introduction of correlation between transmitted coefficients in a
known, controlled manner so that lost coefficients can be
statistically estimated from received coefficients. This
correlation is used at the decoder at the coefficient level, as
opposed to the bit level, so it is fundamentally different than
techniques that use information about the transmitted data to
produce likelihood information for the channel decoder. The latter
is a common element in other types of JSC coding systems, as shown,
for example, in P. G. Sherwood and K. Zeger, "Error Protection of
Wavelet Coded Images Using Residual Source Redundancy," Proc. of
the 31.sup.st Asilomar Conference on Signals, Systems and
Computers, November 1997.
[0003] A known MDTC technique for coding pairs of independent
Gaussian random variables is described in M. T. Orchard et al.,
"Redundancy Rate-Distortion Analysis of Multiple Description Coding
Using Pairwise Correlating Transforms," Proc. IEEE Int. Conf. Image
Proc., Santa Barbara, Calif., October 1997. This MDTC technique
provides optimal 2.times.2 transforms for coding pairs of signals
for transmission over two channels. However, this technique as well
as other conventional techniques fail to provide optimal
generalized n.times.m transforms for coding any n signal components
for transmission over any m channels. Moreover, the optimality of
the 2.times.2 transforms in the M. T. Orchard et al. reference
requires that the channel failures be independent and have equal
probabilities. The conventional techniques thus generally do not
provide optimal transforms for applications in which, for example,
channel failures either are dependent or have unequal
probabilities, or both. This inability of conventional techniques
to provide suitable transforms for arbitrary dimensions and
different types of channel failure probabilities unduly restricts
the flexibility of MDTC, thereby preventing its effective
implementation in many important applications.
SUMMARY OF THE INVENTION
[0004] The invention provides MDTC techniques which can be used to
implement optimal or near-optimal n.times.m transforms for coding
any number n of signal components for transmission over any number
m of channels. A multiple description (MD) joint source-channel
(JSC) encoder in accordance with an illustrative embodiment of the
invention encodes n components of a signal for transmission over m
channels of a communication medium, in applications in which at
least one of n and m may be greater than two, and in which the
failure probabilities of the m channels may be non-independent and
non-equivalent. An n.times.m transform implemented by the MD JSC
encoder may be in the form of a cascade structure of several
transforms each having dimension less than n.times.m. An exemplary
transform in accordance with the invention may include an
additional degree of freedom not found in conventional MDTC
transforms. This additional degree of freedom provides considerable
improvement in design flexibility, and may be used, for example, to
partition a total available rate among the m channels such that
each channel has substantially the same rate.
[0005] In accordance with another aspect of the invention, an MD
JSC encoder may include a series combination of N "macro" MD
encoders followed by an entropy coder, and each of the N macro MD
encoders includes a parallel arrangement of M "micro" MD encoders.
Each of the M micro MD encoders implements one of: (i) a quantizer
block followed by a transform block, (ii) a transform block
followed by a quantizer block, (iii) a quantizer block with no
transform block, and (iv) an identity function. This general MD JSC
encoder structure allows the encoder to implement any desired
n.times.m transform while also minimizing design complexity.
[0006] The MDTC techniques of the invention do not require
independent or equivalent channel failure probabilities. As a
result, the invention allows MDTC to be implemented effectively in
a much wider range of applications than has heretofore been
possible using conventional techniques. The MDTC techniques of the
invention are suitable for use in conjunction with signal
transmission over many different types of channels, including lossy
packet networks such as the Internet as well as broadband ATM
networks, and may be used with data, speech, audio, images, video
and other types of signals.
BRIEF DESCRIPTION OF THE DRAWINGS
[0007] FIG. 1 shows an exemplary communication system in accordance
with the invention.
[0008] FIG. 2 shows a multiple description (MD) joint
source-channel (JSC) encoder in accordance with the invention.
[0009] FIG. 3 shows an exemplary macro MD encoder for use in the MD
JSC encoder of FIG. 2.
[0010] FIG. 4 shows an entropy encoder for use in the MD JSC
encoder of FIG. 2.
[0011] FIGS. 5A through 5D show exemplary micro MD encoders for use
in the macro MD encoder of FIG. 3.
[0012] FIGS. 6A, 6B and 6C show respective audio encoder, image
encoder and video encoder embodiments of the invention, each
including the MD JSC encoder of FIG. 2.
[0013] FIG. 7A shows a relationship between redundancy and channel
distortion in an exemplary embodiment of the invention.
[0014] FIG. 7B shows relationships between distortion when both of
two channels are received and distortion when one of the two
channels is lost, for various rates, in an exemplary embodiment of
the invention.
[0015] FIG. 8 illustrates an exemplary 4.times.4 cascade structure
which may be used in an MD JSC encoder in accordance with the
invention.
DETAILED DESCRIPTION OF THE INVENTION
[0016] The invention will be illustrated below in conjunction with
exemplary MDTC systems. The techniques described may be applied to
transmission of a wide variety of different types of signals,
including data signals, speech signals, audio signals, image
signals, and video signals, in either compressed or uncompressed
formats. The term "channel" as used herein refers generally to any
type of communication medium for conveying a portion of a encoded
signal, and is intended to include a packet or a group of packets.
The term "packet" is intended to include any portion of an encoded
signal suitable for transmission as a unit over a network or other
type of communication medium.
[0017] FIG. 1 shows a communication system 10 configured in
accordance with an illustrative embodiment of the invention. A
discrete-time signal is applied to a pre-processor 12. The
discrete-time signal may represent, for example, a data signal, a
speech signal, an audio signal, an image signal or a video signal,
as well as various combinations of these and other types of
signals. The operations performed by the pre-processor 12 will
generally vary depending upon the application. The output of the
preprocessor is a source sequence {x.sub.k} which is applied to a
multiple description (MD) joint source-channel (JSC) encoder 14.
The encoder 14 encodes n different components of the source
sequence {x.sub.k} for transmission over m channels, using
transform, quantization and entropy coding operations. Each of the
m channels may represent, for example, a packet or a group of
packets. The m channels are passed through a network 15 or other
suitable communication medium to an MD JSC decoder 16. The decoder
16 reconstructs the original source sequence {x.sub.k} from the
received channels. The MD coding implemented in encoder 14 operates
to ensure optimal reconstruction of the source sequence in the
event that one or more of the m channels are lost in transmission
through the network 15. The output of the MD JSC decoder 16 is
further processed in a post processor 18 in order to generate a
reconstructed version of the original discrete-time signal.
[0018] FIG. 2 illustrates the MD JSC encoder 14 in greater detail.
The encoder 14 includes a series arrangement of N macro MD.sub.l
encoders MD.sub.l, . . . MD.sub.N corresponding to reference
designators 20-1, . . . 20-N. An output of the final macro MD.sub.l
encoder 20-N is applied to an entropy coder 22. FIG. 3 shows the
structure of each of the macro MD.sub.l encoders 20-i. Each of the
macro MD.sub.i encoders 20-i receives as an input an r-tuple, where
r is an integer. Each of the elements of the r-tuple is applied to
one of M micro MD.sub.j encoders MD.sub.1, . . . MD.sub.N
corresponding to reference designators 30-1, . . . 30-M. The output
of each of the macro MD.sub.i encoders 20-i is an s-tuple, where s
is an integer greater than or equal to r.
[0019] FIG. 4 indicates that the entropy coder 22 of FIG. 2
receives an r-tuple as an input, and generates as outputs the m
channels for transmission over the network 15. In accordance with
the invention, the m channels may have any distribution of
dependent or independent failure probabilities. More specifically,
given that a channel i is in a state S.sub.i.epsilon.{0, 1}, where
S.sub.l=0 indicates that the channel has failed while S.sub.l=1
indicates that the channel is working, the overall state S of the
system is given by the cartesian product of the channel states
S.sub.i over m, and the individual channel probabilities may be
configured so as to provide any probability distribution function
which can be defined on the overall state S.
[0020] FIGS. 5A through 5D illustrate a number of possible
embodiments for each of the micro MD.sub.J encoders 30-j. FIG. 5A
shows an embodiment in which a micro MD.sub.J encoder 30-j includes
a quantizer (Q) block 50 followed by a transform (T) block 51. The
Q block 50 receives an r-tuple as input and generates a
corresponding quantized r-tuple as an output. The T block 51
receives the r-tuple from the Q block 50, and generates a
transformed r-tuple as an output. FIG. 5B shows an embodiment in
which a micro MD.sub.j encoder 30-j includes a T block 52 followed
by a Q block 53. The T block 52 receives an r-tuple as input and
generates a corresponding transformed s-tuple as an output. The Q
block 53 receives the s-tuple from the T block 52, and generates a
quantized s-tuple as an output, where s is greater than or equal to
r. FIG. 5C shows an embodiment in which a micro MD.sub.J encoder
30-j includes only a Q block 54. The Q block 54 receives an r-tuple
as input and generates a quantized s-tuple as an output, where s is
greater than or equal to r. FIG. 5D shows another possible
embodiment, in which a micro MD.sub.J encoder 30-j does not include
a Q block or a T block but instead implements an identity function,
simply passing an r-tuple at its input though to its output. The
micro MD.sub.j encoders 30-j of FIG. 3 may each include a different
one of the structures shown in FIGS. 5A through 5D.
[0021] FIGS. 6A through 6C illustrate the manner in which the MD
JSC encoder 14 of FIG. 2 can be implemented in a variety of
different encoding applications. In each of the embodiments shown
in FIGS. 6A through 6C, the MD JSC encoder 14 is used to implement
the quantization, transform and entropy coding operations typically
associated with the corresponding encoding application. FIG. 6A
shows an audio coder 60 which includes an MD JSC encoder 14
configured to receive input from a conventional psychoacoustics
processor 61. FIG. 6B shows an image coder 62 which includes an MD
JSC encoder 14 configured to interact with an element 63 providing
preprocessing functions and perceptual table specifications. FIG.
6C shows a video coder 64 which includes first and second MD JSC
encoders 14-1 and 14-2. The encoder 14-1 receives input from a
conventional motion compensation element 66, while the second
encoder receives input from a conventional motion estimation
element 68. The encoders 14-1 and 14-2 are interconnected as shown.
It should be noted that these are only examples of applications of
an MD JSC encoder in accordance with the invention. It will be
apparent to those skilled in the art that numerous alternate
configurations may also be used, in audio, image, video and other
applications.
[0022] A general model for analyzing MDTC techniques in accordance
with the invention will now be described. Assume that a source
sequence {x.sub.k} is input to an MD JSC encoder, which outputs m
streams at rates R.sub.1, R.sub.2, . . . R.sub.m. These streams are
transmitted on m separate channels. One version of the model may be
viewed as including many receivers, each of which receives a subset
of the channels and uses a decoding algorithm based on which
channels it receives. More specifically, there may be 2.sup.m-1
receivers, one for each distinct subset of streams except for the
empty set, and each experiences some distortion. An equivalent
version of this model includes a single receiver when each channel
may have failed or not failed, and the status of the channel is
known to the receiver decoder but not to the encoder. Both versions
of the model provide reasonable approximations of behavior in a
lossy packet network. As previously noted, each channel may
correspond to a packet or a set of packets. Some packets may be
lost in transmission, but because of header information it is known
which packets are lost. An appropriate objective in a system which
can be characterized in this manner is to minimize a weighted sum
of the distortions subject to a constraint on a total rate R. For
m=2, this minimization problem is related to a problem from
information theory called the multiple description problem.
D.sub.0, D.sub.1 and D.sub.2 denote the distortions when both
channels are received, only channel 1 is received, and only channel
2 is received, respectively. The multiple description problem
involves determining the achievable (R.sub.1, R.sub.2, D.sub.0,
D.sub.1, D.sub.2)-tuples. A complete characterization for an
independent, identically-distributed (i.i.d.) Gaussian source and
squared-error distortion is described in L. Ozarow, "On a
source-coding problem with two channels and three receivers," Bell
Syst. Tech. J., 59(8):1417-1426, 1980. It should be noted that the
solution described in the L. Ozarow reference is non-constructive,
as are other achievability results from the information theory
literature.
[0023] An MDTC coding structure for implementation in the MD JSC
encoder 14 of FIG. 2 in accordance with the invention will now be
described. In this illustrative embodiment, it will be assumed for
simplicity that the source sequence {x.sub.k} input to the encoder
is an i.i.d. sequence of zero-mean jointly Gaussian vectors with a
known correlation matrix R.sub.x=[x.sub.kx.sub.k.sup.T]. The
vectors can be obtained by blocking a scalar Gaussian source. The
distortion will be measured in terms of mean-squared error (MSE).
Since the source in this example is jointly Gaussian, it can also
be assumed without loss of generality that the components are
independent. If the components are not independent, one can use a
Karhunen-Loeve transform of the source at the encoder and the
inverse at each decoder. This embodiment of the invention utilizes
the following steps for implementing MDTC of a given source vector
x:
[0024] 1. The source vector x is quantized using a uniform scalar
quantizer with step size .DELTA.: x.sub.qi= [x.sub.l].sub..DELTA.,
where [.multidot.].sub..DELTA. denotes rounding to the nearest
multiple of .DELTA..
[0025] 2. The vector x.sub.q=[x.sub.q1, x.sub.q2, . . .
x.sub.qn].sup.T is transformed with an invertible, discrete
transform {circumflex over (T)}:
.DELTA.Z.sup.n.fwdarw..DELTA.Z.sup.n, y={circumflex over
(T)}(x.sub.q). The design and implementation of {circumflex over
(T)} are described in greater detail below.
[0026] 3. The components of y are independently entropy coded.
[0027] 4. If m>n, the components of y are grouped to be sent
over the m channels.
[0028] When all of the components of y are received, the
reconstruction process is to exactly invert the transform
{circumflex over (T)} to get {circumflex over (x)}=x.sub.q. The
distortion is the quantization error from Step 1 above. If some
components of y are lost, these components are estimated from the
received components using the statistical correlation introduced by
the transform {circumflex over (T)}. The estimate {circumflex over
(x)} is then generated by inverting the transform as before.
[0029] Starting with a linear transform T with a determinant of
one, the first step in deriving a discrete version {circumflex over
(T)} is to factor T into "lifting" steps. This means that T is
factored into a product of lower and upper triangular matrices with
unit diagonals T=T.sub.1T.sub.2 . . . T.sub.k. The discrete version
of the transform is then given by:
{circumflex over (T)}(x.sub.q)=[T.sub.1[T.sub.2 . . .
[T.sub.kx.sub.q].sub..DELTA.].sub..DELTA.].sub..DELTA.. (1)
[0030] The lifting structure ensures that the inverse of
{circumflex over (T)} can be implemented by reversing the
calculations in (1):
{circumflex over (T)}.sup.-1(y)=[T.sub.k.sup.-1 . . .
[T.sub.2.sup.-1[T.sub.1.sup.-1y].sub..DELTA.].sub..DELTA.].sub..DELTA..
[0031] The factorization of T is not unique. Different
factorizations yield different discrete transforms, except in the
limit as .DELTA. approaches zero. The above-described coding
structure is a generalization of a 2.times.2 structure described in
the above-cited M. T. Orchard et al. reference. As previously
noted, this reference considered only a subset of the possible
2.times.2 transforms; namely, those implementable in two lifting
steps.
[0032] It is important to note that the illustrative embodiment of
the invention described above first quantizes and then applies a
discrete transform. If one were to instead apply a continuous
transform first and then quantize, the use of a nonorthogonal
transform could lead to non-cubic partition cells, which are
inherently suboptimal among the class of partition cells obtainable
with scalar quantization. See, for example, A. Gersho and R. M.
Gray, "Vector Quantization and Signal Compression," Kluwer Acad.
Pub., Boston, Mass., 1992. The above embodiment permits the use of
discrete transforms derived from nonorthogonal linear transforms,
resulting in improved performance.
[0033] An analysis of an exemplary MDTC system in accordance with
the invention will now be described. This analysis is based on a
number of fine quantization approximations which are generally
valid for small .DELTA.. First, it is assumed that the scalar
entropy of y={circumflex over (T)}([x].sub..DELTA.) is the same as
that of [Tx].sub..DELTA.. Second, it is assumed that the
correlation structure of y is unaffected by the quantization.
Finally, when at least one component of y is lost, it is assumed
that the distortion is dominated by the effect of the erasure, such
that quantization can be ignored. The variances of the components
of x are denoted by .sigma..sub.1.sup.2, .sigma..sub.2.sup.2 . . .
.sigma..sub.n.sup.2 and the correlation matrix of x is denoted by
R.sub.x, where R.sub.x=diag (.sigma..sub.1.sup.2,
.sigma..sub.2.sup.2 . . . .sigma..sub.n.sup.2). Let
R.sub.y=TR.sub.xT.sup.T. In the absence of quantization, R.sub.y
would correspond to the correlation matrix of y. Under the
above-noted fine quantization approximations, R.sub.y will be used
in the estimation of rates and distortions.
[0034] The rate can be estimated as follows. Since the quantization
is fine, y.sub.i is approximately the same as
[(Tx).sub.i].sub..DELTA., i.e., a uniformly quantized Gaussian
random variable. If y.sub.l is treated as a Gaussian random
variable with power .sigma..sub.yl.sup.2=(R.- sub.y).sub.ll
quantized with stepsize .DELTA., the entropy of the quantized
coefficient is given by: 1 H ( y i ) 1 2 log 2 e yi 2 - log = 1 2
log yi 2 + 1 2 log 2 e - log = 1 2 log yi 2 + k ,
[0035] where k.sub..DELTA..DELTA.(log 2.pi.e)/2-log .DELTA. and all
logarithms are base two. Notice that k.sub..DELTA. depends only on
.DELTA.. The total rate R can therefore be estimated as: 2 R = i =
1 n H ( y i ) = nk + 1 2 log i = 1 n yi 2 . ( 2 )
[0036] The minimum rate occurs when the product from i=1 to n of
.sigma..sub.yl.sup.2 is equivalent to the product from i=1 to n of
.sigma..sub.i.sup.2, and at this rate the components of y are
uncorrelated. It should be noted that T=I is not the only transform
which achieves the minimum rate. In fact, it will be shown below
that an arbitrary split of the total rate among the different
components of y is possible. This provides a justification for
using a total rate constraint in subsequent analysis.
[0037] The distortion will now be estimated, considering first the
average distortion due only to quantization. Since the quantization
noise is approximately uniform, the distortion is .DELTA..sup.2/12
for each component. Thus the distortion when no components are lost
is given by: 3 D 0 = n 2 12 ( 3 )
[0038] and is independent of T.
[0039] The case when l>0 components are lost will now be
considered. It first must be determined how the reconstruction will
proceed. By renumbering the components if necessary, assume that
y.sub.1, y.sub.2, . . . y.sub.n-1 are received and y.sub.n-l+1, . .
. y.sub.n are lost. First partition y into "received" and "not
received" portions as y=[y.sub.r, y.sub.nr] where y.sub.r=[y.sub.1,
y.sub.2, . . . y.sub.n-1].sup.T and y.sub.nr=[y.sub.n-l+1, . . .
y.sub.n].sup.T. The minimum MSE estimate {circumflex over (x)} of x
given y.sub.r is E[x.vertline.y.sub.r], which has a simple closed
form because in this example x is a jointly Gaussian vector. Using
the linearity of the expectation operator gives the following
sequence of calculations:
{circumflex over
(x)}=E[x.vertline.y.sub.r]=E[T.sup.-1Tx.vertline.y.sub.r]-
=T.sup.-1E[Tx.vertline.y.sub.r]
[0040] 4 x ^ = E [ x y r ] = E [ T - 1 Tx | y r ] = T - 1 E [ Tx |
y r ] = T - 1 E [ [ y r y nr ] [ y r ] = T - 1 [ y r E [ y nr | y r
] ] . ( 4 )
[0041] If the correlation matrix of y is partitioned in a way
compatible with the partition of y as: 5 R y = TR x T T = [ R 1 B B
T R 2 ] ,
[0042] then it can be shown that the conditional signal
y.sub.r.vertline.y.sub.nr is Gaussian with mean
B.sup.TR.sub.1.sup.-1y.su- b.r and correlation matrix
A.DELTA.R.sub.2-B.sup.TR.sub.1.sup.-1B. Thus,
E[y.sub.r.vertline.y.sub.nr]=B.sup.TR.sub.1.sup.-1y.sub.r, and
.eta..DELTA.y.sub.nr-E[y.sub.nr.vertline.y.sub.r] is Gaussian with
zero mean and correlation matrix A. The variable .eta. denotes the
error in predicting y.sub.nr from y.sub.r and hence is the error
caused by the erasure. However, because a nonorthogonal transform
has been used in this example, T.sup.-1 is used to return to the
original coordinates before computing the distortion. Substituting
y.sub.nr-.eta. in (4) above gives the following expression for
{circumflex over (x)}: 6 T - 1 [ y r y nr - ] = x + T - 1 [ 0 - ]
,
[0043] such that .vertline..vertline.x-{circumflex over
(x)}.vertline..vertline. is given by: 7 ; T - 1 [ 0 ] r; 2 = T U T
U ,
[0044] where U is the last l columns of T.sup.-1. The expected
value E[.vertline..vertline.x-{circumflex over
(x)}.vertline..vertline.] is then given by: 8 i = 1 l j = 1 l ( U T
U ) ij A ij . ( 5 )
[0045] The distortion with l erasures is denoted by D.sub.l. To
determine D.sub.l, (5) above is averaged over all possible
combinations of erasures of l out of n components, weighted by
their probabilities if the probabilities are non-equivalent. An
additional distortion criteria is a weighted sum {overscore (D)} of
the distortions incurred with different numbers of channels
available, where {overscore (D)} is given by: 9 l = 1 n l D l .
[0046] For a case in which each channel has a failure probability
of p and the channel failures are independent, the weighting 10 a l
= ( n l ) p l ( 1 - p ) n - l
[0047] makes the weighted sum {overscore (D)} the overall expected
MSE. Other choices of weighting could be used in alternative
embodiments. Consider an image coding example in which an image is
split over ten packets. One might want acceptable image quality as
long as eight or more packets are received. In this case, one could
set .alpha..sub.3=.alpha..s- ub.4= . . . .alpha..sub.10=0.
[0048] The above expressions may be used to determine optimal
transforms which minimize the weighted sum {overscore (D)} for a
given rate R. Analytical solutions to this minimization problem are
possible in many applications. For example, an analytical solution
is possible for the general case in which n=2 components are sent
over m=2 channels, where the channel failures have unequal
probabilities and may be dependent. Assume that the channel failure
probabilities in this general case are as given in the following
table.
1 Channel 1 no failure failure Channel 2 failure 1 - p.sub.0 -
p.sub.1 - p.sub.2 p.sub.1 no failure p.sub.2 p.sub.0
[0049] If the transform T is given by: 11 T = [ a b c d ] ,
[0050] minimizing (2) over transforms with a determinant of one
gives a minimum possible rate of:
R*=2k.sub..DELTA.+log.sigma..sub.1.sigma..sub.2.
[0051] The difference .rho.=R-R* is referred to as the redundancy,
i.e., the price that is paid to reduce the distortion in the
presence of erasures. Applying the above expressions for rate and
distortion to this example, and assuming that
.sigma..sub.1>.sigma..sub.2, it can be shown that the optimal
transform will satisfy the following expression: 12 a = 2 2 c 1 [ 2
2 - 1 + 2 2 - 1 - 4 bc ( bc + 1 ) ] .
[0052] The optimal value of bc is then given by: 13 ( bc ) optimal
= - 1 2 + 1 2 ( p 1 p 2 - 1 ) [ ( p 1 p 2 + 1 ) 2 - 4 ( p 1 p 2 ) 2
- 2 ] - 1 / 2
[0053] The value of (bc).sub.optimal ranges from -1 to 0 as
p.sub.1/p.sub.2 ranges from 0 to .infin.. The limiting behavior can
be explained as follows: Suppose p.sub.1>>p.sub.2, i.e.,
channel 1 is much more reliable than channel 2. Since
(bC).sub.optimal approaches 0, ad must approach 1, and hence one
optimally sends x.sub.1 (the larger variance component) over
channel 1 (the more reliable channel) and vice-versa.
[0054] If p.sub.1=p.sub.2 in the above example, then
(bc).sub.optimal=-1/2, independent of .rho.. The optimal set of
transforms is then given by: a.noteq.0 (but otherwise arbitrary),
c=-1/2b, d=1/2a and
b=.+-.(2.sup..rho.-{square root}{square root over
(2.sup.2.rho.-1)}).sigma- ..sub.1a/.sigma..sub.2.
[0055] Using a transform from this set gives: 14 D 1 = 1 2 ( D 1 ,
1 + D 1 , 2 ) = 1 2 - 1 2 2 ( 2 - 2 2 - 1 ) ( 1 2 - 2 2 ) . ( 6
)
[0056] This relationship is plotted in FIG. 7A for values of
.sigma..sub.1=1 and .sigma..sub.2=0.5. As expected, D.sub.1 starts
at a maximum value of (.sigma..sub.1.sup.2+.sigma..sub.2.sup.2)/2
and asymptotically approaches a minimum value of
.sigma..sub.2.sup.2. By combining (2), (3) and (6), one can find
the relationship between R, D.sub.0 and D.sub.1. FIG. 7B shows a
number of plots illustrating the trade-off between D.sub.0 and
D.sub.1 for various values of R. It should be noted that the
optimal set of transforms given above for this example provides an
"extra" degree of freedom, after fixing .rho., that does not affect
the .rho. vs. D.sub.1 performance. This extra degree of freedom can
be used, for example, to control the partitioning of the total rate
between the channels, or to simplify the implementation.
[0057] Although the conventional 2.times.2 transforms described in
the above-cited M. T. Orchard et al. reference can be shown to fall
within the optimal set of transforms described herein when channel
failures are independent and equally likely, the conventional
transforms fail to provide the above-noted extra degree of freedom,
and are therefore unduly limited in terms of design flexibility.
Moreover, the conventional transforms in the M. T. Orchard et al.
reference do not provide channels with equal rate (or,
equivalently, equal power). The extra degree of freedom in the
above example can be used to ensure that the channels have equal
rate, i.e., that R.sub.1=R.sub.2, by implementing the transform
such that .vertline.a.vertline.=.vertline.c.vertline. and
.vertline.b.vertline.=.vertline.d.vertline.. This type of rate
equalization would generally not be possible using conventional
techniques without rendering the resulting transform
suboptimal.
[0058] As previously noted, the invention may be applied to any
number of components and any number of channels. For example, the
above-described analysis of rate and distortion may be applied to
transmission of n=3 components over m=3 channels. Although it
becomes more complicated to obtain a closed form solution, various
simplifications can be made in order to obtain a near-optimal
solution. If it is assumed in this example that
.sigma..sub.1>.sigma..sub.2>.sigma..sub.3, and that the
channel failure probabilities are equal and small, a set of
transforms that gives near-optimal performance is given by: 15 [ a
- 3 1 a 2 - 2 6 3 1 2 a 2 2 a 0 2 6 3 1 2 a 2 a 3 1 a 2 - 2 6 3 1 2
a 2 ] .
[0059] Optimal or near-optimal transforms can be generated in a
similar manner for any desired number of components and number of
channels.
[0060] FIG. 8 illustrates one possible way in which the MDTC
techniques described above can be extended to an arbitrary number
of channels, while maintaining reasonable ease of transform design.
This 4.times.4 transform embodiment utilizes a cascade structure of
2.times.2 transforms, which simplifies the transform design, as
well as the encoding and decoding processes (both with and without
erasures), when compared to use of a general 4.times.4 transform.
In this embodiment, a 2.times.2 transform T.sub..alpha. is applied
to components x.sub.1 and x.sub.2, and a 2.times.2 transform
T.sub..beta. is applied to components x.sub.3 and x.sub.4. The
outputs of the transforms T.sub..alpha.and T.sub..beta. are routed
to inputs of two 2.times.2 transforms T.sub..gamma. as shown. The
outputs of the two 2.times.2 transforms T.sub..gamma. correspond to
the four channels y.sub.1 through y.sub.4. This type of cascade
structure can provide substantial performance improvements as
compared to the simple pairing of coefficients in conventional
techniques, which generally cannot be expected to be near optimal
for values of m larger than two. Moreover, the failure
probabilities of the channels y.sub.1 through y.sub.4 need not have
any particular distribution or relationship. FIGS. 2, 3, 4 and
5A-5D above illustrate more general extensions of the MDTC
techniques of the invention to any number of signal components and
channels.
[0061] The above-described embodiments of the invention are
intended to be illustrative only. It should be noted that a
complementary decoder structure corresponding to the encoder
structure of FIGS. 2, 3, 4 and 5A-5D may be implemented in the MD
JSC decoder 16 of FIG. 1. Alternative embodiments of the invention
may utilize other coding structures and arrangements. Moreover, the
invention may be used for a wide variety of different types of
compressed and uncompressed signals, and in numerous coding
applications other than those described herein. These and numerous
other alternative embodiments within the scope of the following
claims will be apparent to those skilled in the art.
* * * * *