U.S. patent application number 09/179861 was filed with the patent office on 2003-01-09 for video signal coding method.
Invention is credited to RYU, CHUL.
Application Number | 20030007563 09/179861 |
Document ID | / |
Family ID | 22658294 |
Filed Date | 2003-01-09 |
United States Patent
Application |
20030007563 |
Kind Code |
A1 |
RYU, CHUL |
January 9, 2003 |
VIDEO SIGNAL CODING METHOD
Abstract
A video signal coding method is provided which finds proper
decision curves according to characteristics of input frames and
encodes the optimal macroblock by using the decision curves instead
of a fixed motion/no-motion compensation curve and intra/inter
coding curve. The optimal mode is selected for each macroblock of
input frame and it is determined through a step of judging whether
the input frame is intra mode using a given function, a step of
judging whether the input frame is inter mode when it is not intra
mode using a given function, a step of controlling quantizer using
a predetermined critical value when it is not inter mode, and step
of performing skip when the quantizer controlling step is not
carried out.
Inventors: |
RYU, CHUL; (KYUNGKI-DO,
KR) |
Correspondence
Address: |
DANIEL Y J KIM
LAW OFFICES OF FLESHNER & KIM
PO BOX 221200
CHANTILLY
VA
201531200
|
Family ID: |
22658294 |
Appl. No.: |
09/179861 |
Filed: |
October 28, 1998 |
Current U.S.
Class: |
375/240.13 ;
375/E7.13; 375/E7.134; 375/E7.135; 375/E7.139; 375/E7.145;
375/E7.146; 375/E7.157; 375/E7.159; 375/E7.176; 375/E7.181;
375/E7.211; 375/E7.219; 375/E7.254 |
Current CPC
Class: |
H04N 19/132 20141101;
H04N 19/149 20141101; H04N 19/61 20141101; H04N 19/172 20141101;
H04N 19/587 20141101; H04N 19/117 20141101; H04N 19/152 20141101;
H04N 19/115 20141101; H04N 19/124 20141101; H04N 19/176 20141101;
H04N 19/192 20141101; H04N 19/103 20141101 |
Class at
Publication: |
375/240.13 |
International
Class: |
H04N 007/12 |
Claims
What is claimed is:
1. A video signal coding method, which variably changes decision
curve which determines the mode of a specific block based on
input/output restrictive condition.
2. The video signal coding method as claimed in claim 1, wherein
the decision curve corresponds to intra/inter mode decision
curve.
3. The video signal coding method as claimed in claim 1, wherein
the decision curve corresponds to motion/no-motion compensation
mode decision curve.
4. The video signal coding method as claimed in claim 1, wherein
the input/output restrictive condition is output target bitrate of
encoder.
5. The video signal coding method as claimed in claim 4, wherein
the target bitrate is determined according to channel rate.
6. The video signal coding method as claimed in claim 1, wherein
the input/output restrictive condition depends on data
characteristic of the block.
7. The video signal coding method as claimed in claim 6, wherein
the data characteristic of the block shows the degree of variation
between current block and previous block during encoding using the
intra/inter mode decision curve.
8. The video signal coding method as claimed in claim 7, wherein
the degree of variation is calculated with regard to motion
compensation vector between blocks.
9. The video signal coding method as claimed in claim 1, wherein
the input/output restrictive condition is output buffer
fullness.
10. The video signal coding method as claimed in claim 1, wherein
the variation in the decision curve occurs for every group of
blocks, which is constructed of specific number of blocks.
11. The video signal coding method as claimed in claim 10, wherein
the group of blocks is unit frame.
12. The video signal coding method as claimed in claim 10. wherein
each of the blocks is macroblock constructed of at least one unit
block.
13. The video signal coding method, wherein decision curve which
determines the mode of a specific block is variably changed for
every group of blocks which is constructed of at least one unit
block.
14. The video signal coding method as claimed in claim 13, wherein
the decision curve corresponds to intra/inter mode decision
curve.
15. The video signal coding method as claimed in claim 13, wherein
the decision curve corresponds to motion/no-motion compensation
mode decision curve.
16. The video signal coding method as claimed in claim 13, wherein
the block is macroblock constructed of at least one unit block.
17. The video signal coding method as claimed in claim 13, wherein
the group of blocks is unit frame.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] The present invention relates to a video signal coding
method, and more particularly, to a video signal coding method
which is able to select the optimal modes for macroblocks when a
block based video signal is coded.
[0003] 2. Discussion of Related Art
[0004] In transmitting of video signals outputted from a video
coding application program through public switched telephone
network (PST), it is preferable that a fixed bitrate is maintained
for simple configuration or fixed band width of the network.
Accordingly, in order to transmit the video signals in a fixed
bitrate, variation in bitrate is reduced using a buffer placed
between the output terminal of the encoder and channel. If the
encoder controls the bitrate in transmission of video signals, the
output bitrate of the encoder can be uniformly maintained to
correspond to variable coding quality. However, to obtain the
optimal bitrate requires simultaneous using of the two methods,
that is, employing of buffer and controlling of the bitrate by the
encoder. For this, there have been proposed various techniques
which are compatible with the standard decoder and maximize the
visual quality of channel bitrate set.
[0005] MPEG-1 and H.261 which are the standard models in video
coding deal with only quantization parameter based on the buffer
fullness, approaches proposed for MPEG-1 and H.261 use previous bit
count values as predicted values of bit counts for current
macroblock or sub-group-of-block, and each quantization level is
controlled accordingly. Another approach uses an approximate value
to predict the number of bit for the current macroblock from
training sequences under the stationary supposition, controlling
each quantization level. Another technique makes a formula of
rate-constrained product code, to optimize the combination of
quantization selection items.
[0006] In general, variation in the output bitrate is decreased and
required buffer size is reduced as quantization is frequently
controlled. On the other hand, a larger sized buffer is needed when
controlling of quantization depends on the time required for
extracting the predicted value of the buffer. A modelling technique
is widely used when the quantization level is determined based on
the buffer fullness. For example, buffer fullness and quantization
related analysis models are employed to determine the magnitude of
quantization level based on the buffer fullness. In MPEG-1 and
H.261, there are fixed functions which determine modes for
macroblocks. The encoder has to makes decisions for each
macroblock: how to determine the best motion vectors to use, decide
whether to code each macroblock as intra or predicted mode, and how
to set the quantizer scale.
[0007] As a conventional mode selection method, there has been
proposed a simple suboptimal method which performs calculation more
easily using a computer. This suboptimal method which makes a
series of decision options for bitrate control is carried out
through the following steps. First of all, motion compensation or
no-motion compensation step is performed, which determines whether
motion vector is transmitted or processed as `0`. The next step is
to determine whether the mode of macroblock is intra or inter mode
using motion vector which was found in the motion compensation or
no-motion compensation step. In the case of inter coding, a step is
implemented, which determines whether residual error is large
enough to be coded using discrete cosine transform. The final step
is to determine whether the quantizer scale is satisfactory or
required to be changed. In each of the steps, functions or rules
are used for effective decision. For example, decisions for
motion/no-motion compensation and intra/inter coding use fixed
functions but decision for code/no-code is determined according to
the difference between the magnitudes of error signals.
[0008] Furthermore, quantization parameters are determined based on
the buffer fullness. The modes for macroblocks and quantizers are
determined and macroblocks are coded accordingly through a series
of decision procedures. However, when the video coder transmits the
video signals, most of the conventional approaches for bitrate
control focused on the decision step of determining whether the
quantizer scale is satisfactory or required to be changed. For
example, the quantizer level is adjusted based on the buffer
fullness. Proposed algorithm extends the coding decision options
for rate control to motion/no-motion compensation as well as
inter/intra decisions. Output bitrate coded is sensitive to the
shapes of decision curves. Accordingly, when the bitrate is
controlled using the quantizer, quantization error directly affects
visual quality, producing various problems.
SUMMARY OF THE INVENTION
[0009] Accordingly, the present invention is directed to a video
signal coding method that substantially obviates one or more of the
problems due to limitations and disadvantages of the related
art.
[0010] An object of the present invention is to provide a video
signal coding method which finds proper decision curves according
to characteristics of input frames and encodes the optimal
macroblock by using the decision curves instead of a fixed
motion/no-motion compensation curve and intra/inter coding
curve.
[0011] According to an embodiment of the present invention to
accomplish the object, the optimal mode is selected for each
macroblock of input frame when video signals are coded in a video
coder. Preferably, the optimal mode is determined through a step of
judging whether the input frame is intra mode using a given
function, a step of judging whether the input frame is inter mode
when it is not intra mode using a given function, a step of
controlling quantizer using a predetermined critical value when it
is not inter mode, and step of performing skip when the quantizer
controlling step is not carried out.
[0012] According to the present invention, it is possible to select
the optimal modes to control bitrate so that a video coder whose
visual quality variation is smaller compared with the conventional
bitrate control by simple quantizer. In a block based video codec,
the present invention presents an effective algorithm for selecting
the optimal modes for macroblocks. Accordingly, the method proposed
by the present invention is different from the previous ones in
that it does not manipulate quantizer to meet target bitrate.
Instead it finds the optimal macroblock modes which extract
consistent visual quality to meet the target bitrate.
[0013] It is to be understood that both the foregoing general
description and the following detailed description are exemplary
and explanatory and are intended to provide further explanation of
the invention as claimed.
BRIEF DESCRIPTION OF THE ATTACHED DRAWINGS
[0014] The accompanying drawings, which are included to provide a
further understanding of the invention and are incorporated in and
constitute a part of this specification, illustrate embodiments of
the invention and together with the description serve to explain
the principles of the invention:
[0015] In the drawings:
[0016] FIG. 1 is a block diagram of a conventional AMS video
coder;
[0017] FIG. 2 is a simplified structure of the video coder of FIG.
1;
[0018] FIG. 3 is a decision curve which divides the region into A
and B for motion and no-motion compensation;
[0019] FIG. 4 shows predetermined bounds to alert when buffer
reaches underflow or overflow;
[0020] FIGS. 5A and 5B are graphs showing PSNR and buffer fullness
for various forms of .lambda.;
[0021] FIG. 6 is a flow diagram showing AMS algorithm according to
the present invention;
[0022] FIGS. 7A and 7B are graphs showing the relationship between
PSNR and decision curve orders (from 2 to 5) of motion/no-motion
compensation;
[0023] FIGS. 8A and 8B are graphs showing the relationship between
PSNR and frame encoding rates in the case of encoding with HAMS and
AMS in Claire;
[0024] FIGS. 9A and 9B are graphs showing the relationship between
PSNR and frame encoding rates in the case of encoding with HAMS in
Claire;
[0025] FIGS. 10A and 10B are graphs showing PSNR, buffer contents
and effect of buffer overflow when frames are coded with 160 bps at
30 fps;
[0026] FIGS. 11A and 11B are graphs showing decision curves for
motion/no-motion compensation and intra/inter frame coding;
[0027] FIGS. 12A to 12D are graphs showing the relationship between
PSNR and encoded frame rates in the case of encoding with 160 kbps
and 179 kbps using HAMS in Claire;
[0028] FIGS. 13A to 13D are graphs showing the relationship between
PSNR and frame rates in the case of encoding with 128 kbps and 192
kbps using HAMS in Miss America sequences in CIF;
[0029] FIGS. 14A to 14D show PSNR, relation with frame rate, buffer
fullness, and relation with MQIANT in the case of encoding with 339
kbps using HAMS in Salesperson sequences in CIF; and
[0030] FIGS. 15A to 15D show PSNR, relation with frame rate,
relation with buffer fullness, and relation with MQIANT in the case
of encoding with 352 kbps using HAMS in Salesperson sequences in
CIF.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENT
[0031] Reference will now be made in detail to the preferred
embodiments of the present invention, examples of which are
illustrated in the accompanying drawings.
[0032] FIG. 1 is a block diagram of a conventional adaptive mode
selection (AMS) video coder. Referring to FIG. 1, the AMS video
coder consists of a mode controller 100 and encoder 200. Mode
controller 100 includes a mode selector 11 for determining coding
modes for macroblocks of input frame, a buffer 12 for storing video
signals outputted from mode selector 11 to observe if overflow or
underflow occurs, determining whether to skip corresponding block,
and an annealing optimization module 13 for obtaining the optimal
decision curve coefficient and .lambda. from the output of the
buffer using a given mode. Encoder 200 includes a discrete cosine
transform (DCT) module 21 for transforming a spacial domain value
into a transmission domain value, a quantizer 22 for dividing the
output of DCT module 21 into various levels to specify them as
predetermined values, a variable length coder (VLC) 23 for coding
the quantized values to meet a predetermined mode, a buffer 24 for
observing the amount of signal outputted from VLC 23 to determine
whether to transmit it or to control the quantization level and
code it, an inverse quantizer 25 for inverse-quantizing signals
outputted from quantizer 22 to predict it, an inverse discrete
cosine transform (IDCT) module 26 for transforming a transmission
domain value into a spatial domain value to predict signals
outputted from inverse quantizer 25, and a predictive module 27 for
predicting the motion of signal outputted from IDCT module 26 to
carry out motion compensation when the frame mode is inter
mode.
[0033] The present invention presents an effective algorithm for
selecting the optimal modes for macroblocks in a block based video
codec. These modes are selected by optimal decision curves which
are sequentially determined by the relation of target bitrate and
distortion. Characteristic decision curves for frames are applied
to all macroblocks. These decision curves are determined by
repeatedly comparing the target bitrate which minimizes encoding
distortion with output bitrate. This repeated procedure having no
relation with the channel buffer fullness ensures that the decision
curves are optimal using simulated annealing optimization
techniques. Upon determination of the optimal decision curves, the
optimal modes for macroblocks are selected based on the decision
curves which minimize the overall distortion for a given bitrate.
An embodiment of the present invention optimizes two different
decision curves, motion/no-motion compensation decision curve and
intra/inter coding decision curve.
[0034] When let F and {overscore (F)} be input and reconstructed
frames of the video codec, F and {overscore (F)} can be partitioned
into groups of macroblocks as shown in the following expression
(1).
F=(X.sub.0, X.sub.1, X.sub.2, . . . X.sub.L-1)
{overscore (F)}=({overscore (X)}.sub.0, {overscore (X)}.sub.1,
{overscore (X)}.sub.2, . . . {overscore (X)}.sub.L-1) (1)
[0035] In this expression, each macroblock X.sub.i,j in F.sub.j can
be coded using only one of N possible modes given by the set S
shown in the following expression (2).
S={M.sub.o, M.sub.1, M.sub.2, . . . , M.sub.N-1} (2)
[0036] Let M.sub.k.sup.i.epsilon.S, where k=0,1, . . . , N-1, be
the mode selected to code a macroblock X.sub.i,j. The number i and
j represent the index of macroblock and frame, respectively. Let
y.sub.1,j be data of a macroblock to be coded, generated by
processing of a selected mode M.sub.k.sup.i and {overscore
(X)}.sub.i,j be the output after decoding of corresponding block.
In general, y.sub.i,j can be represented as follows depending on
which mode it is assigned with.
y.sub.i,j=0, if skipped
y.sub.i,j=X.sub.i,j, if intra coded (3)
y.sub.i,j=X.sub.i,j-X.sub.i.vertline.,j-1, if inter coded
[0037] X.sub.i.vertline.,j-1 represents a ith macroblock of (j-1)th
frame with motion vector .DELTA., where
.DELTA.=0, if no MC
.DELTA.=.delta..sub.x,y, if MC (4)
[0038] .delta..sub.x,y is the result of motion compensation
algorithm. The decision curve for motion/no-motion compensation can
be represented as a polynomial with order P-1, 1 g ( x ) = k = 0 P
- 1 a k x k ( 5 )
[0039] The equation (5) determines whether the macroblock is
encoded using motion or no-motion compensation through a curve
which divides the first quarter of a plane into two regions. FIG. 3
shows an example of decision curve which divides the region into A
and B for motion and no-motion compensations. The X-axis in FIG. 3
is determined by the sum of absolute differences between pixel
values in X.sub.i,j and X.sub.i,j-1 blocks for a given i and j as
follows 2 x ( X ) | X i , j - X i , j - 1 | i , j 1 256 m n | X i ,
j ( m , n ) - X i , j - 1 ( m , n ) | ( 6 )
[0040] where the numbers of m and n are indexes in the macroblock.
Variable y is defined as the sum of absolute differences between
pixel values of X.sub.i,j and X.sub.i.vertline..epsilon.,j-1 3 y (
X ) | X i , j - X i | , j - 1 | i , j 1 256 m n | X i , j ( m , n )
- X i | , j - 1 ( n ) | ( 7 )
[0041] The motion vector A is determined through the curve g(X) as
follows.
.DELTA.=0, if y.gtoreq.g(x)
.DELTA.=.delta..sub.x,y, if y<g(x) (8)
[0042] While the sum of absolute differences between X.sub.i,j and
X.sub.j-1, and X.sub.i,j and X.sub.i.vertline..DELTA.,j-1 is used,
the modes for motion and no-motion compensations are determined
depending on where (x,y) is placed using g(x) for a given X.sub.i,j
and X.sub.i,j-1. Therefore, the objective is to find the
coefficients a.sub.k of the polynomial which minimizes the global
coding distortion D=D.sub.j(a.sub.k) of the jth frame at a given
bitrate R. 4 a r g { min a k D F j ( a k ) } w i t h R F j ( a k )
R ( 9 )
[0043] As described above, it is possible to obtain the solution
for the unconstraint optimization problem expressed in the
following equation (10) instead of the constraint optimization
problem of expression (9). 5 a r g { min a k J F j ( a k ) } = D F
j ( a k ) + R F j ( a k ) ( 10 )
[0044] This case, however, is to find parameter a.sub.k not
parameter .lambda. of the quantizer. .lambda. must be found to
solve the equation (10) but the problem can be simplified using the
fact that .lambda. is the slope of R-D curve at a selected optimal
point which meets the constraint and characteristic of convex R-D
curve. That is, if the R-D curve can be used, the slope at the
point of (x1,y1) on the R-D curve, where x1 is desired bitrate and
y1 is unknown minimum distortion point, is found to obtain
.lambda.. Therefore, the unconstraint problem can be explained
using fixed .lambda. as follows, and the solution for a.sub.k*
which minimizes the following equation (11), the Lagrangian cost
function, can be obtained for a given .lambda.>0 using the
following equation (11). 6 J F j ( a k * ) = min a k { D F j ( a k
) + R F j ( a k ) } ( 11 )
[0045] The constraint problem corresponding the equation (9) can be
represented as follows. 7 D F j ( a k * ) = min a k D F j ( a k ) s
u c h t h a t R F j ( a k * ) R ( 12 )
[0046] It is clear that the constraint problem of the equation (12)
has the same solution as that of the unconstraint problem of the
equation (9), and this is possible when
R=R.sub.F.sub..sub.j(a.sub.k*) because .lambda. is selected to
satisfy the constraint optimal value.
[0047] It should be noted that the optimal solution may not exist
when .lambda.>0 depending on R-D characteristic. For all
.lambda. which is not negative value, the solution of the
constraint problem is identical to that corresponding unconstraint
problem. However, it important that a.sub.k* becomes the solution
for the constraint problem when R.sub.F.sub..sub.j(a.sub.k*) is R
for a given .lambda.. In other words, .lambda. can be obtained when
R is given and the unconstraint problem can be solved accordingly,
obtaining desired solution a.sub.k* which meets
R.sub.F.sub..sub.j.ltoreq.R. Here, R can be obtained from the point
on R-D curve whose slope is .lambda.. However, it is difficult to
actually obtain the R-D curve for various decision curve parameters
(in a case that the quantizer takes a constant). That is, it takes
quite a long period of time to obtain the R-D curve. N.times.M
dimensional searching is needed for data point of R-D curve where
the numbers of coefficients for two decision curves (for example,
motion/no-motion compensation curve and intra/non-intra coding
decision curve) are N and M. To maximize the computation, the
values of .lambda. between 0 (minimum distortion, maximum rate) to
infinity (maximum distortion, minimum rate) are searched and the
optimal coefficient a.sub.k* is obtained. Subsequently, bitrate
R.sub.F.sub..sub.j(a.sub.k*) is calculated to confirm whether to
approach a desired bitrate. When the bitrate obtained is sufficient
approximate value, the optimal decision curves are defined
depending on parameter a.sub.k used during the computation.
[0048] To solve the equation (11), .lambda. as well as a.sub.k must
be searched to obtain the constraint minimum cost function value.
When consistency is maintained in variation in pictures between
frames, it is considered that the variation in the value of
.lambda. is not severe between the frames. This suggests that the
variation in coded bitrate in each frame is small between the
frames. Therefore, instead of obtaining .lambda. for each frame,
using the previous .lambda. for frames which are sequentially
generated reduces the complexity in processing with a to
computer.
[0049] Since .lambda. can be analyzed as the quality index of coded
frames, it is important to obtain optimal .lambda. for a period
starting from encoding of the first frame until picture variation
occurs (for example, in the case of cutting of picture). Distortion
of the jth frame for macroblocks can be extended as follows 8 D F j
( a k ) = E { | F j - F _ j | 2 } = 1 L i = 0 L - 1 | X i , j - ( X
i | j - 1 + X _ i , j | 2 ( 13 )
[0050] Therefore, the Lagrangian cost function can be changed to
the following equation (14). 9 J F j ( a k * ) = min a k { D F j (
a k ) + R F j ( a k ) } = min a k - 1 { 1 L i = 0 L - 1 | X i , j -
( X i | j - 1 + X _ i , j | 2 + R F j ( a k ) w h e r e X _ i , j =
D - 1 Q Q D ( X i , j - X i | , j - 1 ) , F j = F r a m e j ( = i x
i , j ) , a n d F _ j = R e c o ns t r u c t e d f r a m e j ( = i
x _ i , j ) ( 14 )
[0051] Here, D and D.sup.-1 denote DCT and IDCT, and Q and Q.sup.-1
denote quantization and inverse quantization, respectively. In case
of intra and inter frame codings, the equations (14), (6) and (7)
are replaced by the following equations (15) and (16). 10 x ( X ) =
var ( X i , j , X i , j - 1 ) i , j 1 256 m n ( X i , j ( m , n ) -
X i , j - 1 ( m , n ) ) 2 ( 15 ) y ( X ) = var ( X i , j , X i | ,
j - 1 ) i , j 1 256 m n ( X i , j ( m , n ) - X i | , j - 1 ( m , n
) ) 2 ( 16 )
[0052] These are the unconstraint optimal formulas for the proposed
problem, and a stochastic annealing optimization algorithm is
employed to find the coefficients a.sub.k in equation (5) which
minimizes the distortion in equation (14).
Bitrate Control by Adaptive Mode Selection
[0053] In applications for H.261 encoding, the Lagrangian
multiplier of unconstraint function must be controlled to generate
decision curves in order to create proper modes to satisfy the bit
count value of current frame. The content of buffer is predicted
using the bit count of previously coded frames according to input
encoding parameters such as frame rate, channel rate and buffer
size.
[0054] Buffer control is required to adjust the average bitrate for
desired one. In general, the state of buffer is retransmitted to
the encoder which selects quantizers to avoid the buffer being
overflowed or underflowed. However, the conventional bitrate
controlling method using mapping method between quantizers and
buffer record does not produce satisfactory encoding result because
the number of coded bits generated from frames (I, P and B) of
various modes varies widely. Accordingly, there will be explained a
method of setting the state of alerting overflow and underflow of
the Ad buffer according to an embodiment of the present invention.
Let R.sub.f and R.sub.c denote frame rate (fps) and channel rate
(bps), and B denote buffer size in k msec (kRc) of the channel
rate, respectively. Let input rate to the buffer be represented as
the following equation (17). 11 R i n = R 1 + R 2 + R 3 + + R n = R
f r 1 + R f r 2 + R f r 3 + + R f r n ( 17 )
[0055] assuming there are n frames coming into encoder and the unit
of R.sub.i(i=1, . . . , n) is in bps. Using above notations, the
contents of buffer can be represented as follows, 12 B L ( t = n -
) = i = 0 n - 1 R f r i - n R c B L ( t = n + ) = i = 0 n - 1 R f r
i - ( n + 1 ) R c ( 18 )
[0056] The upper part and lower part of equation (18) represent the
state of buffer just after the nth frame is temporarily stored in
the buffer and coded, and just before the (n+1)th frame is
temporarily stored in the buffer, respectively. In order to avoid
the buffer being overflowed or underflowed, the buffer level,
B.sub.L, is controlled as follows
.alpha.B .ltoreq.B.sub.L<.beta.B (19)
[0057] where,
60 +.beta.=1, .alpha..gtoreq.0, .alpha..ltoreq..beta.
[0058] The buffer can avoid overflow or underflow if terms in
equation (18) meets equation (19). Accordingly, the following
equation (20) is possible. 13 B i = 0 n - 1 R f r i - ( n + 1 ) R c
i = 0 n - 1 R f r i - n R c < B ( 20 )
[0059] Here, since B is represented as channel rate, the above
equation (20) can be extended as follows 14 k R c i = 0 n - 1 ( R f
R c ) r i - ( n + 1 ) i = 0 n - 1 ( R f R c ) r i - n < k ( 21
)
[0060] By varying .alpha. and .beta. the bounds to alert when
buffer reaches underflow or overflow can be preset. FIG. 4 shows an
example of the bounds, upper and lower bounds.
[0061] Bitrate control according to quality coefficient .lambda. is
explained below. The bitrate in the approach proposed by the
present invention is controlled by quality coefficient .lambda. not
by conventional quantizer, and Lagrangian multiplier is adjusted
depending on the state of buffer. When the content of buffer is in
underflow alert state (for example, 0<B.sub.L<.alpha.B),
Lagrangian multiplier .lambda. in equation (14) is decreased to
provide distortion constraint rather than bitrate constraint. When
the content of buffer reaches upper bound, .beta.B, .lambda. is
increased to allow the modes of macroblocks to use skipping
mode.
[0062] By increasing .lambda., algorithm gives more favor to
bitrate constraint than distortion constraint. Once the optimal
coefficients of the decision curve are found, each macroblock is
coded according to the mode based on the decision curve found.
FIGS. 5A and 5B show Power Signal-to-Noise Ratio (PSNR) and buffer
fullness for various forms of .lambda.. As shown in FIGS. 5A and
5B, the small increment in .lambda. reduces PSNR and buffer state.
The decrement in .lambda. increases the buffer state but decreases
PSNR before overflow occurs. Accordingly, coded bitrate can be
controlled by adjusting .lambda. according to the buffer state not
to allow overflow or underflow to occur.
[0063] To obtain the optimal .lambda. when overflow or underflow
occurs in each frame requires a period of time. The procedure can
be carried out if parameter .lambda. is used for frame unit. A
method employed in a simulation in order to avoid the buffer being
overflowed is expressed as follows 15 j + 1 = j + j ( R F j R - R F
) = j ( R R - R F ) ( 22 )
[0064] where the increment in .lambda. depends on the coded bitrate
of previous frame. Equation (22) means that the increment is small
in .lambda. as long as the difference between a desired bitrate and
coded bitrate of the jth frame is large, and the increment becomes
large when the difference is small. When the content of buffer
reaches underflow state, .lambda. can be changed by decreasing its
value as follows. 16 j + 1 = j - j ( R F j R - R F ) = j ( R - 2 R
F j R - R F ) ( 23 )
[0065] FIG. 6 is a flow diagram showing AMS algorithm according to
the present invention. Referring to FIG. 6, upon inputting of
frame, mode controller 100 of the present invention judges if the
input frame is new (S1). Coefficient a, of decision curve is
initialized (S2) and then it is replaced by new value (S3) when the
input frame is new, and the coefficient is replaced without
initialization when the input frame is not new one. Subsequently,
it is judged if it is skip mode transmission according to alerting
of a predetermined maximum value (overflow) or minimum value
(underflow) (S4). When it is not skip mode transmission in step S4,
the procedure goes to group of next block (S5). Here, GOB consists
of thirty-three macroblocks. Then, the mode of macroblock is
determined (S6). Thereafter, it is determined whether the buffer
level is within overflow level (S7).
[0066] Subsequently, when it is skip mode transmission in step S4
or buffer level is within overflow level in step S7, they are
sequentially compared with predetermined critical values (S8, S9),
and skip mode is determined if they all correspond to the values
(S11). Virtual coding is performed (S12) when the buffer level is
not within overflow level in step S7 or after determination of skip
mode in step S11. Here, it is judged that if the buffer level is in
underflow state (S12), and dummy bit is inserted when it is (S13).
When the buffer level is not in underflow state in step S12, it is
judged that if it is the final GOB (S14), feeding back to step S5
when it is not the final GOB.
[0067] Otherwise, it is judged if group of optimal blocks is found
when it is the final GOB (S15). When the group of optimal blocks is
not found, it is judged that if the mean square error obtained
using a.sub.k given in step S3 is smaller than that for previous
a.sub.k (S16). If smaller, the smaller value replaces the previous
mean square error value (S17). It is judged if the group of optimal
blocks are found or a given loop is repeated (S18) when the mean
square error obtained using a.sub.k given in step S3 is not smaller
than that for previous a.sub.k, or after execution of step S17.
When the group of optimal blocks are not found and given loop is
not repeated, new a.sub.k is generated (S19). On the other hand,
when the procedures are performed in step S18, the group of optimal
blocks has been found and thus actual coding is carried out using
those a.sub.k ad .lambda. (S20). Otherwise, when the group of
optimal blocks are found in step S15, it is judged if it is the
final frame (S21), and when it is not the procedure goes to the
next frame, initializing the value of group of optimal blocks and
a.sub.k (S22) When it is the final frame in step S21, the procedure
is completed.
[0068] FIGS. 7A and 7B are graphs showing the relationship between
PSNR and decision curve order (from 2 to 5) of motion/no-motion
compensation. Referring to FIGS. 7A and 7B, for testing of the
proposed approach, simulations were performed using various
sequences in CIF format (352 pixels.times.240 lines) for different
degree of order from P=2 to 9 in equation (5). For this experiment,
the frame rate is held constant at 30 fps and the average bitrate
varies ranging from 128 kbps to 352 kbps. As a part of the encoding
process, the modes are selected using the procedure described
above, that is, process of obtaining the optimal motion/no-motion
compensation coding curve and then obtaining the optimal
intra/inter coding curve. The encoding results are compared with
coded sequences generated by the video codec test model RM 8.
[0069] FIGS. 8A and 8B are graphs which show the relationship
between PSNR and encoding rates in the case of encoding with
Hierarchical Search Based Adaptive Mode Selection (HAMS) and AMS in
Claire. Referring to FIGS. 8A and 8B, during the experiment, it has
been found that the order of polynomial was optimal when P=3 for
motion/no-motion compensation decision curve. In case of order
below 3, satisfactory supplementary optimal coefficients were not
obtained. PSNR was increased small while the computation increased
significantly in case of order above 3. Through the experiment, it
is found that the order of polynomial was optimal when P=2 for
intra/inter coding decision curve. The above optimal orders of
polynomials are shown in FIGS. 7A and 7B. Hierarchical search
motion estimation is employed in order to reduce the complexity in
obtaining of motion vectors and increase encoding performance.
Motion vectors affect the modes of macroblocks according to
hierarchical search motion estimation. The modes obtained using
HAMS are different from modes selected according to adaptive mode
selection. This is because of absolute difference or deviation
between corresponding macroblock and current macroblock due to
motions of motion vectors, different from each other, from the
previous frames. FIGS. 9A and 9B show that HAMS approach is
performed better than AMS approach.
[0070] FIGS. 9A and 9B are graphs showing the relationship between
PSNR and encoding rates in the case of encoding with HAMS in
Claire. Two different models of finding optimal decision curve
using HAMS are compared with RM 8, i.e. finding optimal decision
curve for motion/no-motion (HAMS-P) and inter/intra (HAMS-I).
Proposed models provide consistent visual quality within and
between frames as shown in FIGS. 8A and 8B. In order to check the
effect of overflow and underflow of the buffer, relatively fast and
slow bitrates are applied while other parameters are identical.
[0071] FIGS. 10A and 10B show corresponding PSNR, contents of
buffer and effect of overflow and underflow of the buffer when
frames are coded at 30 fps in 160 bps. Proposed approaches (HAMS-P
and HAMS-I) are compared with RM 8. HAMS-P and HAMS-I mean
corresponding output PSNR and frame rates when optimal mode for
motion/no-motion decision function is selected and intra/non-intra
coding function is selected, respectively. FIGS. 10A and 10B are
graphs showing the relationship between PSNR and buffer fullness in
the case of encoding with HAMS in Claire. Referring to FIGS. 10A
and 10B, in experiment with RM 8, the buffer begins to fill
significantly after frame 13 which forces RM 8 to repeat the
previous frames (shown in FIG. 10B). However, with the proposed
scheme, the buffer does not reach overflow state as macroblock
modes are selected based on the optimal decision curves in all two
models. Furthermore, visual quality of proposed approach until
frame 13 before the buffer reaches overflow is better than a fixed
model. In case of underflow of the buffer, proposed approach and
visual quality of RM8 are similar to those in overflow state
because the smallest quantization parameters are used.
[0072] FIGS. 11A and 11B show motion/no-motion compensation curves
and intra/inter coding decision curves. It can be seen in FIG. 11A
that if the error magnitude is low (about 2.5 in HAMS case compared
to 1.0 in FIGS. 11A and 11B) then no motion compensation is used.
Obviously HAMS favors zero motion compensation and this can be
explained by Lagrangian multiplier in the cost function.
Intra/inter decision curves using HAMS favor intra frame encoding
as shown in FIG. 11B. This explains why the optimal order of the
polynomial in intra/inter decision curve is 2.
[0073] FIGS. 11A and 11B are graphs showing motion/no-motion
compensation and intra/inter coding decision curves which are found
in the case of encoding with HAMS, and FIGS. 12A to 12D are graphs
showing the relationship between PSNR and encoded frame rates in
the case of encoding using HAMS in Claire in 160 kbps and 179 kbps.
Furthermore, FIGS. 12 to 15 show the results of lots of simulations
using other bitrates and various sequences. It should be noted that
difference in quality recognizable for conventional fixed encoding
is quite large if buffer overflow occurs when bitrate is especially
low due to repetition of frame. The visual quality is improved by
10 dB at lower bitrate when the proposed approach is used (shown in
FIGS. 13A to 13D).
[0074] Moreover, the proposed approach provides uniform visual
quality through frames. FIGS. 13A to 13D are graphs showing the
relationship between PSNR and frame rates in the case of coding
using HAMS in 128 kbps and 192 kbps in Miss America sequences, and
FIGS. 14A, 14B, 14C and 14D show PSNR, relation with frame rate,
buffer fullness and relation with MQUANT in the case of encoding
using HAMS in 339 kbps in Salesperson sequences, respectively.
FIGS. 15A, 15B, 15C and 15D show PSNR, relation with frame rate,
buffer fullness and relation with MQUANT in the case of encoding
using HAMS in 352 kbps in Salesperson sequences, respectively.
[0075] According to the present invention, the proposed method can
obtain higher PSNR and better visual quality compared with standard
quantizer feedback based bitrate control approach. Furthermore, the
present invention can be applied to video compressor or video
conference apparatus or communication terminal, to select optimal
modes. Accordingly, it is possible to construct the video coder
whose visual quality variation is smaller than that obtained in
case of controlling by the conventional simple quantizer.
[0076] It will be apparent to those skilled in the art that various
modifications and variations can be made in the video signal coding
method of the present invention without departing from the spirit
or scope of the invention. Thus, it is intended that the present
invention cover the modifications and variations of this invention
provided they come within the scope of the appended claims and
their equivalents.
* * * * *