U.S. patent application number 16/577153 was filed with the patent office on 2020-01-09 for method and apparatus for rate control for constant-bit-rate-finite-buffer-size video encoder.
The applicant listed for this patent is ATI Technologies ULC. Invention is credited to Stefan Eckart.
Application Number | 20200014938 16/577153 |
Document ID | / |
Family ID | 38193698 |
Filed Date | 2020-01-09 |
View All Diagrams
United States Patent
Application |
20200014938 |
Kind Code |
A1 |
Eckart; Stefan |
January 9, 2020 |
METHOD AND APPARATUS FOR RATE CONTROL FOR
CONSTANT-BIT-RATE-FINITE-BUFFER-SIZE VIDEO ENCODER
Abstract
A method and apparatus for rate control for a constant-bit-rate
finite-buffer-size video encoder is described. Rate control is
provided by adjusting the size of non-intra frames based on the
size of intra frames. A sliding window approach is implemented to
avoid excessive adjustment of non-intra frames located near the end
of a group of pictures. A measurement of "power" based on a sum of
absolute values of pixel values is used. The "power" measurement is
used to adjust a global complexity value, which is used to adjust
the sizes of frames. The global complexity value responds to scene
changes. An embodiment of the invention calculates and uses L1
distances and pixel block complexities to provide rate control. An
embodiment of the invention implements a number of bit predictor
block. Predictions may be performed at a group-of-pictures level,
at a picture level, and at a pixel block level. An embodiment of
the invention resets a global complexity parameter when a scene
change occurs.
Inventors: |
Eckart; Stefan; (Mountain
View, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
ATI Technologies ULC |
Markham |
|
CA |
|
|
Family ID: |
38193698 |
Appl. No.: |
16/577153 |
Filed: |
September 20, 2019 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
15074227 |
Mar 18, 2016 |
10462473 |
|
|
16577153 |
|
|
|
|
11681492 |
Mar 2, 2007 |
9414078 |
|
|
15074227 |
|
|
|
|
09552761 |
Apr 18, 2000 |
7277483 |
|
|
11681492 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04N 19/149 20141101;
H04N 19/124 20141101; H04N 19/182 20141101; H04N 19/65 20141101;
H04N 19/14 20141101; H04N 19/33 20141101; H04N 19/142 20141101;
H04N 19/423 20141101; H04N 19/176 20141101; H04N 19/593 20141101;
H04N 19/115 20141101; H04N 19/177 20141101; H04N 19/152 20141101;
H04N 19/12 20141101; H04N 19/172 20141101 |
International
Class: |
H04N 19/177 20060101
H04N019/177; H04N 19/593 20060101 H04N019/593; H04N 19/423 20060101
H04N019/423; H04N 19/12 20060101 H04N019/12; H04N 19/65 20060101
H04N019/65; H04N 19/33 20060101 H04N019/33; H04N 19/176 20060101
H04N019/176; H04N 19/172 20060101 H04N019/172; H04N 19/149 20060101
H04N019/149; H04N 19/115 20060101 H04N019/115; H04N 19/124 20060101
H04N019/124; H04N 19/14 20060101 H04N019/14; H04N 19/142 20060101
H04N019/142; H04N 19/152 20060101 H04N019/152; H04N 19/182 20060101
H04N019/182 |
Claims
1. Apparatus for rate control for a constant-bit-rate
finite-buffer-size video encoder comprising: a preprocessing stage
for determining a power value based on pixel values of only a first
frame; and a group-of-pictures-level rate control block operatively
coupled to the preprocessing stage to receive the power value and
to provide a target quantizer step size used to provide rate
control for the video encoder; wherein the group-of-pictures-level
rate control block causes an adjustment of sizes of non-intra
frames based on expected sizes of future intra frames.
2. The apparatus of claim 1 wherein the preprocessing stage updates
the power value for each subsequent picture being encoded.
3. Apparatus for rate control for a constant-bit-rate
finite-buffer-size video encoder comprising: a prediction error
image block to determine L1 distances according to sums of absolute
differences; a picture-level rate control block operatively coupled
to the prediction error image block to receive the L1 distances and
to produce a target quantizer step size for a pixel block; and a
complexity estimator block operatively coupled to the prediction
error image block to determine non-intra pixel block complexity
values and intra pixel block complexity values.
4. A method for rate control for a constant-bit-rate
finite-buffer-size video encoder comprising: obtaining a prediction
error frame including a plurality of pixel-level error values;
calculating a sum of absolute values of the pixel-level error
values for a pixel block; calculating an expected number of bits
for the pixel block based on the sum of the absolute values, and
using the expected number of bits for the pixel block to obtain
constant-bit-rate video encoding.
5. The method of claim 4 wherein using the expected number of bits
for the pixel block to obtain constant-bit-rate video encoding
further comprises: calculating an expected number of bits for a
frame in which the pixel block is located; and using the expected
number of bits for the frame to obtain constant-bit-rate video
encoding.
6. The method of claim 5 wherein calculating the expected number of
bits for the frame further comprises: summing the expected number
of bits for the pixel block for all pixel blocks in the frame.
7. A method for rate control for a constant-bit-rate
finite-buffer-size video encoder comprising the: predicting a
relationship between a quantizer scale factor and a number of
encoded bits of a pixel block based on a known relationship in
previous pixel blocks of a same type; and using the quantizer scale
factor to control a pixel block level rate of the video
encoder.
8. The method of claim 7 wherein using the quantizer scale factor
to control the pixel block level rate of the video encoder further
comprises: using the quantizer scale factor together with a sum of
absolute values of pixel-level error values to control the pixel
block level rate of the video encoder.
9. The method of claim 7 wherein predicting the relationship
between the quantizer scale factor and the number of encoded bits
of the pixel block further comprises: predicting a first
relationship between the quantizer scale factor and a first number
of encoded bits of a first type of pixel block based on a first
known relationship in previous pixel blocks of the first type; and
predicting a second relationship between the quantizer scale factor
and a second number of encoded bits of a second type of pixel block
based on a second known relationship in previous pixel blocks of
the second type.
10. A method for rate control for a constant-bit-rate
finite-buffer-size video encoder comprising: calculating a
prediction for a number of bits encoded for a pixel block based on
a sum of absolute values of pixel-level error values, a pixel block
complexity, and a quantizer scale factor; using the prediction for
adjusting the quantizer scale factor to meet a targeted
picture-level number of bits.
Description
RELATED APPLICATIONS
[0001] This application is a divisional of U.S. application Ser.
No. 15/074,227, filed Mar. 18, 2016, which is a divisional of U.S.
application Ser. No. 11/681,492 (now U.S. Pat. No. 9,414,078),
which is a continuation of U.S. application Ser. No. 09/552,761
(now U.S. Pat. No. 7,277,483, filed Apr. 18, 2000, entitled "Method
and Apparatus for Rate Control for Constant-Bit-Rate
Finite-Buffer-Size Video Encoder", having as inventor Stefan
Eckart, and owned by instant assignee and is incorporated herein by
reference.
TECHNICAL FIELD OF THE INVENTION
[0002] The invention relates generally to video encoding and more
specifically to a method and apparatus for rate control for a
constant-bit-rate finite-buffer-size video encoder.
BACKGROUND OF THE INVENTION
[0003] Much technology has been developed to facilitate
communication of images over media of finite bandwidth. It is
generally desirable to communicate the highest quality of images
possible over a medium of a given bandwidth. Thus, techniques such
as video compression (e.g., compression according to a Moving
Picture Experts Group (MPEG) format) have been developed to reduce
the amount of data required to represent images. An MPEG format
includes various types of frames, including intra frames and
non-intra frames. Intra frames contain sufficient information to
reconstruct an uncompressed video frame without the need to
reference information in other MPEG frames. Non-intra frames
contain less information, allowing reconstruction of an
uncompressed video frame when combined with information from other
MPEG frames.
[0004] To increase the efficiency of the compression, the
relationship between the intra frames and the non-intra frames
varies depending on the nature of the video stream being encoded.
For example, if a video stream includes frames that differ very
little from one to the next, non-intra frames containing little
information can accurately represent uncompressed video frames.
However, if, for example, the frames of the video stream differ
substantially from one another, more information is needed to
accurately convey the video stream. As an example, during a scene
change when the video stream changes from portray one scene to a
completely different scene, the image of the new scene generally
bears no relationship to the image of the previous scene. Thus, an
intra frame is usually used to provide information about the new
scene.
[0005] As can be readily appreciated, the relationship between the
size of the intra frames and the non-intra frames, and even the
frequency of the intra frames relative to the non-intra frames,
cannot easily be predicted. Added complication arises when the
compressed frames are to be communicated over a medium of finite
bandwidth. While circumstances such as a scene change may
necessitate communication of more information, the available
bandwidth does not expand to accommodate the additional
information. The buffers used to store information from the
compressed video stream during processing are of finite size. Thus,
variations in a compressed video stream can lead to buffer overflow
and underflow conditions, disrupting the reproduction of the video
stream. To accommodate the finite bandwidth of the medium, it is
desirable to produce a compressed video stream that occurs at a
constant, or substantially constant, bit rate.
[0006] The visual quality of compressed video encoded by a
constant-bit-rate finite-buffer-size video encoder depends
substantially on the characteristics of the underlying rate-control
technique. To operate efficiently, the rate-control technique makes
assumptions regarding the compression properties of future frames
(i.e., frames that have not yet been compressed). These assumptions
can be based on analyzing the compression properties of future
frames in advance. While this leads to high quality and stable
operation, it also causes an increase in computational and storage
demands that is not always economic. Also, the overall system delay
increases significantly because a frame can only be encoded after
the future frames needed for encoding this frame have become
available. Thus, it is desirable to avoid these disadvantages.
[0007] In addition to the accurate prediction of the compression
properties of future frames, it is desirable for a rate-control
control algorithm to ensure that the number of actually generated
bits for the current frame closely matches the target number of
bits allocated to the current frame. Since the functional
relationship between the primary control variable (e.g., the
quantization step size) and the resulting number of bits is highly
non-linear, iteratively encoding the frame at different
quantization step sizes is used to exactly arrive at a given number
of bits per frame. This is computationally expensive. Thus, it is
desirable to avoid this computational expense and complexity.
[0008] Furthermore, it is desirable for rate-control to be robust.
Whenever the assumptions, (e.g., the predicted compression
properties of future frames or the number of bits generated for the
current frame) turn out to be inaccurate, finite buffer-size
constraints still have to be dealt with, preferably in a manner
that does not greatly affect visual quality. Thus, it is desirable
to provide such robustness so as to ensure that constraints are
met, and visual quality is maintained.
[0009] Thus, a technique is needed to provide rate control for a
constant-bit-rate finite-buffer-size video encoder that provides
the desired features while avoiding the disadvantages.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] FIG. 1 is a block diagram illustrating a portion of an
apparatus for rate control for a constant-bit-rate
finite-buffer-size video encoder in accordance with an embodiment
of the invention.
[0011] FIG. 2 is a block diagram illustrating a portion of an
apparatus for rate control for a constant-bit-rate
finite-buffer-size video encoder in accordance with an embodiment
of the invention.
[0012] FIG. 3 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention.
[0013] FIG. 4 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention.
[0014] FIG. 5 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention.
[0015] FIG. 6 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention.
[0016] FIG. 7 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention.
[0017] FIG. 8 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention.
[0018] FIG. 9 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention.
[0019] FIG. 10 is a flow diagram illustrating a method for rate
control for a constant bit-rate-finite-buffer-size video encoder in
accordance with an embodiment of the invention.
DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT
[0020] A method and apparatus for rate control for a
constant-bit-rate finite-buffer-size video encoder is described.
Rate control is provided by adjusting the size of non-intra frames
based on the expected size of future intra frames. Here, the size
of a frame is the number of bits in the encoded, or compressed,
frame. A sliding window approach is implemented to avoid excessive
adjustment of non-intra frames located near the end of a group of
pictures. A measurement of "power" based on a sum of absolute
values of pixel values is used. The "power" measurement is used to
adjust a global complexity value, which is used to adjust the sizes
of frames. The global complexity value responds to scene
changes.
[0021] An embodiment of the invention calculates and uses L1
distances and pixel block complexities to provide rate control. An
embodiment of the invention implements a number of bit predictor
blocks. Predictions may be performed at a group-of-pictures level,
at a picture level, and at a pixel block level. An embodiment of
the invention resets a global complexity parameter when a scene
change occurs.
[0022] Video data is organized as a sequence of frames. A frame
represents an instantaneous image. Thus, the video data may be
thought of as being divided in time into frames. The frames may be
divided in space into smaller elements of the frames. As an
example, the frames may be divided into an array of pixels. Frames
may also be divided into groups of pixels, referred to as
macroblocks or pixel blocks. One example of macroblock or pixel
block is a 16.times.16 array of pixels.
[0023] The present invention is capable of advantageously using
compression properties from past frames (frames that already have
been compressed) and, possibly, the current frame, rather than
requiring compression properties of future frames. High quality of
compressed video is provided in accordance with accurate prediction
of compression properties of future frames based on the available
compression properties of past frames.
[0024] The rate-control features in accordance with an embodiment
of the invention generate an accurate approximation of the desired
number of bits in a single pass without iterations. Additionally,
the present invention affords robust rate control.
[0025] FIG. 1 is a block diagram illustrating a portion of an
apparatus for rate control for a constant-bit-rate
finite-buffer-size video encoder in accordance with an embodiment
of the invention. FIG. 1 includes reference frame block 101, motion
estimation block 102, motion compensated prediction block 103,
uncompressed video frame block 104, adder 105, prediction error
image block 106, preprocessing stage 107, discrete cosine transform
(DCT) block 108, quantization block 109, variable length coding
(VLC) block 110, video buffer verifier (VBV) 111, rate control 112,
and complexity estimator 113.
[0026] Reference frame block 101 provides reference frames 114 and
115 to motion estimation block 102. Uncompressed video frame block
104 provides uncompressed video frames 118, 119, and 120 to motion
estimation block 102, to adder 105, and to preprocessing stage 107.
Preprocessing stage 107 determines a power value 121 and a local
activity value 122. In one embodiment, the preprocessing stage 107
updates the power value for each subsequent picture or frame being
encoded.
[0027] Motion estimation block 102 provides a motion estimate 116
to motion compensated prediction block 103. Motion compensated
prediction block 103 provides a pixel block type indication 117.
Motion compensated prediction block 103 also provides a motion
compensated prediction frame 134 as a negative input to adder 105.
Adder 105 subtracts the motion compensated prediction frame 134
from the uncompensated video frame 119 and provides the result 123
to prediction error image block 106.
[0028] Prediction error image block 106 provides a prediction error
image 124 to DCT block 108. Prediction error image block 106 also
determines when a scene change occurs and provides a scene change
indication 125 to complexity estimator 113. Prediction image block
106 further provides L1 distances 126. The L1 distances represent a
power measurement at the pixel block level that may be obtained by
summing the absolute differences within a pixel block.
[0029] DCT block 108 provides a DCT result 127 to quantization
block 109. Quantization block 109 performs quantization according
to a quantizer step size, referred to as mquant, and provides a
result 128 to VLC block 110. VLC block 110 provides an MPEG bit
stream 129, which is fed back to complexity estimator 113 and VBV
111.
[0030] VBV 111 provides a VBV fullness output 130 to rate control
block 112. Rate control block 112 provides quantizer step size 131
to quantization block 109 and to complexity estimator 113.
Complexity estimator 113 is coupled to the prediction error image
block 106 and provides a global complexity 132 and pixel block
complexities 133. The pixel block complexities 133 include
non-intra pixel block complexity values and intra pixel block
complexity values. The complexity estimator 113 resets a global
complexity value upon receipt of the scene change indication
[0031] FIG. 2 is a block diagram illustrating a portion of an
apparatus for rate control for a constant-bit-rate
finite-buffer-size video encoder in accordance with an embodiment
of the invention. FIG. 2 includes group-of-pictures-level
(GOP-level) rate control 201, picture-level rate control 202,
pixel-block-level rate control 203, adder 204, and number-of-bit
predictor 205. GOP-level rate control 201 is operatively coupled to
the preprocessing stage to receive the power value 121 and global
complexity 132 and provides a target quantizer step size 206 used
to provide rate control for the video encoder to picture-level rate
control 202. The group-of-pictures-level rate control block causes
an adjustment of sizes of non-intra frames based on the expected
sizes of future intra frames.
[0032] The picture-level rate control block 202 is operatively
coupled to the prediction error image block to receive the L1
distances 126. The picture-level rate control block 202 also
receives VBV fullness output 130, pixel block complexities 133, and
pixel block type 117 and provides a target quantizer step size for
a pixel block to pixel-block-level rate control block 203 and to
number-of-bit predictor block 205.
[0033] Number-of-bit-predictor block 205 receives L1 distances 126,
pixel block complexities 133, and pixel block type 117, as well as
picture-level rate control output 207. The number-of-bit predictor
predicts a number of bits generated by the video encoder.
Number-of-bit predictor block 205 provides a number-of-bit
prediction output to adder 204. MPEG stream 129 is provided to a
number-of-bit counter 210. The number-of-bit counter 210 provides
an output 211 that is received by adder 204 as a negative input.
Adder 204 subtracts output 211 from number-of-bit prediction output
208 and provides the result 209 to pixel-block-level rate control
block 203. Pixel-block-level rate control block 203 receives local
activity 122. Pixel-block-level rate control block 203 also
receives L1 distances 126. Pixel-block-level rate control block 203
provides quantizer step size 131.
[0034] FIG. 3 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention. A sliding window
approach is used with respect to the GOP being processed. The
sliding window approach avoids wide variations in rate control
adjustments dependent upon the location of a frame (or picture) in
a GOP.
[0035] The method begins in step 301 and continues to step 302. In
step 302, a first quantizer step size is calculated such that a
first number of bits generated at an output of the
constant-bit-rate finite-buffer-size video encoder is constant over
a first given number of frames (e.g., GOP) starting at a current
frame. In step 303, the current frame is incremented. In step 304,
a second quantizer step size is calculated such that a second
number of bits generated at the output of the constant-bit-rate
finite-buffer-size video encoder is constant over a second given
number of frames starting at the incremented current frame. Thus, a
full GOP is considered for each frame processed, rather than
considering only those frames remaining in a static GOP or waiting
until a second static GOP following the first static GOP is
processed.
[0036] FIG. 4 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention. The method begins
in step 401. In step 402, a power value is calculated by
calculating the sum of absolute values of pixel values over a first
frame. Step 402 may include steps 403, 404, and 405. In step 403,
an average value of the pixel values in each of a plurality of
pixel blocks (e.g., macroblocks) within the first frame is
calculated. In step 404, for each of the plurality of pixel blocks,
a sum of absolute differences between the pixel values in the
respective pixel block and the average value is calculated. This
step may be repeated for all pixel blocks in the picture (e.g.,
frame). In step 405, each sum of the absolute differences for each
of the plurality of pixel blocks within the first frame are added
to obtain a power value for the first frame.
[0037] From step 402, the method continues to step 406. In step
406, a number of bits in a second frame are adjusted based on the
sum of the absolute values of pixel values. The method ends in step
407.
[0038] FIG. 5 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention. A power value may
be used to adjust a global complexity, which may be expressed as
Xi. The method begins in step 501. In step 502, a reference global
complexity is calculated for each intra frame encoded. In step 503,
a reference power value is calculated for each intra frame
encoded.
[0039] In step 504, a power value is calculated for subsequent
frames. In step 505, a global complexity is calculated by
multiplying the reference global complexity by the power value and
dividing by the reference power value. In step 506, the global
complexity is used to adjust a frame size. The method ends in step
507.
[0040] FIG. 6 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention. The method begins
in step 601. In step 602, a prediction error frame including a
plurality of pixel-level error values is obtained. In step 603, a
sum of absolute values of the pixel-level error values for a pixel
block is calculated.
[0041] In step 604, an expected number of bits for the pixel block
is calculated based on the sum of the absolute values, which may be
expressed as pmb. Step 604 may include steps 605 and 607 and/or
step 608. In step 605, an expected number of bits for a frame in
which the pixel block is located is calculated. Step 605 may also
include step 606. In step 606, the expected number of bits for the
pixel block are summed for all pixel blocks in the frame. In step
608, for each pixel block in the frame, a pixel block complexity
value is multiplied by the sum of the absolute values of the
pixel-level error values for the pixel block and dividing by a
target quantizer step size for the frame. In step 607, the expected
number of bits for the frame is used to obtain constant-bit-rate
video encoding. In step 609, the expected number of bits for the
pixel block is used to obtain constant-bit-rate video encoding. The
process ends in step 610. L1 distances may be usefully employed in
accordance with the method set forth above.
[0042] FIG. 7 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention. The method starts
in step 701. In step 702, a relationship between a quantizer scale
factor and a number of encoded bits of a pixel block is predicted
based on a known relationship in previous pixel blocks of a same
type. Step 702 may also include steps 703 and 704. In step 703, a
first relationship between the quantizer scale factor and a first
number of encoded bits of a first type of pixel block is predicted
based on a first known relationship in previous pixel blocks of the
first type. In step 704, a second relationship between the
quantizer scale factor and a second number of encoded bits of a
second type of pixel block is predicted based on a second known
relationship in previous pixel blocks of the second type. As an
example, theses relationships may be pixel block complexities. As
can be seen, separate pixel block complexities may be determined
for intra frame pixel blocks and for non-intra frame pixel
blocks.
[0043] From step 702, the process continues to step 705. In step
705, the quantizer scale factor is used to control a pixel block
level rate of the video encoder. Step 705 may include step 706. In
step 706, the quantizer scale factor is used together with L1
distances to control the pixel block level rate of the video
encoder. In step 707, the method ends.
[0044] FIG. 8 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention. The method begins
in step 801. In step 802, a group-of-pictures-level prediction for
a number of bits encoded for a group of pictures is calculated.
Step 802 may include step 803. In step 803, the
group-of-pictures-level prediction for the number of bits encoded
for the group-of-pictures is calculated based on a global
complexity value.
[0045] From step 802, the method continues in step 804. In step
804, a picture-level prediction for a number of bits encoded for a
picture is calculated. Step 804 may include step 805. In step 805,
the picture-level prediction for the number of bits encoded for the
picture is calculated based on a pixel block type, an L1 distance,
and a pixel block complexity.
[0046] From step 804, the method continues to step 806. In step
806, a pixel-block-level prediction for a number of bits encoded
for a pixel block is calculated. Step 806 may include step 807. In
step 807, the pixel-block-level prediction for the number of bits
encoded for the pixel block is calculated based on a local activity
value.
[0047] From step 806, the method continues to step 808. In step
808, the group-of-pictures-level prediction, the picture-level
prediction, and the pixel-block-level prediction are used to adjust
a quantizer scale factor to provide the rate control for the video
encoder. The method ends in step 809. Thus, the method utilizes
prediction of a number of bits at the GOP level, the picture (e.g.,
frame) level, and the pixel block (e.g., macroblock) level to
achieve higher accuracy in prediction and more effective rate
control.
[0048] FIG. 9 is a flow diagram illustrating a method for rate
control for a constant-bit-rate finite-buffer-size video encoder in
accordance with an embodiment of the invention. The method begins
in step 901. In step 902, a scene change indication is obtained
from a prediction error image. This may be done, for example, by
looking at the ratio between intra and non-intra coded macroblocks.
From step 902, the method continues to step 903. In step 903, the
scene change indication is used to reset a global complexity
history (e.g., Xipb). From step 903, the method continues to step
904. In step 904, the global complexity history is used to provide
the rate control for the video encoder.
[0049] FIG. 10 is a flow diagram illustrating a method for rate
control for a constant bit-rate-finite-buffer-size video encoder in
accordance with an embodiment of the invention. The method begins
in step 1001. In step 1002, a prediction for a number of bits
encoded for a pixel block is calculated based on an L1 distance, a
pixel block complexity, and a quantizer scale factor. In step 1003,
the prediction is used for adjusting the quantizer scale factor
(e.g., mquant) to meet a targeted picture-level number of bits. The
method ends in step 1004.
[0050] In accordance with an embodiment of the invention, the rate
control process is organized hierarchically as follows: [0051] GOP
level: distributes bits to I, P and B pictures based on the GOP
structure (IBP pattern) and the statistical properties of the
individual picture types [0052] picture level: calculates the
target bit allocation and mquant for the next picture based on
statistical properties of that particular picture [0053] macroblock
level: adjusts mquant to meet the target bit allocation (optional)
In addition, the rate control handles the following tasks: [0054]
VBV compliance (bitrate adjustment, emergency quant mode, bit
stuffing) [0055] psychovisual masking (spatial activity based
mquant modulation) [0056] estimation of various rate control
parameters (adaptive rate control algorithm)
[0057] GOP Level Rate Control
[0058] The proportion of bits allocated to a picture depends on its
picture type (I, P, or B). The allocation is based on the goal of
achieving fixed mquant ratios as follows:
mquant.sub.I/mquant.sub.P/mquant.sub.B=K.sub.I/K.sub.P/K.sub.B
(1)
or, equivalently:
mquant ipb K ipb = 1 c = const ( 2 ) ##EQU00001##
[0059] Throughout this document an index of ipb can have one of the
values I, P, or B and indicates a picture type specific entity. In
(2), c is a constant that depends on the bitrate and frame
statistics.
[0060] The relationship between the mquant (or quantiser_scale)
value used for encoding a frame and the resulting number of bits is
complex and the only way to accurately calculate it is by actually
encoding the frame at the given mquant value. For the purpose of
rate control a highly simplified model is used instead as
follows:
R ipb = X ipb 1 mquant ipb ( 3 ) ##EQU00002##
[0061] An inverse proportional relationship is assumed between
mquant and R.sub.ipb, the number of bits per frame (or bitrate). In
this document, all bitrates are expressed as bits per frame instead
of bits per second, therefore the terms bits per frame and bitrate
are used interchangeably.
[0062] X.sub.ipb denotes global (coding) complexity and
characterizes the encoding process and its dependency on the frame
content. In practice, X.sub.ipb is a function of mquant but rate
control assumes it to be constant. X.sub.ipb is determined by
parameter estimation as described later (cf. (27) and (31)).
[0063] Combing (2) and (3) results in
R ipb = c X ipb K ipb cX ipb ' ( 4 ) ##EQU00003##
where X' is a short notation for X/K (normalized complexity). The
average bitrate R for an entire GOP can be calculated as
R = ipb N ipb R ipb ipb N ipb = ipb N ipb R ipb N ( 5 )
##EQU00004##
where N.sub.ipb is the number of frames of a particular type in a
GOP, and N is the total number of frames per GOP. For example, the
typical case of IBBPBBPBBPBBPBB corresponds to N=15,
N.sub.I=N.sub.P=4, N.sub.B=10.
[0064] Substituting (4) into (5):
R = c N ipb X ipb ' N ( 6 ) ##EQU00005##
and solving for c:
c = NR N ipb X ipb ' ( 7 ) ##EQU00006##
allows to calculate the individual R.sub.ipb values (using (4)) as
a function of the complexities X and the average bitrate R:
R ipb = NRX ipb ' N ipb X ipb ' ( 8 ) ##EQU00007##
(Note that if
N ipb X ipb ' N ##EQU00008##
is interpreted as average GOP complexity X.sub.GOP', (8) simplifies
to
R ipb R = X ipb ' X GOP ' . ) ##EQU00009##
[0065] Normalized VBV Fullness
[0066] As a prerequisite for deriving the GOP level bitrate control
equation, this section defines the concept of an actual and a
normalized VBV fullness and their relationship. This is based on
the observation that the difference between an expected actual VBV
fullness and the current actual VBV fullness has a component that
depends on the complexities, GOP structure, bitrate differences and
position in the GOP pattern, which is undesirable. Introducing the
concept of a normalized VBV fullness removes these
dependencies.
[0067] The normalized VBV fullness is defined as the number of bits
in the VBV if every frame would have been allocated the average
number of bits per frame R, whereas the actual VBV fullness is
based on allocating bits according to (8). The actual VBV fullness
for the M'th frame (note that this M is not the I/P frame distance
used in defining GOP patterns) in a GOP can be expressed as:
E R , M = E R , 0 + MR 0 - k = 0 M - 1 R ipb ( k ) = E R , 0 + MR 0
- ipb M ipb R ipb ( 9 ) ##EQU00010##
Here, E.sub.R,0 is the VBV fullness the start of the GOP, R.sub.0
is the constant bitrate of the VBV buffer model (i.e. the bit_rate
parameter in the sequence header of the MPEG stream, converted from
bits per second to bits per frame), M.sub.ipb is the number of I,
P, and B frames, respectively, in the current GOP up to, but not
including the current (M'th) frame, and ipb(k) is the picture type
of the k'th frame. The normalized VBV fullness is simply:
.sub.R,M=E.sub.R,0+M(R.sub.0-R) (10)
It increases or decreases linearly over time and is constant if the
average bitrate matches the nominal bitrate of the stream.
[0068] Subtracting (9) from (10) allows conversion between actual
and normalized buffer fullness:
.sub.R,M=E.sub.R,M+.SIGMA.M.sub.ipbR.sub.ipb-MR (11)
Introducing the fraction of bits per GOP spent up to, but not
including, the M'th frame, .sigma..sub.M:
.sigma. M = M ipb R ipb NR ( 12 ) ##EQU00011##
and the normalized difference between the actual and normalized
allocation, .delta..sub.M:
.delta. M = M ipb R ipb - MR NR = .sigma. M - M N ( 13 )
##EQU00012##
equation (11) can be rewritten as:
.sub.R,M=E.sub.R,M+NR.delta..sub.M (11a)
[0069] For the special case of R=R.sub.0 (nominal bitrate), and
E.sub.R.sub.0.sub.,0=E.sub.0 (nominal VBV fullness), equations (9)
and (10) become
E.sub.R.sub.0.sub.,M=E.sub.0+MR.sub.0-.SIGMA.M.sub.ipbR.sub.ipb=E.sub.0--
NR.sub.0.delta..sub.M (9a)
.sub.R.sub.0.sub.,M=E.sub.0 (10a)
[0070] GOP Level Rate Control Equation
[0071] GOP level rate control adjusts the average bitrate R to
ensure VBV compliance, which indirectly results in constant bitrate
operation. Essentially it changes R proportionally to the deviation
of the actual from the expected VBV fullness. (Note that this only
guarantees that there is no long term drift between VBV and encoder
but does not prevent temporary VBV underflow or overflow; this is
handled separately). The control equation is expressed as
follows:
.sub.R.sub.0.sub.,M+N.sub.t= .sub.R,M+N.sub.t (14)
(i.e., the bitrate R is set such that the expected normalized VBV
fullness reaches the nominal normalized VBV fullness after N,
frames).
[0072] The remaining step is to convert (14) into an explicit
equation for R. Using (10a) and (10), (14) becomes:
E.sub.0= .sub.R,M+N.sub.t(R.sub.0-R) (14a)
Substituting (9a), (10a), and (11a) into (14a) results in:
E.sub.R.sub.0.sub.,M+NR.sub.0.delta..sub.M=E.sub.R,M+NR.delta..sub.M+N.s-
ub.t(R.sub.0-R) (14b)
Solving (14b) for R:
[0073] R = R 0 + E R , M - E R 0 , M N t - N .delta. M ( 15 )
##EQU00013##
[0074] As expected, the rate is adjusted proportionally to the
difference between current and expected VBV fullness. The term
-N.delta..sub.M in the denominator stems from the conversion from
actual to normalized VBV levels, removing GOP position dependencies
from the equation.
[0075] Picture Level Bit Allocation
[0076] At the GOP level the bit allocation for pictures is
determined by (4). As discussed below, the complexities X used in
this equation are a posteriori estimates optimized to provide an
accurate long term estimate of the bitrate versus mquant
relationship.
[0077] Bit allocation for the current picture is improved by using
a priori knowledge of its statistical properties provided by the
motion estimator. In addition, picture level bit allocation is
responsible for preventing VBV underflows.
[0078] Picture level bit allocation models the relationship between
the target mquant for the current picture, d, and target bit
allocation for the current picture, T, by an equation similar to
(3):
T = K ipb X ^ ' 1 d ( 3 a ) ##EQU00014##
where {circumflex over (X)}' is the a priori knowledge based
normalized complexity of the current frame. Computation of
{circumflex over (X)}' is discussed below (cf. equation (32)), it
is based on L1 distances for the individual macroblocks, and local
complexity estimates for intra and non-intra macroblocks.
[0079] Having two different estimates for the complexity of the
current frame ({circumflex over (X)}.sub.ipb', the `typical`
complexity derived as a long-term average based on posteriori
knowledge about previously coded frames, and {circumflex over
(X)}', the `actual` complexity based on a priori knowledge about
the current, not yet encoded frame) leads to a variety of possible
bit allocation schemes for the current frame. The two corner cases
are as follow: [0080] mquant preserving mode: use the mquant as
determined by GOP level rate control
[0080] ( d = K ipb c , ##EQU00015##
cf. (2)); the resulting number of bits may not match the number
predicted by GOP level rate control; this mode keeps quality
constant but may cause significant spikes in the allocation for
frames that are more complex than anticipated at the GOP level
[0081] bitrate preserving mode: try to encode the frame with a
number of bits as close as possible to the number of bits allocated
at the GOP level by adjusting the value of mquant; this mode
results in higher stability (no unpredicted excursions in the VBV
level), but may result in very large mquant values at scene changes
(resulting in noticeable blockiness) and unnecessarily low mquant
values for repeated frames (large mquant fluctuations for 3:2
pulldown material)
[0082] These corner cases, and all the intermediate ones, can be
described using the notion of an effective complexity X'' in (4) as
follows:
T=cX'' (4a)
Mquant preserving mode corresponds to setting X''={circumflex over
(X)}', while bitrate preserving mode corresponds to X''={circumflex
over (X)}.sub.ipb'.
[0083] One embodiment of the invention uses the following equation
to determine the effective complexity X'':
X '' = { X I ' scene change min { X ipb ' + X ^ ' 2 , X ^ '
otherwise ( 16 ) ##EQU00016##
In (16), X.sub.I' is the normalized complexity of I frames,
X.sub.ipb' is the normalized complexity of frames of the type of
the current frame (these are the same complexities as used by the
GOP level rate control), and {circumflex over (X)}' is the a priori
knowledge based normalized complexity of the current frame.
[0084] By default (16) uses the average of {circumflex over (X)}'
and X.sub.ipb' to achieve a compromise between the constant quality
of mquant preserving mode and the higher stability of bitrate
preserving mode. The default mode of (16) is augmented by several
experimentally determined heuristics that improve behavior at
certain highly non-stationary events as follow: [0085] repeated
frames (including dropped frames and 3:2 pulldown) [0086] scene
changes
[0087] Repeated frames coded as P or B pictures tend to have very
low complexity since they can be very accurately predicted from
their reference frame(s). With default mode bit allocation, too
many bits are allocated to these frames, and mquant drops to a very
low value. To avoid this behavior, (16) uses the minimum of
{circumflex over (X)}' and the average of {circumflex over (X)}'
and X.sub.ipb'. Whenever the (a priori) actual complexity of the
current frame is lower than the long term average complexity, (16)
goes into mquant preserving mode, reducing the number of allocated
bits below the one predicted at the GOP level.
[0088] P and B frames across scene changes are mostly coded using
intra macroblocks and their encoding behaves similarly to that of I
frames. Their complexity is usually much higher than that of
regular P and B frames. The default mode underestimates the
complexity of such a frame and therefore causes allocation of too
few bits at an undesirably high mquant. On the other hand, choosing
the obvious alternative, mquant preserving mode, can lead to
extremely high bit allocation. This happens on scene changes from a
low complexity to a high complexity scene because mquant then is
still based on complexity values from the previous scene. Instead,
(16) uses X.sub.I', the I frame complexity. This provides improved
performance based on the following: [0089] 1. P and B frames across
a scene change behave like an I frame (mostly intra coded
macroblocks) [0090] 2. as discussed below, X.sub.I' is adjusted for
every picture (not just for I frames) based on the L1 variance of
the current frame, and therefore already takes the changed
complexity of the new scene into account
[0091] Experiments have confirmed that explicitly using the I frame
complexity X.sub.I' at scene changes results in an allocation that
avoids huge mquant spikes and also avoids bit allocations that are
much higher than the I frame bit allocation. Only if scene changes
are not properly detected (which happens when they occur
immediately before an I frame) B frames are encoded with higher
than optimal mquant.
[0092] VBV Compliance
[0093] Using the target bit allocation T given in equation (4)
results in a bitstream that has constant average bitrate R.sub.0,
but does not guarantee VBV compliance, i.e. occasional VBV
underflows or overflow may occur. Therefore, T is adjusted based on
the restrictions imposed by the VBV model:
T'=min{T,T.sub.min}
T''=f.sub.lim(T',T.sub.max) (17)
T.sub.min is a lower boundary for the number of bits required to
avoid VBV overflow:
T.sub.min=.left brkt-top.R.sub.0-(vbv_buffer_size-E.sub.R,M).right
brkt-bot. (18)
Here R.sub.0 is the nominal bitrate, vbv_buffer_size the value
encoded in the sequence header, and E.sub.R,M the VBV fullness
before encoding the current frame. f.sub.lim is a soft limiter
defined by the following equation:
f lim ( x , x max ) = { x x < x max 2 x max 2 + ( x - x max 2 )
x max 2 x x .gtoreq. x max 2 ( 19 ) ##EQU00017##
For large x, this function asymptotically converges to x.sub.max.
The final value for the target mquant is obtained by inserting T''
in (3a):
d '' = K ipb X ^ ' T '' ( 3 b ) ##EQU00018##
[0094] Macroblock Level Rate Control
[0095] Based on the target mquant d'', macroblock level rate
control determines the actual mquant for each macroblock in the
picture taking the following aspects into account: [0096]
psychovisual masking by local activity modulation [0097] adaptation
of mquant to meet target bit allocation (T'') by using feedback
[0098] support of fractional mquant values by using dithering
[0099] Psychovisual Masking
[0100] A preprocessing stage computes the relative local activity
act.sub.mb of every macroblock as
u _ ( mb , b ) = 1 64 i , j = 0 7 u i , j ( mb , b ) act mb ' = min
b = 0 3 i , j = 0 7 u i , j ( mb , b ) - u _ ( mb , b ) act ' _ = 1
n mb mb = 0 n mb - 1 act mb ' act mb = act mb ' act ' _ ( 20 )
##EQU00019##
Here u.sub.i,j(mb,b) is the pixel value of the i,j-th pixel in
block b of macroblock mb, (mb,b) is the average pixel value of
block b of macroblock mb, act.sub.mb' is the activity of macroblock
mb, act' is the average activity of the picture, act.sub.mb is the
relative activity of macroblock mb, and n.sub.mb is the total
number of macroblocks in the picture.
[0101] The relative activity is mapped to an activity scaling
factor .alpha..sub.act,mb using the following non-linear
relation:
.alpha. act , mb = m act act mb + 1 act mb + m act ( 21 )
##EQU00020##
The parameter m.sub.act determines the degree of activity
modulation. mquant is multiplied with this scaling factor:
mquant.sub.mb'=.alpha..sub.act,mbd'' (22)
where d'' is the value from (3b).
[0102] Macroblock Level Control Loop
[0103] In order to reduce the mismatch between the target bit
allocation T'' and the actual number of bits generated for the
current image, which is caused by the limited accuracy of the
complexity model (3), a control loop adjusts mquant at the
macroblock level based on the accumulated mismatch from the start
of the picture up the current macroblock. This improves the rate
control stability. Too strong feedback, however, can result in
large spatial variations of mquant due to local complexity changes
in the image. The following control equation is used:
mquant.sub.mb''=mquant.sub.mb'+kmb(S.sub.mb-S.sub.mb) (23)
[0104] S.sub.mb is the number of generated bits up to, but not
including, macroblock number mb. S.sub.mb is the expected value of
the same quantity. It is calculated as:
S ^ mb = 1 d '' n = 0 mb - 1 X ^ n ( 24 ) ##EQU00021##
where {circumflex over (X)}.sub.n is the estimated macroblock
complexity of the n-th macroblock (cf. equation (33)). kmb
determines the loop gain of the first order loop. It is related to
nmb, the number of macroblocks the (linearized) system requires to
reduce a mismatch to 1/e of its original value (`time constant` of
the control loop) as follows:
kmb = d '' n mb T '' nmb ( 25 ) ##EQU00022##
[0105] Fractional Mquant Support
[0106] The target mquant, d'', is a real valued number, while the
actual mquant used by the encoder is an integer. For small mquant,
rounding d'' to the nearest integer can result in a significant
mismatch in the generated number of bits. Usually, this mismatch is
compensated by the macroblock level control loop. If the latter is
deactivated (kmb=0), the mquant values are dithered to approximate
the real valued target value on average. A simple, one-dimensional,
1 tap error diffusion filter is used for this purpose.
[0107] Parameter Estimation
[0108] This section describes how various parameters used in the
rate control algorithm are estimated from the actual content of the
video sequence being encoded.
[0109] Global Complexities
[0110] X.sub.ipb, introduced in (3), is estimated from the
relationship between mquant and generated number of bits of
previously encoded pictures. At the end of each frame, the frame
complexity {tilde over (X)} of this frame is calculated as
follows:
X ~ = { S d '' n mb n valid , mb n valid , mb > 0 0 n valid , mb
= 0 ( 26 ) ##EQU00023##
S is the number of bits generated for the frame, d'' is the target
mquant from (3b), n.sub.mb is the total number of macroblocks in
the frame, n.sub.valid,mb is the number of macroblocks in the frame
not encoded in `emergency quantization mode`. Emergency
quantization mode is entered if the number of bits in a partially
encoded frame exceeds a threshold that indicates potential VBV
buffer underflow. In this mode almost no bits are generated for the
remaining macroblocks (only DC/(0,0) coefficients are encoded),
independently of d''.
[0111] For P and B frames, {tilde over (X)} can vary noticeably
from frame to frame. It is highly dependent on the efficiency of
motion compensation, which in turn depends on the scene content. To
reduce the effect of content dependency, a scene-change adaptive
low-pass filter is applied to {tilde over (X)} to produce
X.sub.ipb:
X.sub.ipb,k=(1-.alpha..sub.sc,ipb)X.sub.ipb,k-1+.alpha..sub.sc,ipb{tilde
over (X)}, for ipb=P,B (27)
k denotes sequential numbers for frames of the same type.
.alpha..sub.sc,ipb depends on the picture type (P or B) and whether
or not a scene change was detected. .alpha..sub.sc,ipb is set
according to the following table:
TABLE-US-00001 no scene scene .alpha..sub.sc, ipb change change P
0.75 0.5 B 0.5 0.25
[0112] The same scheme could be applied to I frames as well. There
are two drawbacks, however. First of all, the current scene
detection scheme does not work for I frames (it is based on the
intra vs. non-intra macroblock ratio). This would result in a
non-adaptive .alpha. with a value close to 1.0. Secondly, I frames
can be spaced considerably far apart (e.g. 15 frames) resulting in
long intervals without new estimates for X.sub.I. This is
undesirable because X.sub.I not only affects bit allocation for I
frames but indirectly also the allocation of P and B frames (i.e.
an increased X.sub.I reduces the number bits allocated to P and B
frames in anticipation of higher allocation requirements for the
next I frame). Therefore an updated X.sub.I is provided for every
frame. To this end, the global I frame complexity is modeled as
X.sub.I=X.sub.0-P.sub.intra (28)
where X.sub.0 is a constant and P.sub.intra is the total intra
energy (or power) of the frame. P.sub.intra is calculated as
P intra = mb = 0 n mb - 1 p intra , mb ( 29 ) ##EQU00024##
p.sub.intra,mb is the intra energy of macroblock number mb as
defined in (34) below. Note that p.sub.intra,mb is calculated at
the same time as act.sub.mb' (cf. (20)) without significant
additional computational overhead.
[0113] An estimate for X.sub.0 is obtained from the most recent I
frame k:
X ^ 0 , k = X ~ k P intra , k ( 30 ) ##EQU00025##
with {tilde over (X)} from (26) and P.sub.intra from (29). The
index k denotes that these values are those of the k-th I frame.
For all frames m between the k-th (inclusive) and k+1-th
(exclusive) I frame, X.sub.I,m is calculated from (28):
X.sub.I,m={circumflex over (X)}.sub.0,kP.sub.intra,m (31)
[0114] A-Priori Complexity
[0115] The normalized a-priori complexity for the current frame
{circumflex over (X)}' used in (3a) ff. is obtained from a-priori
knowledge of the current frame before actually encoding it, in
contrast to the `a-posteriori` global complexity described in the
previous section which is derived from values available only after
actually encoding the frame.
X ^ ' = 1 K ipb mb = 0 n mb - 1 X ^ mb ( 32 ) ##EQU00026##
{circumflex over (X)}.sub.mb is a macroblock complexity estimate
which depends on the coding type of the macroblock:
X ^ mb = { x intra p intra , mb .alpha. act , mb intra coded
macroblocks ( I , P , B ) x nonintra p zeromv , mb .alpha. act , mb
zero motionvector macroblocks ( P ) x nonintra , p p nonintra , mb
.alpha. act , mb non - intra coded macroblocks ( P ) x nonintra , b
p nonintra , mb .alpha. act , mb non - intra coded macroblocks ( B
) ( 33 ) ##EQU00027##
.alpha..sub.act,mb from (21) in the denominator of (33) accounts
for the mquant modulation in (22). x.sub.intra, x.sub.nonintra,p
and x.sub.nonintra,b are the macroblock complexities for intra
coded macroblocks, non-intra coded macroblocks in P frames, and
non-intra coded macroblocks in B frames, respectively.
p.sub.intra,mb, p.sub.zeromv,mb and p.sub.nonintra,mb are the
macroblock energies (or power) of intra coded, zero-motion vector
coded, and non-intra coded macroblocks, respectively:
p intra , mb = b = 0 3 i , j = 0 7 u i , j ( mb , b ) - u _ ( mb ,
b ) p zeromv , mb = b = 0 3 i , j = 0 7 v 0 , i , j ( mb , b ) p
nonintra , mb = b = 0 3 i , j = 0 7 v i , j ( mb , b ) ( 34 )
##EQU00028##
Here u.sub.i,j(mb,b), v.sub.0,i,j(mb,b), and v.sub.i,j(mb,b) are
the pixel value, the zero motion vector prediction error, and the
motion-compensated prediction error of the i,j-th pixel in block b
of macroblock mb, respectively. (mb,b) is the average pixel value
of block b of macroblock mb, defined in (20).
[0116] Intra/Non-Intra Macroblock Complexities
[0117] x.sub.intra, x.sub.nonintra,p an x.sub.nonintra,b are
a-posteriori estimates of the complexity of macroblocks of a
particular type. They differ from the global complexities by being
normalized with the macroblock energy (similar to X.sub.0 in (30),
but at the macroblock level). The underlying model for the number
of bits generated for the current macroblock, s.sub.mb, is:
s mb = x p mb mquant mb '' ( 35 ) ##EQU00029##
with x and p chosen according to the current macroblock coding type
and picture type.
[0118] Estimates for x.sub.intra, x.sub.nonintra,p and
x.sub.nonintra,b are obtained from previous macroblocks of the same
type.
x = s _ n p _ n , with s _ n = ( 1 - .alpha. x ) s _ n - 1 +
.alpha. x ( s mb + s 0 ) p _ n = ( 1 - .alpha. x ) p _ n - 1 +
.alpha. x ( p mb mquant mb '' + p 0 ) ( 36 ) ##EQU00030##
[0119] Equation (36) is evaluated independently for all 3 variants
of x (intra, nonintra,p, nonintra,b). s.sub.n and p.sub.n are
updated whenever a macroblock of matching type has been encoded
(skipped macroblocks are excluded). x is recalculated before
starting a new picture. .alpha..sub.x determines the amount of
low-pass filtering. It is preferably set to 10.sup.-3. s.sub.0 and
p.sub.0 are constants that stabilize x in case of low bitrate/low
energy macroblocks. For x.sub.intra, s.sub.0 is preferably set to
75, and p.sub.0 is preferably set to 50, otherwise s.sub.0 is
preferably set to 50, and p.sub.0 is preferably set to 25. This
results in asymptotic values of 1.5 for x.sub.intra, and of 2.0 for
x.sub.nonintra,p and x.sub.nonintra,b. These constants have been
determined by experiment. Thus, other values may be substituted, if
desired, to obtain other results.
[0120] It should be understood that the implementation of other
variations and modifications of the invention in its various
aspects will be apparent to those of ordinary skill in the art, and
that the invention is not limited by the specific embodiments
described. For example, the specific type of stream being encoded
may be varied. As another example, various aspects of the invention
may be implemented without implementing other aspects. It is
therefore contemplated to cover by the present invention, any and
all modifications, variations, or equivalents that fall within the
spirit and scope of the basic underlying principles disclosed and
claimed herein.
* * * * *