U.S. patent application number 11/988653 was filed with the patent office on 2009-03-26 for method and apparatus for weighted prediction for scalable video coding.
This patent application is currently assigned to Thomson Licensing. Invention is credited to Jill MacDonald Boyce, Purvin Bibhas Pandit, Peng Yin.
Application Number | 20090080535 11/988653 |
Document ID | / |
Family ID | 36969701 |
Filed Date | 2009-03-26 |
United States Patent
Application |
20090080535 |
Kind Code |
A1 |
Yin; Peng ; et al. |
March 26, 2009 |
Method and apparatus for weighted prediction for scalable video
coding
Abstract
There are provided scalable video encoder and decoders, and
corresponding scalable video encoding and decoding methods. A
scalable video encoder includes an encoder for encoding a block in
an enhancement layer of a picture by applying a same weighting
parameter to an enhancement layer reference picture as that applied
to a lower layer reference picture used for encoding a block in a
lower layer of the picture. The block in the enhancement layer
corresponds to the block in the lower layer, and the enhancement
layer reference picture corresponds to the lower layer reference
picture. The scalable video decoder includes a decoder for decoding
a block in an enhancement layer of a picture by applying a same
weighting parameter to an enhancement layer reference picture as
that applied to a lower layer reference picture used for decoding a
block in a lower layer of the picture. The block in the enhancement
layer corresponds to the block in the lower layer, and the
enhancement layer reference picture corresponds to the lower layer
reference picture.
Inventors: |
Yin; Peng; (West Windsor,
NJ) ; Boyce; Jill MacDonald; (Manalapan, NJ) ;
Pandit; Purvin Bibhas; (Franklin Park, NJ) |
Correspondence
Address: |
Robert D. Shedd;Thomson Licensing LLC
PO Box 5312
PRINCETON
NJ
08543-5312
US
|
Assignee: |
Thomson Licensing
Boulogne-Billancourt
FR
|
Family ID: |
36969701 |
Appl. No.: |
11/988653 |
Filed: |
May 19, 2006 |
PCT Filed: |
May 19, 2006 |
PCT NO: |
PCT/US2006/019520 |
371 Date: |
January 11, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60701464 |
Jul 21, 2005 |
|
|
|
Current U.S.
Class: |
375/240.26 ;
375/E7.078 |
Current CPC
Class: |
H04N 19/615 20141101;
H04N 19/13 20141101; H04N 19/33 20141101; H04N 19/63 20141101; H04N
19/31 20141101; H04N 19/577 20141101; H04N 19/70 20141101; H04N
19/36 20141101; H04N 19/61 20141101; H04N 19/51 20141101 |
Class at
Publication: |
375/240.26 ;
375/E07.078 |
International
Class: |
H04N 7/26 20060101
H04N007/26 |
Claims
1. An apparatus comprising: an encoder for encoding a block in an
enhancement layer of a picture by applying a same weighting
parameter to an enhancement layer reference picture as that applied
to a lower layer reference picture used for encoding a block in a
lower layer of the picture, wherein the block in the enhancement
layer corresponds to the block in the lower layer, and the
enhancement layer reference picture corresponds to the lower layer
reference picture.
2. The apparatus of claim 1, wherein said encoder encodes the block
in the enhancement layer by selecting between an explicit weighting
parameter mode and an implicit weighting parameter mode.
3. The apparatus of claim 1, wherein said encoder imposes a
constraint that the same weighting parameter is always applied to
the enhancement layer reference picture as that applied to the
lower layer reference picture, when the block in the enhancement
layer corresponds to the block in the lower layer, and the
enhancement layer reference picture corresponds to the lower layer
reference picture.
4. The apparatus of claim 3, wherein the constraint is defined as a
profile and/or a level constraint, and/or is signaled in a sequence
picture parameter set.
5. The apparatus of claim 1, wherein said encoder adds a syntax in
a slice header, for a slice in the enhancement layer, to
selectively apply the same weighting parameter to the enhancement
layer reference picture or a different weighting parameter.
6. The apparatus of claim 1, wherein said encoder performs a
remapping of a pred_weight_table( ) syntax from the lower layer to
a pred_weight_table( ) syntax for the enhancement layer.
7. The apparatus of claim 6, wherein said encoder uses a picture
order count to remap weighting parameters from the lower layer to a
corresponding reference picture index in the enhancement layer.
8. The apparatus of claim 7, wherein the weighting parameters with
a smallest reference picture index are remapped first.
9. The apparatus of claim 6, wherein said encoder sets a
weighted_prediction_flag field to zero for a reference picture used
in the enhancement layer that is unavailable in the lower
layer.
10. The apparatus of claim 6, wherein said encoder sends, in a
slice header, weighting parameters for a reference picture index
corresponding to a reference picture used in the enhancement layer,
when the reference picture used in the enhancement layer is without
a match in the lower layer.
11. The apparatus of claim 6, wherein said encoder performs the
remapping on a slice basis when the picture has a same slice
partitioning in both the enhancement layer and the lower layer, and
said encoder performs the remapping on a macroblock basis when the
picture has a different slice partitioning in the enhancement layer
with respect to the lower layer.
12. The apparatus of claim 1, wherein said encoder performs a
remapping of a pred_weight_table( ) syntax from the lower layer to
a pred_weight_table( ) syntax for the enhancement layer, when said
encoder applies the same weighting parameter to the enhancement
layer reference picture as that applied to the lower layer
reference picture.
13. The apparatus of claim 1, wherein said encoder skips performing
weighting parameters estimation, when said encoder applies the same
weighting parameter to the enhancement layer reference picture as
that applied to the lower layer reference picture.
14. The apparatus of claim 1, wherein said encoder stores only one
set of weighting parameters for each reference picture index, when
said encoder applies the same weighting parameter to the
enhancement layer reference picture as that applied to the lower
layer reference picture.
15. The apparatus of claim 1, wherein said encoder estimates the
weighting parameters, when said encoder applies a different
weighting parameter or the enhancement layer is without the lower
layer.
16. A method for scalable video encoding, comprising: encoding a
block in an enhancement layer of a picture by applying a same
weighting parameter to an enhancement layer reference picture as
that applied to a lower layer reference picture used for encoding a
block in a lower layer of the picture, wherein the block in the
enhancement layer corresponds to the block in the lower layer, and
the enhancement layer reference picture corresponds to the lower
layer reference picture.
17. The method of claim 16, wherein said encoding step encodes the
block in the enhancement layer by selecting between an explicit
weighting parameter mode and an implicit weighting parameter
mode.
18. The method of claim 16, wherein said encoding step comprises
imposing a constraint that the same weighting parameter is always
applied to the enhancement layer reference picture as that applied
to the lower layer reference picture, when the block in the
enhancement layer corresponds to the block in the lower layer, and
the enhancement layer reference picture corresponds to the lower
layer reference picture.
19. The method of claim 18, wherein the constraint is defined as a
profile and/or a level constraint, and/or is signaled in a sequence
picture parameter set.
20. The method of claim 16, wherein said encoding step comprises
adding a syntax in a slice header, for a slice in the enhancement
layer, to selectively apply the same weighting parameter to the
enhancement layer reference picture or a different weighting
parameter.
21. The method of claim 16, wherein said encoding step comprises
performing a remapping of a pred_weight_table( ) syntax from the
lower layer to a pred_weight_table( ) syntax for the enhancement
layer.
22. The method of claim 21, wherein said performing step uses a
picture order count to remap weighting parameters from the lower
layer to a corresponding reference picture index in the enhancement
layer.
23. The method of claim 22, wherein the weighting parameters with a
smallest reference picture index are remapped first.
24. The method of claim 21, wherein said encoding step comprises
setting a weighted_prediction_flag field to zero for a reference
picture used in the enhancement layer that is unavailable in the
lower layer.
25. The method of claim 21, wherein said encoding step comprises
sending, in a slice header, weighting parameters for a reference
picture index corresponding to a reference picture used in the
enhancement layer, when the reference picture used in the
enhancement layer is without a match in the lower layer.
26. The method of claim 21, wherein the remapping is performed on a
slice basis when the picture has a same slice partitioning in both
the enhancement layer and the lower layer, and said remapping step
is performed on a macroblock basis when the picture has a different
slice partitioning in the enhancement layer with respect to the
lower layer.
27. The method of claim 16, wherein said encoding step comprises
performing a remapping of a pred_weight_table( ) syntax from the
lower layer to a pred_weight_table( ) syntax for the enhancement
layer, when said encoding step applies the same weighting parameter
to the enhancement layer reference picture as that applied to the
lower layer reference picture.
28. The method of claim 16, wherein said encoding step comprises
skipping weighting parameters estimation, when said encoding step
applies the same weighting parameter to the enhancement layer
reference picture as that applied to the lower layer reference
picture.
29. The method of claim 16, wherein said encoding step comprises
storing only one set of weighting parameters for each reference
picture index, when said encoding step applies the same weighting
parameter to the enhancement layer reference picture as that
applied to the lower layer reference picture.
30. The method of claim 16, wherein said encoding step comprises
estimating the weighting parameters, when said encoding step
applies a different weighting parameter or the enhancement layer is
without the lower layer.
31. A video signal structure for scalable video encoding
comprising: a block encoded in an enhancement layer of a picture
generated by applying a same weighting parameter to an enhancement
layer reference picture as that applied to a lower layer reference
picture used for encoding a block in a lower layer of the picture,
wherein the block in the enhancement layer corresponds to the block
in the lower layer, and the enhancement layer reference picture
corresponds to the lower layer reference picture.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional
Application Ser. No. 60/701,464, filed Jul. 21, 2005 and entitled
"METHOD AND APPARATUS FOR WEIGHTED PREDICTION FOR SCALABLE VIDEO
CODING," which is incorporated by reference herein in its
entirety.
FIELD OF THE INVENTION
[0002] The present invention relates generally to video encoding
and decoding and, more particularly, to methods and apparatus for
weighted prediction for scalable video encoding and decoding.
BACKGROUND OF THE INVENTION
[0003] The International Organization for
Standardization/international Electrotechnical Commission (ISO/IEC)
Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video
Coding (AVC) standard/international Telecommunication Union,
Telecommunication Sector (ITU-T) H.264 standard (hereinafter the
"MPEG4/H.264 standard" or simply the "H.264 standard") is the first
international video coding standard to include a Weighted
Prediction (WP) tool. Weighted Prediction was adopted to improve
coding efficiency. The scalable video coding (SVC) standard,
developed as an amendment of the H.264 standard, also adopts
weighted prediction. However, the SVC standard does not explicitly
specify the relationship of weights among a base layer and its
enhancement layers.
[0004] Weighted Prediction is supported in the Main, Extended, and
High profiles of the H.264 standard. The use of WP is indicated in
the sequence parameter set for P and SP slices using the
weighted_pred_flag field, and for B slices using the
weighting_bipred_idc field. There are two WP modes, an explicit
mode and an implicit mode. The explicit mode is supported in P, SP,
and B slices. The implicit mode is supported in only B slices.
[0005] A single weighting factor and offset are associated with
each reference picture index for each color component in each
slice. In explicit mode, these WP parameters may be coded in the
slice header. In implicit mode, these parameters are derived based
on the relative distance of the current picture and its reference
pictures.
[0006] For each macroblock or macroblock partition, the weighting
parameters applied are based on a reference picture index (or
indices in the case of bi-prediction) of the current macroblock or
macroblock partition. The reference picture indices are either
coded in the bitstream or may be derived, e.g., for skipped or
direct mode macroblocks. The use of the reference picture index to
signal which weighting parameters to apply is bitrate efficient, as
compared to requiring a weighting parameter index in the bitstream,
since the reference picture index is already available based on the
other required bitstream fields.
[0007] Many different methods of scalability have been widely
studied and standardized, including SNR scalability, spatial
scalability, temporal scalability, and fine grain scalability, in
scalability profiles of the MPEG-2 and H.264 standards, or are
currently being developed as an amendment of the H.264
standard.
[0008] For spatial, temporal and SNR scalability, a large degree of
inter-layer prediction is incorporated. Intra and inter macroblocks
can be predicted using the corresponding signals of previous
layers. Moreover, the motion description of each layer can be used
for a prediction of the motion description for following
enhancement layers. These techniques fall into three categories:
inter-layer intra texture prediction, inter-layer motion prediction
and inter-layer residue prediction.
[0009] In Joint Scalable Video Model (JSVM) 2.0, an enhancement
layer macroblock can exploit inter-layer prediction using scaled
base layer motion data, using either "BASE_LAYER_MODE" or
"QPEL_REFINEMENT_MODE", as in case of dyadic (two-layer) spatial
scalability. When inter-layer motion prediction is used, the motion
vector (including its reference picture index and associated
weighting parameters) of the corresponding (upsampled) MB in the
previous layer is used for motion prediction. If the enhancement
layer and its previous layer have different pred_weight_table( )
values, we need to store different sets of weighting parameters for
the same reference picture in the enhancement layer.
SUMMARY OF THE INVENTION
[0010] These and other drawbacks and disadvantages of the prior art
are addressed by the present invention, which is directed to
methods and apparatus for weighted prediction for scalable video
encoding and decoding.
[0011] According to an aspect of the present invention, there is
provided a scalable video encoder. The scalable video encoder
includes an encoder for encoding a block in an enhancement layer of
a picture by applying a same weighting parameter to an enhancement
layer reference picture as that applied to a lower layer reference
picture used for encoding a block in a lower layer of the picture.
The block in the enhancement layer corresponds to the block in the
lower layer, and the enhancement layer reference picture
corresponds to the lower layer reference picture.
[0012] According to another aspect of the present invention, there
is provided a method for scalable video encoding. The method
includes encoding a block in an enhancement layer of a picture by
applying a same weighting parameter to an enhancement layer
reference picture as that applied to a lower layer reference
picture used for encoding a block in a lower layer of the picture.
The block in the enhancement layer corresponds to the block in the
lower layer, and the enhancement layer reference picture
corresponds to the lower layer reference picture.
[0013] According to yet another aspect of the present invention,
there is provided a video signal structure for scalable video
encoding including a block encoded in an enhancement layer of a
picture generated by applying a same weighting parameter to an
enhancement layer reference picture as that applied to a lower
layer reference picture used for encoding a block in a lower layer
of the picture. The block in the enhancement layer corresponds to
the block in the lower layer, and the enhancement layer reference
picture corresponds to the lower layer reference picture.
[0014] These and other aspects, features and advantages of the
present invention will become apparent from the following detailed
description of exemplary embodiments, which is to be read in
connection with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The present invention may be better understood in accordance
with the following exemplary figures, in which:
[0016] FIG. 1 shows a block diagram for an exemplary Joint Scalable
Video Model (JSVM) 2.0 encoder to which the present principles may
be applied;
[0017] FIG. 2 shows a block diagram for an exemplary decoder to
which the present principles may be applied;
[0018] FIG. 3 is a flow diagram for an exemplary method for
scalable video encoding of an image block using weighted prediction
in accordance with an exemplary embodiment of the present
principles;
[0019] FIG. 4 is a flow diagram for an exemplary method for
scalable video decoding of an image block using weighted prediction
in accordance with an exemplary embodiment of the present
principles;
[0020] FIG. 5 is a flow diagram for an exemplary method for
decoding level_idc and profile_idc syntaxes in accordance with an
exemplary embodiment of the present principles; and
[0021] FIG. 6 is a flow diagram for an exemplary method for
decoding a weighted prediction constraint for an enhancement layer
in accordance with an exemplary embodiment of the present
principles.
DETAILED DESCRIPTION
[0022] The present invention is directed to methods and apparatus
for weighted prediction for scalable video encoding and
decoding.
[0023] In accordance with the principles of the present invention,
methods and apparatus are disclosed which re-use the base layer
weighting parameters for enhancement layer weighted prediction.
Advantageously, embodiments in accordance with the present
principles can save on memory and/or complexity for both the
encoder and decoder. Moreover, embodiments in accordance with the
present principles can also save bits at very low bitrates.
[0024] The present description illustrates the principles of the
present invention. It will thus be appreciated that those skilled
in the art will be able to devise various arrangements that,
although not explicitly described or shown herein, embody the
principles of the invention and are included within its spirit and
scope.
[0025] All examples and conditional language recited herein are
intended for pedagogical purposes to aid the reader in
understanding the principles of the invention and the concepts
contributed by the inventor to furthering the art, and are to be
construed as being without limitation to such specifically recited
examples and conditions.
[0026] Moreover, all statements herein reciting principles,
aspects, and embodiments of the invention, as well as specific
examples thereof, are intended to encompass both structural and
functional equivalents thereof. Additionally, it is intended that
such equivalents include both currently known equivalents as well
as equivalents developed in the future, i.e., any elements
developed that perform the same function, regardless of
structure.
[0027] Thus, for example, it will be appreciated by those skilled
in the art that the block diagrams presented herein represent
conceptual views of illustrative circuitry embodying the principles
of the invention. Similarly, it will be appreciated that any flow
charts, flow diagrams, state transition diagrams, pseudocode, and
the like represent various processes which may be substantially
represented in computer readable media and so executed by a
computer or processor, whether or not such computer or processor is
explicitly shown.
[0028] The functions of the various elements shown in the figures
may be provided through the use of dedicated hardware as well as
hardware capable of executing software in association with
appropriate software. When provided by a processor, the functions
may be provided by a single dedicated processor, by a single shared
processor, or by a plurality of individual processors, some of
which may be shared. Moreover, explicit use of the term "processor"
or "controller" should not be construed to refer exclusively to
hardware capable of executing software, and may implicitly include,
without limitation, digital signal processor ("DSP") hardware,
read-only memory ("ROM") for storing software, random access memory
("RAM"), and non-volatile storage.
[0029] Other hardware, conventional and/or custom, may also be
included. Similarly, any switches shown in the figures are
conceptual only. Their function may be carried out through the
operation of program logic, through dedicated logic, through the
interaction of program control and dedicated logic, or even
manually, the particular technique being selectable by the
implementer as more specifically understood from the context.
[0030] In the claims hereof, any element expressed as a means for
performing a specified function is intended to encompass any way of
performing that function including, for example, a) a combination
of circuit elements that performs that function or b) software in
any form, including, therefore, firmware, microcode or the like,
combined with appropriate circuitry for executing that software to
perform the function. The invention as defined by such claims
resides in the fact that the functionalities provided by the
various recited means are combined and brought together in the
manner which the claims call for. It is thus regarded that any
means that can provide those functionalities are equivalent to
those shown herein.
[0031] In accordance with embodiments of the present principles, a
method and apparatus are disclosed which re-use the base layer
weighting parameters for the enhancement layer. Since the base
layer is simply the downsampled version of the enhancement layer,
it is beneficial if the enhancement layer and the base layer have
the same set of weighting parameters for the same reference
picture.
[0032] In addition, other advantages/features are provided by the
present principles. One advantage/feature is that only one set of
weighting parameters needs to be stored for each enhancement layer,
which can save memory usage. In addition, when inter-layer motion
prediction is used, the decoder needs to know which set of
weighting parameters is used. A look up table may be utilized to
store the necessary information.
[0033] Another advantage/feature is a reduction in complexity at
both the encoder and decoder. At the decoder, embodiments of the
present principles can reduce the complexity of parsing and table
lookup to locate the right set of weighting parameters. At the
encoder, embodiments of the present principles can reduce the
complexity of using different algorithms and, thus, making
decisions for weighting parameters estimation. When an update step
is used and prediction weights are taken into consideration, having
multiple weighting parameters for the same reference picture index
will make the derivation of motion information in inverse-update
step at the decoder and update step at the encoder more
complicated.
[0034] Yet another advantage/feature is at very low bitrates,
embodiments of the present principles can also have a slight
advantage of coding efficiency, since weighting parameters are not
explicitly transmitted in the slice header for the enhancement
layer.
[0035] Turning to FIG. 1, an exemplary Joint Scalable Video Model
Version 2.0 (hereinafter "JSVM2.0") encoder to which the present
invention may be applied is indicated generally by the reference
numeral 100. The JSVM2.0 encoder 100 uses three spatial layers and
motion compensated temporal filtering. The JSVM encoder 100
includes a two-dimensional (2D) decimator 104, a 2D decimator 106,
and a motion compensated temporal filtering (MCTF) module 108, each
having an input for receiving video signal data 102.
[0036] An output of the 2D decimator 106 is connected in signal
communication with an input of a MCTF module 110. A first output of
the MCTF module 110 is connected in signal communication with an
input of a motion coder 112, and a second output of the MCTF module
110 is connected in signal communication with an input of a
prediction module 116. A first output of the motion coder 112 is
connected in signal communication with a first input of a
multiplexer 114. A second output of the motion coder 112 is
connected in signal communication with a first input of a motion
coder 124. A first output of the prediction module 116 is connected
in signal communication with an input of a spatial transformer 118.
An output of the spatial transformer 118 is connected in signal
communication with a second input of the multiplexer 114. A--second
output of the prediction module 116 is connected in signal
communication with an input of an interpolator 120. An output of
the interpolator is connected in signal communication with a first
input of a prediction module 122. A first output of the prediction
module 122 is connected in signal communication with an input of a
spatial transformer 126. An output of the spatial transformer 126
is connected in signal communication with the second input of the
multiplexer 114. A second output of the prediction module 122 is
connected in signal communication with an input of an interpolator
130. An output of the interpolator 130 is connected in signal
communication with a first input of a prediction module 134. An
output of the prediction module 134 is connected in signal
communication with a spatial transformer 136. An output of the
spatial transformer is connected in signal communication with the
second input of a multiplexer 114.
[0037] An output of the 2D decimator 104 is connected in signal
communication with an input of a MCTF module 128. A first output of
the MCTF module 128 is connected in signal communication with a
second input of the motion coder 124. A first output of the motion
coder 124 is connected in signal communication with the first input
of the multiplexer 114. A second output of the motion coder 124 is
connected in signal communication with a first input of a motion
coder 132. A second output of the MCTF module 128 is connected in
signal communication with a second input of the prediction module
122.
[0038] A first output of the MCTF module 108 is connected in signal
communication with a second input of the motion coder 132. An
output of the motion coder 132 is connected in signal communication
with the first input of the multiplexer 114. A second output of the
MCTF module 108 is connected in signal communication with a second
input of the prediction module 134. An output of the multiplexer
114 provides an output bitstream 138.
[0039] For each spatial layer, a motion compensated temporal
decomposition is performed. This decomposition provides temporal
scalability. Motion information from lower spatial layers can be
used for prediction of motion on the higher layers. For texture
encoding, spatial prediction between successive spatial layers can
be applied to remove redundancy. The residual signal resulting from
intra-prediction or motion compensated inter prediction is
transform coded. A quality base layer residual provides minimum
reconstruction quality at each spatial layer. This quality base
layer can be encoded into an H.264 standard compliant stream if no
inter-layer prediction is applied. For quality scalability, quality
enhancement layers are additionally encoded. These enhancement
layers can be chosen to either provide coarse or fine grain quality
(SNR) scalability.
[0040] Turning to FIG. 2, an exemplary scalable video decoder to
which the present invention may be applied is indicated generally
by the reference numeral 200. An input of a demultiplexer 202 is
available as an input to the scalable video decoder 200, for
receiving a scalable bitstream. A first output of the demultiplexer
202 is connected in signal communication with an input of a spatial
inverse transform SNR scalable entropy decoder 204. A first output
of the spatial inverse transform SNR scalable entropy decoder 204
is connected in signal communication with a first input of a
prediction module 206. An output of the prediction module 206 is
connected in signal communication with a first input of an inverse
MCTF module 208.
[0041] A second output of the spatial inverse transform SNR
scalable entropy decoder 204 is connected in signal communication
with a first input of a motion vector (MV) decoder 210. An output
of the MV decoder 210 is connected in signal communication with a
second input of the inverse MCTF module 208.
[0042] A second output of the demultiplexer 202 is connected in
signal communication with an input of a spatial inverse transform
SNR scalable entropy decoder 212. A first output of the spatial
inverse transform SNR scalable entropy decoder 212 is connected in
signal communication with a first input of a prediction module 214.
A first output of the prediction module 214 is connected in signal
communication with an input of an interpolation module 216. An
output of the interpolation module 216 is connected in signal
communication with a second input of the prediction module 206. A
second output of the prediction module 214 is connected in signal
communication with a first input of an inverse MCTF module 218.
[0043] A second output of the spatial inverse transform SNR
scalable entropy decoder 212 is connected in signal communication
with a first input of an MV decoder 220. A first output of the MV
decoder 220 is connected in signal communication with a second
input of the MV decoder 210. A second output of the MV decoder 220
is connected in signal communication with a second input of the
inverse MCTF module 218.
[0044] A third output of the demultiplexer 202 is connected in
signal communication with an input of a spatial inverse transform
SNR scalable entropy decoder 222. A first output of the spatial
inverse transform SNR scalable entropy decoder 222 is connected in
signal communication with an input of a prediction module 224. A
first output of the prediction module 224 is connected in signal
communication with an input of an interpolation module 226. An
output of the interpolation module 226 is connected in signal
communication with a second input of the prediction module 214.
[0045] A second output of the prediction module 224 is connected in
signal communication with a first input of an inverse MCTF module
228. A second output of the spatial inverse transform SNR scalable
entropy decoder 222 is connected in signal communication with an
input of an MV decoder 230. A first output of the MV decoder 230 is
connected in signal communication with a second input of the MV
decoder 220. A second output of the MV decoder 230 is connected in
signal communication with a second input of the inverse MCTF module
228.
[0046] An output of the inverse MCTF module 228 is available as an
output of the decoder 200, for outputting a layer 0 signal. An
output of the inverse MCTF module 218 is available as an output of
the decoder 200, for outputting a layer 1 signal. An output of the
inverse MCTF module 208 is available as an output of the decoder
200, for outputting a layer 2 signal.
[0047] In a first exemplary embodiment in accordance with the
present principles, new syntax is not used. In this first exemplary
embodiment, the enhancement layer re-uses the base layer weights.
The first exemplary embodiment may be implemented, e.g., as a
profile or level constraint. The requirement can be also indicated
in the sequence or picture parameter sets.
[0048] In a second exemplary embodiment in accordance with the
present principles, one syntax element,
base_pred_weight_table_flag, is introduced in the slice header
syntax in the scalable extension as shown in Table 1, so that the
encoder can adaptively select which mode is used for weighted
prediction on a slice basis. When base_pred_weight_table_flag is
not present, base_pred_weight_table_flag shall be inferred to be
equal to 0. When base_pred_weight_table_flag is equal to 1, this
indicates that the enhancement layer re-uses pred_weight_table( )
from its previous layer.
[0049] Table 1 illustrates syntax for weighted prediction for
scalable video coding.
TABLE-US-00001 TABLE 1 slice_header_in_scalable_extension( ) { C
Descriptor first_mb_in_slice 2 ue(v) slice_type 2 ue(v)
pic_parameter_set_id 2 ue(v) if( slice_type = = PR ) {
num_mbs_in_slice_minus1 2 ue(v) luma_chroma_sep_flag 2 u(1) }
frame_num 2 u(v) if( !frame_mbs_only_flag ) { field_pic_flag 2 u(1)
if( field_pic_flag ) bottom_field_flag 2 u(1) } if( nal_unit_type =
= 21 ) idr_pic_id 2 ue(v) if( pic_order_cnt_type = = 0 ) {
pic_order_cnt_lsb 2 u(v) if( pic_order_present_flag &&
!field_pic_flag ) delta_pic_order_cnt_bottom 2 se(v) } if(
pic_order_cnt_type = = 1 &&
!delta_pic_order_always_zero_flag ) { delta_pic_order_cnt[ 0 ] 2
se(v) if( pic_order_present_flag && !field_pic_flag )
delta_pic_order_cnt[ 1 ] 2 se(v) } if( slice_type != PR ) { if(
redundant_pic_cnt_present_flag ) redundant_pic_cnt 2 ue(v) if(
slice_type = = EB ) direct_spatial_mv_pred_flag 2 u(1)
key_picture_flag 2 u(1) decomposition_stages 2 ue(v) base_id_plus1
2 ue(v) if( base_id_plus1 != 0) { adaptive_prediction_flag 2 u(1) }
if( slice_type = = EP || slice_type = = EB ) {
num_ref_idx_active_override_flag 2 u(1) if(
num_ref_idx_active_override_flag ) { num_ref_idx_I0_active_minus1 2
ue(v) if( slice_type = = EB ) num_ref_idx_I1_active_minus1 2 ue(v)
} } ref_pic_list_reordering( ) 2 for( decLvl = temporal_level;
decLvl < decomposition_stages; decLvl++ ) {
num_ref_idx_update_I0_active[ decLvl + 1 ] 2 ue(v)
num_ref_idx_update_I1_active[ decLvl + 1 ] 2 ue(v) } if( (
weighted_pred_flag && slice_type = = EP ) || (
weighted_bipred_idc = = 1 && slice_type = = EB ) ) { if (
(base_id_plus1 != 0) && ( adaptive_prediction_flag = = 1) )
base_pred_weight_table_flag 2 u(1) if ( base_pred_weight_table_flag
= = 0) pred_weight_table( ) 2 } if( nal_ref_idc != 0 )
dec_ref_pic_marking( ) 2 if( entropy_coding_mode_flag &&
slice_type != EI ) cabac_init_idc 2 ue(v) } slice_qp_delta 2 se(v)
if( deblocking_filter_control_present_flag ) {
disable_deblocking_filter_idc 2 ue(v) if(
disable_deblocking_filter_idc != 1 ) { slice_alpha_c0_offset_div2 2
se(v) slice_beta_offset_div2 2 se(v) } } if( slice_type != PR ) if(
num_slice_groups_minus1 > 0 && slice_group_map_type
>= 3 && slice_group_map_type <= 5)
slice_group_change_cycle 2 u(v) if( slice_type != PR &&
extended_spatial_scalability > 0 ) { if ( chroma_format_idc >
0 ) { base_chroma_phase_x_plus1 2 u(2) base_chroma_phase_y_plus1 2
u(2) } if( extended_spatial_scalability = = 2 ) {
scaled_base_left_offset 2 se(v) scaled_base_top_offset 2 se(v)
scaled_base_right_offset 2 se(v) scaled_base_bottom_offset 2 se(v)
} } SpatialScalabilityType = spatial_scalability_type( ) }
[0050] At the decoder, when the enhancement layer is to re-use the
weights from the base layer, a remapping of pred_weight_table( ) is
performed from the base (or previous) layer to pred_weight_table( )
in the current enhancement layer. This process is utilized for the
following cases: in a first case, the same reference picture index
in the base layer and the enhancement layer indicates a different
reference picture; or in a second case, the reference picture used
in the enhancement layer does not have a corresponding match in the
base layer. For the first case, the picture order count (POC)
number is used to map the weighting parameters from the base layer
to the right reference picture index in the enhancement layer. If
multiple weighting parameters are used in the base layer, the
weighting parameters with the smallest reference picture index are
preferably, but not necessarily, mapped first. For the second case,
it is presumed that base_pred_weight_table_flag is set to 0 for the
reference picture which is not available in the enhancement layer.
The remapping of pred_weight_table( ) from the base (or previous)
layer to pred_weight_table( ) in the current enhancement layer is
derived as follows. The process is referred to as an inheritance
process for pred_weight_table( ). In particular, this inheritance
process is invoked when base_pred_weight_table_flag is equal to 1.
Outputs of this process are as follows: [0051] luma_weight_LX[ ]
(with X being 0 or 1) [0052] luma_offset_LX[ ] (with X being 0 or
1) [0053] chroma_weight_LX[ ] (with X being 0 or 1) [0054]
chroma_offset_LX[ ] (with X being 0 or 1) [0055] luma_log
2_weight_denom [0056] chroma_log 2_weight_denom
[0057] The derivation process for the base pictures is invoked with
basePic as output. For X being replaced by either 0 or 1, the
following applies: [0058] Let base_luma_weight_LX[ ] be the value
of syntax element luma_weight_LX[ ] value of the base picture
basePic. [0059] Let base_luma_offset_LX[ ] be the value of syntax
element luma_offset LX[ ] of the base picture basePic. [0060] Let
base_chroma_weight_LX[ ] be the value of syntax element
chroma_weight_LX[ ] of the base picture basePic. [0061] Let
base_chroma_offset_LX[ ] be the value of syntax element
chroma_offset_LX[ ] value of the base picture basePic. [0062] Let
base_luma_log 2 weight_denom be the value of syntax element
luma_log 2_weight_denom value of the base picture basePic. [0063]
Let base_chroma_log 2_weight_denom be the value of syntax element
chroma_log 2_weight_denom of the base picture basePic. [0064] Let
BaseRefPicListX be the reference index list RefPicListX of the base
picture basePic. [0065] For each reference index refIdxLX in the
current slice reference index list RefPicListX (loop from 0 to
num_ref_idx_IX_active_minus1), its associated weighting parameters
in the current slice are inherited as follows: [0066] Let refPic be
the picture that is referenced by refIdxLX [0067] Let refPicBase,
the reference picture of the corresponding base layer, be
considered to exist if there is a picture for which all of the
following conditions are true. [0068] The syntax element
dependency_id for the picture refPicBase is equal to the variable
DependencyIdBase of the picture refPic. [0069] The syntax element
quality_level for the picture refPicBase is equal to the variable
QualityLevelBase of the picture refPic. [0070] The syntax element
fragment_order for the picture refPicBase is equal to the variable
FragmentOrderBase of the picture refPic. [0071] The value of
PicOrderCnt(refPic) is equal to the value of
PicOrderCnt(refPicBase). [0072] There is an index baseRefIdxLX
equal to the lowest valued available reference index in the
corresponding base layer reference index list BaseRefPicListX that
references refPicBase. [0073] If a refPicBase was found to exist
the following applies: [0074] baseRefIdxLX is marked as unavailable
for subsequent steps of the process.
[0074] luma_log 2_weight_denom=base_luma_log 2_weight_denom (1)
chroma_log 2_weight_denom=base_chroma_log 2_weight_denom (2)
luma_weight_LX[refIdxLX]=base_luma_weight_LX[baseRefIdxLX] (3)
luma_offset_LX[refIdxLX]=base_luma_offset_LX[baseRefIdxLX] (4)
chroma_weight_LX[refIdxLX][0]=base_chroma_weight_LX[baseRefIdxLX][0]
(5)
chroma_offset_LX[refIdxLX][0]=base_chroma_offset_LX[baseRefIdxLX][0]
(6)
chroma_weight_LX[refIdxLX][1]=base_chroma_weight_LX[baseRefIdxLX][1]
(7)
chroma_offset_LX[refIdxLX][1]=base_chroma_offset_LX[baseRefIdxLX][1]
(8) [0075] Otherwise,
[0075] luma_log 2_weight_denom=base_luma_log 2_weight_denom (9)
chroma_log 2_weight_denom=base_chroma_log 2_weight_denom (10)
luma_weight_LX[refIdxLX]=1<<luma_log 2_weight_denom (11)
luma_offset_LX[refIdxLX]=0 (12)
chroma_weight_LX[refIdxLX][0]=1<<chroma_log 2_weight_denom
(13)
chroma_offset_LX[refIdxLX][0]=0 (14)
chroma_weight_LX[refIdxLX][1]1<<chroma_log 2_weight_denom
(15)
chroma_offset_LX[refIdxLX][1]=0 (16)
[0076] The following is one exemplary method to implement the
inheritance process:
TABLE-US-00002 for( baseRefldxLX = 0; baseRefldxLX <=
base_num_ref_idx_IX_active_minus1; baseRefldxLX ++ )
base_ref_avail[baseRefldxLX ] = 1 for( refldxLX = 0; refldxLX <=
num_ref_idx_IX_active_minus1; refldxLX ++ ) {
base_weights_avail_flag[refldxLX ] = 0 for( baseRefldxLX =0;
baseRefldxLX <= base_num_ref_idx_IX_active_minus1; baseRefldxLX
++) { if (base_ref_avail[baseRefldxLX ] &&
(PicOrderCnt(RefPicListX[refldxLX ]) = =
PicOrderCnt(BaseRefPicListX[baseRefldxLX ]) ) ) { apply equations
(1) to (8) base_ref_avail[baseRefldxLX ] = 0
base_weights_avail_flag[refldxLX ] = 1 break; } } if
(base_weights_avail_flag[refldxLX ] = = 0) { apply equations (9) to
(16) } } (17)
[0077] If the enhancement layer picture and the base layer picture
have the same slice partitioning, the remapping of
pred_weight_table( ) from the base (or lower) layer to
pred_weight_table( ) in the current enhancement layer can be
performed on a slice basis. However, if the enhancement layer and
the base layer have a different slice partitioning, the remapping
of pred_weight_table( ) from the base (or lower) layer to
pred_weight_table( ) in the current enhancement layer needs to be
performed on macroblock basis. For example, when the base layer and
the enhancement layer have the same two slice partitions, the
inheritance process can be called once per slice. In contrast, if
the base layer has two partitions and the enhancement layer has
three partitions, then the inheritance process is called on a
macroblock basis.
[0078] Turning to FIG. 3, an exemplary method for scalable video
encoding of an image block using weighted prediction is indicated
generally by the reference numeral 300.
[0079] A start block 305 starts encoding a current enhancement
layer (EL) picture, and passes control to a decision block 310. The
decision block 310 determines whether or not a base layer (BL)
picture is present for the current EL picture. If so, then control
is passed to a function block 350. Otherwise, control is passed to
a function block 315.
[0080] The function block 315 obtains the weights from the BL
picture, and passes control to a function block 320. The function
block 320 remaps pred_weight_table( ) of the BL to
pred_weight_table( ) of the enhancement layer, and passes control
to a function block 325. The function block 325 sets
base_pred_weight_table_flag equal to true, and passes control to a
function block 330. The function block 330 weights the reference
picture with the obtained weights, and passes control to a function
block 335. The function block 335 writes
base_pred_weight_table_flag in the slice header, and passes control
to a decision block 340. The decision block 340 determines whether
or not the base_pred_weight_table_flag is equal to true. If so,
then control is passed to a function block 345. Otherwise, control
is passed to a function block 360.
[0081] The function block 350 calculates the weights for the EL
picture, and passes control to a function block 355. The function
block 355 sets base_pred_weight_table_flag equal to false, and
passes control to the function block 330.
[0082] The function block 345 encodes the EL picture using the
weighted reference picture, and passes control to an end block
365.
[0083] The function block 360 writes the weights in the slice
header, and passes control to the function block 345.
[0084] Turning to FIG. 4, an exemplary method for scalable video
decoding of an image block using weighted prediction is indicated
generally by the reference numeral 400.
[0085] A start block 405 starts decoding a current enhancement
layer (EL) picture, and passes control to a function block 410. The
function block 410 parses base_pred_weight_table_flag in the slice
header, and passes control to a decision block 415. The decision
block 415 determines whether or not base_pred_weight_table_flag is
equal to one. If so, then control is passed to a function block
420. Otherwise, control is passed to a function block 435.
[0086] The function block 420 copies weights from the corresponding
base layer (BL) picture to the EL picture, and passes control to a
function block 425. The function block 425 remaps
pred_weight_table( ) of the BL picture to pred_weight_table( ) of
the EL picture, and passes control to a function block 430. The
function block 430 decodes the EL picture with the obtained
weights, and passes control to an end block 440.
[0087] The function block 435 parses the weighting parameters, and
passes control to the function block 430.
[0088] Turning to FIG. 5, an exemplary method for decoding
level_idc and profile_idc syntaxes is indicated generally by the
reference numeral 500.
[0089] A start block 505 passes control to a function block 510.
The function block 510 parses level_idc and profile_idc syntaxes,
and passes control to a function block 515. The function block 515
determines the weighted prediction constraint for the enhancement
layer based on the parsing performed by function block 510, and
passes control to an end block 520.
[0090] Turning to FIG. 6, an exemplary method for decoding a
weighted prediction constraint for an enhancement layer is
indicated generally by the reference numeral 600.
[0091] A start block 605 passes control to a function block 610.
The function block 610 parses syntax for weighted prediction for
the enhancement layer, and passes control to an end block 615.
[0092] A description will now be given of some of the many
attendant advantages/features of the present invention, some of
which have been mentioned above. For example, one advantage/feature
is a scalable video encoder, that includes an encoder for encoding
a block in an enhancement layer of a picture by applying a same
weighting parameter to an enhancement layer reference picture as
that applied to a particular lower layer reference picture used for
encoding a block in a lower layer of the picture, wherein the block
in the enhancement layer corresponds to the block in the lower
layer, and the enhancement layer reference picture corresponds to
the particular lower layer reference picture. Another
advantage/feature is the scalable video encoder as described above,
wherein the encoder encodes the block in the enhancement layer by
selecting between an explicit weighting parameter mode and an
implicit weighting parameter mode. Yet another advantage/feature is
the scalable video encoder as described above, wherein the encoder
imposes a constraint that the same weighting parameter is always
applied to the enhancement layer reference picture as that applied
to the particular lower layer reference picture, when the block in
the enhancement layer corresponds to the block in the lower layer,
and the enhancement layer reference picture corresponds to the
particular lower layer reference picture. Moreover, another
advantage/feature is the scalable video encoder having the
constraint as described above, wherein the constraint is defined as
a profile or a level constraint, or is signaled in a sequence
picture parameter set. Further, another advantage/feature is the
scalable video encoder as described above, wherein the encoder adds
a syntax in a slice header, for a slice in the enhancement layer,
to selectively apply the same weighting parameter to the
enhancement layer reference picture or a different weighting
parameter. Also, another advantage/feature is the scalable video
encoder as described above, wherein the encoder performs a
remapping of a pred_weight_table( ) syntax from the lower layer to
a pred_weight_table( ) syntax for the enhancement layer.
Additionally, another advantage/feature is the scalable video
encoder with the remapping as described above, wherein the encoder
uses a picture order count to remap weighting parameters from the
lower layer to a corresponding reference picture index in the
enhancement layer. Moreover, another advantage/feature is the
scalable video encoder with the remapping using the picture order
count as described above, wherein the weighting parameters with a
smallest reference picture index are remapped first. Further,
another advantage/feature is the scalable video encoder with the
remapping as described above, wherein the encoder sets a
weighted_prediction_flag field to zero for a reference picture used
in the enhancement layer that is unavailable in the lower layer.
Also, another advantage/feature is the scalable video encoder with
the remapping as described above, wherein the encoder sends, in a
slice header, weighting parameters for a reference picture index
corresponding to a reference picture used in the enhancement layer,
when the reference picture used in the enhancement layer is without
a match in the lower layer. Moreover, another advantage/feature is
the scalable video encoder with the remapping as described above,
wherein the encoder performs the remapping on a slice basis when
the picture has a same slice partitioning in both the enhancement
layer and the lower layer, and the encoder performs the remapping
on a macroblock basis when the picture has a different slice
partitioning in the enhancement layer with respect to the lower
layer. Further, another advantage/feature is the scalable video
encoder as described above, wherein the encoder performs a
remapping of a pred_weight_table( ) syntax from the lower layer to
a pred_weight_table( ) syntax for the enhancement layer, when the
encoder applies the same weighting parameter to the enhancement
layer reference picture as that applied to the particular lower
layer reference picture. Also, another advantage/feature is the
scalable video encoder as described above, wherein the encoder
skips performing weighting parameters estimation, when the encoder
applies the same weighting parameter to the enhancement layer
reference picture as that applied to the particular lower layer
reference picture. Additionally, another advantage/feature is the
scalable video encoder as described above, wherein the encoder
stores only one set of weighting parameters for each reference
picture index, when the encoder applies the same weighting
parameter to the enhancement layer reference picture as that
applied to the particular lower layer reference picture. Moreover,
another advantage/feature is the scalable video encoder as
described above, wherein the encoder estimates the weighting
parameters, when the encoder applies a different weighting
parameter or the enhancement layer is without the lower layer.
[0093] These and other features and advantages of the present
invention may be readily ascertained by one of ordinary skill in
the pertinent art based on the teachings herein. It is to be
understood that the teachings of the present invention may be
implemented in various forms of hardware, software, firmware,
special purpose processors, or combinations thereof.
[0094] Most preferably, the teachings of the present invention are
implemented as a combination of hardware and software. Moreover,
the software may be implemented as an application program tangibly
embodied on a program storage unit. The application program may be
uploaded to, and executed by, a machine comprising any suitable
architecture. Preferably, the machine is implemented on a computer
platform having hardware such as one or more central processing
units ("CPU"), a random access memory ("RAM"), and input/output
("I/O") interfaces. The computer platform may also include an
operating system and microinstruction code. The various processes
and functions described herein may be either part of the
microinstruction code or part of the application program, or any
combination thereof, which may be executed by a CPU. In addition,
various other peripheral units may be connected to the computer
platform such as an additional data storage unit and a printing
unit.
[0095] It is to be further understood that, because some of the
constituent system components and methods depicted in the
accompanying drawings are preferably implemented in software, the
actual connections between the system components or the process
function blocks may differ depending upon the manner in which the
present invention is programmed. Given the teachings herein, one of
ordinary skill in the pertinent art will be able to contemplate
these and similar implementations or configurations of the present
invention.
[0096] Although the illustrative embodiments have been described
herein with reference to the accompanying drawings, it is to be
understood that the present invention is not limited to those
precise embodiments, and that various changes and modifications may
be effected therein by one of ordinary skill in the pertinent art
without departing from the scope or spirit of the present
invention. All such changes and modifications are intended to be
included within the scope of the present invention as set forth in
the appended claims.
* * * * *