U.S. patent application number 13/659006 was filed with the patent office on 2013-05-02 for scalable video coding method and apparatus using inter prediction mode.
This patent application is currently assigned to INTELLECTUAL DISCOVERY CO., LTD.. The applicant listed for this patent is Hyo Min Choi, Hyun Ho Jo, Jung Hak Nam, Dong Gyu Sim. Invention is credited to Hyo Min Choi, Hyun Ho Jo, Jung Hak Nam, Dong Gyu Sim.
Application Number | 20130107962 13/659006 |
Document ID | / |
Family ID | 48172431 |
Filed Date | 2013-05-02 |
United States Patent
Application |
20130107962 |
Kind Code |
A1 |
Sim; Dong Gyu ; et
al. |
May 2, 2013 |
SCALABLE VIDEO CODING METHOD AND APPARATUS USING INTER PREDICTION
MODE
Abstract
The present invention relates to a scalable video coding method
and apparatus using inter prediction mode. A decoding method
includes determining motion information prediction mode on a target
decoding block of an enhancement layer, predicting motion
information on the target decoding block of the enhancement layer
using motion information on the neighboring blocks of the
enhancement layer, if the determined motion information prediction
mode is a first mode, and predicting the motion information on the
target decoding block of the enhancement layer using motion
information on a corresponding block of a reference layer, if the
determined motion information prediction mode is a second mode.
Inventors: |
Sim; Dong Gyu; (Seoul,
KR) ; Nam; Jung Hak; (Seoul, KR) ; Jo; Hyun
Ho; (Seoul, KR) ; Choi; Hyo Min; (Seoul,
KR) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Sim; Dong Gyu
Nam; Jung Hak
Jo; Hyun Ho
Choi; Hyo Min |
Seoul
Seoul
Seoul
Seoul |
|
KR
KR
KR
KR |
|
|
Assignee: |
INTELLECTUAL DISCOVERY CO.,
LTD.
Seoul
KR
|
Family ID: |
48172431 |
Appl. No.: |
13/659006 |
Filed: |
October 24, 2012 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61551442 |
Oct 26, 2011 |
|
|
|
Current U.S.
Class: |
375/240.16 ;
375/E7.263 |
Current CPC
Class: |
H04N 19/105 20141101;
H04N 19/52 20141101; H04N 19/176 20141101; H04N 19/159 20141101;
H04N 19/30 20141101; H04N 19/53 20141101 |
Class at
Publication: |
375/240.16 ;
375/E07.263 |
International
Class: |
H04N 7/36 20060101
H04N007/36 |
Foreign Application Data
Date |
Code |
Application Number |
Dec 8, 2011 |
KR |
10-2011-0131156 |
Claims
1. A scalable video decoding method based on multiple layers,
comprising: determining motion information prediction mode on a
target decoding block of an enhancement layer; predicting motion
information on the target decoding block of the enhancement layer
using motion information on neighboring blocks of the enhancement
layer, if the determined motion information prediction mode is a
first mode; and predicting the motion information on the target
decoding block of the enhancement layer using motion information on
a corresponding block of a reference layer, if the determined
motion information prediction mode is a second mode.
2. The scalable video decoding method of claim 1, wherein: the
motion information prediction mode is determined based on
information signaled by a coding apparatus, and the signaling
information comprises a flag indicating whether inter-layer inter
coding is performed or not.
3. The scalable video decoding method of claim 1, wherein
determining the motion information prediction mode is performed in
a Coding Unit (CU).
4. The scalable video decoding method of claim 1, wherein
predicting the motion information in the first mode comprises:
decoding a motion merging candidate index if the first mode is not
a skip mode; and selecting any one motion merging candidate from a
motion merging candidate list in which the motion information on
the neighboring blocks are combined as a motion vector for the
target decoding block of the enhancement layer using the decoded
motion merging candidate index.
5. The scalable video decoding method of claim 1, wherein
predicting motion information in the first mode comprises: decoding
a motion prediction candidate index if the first mode is not a skip
mode; selecting any one motion prediction candidate from the motion
prediction candidate list, comprising the motion information on the
neighboring blocks, using the decoded motion prediction candidate
index; and generating a motion vector for the target decoding block
of the enhancement layer by merging a Motion Vector Difference
(MVD) with the selected motion prediction candidate.
6. The scalable video decoding method of claim 1, wherein
predicting the motion information in the second mode comprises
scaling the motion information on the corresponding block of the
reference layer according to a difference between resolutions of
the reference layer and the enhancement layer.
7. The scalable video decoding method of claim 6, wherein
predicting the motion information in the second mode comprises
generating a motion vector for the target decoding block of the
enhancement layer by merging an MVD with the scaled motion
information, if the second mode is a skip mode.
8. The scalable video decoding method of claim 1, wherein the
reference layer is a base layer.
9. A scalable video decoding apparatus based on multiple layers,
comprising: a first motion prediction module configured to predict
motion information on a target decoding block of an enhancement
layer using motion information on neighboring blocks; and a second
motion prediction module configured to predict motion information
on a target decoding block of the enhancement layer using motion
information on a corresponding block of a reference layer, wherein
any one of the first and the second motion prediction units is used
to predict the motion information on the target decoding block of
the enhancement layer according to motion information prediction
mode signaled by a coding apparatus.
10. The scalable video decoding apparatus of claim 9, wherein the
first motion prediction module comprises a motion merging module
for selecting any one motion merging candidate from a motion
merging candidate list in which the motion information on the
neighboring blocks are combined as a motion vector for the target
decoding block of the enhancement layer using a motion merging
candidate index, if mode is not a skip mode.
11. The scalable video decoding apparatus of claim 9, wherein the
first motion prediction module comprises a motion vector prediction
module for selecting any one motion prediction candidate from a
motion prediction candidate list, comprising the motion information
on the neighboring blocks, using a motion prediction candidate
index, if mode is not a skip mode, and configuring a motion vector
for the target decoding block of the enhancement layer by merging a
Motion Vector Difference (MVD) with the selected motion prediction
candidate.
12. The scalable video decoding apparatus of claim 9, wherein the
second motion prediction module comprises: a scaling unit for
scaling the motion information on the corresponding block of the
reference layer according to a difference between resolutions of
the reference layer and the enhancement layer; and a motion vector
generating unit for generating a motion vector for the target
decoding block of the enhancement layer by merging an MVD with the
scaled motion information.
Description
[0001] Priority to U.S. provisional application No. 61/551,442
filed on Oct. 26, 2011 and Korean patent application number
10-2011-0131156 filed on Dec. 8, 2011, the entire disclosure of
which is incorporated by reference herein, is claimed.
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] The present invention relates to video processing technology
and, more particularly, to a scalable video coding method and
apparatus for coding/decoding a video.
[0004] 2. Discussion of the Related Art
[0005] As broadcasting service having High Definition
(1280.times.720 or 1920.times.1080) is extended domestically and
globally, lots of users are becoming accustomed to pictures of high
resolution and high picture quality and thus lots of organizations
put spurs to the development of the next-generation picture
devices. Furthermore, as interest in Ultra High Definition (UHD)
having resolution 4 times higher than the HDTV, together with HDTV,
is increasing, moving picture standardization organizations have
recognized a necessity for compression technology for a picture of
higher resolution and high picture quality. Furthermore, there is a
need for a new standard which can provide the same picture quality
as that of the existing coding methods and also provide lots of
advantages in terms of a frequency band and storage through
compression efficiency higher than that of H.264/Advanced Video
Coding (AVC), that is, a moving picture compression coding standard
that is now used in HDTV and mobile phones. Moving Picture Experts
Group (MPEG) and Video Coding Experts Group (VCEG) jointly perform
a standardization task for High Efficiency Video Coding (HEVC),
that is, the next-generation video codec. An outline object of HEVC
is to code a video, including a UHD image, in compression
efficiency that is twice that of H.264/AVC. HEVC can provide not
only HD and UHD images, but also an image of high picture quality
in a frequency lower than a current frequency even in 3D
broadcasting and mobile communication networks.
[0006] In HEVC, a prediction picture can be generated by performing
prediction on a picture spatially or temporally, and a difference
between an original picture and the predicted picture can be coded.
Picture coding efficiency can be improved by this prediction
coding.
SUMMARY OF THE INVENTION
[0007] An object of the present invention is to provide a scalable
video coding method and apparatus which can improve coding/decoding
efficiency.
[0008] In accordance with an embodiment of the present invention, a
scalable video decoding method includes determining motion
information prediction mode on a target decoding block of an
enhancement layer; predicting motion information on the target
decoding block of the enhancement layer using motion information on
the neighboring blocks of the enhancement layer, if the determined
motion information prediction mode is a first mode; and predicting
the motion information on the target decoding block of the
enhancement layer using motion information on a corresponding block
of a reference layer, if the determined motion information
prediction mode is a second mode.
[0009] In accordance with an embodiment of the present invention, a
scalable video decoding apparatus includes a first motion
prediction module configured to predict motion information on a
target decoding block of an enhancement layer using motion
information on neighboring blocks and a second motion prediction
module configured to predict motion information on a target
decoding block of the enhancement layer using motion information on
a corresponding block of a reference layer, wherein any one of the
first and the second motion prediction units is used to predict the
motion information on the target decoding block of the enhancement
layer according to motion information prediction mode signaled by a
coding apparatus.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The accompany drawings, which are included to provide a
further understanding of this document and are incorporated on and
constitute a part of this specification illustrate embodiments of
this document and together with the description serve to explain
the principles of this document.
[0011] FIG. 1 is a block diagram showing a configuration according
to an embodiment of a video coding apparatus to which the present
invention is applied;
[0012] FIG. 2 is a block diagram showing a configuration according
to an embodiment of a videovideo decoding apparatus to which the
present invention is applied;
[0013] FIG. 3 is a conceptual diagram showing a concept of a
picture and a block which are used in an embodiment of the present
invention;
[0014] FIG. 4 is a conceptual diagram schematically showing an
embodiment of a scalable video coding structure based on multiple
layers;
[0015] FIG. 5 is a block diagram showing an embodiment of the
configuration of a motion compensation unit shown in FIG. 2;
[0016] FIG. 6 is a block diagram showing an embodiment of the
configuration of a second motion prediction module shown in FIG.
5;
[0017] FIG. 7 is a flowchart illustrating a scalable video coding
method in accordance with a first embodiment of the present
invention;
[0018] FIG. 8 is a diagram illustrating an embodiment of an
inter-layer inter coding method;
[0019] FIG. 9 is a flowchart illustrating a scalable video coding
method in accordance with a second embodiment of the present
invention; and
[0020] FIG. 10 is a flowchart illustrating a scalable video coding
method in accordance with a third embodiment of the present
invention.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0021] Hereinafter, exemplary embodiments are described in detail
with reference to the accompanying drawings. In describing the
embodiments of the present invention, a detailed description of the
known functions and constructions will be omitted if it is deemed
to make the gist of the present invention unnecessarily vague.
[0022] When it is said that one element is "connected" or "coupled"
to the other element, the one element may be directly connected or
coupled to the other element, but it should be understood that a
third element may exist between the two elements. Furthermore, in
the present invention, the contents describing that a specific
element is "included (or comprised)" does not mean that elements
other than the specific element are excluded, but means that
additional elements may be included in the implementation of the
present invention or in the scope of technical spirit of the
present invention.
[0023] Terms, such as the first and the second, may be used to
describe various elements, but the elements should not be
restricted by the terms. The terms are used to only distinguish one
element and the other element from each other. For example, a first
element may be named a second element without departing from the
scope of the present invention. Likewise, a second element may also
be named a first element.
[0024] Furthermore, elements described in the embodiments of the
present invention are independently shown in order to indicate
different and characteristic functions, and it does not mean that
each of the elements consists of separate hardware or a piece of
software unit. That is, the elements are arranged, for convenience
of description, and at least two of the elements may be combined to
form one element or one element may be divided into a plurality of
elements and the plurality of elements may perform functions. An
embodiment in which the elements are combined or each of the
elements is divided is included in the scope of the present
invention without departing from the essence of the present
invention.
[0025] Furthermore, in the present invention, some elements may not
be essential elements for performing essential functions, but may
be optional elements for improving only performance. The present
invention may be embodied using only the essential elements for
implementing the essence of the present invention other than the
elements used to improve only performance, and a structure
including only the essential elements other than the optional
elements used to improve only performance are included in the scope
of the present invention.
[0026] FIG. 1 is a block diagram showing a configuration according
to an embodiment of a video coding apparatus to which the present
invention is applied.
[0027] Referring to FIG. 1, the video coding apparatus 100 includes
a motion prediction unit 111, a motion compensation unit 112, an
intra prediction unit 120, a switch 115, a subtractor 125, a
transform unit 130, a quantization unit 140, an entropy coding unit
150, an inverse quantization unit 160, an inverse transform unit
170, an adder 175, a filter unit 180, and a reference
picturepicture buffer 190.
[0028] The video coding apparatus 100 performs coding on an input
video in intra mode or inter mode and outputs a bitstream. Intra
prediction means intra-picture prediction, and inter prediction
means inter-picture prediction. In the case of the intra mode, the
switch 115 is switched to the intro mode, and in the case of the
inter mode, the switch 115 is switched to the inter mode. The video
coding apparatus 100 generates a prediction block for the input
block of the input picture and codes a difference between the input
block and the prediction block.
[0029] In the case of the intra mode, the intra prediction unit 120
generates the prediction block by performing spatial prediction
using the pixel values of coded neighboring blocks.
[0030] In a motion prediction process, for the case of the inter
mode, the motion prediction unit 111 searches a reference picture,
stored in the reference picture buffer 190, for a region that is
most well matched with an input block and calculates a motion
vector based on the retrieved reference picture. The motion
compensation unit 112 generates a prediction block by performing
motion compensation using the motion vector.
[0031] The subtractor 125 generates a residual block based on a
difference between the input block and the generated prediction
block. The transform unit 130 transforms the residual block and
outputs transform coefficients. Furthermore, the quantization unit
140 quantizes the input transform coefficients based on
quantization parameters and outputs quantized coefficients. The
entropy coding unit 150 performs entropy coding on the input
quantized coefficient based on a probability distribution and
outputs a bitstream.
[0032] In HEVC, a current coded picture needs to be decoded and
stored in order to be used as a reference picture because inter
prediction coding, that is, inter-picture prediction coding, is
performed. Accordingly, a quantized coefficient is dequantized by
the inverse quantization (dequantization) unit 160 and inversely
transformed by the inverse transform unit 170. Dequantized and
inversely transformed coefficients are added to a prediction block
by the adder 175, so that a reconstruction block is generated.
[0033] The reconstruction block is input to the filter unit 180.
The filter unit 180 may apply at least one of a deblocking filter,
a Sample Adaptive Offset (SAO), and an Adaptive Loop Filter (ALF)
to a reconstruction block or a reconstructed picture. The filter
unit 180 may also be called an adaptive in-loop filter. The
deblocking filter can remove block distortion that occurs at the
boundary between blocks. The SAO can add a proper offset value to a
pixel value in order to compensate for a coding error. The ALF can
perform filtering based on a value obtained by comparing a
reconstructed picture with an original picture, and the filtering
may be performed only when high efficiency is applied. The
reconstruction block output from the filter unit 180 is stored in
the reference picture buffer 190.
[0034] FIG. 2 is a block diagram showing a configuration according
to an embodiment of a video decoding apparatus to which the present
invention is applied.
[0035] Referring to FIG. 2, the video decoding apparatus 200
includes an entropy decoding unit 210, an inverse quantization unit
220, an inverse transform unit 230, an intra prediction unit 240, a
motion compensation unit 250, a filter unit 260, and a reference
picture buffer 270.
[0036] The video decoding apparatus 200 receives a bitstream from a
coder, decodes the bitstream in intra mode or inter mode, and
outputs a reconfigured picture according to the decoding, that is,
a reconstruction picture. A switch is switched to intra mode in the
case of intra mode and to inter mode in the case of inter mode. The
video decoding apparatus 200 obtains a residual block from an input
bitstream, generates a prediction block and generates a block
configured by adding the residual block and the prediction block,
that is, a reconstruction block.
[0037] The entropy decoding unit 210 performs entropy decoding on
the input bitstream according to a probability distribution and
outputs a quantized coefficient. The quantized coefficients are
dequantized by the inverse quantization (dequantization) unit 220
and then inversely transformed by the inverse transform unit 230.
The inverse transform unit (230) outputs a residual block.
[0038] In the case of intra mode, the intra prediction unit 240
generates a prediction block by performing spatial prediction using
the pixel values of coded blocks that are neighboring to a current
block.
[0039] In the case of inter mode, the motion compensation unit 250
generates a prediction block by performing motion compensation
using a motion vector and a reference picture stored in the
reference picture buffer 270.
[0040] The residual block and the prediction block are added by an
adder 255. The added block is input into the filter unit 260. The
filter unit 260 may apply at least one of a deblocking filter, an
SAO, and an ALF to a reconstruction block or a reconstruction
picture. The filter unit 260 outputs a reconfigured picture, that
is, a reconstruction picture. The reconstruction picture can be
stored in the reference picture buffer 270 and used in
inter-picture prediction.
[0041] A method of improving the prediction performance of
coding/decoding apparatuses includes a method of improving the
accuracy of an interpolation picture and a method of predicting a
difference signal. Here, the difference signal is a signal
indicating a difference between an original picture and a
prediction picture. In the present invention, a "difference signal"
may be replaced with a "differential signal", a "residual block",
or a "differential block" depending on context, and a person having
ordinary skill in the art will distinguish them within a range that
does not affect the spirit and essence of the invention.
[0042] Although the accuracy of an interpolation picture is
improved, a difference signal is inevitably occurred. In order to
improve coding performance, it is necessary to reduce a difference
signal to be coded to a maximum extent by improving the prediction
performance of the difference signal.
[0043] A filtering method using a fixed filter coefficient may be
used as a method of predicting a difference signal. However, the
prediction performance of this filtering method is limited because
the filter coefficient cannot be adaptively used according to
picture characteristics. Accordingly, it is necessary to improve
the accuracy of prediction in such a manner that filtering is
performed for each prediction block according to its
characteristics.
[0044] FIG. 3 is a conceptual diagram showing a concept of a
picture and a block which are used in an embodiment of the present
invention.
[0045] Referring to FIG. 3, a target coding block is a set of
pixels that are spatially coupled within a current target coding
picture. The target coding block is a unit on which coding and
decoding are performed, and it may have a quadrangle or a specific
shape. A neighboring reconstruction block is a block on which
coding and decoding have been performed before a current target
coding block is coded within a current target coding picture.
[0046] A prediction picture is a picture including a collection of
prediction blocks used to code respective target coding blocks from
the first target coding block to the current target coding block
picture within a current target coding picture. Here, the
prediction block refers to a block having a prediction signal used
to code each target coding block within the current target coding
picture. That is, the prediction block refers to each of blocks
within a prediction picture.
[0047] A neighboring block refers to a neighboring reconstruction
block of a current target coding block and a neighboring prediction
block, that is, the prediction block of each neighboring
reconstruction block. That is, a neighboring block refers to both a
neighboring reconstruction block and a neighboring prediction
block.
[0048] The prediction block of a current target coding block may be
a prediction block that is generated by the motion compensation
unit 112 or the intra prediction unit 120 according to the
embodiment of FIG. 1. In this case, after a prediction block
filtering process is performed on the prediction block generated by
the motion compensation unit 112 or the intra prediction unit 120,
the subtractor 125 may perform subtracting a filtered final
prediction block from an original block.
[0049] A neighboring block may be a block stored in the reference
picture buffer 190 according to the embodiment of FIG. 1 or a block
stored in additional memory. Furthermore, a neighboring
reconstruction block or a neighboring prediction block generated
during a picture coding process may be used as a neighboring
block.
[0050] FIG. 4 is a conceptual diagram schematically showing an
embodiment of a scalable video coding structure based on multiple
layers. In FIG. 4, a Group Of Picture (GOP) indicates a picture
group, that is, a group of pictures.
[0051] A transmission medium is necessary to transmit video data,
and a transmission medium has different performance depending on a
variety of network environments. A scalable video coding method can
be provided for the purpose of an application to a variety of
transmission media or network environments.
[0052] The scalable video coding method is a coding method of
improving coding/decoding performance by removing redundancy
between layers using texture information, motion information, and a
residual signal between layers. The scalable video coding method
can provide a variety of scalabilities from spatial, temporal, and
picture quality points of view depending on surrounding conditions,
such as a transfer bit rate, a transfer error rate, and system
resources.
[0053] Scalable video coding can be performed using a multi-layer
structure so that a bitstream applicable to a variety of network
situations can be provided. For example, a scalable video coding
structure may include a base layer for performing compression and
processing on picture data using a common picture coding method and
an enhancement layer for performing compression and processing on
picture data using both information on the coding of the base layer
and a common picture coding method.
[0054] Here, a layer means a set of pictures and bitstream which
are distinguished from one another according to criteria, such as a
space (e.g., a picture size), time (e.g., coding order and picture
output order), picture quality, and complexity. Furthermore,
multiple layers may have mutual dependency.
[0055] Referring to FIG. 4, for example, a base layer may be
defined to have a Quarter Common Intermediate Format (QCIF), a
frame rate of 15 Hz, and a bit rate of 3 Mbps. A first enhancement
layer may be defined to have a Common Intermediate Format (CIF), a
frame rate of 30 Hz, and a bit rate of 0.7 Mbps. A second
enhancement layer may be defined to have Standard Definition (SD),
a frame rate of 60 Hz, and a bit rate of 0.19 Mbps. The formats,
the frame rates, and the bit rates are only illustrative and may be
differently determined as occasion demands. Furthermore, the number
of layers is not limited to that of the present embodiment, but may
be differently determined according to situations.
[0056] If a bitstream having a CIF and 0.5 Mbps is necessary, a
bitstream may be segmented and transmitted in the first enhancement
layer so that the bitstream has the bit rate of 0.5 Mbps. A
scalable video coding method can provide temporal, spatial, and
picture quality scalabilities through the method described in
connection with the embodiment of FIG. 3.
[0057] Hereinafter, a target layer, a target picture, a target
slice, a target unit, a target block, a target symbol, and a target
bin mean a layer, a picture, a slice, a unit, a block, a symbol,
and a bin, respectively, which are now being coded or decode.
Accordingly, a target layer may be a layer to which a target symbol
belongs, for example. Furthermore, other layers are layers except a
target layer, and layers that the target can refer to. That is,
other layers may be used to perform decoding in a target layer.
Layers which a target layer can use may include temporal, spatial,
and picture quality lower layers, for example.
[0058] Furthermore, a corresponding layer, a corresponding picture,
a corresponding slice, a corresponding unit, a corresponding block,
a corresponding symbol, and a corresponding bin hereinafter mean a
layer, a picture, a slice, a unit, a block, a symbol, and a bin,
respectively, corresponding to a target layer, a target picture, a
target slice, a target unit, a target block, a target symbol, and a
target bin. A corresponding picture refers to a picture of another
layer that is placed in the same time axis as that of a target
picture. If a picture within a target layer has the same display
order as a picture within another layer, it can be said that the
picture within the target layer and the picture within another
layer are placed in the same time axis. Whether pictures are placed
in the same time axis or not can be checked using a coding
parameter, such as a Picture Order Count (POC). A corresponding
slice refers to a slice placed at a position that is spatially the
same as or similar to that of the target slice of a target picture
within a corresponding picture. A corresponding unit refers to a
unit placed at a position that is spatially the same as or similar
to that of the target unit of a target picture within a
corresponding picture. A corresponding block refers to a block
placed at a position that is spatially the same as or similar to
that of the target block of a target picture within a corresponding
picture.
[0059] Furthermore, a slice indicating a unit on which a picture is
split is hereinafter used as a meaning that generally refers to a
partition unit, such as a tile and an entropy slice. Independent
picture coding and decoding are possible between partition
units.
[0060] Furthermore, a block hereinafter means a unit of picture
coding and decoding. When a picture is coded and decoded, a coding
or decoding unit refers to a partition unit when splitting one
picture into partition units and coding or decoding the partition
units. Thus, the coding or decoding unit may also be called a macro
block, a Coding Unit (CU), a Prediction Unit (PU), a Transform Unit
(TU), or a transform block, etc. One block may be further split
into smaller lower blocks.
[0061] Inter layer Intra prediction, inter layer inter prediction,
or inter layer differential signal prediction can be performed in
order to remove redundancy between layers by taking the
characteristics of scalable video coding, such as those described
above, into consideration.
[0062] The inter-layer inter prediction is a method of using motion
information on the corresponding block of a reference layer in an
enhancement layer. This is described in detail later.
[0063] A scalable video coding method in accordance with an
embodiment of the present invention is described in detail below
with reference to FIGS. 5 to 10. Meanwhile, a method of coding an
enhancement layer, such as that described with reference to FIG. 4,
is described below.
[0064] FIG. 5 is a schematic block diagram showing the
configuration of the decoding apparatus in accordance with an
embodiment of the present invention. FIG. 5 shows a detailed
configuration of the motion compensation unit 250 shown in FIG.
2.
[0065] Referring to FIG. 5, the motion compensation unit 250
predicts motion information (e.g., a motion vector) using a
plurality of motion prediction methods. To this end, the motion
compensation unit 250 may include a first motion prediction module
251 and a second motion prediction module 255 configured to predict
the target decoding block of an enhancement layer using different
methods.
[0066] The first motion prediction module 251 may use motion
information on neighboring blocks within an enhancement layer in
order to predict motion information on the target decoding block of
the enhancement layer.
[0067] For example, the motion merging module 252 of the first
motion prediction module 251 may use motion information on
neighboring candidate blocks as motion information on the target
decoding block and predict motion information on the target
decoding block of the enhancement layer using a motion merging
method defined in HEVC, for example.
[0068] More particularly, in the motion merging method, a coding
apparatus can select any one motion merging candidate from a motion
merging candidate list in which motion information on neighboring
blocks are combined and signaled an index for the selected motion
merging candidate.
[0069] Meanwhile, the decoding apparatus can select any one motion
merging candidate as a motion vector for the target decoding block
using the motion merging candidate index signaled by the coding
apparatus from a motion merging candidate list, previously
produced.
[0070] The motion vector prediction module 253 can use one
candidate block having optimum performance in a viewpoint of
rate-distortion, from among motion information on neighboring
candidate blocks and predict motion information on the target
decoding block of the enhancement layer using an Advanced Motion
Vector Prediction (AMVP) method defined in HEVC, for example.
[0071] Particularly, a coding apparatus can compare rate-distortion
cost values with each other for candidates in an AMVP candidate
list including motion information on neighboring blocks, select any
one motion prediction candidate based on a result of the
comparison, and signals an index for the selected motion prediction
candidate.
[0072] Meanwhile, the decoding apparatus can select any one motion
prediction candidate using the motion prediction candidate index
signaled by the coding apparatus from the motion prediction
candidate list that has been previously produced and generate a
motion vector for the target decoding block of the enhancement
layer by adding a Motion Vector Difference (MVD) to the selected
motion prediction candidate.
[0073] Meanwhile, the second motion prediction module 255 can
predict motion information on the target decoding block of the
enhancement layer using motion information on the corresponding
block of the reference layer.
[0074] Referring to FIG. 6, the second motion prediction module 255
may include a scaling unit 256 and a motion vector generating unit
257.
[0075] The scaling unit 256 can adaptively scale the motion
information on the corresponding block of the reference layer
depending on a difference between the resolutions of the layers.
The motion vector generating unit 257 can generate the motion
vector for the target decoding block of the enhancement layer by
adding the MVD to the scaled motion information.
[0076] For example, the reference layer may be a base layer.
[0077] In the decoding apparatus according with an embodiment of
the present invention, a module selected from the motion merging
module 252, the motion vector prediction module 253, and the second
motion prediction module 255, such as those described above,
according to motion prediction mode can configure the motion vector
for the target decoding block of the enhancement layer to be used
in the motion compensation unit 250 based on motion information
transferred from the entropy coding unit 150 or motion information
derived from the reference layer.
[0078] FIG. 7 is a flowchart illustrating a scalable video coding
method in accordance with a first embodiment of the present
invention. The illustrated video coding method is described in
connection with the block diagrams showing the configuration of the
decoding apparatus of FIGS. 5 and 6 in accordance with an
embodiment of the present invention.
[0079] Referring to FIG. 7, whether or not motion information
prediction mode for a current target decoding block of an
enhancement layer is a mode in which inter-layer inter coding will
be performed is determined at step S300.
[0080] For example, the motion information prediction mode is
determined based on information signaled by a coding apparatus.
Particularly, the signaled information may include a flag
indicating whether inter-layer inter coding will be performed or
not.
[0081] Furthermore, the step S300 of determining the motion
information prediction mode and a series of steps thereafter may be
performed in a CU unit.
[0082] If it is determined that the motion information prediction
mode is a first mode in which inter-layer inter coding is not
performed, the first motion prediction module 251 obtains motion
information on the neighboring blocks of an enhancement layer at
step S310 and predicts motion information on the target decoding
block of the enhancement layer using the obtained motion
information on the neighboring blocks at step S330.
[0083] If however, it is determined that the motion information
prediction mode is a second mode in which inter-layer inter coding
is performed, the second motion prediction module 255 obtains
motion information on the corresponding block of a reference layer
at step S320 and predicts motion information on the target decoding
block of the enhancement layer using the obtained motion
information on the corresponding block of the reference layer at
step S330.
[0084] Referring to FIG. 8, the scaling unit 256 of the second
motion prediction module 255 can generate a motion vector for the
target decoding block B1 by scaling a motion vector for the
corresponding block B2 of the base layer based on a difference
between the resolutions of the enhancement layer and the base layer
in order to predict the motion information on the target decoding
block of the enhancement layer B1.
[0085] For example, the corresponding block B2 of the base layer
may be a block that is most well matched with the target decoding
block B1 of the enhancement layer among blocks existing in the base
layer, or may be a co-located block that has a position
corresponding to the target decoding block B1 of the enhancement
layer.
[0086] Furthermore, if a current target decoding block is included
in the base layer not in the enhancement layer, the first motion
prediction module 251 can predict motion information on the target
decoding block of the base layer using motion information on
neighboring blocks.
[0087] Meanwhile, relating to the generation of a prediction
picture for video coding, in skip mode, motion information can be
derived from neighboring blocks and a prediction picture or block
can be generated based on the derived motion information, but the
motion information or residual picture information may not be coded
or decoded.
[0088] A video coding method in accordance with an embodiment of
the present invention may be differently performed depending on
whether the skip mode is used or not.
[0089] FIG. 9 is a flowchart illustrating a scalable video coding
method in accordance with a second embodiment of the present
invention. FIG. 9 shows an example of a video decoding method when
the skip mode is not used.
[0090] Referring to FIG. 9, first, whether inter-layer inter coding
is performed on a current target decoding block of an enhancement
layer or not is determined at step S400.
[0091] If it is determined that the inter-layer inter coding is not
performed, the motion merging module 252 decodes a motion merging
candidate index signaled by a coding apparatus at step S410 and
selects one motion merging candidate as a motion vector for the
target decoding block of the enhancement layer at step S420 using
the decoded motion merging candidate index from a motion merging
candidate list, previously produced.
[0092] If it is determined that the inter-layer inter coding is
performed, the second motion prediction module 255 derives a motion
vector for a corresponding block of a reference layer at step S430
and scales the derived motion vector according to resolution at
step S440.
[0093] For example, the scaling unit 256 of the second motion
prediction module 255 scales the motion vector for the
corresponding block of the reference layer according to a
difference between the resolutions of the reference layer and the
enhancement layer. If the reference layer and the enhancement layer
have the same resolution, the step S440 may be omitted and the
motion vector for the corresponding block of the reference layer
may be used as the motion vector for the target decoding block of
the enhancement layer.
[0094] FIG. 10 is a flowchart illustrating a scalable video coding
method in accordance with a third embodiment of the present
invention. FIG. 10 shows an example of a video decoding method when
the skip mode is used.
[0095] Referring to FIG. 10, first, whether inter-layer inter
coding is performed on a current target decoding block of an
enhancement layer or not is determined at step S500.
[0096] If it is determined that the inter-layer inter coding is not
performed, the motion vector prediction module 253 decodes a motion
prediction candidate index signaled by a coding apparatus at step
S510 and selects a motion vector using the decoded motion
prediction candidate index from a motion prediction candidate list
that has been previously produced at step S520.
[0097] Next, the motion vector prediction module 253 decodes a
Motion Vector Difference (MVD) signaled by the coding apparatus at
step S530 and generates a motion vector for a target decoding block
of the enhancement layer by adding the decoded MVD to the motion
vector selected at step S520 at step S540.
[0098] If it is determined that the inter-layer inter coding is
performed, the second motion prediction module 255 derives a motion
vector for a corresponding block of a reference layer at step S550
and the scaling unit 256 of the second motion prediction module 255
scales the derived motion vector according to a difference between
the resolutions of the reference layer and the enhancement layer at
step S560.
[0099] Next, the motion vector generating unit 257 decodes an MVD
signaled by the coding apparatus at step S570 and generates a
motion vector for the target decoding block of the enhancement
layer by adding the decoded MVD to the scaled motion vector at step
S580.
[0100] The scalable video coding methods and apparatus in
accordance with some embodiments of the present invention have been
described above on the basis of a video decoding method and
apparatus, but the scalable video coding method in accordance with
an embodiment of the present invention may be embodied by
performing a series of steps according to a decoding method, such
as that described with reference to FIGS. 5 to 10.
[0101] More particularly, in accordance with the scalable video
coding methods and apparatuses according to the embodiments of the
present invention, an intra prediction mode for a target coding
block of an enhancement layer can be selected and a prediction
signal can be generated according to the selected intra prediction
mode by performing intra prediction having the same construction as
that of a decoding method and apparatus, such as those described
with reference to FIGS. 5 to 10.
[0102] In accordance with an embodiment of the present invention,
in scalable video coding based on multiple layers, in order to
predict motion information on an enhancement layer, motion
information on neighboring blocks and motion information on a
corresponding block of a base layer are selectively used.
Accordingly, coding efficiency can be improved because the number
of bits necessary for coding and decoding is reduced, and thus
improved picture quality can be provided in the same bit rate.
[0103] In the above exemplary systems, although the methods have
been described on the basis of the flowcharts using a series of the
steps or blocks, the present invention is not limited to the
sequence of the steps, and some of the steps may be performed in
order different from that of the remaining steps or may be
performed simultaneously with the remaining steps. Furthermore,
those skilled in the art will understand that the steps shown in
the flowcharts are not exclusive and they may include other steps
or one or more steps of the flowchart may be deleted without
affecting the scope of the present invention.
[0104] The above embodiments include various aspects of examples.
Although all possible combinations for describing the various
aspects may not be described, those skilled in the art may
appreciate that other combinations are possible. Accordingly, the
present invention should be construed as including all other
replacements, modifications, and changes which fall within the
scope of the claims.
* * * * *