U.S. patent application number 13/391517 was filed with the patent office on 2012-12-13 for method and apparatus for interpolating reference picture and method and apparatus for encoding/decoding image using same.
This patent application is currently assigned to SK TELECOM CO., LTD.. Invention is credited to Jaehoon Choi, Jongki Han, Byeungwoo Jeon, Dongwon Kim, Haekwang Kim, Sunyeon Kim, Gyumin Lee, Juock Lee, Yunglyul Lee, Jeongyeon Lim, Joohee Moon.
Application Number | 20120314771 13/391517 |
Document ID | / |
Family ID | 43929560 |
Filed Date | 2012-12-13 |
United States Patent
Application |
20120314771 |
Kind Code |
A1 |
Lim; Jeongyeon ; et
al. |
December 13, 2012 |
METHOD AND APPARATUS FOR INTERPOLATING REFERENCE PICTURE AND METHOD
AND APPARATUS FOR ENCODING/DECODING IMAGE USING SAME
Abstract
The present disclosure relates to a method and apparatus for
interpolating a reference picture and a method and apparatus for
encoding/decoding a video using the same. The apparatus for
interpolating the reference picture selects a plurality of filters
for interpolating the reference picture and generates a reference
picture having a target precision through a multi-stage filtering
of the reference picture by using a plurality of filters. The
compression efficiency of the video may be improved by
interpolating a reference picture through the determination of a
filter of a filter coefficient for interpolating the reference
picture according to characteristics of the video and interpolating
the reference picture through a multi-stage filtering or adaptively
changing resolutions of motion vectors in the unit of predetermined
areas.
Inventors: |
Lim; Jeongyeon;
(Gyeonggi-do, KR) ; Kim; Sunyeon; (Seoul, KR)
; Lee; Gyumin; ( Gyeonggi-do, KR) ; Choi;
Jaehoon; (Gyeonggi-do, KR) ; Moon; Joohee;
(Seoul, KR) ; Lee; Yunglyul; (Seoul, KR) ;
Kim; Haekwang; (Seoul, KR) ; Jeon; Byeungwoo;
( Gyeonggi-do, KR) ; Han; Jongki; (Seoul, KR)
; Lee; Juock; (Seoul, KR) ; Kim; Dongwon;
(Seoul, KR) |
Assignee: |
SK TELECOM CO., LTD.
Seoul
KR
|
Family ID: |
43929560 |
Appl. No.: |
13/391517 |
Filed: |
August 21, 2010 |
PCT Filed: |
August 21, 2010 |
PCT NO: |
PCT/KR2010/005569 |
371 Date: |
April 28, 2012 |
Current U.S.
Class: |
375/240.16 ;
375/240.12; 375/E7.123; 375/E7.243; 382/232; 382/233; 382/300 |
Current CPC
Class: |
H04N 19/117 20141101;
H04N 19/463 20141101; H04N 19/80 20141101; H04N 19/523
20141101 |
Class at
Publication: |
375/240.16 ;
375/240.12; 382/300; 382/232; 382/233; 375/E07.123;
375/E07.243 |
International
Class: |
H04N 7/32 20060101
H04N007/32; G06K 9/32 20060101 G06K009/32 |
Foreign Application Data
Date |
Code |
Application Number |
Aug 21, 2009 |
KR |
10-2009-0077452 |
Mar 3, 2010 |
KR |
10-2010-0019208 |
Aug 20, 2010 |
KR |
10-2010-0081097 |
Claims
1. An apparatus for encoding/decoding a video, comprising: a video
encoder for interpolating a reference picture to have a target
precision through a multi-stage filtering of the reference picture
by using a plurality of filters and performing an inter prediction
encoding of the video by using an interpolated reference picture
having the target precision; and a video decoder for interpolating
a reference picture to have a target precison through a multi-stage
filtering of the reference picture by using the plurality of
filters identified by reconstructed information from decoding a
bitstream and reconstructing the video through performing an inter
prediction decoding of the bitstream by using the interpolated
reference picture having the target precision.
2. An apparatus for encoding a video, comprising: a reference
picture interpolator for interpolating a reference picture to have
a target precision through a multi-stage filtering of the reference
picture by using a plurality of filters; and an inter prediction
encoder for performing an inter prediction encoding of the video by
using the interpolated reference picture having the target
precision.
3. The apparatus of claim 2, further comprising: a resolution
determiner for determining a motion vector resolution for each area
or motion vector, wherein the reference picture interpolator
determines a filter tap by using a determined motion vector
resolution.
4. The apparatus of claim 2, further comprising: a resolution
conversion flag generator for generating a resolution conversion
flag for informing changes from a resolution of a previous block or
a surrounding resolution of an area in which a current resolution
is to be encoded, wherein the reference picture interpolator
determines a filter tap by using a motion vector resolution
determined by using the resolution conversion flag.
5. The apparatus of claim 2, further comprising: a resolution
appointment flag generator for generating a resolution appointment
flag for appointing resolution sets differently for each motion
vector or area of a video, wherein the reference picture
interpolator determines a filter tap according to a single
resolution when the resolution appointment flag means the single
resolution.
6. The apparatus of claim 2, wherein the reference picture
interpolator sets types of filter taps for each resolution of a
picture and selects a filter from the types of the filter taps,
which has a minimum difference from a current picture as a result
of an interpolation, as an optimum filter.
7. The apparatus of claim 2, wherein the reference picture
interpolator selects a filter tap according to a motion vector
resolution.
8. The apparatus of claim 2, wherein the reference picture
interpolator selects an optimum filter tap for each resolution in
the unit of predetermined areas within a picture or a slice.
9. An apparatus for decoding a video, comprising: a reference
picture interpolator for interpolating a reference picture to have
a target precision through a multi-stage filtering of the reference
picture by using a plurality of filters identified by information
on the plurality of filters reconstructed through decoding a
bitstream; and an inter prediction decoder for reconstructing a
video through an inter prediction decoding of a bitstream by using
an interpolated reference picture having the target precision.
10. The apparatus of claim 9, further comprising: a resolution
decoder for extracting a motion vector resolution for each area or
motion vector, wherein the reference picture interpolator
determines a filter tap by using the extracted motion vector
resolution.
11. The apparatus of claim 9, further comprising: a resolution
conversion flag extractor for extracting a resolution conversion
flag for informing changes from a resolution of a previous block or
a surrounding resolution of an area in which a current resolution
is encoded, wherein the reference picture interpolator determines a
filter tap by using a motion vector resolution determined by using
an extracted resolution conversion flag.
12. The apparatus of claim 9, further comprising: is a resolution
appointment flag generator for extracting a resolution appointment
flag for appointing resolution sets differently for each motion
vector or area of a video, wherein the reference picture
interpolator determines a filter tap according to a single
resolution when an extracted resolution appointment flag means the
single resolution.
13. The apparatus of claim 9, wherein the reference picture
interpolator sets types of filter taps for each resolution of a
picture and selects a filter from the types of the filter taps,
which has a minimum difference from a current picture as a result
of an interpolation, as an optimum filter.
14. The apparatus of claim 9, wherein the reference picture
interpolator selects a filter tap according to a motion vector
resolution.
15. The apparatus of claim 9, wherein the reference picture
interpolator selects an optimum filter tap for each resolution in
the unit of predetermined areas within a picture or a slice.
16. An apparatus for interpolating a reference picture, comprising:
a filter selector for selecting a plurality of filters for
interpolating the reference picture into an interpolated reference
picture; and a filter for generating a reference picture having a
target precision through a multi-stage filtering of the reference
picture by using the plurality of filters.
17. The apparatus of claim 16, wherein the filter selector selects
a first filter for interpolating a sub-pixel by using an integer
pixel of the reference picture and a second filter for
interpolating a sub-pixel of a target precision by using the
integer pixel and an interpolated sub-pixel.
18. The apparatus of claim 17, wherein the filter interpolates the
reference picture by using the first filter and interpolates an
interpolated reference picture by using the second filter.
19. The apparatus of claim 17, wherein the filter selector selects
a filter from a plurality of filters having a fixed filter
coefficient, which has a minimum difference between a current
picture and the interpolated reference picture, as the first filter
when the sub-pixel is interpolated by using the integer pixel of
the reference picture.
20. The apparatus of claim 17, wherein the filter selector
calculates a filter coefficient which has a minimum difference
between a current picture and the interpolated reference picture,
as a filter coefficient of the first filter when the sub-pixel is
interpolated by using the integer pixel of the reference
picture.
21. The apparatus of claim 17, wherein the filter selector selects
a filter from a plurality of filters having a fixed filter
coefficient, which has a minimum difference between a current
picture and a re-interpolated reference picture, as the second
filter when the sub-pixel of the target precision is interpolated
by using the interpolated sub-pixel and the integer pixel of the
reference picture.
22. The apparatus of claim 17, wherein the filter selector
calculates a filter coefficient which has a minimum difference
between a current picture and a re-interpolated reference picture,
as a filter coefficient of the second filter when the sub-pixel of
the target precision is interpolated by using the interpolated
sub-pixel and the integer pixel of the reference picture.
23. The apparatus of claim 17, further comprising: a filter
information encoder for encoding information on the first filter
and information on the second filter.
24. An apparatus for interpolating a reference picture, comprising:
a filter information decoder for reconstructing information on a
plurality of filters through decoding a bitstream; and is a filter
for generating a reference picture having a target precision
through a multi-stage filtering of the reference picture by using a
plurality of filters identified by a reconstructed information on
the plurality of filters.
25. The apparatus of claim 24, wherein the filter interpolates the
reference picture by using a first filter identified by information
on the first filter reconstructed through decoding the bitstream
and interpolates an interpolated reference picture by using a
second filter identified by information on the second filter
reconstructed through decoding the bitstream.
26. The apparatus of claim 25, wherein the information on the first
filter and the information on the second filter contain information
on filter coefficients or information on types of selected filters
from the plurality of filters.
27. The apparatus of claim 25, wherein the filter interpolates a
sub-pixel of the reference picture by using the first filter based
on an integer pixel of the reference picture.
28. The apparatus of claim 25, wherein the filter interpolates a
sub-pixel of the target precision based on the integer pixel of the
reference picture and an interpolated sub-pixel of the reference
picture.
29. A method of encoding/decoding a video, comprising:
interpolating a reference picture to have a target precision
through a multi-stage filtering of the reference picture by using a
plurality of filters; performing an inter prediction encoding of
the video by using an interpolated reference picture having the
target precision; interpolating a reference picture to have a
target precision through a multi-stage filtering of the reference
picture by using the plurality of filters identified by information
reconstructed through a decoding of a bitstream; and reconstructing
the video through an inter prediction decoding of the bitstream by
using the interpolated reference picture having the target
precision.
30. A method of encoding a video, comprising: interpolating a
reference picture to have a target precision through a multi-stage
filtering of the reference picture by using a plurality of filters;
and performing an inter prediction encoding of the video by using
an interpolated reference picture having the target precision.
31. The method of claim 30, wherein the step of interpolating the
reference picture to have the target precision is performed through
an iterative process in which the reference picture is interpolated
through a filtering by using one of the plurality of filters and
the interpolated reference picture is interpolated through another
filtering by using another one of the plurality of filters.
32. The method of claim 30, further comprising: determining a
motion vector resolution for each area or motion vector, wherein
the step of interpolating the reference picture to have the target
precision comprises determining a filter tap by using a determined
motion vector resolution.
33. The method of claim 30, further comprising: generating a
resolution conversion flag for informing changes from a resolution
of a previous block or a surrounding resolution of an area in which
a current resolution is to be encoded, wherein the step of
interpolating the reference picture to have the target precision
comprises determining a filter tap by using a motion vector
resolution determined by using the resolution conversion flag.
34. The method of claim 30, further comprising: generating a
resolution appointment flag for appointing resolution sets
differently for each area or motion vector, wherein the step of
interpolating the reference picture to have the target precision
comprises determining a filter tap according to a single resolution
when is the resolution appointment flag means the single
resolution.
35. A method of decoding a video, comprising: interpolating a
reference picture to have a target precision through a multi-stage
filtering of the reference picture by using a plurality of filters
identified by information reconstructed through decoding a
bitstream; and reconstructing the video through an inter prediction
decoding of the bitstream by using an interpolated reference
picture having the target precision.
36. The method of claim 35, wherein the step of interpolating the
reference picture to have the target precision is performed through
an iterative process in which the reference picture is interpolated
through a filtering by using one of the plurality of filters and
the interpolated reference picture is interpolated through another
filtering by using another one of the plurality of filters.
37. The method of claim 35, wherein the step of interpolating the
reference picture to have the target precision sets types of filter
taps for each resolution of a picture and selects a filter from the
types of the filter taps, which has a minimum difference from a
current picture as a result of an interpolation, as an optimum
filter.
38. The method of claim 35, wherein the step of interpolating the
reference picture to have the target precision selects a filter tap
according a motion vector resolution.
39. The method of claim 35, wherein the step of interpolating the
reference picture to have the target precision selects an optimum
filter for each resolution in the unit of predetermined areas
within a picture or a slice, and sets types of filter taps for each
resolution of a picture and selects a filter from the types of the
filter taps, which has a minimum difference from a current picture
as a result of an interpolation, as an optimum filter.
40. The method of claim 35, wherein the step of interpolating the
reference picture to have the target precision selects a filter tap
according to a motion vector resolution.
41. The method of claim 35, wherein the step of interpolating the
reference picture to have the target precision selects an optimum
filter tap for each resolution in the unit of predetermined areas
within a picture or a slice.
42. A method of interpolating a reference picture, comprising:
selecting a first filter for interpolating a sub-pixel by using an
integer pixel of the reference picture; interpolating the reference
picture by using the first filter; selecting a second filter for
interpolating a sub-pixel of a target precision by using the
integer pixel and an interpolated sub-pixel; and interpolating an
interpolated reference picture by using the second filter.
43. The method of claim 42, wherein the step of selecting the first
filter selects a filter from a plurality of filters having a fixed
filter coefficient, which has a minimum difference between a
current picture and the interpolated reference picture, as the
first filter when the sub-pixel is interpolated by using the
integer pixel of the reference picture.
44. The method of claim 42, wherein the step of selecting the first
filter calculates a filter coefficient which has a minimum
difference between a current picture and the interpolated reference
picture, as a filter coefficient of the first filter when the
sub-pixel is interpolated by using the integer pixel of the
reference picture.
45. The method of claim 42, wherein the step of selecting the
second filter selects a filter from a plurality of filters having a
fixed filter coefficient, which has a minimum difference between a
current picture and a re-interpolated reference picture, as the
second filter when the sub-pixel of the target precision is
interpolated by using the interpolated sub-pixel and the integer
pixel of the reference picture.
46. The method of claim 42, wherein the step of selecting the
second filter calculates a filter coefficient which has a minimum
difference between a current picture and a re-interpolated
reference picture, as a filter coefficient of the second filter
when the sub-pixel of the target precision is interpolated by using
the interpolated sub-pixel and the integer pixel of the reference
picture.
47. The method of claim 42, further comprising: encoding
information on the first filter and information on the second
filter.
48. A method of interpolating a reference picture, comprising:
reconstructing information on a first filter and information on a
second filter through decoding a bitstream; interpolating the
reference picture by using the first filter identified by the
information on the first filter; and interpolating an interpolated
reference picture by using the second filter identified by the
information on the second filter.
49. The method of claim 48, wherein the information on the first
filter and the information on the second filter contain information
on filter coefficients or information on types of selected filters
from a plurality of filters.
50. The method of claim 48, wherein the step of interpolating the
reference picture by using the first filter interpolates a
sub-pixel of the reference picture by using the first filter based
on an integer pixel of the reference picture.
51. The method of claim 48, wherein the step of interpolating the
interpolated reference picture by using the second filter
interpolates a sub-pixel of the target precision based on the
integer pixel of the reference picture and an interpolated
sub-pixel of the reference picture.
Description
TECHNICAL FIELD
[0001] The present disclosure relates to a method and an apparatus
for interpolating a reference picture and a method and an apparatus
for encoding/decoding a video using the same. More particularly,
the present disclosure relates to a method and an apparatus for
improving the encoding efficiency by interpolating a reference
picture through a determination of a filter or a filter coefficient
for interpolating the reference picture according to
characteristics of a video and interpolating the reference picture
through a multi-stage filtering or adaptively changing the
resolution of a motion vector in the inter prediction encoding and
inter prediction decoding of a video.
BACKGROUND ART
[0002] The statements in this section merely provide background
information related to the present disclosure and may not
constitute the prior art.
[0003] Encoding of data for a video includes an intra prediction
encoding and an inter prediction encoding. The intra prediction
encoding and the inter prediction encoding are effective methods
capable of reducing the correlation existing between multiple
pieces of data, which are widely used in various data compressions.
Especially, in the inter prediction encoding, since a motion vector
of a current block determined through estimation of the motion of
the current block to be currently encoded has a close relation with
motion vectors of surrounding blocks, a predicted motion vector
(PMV) for the motion vector of the current block is first
calculated from the motion vectors of the surrounding blocks and
only a differential motion vector (DMV) for the PMV is encoded
instead of encoding the motion vector of the current block itself,
so as to considerably reduce the quantity of bits to be encoded and
thus improve the encoding efficiency.
[0004] That is, in the case of performing the inter prediction
encoding, an encoder encodes and transmits only a DMV corresponding
to a differential value between the current motion vector and a PMV
determined through estimation of the motion of the current block in
a reference frame, which has been reconstructed through previous
encoding and decoding. Also, a decoder reconstructs the current
motion vector by adding the PMV and the DMV transmitted based on a
prediction of the motion vector of the current block using the
motion vectors of the surrounding blocks decoded in advance.
[0005] Further, at the time of performing the inter prediction
encoding, the resolution may be collectively enhanced through
interpolation of the reference frame, and a DMV corresponding to a
differential value between the current motion vector and a PMV
determined through estimation of the motion of the is current block
may be then encoded and transmitted. In this event, the enhancement
of the resolution of a reference video (i.e. the video of the
reference frame) enables a more exact inter prediction and thus
reduces the quantity of bits generated by the encoding of the
residual signal between the original video and a predicted video.
However, the enhancement of the resolution of the reference video
also causes an enhancement of the resolution of the motion vector,
which increases the quantity of bits generated by encoding of the
DMV. In contrast, although a decrease of the resolution of the
reference video increases the quantity of bits generated by the
encoding of the residual signal, the decrease of the resolution of
the reference video decreases the resolution of the motion vector,
which also decreases the quantity of bits generated by encoding of
the DMV.
[0006] As described above, since the conventional inter prediction
encoding uses motion vectors of the same resolution obtained by
interpolating all video encoding units, such as blocks, slices, and
pictures, of a video with the same resolution, it is difficult for
the conventional inter prediction encoding to achieve an efficiency
encoding, which may degrade the compression efficiency.
[0007] Further, since the conventional inter prediction encoding is
operated correspondently with an inter prediction encoding, it is
difficult to expect to improve the efficiency of the inter
prediction decoding in a state where the compression efficiency of
the inter prediction encoding is deteriorated.
DISCLOSURE
Technical Problem
[0008] Therefore, the present disclosure has been made in view of
the above mentioned problems in the inter prediction encoding and
inter prediction decoding of a video to improve the encoding
efficiency by interpolating a reference picture through a
determination of a filter or a filter coefficient for interpolating
the reference picture according to characteristics of the video and
interpolating the reference picture through a multi-stage filtering
or adaptively changing the resolution of a motion vector.
Technical Solution
[0009] An aspect of the present disclosure provides an apparatus
for interpolating a reference picture, including: a filter selector
for selecting a plurality of filters for interpolating the
reference picture into an interpolated reference picture; and a
filter for generating a reference picture having a target precision
through a multi-stage filtering of the reference picture by using
the plurality of filters.
[0010] Another aspect of the present disclosure provides an
apparatus for interpolating a reference picture, including: a
filter information decoder for reconstructing information on a
plurality of filters through decoding a bitstream;
[0011] and a filter for generating a reference picture having a
target precision through a multi-stage filtering of the reference
picture by using a plurality of filters identified by a
reconstructed information on the plurality of filters.
[0012] Yet another aspect of the present disclosure provides an
apparatus for encoding a video, including: a reference picture
interpolator for interpolating a reference picture to have a target
precision through a multi-stage filtering of the reference picture
by using a plurality of filters; and an inter prediction encoder
for performing an inter prediction encoding of the video by using
the interpolated reference picture having the target precision.
[0013] Yet another aspect of the present disclosure provides an
apparatus for decoding a video, including: a reference picture
interpolator for interpolating a reference picture to have a target
precision through a multi-stage filtering of the reference picture
by using a plurality of filters identified by information on the
plurality of filters reconstructed through decoding a bitstream;
and an inter prediction decoder for reconstructing a video through
an inter prediction decoding of a bitstream by using an
interpolated reference picture having the to target precision.
[0014] Yet another aspect of the present disclosure provides a
method of interpolating a reference picture, including: selecting a
first filter for interpolating a sub-pixel by using an integer
pixel of the reference picture; interpolating the reference picture
by using the first filter; selecting a second filter for
interpolating a sub-pixel of a target precision by using the
integer pixel and an interpolated sub-pixel; and interpolating an
interpolated reference picture by using the second filter.
[0015] Yet another aspect of the present disclosure provides a
method of interpolating a reference picture, including:
reconstructing information on a first filter and information on a
second filter through decoding a bitstream; interpolating the
reference picture by using the first filter identified by the
information on the first filter; and interpolating an interpolated
reference picture by using the second filter identified by the
information on the second filter.
[0016] Yet another aspect of the present disclosure provides a
method of encoding a video, including: interpolating a reference
picture to have a target precision through a multi-stage filtering
of the reference picture by using a plurality of filters; and
performing an inter prediction encoding of the video by using an
interpolated reference picture having the target precision.
[0017] Yet another aspect of the present disclosure provides a
method of decoding a video, including: interpolating a reference
picture to have a target precision through a multi-stage filtering
of the reference picture by using a plurality of filters identified
by information reconstructed through decoding a bitstream; and
reconstructing the video through an inter prediction decoding of
the bistream by using an interpolated reference picture having the
target precision.
Advantageous Effects
[0018] According to the present disclosure as described above, a
video can be efficiently encoded through an inter prediction
encoding of the video by interpolating a reference picture through
a determination of a filter or a filter coefficient for
interpolating the reference picture according to characteristics of
the video and interpolating the reference picture through a
multi-stage filtering or adaptively changing the resolution of a
motion vector in the unit of predetermined areas.
DESCRIPTION OF DRAWINGS
[0019] FIG. 1 is a schematic diagram showing a video encoding
apparatus,
[0020] FIG. 2 is an exemplary diagram for illustrating the process
of is interpolating a reference picture,
[0021] FIG. 3 is an exemplary diagram for illustrating the process
of determining a predicted motion vector,
[0022] FIG. 4 shows an example of truncated unary codes wherein a
maximum value T thereof is 10,
[0023] FIG. 5 shows an example of the 0-th, first, and second order
Exp-Golomb codes,
[0024] FIG. 6 shows an example of the Concatenated Truncated
Unary/K-th Order Exp-Golomb Code wherein the maximum value T is 9
and K is 3,
[0025] FIG. 7 shows an example of a sequence of Zigzag
scanning,
[0026] FIG. 8 is a schematic block diagram of a video decoding
apparatus,
[0027] FIG. 9 is a block diagram illustrating a video encoding
apparatus according to the first aspect of the present
disclosure,
[0028] FIGS. 10A to 10C are exemplary diagrams for illustrating
motion vector resolutions hierarchically expressed by a Quadtree
structure according to an aspect of the present disclosure,
[0029] FIG. 11 illustrates a hierarchically expressed result of
encoded motion vector resolutions in a Quadtree structure according
to an aspect of the present disclosure,
[0030] FIG. 12 illustrates motion vector resolutions of areas
determined according to an aspect of the present disclosure,
[0031] FIG. 13 illustrates an example of motion vector resolution
hierarchically expressed in a tag tree structure according to an
aspect of the present disclosure,
[0032] FIG. 14 illustrates a result of the encoding of the motion
vector resolutions hierarchically expressed in a tag tree structure
according to an aspect of the present disclosure,
[0033] FIG. 15 illustrates an example of a process for determining
a motion vector resolution by using surrounding pixels of an area
according to an aspect of the present disclosure,
[0034] FIG. 16 is a view for illustrating the process of predicting
a predicted motion vector according to an aspect of the present
disclosure,
[0035] FIG. 17 is a flowchart for describing a method for encoding
a video by using an adaptive motion vector resolution according to
an aspect of the present disclosure,
[0036] FIG. 18 is a schematic block diagram illustrating a video
decoding apparatus using an adaptive motion vector according to an
aspect of the present disclosure,
[0037] FIG. 19 is a flowchart of a method for decoding a video by
using an adaptive motion vector resolution according to an aspect
of the present disclosure,
[0038] FIG. 20 illustrates another scheme of dividing a node into
lower layers according to an aspect of the present disclosure,
[0039] FIG. 21 illustrates an example of a bit string allocated to
each of symbols depending on the motion vector resolutions
according to an aspect of the present disclosure,
[0040] FIG. 22 illustrates optimum motion vectors of a current
block and surrounding blocks in order to describe a process of
determining a resolution of a motion vector,
[0041] FIG. 23 illustrates a table showing conversion formulas
according to the motion vector resolution,
[0042] FIG. 24 illustrates a table showing the resolutions of
motion vectors of surrounding blocks converted based on block X to
be currently encoded,
[0043] FIG. 25 illustrates a code number table of a differential
motion vector according to the motion vector resolutions,
[0044] FIG. 26 illustrates optimum motion vectors of a current
block and surrounding blocks in order to describe a process of
determining a resolution of is a differential motion vector by the
resolution determiner,
[0045] FIG. 27 illustrates a code number table of differential
motion vectors according to the differential motion vector
resolutions,
[0046] FIG. 28 illustrates a motion vector of the current block and
a reference motion vector of surrounding blocks,
[0047] FIG. 29 illustrates a code number table of a differential
reference motion vector according to the differential reference
motion vector resolution,
[0048] FIG. 30 illustrates an example of indexing and encoding a
reference picture based on a distance between a current picture and
a reference picture,
[0049] FIG. 31 is a table illustrating an example of reference
picture indexes according to reference picture numbers and
resolutions,
[0050] FIG. 32 is a schematic block diagram illustrating a video
encoding apparatus 3200 using an adaptive motion vector according
to the second aspect of the present disclosure,
[0051] FIG. 33 illustrates resolution identification flags in the
case in which the appointed resolutions are 1/2 and 1/4,
[0052] FIG. 34 illustrates current block and its surrounding
blocks,
[0053] FIG. 35 illustrates a context model according to the
conditions,
[0054] FIG. 36 illustrates resolution identification flags in the
case in which the appointed resolutions are 1/2, 1/4, and 1/8,
[0055] FIG. 37 illustrates a context model according to the
conditions,
[0056] FIGS. 38 and 39 illustrate examples of adaptability degrees
according to distances between the current picture and reference
pictures,
[0057] FIG. 40 illustrates an example of storing different
reference picture index numbers according to predetermined
resolution sets,
[0058] FIG. 41 illustrates an example of a structure for encoding
of reference pictures,
[0059] FIG. 42 illustrates an example of resolution sets of
reference pictures when the resolution sets are appointed to 1/2
and 1/4,
[0060] FIG. 43 illustrates an example of a resolution of a current
block and resolutions of surrounding blocks,
[0061] FIG. 44 illustrates another example of a resolution of a
current block and resolutions of surrounding blocks,
[0062] FIG. 45 illustrates resolution identification flags
according to resolutions,
[0063] FIG. 46 illustrates an example of the resolution of the
current block and the resolutions of surrounding blocks,
[0064] FIG. 47 is a flowchart illustrating a video encoding method
using an adaptive motion vector resolution according to the second
aspect of the present disclosure,
[0065] FIG. 48 is a schematic block diagram illustrating a video
decoding apparatus using an adaptive motion vector according to the
second aspect of the present disclosure,
[0066] FIG. 49 illustrates an example of surrounding motion vectors
of current block,
[0067] FIG. 50 illustrates an example of converted values of
surrounding motion vectors according to the current resolution,
[0068] FIG. 51 is a flowchart illustrating a video decoding method
using an adaptive motion vector resolution according to the second
aspect of the present disclosure,
[0069] FIG. 52 is a schematic block diagram of a video decoding
apparatus according to a third aspect of the present
disclosure,
[0070] FIG. 53 is a schematic block diagram of a reference picture
interpolating apparatus for a video encoding according to an aspect
of the is present disclosure,
[0071] FIGS. 54 and 55 show examples of filters used in a
multi-stage filtering according to an aspect of the present
disclosure and FIG. 56 shows an example for describing a process of
the multi-stage filtering according to an aspect of the present
disclosure,
[0072] FIG. 57 is a flowchart of a reference picture interpolating
method for a video encoding according to an aspect of the present
disclosure,
[0073] FIG. 58 is a flowchart of a video encoding method according
to the third aspect of the present disclosure,
[0074] FIG. 59 is a schematic block diagram of a video decoding
apparatus according to the third aspect of the present
disclosure,
[0075] FIG. 60 is a schematic block diagram of a reference picture
interpolating apparatus for a video decoding according to an aspect
of the present disclosure,
[0076] FIG. 61 is a flowchart of a reference picture interpolating
method for a video decoding according to an aspect of the present
disclosure,
[0077] FIG. 62 is a flowchart of a video decoding method according
to the third aspect of the present disclosure,
[0078] FIG. 63 illustrates a table of a filter tap according to the
resolution,
[0079] FIG. 64 illustrates a table indicating types of a filter tap
varying according to the resolution of a motion vector,
[0080] FIG. 65 illustrates another table indicating types of a
filter tap varying according to the resolution of a motion
vector,
[0081] FIG. 66 illustrates a table in the case in which an optimum
position is found using 1/2 and 1/4 resolution,
[0082] FIG. 67 shows an example of a video encoding apparatus
according to a fourth aspect of the present disclosure,
[0083] FIG. 68 illustrates yet another table indicating types of a
filter tap is depending on the resolution of a motion vector,
[0084] FIG. 69 shows an example of resolution identification flags
according to the resolution,
[0085] FIG. 70 shows another example of resolution identification
flags according to the resolution,
[0086] FIG. 71 illustrates a video encoding method according to the
fourth aspect of the present disclosure,
[0087] FIG. 72 is a schematic block diagram of a video decoding
apparatus according to the fourth aspect of the present disclosure,
and
[0088] FIG. 73 illustrates a video decoding method according to the
fourth aspect of the present disclosure.
MODE FOR INVENTION
[0089] Hereinafter, aspects of the present disclosure will be
described in detail with reference to the accompanying drawings. In
the following description, the same elements will be designated by
the same reference numerals although they are shown in different
drawings. Further, in the following description of the present
disclosure, a detailed description of known functions and
configurations incorporated herein will be omitted when it may make
the subject matter of the present disclosure rather unclear.
[0090] Additionally, in describing the components of the present
disclosure, there may be terms used like first, second, A, B, (a),
and (b). These are solely for the purpose of differentiating one
component from the other but not to imply or suggest the
substances, order or sequence of the components. If a component
were described as `connected`, `coupled`, or `linked` to another
component, they may mean the components are not only directly
`connected`, `coupled`, or `linked` but also are indirectly
`connected`, `coupled`, or `linked` via a third component.
[0091] A video encoding apparatus or video decoding apparatus
described is hereinafter may be a personal computer or PC, notebook
or laptop computer, personal digital assistant or PDA, portable
multimedia player or PMP, PlayStation Portable or PSP, or mobile
communication terminal, smart phone or such devices, and represent
a variety of apparatuses equipped with, for example, a
communication device such as a modem for carrying out communication
between various devices or wired/wireless communication networks, a
memory for storing various programs for encoding videos and related
data, and a microprocessor for executing the programs to effect
operations and controls.
[0092] In addition, the video encoded into a bitstream by the video
encoding apparatus may be transmitted in real time or non-real-time
to the video decoding apparatus for decoding the same where it is
reconstructed and reproduced into the video after being transmitted
via a wired/wireless communication network including the Internet,
a short range wireless communication network, a wireless LAN
network, a WiBro (Wireless Broadband) also known as WiMax network,
and a mobile communication network or a communication interface
such as cable or USB (universal serial bus).
[0093] In addition, although the video encoding apparatus and the
video decoding apparatus may be equipped with the functions of
performing the inter prediction as well as the intra prediction,
which lacks a direct correlation with the aspects of the present
disclosure, a detailed description thereof will not be provided to
avoid any confusions.
[0094] A video typically includes a series of pictures each of
which is divided into predetermined areas, such as blocks. When
each picture is divided into blocks, each of the blocks is
classified into an intra block or an inter block depending on the
method of classification. The intra block means the block that is
encoded through an intra prediction coding which is within a
current picture where the current encoding is performed for
generating a predicted block by predicting a current block using
pixels of a reconstructed block that underwent previous encoding
and decoding and then encoding the differential value of the
predicted block from the pixels of the current block. The inter
block means the block that is encoded through an inter prediction
coding which generates the predicted block by predicting the
current block in the current picture through referencing one or
more past pictures or future pictures to predict the current block
in the current picture and then encoding the differential value of
the predicted block from the current block. Here, the picture that
is referenced in encoding or decoding the current picture is called
a reference picture.
[0095] The following description discusses apparatuses for encoding
and decoding a video by blocks through examples shown in FIGS. 1 to
8, wherein the block may be a macroblock having a size of M.times.N
or a subblock having a size of OxP. However, the encoding or
decoding of a video by blocks is just an example and a video may be
encoded or decoded not only by standardized areas, such as blocks,
but also by non-standardized areas.
[0096] FIG. 1 is a schematic diagram showing a video encoding
apparatus.
[0097] The video encoding apparatus 100 may include a predictor
110, a subtracter 120, a transformer 130, a quantizer 140, an
encoder 150, an inverse quantizer 160, an inverse transformer 170,
an adder 180, and a memory 190.
[0098] The predictor 110 generates a predicted block by performing
intra prediction on the current block. In other words, in response
to an input of a block to be currently encoded, i.e. a current
block, the predictor 110 predicts original pixel values of pixels
of the current block by using motion vectors of the current block
determined through motion estimation, to generate and output the
predicted block having predicted pixel values.
[0099] The subtracter 120 generates a residual block of the current
block by subtracting the predicted block from the current block.
Here, the outputted residual block includes a residual signal which
has a value obtained by subtracting the predicted pixel value of
the predicted block from the original pixel value of the current
block.
[0100] The transformer 130 generates a transformed block by
transforming the residual block. Specifically, the transformer 130
transforms a residual signal of the residual block outputted from
the subtracter 120 into the frequency domain to is generate and
output the transformed block having a transform coefficient. Here,
the method used for transforming the residual signal into the
frequency domain may be the discrete cosine transform (DCT) based
transform or Hadamard transform among various other unlimited
transforming techniques available from improving and modifying the
DCT transform or the like, whereby the residual signal is
transformed into the frequency domain and into the transform
coefficient.
[0101] The quantizer 140 quantizes the transformed block to
generate a transformed and quantized block. Specifically, the
quantizer 140 quantizes the transform coefficient of the
transformed block outputted from the transformer 130 to generate
and output the transformed and quantized block having a quantized
transform coefficient. Here, the quantizing method used may be the
dead zone uniform threshold quantization (DZUTQ) or the
quantization weighted matrix among their various improvement
options.
[0102] The encoder 150 encodes the transformed and quantized block
to output a bitstream. In particular, the encoder 150 encodes a
frequency coefficient string resulted from scanning in the zig-zag
scanning or other various scanning methods with respect to the
quantized transform coefficient of the transformed and quantized
block outputted from the quantizer 140, by using various encoding
techniques such as the entropy encoding, and generates and outputs
the bitstream encompassing additional information needed to decode
the involved block such as prediction mode information,
quantization parameter, motion vector, etc.
[0103] The inverse quantizer 160 carries out the inverse process of
quantization with respect to the transformed and quantized block.
Specifically, the inverse quantizer 160 inversely quantizes and
outputs the quantized transform coefficients of the transformed and
quantized block outputted from the quantizer 140.
[0104] The inverse transformer 170 carries out the inverse process
of transformation with respect to the transformed and inversely
quantized block. Specifically, the inverse transformer 170
inversely transforms the inversely is quantized transform
coefficients from the inverse quantizer 160 to reconstruct the
residual block having the reconstructed residual coefficients.
[0105] The adder 180 adds the inversely transformed and
reconstructed residual block from the inverse transformer 170 to
the predicted block from the predictor 110 to reconstruct the
current block. The reconstructed current block is stored in the
memory 190 and may be accumulated by blocks or by pictures and then
transferred in units of pictures to the predictor 110 for possible
use in predicting other blocks including the next block or the next
picture.
[0106] Meanwhile, the predictor 110 determines the motion vector of
the current block by estimating the motion of the current block by
using the reference picture stored in the memory 190, and may
perform the motion estimation after enhancing the resolution of the
reference picture by interpolating the reference picture stored in
the memory 190.
[0107] FIG. 2 is a view for illustrating the process of
interpolating a reference picture.
[0108] FIG. 2 shows pixels of the reference picture stored in the
memory 190 and pixels interpolated by using sub-pixels. Sub-pixels
a.about.s can be generated by interpolating previously
reconstructed pixels A.about.U of the reference picture by using an
interpolation filter, and the sub-pixels a.about.s interpolated
between the previously reconstructed pixels can increase the
resolution of the reference picture fourfold or more.
[0109] The motion estimation refers to a process of finding a part
of an interpolated reference picture which is most similar to the
current block and outputting a block of the part and a motion
vector indicating the part. A predicted block found in this process
is subtracted from the current block by the subtracter 120, so as
to produce a residual block having a residual signal. Further, the
motion vector is encoded by the encoder 150.
[0110] When encoding the motion vector, the encoder 150 may predict
the motion vector of the current block by using motion vectors of
blocks adjacent to the current block and may encode the motion
vector by using the predicted motion vector.
[0111] FIG. 3 is a view for illustrating the process of determining
a predicted motion vector.
[0112] Referring to FIG. 3, based on an assumption that the current
block is X, a motion vector of an adjacent block A located at the
left side of the current block is MV_A (x component: MVx_A, y
component: MVy_A), a motion vector of an adjacent block B located
at the upper side of the current block is MV_B (x component: MVx_B,
y component: MVy_B), and a motion vector of an adjacent block C
located at the right upper side of the current block is MV_C (x
component: MVx_C, y component: MVy_C), each component of the
predicted motion vector MV_pred_X (x component: MVx_pred_X, y
component: MVy_pred_X) of the current block X may be determined as
a median value of each component of the motion vector of an
adjacent block of the current block as shown in Equation 1 below.
Meanwhile, the method of predicting a motion vector according to
the present disclosure is not limited to the method introduced
herein.
MVx_pred.sub.--X=median(MVx.sub.--A,MVx.sub.--B,MVx.sub.--C)
My_pred.sub.--X=median(MVy.sub.--A,MVy.sub.--B,MVy.sub.--C)
Equation 1
[0113] The encoder 150 may encode a differential vector having a
differential value between a motion vector and a predicted motion
vector. Various entropy encoding schemes, such as a Universal
Variable Length Coding (UVLC) scheme and a Context-Adaptive Binary
Arithmetic Coding (CABAC) scheme, may be used for encoding the
differential vector. Meanwhile, in the present disclosure, the
encoding method by the encoder 150 is not limited to the method
described herein.
[0114] In the case of encoding the differential vector by using the
UVLC, the differential vector may be encoded by using the K-th
order Exp-Golomb code. In this event, K may have a value of "0" or
another value. The prefix of the K-th order Exp-Golomb code has a
truncated unary code corresponding to l(x)=.left brkt-bot.
log.sub.2(x/2.sup.k+1).right brkt-bot., and a suffix thereof may be
expressed by a binary-coded bit stream of a value of
x+2.sup.k(1-2.sup.l(x)) having a length of k+l(x).
[0115] FIG. 4 shows an example of truncated unary codes wherein a
is maximum value T thereof is 10, and FIG. 5 shows an example of
the 0-th, first, and second order Exp-Golomb codes.
[0116] Further, when the differential vector is encoded using the
CABAC, the differential vector may be encoded using code bits of
the Concatenated Truncated Unary/K-th Order Exp-Golomb Code.
[0117] In the Concatenated Truncated Unary/K-th Order Exp-Golomb
Code, the maximum value T is 9 and K may be 3. FIG. 6 shows an
example of the Concatenated Truncated Unary/K-th Order Exp-Golomb
Code wherein the maximum value T is 9 and K is 3.
[0118] FIG. 7 shows an example of a sequence of Zigzag
scanning.
[0119] The quantized frequency coefficients quantized by the
quantizer 140 may be scanned and encoded into a quantized frequency
coefficient string by the encoder 150. Block type quantized
frequency coefficients may be scanned according to not only the
zigzag sequence as shown in FIG. 7 but also various other
sequences.
[0120] FIG. 8 is a schematic block diagram of a video decoding
apparatus.
[0121] The video decoding apparatus 800 may include a decoder 810,
an inverse quantizer 820, an inverse transformer 830, an adder 840,
a predictor 850, and a memory 860.
[0122] The decoder 810 decodes a bitstream to extract the
transformed and quantized block. Specifically, the decoder 810
decodes a bit string extracted from the bitstream received and
inversely scans the result to reconstruct the transformed and
quantized block having a quantized transform coefficient. At the
same time, the decoder 810 uses the same encoding technique like
the entropy encoding as used by the encoder 150 of the video
encoding apparatus 100 to perform the reconstruction.
[0123] Further, the decoder 810 may extract and decode an encoded
differential vector from the bitstream to reconstruct the
differential vector, and may predict a motion vector of the current
block and then add the predicted is motion vector to the
reconstructed differential vector to reconstruct the motion vector
of the current block.
[0124] The inverse quantizer 820 inversely quantizes the
transformed and quantized block. Specifically, the inverse
quantizer 820 inversely quantizes the quantized transform
coefficient of the transformed and quantized block from the decoder
810. At this time, the inverse quantizer 820 in its operation
performs a reversal of the quantization technique used in the
quantizer 140 of the video encoding apparatus 100.
[0125] The inverse transformer 830 inversely transforms the
transformed and inversely quantized block to reconstruct the
residual block. Specifically, the inverse transformer 830
reconstructs the inversely quantized transform coefficient of the
transformed and inversely quantized block from the inverse
quantizer 820, wherein the inverse transformer 830 in its operation
performs a reversal of the transform technique used in the
transformer 130 of the video encoding apparatus 100.
[0126] The predictor 850 generates a predicted block by predicting
the current block by using the reconstructed motion vector of the
current block extracted and decoded from the bitstream.
[0127] The adder 840 adds the reconstructed residual block to the
predicted block to reconstruct the current block. Specifically, the
adder 840 adds a reconstructed residual signal of the reconstructed
residual block outputted from the inverse transformer 830 to the
predicted pixel values of the predicted block outputted from the
predictor 850 to calculate the reconstructed pixel values of the
current block, thereby reconstructing the current block.
[0128] The current block reconstructed by the adder 840 is stored
in the memory 860. The current blocks may be stored as reference
pictures by blocks or by pictures for use in the prediction of a
next block by the predictor 850.
[0129] As described above with reference to FIGS. 1 to 8, the video
encoding apparatus 100 and the video decoding apparatus 800 can
perform the inter is prediction encoding and inter prediction
decoding after enhancing the resolution of the motion vector and
the reference picture by interpolating the reference picture in the
unit of sub-pixels. Specifically, they can enhance the resolution
of the motion vector by interpolating the reference picture with
the same resolution in the unit of pictures or picture groups.
[0130] However, an inter prediction with an enhanced resolution of
a reference picture enables a more precise inter prediction and
thus reduces the quantity of bits generated by the encoding of the
residual signal. However, the enhancement of the resolution of the
reference picture also results in an inevitable an enhancement of
the resolution of the motion vector, which increases the quantity
of bits generated by encoding of the motion vector. As a result,
even the inter prediction with an enhanced resolution of a
reference picture may fail to significantly increase the encoding
efficiency or may rather degrade the encoding efficiency depending
on the images.
[0131] The following description discusses a method and an
apparatus for inter prediction encoding and inter prediction
decoding, which can adaptively enhance the resolution of a
reference picture in the unit of areas having predetermined regular
or irregular sizes, such as pictures, slices, and blocks of images
according to the characteristics of the images, so that an area
having a relatively complex image or smaller movements is inter
prediction encoded and decoded with an enhanced resolution while an
area having a relatively simple image or larger movements is inter
prediction encoded and decoded with a lowered resolution.
[0132] FIG. 9 is a block diagram illustrating a video encoding
apparatus according to the first aspect of the present
disclosure.
[0133] A video encoding apparatus 900 using an adaptive motion
vector according to the first aspect of the present disclosure may
include an inter prediction encoder 910, a resolution change flag
generator 920, a resolution determiner 930, a resolution encoder
940, and a differential vector encoder 950. Meanwhile, it is not
required but optional that all of the resolution change flag
generator 920, resolution encoder 940, and the differential vector
encoder 950 be included in the video encoding apparatus 900, and
they may be selectively included in the video encoding apparatus
900.
[0134] The inter prediction encoder 910 performs an inter
prediction encoding of a video in the unit of areas of the image by
using a motion vector according to a motion vector resolution
determined for each motion vector or each area of the video. The
inter prediction encoder 910 can be implemented by the video
encoding apparatus 100 described above with reference to FIG. 1. In
this event, when one or more components between the resolution
encoder 940 and the differential vector encoder 950 of FIG. 9 are
additionally included and the function of the additionally included
component or components overlaps with the function of the encoder
150 within the inter prediction encoder 910, the overlapping
function may be omitted from the encoder 150. Further, if there is
an overlapping area between the function of the predictor 110
within the inter prediction encoder 910 and the function of the
resolution determiner 930, the overlapping function may be omitted
from the predictor 110.
[0135] Further, one or more components between the resolution
encoder 940 and the differential vector encoder 950 may be
configured either as a component separate from the inter prediction
encoder 910 as shown in FIG. 9 or as a component integrally formed
with the encoder 150 within the inter prediction encoder 910.
Further, the flag information generated in the resolution change
flag generator 920 may be transformed into a bitstream either by
the resolution change flag generator 920 or by the encoder 150
within the inter prediction encoder 910.
[0136] However, although the above description with reference to
FIG. 1 discusses encoding of a video in the unit of blocks by the
video encoding apparatus 100, the inter prediction encoder 910 may
divide the video into areas with various shapes or sizes, such as
blocks including macroblocks or subblocks, slices, or pictures, and
perform the encoding in the unit of areas each having a
predetermined size. Such a predetermined area may be not only a
macroblock having a size of 16.times.16 but also blocks with
various shapes or sizes, such as a block having a size of
64.times.64 and a block having a size of 32.times.16.
[0137] Further, although the video encoding apparatus 100 described
above with reference to FIG. 1 performs an inter prediction
encoding using motion vectors having the same motion vector
resolution for all the blocks of an image, the inter prediction
encoder 910 may perform an inter prediction encoding by using
motion vectors having motion vector resolutions differently
determined according to the video areas. The video areas according
to which the motion vector resolutions may be differently
determined may be pictures (frames or fields), slices, or image
blocks each having a predetermined size.
[0138] That is, in the inter prediction encoding of an area, the
inter prediction encoder 910 performs a motion estimation after
enhancing the resolution of the area by interpolating a reference
picture which has been previously encoded, decoded, and
reconstructed. For the interpolation of the reference picture,
various interpolation filters, such as a Wiener filter, a bilinear
filter, and a Kalman filter may be used and there may be
resolutions applicable in the unit of various integer pixels or
fraction pixels, such as 2/1 pixel, 1/1 pixel, 1/2 pixel, 1/4
pixel, and 1/8 pixel. Further, according to such various
resolutions, there may be different filter coefficients or
different numbers of filter coefficients to be used.
[0139] For example, a Wiener filter may be used for the
interpolation when the resolution corresponds to the 1/2 pixel unit
and a Kalman filter may be used for the interpolation when the
resolution corresponds to the 1/4 pixel unit. Moreover, different
numbers of filter taps may be used for the interpolation of the
respective resolutions. For example, an 8-tap Wiener filter may be
used for the interpolation when the resolution corresponds to the
1/2 pixel unit and a 6-tap Wiener filter may be used for the
interpolation when the resolution corresponds to the 1/4 pixel
unit.
[0140] Further, the inter prediction encoder 910 may determine an
optimum filter coefficient, which has minimum errors between a
picture to be currently encoded and a reference picture, for each
motion vector resolution and then encode the filter coefficient. In
this event, any of the Wiener filter, Kalman filter, etc. may be
used with arbitrary number of filter taps, and each resolution may
prescribe distinctive numbers of the filters and filter taps.
[0141] In addition, the inter prediction encoder 910 may perform an
inter prediction by using reference pictures interpolated using
different filters is depending on the resolutions of motion vectors
or areas. For example, as noted from Equation 2 below, in order to
calculate an optimum filter coefficient, which has a minimum Sum of
Squared Difference (SSD) between a picture to be currently encoded
and a reference picture, the Wiener filter may be used for
calculating an optimum filter tap for each resolution.
( sp ) 2 = x y ( S x , y - i j h i , j sp P x ~ - i , y ~ - j ) 2 S
x , y : Current frame P x , y : Reference frame Equation 2
##EQU00001##
({tilde over (x)} and {tilde over (y)} indicate positions at which
the motion vectors are applied)
[0142] In Equation 2, S indicates a pixel of the current picture,
h.sup.sp indicates a filter coefficient of the pixel domain, P
indicates a pixel of a reference picture, e.sup.sp indicates an
error, and x and y indicate locations of the current pixel.
[0143] That is, the inter prediction encoder 910 may calculate the
filter coefficient for each resolution by using a Wiener-Hopf
Equation like Equation 2, encode an optimum filter coefficient for
each resolution, and include the encoded filter coefficient in a
bitstream. Then, the inter prediction encoder 910 may perform an
interpolation filtering for the reference picture and then generate
and encode a reference picture for each resolution. In this event,
a filter coefficient of a 6-tap Wiener filter may be calculated and
encoded for the 1/2 resolution, a filter coefficient of an 8-tap
Kalman filter for the 1/4 resolution, and a filter coefficient of a
linear filter for the 1/8 resolution, including the encoded filter
coefficients in the bitstream, and the reference picture for each
resolution may be then interpolated and encoded. In the encoding,
the inter prediction encoder 910 may use the reference picture
interpolated by the 6-tap Wiener filter when the resolution of the
current area or motion vector is the 1/2 resolution, and may use a
reference picture interpolated by the 8-tap Kalman filter when the
resolution of the current area or motion vector is the 1/4
resolution.
[0144] The resolution change flag generator 920 may generate a
resolution change flag into the bitstream, which indicates whether
to define a motion vector resolution and/or a resolution of a
differential motion vector with respect to each area of an image or
each motion vector. The area for the change of a motion is vector
resolution and/or a resolution of a differential motion vector by
the resolution change flag may be a block, a macroblock, a group of
blocks, a group of macroblocks, or an area having a predetermined
size, such as M.times.N. Therefore, the resolution change flag
generator 920 may generate the resolution change flag into the
bitstream, which indicates whether to perform the inter prediction
encoding by using motion vectors having a fixed motion vector
resolution for sub-areas within a part of or all of areas of a
video or whether to determine a motion vector resolution of each
area (or motion vector), perform an inter prediction encoding by
using a motion vector having the determined motion vector
resolution, and generate a differential motion vector having a
fixed resolution. Such a resolution change flag may be determined
and generated either according to configuration information input
by a user or according to a preset determination criteria based on
an analysis of the video to be encoded. The resolution change flag
may be included in a bitstream header such as a picture parameter
set, a sequence parameter set, or a slice header.
[0145] When the resolution change flag generated by the resolution
change flag generator 920 indicates fixation of the motion vector
resolution and/or resolution of the differential motion vectors,
the inter prediction encoder 910 performs an inter prediction
encoding of each of the sub-areas defined in the header by using
motion vectors of the sub-areas having the fixed motion vector
resolution. For example, when a resolution change flag included in
a slice header of a slice indicates that the motion vector
resolution is fixed, the inter prediction encoder 910 may determine
a motion vector resolution having the lowest rate-distortion cost
for an image of the slice and then perform an inter prediction
encoding for all areas of the slice by using motion vectors of the
areas having the determined motion vector resolution.
[0146] Further, when the resolution change flag indicates that the
resolutions of the motion vectors and/or differential motion
vectors are adaptively changing for each area or motion vector, the
inter prediction encoder 910 performs an inter prediction encoding
of each area by using a motion vector of each area having a motion
vector resolution determined by the resolution determiner 930. For
example, when a resolution change flag included in a slice header
of a slice is indicates that the resolutions of the motion vector
and/or differential motion vector adaptively changes for each area
or motion vector, the inter prediction encoder 910 may perform an
inter prediction encoding of each area within the slice by using a
motion vector of the area having a motion vector resolution
determined by the resolution determiner 930. As another example,
when a resolution change flag included in a slice header of a slice
indicates that the motion vector resolution of the motion vector
and/or differential motion vector adaptively changes for each
motion vector, the inter prediction encoder 910 may perform an
inter prediction encoding of each motion vector within the slice by
using a motion vector resolution determined for the motion vector
by the resolution determiner 930.
[0147] When a resolution change flag indicating that the motion
vector resolution of the motion vectors and/or differential motion
vectors adaptively changes foe each area or motion vector is
generated by the resolution change flag generator 920, the
resolution determiner 930 determines an optimum motion vector
resolution and/or differential motion vector resolution of each
motion vector and/or differential motion vector through changing
the motion vector resolution and/or differential motion vector
resolution by using a predetermined cost function, such as a
rate-distortion cost (RD cost). In this event, the optimum motion
vector resolution and/or differential motion vector resolution
simply refers to a resolution of a motion vector and/or
differential motion vector determined by using a predetermined cost
function and does not imply that the determined optimum motion
vector resolution and/or differential motion vector resolution
always has an optimum performance. When the predetermined cost
function is a rate-distortion cost, a motion vector resolution
and/or differential motion vector resolution having the lowest
rate-distortion cost may be the optimum motion vector resolution
and/or differential motion vector resolution.
[0148] The resolution encoder 940 may encode the optimum motion
vector resolution and/or differential motion vector resolution
determined for each area or motion vector. That is, the resolution
encoder 940 may encode a motion vector resolution identification
flag for indicating a motion vector resolution and/or a
differential motion vector resolution identification flag
indicating a is differential motion vector resolution of each area
determined by the resolution determiner 930 and then include the
encoded resolution identification flag in a bitstream. There may be
various ways for implementing the motion vector resolution
identification flag or differential motion vector resolution
identification flag. The resolution indicated by the resolution
identification flag may be adopted by either only one or both of a
motion vector resolution and a differential motion vector
resolution.
[0149] The differential vector encoder 950 may encode a
differential motion vector corresponding to a difference between a
predicted motion vector and a motion vector according to a motion
vector resolution determined for each motion vector or area. The
differential motion vector may be differentially encoded according
to the differential motion vector resolution.
[0150] A resolution identification flag indicating a motion vector
resolution may indicate either one of the resolutions of x
component and y component of a motion vector for motion estimation
or both. That is, when a camera taking an image moves or when an
object within a video moves, the resolution determiner 930 may
separately determine the resolutions of the x component and the y
component of the motion vector. For example, the resolution
determiner may determine a resolution in 1/8 pixel unit for an x
component of a motion vector of a certain area as it determines a
resolution in 1/2 pixel unit for a y component of the motion
vector. Then, the inter prediction encoder 910 may determine the
motion vector of the corresponding area in different resolutions
for the x component and the y component and perform motion
estimation and motion compensation by using the determined motion
vector, so as to perform an inter prediction encoding of the
area.
[0151] FIG. 22 illustrates optimum motion vectors of a current
block and surrounding blocks in order to describe a process of
determining the resolution of a motion vector by the resolution
determiner 930.
[0152] When a flag which indicates that the motion vector
resolution and/or differential motion vector resolution adaptively
changes according to the area or motion vector, is generated by the
resolution change flag generator 920 (in the second aspect, a
resolution appointment flag generated by a resolution appointment
flag generator 3220 enables setting of whether to change or fix the
motion vector resolution and/or differential motion vector
resolution), it is assumed that the kinds of resolutions of the
current block and surrounding blocks are 1/2, 1/4, and 1/8 and an
optimum resolution has been determined as shown in FIG. 22. On this
assumption, block A has a resolution of 1/2 and a motion vector of
(4/2, -8/2), block B has a resolution of 1/4 and a motion vector of
(36/4, -28/4), block C has a resolution of 1/8 and a motion vector
of (136/8, -104/8), and the current block has a resolution of 1/4
and a motion vector of (16/4, 20/4). In this event, the predicted
motion vector may follow the resolution of the current motion
vector. Then, in order to calculate the predicted motion vector, a
resolution conversion process may be carried out to equalize the
resolution of the surrounding motion vectors to the resolution of
the current motion vector.
[0153] FIG. 23 illustrates a table showing conversion formulas
according to the motion vector resolutions, and FIG. 24 illustrates
a table showing the resolutions of motion vectors of surrounding
blocks converted based on block X to be currently encoded.
[0154] The predicted motion vector may be obtained by using
surrounding motion vectors. If the surrounding motion vectors have
been stored according to their respective resolutions and are
different from the current motion vector, the conversion can be
made using a multiplication and a division. Further, in this event,
the resolution conversion process may be performed at the time of
obtaining a predicted motion vector. Otherwise, if the surrounding
motion vectors have been stored based on the best resolution and
the resolution of the current motion vector is not the best
resolution, the conversion can be made using a division. Further,
in this event, when the resolution conversion process finds an
encoded motion vector which is in less than the highest resolution,
it may carry out a resolution conversion into the heist resolution.
Otherwise, if the surrounding motion vectors have been stored based
on a certain reference resolution and the resolution of the current
motion vector is different from the reference resolution in which
the surrounding motion vectors are stored, the is conversion can be
made using a multiplication and a division. Further, in this event,
when the resolution conversion process finds an encoded motion
vector which is stored in a resolution different from the reference
resolution, it may carry out a resolution conversion into the
reference resolution. In the case of performing the division,
rounding may be used, including a round-off, a round-up, and a
round-down. In the aspect shown in FIGS. 23 and 24, a round-off is
used. Further, in the shown aspect, surrounding motion vectors in
store according to their respective resolutions.
[0155] A predicted motion vector may be obtained by referring to
the table shown in FIG. 23. In FIG. 23, the predicted motion vector
can be obtained by using a median function, and a median value can
be obtained for each component.
MVPx=median(16/4,36/4,32/4)=32/4
MVPy=median(-32/4,-28/4,-28/4)=-28/4
[0156] As a result, the predicted motion vector has a value of
(32/4, -28/7). Then, a differential motion vector is obtained by
using the obtained predicted motion vector. The differential motion
vector can be obtained by using the difference between the motion
vector and the predicted motion vector as noted from Equation 3
below.
MVD(-16/4,48/4)=MV(16/4,20/4)-MVP(32/4,-28/4) Equation 3
[0157] Therefore, the differential motion vector has a value of
(-16/4, 48/4), which is equal to (-4, 12).
[0158] FIG. 25 illustrates a code number table of a differential
motion vector according to the motion vector resolutions.
[0159] The differential vector encoder 950 may use the code number
table of differential motion vectors according to the motion vector
resolutions as shown in FIG. 25 in encoding the differential motion
vectors with respect to motion vector values of respective
resolutions.
[0160] Further, the predicted motion vector may be obtained as
follows by using the example shown in FIG. 22. In this event,
instead of converting the is surrounding motion vectors according
to the current resolution, it is possible to first obtain medians
of individual components of each surrounding motion vector.
MVPx=median(4/2,36/4,136/8)=36/4
MVPy=median(-8/2,-28/4,-104/8)=-104/8
[0161] As a result, the predicted motion vector has a value of
(36/4, -104/8). The differential motion vector is obtained using
the predicted motion vector obtained as in the way as described
above. The differential motion vector can be obtained using the
difference between the motion vector and the predicted motion
vectors as noted from Equation 4 below.
MVD(-20/4,72/4)=MV(16/4,20/4)-MVP(36/4,-104/8) Equation 4
[0162] As a result, the differential motion vector has a value of
(-20/4, 72/4), which is equal to (-5, 18).
[0163] The differential vector encoder 950 may use the code number
table of differential motion vectors according to the motion vector
resolutions as shown in FIG. 25 in encoding the differential motion
vectors with respect to motion vector values for each of the
resolutions.
[0164] Further, the predicted motion vectors may be obtained as
follows by using the example shown in FIG. 22. In this event,
converting the surrounding motion vectors according to the current
resolution may be performed only after medians of individual
components of each surrounding motion vector are obtained.
MVPx=median(4/2,36/4,136/8)=36/4
MVPy=median(-8/2,-28/4,-104/8)=-104/8
[0165] As a result, the predicted motion vector has a value of
(36/4, -52/4) with reference to FIG. 23. The differential motion
vector is obtained using the predicted motion vector obtained in
the way as described above. The differential is motion vector can
be obtained using the difference between the motion vector and the
predicted motion vectors as noted from Equation 5 below.
MVD(-20/4,72/4)=MV(16/4,20/4)-MVP(36/4,-52/4) Equation 5
[0166] As a result, the differential motion vector has a value of
(-20/4, 72/4) which is equal to (-5, 18).
[0167] The differential vector encoder 950 may use the code number
table of differential motion vectors according to the motion vector
resolutions as shown in FIG. 25 in encoding the differential motion
vectors with respect to motion vector values for each of the
resolutions.
[0168] Further, the predicted motion vectors may be obtained as
follows by using the example shown in FIG. 22. The median can be
obtained using only surrounding motion vector or vectors having the
same resolution as that of the current motion vector. In FIG. 22,
since only block B corresponds to the surrounding motion vector
having the same resolution as that of the current motion vector,
the predicted motion vector has a value of (36/4, -28/4). The
differential motion vector is obtained using the predicted motion
vector obtained in the way as described above. The differential
motion vector can be obtained using the difference between the
motion vector and the predicted motion vectors as noted from
Equation 6 below.
MVD(-20/4,48/4)=MV(16/4,20/4)-MVP(36/4,-28/4) Equation 6
[0169] As a result, the differential motion vector has a value of
(-20/4, 48/4) which is equal to (-5, 12).
[0170] The differential vector encoder 950 may use the code number
table of differential motion vectors according to the motion vector
resolutions as shown in FIG. 25 in encoding the differential motion
vectors with respect to motion vector values for each of the
resolutions.
[0171] Further, if the surrounding motion vectors have been stored
based on a is resolution of 1/8, the predicted motion vector may be
obtained in the way as described below by using the example shown
in FIG. 22. Referring to FIGS. 23 and 24, the predicted motion
vector has a value of (32/4, -28/4). The differential motion vector
is obtained using the predicted motion vector obtained in the way
as described above. The differential motion vector can be obtained
by using the difference between the motion vector and the predicted
motion vectors as noted from Equation 3. As a result, the
differential motion vector has a value of (-16/4, 48/4) which is
equal to (-4, 12).
[0172] Meanwhile, the resolution encoder 940 may encode the kinds
of resolutions and the resolution change flag (the resolution
appointment flag in the second aspect) into the header. In this
event, the resolution encoder 940 may encode the resolution
identification flag, which has been determined as the optimum flag,
to 1/4, and the differential vector encoder 950 may encode the
differential motion vector obtained by using a predicted motion
vector calculated by using the surrounding motion vectors converted
according to the resolution determined by the resolution determiner
930.
[0173] FIG. 26 illustrates optimum motion vectors of a current
block and surrounding blocks in order to describe a process of
determining a resolution of a differential motion vector by the
resolution determiner 930.
[0174] As noted from FIG. 26, if the motion vector resolution of
the current block and surrounding blocks is 1/8, the predicted
motion vector may be calculated by Equation 7 below.
PMVx=median(7/8,1/8, 2/8)= 2/8
PMVy=median(- 6/8,1/8,- 2/8)=- 2/8 Equation 7
[0175] As a result, PMV=( 2/8, - 2/8)=(1/4, -1/4). The differential
motion vector can be obtained by Equation 8 below.
MVD(-1/8,- 2/8)=MV(1/8,- 4/8)-PMV(1/4,-1/4) Equation 8
[0176] Therefore, the differential motion vector identification
flag MVDx may be encoded to 1/8 and the differential motion vector
identification flag MVDy may be encoded to 1/4.
[0177] FIG. 27 illustrates a code number table of differential
motion vectors according to the differential motion vector
resolutions.
[0178] As noted from FIG. 27, the code number of the differential
motion vector is (1, 1) according to the code number table of the
differential motion vector. Therefore, the resolution encoder 940
may encode x and y components of the differential motion vector
resolution identification flag to (1/8, 1/4), encode the code
number of the differential motion vector to (1, 1), and separately
encode signs of the x and y components of the differential motion
vector.
[0179] Meanwhile, when the differential vector encoder 950 encodes
the differential motion vector, it determines a reference
resolution or converts a motion vector having a resolution, other
than a reference resolution to one with the reference resolution,
and calculates a differential motion vector by using a reference
predicted motion vector obtained from a reference motion vector of
surrounding blocks. If a motion vector has a resolution other than
the reference resolution, there is a method of additionally
encoding a reference resolution flag. The reference resolution flag
may include data indicating whether the motion vector has the same
resolution as the reference resolution and data indicating a
location of the actual motion vector.
[0180] The reference resolution may be defined in a header, such as
a picture parameter set, a sequence parameter set, or a slice
header.
[0181] FIG. 28 illustrates a motion vector of the current block X
and a reference motion vector of surrounding blocks.
[0182] When the resolution change flag (the resolution appointment
flag in the second aspect) indicates multiple resolutions, the
kinds of the resolutions include 1/2, 1/4, and 1/8, the reference
resolution is 1/4, and the optimum resolution has is been
determined as shown in FIG. 28, the current motion vector ( 4/8,
5/8) is converted by using the reference resolution, 1/4, to a
reference motion vector by Equation 9 below.
Ref.sub.--MVx= 2/4
Ref.sub.--MVy=3/4 Equation 9
[0183] If the resolution of the current motion vector is different
from the reference resolution, it may be converted by using a
multiplication and a division. In the case of using the division,
rounding may be used including a round-off, a round-up, and a
round-down. The current aspect uses a round-off. Therefore, the
reference resolution has a value of ( 2/4, 3/4), and the location
of the actual motion vector having a resolution other than the
reference resolution can be expressed using the reference
resolution flag. In this event, the difference between the motion
vector of the current block and the reference motion vector is (0,
1/8), and the value of the reference resolution flag may have, for
example, location information, such as (0, 1). In the example of
the location information, (0, 1), "0" indicates that the reference
motion vector is equal to the motion vector and "1" indicates a
motion vector that is smaller by -1/8 than the reference motion
vector.
[0184] In the meantime, the differential vector of the reference
motion vector is calculated using a predicted reference motion
vector, which corresponds to a median value of the reference motion
vector of the surrounding blocks. The predicted reference motion
vector may be obtained by Equation 10.
Ref.sub.--PMVx=median(9/4,1, 2/4)=1
Ref.sub.--PMVy=median(-7/4,-1,-1)=-1 Equation 10
[0185] Therefore, the predicted reference motion vector Ref_PMV has
a value of (1, -1). Then, by applying (Ref_MV( 2/4, 3/4)-Ref_PMV(1,
-1)), the differential reference motion vector Ref_MVD has a value
of (- 2/4, 7/4). Therefore, the encoder encodes the Ref_MVD (- 2/4,
7/4) and encodes the reference resolution flag (0, 1).
[0186] FIG. 29 illustrates a code number table of a differential
reference motion vector according to the differential reference
motion vector resolution.
[0187] Referring to FIG. 29, the code number is 2 when the
reference resolution is 1/4 and the value of the differential
reference motion vector is 2/4, and the code number is 3 when the
reference resolution is 1/4 and the value of the differential
reference motion vector is 3/4, and the code number of each
component of the differential reference motion vector is included
in the reference resolution flag.
[0188] The resolution encoder 940 can encode in various ways the
motion vector resolution and/or differential motion vector
resolution determined according to each motion vector or area. The
following description with reference to FIGS. 10 to 14 discusses
various examples of the encoding of the motion vector resolution or
differential motion vector resolution. Although the following
description deals with only the examples of the encoding of the
motion vector resolution, the differential motion vector resolution
can also be encoded in the same way as that for the motion vector
resolution, which is omitted in the description.
[0189] The resolution encoder 940 may integrate the motion vector
resolutions and/or differential motion vector resolutions of
adjacent areas having the same motion vector resolution with each
other, and then generate a resolution identification flag for each
integrated area. For example, the resolution encoder 940 may
hierarchically generate the resolution identification flags with a
to Quadtree structure. In this event, the resolution encoder 940
may encode an identifier, which represents the maximum number of
the Quadtree layers and the size of the area indicated by the
lowest node of the Quadtree layers, and then include the encoded
identifier in a header of a corresponding area of a bitstream.
[0190] FIGS. 10A to 10C illustrate an example of motion vector
resolutions is hierarchically expressed by a Quadtree structure
according to an aspect of the present disclosure.
[0191] FIG. 10A illustrates areas having various motion vector
resolutions within one picture. In FIG. 10A, each area may be a
macroblock having a size of 16.times.16 and the number in each area
indicates a motion vector resolution of the area. FIG. 10B
illustrates grouping of the areas shown in FIG. 10A into grouped
areas, each of which includes areas having the same motion vector
resolution. FIG. 100 hierarchically illustrates the motion vector
resolutions of the grouped areas shown in FIG. 10B in a Quadtree
structure. As noted from FIG. 100, the area indicated by the lowest
node corresponds to a macroblock having a size of 16.times.16 and
the maximum number of the layers of the Quadtree structure is 4.
Therefore, this information is encoded and is included in a header
for the corresponding area.
[0192] FIG. 11 illustrates a hierarchically expressed result of
encoded motion vector resolutions in a Quadtree structure according
to an aspect of the present disclosure.
[0193] The final bits as shown in FIG. 11 can be obtained by
encoding the motion vector resolutions in the Quadtree structure
shown in FIG. 100. One encoded bit may indicate whether a node has
been divided. For example, a bit value of "1" may indicate that a
corresponding node has been divided into lower nodes and a bit
value of "0" may indicate that a corresponding node has not been
divided into the lower layers.
[0194] In FIG. 100, since the node of level 0 has been divided into
lower layers, it is encoded to a bit value of "1". Since the first
node of divided level 1 has a resolution of 1/2 and has not been
divided any more, it is encoded to a bit value of "0" while the
motion vector resolution of 1/2 is encoded. Since the second node
of level 1 has been divided into lower layers, it is encoded to a
bit value of "1". Since the third node of level 1 has not been
divided into lower layers, it is encoded to a bit value of "0"
while the motion vector resolution 1/4 is encoded. Since the final
fourth node of level 1 has been divided into lower layers, it is
encoded to a bit value of "1". Nodes of level 2 are encoded in the
same way. In level 3, only the motion vector resolutions are
encoded, because the maximum number of layers has been determined
as 3 in the header, which tells that there are no more layers lower
than level 3. The final bits generated by hierarchically encoding
the various motion vector resolutions of the areas shown in FIG.
10A in a Quadtree structure may have the structure as shown in FIG.
11.
[0195] The motion vector resolutions of 1/2, 1/4, and 1/8
identified in the final bits imply the encoding result of using
their representative bits, although the bit values are not
represented for the convenience of description. The motion vector
resolutions may be expressed by bit values in various ways
according to the implementation methods. For example, if there are
two type of available motion vector resolutions, they can be
indicated by a 1-bit flag. Further, if there are four or less types
of available motion vector resolutions, they can be indicated by a
2-bit flag.
[0196] If the maximum number of layers and the size of the area
indicated by the lowest node are defined in a slice header, the
resolution identification flag generated as described above may be
included in the field of the slice data. A video decoding
apparatus, which will be described later, can extract and decode a
resolution identification flag from a bitstream, so as to
reconstruct the motion vector resolution of each area.
[0197] Further, the aspect shown in FIGS. 10A to 10C discusses only
two alternative cases in which a node is either divided into lower
layers (i.e. four areas) or undivided, although there may be
various divisions as shown in FIG. 20, including the nondivision of
the node, its divisions into two transversely lengthy areas, two
longitudinally lengthy areas, or four areas.
[0198] Further, the resolution encoder 940 may generate a
resolution identification flag by encoding the motion vector
resolution of each area or motion vector by using a predicted
motion vector resolution predicted by motion vector resolutions of
surrounding areas of that area. For example, based on an assumption
that an area corresponds to a block having a size of 64.times.64,
the motion vector resolution of the area may be predicted by using
motion vector resolutions of areas at the left side and upper side
of the area. When the is predicted motion vector resolution of an
area is identical to the motion vector resolution of the area, the
resolution encoder 940 may encode a resolution identification flag
of the area to a bit value of "1". Otherwise, when the predicted
motion vector resolution of an area is not identical to the motion
vector resolution of the area, the resolution encoder 940 may
encode a resolution identification flag of the area to a bit value
of "0" and a bit value indicating a motion vector resolution of the
area. For example, if each of the resolutions of the upper area and
the left area of an area is 1/2 and the resolution of the area is
also 1/2, the resolution encoder 940 may encode the resolution
identification flag of the area to a bit value of "1" and does not
encode the motion vector resolution of the area. If each of the
resolutions of the upper area and the left area of an area is 1/2
and the resolution of the area is 1/4, the resolution encoder 940
may encode the resolution identification flag of the area to a bit
value of "0" and may additionally encode the motion vector
resolution of the area.
[0199] Further, the resolution encoder 940 may generate a
resolution identification flag by encoding the motion vector
resolution of each area of motion vector by using the run and
length of the motion vector resolution of each area or motion
vector.
[0200] FIG. 12 illustrates motion vector resolutions of areas
determined according to an aspect of the present disclosure.
[0201] In FIG. 12, areas within one picture correspond to
macroblocks each having a size of 16.times.16 and a motion vector
resolution of each area is expressed in each area. Hereinafter, an
example of encoding the motion vector resolutions of the areas
shown in FIG. 12 by using the runs and lengths thereof will be
described. When the motion vector resolutions of the areas shown in
FIG. 12 are ordered in a raster scan direction, the motion vector
resolution of 1/2 occurs four times in a row, the motion vector
resolution of 1/4 once, the motion vector resolution of 1/8 twice
in a row, and the motion vector resolution of 1/2 is four times in
a row (the motion vector resolutions thereafter are omitted). As a
result, using the runs and lengths, those motion vector resolutions
can be expressed as (1/2, 4), (1/4, 1), (1/8, 2), (1/2, 4), . . . .
Therefore, the resolution encoder 940 can generate a resolution
identification flag by encoding the motion vector resolution is of
each area expressed using the run and length and expressing the
resolution by a bit value.
[0202] Further, the resolution encoder 940 may generate a
resolution identification flag by hierarchically encoding the
motion vector resolutions of each area or motion vector by using a
tag tree. In this event, the resolution encoder 940 may include an
identifier which indicates the maximum number of the tag tree
layers and the size of the area indicated by the lowest node, in a
header.
[0203] FIG. 13 illustrates an example of motion vector resolution
hierarchically expressed in a tag tree structure according to an
aspect of the present disclosure.
[0204] In particular, FIG. 13 shows the hierarchical tag tree
structure of the motion vector resolution respectively determined
for the individual areas within a section of an image. It is
assumed that each of the areas corresponds to a macroblock having a
size of 16.times.16.
[0205] In FIG. 13, since the minimum value is 1/2 among the motion
vector resolutions of the first four areas of level 3, the motion
vector resolution of the first area is 1/2. The areas are
hierarchically grouped in this way as many times as the number of
layers, and coded bits are then generated from each upper layer to
its lower layer to complete the encoding stage thereof.
[0206] FIG. 14 illustrates a result of the encoding of the motion
vector resolutions hierarchically expressed in a tag tree structure
according to an aspect of the present disclosure.
[0207] In a method of generating a coded bit of each area,
subtracted values between the motion vector resolution number
designations in current layers and their higher layers from the
root to end nodes of the tree are expressed by a to series of "0"
finished with the last bit value of "1". In this event, in the case
of the highest layer, based on an assumption that a motion vector
resolution of its higher layer is designated "0", a motion vector
resolution of 1/2 is "1", a motion vector resolution of 1/4 is "2",
and a motion vector resolution of 1/8 is "3", a resolution
identification flag may be generated as shown in FIG. 14 by is
hierarchically encoding the motion vector resolutions of the areas
as shown in FIG. 13 with the tag tree structure. In this event, the
number assigned to each motion vector resolution may be
changed.
[0208] In FIG. 14, pair of numbers (0,0), (0,1), etc. expressed in
the respective areas correspond to reference numbers identifying
the areas, and numerals "0111", "01", etc. correspond to bit values
of resolution identification flags obtained by encoding the motion
vector resolutions of the areas.
[0209] In the case of the resolution identification flag in the
area identified by (0,0), Level 0 has its higher layer motion
vector resolution numbered "0" as Level 1 has the motion vector
resolution of 1/2 numbered "1" leading to subtracted value between
the Level 1 number and the Level 0 number into "1" which is
converted to a coded bit of "01". Again, Level 1 has a difference
from its higher layer (Level 0) in their motion vector resolution
numbers by subtracted value "0" which turns to a coded bit of "1".
Yet again, Level 2 has a difference from its upper layer (Level 1)
in their motion vector resolution numbers by subtracted value "0"
which turns to a coded bit of "1". Furthermore, in Level 3, since
the difference between the numbers of the motion vector resolutions
of Level 3 and the higher layer (Level 2) is "0", an encoded bit of
"1" is obtained. As a result, "0111" is finally obtained as encoded
bits of the motion vector resolution of the area identified by
(0,0).
[0210] In the case of the resolution identification flag of the
area identified by (0,1), Level 0, Level 1, and Level 2 are already
reflected in the resolution identification flag identified by
(0,0). Therefore, only in Level 3, "1", which is the difference
between the numbers of the motion vector resolutions of Level 3 and
the higher layer (Level 2), is encoded, so as to obtain an encoded
bit of "01". As a result, only "01" is finally obtained as a
resolution identification flag of the area to identified by
(0,1).
[0211] In the case of the resolution identification flag of the
area identified by (0,4), Level 0 is already reflected in the
resolution identification flag identified by (0,0). Therefore, only
Level 1, Level 2, and Level 3 are subjected to an encoding in the
way described above, so that "0111" is finally obtained as the
encoded bits.
[0212] Further, the resolution encoder 940 may generate a
resolution identification flag by changing and encoding the number
of bits allocated to the motion vector resolution according to the
frequency of the motion vector resolution determined for each
motion vector or area. To this end, the resolution encoder 940 may
change and encode the number of bits allocated to the motion vector
resolution of a corresponding area according to the occurrence
frequency of the motion vector resolution up to the just previous
area in the unit of area, or may change and encode the number of
bits allocated to the motion vector resolution of a corresponding
section, which includes a plurality of areas, according to the
occurrence frequency of the motion vector resolution up to the just
previous section or the occurrence frequency of the motion vector
resolution of the just previous section in the unit of sections. To
this end, the resolution encoder 940 may encode the motion vector
resolution of each area by calculating the frequency of the motion
vector resolution in the unit of areas or sections, allocating
numbers to the motion vector resolutions in a sequence causing the
smaller number to be allocated to a motion vector resolution having
the larger frequency, and allocating the smaller number of bits to
the motion vector resolutions allocated the smaller numbers.
[0213] For example, in the case where the resolution encoder 940
changes the bit numbers according to the occurrence frequency of
the motion vector resolution up to the previous area in the unit of
areas, if the motion vector resolution of 1/2 has occurred 10
times, the motion vector resolution of 1/4 has occurred 15 times,
and the motion vector resolution of 1/8 has occurred 8 times in all
areas up to the previous area, the resolution encoder 940 allocates
the smallest number (e.g. No. 1) to the motion vector resolution of
1/4, the next smallest number (e.g. No. 2) to the motion vector
resolution of 1/2, and the largest number (e.g. No. 3) to the
motion vector resolution of 1/8, and allocates a smaller number of
bits to the motion vector resolutions in a sequence from the
smaller number to the larger number. Then, if the motion vector
resolution of the area, for which the motion vector resolution is
to be encoded, corresponds to the 1/4 pixel unit, the resolution
encoder 940 may allocate the smallest bits to the motion vector
resolution, so as to encode the motion vector resolution of 1/4 for
the area.
[0214] Further, in the case where the resolution encoder 940
changes and encodes the bit numbers according to the frequency of
occurrences of the motion vector resolution up to the previous area
group in the unit of area groups, the resolution encoder 940 may
encode the motion vector resolution of each area of the area group,
for which the motion vector resolution is to be encoded, by
updating the occurrence frequency of the motion vector resolution
of each area up to the previous area group, allocating numbers to
the motion vector resolutions in a sequence causing the smaller
number to be allocated to a motion vector resolution having the
larger frequency, and allocating the smaller number of bits to the
motion vector resolutions allocated the smaller numbers. The area
group may be a Quadtree, a Quadtree bundle, a tag tree, a tag tree
bundle, a macroblock, a macroblock bundle, or an area in a
predetermined size. For example, when the area group is appointed
as including two macroblocks, it is possible to update the
frequency of occurrence of the motion vector resolution for every
two macroblocks and allocate a bit number of the motion vector
resolution to the updated frequency. Otherwise, when the area group
is appointed as including four Quadtrees, it is possible to update
the frequency probability of the motion vector resolution for every
four Quadtrees and allocate a bit number of the motion vector
resolution to the updated frequency.
[0215] Further, the resolution encoder 940 may use different
methods for encoding a resolution identification flag according to
the distribution of the motion vector resolutions of surrounding
areas of each area with respect to the motion vector resolution
determined according to each area or motion vector. That is, the
smallest bit number is allocated to a resolution having the highest
probability that the resolution may be the resolution of a
corresponding area according to the distribution of the motion
vector resolutions of surrounding areas or area groups. For
example, if a left side area of an area has a motion vector
resolution of 1/2 and an upper side area of the area has a motion
vector resolution of 1/2, it is most probable that the area has a
motion vector resolution of 1/2, and the is smallest bit number is
thus allocated to the motion vector resolution of 1/2, which is
then encoded. As another example, if a left side area of an area
has a motion vector resolution of 1/4, a left upper side area of
the area has a motion vector resolution of 1/2, an upper side area
of the area has a motion vector resolution of 1/2, and a right
upper side area of the area has a motion vector resolution of 1/2,
the bit numbers are allocated to the motion vector resolutions in a
sequence causing the smaller bit number to be allocated to a motion
vector resolution having the higher probability, such as in a
sequence of 1/2, 1/4, 1/8, . . . , and the motion vector
resolutions are then encoded.
[0216] Further, in performing the entropy encoding by an arithmetic
encoding, the resolution encoder 940 uses different methods of
generating a bit string of a resolution identification flag
according to the distribution of the motion vector resolutions of
the surrounding areas of each area for the motion vector resolution
determined according to each motion vector or area and applies
different context models according to the distribution of the
motion vector resolutions of the surrounding areas and the
probabilities of the motion vector resolution having occurred up to
the present for the arithmetic encoding and probability update.
[0217] Referring to FIG. 21 as an example, based on an assumption
that an entropy encoding is performed using only three motion
vector resolutions including 1/2, 1/4, and 1/8 by the CABAC, if a
left side area of a pertinent area has a motion vector resolution
of 1/2 and an upper side area of the area has a motion vector
resolution of 1/2, the shortest bit string is allocated to the
motion vector resolution of 1/2 and the other bit strings are
allocated to the motion vector resolutions in a sequence causing
the smaller bit number to be allocated to a motion vector
resolution having the higher probability. Specifically, if the
motion vector resolution of 1/8 has the higher occurrence
probability than that of to the motion vector resolution of 1/4,
the bitstream of "00" is allocated to the motion vector resolution
of 1/8 and the bitstream of "01" is allocated to the motion vector
resolution of 1/2 for the arithmetic encoding.
[0218] Further, in encoding the first bit string, four different
context models may be used, which include: a first context model in
which the resolution of the is left side area is equal to the
resolution of the upper side area, which is equal to the resolution
of the highest probability up to the present; a second context
model in which the resolution of the left side area is equal to the
resolution of the upper side area, which is different from the
resolution of the highest probability up to the present; a third
context model in which the resolutions of the left side area and
the upper side area are different from each other and at least one
of the resolutions of the left side area and the upper side area is
equal to the resolution of the highest probability up to the
present; and a fourth context model in which the resolutions of the
left side area and the upper side area are different from each
other and neither of them is equal to the resolution of the highest
probability up to the present. In encoding the second bit string,
two different context models may be used, which include: a first
context model in which the resolutions of the left side area and
the upper side area are different from each other and at least one
of the resolutions of the left side area and the upper side area is
equal to the resolution of the highest probability up to the
present; and a second context model in which the resolutions of the
left side area and the upper side area are different from each
other and neither of them is equal to the resolution of the highest
probability up to the present.
[0219] As another example, based on an assumption that an entropy
encoding is performed using only three motion vector resolutions
including 1/2, 1/4, and 1/8 by the CABAC and the highest motion
vector resolution up to the present is 1/4, "1", which is the
shortest bitstream, is allocated to the motion vector resolution of
1/4 and "00" and "01" are then allocated to the other motion vector
resolutions of 1/2 and 1/8, respectively. Further, in encoding the
first bit string, three different context models may be used, which
include: a first context model in which each of the resolutions of
the left side area and the upper side area of a corresponding area
is equal to the resolution of the highest probability up to the
present; a second context model in which only one of the
resolutions of the left side area and the upper side area of a
corresponding area is equal to the resolution of the highest
probability up to the present; and a third context model in which
neither of the resolutions of the left side area and the upper side
area of a corresponding area is equal to the resolution of the
highest probability up to is the present. In encoding the second
bit string, six different context models may be used, which
include: a first context model in which each of the resolution of
the left side area and the resolution of the upper side area of a
corresponding area corresponds to a motion vector resolution of
1/8; a second context model in which each of the resolutions of the
left side area and the upper side area of a corresponding area
corresponds to a motion vector resolution of 1/2; a third context
model in which each of the resolutions of the left side area and
the upper side area of a corresponding area corresponds to a motion
vector resolution of 1/4; a fourth context model in which one of
the resolutions of the left side area and the upper side area of a
corresponding area corresponds to a motion vector resolution of 1/8
and the other resolution corresponds to a motion vector resolution
of 1/4; a fifth context model in which one of the resolutions of
the left side area and the upper side area of a corresponding area
corresponds to a motion vector resolution of 1/2 and the other
resolution corresponds to a motion vector resolution of 1/4; and a
sixth context model in which one of the resolutions of the left
side area and the upper side area of a corresponding area
corresponds to a motion vector resolution of 1/8 and the other
resolution corresponds to a motion vector resolution of 1/2. The
resolution of the highest probability up to now may be of the
probability of resolution encoded up to the previous area, a
probability of a certain area, or a predetermined fixed
resolution.
[0220] Further, the resolution encoder 940 may determine whether a
video decoding apparatus can estimate a motion vector resolution of
each motion vector or area according to a prearranged estimation
scheme. Then, for an area having an estimable motion vector
resolution, the resolution encoder 940 may encode a positive
identifier, which indicates that it can be estimated, so as to
generate a resolution identification flag. In contrast, for an area
having an inestimable motion vector resolution, the resolution
encoder 940 may encode a negative identifier, which indicates that
it cannot be estimated, and a motion vector resolution of a
corresponding area, so as to generate a resolution identification
flag.
[0221] That is, in order to encode a motion vector resolution of
each motion vector or area, the resolution encoder 940 calculates a
motion vector and a is predicted motion vector of the area with
multiple motion vector resolutions applied, encodes a differential
motion vector between them, decodes the differential motion vector,
and decodes the motion vector for each resolution by using the
reconstruction of the decoded differential motion vector based on
assumption that each resolution is the optimum resolution. Then,
the resolution encoder 940 determines a motion vector resolution,
which has the lowest cost according to a predetermined cost
function when motions of surrounding pixels of a corresponding area
by using the motion vector reconstructed based on an assumption
that each resolution is the optimum resolution. When the motion
vector resolution determined in the way described above is equal to
a motion vector resolution of a corresponding area originally
desired to be encoded (i.e. a motion vector resolution determined
as an optimum motion vector resolution of the corresponding area,
on condition that the optimum motion vector resolution does not
imply that it always exhibits the optimum performance and simply
refers to a motion vector resolution determined as optimum under
the conditions for determining the motion vector resolution), the
resolution encoder 940 may generate an identifier (e.g. "1"),
indicating that the video decoding apparatus can estimate the
motion vector resolution of the corresponding area, as a resolution
identification flag of the corresponding area. In this event, the
motion vector resolution of the corresponding area is not encoded.
When the determined motion vector resolution is not equal to the
motion vector resolution of the corresponding area intended to be
encoded, the resolution encoder 940 may encode an identifier (e.g.
"0"), indicating that the video decoding apparatus cannot estimate
the motion vector resolution of the corresponding area, and the
original motion vector resolution of the corresponding area, so as
to generate a resolution identification flag of the corresponding
area. In this event, various distortion functions, such as Mean
Square Error (MSE) or Sum of Absolute to Transformed Differences
(SATD), may be used as the predetermined cost function.
[0222] Further, when each component of the differential motion
vector is "0", the resolution encoder 940 may dispense with
encoding the resolution of the motion vector or area. When each
component of the differential motion vector is "0", a predicted
motion vector is encoded to a motion vector, which makes it
unnecessary to encode the motion vector resolution.
[0223] FIG. 15 illustrates an example of a process for determining
a motion vector resolution by using surrounding pixels of an area
according to an aspect of the present disclosure.
[0224] Referring to FIG. 15, if the optimum motion vector
resolution determined as a result of the motion estimation for an
area, the motion vector resolution of which is to be encoded by the
resolution encoder 940, is a motion vector resolution of 1/2, a
motion vector is (4, 10), and a predicted motion vector is (2, 7),
the differential motion vector is (2, 3). In this event, based on
an assumption that a video decoding apparatus can decode and
reconstruct only a differential vector, the resolution encoder 940
may change the motion vector resolution into various motion vector
resolutions, predict a predicted motion vector according to each
motion vector resolution, reconstruct a motion vector according to
each motion vector resolution, and determine a motion vector
resolution having a least distortion between surrounding pixels of
a current area and surrounding pixels of an area indicated by a
motion vector according to each reconstructed motion vector
resolution.
[0225] If the motion vector resolution corresponds to the 1/4 pixel
unit and the predicted motion vector is (3, 14), the differential
motion vector reconstructed by the video decoding apparatus is (2,
3) and the motion vector of the corresponding reconstructed area is
thus (5, 17). Further, if the motion vector resolution corresponds
to a 1/2 pixel unit and the predicted motion vector is (2, 7), the
differential motion vector reconstructed by the video decoding
apparatus is (2, 3) and the motion vector of the corresponding
reconstructed area is thus (4, 10). In the same way as described
above, a motion vector of a corresponding area reconstructed by the
video decoding apparatus is also calculated in the case where the
motion vector resolution corresponds to the 1/8 pixel unit.
[0226] When the motion vector resolution having a least distortion
between surrounding pixels of a corresponding area and surrounding
pixels of an area having been motion-compensated in a reference
picture by using a motion is vector of a corresponding area
reconstructed according to each motion vector resolution is equal
to an optimum motion vector resolution determined in advance, the
resolution encoder 940 encodes only an identifier, indicating that
the video decoding apparatus can estimate the motion vector
resolution, so as to generate a resolution identification flag of
the corresponding area, and does not encode the motion vector
resolution of the corresponding area.
[0227] When the size of a predicted motion vector or differential
motion vector of a motion vector according to a motion vector
resolution determined for each area or motion vector is larger than
a threshold, the resolution determiner 930 may determine a
predetermined value as the motion vector resolution of each area or
motion vector. For example, when the size of a differential motion
vector or the size of a predicted motion vector of an area or a
motion vector is larger than a threshold, the resolution determiner
930 may determine a predetermined value as a motion vector
resolution of the area or the motion vector without encoding the
motion vector resolution of the area. Further, when the size of a
motion vector of a surrounding area of an area or a motion vector
is larger or the size of a motion vector of an area is larger than
a threshold, the resolution determiner 930 may determine a
predetermined value as a motion vector resolution of the area
without encoding the motion vector resolution of the area. In this
event, the motion vector resolution of the area or motion vector
can be changed to a predetermined resolution even without a flag.
The threshold may be a pre-appointed value or any inputted values,
or may be calculated from a motion vector of a surrounding
block.
[0228] When the resolution of the current block is identifiable
with a reference picture index, the resolution determiner 930 may
encode information on the resolution by encoding the reference
picture index without generating a resolution identification
flag.
[0229] For example, based on the distance between the current
picture and the reference picture as shown in FIG. 30, the
resolution determiner 930 may index and encode the reference
picture. For example, based on an assumption that four reference
pictures are used, candidates of reference pictures, which can be
indexed when the current picture is No. 5, can be indexed as shown
in FIG. 31.
[0230] FIG. 31 is a table illustrating an example of reference
picture indexes according to reference picture numbers and
resolutions.
[0231] With resolutions 1/4 and 1/8 being used, in the event
illustrated in FIG. 13 where the optimal reference picture is
numbered 3 and has the resolution of 1/8, the reference picture
index may be encoded into 3, and then the decoding apparatus will
know that the reference picture number of 3 after extracting the
same from the bitstream and that the resolution is 1/8 by using the
same table as is used by the decoder.
[0232] The differential vector encoder 950 may differently encode
differential vectors depending on the motion vector resolutions.
That is, as the motion vector resolution increases, the size of the
motion vector also increases and the required bit quantity thus
increases. Therefore, by encoding differential vectors in different
ways according to the motion vector resolutions, the differential
vector encoder 950 can reduce the bit quantity.
[0233] For example, when the differential vector encoder 950
encodes the differential vector by using the UVLC, the differential
vector encoder 950 may use the K-th order Exp-Golomb code in the
encoding. In this event, the differential vector encoder 950 may
change the degree of order (K) of the Exp-Golomb code according to
the motion vector resolution determined for each area. For example,
in the case of encoding the differential vector by using the UVLC,
the degree of order (K) of the Exp-Golomb code can be set to "0"
when the motion vector resolution corresponds to the 1/4 pixel unit
and the degree of order (K) of the Exp-Golomb code can be set to
"1" when the motion vector resolution corresponds to the 1/8 pixel
unit.
[0234] Further, when the differential vector encoder 950 encodes
the differential vector by using the CABAC, the differential vector
encoder 950 may use the Concatenated Truncated Unary/K-th Order
Exp-Golomb Code in the encoding. In the encoding, the differential
vector encoder 950 may change the degree of order (K) and the
maximum value (T) of the Concatenated Truncated Unary/K-th Order
Exp-Golomb Code according to the motion vector resolution is
determined for each area. For example, in the case of encoding the
differential vector by using the CABAC, the degree of order (K) of
the code may be set to "3" and the maximum value (T) of the code
may be set to "6" when the motion vector resolution corresponds to
the 1/4 pixel unit, and the degree of order (K) of the code may be
set to "5" and the maximum value (T) of the code may be set to "12"
when the motion vector resolution corresponds to the 1/8 pixel
unit.
[0235] In addition, when the differential vector encoder 950
encodes the differential vector by using the CABAC, the
differential vector encoder 950 may differently calculate the
accumulation probability according to the motion vector resolution
determined for each area. For example, whenever encoding the
differential vectors of the areas, the differential vector encoder
950 may update each context model according to the motion vector
resolution determined for each area, and may use the updated
context model according to each motion vector resolution when
encoding a differential vector of another area. That is, when a
motion vector resolution of an area corresponds to the 1/2 pixel
unit, the differential vector encoder 950 may encode the
differential vector by using the context model of the 1/2 pixel
unit and update the context model of the 1/2 pixel unit. Further,
when a motion vector resolution of an area corresponds to the 1/8
pixel unit, the differential vector encoder 950 may encode the
differential vector by using the context model of the 1/8 pixel
unit and update the context model of the 1/8 pixel unit.
[0236] Further, in order to calculate the differential vector of
each area, the differential vector encoder 950 may predict a
predicted motion vector for each area or motion vector by using
motion vectors of surrounding areas of each area or motion vector.
In this event, when the motion vector resolution of each area is
not equal to the motion vector resolution of surrounding areas, the
differential vector encoder 950 may convert the motion vector
resolution of the surrounding areas to the motion vector resolution
of said each area for the prediction. For the converting of the
motion vector resolution, it is possible to use a round-off, a
round-up, and a round-down. In this event, it is required to
understand that the surrounding areas include adjacent areas.
[0237] FIG. 16 is a view for illustrating the process of predicting
a predicted is motion vector according to an aspect of the present
disclosure.
[0238] Referring to the example shown in FIG. 16, if motion vectors
of surrounding areas of an area of a predicted motion vector to be
predicted are (4, 5), (10, 7), and (5, 10) and a round-off is used
for the converting, the predicted motion vector may be (5, 5) when
the resolution of the motion vector of the area to be predicted is
1/4, and the predicted motion vector may be (10, 10) when the
resolution of the motion vector of the area to be predicted is
1/8.
[0239] Further, when the block mode of one or more areas among the
areas is a skip mode, the differential vector encoder 950 may
convert the motion vector resolution of the area of the motion
vector to be predicted to the highest resolution among the motion
vector resolutions of surrounding areas of the area and then
perform the prediction. Referring to the example shown in FIG. 16,
when the area to be predicted is in the skip mode, since the
highest resolution among the motion vector resolutions of the
surrounding areas is 1/8, a predicted motion vector of (10, 10) is
obtained based on an assumption that the resolution of the area to
be predicted is 1/8.
[0240] Moreover, in predicting a predicted motion vector of an area
to be predicted by using motion vectors of surrounding areas of the
area, the differential vector encoder 950 may convert the motion
vectors of the surrounding areas to a predetermined resolution. In
this event, when a predetermined motion vector resolution and the
motion vector resolution of the area to be predicted are not equal
to each other, the differential vector encoder 950 may convert the
predetermined motion vector resolution to the motion vector
resolution of the area of the predicted motion vector to be
predicted, so as to obtain a final predicted motion vector.
Referring to the example shown in FIG. 16, the predicted motion
vector is converted to (3, 3) when the predetermined motion vector
resolution corresponds to the 1/2 pixel unit. Further, when the
motion vector resolution of the area to be predicted corresponds to
the 1/8 pixel unit, which is not equal to the predetermined motion
vector resolution, the predicted motion vector of (3, 3) is
converted to the 1/8 pixel unit, so as to obtain is a final
predicted motion vector of (12, 12). In the same way, when the
motion vector resolution of the area to be predicted corresponds to
the 1/4 pixel unit, it is possible to obtain a final predicted
motion vector of (12, 12).
[0241] FIG. 17 is a flowchart for describing a method for encoding
a video by using an adaptive motion vector resolution according to
a first aspect of the present disclosure.
[0242] In a method for encoding a video by using an adaptive motion
vector resolution according to a first aspect of the present
disclosure, a motion vector resolution is first determined for each
area or motion vector, and an inter prediction encoding of a video
is performed in the unit or areas by using a motion vector
according to the motion vector resolution determined for each area
or motion vector. To this end, a video encoding apparatus 900 using
an adaptive motion vector resolution according to a first aspect of
the present disclosure determines whether the motion vector
resolution changes according to each area or motion vector of a
video (step S1710). When the motion vector resolution changes
according to each area or motion vector, the video encoding
apparatus 900 determines the motion vector resolution of each area
or motion vector (step S1720). Then, the video encoding apparatus
900 performs an inter prediction encoding of the video in the unit
of areas by using a motion vector according to the motion vector
resolution determined for each area or motion vector (step S1730).
In contrast, when the motion vector resolution does not change but
is fixed regardless of the area or motion vector, the video
encoding apparatus 900 performs an inter prediction encoding of the
video in the unit of areas by using a motion vector according to
the fixed motion vector resolution for lower areas within some
areas or all areas of the video (step S1740).
[0243] In this event, the motion vector resolution determined for
each area may have different values for an x component and a y
component of the area.
[0244] Further, the video encoding apparatus 900 may generate a
resolution identification flag, which indicates whether to
determine the motion vector resolution, according to each area or
motion vector. For example, when it is determined in step S1710
that the motion vector resolution changes according to each area or
motion vector, the video encoding apparatus 900 may generate a
resolution identification flag (e.g. "1") indicating that the
motion vector resolution changes according to each area or motion
vector. Further, when it is determined in step S1710 that the
motion vector resolution does not change but is fixed regardless of
the area or motion vector, the video encoding apparatus 900 may
generate a resolution identification flag (e.g. "0") indicating
that the motion vector resolution does not change but is fixed
regardless of the area or motion vector. In contrast, the video
encoding apparatus 900 may generate a resolution identification
flag according to the set information input from a user or an
exterior, and may determine whether the motion vector resolution is
determined for each area as in step S1710 based on the bit value of
the generated resolution identification flag.
[0245] Further, the video encoding apparatus 900 may encode a
motion vector resolution determined for each area or motion vector.
For example, the video encoding apparatus 900 may hierarchically
encode the motion vector resolutions determined for respective
areas or motion vectors in a Quadtree structure by grouping areas
having the same motion vector resolution together, may encode the
motion vector resolution determined for each area or motion vector
by using a motion vector resolution predicted using motion vector
resolutions of surrounding areas of each area, may encode the
motion vector resolution determined for each area or motion vector
by using the run and length or may hierarchically encode the motion
vector resolutions by using a tag tree, or may perform the encoding
while changing the number of bits allocated to the motion vector
resolution according to the frequency of the motion vector
resolution determined for each area or motion vector. Also, the
video encoding apparatus 900 may determine whether a video decoding
apparatus can estimate the motion vector resolution determined for
each area or motion vector according to a pre-promised estimation
scheme, and then encode an identifier indicating to the capability
of estimation for an area having a motion vector resolution that
can be estimated or encode an identifier indicating the
incapability of estimation for an area having a motion vector
resolution that cannot be estimated. In the case where the video
encoding apparatus 900 hierarchically encodes the motion vector
resolutions in a Quadtree structure or by using a tag tree, the
video is encoding apparatus 900 may encode an identifier, which
indicates the size of an area indicated by the lowest node of the
tag tree layers and the maximum number of the tag tree layers or
the size of an area indicated by the lowest node of the Quadtree
layers and the maximum number of the Quadtree layers, and then
include the encoded identifier in a header.
[0246] Further, when the size of the differential motion vector or
predicted motion vectors of the motion vector according to the
motion vector resolution determined for each area is larger than a
threshold, the video encoding apparatus 900 may determine a
predetermined value or a certain value as the motion vector
resolution determined for each area. Further, when each component
of the differential motion vector is "0", the video encoding
apparatus 900 may dispense with encoding the resolution of the
motion vector or area.
[0247] Further, the video encoding apparatus 900 may encode a
differential motion vector corresponding to a difference between a
predicted motion vector and a motion vector according to the motion
vector resolution determined for each area or motion vector. In
this event, the video encoding apparatus 900 may differently encode
the differential motion vector depending on the motion vector
resolution. To this end, when the video encoding apparatus 900
encodes the differential vector by using the UVLC, the video
encoding apparatus 900 may use the K-th order Exp-Golomb code in
the encoding. In this event, the video encoding apparatus 900 may
change the degree of order (K) of the Exp-Golomb code according to
the motion vector resolution determined for each area. Further,
when the video encoding apparatus 900 encodes the differential
vector by using the CABAC, the video encoding apparatus 900 may use
the Concatenated Truncated Unary/K-th Order Exp-Golomb Code in the
encoding. In the encoding, the video encoding apparatus 900 may
change the degree of order (K) and the maximum value (T) of the
Concatenated Truncated Unary/K-th to Order Exp-Golomb Code
according to the motion vector resolution determined for each area.
In addition, when the video encoding apparatus 900 encodes the
differential vector by using the CABAC, the video encoding
apparatus 900 may differently calculate the accumulation
probability according to the motion vector resolution determined
for each area.
[0248] Further, the video encoding apparatus 900 may predict a
predicted motion vector for a motion vector of each area by using
motion vectors of surrounding areas of each area. In this event,
when the motion vector resolution of each area is not equal to the
motion vector resolution of surrounding areas, the video encoding
apparatus 900 may perform the prediction after converting the
motion vector resolution of the surrounding areas to the motion
vector resolution of said each area.
[0249] In addition, the video encoding apparatus 900 may use
different methods of encoding a resolution identification flag
according to the distribution of the motion vector resolutions of
surrounding areas of each area with respect to the motion vector
resolution determined according to each area or motion vector.
[0250] Further, in performing the entropy encoding by an arithmetic
encoding, the video encoding apparatus 900 uses different methods
of generating a bit string of a resolution identification flag
according to the distribution of the motion vector resolutions of
the surrounding areas of each area and applies different context
models according to the distribution of the motion vector
resolutions of the surrounding areas and the probabilities of the
motion vector resolution having occurred up to the present, for the
arithmetic encoding and probability update. Also, the video
encoding apparatus 900 uses different context models according to
the bit position for the arithmetic encoding and context model
update.
[0251] Moreover, when the block mode of one or more areas among the
areas is a skip mode, the video encoding apparatus 900 may convert
the motion vector resolution of the area of the motion vector to be
predicted to the highest resolution among the motion vector
resolutions of surrounding areas of the area and then perform the
prediction.
[0252] FIG. 32 is a block diagram illustrating a video encoding
apparatus 3200 using an adaptive motion vector according to the
second aspect of the present disclosure.
[0253] A video encoding apparatus 3200 using an adaptive motion
vector according to the second aspect of the present disclosure
includes an inter is prediction encoder 3210, a resolution
appointment flag generator 3220, a resolution determiner 3230, a
resolution encoder 3240, a differential vector encoder 3250, and a
resolution conversion flag generator 3260. Meanwhile, it is not
inevitably required that all of the resolution appointment flag
generator 3220, resolution encoder 3240, the differential vector
encoder 3250, and the resolution conversion flag generator 3260
should be included in the video encoding apparatus 3200, and they
may be selectively included in the video encoding apparatus
3200.
[0254] The inter prediction encoder 3210 performs an inter
prediction encoding of a video in the unit of areas of the image by
using a motion vector according to a motion vector resolution
determined for each motion vector or each area of the video. The
inter prediction encoder 3210 can be implemented by the video
encoding apparatus 100 described above with reference to FIG. 1. In
this event, when one or more elements between the resolution
encoder 3240 and the differential vector encoder 3250 of FIG. 32
are additionally included and the function of the additionally
included element or elements overlaps with the function of the
encoder 150 within the inter prediction encoder 3210, the
overlapping function may be omitted in the encoder 150. Further, if
there is an overlapping area between the function of the predictor
110 within the inter prediction encoder 3210 and the function of
the resolution determiner 3230, the overlapping function may be
omitted in the predictor 110.
[0255] Further, one or more elements between the resolution encoder
3240 and the differential vector encoder 3250 may be configured
either as an element separate from the inter prediction encoder
3210 as shown in FIG. 32 or as an element integrally formed with
the encoder 150 within the inter prediction encoder 3210. Further,
the flag information generated in the resolution appointment flag
generator 3220 or the resolution conversion flag generator 3260 may
be transformed into a bitstream either by the resolution
appointment flag generator 3220 or the resolution conversion flag
generator 3260 or by the encoder 150 within the inter prediction
encoder 3210.
[0256] Meanwhile, the functions of the inter prediction encoder
3210, the resolution encoder 3240, and the differential vector
encoder 3250 may be equal is or similar to those of the inter
prediction encoder 910, the resolution encoder 940, and the
differential vector encoder 950 in FIG. 9. Therefore, a detailed
description on the inter prediction encoder 3210, the resolution
encoder 3240, and the differential vector encoder 3250 is omitted
here.
[0257] The resolution appointment flag generator 3220 may
differently appoint the adaptability degree of the resolution
according to each area or motion vector of a video. The resolution
appointment flag generator 3220 may generate a resolution
appointment flag appointing a set of motion vector resolutions
and/or differential motion vector resolutions to each area or
motion vector of a video, and then include the generated resolution
appointment flag in a bitstream. The area using the resolution
appointment flag to indicate a motion vector resolution and/or
differential motion vector resolution may be a block, a macroblock,
a group of blocks, a group of macroblocks, or an area having a
predetermined size, such as M.times.N. That is, the resolution
appointment flag generator 3220 may generate a resolution
appointment flag indicating a resolution available for lower areas
within some areas of a video or all areas of the video, and then
include the generated resolution appointment flag in a bitstream.
Such a resolution appointment flag may be determined and generated
either according to configuration information input by a user or
according to a predetermined determination criteria based on an
analysis of the video to be encoded. The resolution appointment
flag may be included in a header of a bitstream, such as a picture
parameter set, a sequence parameter set, or a slice header.
[0258] If the resolution appointment flag appoints 1/2 and 1/4 as
the resolution options, the optimum resolution determined by the
resolution determiner 3230 and the resolution identification flag
encoded by the resolution encoder 3240 are selected from the
resolutions of 1/2 and 1/4 and the resolution identification flag
may be encoded according to a predetermined method. FIG. 33
illustrates to resolution identification flags in the case in which
the appointed resolutions are 1/2 and 1/4.
[0259] Further, the resolution identification flag may be encoded
using a unary coding, a CABAC, or a Quadtree coding. For example,
in the case of using the CABAC, a bit string may be first generated
using the table shown in FIG. 33 and is then subjected to an
arithmetic and probability encoding. For example, according to the
motion vector resolutions of surrounding motion vectors or blocks,
the context models may be divided into three cases. FIG. 34
illustrates current block X and its surrounding blocks A, B, and C,
and FIG. 35 illustrates a context model according to the
conditions.
[0260] If the resolution appointment flag appoints 1/2, 1/4, and
1/8 as the resolution options, the encoded resolution
identification flag may be selected from the resolutions of 1/2,
1/4, and 1/8 and the resolution identification flag may be encoded
according to a predetermined method. FIG. 36 illustrates resolution
identification flags in the case in which the appointed resolutions
are 1/2, 1/4, and 1/8. Referring to FIG. 36, the resolution
identification flag may be 0, 10, or 11.
[0261] The resolution identification flag may be encoded using a
unary coding, a CABAC, or a Quadtree coding. For example, in the
case of using the CABAC, a bit string may be first generated using
the table shown in FIG. 36 and then subjected to an arithmetic and
probability encoding. For example, using the index of bin string
and the motion vector resolutions of surrounding motion vectors or
blocks, the context models may be divided into a total of six
cases. In this event, based on FIG. 34 illustrating current block X
and its surrounding blocks A, B, and C, FIG. 37 illustrates a
context model according to the conditions.
[0262] In the meantime, the resolution appointment flag generated
by the resolution appointment flag generator 3220 may indicate a
single resolution. For example, in the case of fixing the
resolution to 1/2 instead of adaptively applying the resolution,
the resolution identification flag may be encoded to indicate that
the resolution of the corresponding area is fixed to the resolution
of 1/2.
[0263] Further, in the case of using multiple reference pictures,
the adaptability degrees (i.e. resolution set) of the resolution
may be set to be different according to the reference picture based
on a predetermined criterion without encoding the resolution
identification flag. For example, different adaptability degrees of
the resolution may be employed according to the is distance between
the current picture and reference pictures.
[0264] FIGS. 38 and 39 illustrate examples of adaptability degrees
according to distances between the current picture and reference
pictures. As noted from FIG. 38, when the distance between the
current picture and a reference picture is nearest (i.e. smallest)
among the distances between the current picture and the multiple
reference pictures, an optimum resolution may be selected from the
resolution set including 1/1, 1/2, 1/4, and 1/8 and a resolution
identification flag may be encoded. When the distance between the
current picture and a reference picture is farthest (i.e. largest)
among the distances between the current picture and the multiple
reference pictures, an optimum resolution may be selected from the
resolution set including 1/2 and 1/4 and a resolution
identification flag may be encoded. When the distance between the
current picture and a reference picture is neither nearest (i.e.
smallest) nor farthest (i.e. largest) among the distances between
the current picture and the multiple reference pictures, an optimum
resolution may be selected from the resolution set including 1/2,
1/4, and 1/8 and a resolution identification flag may be encoded.
It is noted from FIG. 39 that it is possible to use a single
resolution.
[0265] Further, at the time of generating reference pictures,
different adaptability degrees of the resolution may be employed
using an error measurement means, such as a Sum of Squared
Difference (SSD) between resolutions. For example, if usable
resolutions are 1/1, 1/2, 1/4, and 1/8, in interpolating a
reference picture, it is possible to set the resolution of 1/2 to
be used only when an error value obtained using an error
measurement means, such as an SSD, for the resolutions of 1/1 and
1/2 exceeds a predetermined threshold while setting the resolution
of 1/2 not to be used when the error value does not exceed the
predetermined threshold. Further, when it has been set that the
resolution of 1/2 should not be used, it is determined whether an
error value obtained using an error measurement means, such as an
SSD, for the resolutions of 1/1 and 1/4 exceeds a predetermined
threshold. When the error value for the resolutions of 1/1 and 1/4
does not exceed the predetermined threshold, the resolution of 1/4
is set not to be used. In contrast, when the error is value for the
resolutions of 1/1 and 1/4 exceeds the predetermined threshold, the
resolutions of both 1/1 and 1/4 are set to be used. Also, when the
resolution of 1/4 has been set to be used, it is determined whether
an error value obtained using an error measurement means, such as
an SSD, for the resolutions of 1/4 and 1/8 exceeds a predetermined
threshold. When the error value for the resolutions of 1/4 and 1/8
does not exceed the predetermined threshold, the resolution of 1/8
is set not to be used. In contrast, when the error value for the
resolutions of 1/4 and 1/8 exceeds the predetermined threshold, all
the resolutions of 1/1, 1/4, and 1/8 are set to be used. The
threshold may be different according to the resolutions or
quantized parameters, or may be the same.
[0266] Further, it is possible to encode the employment of
different adaptability degrees of the resolution according to the
reference pictures. For example, in the case of using three
reference pictures, it is possible to store different index numbers
(resolution set indexes in FIG. 9, which may be reference picture
numbers) according to predetermined resolution sets in a header and
then transmit them to a decoding apparatus.
[0267] FIG. 41 illustrates an example of a structure for encoding
of reference pictures.
[0268] Meanwhile, the resolution appointment flag generator 3220
may use different resolution sets for a picture to be used as a
reference picture and a picture not to be used as a reference
picture, respectively. For example, it is assumed that reference
pictures have been encoded with the structure as shown in FIG. 41,
and pictures of time layers TL0, TL1, and TL2 correspond to
pictures used as reference pictures while pictures of time layer
TL3 correspond to pictures not used as reference pictures. In this
event, when the resolution set has been appointed to 1/2 and 1/4 by
the resolution appointment flag generator 3220, the resolution sets
according to the reference pictures may be arranged as shown in
FIG. 42.
[0269] Referring to FIG. 42, the resolution sets at the time of
encoding picture No. 6 are 1/2 and 1/4, and the resolution
identification flag or resolution is appointment flag may be
encoded in the unit of areas or motion vectors for the resolution
sets determined as described above. At the time of encoding picture
No. 9, the resolution is a single resolution and it is not required
to encode the resolution identification flag or resolution
appointment flag.
[0270] Meanwhile, the resolution appointment flag generator 3220
may include all functions of the resolution change flag generator
920 as described above with reference to FIG. 9.
[0271] The resolution conversion flag generator 3260 generates a
resolution conversion flag, which indicates a change (or
difference) between a resolution of an area to be currently encoded
and a resolution of surrounding areas or a previous resolution.
[0272] FIG. 43 illustrates an example of a resolution of a current
block and resolutions of surrounding blocks.
[0273] For example, when a resolution set includes 1/2, 1/4, and
1/8 and resolutions of surrounding blocks and a current optimum
resolution have values as shown in FIG. 43, the resolutions of the
surrounding blocks are (1/8, 1/4, 1/4, and 1/4) and the resolution
having the highest frequency is 1/4. Further, since the optimum
resolution of current block X is also 1/4, the resolution
conversion flag is encoded to "0". In this event, the decoder can
extract the resolution conversion flag from a bitstream. Also, when
the resolution conversion flag is 0, the decoder can obtain
information that the resolution having the highest frequency, 1/4,
is the resolution of current block X.
[0274] FIG. 44 illustrates another example of a resolution of a
current block and resolutions of surrounding blocks, and FIG. 45
illustrates resolution identification flags according to
resolutions.
[0275] In FIG. 44, since the optimum resolution of current block X
is not 1/4, which is the resolution of the surrounding block having
the highest frequency among the resolutions of the surrounding
blocks, the resolution conversion flag is encoded to 1 so as to
indicate that it is a resolution different from those of the
surrounding blocks, and the resolution identification flag of the
resolution of current block X is encoded to 1 by using the table
shown in FIG. 45. Since there is no possibility that 1/4 is
selected as the converted resolution when the current resolution is
1/4, a resolution identification flag is not provided in the case
of the resolution of 1/4.
[0276] FIG. 46 illustrates an example of the resolution of the
current block and the resolutions of surrounding blocks.
[0277] For example, when a resolution set includes 1/2 and 1/4 and
the encoding has been performed as shown in FIG. 46, since a
previous block of current block X is A, the resolution conversion
flag may indicate whether the resolution of block A and the
resolution of current block X are identical to each other.
Therefore, in the case described above, the resolution of block A
and the resolution of current block X are not identical to each
other, and the resolution conversion flag may thus have a value of
1. Further, since the resolution set includes 1/2 and 1/4, it is
possible to understand that the resolution of the current block is
1/4, even with only the resolution conversion flag without
additionally encoding the resolution identification flag.
[0278] FIG. 47 is a flowchart illustrating a video encoding method
using an adaptive motion vector resolution according to the second
aspect of the present disclosure.
[0279] As shown in FIG. 47, the video encoding method using an
adaptive motion vector resolution according to the second aspect of
the present disclosure includes: a resolution appointment flag
generating step (S4702), a resolution determining step (S4704), an
inter prediction encoding step (S4706), a differential vector
encoding step (S4708), a resolution encoding step (S4710), and a
resolution conversion flag generating step (S4712).
[0280] The resolution appointment flag generating step (S4702)
corresponds to the operation of the resolution appointment flag
generator 3220, a resolution determining step (S4704) corresponds
to the operation of the resolution determiner 3230, an inter
prediction encoding step (S4706) corresponds to the operation of
the inter prediction encoder 3210, a differential vector encoding
step (S4708) corresponds to the operation of the differential
vector encoder 3250, a resolution encoding step (S4710) corresponds
to the operation of the resolution encoder 3240, and a resolution
conversion flag generating step (S4712) corresponds to the
operation of the resolution conversion flag generator 3260.
Therefore, a detailed description on the process in each step is
omitted here.
[0281] Further, the steps described above may include a step or
steps, which can be omitted, depending on the existence or absence
of each element of the video encoding apparatus 3200, from the
method of encoding a video using an adaptive motion vector
resolution according to the second aspect of the present
disclosure.
[0282] FIG. 52 is a schematic block diagram of a video decoding
apparatus according to a third aspect of the present
disclosure.
[0283] The video encoding apparatus 5200 according to the third
aspect of the present disclosure is an apparatus for encoding a
video and may include a reference picture interpolator 5210 and an
inter prediction encoder 5220.
[0284] The reference picture interpolator 5210 can adaptively
determine a type of a filter or a filter coefficient according to
the target precision set in the unit of predetermined areas.
Accordingly, the reference picture interpolator 5210 can select and
determine one filter, which corresponds to an optimum filter in a
corresponding area among a plurality of filters having a fixed
filter coefficient, or calculate and determine a preset filter
coefficient or an optimum filter coefficient in a corresponding
area of a fixed filter. Information on the filter or filter
coefficient determined through the selection or the calculation can
be encoded and included in a bitstream.
[0285] Further, the reference picture interpolator 5210
interpolates the reference picture to have the target precision by
filtering the reference picture stage by stage using a plurality of
filters. That is, when the reference picture having the target
precision is generated by interpolating the reference picture, the
reference picture can be interpolated to have the target precision
through the multi-stage filtering of the reference picture by using
a plurality of filters or filter coefficients instead of one step
filtering using one filter or filter coefficient.
[0286] Here, the filter for interpolating the reference picture may
include a Wiener filter, a Bilinear filter, a Kalman filter, etc.
The target precision refers to the precision aimed when a motion of
an area which is desired to be encoded by the video encoding
apparatus 5200 is estimated, and various precisions such as single
precision, double precision, quadruple precision, and octuple
precision may be used. A detailed description of the reference
picture interpolator 5210 will be discussed later with reference to
FIG. 53.
[0287] The inter prediction encoder 5220 performs an inter
prediction encoding of a video by using the interpolated reference
picture having the target precision. That is, the inter prediction
encoder 5220 estimates and compensates a motion of a predetermined
area of a video such as a block which is desired to be encoded
using the interpolated reference picture having the target
precision by the reference picture interpolator 5210, so that the
inter prediction encoding of the video of the corresponding area is
performed and a bitstream is generated. The bitstream generated as
described above includes information on the encoding by the
reference picture interpolator 5210. The inter prediction encoder
5220 may be implemented as the video encoding apparatus 100
described reference to FIG. 1.
[0288] However, it has been described that the video encoding
apparatus 100 described through FIG. 1 encodes a video in the unit
of blocks, but the inter prediction encoder 920 can encode a video
in the unit of areas having a predetermined size by dividing the
areas into areas having various types and sizes such as a block
including a macro block or a sub block, a slice, or a picture. The
predetermined area may be a macro block of a 16.times.16 size, but
the present invention is not limited thereto and may be blocks
having various types and sizes such a block of a 64.times.64 size
or a block of a 32.times.16 size.
[0289] Further, the video encoding apparatus 100 described through
FIG. 1 performs inter prediction encodings of all blocks of a video
into motion vectors having the same motion vector precision and
determines a motion vector by interpolating a reference picture
with the same precision to estimate the motion. However, the inter
prediction encoder 5220 determines a motion vector and performs an
inter prediction encoding by estimating the motion by using a is
reference picture interpolated with different precisions for each
predetermined area through a filter or a filter coefficient
determined for each predetermined area by the reference picture
interpolator 5210.
[0290] FIG. 53 is a schematic block diagram of a reference picture
interpolating apparatus for a video encoding according to an aspect
of the present disclosure.
[0291] A reference picture interpolating apparatus according to an
aspect of the present disclosure may be implemented as the
reference picture interpolator 910 in the video encoding apparatus
5200 according to the third aspect of the present disclosure
described with reference to FIG. 52. Hereinafter, for the
convenience of description, the reference picture interpolating
apparatus according to the aspect of the present disclosure will be
referred to as the reference picture interpolator 5210.
[0292] The reference picture interpolator 5210 may include a filter
selector 5310, a filter 5320, and a filter information encoder
5330.
[0293] The filter selector 5310 adaptively determines types of
filters or filter coefficients according to the target precision
determined in the unit of predetermined areas. Accordingly, the
filter selector 5310 can select and determine one optimum filter in
a corresponding area from a filter set including a plurality of
filters having a fixed coefficient. That is, when the filter
selector 5310 interpolates a reference picture by filtering the
reference picture by using a plurality of filters having a fixed
filter coefficient, the filter selector 5310 can select one filter,
which has the minimum difference between the interpolated reference
picture and a current picture, as the one optimum filter. Further,
the filter selector 5310 calculates a filter coefficient, which has
the minimum difference between an interpolated reference picture
and a current picture for a certain filter, and can determine the
calculated filter coefficient as the optimum filter coefficient.
Here, the difference between the interpolated reference picture and
the current picture may be calculated by a difference function such
as SAD (Sum of Absolute difference) or SSD (Sum of Squared
Difference), but the present invention is not limited thereto and
may be calculated by various methods.
[0294] The filter 5320 generates an interpolated reference picture
having the target precision by filtering a reference picture by
using a filter or a filter coefficient determined by the filter
selector 5310. That is, when the filter selector 5310 selects one
filter from a plurality of filters, the filter 5320 interpolates
the reference picture by filtering the reference picture by using
the selected filter. When the filter selector 5310 calculates a
filter coefficient, the filter 5320 interpolates the reference
picture by filtering the reference picture by using a filter having
the calculated filter coefficient. The reference picture as
interpolated above becomes a reference picture of the target
precision having a pixel (an integer pixel and/or a sub-pixel) of
the target precision and is used as a reference picture when the
inter prediction encoder 5220 determines a motion vector by
estimating a motion of a predetermined area. That is, the inter
prediction encoder 5220 estimates the motion by using the
interpolated reference picture with the target precision by the
filter 5320.
[0295] The filter information encoder 5330 encodes information on a
filter coefficient or information on a filter determined by the
filter selector 5310. The encoded information on the filter and the
filter coefficient may be included in a bitstream.
[0296] The filter selector 5310 can differently determine filter
coefficients according to the target precision for an
interpolation. For example, the filter selector 5310 can calculate
an optimum filter tap for each precision by using a Wiener filter
in order to calculate an optimum filter coefficient, which has the
SSD (Sum of Squared Difference) between a reference picture and a
current picture to be currently encoded, as shown in FIG. 2.
[0297] The filter selector 5310 can calculate filter coefficients
for each precision by using a Wiener-Hopf Equation like Equation
(2). For example, a filter coefficient of a 6-tap Wiener filter may
be calculated for double precision, a to filter coefficient of an
8-tap Kalman filter may be calculated for quadruple precision, a
filter coefficient of a linear filter may be calculated for octuple
precision, and the calculated filter coefficients may be encoded
and included in the bitstream. In this event, the filter 5320
encodes the filter coefficients by using a reference picture
interpolated by a 6-tap Wiener filter when the precision of the is
current area or motion vector is the double precision, and encodes
the filter coefficients by using a reference picture interpolated
by an 8-tap Kalman filter when the precision of the current area or
motion vector is the quadruple precision.
[0298] Further, the filter selector 5310 can determine a plurality
of filters or filter coefficients for one area. That is, the filter
selector 5310 determines an optimum filter or filter coefficient
for interpolating the reference picture and can further determine a
filter or a filter coefficient for interpolating an interpolated
reference picture again by using the determined filters or filter
coefficients. That is, when the reference picture interpolator 5210
interpolates a reference picture of the target precision by
filtering the reference picture stage-by-stage, the filter selector
5310 selects a filter, which has the minimum difference between the
interpolated reference picture and the current picture, or
determines a filter coefficient and then can additionally select a
filter, which has the minimum difference between the interpolated
reference picture and the current picture, or additionally
calculate a filter coefficient. The filter 5320 can interpolate the
reference picture by filtering the interpolated reference picture
by using the additionally selected filter or interpolate the
reference picture having the target precision by filtering the
interpolated reference picture by using a filter having the
additionally calculated filter coefficient.
[0299] Further, in the calculation of the filter coefficient, the
filter selector 5310 can reduce the number of filter coefficients
to be encoded based on an assumption that filter coefficients in
similar positions are symmetrical. For example, as shown in
Equation (11), a filter coefficient applied to a C1 pixel among
filters used for an interpolation of a sub-pixel S.sub.02, a filter
coefficient applied to a pixel A3 among filters used for an
interpolation of a sub-pixel S.sub.20, a filter coefficient applied
to a pixel C6 among filters used for an interpolation of a
sub-pixel S.sub.06, and a filter coefficient applied to a pixel F3
among filters used for an interpolation of a sub-pixel S.sub.60 may
be assumed to have the same value. Through the assumption, it is
possible to reduce the number of filter coefficients for an
interpolation, which should be interpolated by the filter
information encoder 5330, and improve the compression
efficiency.
h C 1 S 02 = h A 3 S 20 = h C 6 S 06 = h F 3 S 60 h C 2 S 02 = h B
3 S 20 = h C 5 S 06 = h E 3 S 60 h C 3 S 02 = h C 3 S 20 = h C 64 S
06 = h D 3 S 60 Equation 11 ##EQU00002##
[0300] Further, when a filter coefficient is calculated by using a
Wiener filter, the filter selector 1010 can calculate a filter
coefficient h.sup.SP, which has a minimum square of an error
e.sup.SP, as shown in Equation (2).
[0301] Hereinafter, a process will be described in which the
reference picture interpolator 5210 interpolates a reference
picture of the target precision through a multi-stage filtering of
the reference picture by using a plurality of filters or a
plurality of filter coefficients with reference to FIGS. 54 to
56.
[0302] FIGS. 54 and 55 show examples of filters used in a
multi-stage filtering according to an aspect of the present
disclosure and FIG. 56 shows an example for describing a process of
the multi-stage filtering according to an aspect of the present
disclosure.
[0303] FIG. 54 shows an example of a 6.times.6 tap filter and FIG.
55 shows an example of a 4.times.4 tap filter. Pixels shown in FIG.
56 correspond to pixels included in a shaded part in FIGS. 54 and
55 and refer to pixels interpolated by using filters shown in FIGS.
54 and 55.
[0304] In an example, the reference picture interpolator 5210 can
interpolate sub-pixels included in a shaded part based on already
reconstructed integer pixels A1 to F6 within the reference picture
by using a 6.times.6 tap filter shown in FIG. 54. Further, when the
reference picture interpolator 5210 uses only a 6-tap filter among
6.times.6 tap filter shown in FIG. 54, the reference picture
interpolator 5210 can interpolate sub-pixels S.sub.11, S.sub.22,
S.sub.33, S.sub.44, S.sub.55, and S.sub.66 shown in FIG. 56 based
on already reconstructed integer pixels A1, B1, C3, D4, E5, and F6
within the reference picture or sub-pixels S.sub.01, S.sub.02,
S.sub.03, S.sub.o4, S.sub.05, S.sub.o6, and S.sub.07 based on
already reconstructed integer pixels C1, C2, C3, C4, C5, and C6.
Further, the reference picture interpolator 5210 can interpolate
sub-pixels included in a shaded part shown in FIG. 55 by using a
4.times.4 tap filter shown in FIG. 55 in a similar way as that
described above.
[0305] At this time, when resolutions of a width and a length are
interpolated 8 times, respectively, the reference picture
interpolator 5210 can encode 63 filter coefficient sets of the
6.times.6 tap filter for interpolating 63 sub-pixels S.sub.01 to
S.sub.77. Alternatively, the reference picture interpolator 5210
can interpolate a sub-pixel of a 1/4 or a 1/2 pixel position by
using a filter having a calculated optimum filter coefficient based
on integer pixels in a first stage and interpolate a sub-pixel of a
1/8 pixel position by using an optimum interpolation filter or a
Bilinear interpolation filter based on an integer pixel and a
sub-pixel of a 1/4 or a 1/2 pixel position in a second stage.
[0306] In another example, when resolutions of a width and a length
of the reference picture are interpolated 8 times, respectively,
the reference picture interpolator 5210 interpolates sub-pixels
S.sub.02, S.sub.O4, and S.sub.o6 by using the 6-tap filter based on
integer pixels C1, C2, C3, C4, C5, and C6, interpolates sub-pixels
S.sub.20, S.sub.40, and S.sub.60 by using the 6-tap filter based on
integer pixels A3, B3, C3, D3, E3, and F3, and interpolates
sub-pixels S.sub.22, S.sub.24, S.sub.26, S.sub.42, S.sub.44,
S.sub.46, S.sub.62, S.sub.64, and S.sub.66 of a 1/2 and a 1/4 pixel
position by using the 6.times.6 tap filter in a first stage. The
reference picture interpolator 5210 can interpolate again
sub-pixels of the 1/8 pixel position by using the 4-tap filter,
6-tap filter, and the 6.times.6 tap filter based on the integer
pixels and the sub-pixels interpolated in the first stage in a
second stage.
[0307] FIG. 67 shows an example of a video encoding apparatus 6700
according to a fourth aspect of the present disclosure.
[0308] As shown in FIG. 67, the video encoding apparatus 6100
according to a fourth aspect of the present disclosure may include
the inter prediction encoder 3210, the resolution appointment flag
encoder 3220, the resolution determiner 3230, the differential
vector encoder 3250, the resolution conversion flag generator 3260,
and a reference picture interpolator 6710. Meanwhile, all of the
resolution appointment flag encoder 3220, the resolution determiner
3230, the differential vector encoder 3250, and the resolution
conversion flag generator 3260 do not have to be included in the
aspect as shown in FIG. 67, and may be selectively included in the
aspect as shown in FIG. 67 according to an implementation method.
Further, when the reference interpolator 6710 or 5210 and the inter
prediction encoder 320 of the present disclosure have the same
operation so that the operation should be performed two times, the
operation performed in the inter prediction encoder may be
omitted.
[0309] Further, an operation of the filter information encoder
within the reference picture interpolator 6710 or 5210 may be
implemented integrally with the encoder 150 within the inter
prediction encoder 3210.
[0310] Further, operations of the resolution appointment flag
encoder 3220, the resolution determiner 3230, the differential
vector encoder 3250, and the resolution conversion flag generator
3260 shown in FIG. 67 may be the same as or similar to operations
of the resolution appointment flag encoder 3220, the resolution
determiner 3230, the differential vector encoder 3250, and the
resolution conversion flag generator 3260 shown in FIG. 32,
respectively. The operations of the resolution appointment flag
encoder 3220, the resolution determiner 3230, the differential
vector encoder 3250, and the resolution conversion flag generator
3260 may be the same as or similar to functions in FIG. 32 except
for a function of generating a resolution appointment flag, a
resolution identification flag, and a resolution conversion flag to
transmit them to the reference picture interpolator 6710.
Accordingly, a more detailed description for the operations of the
resolution appointment flag encoder 3220, the resolution determiner
3230, the differential vector encoder 3250, and the resolution
conversion flag generator 3260 are omitted in FIG. 67. Further, a
function of the inter prediction encoder 3210 in FIG. 67 may be the
same as or similar to a function of the inter prediction encoder
3210 in FIG. 32 or the inter prediction encoder 5220, so a more
detailed description for an operation of the inter prediction
encoder 3210 is omitted in FIG. 67.
[0311] The reference picture interpolator 6710 may include a
function of the reference picture interpolator 5210 in FIG. 52,
include a function of the reference picture interpolator 6710,
which will be described hereinafter, or include both is
functions.
[0312] The function of the reference picture interpolator 6410 will
be described with reference to FIGS. 63 to 65.
[0313] A filter may be selected through the similarity with a
current picture in the generation of the reference picture.
Further, different filters may be used for each resolution. In this
event, a filter tap may be applied in the unit of pictures or
slices. Accordingly, filter information according to the resolution
may be included in a bitstream in the unit of pictures or slices.
The filter information may be informed by a filter flag when a
fixed filter coefficient is used, and the filter information may
contain a filter coefficient when an adaptive filter coefficient is
used.
[0314] For example, when the reference picture is interpolated to
have a resolution of 1/8, the interpolation may be differently
performed for each resolution by using a table of FIG. 63.
[0315] An optimum filter is selected for each resolution with
reference to a table of a filter tap in accordance with the
resolution in FIG. 63. A filter, which is the most similar with the
current picture, is selected. When a filter of a resolution of 1/2
is selected, a filter, which has the minimum difference among
differences between pixels at a 1/2 pixel position interpolated
using the 8-tap Wiener filter and the current picture and
differences between pixels at a 1/2 pixel position interpolated
using the 8-tap Kalman filter and the current picture, as the
optimum filter. The difference can be calculated by SAD (Sum of
Absolute Difference) or SSD (Sum of Squared Difference). When a
filter of a resolution of 1/4 is selected, the pixel interpolated
using the optimum filter at a resolution of 1/2 may be used or may
not be used.
[0316] When a filter having a resolution of 1/4 is selected, if the
pixel interpolated using the optimum filter at a resolution of 1/2
is not used, the resolution of 1/2 for making the resolution of 1/4
may be interpolated using another filter. In this event, the
resolution of for making the resolution of 1/4 can be equally
appointed between the reference picture interpolator 6710 and a
reference picture interpolator 7110 of the video decoding
apparatus, which will be described later. Further, when the filter
having the resolution of 1/4 is selected, the difference from the
current picture is calculated for each filter and a filter, which
has the minimum difference, is selected as the optimum filter in
the same way of the resolution of 1/2. When the filter of a
resolution of 1/8 is selected, it can be selected using pixels
interpolated using the optimum filter at the resolutions of 1/2 and
1/4 or resolutions of 1/2 and 1/4 are made and used for the
resolution of 1/8. When the resolutions of 1/2 and 1/4 are made for
the resolution of 1/8, they can be equally appointed between the
reference picture interpolator 6710 and the reference picture
interpolator 7110 of the video decoding apparatus, which will be
described later.
[0317] FIG. 64 illustrates a table indicating types of a filter tap
varying according to the resolution of a motion vector.
[0318] When the reference picture is interpolated using a 1/4 pixel
position, a 1/8 pixel position, or a pixel position smaller than
the 1/4 or 1/8 pixel position, the reference picture is
interpolated using only the 1/1 pixel position without using the
pixel position of the previous resolution and different filters may
be used depending on the resolutions of motion vectors. For
example, when the reference picture is interpolated to have a
resolution of 1/16, the interpolation may be differently performed
for each resolution by using a table shown in FIG. 64 and a
differential motion vector may be encoded based on the resolution
of 1/16.
[0319] FIG. 65 illustrates another table indicating types of a
filter tap varying according to the resolution of a motion
vector.
[0320] As shown in FIG. 65, resolutions up to the resolution of 1/2
are interpolated by using the 8-tap Wiener filter. Then, the
resolutions smaller than the resolution of 1/2 may be interpolated
based on the resolutions of 1/1 and 1/2, with the only difference
in that different filters are used. In this event, a differential
motion vector may be encoded based on the resolution of 1/16.
[0321] FIG. 66 illustrates a table in the case in which an optimum
position is found using resolutions of 1/2 and 1/4.
[0322] Meanwhile, when the reference picture is interpolated, the
filter is not applied in the unit of pictures or slices and an
optimum filter may be found for is each filter in the unit of
predetermined areas.
[0323] The predetermined area may be 16.times.16, 32.times.32,
64.times.64, 128.times.128 or the unit of motion vectors. In this
event, an optimum filter is encoded for each resolution in the unit
of predetermined areas or the reference picture interpolator 6710
and a reference picture interpolator 7110 of the video encoding
apparatus, which will be described later, can equally appoint a
filter for each resolution in advance. For example, when
resolutions up to a resolution of 1/4 is used and an optimum filter
is determined for each resolution in the unit of motion vectors, a
motion vector is found by a 1/1 pixel position and surroundings of
the 1/1 pixel position may use different filters according to the
resolutions. In this event, a motion vector is first found by the
1/1 pixel position and an optimum position may be found by
interpolating resolutions of 1/2 and 1/4 by using the table shown
in FIG. 66. In this event, a differential motion vector may be
encoded based on the resolution of 1/4. The encoder generates a
reconstructed video according to positions of motion vectors with
reference to FIG. 66. Alternatively, when a motion vector
resolution identification flag is transmitted, the differential
motion vector encoder can perform an encoding by using the code
number table of differential motion vectors according to the motion
vector resolutions as shown in FIG. 25. The decoder decodes the
motion vector resolution identification flag, and the differential
motion vector decoder performs a decoding by using the code number
table of differential motion vectors according to the motion vector
resolutions as shown in FIG. 25 and generates a reconstructed video
according to the motion vector resolution.
[0324] When the reference picture interpolator 6710 receives a
resolution identification flag from the resolution determiner 3230,
the reference picture interpolator 6710 can use different filters
through the motion vector resolutions and select an optimum filter
for each resolution from a plurality of filters.
[0325] For example, when the reference picture is interpolated
using a 1/4 pixel position, a 1/8 pixel position, or a pixel
position smaller than the 1/4 or 1/8 pixel position, the reference
picture is interpolated using only the 1/1 pixel position without
using the pixel position of the previous resolution and different
is filters may be used depending on the resolutions of motion
vectors. For example, when the resolution appointment flags
indicate (1/2, 1/4, 1/8), the interpolation may be differently
performed by selecting an optimum filter for each resolution by
using the table shown in FIG. 63.
[0326] When an optimum filter is selected for each resolution with
reference to the table shown in FIG. 63, a filter, which is the
most similar with the current picture, is selected. When a filter
having a resolution of 1/2 is selected, a filter, which has the
minimum difference among differences between pixels at a 1/2 pixel
position interpolated using the 8-tap Wiener filter and the current
picture and differences between pixels at a 1/2 pixel position
interpolated using the 8-tap Kalman filter and the current picture,
as the optimum filter. The difference can be calculated by SAD (Sum
of Absolute Difference) or SSD (Sum of Squared Difference). When a
filter having a resolution of 1/4 is selected, the pixel
interpolated using the optimum filter at a resolution of 1/2 may be
used or may not be used. When a filter having a resolution of 1/4
is selected, if the pixel interpolated using the optimum filter at
a resolution of 1/2 is not used, the resolution of 1/2 for making
the resolution of 1/4 may be interpolated using another filter. In
this event, the resolution of for making the resolution of 1/4 can
be equally appointed between the reference picture interpolator
6710 and a reference picture interpolator 7110 of the video
decoding apparatus, which will be described later. Further, when
the filter having the resolution of 1/4 is selected, the difference
from the current picture is calculated for each filter and a
filter, which has the minimum difference, is selected as the
optimum filter in the same way of the resolution of 1/2. When the
filter of a resolution of 1/8 is selected, it can be selected using
pixels interpolated using the optimum filter at the resolutions of
1/2 and 1/4 or resolutions of 1/2 and 1/4 are made and used for the
resolution of 1/8. When the resolutions of 1/2 and 1/4 are made for
the resolution of 1/8, they can be equally appointed between the
reference picture interpolator 6710 and the reference picture
interpolator 7110 of the video decoding apparatus, which will be
described later.
[0327] In this event, if the optimum resolution is the resolution
of 1/4, the reference picture interpolator 6710 encodes optimum
filter information (filter flag is or filter coefficient) for each
resolution in the unit of predetermined areas and can encode the
resolution identification flag into 1/4 in the unit of motion
vectors.
[0328] If the optimum resolution is the resolution of 1/4 and
optimum filter information and the motion vector resolutions are
encoded for each resolution in the unit of motion vectors, the
resolution identification flag is encoded into 1/4 in the unit of
motion vectors and the reference picture interpolator 6710 can
encode optimum filter information (filter flag or filter
coefficient) of the resolution of 1/4 in the unit of motion
vectors. In this event, the decoder decodes the resolution
identification flag in the unit of motion vectors and decodes
optimum filter information (filter flag or filter coefficient) of
the corresponding resolution to generate a reconstructed video by
using the filter information.
[0329] Alternatively, when optimum filter information is encoded
for each resolution in the unit of pictures or slices and the
motion vector identification flag is encoded in the unit of motion
vectors, the reference picture interpolator 6710 encodes optimum
filter information (filter flag or filter coefficient) for each
resolution and the motion vector resolution identification flag is
encoded in the unit of motion vectors. In this event, the decoder
decodes optimum filter information for each resolution in the unit
of pictures or slices and decodes the motion vector resolution
identification flag in the unit of motion vectors to generate a
reconstructed video by using filter information of the
corresponding motion vector resolution.
[0330] Meanwhile, when the reference picture interpolator 6710
receives the resolution identification flag from the resolution
determiner 3230, the reference picture interpolator 6710 can
perform a filtering by using a single filter for each resolution
and different filters may be used depending on the resolutions of
motion vectors.
[0331] When the reference picture is interpolated using a 1/4 pixel
position, a 1/8 pixel position, or a pixel position smaller than
the 1/4 or 1/8 pixel position, the reference picture is
interpolated using only the 1/1 pixel position without using the
pixel position of the previous resolution and different filters may
be used depending on the resolutions of motion vectors. For
example, when the resolution appointment flags indicate (1/2, 1/4,
1/8, 1/16), the interpolation may be differently performed for each
resolution by using the table in FIG. 64. In this event, the
differential motion vector is encoded based on a resolution of 1/1,
and the resolution identification flag is encoded into 1/8 when the
motion vector is at a 1/8 pixel position.
[0332] Alternatively, the resolutions up to the resolution of 1/2
are interpolated by using the 8-tap Wiener filter. Then, the
resolutions smaller than the resolution of 1/2 may be interpolated
based on the resolutions of 1/1 and 1/2, with the only difference
in that different filters are used as shown in the table of FIG.
65. In this event, the differential motion vector is encoded based
on the resolution of 1/2 and the resolution identification flag is
encoded into 1/4 when the motion vector is at a 1/4 pixel
position.
[0333] Meanwhile, when the reference picture interpolator 6710
receives the resolution identification flag from the resolution
determiner 3230, a filter is not applied in the unit of pictures or
slices and optimum filters may be selected for each resolution in
the unit of predetermined areas in the interpolation of the
reference picture.
[0334] FIG. 68 illustrates yet another table indicating types of a
filter tap depending on the resolution of a motion vector.
[0335] The predetermined area may be 16.times.16, 32.times.32,
64.times.64, 128.times.128 or the unit of the motion vector. In
this case, an optimum filter for each resolution is encoded in the
unit of predetermined areas or the reference picture interpolator
6710 and a reference picture interpolator 7110 of the video
encoding apparatus, which will be described later, can equally
appoint a filter for each resolution in advance. For example, when
resolutions up to a resolution of 1/4 is used and an optimum filter
is determined for each resolution in the unit of motion vectors, a
motion vector is found by a 1/1 pixel position and surroundings of
the 1/1 pixel position may use different filters according to the
resolutions. In this event, a motion vector is first found by the
1/1 pixel position and an optimum position may be found by
interpolating resolutions of 1/2 and 1/4 by using the table shown
in FIG. 68. In this event, the differential motion vector is
encoded is based on a resolution of 1/1 and the resolution
identification flag is encoded into 1/4 when the motion vector is
at a 1/4 pixel position.
[0336] When the reference picture interpolator 6710 receives the
resolution identification flag from the resolution determiner 3230,
the reference picture interpolator 6710 uses different filters
through the motion vector resolutions and the optimum filter for
each resolution is selected from a plurality of filters. In this
event, when the reference picture is interpolated using a 1/4 pixel
position, a 1/8 pixel position, or a pixel position smaller than
the 1/4 or 1/8 pixel position, the reference picture is
interpolated using only the 1/1 pixel position without using the
pixel position of the previous resolution and different filters may
be used depending on the resolutions of motion vectors. For
example, when the resolution appointment flags indicate (1/2, 1/4,
and 1/8), the interpolation may be differently performed by
selecting an optimum filter for each resolution by using the table
of FIG. 63.
[0337] The optimum filter for each resolution is selected with
reference to the table of FIG. 63 and a filter, which is the most
similar with the current picture, is selected. When a filter of a
resolution of 1/2 is selected, a filter, which has the minimum
difference among differences between pixels at a 1/2 pixel position
interpolated using the 8-tap Wiener filter and the current picture
and differences between pixels at a 1/2 pixel position interpolated
using the 8-tap Kalman filter and the current picture, as the
optimum filter. The difference can be calculated by SAD (Sum of
Absolute Difference) or SSD (Sum of Squared Difference). When a
filter of a resolution of 1/4 is selected, the pixel interpolated
using the optimum filter at a resolution of 1/2 may be used or may
not be used. When a filter having a resolution of 1/4 is selected,
if the pixel interpolated using the optimum filter at a resolution
of 1/2 is not used, the resolution of 1/2 for making the resolution
of 1/4 may be interpolated using another filter. In this event, the
resolution of for making the resolution of 1/4 can be equally
appointed between the reference picture interpolator 6710 and a
reference picture interpolator 7110 of the video decoding
apparatus, which will be described later. Further, when the filter
having the resolution of 1/4 is selected, the difference from the
current picture is calculated for each filter and a filter, which
has the minimum difference, is selected as the optimum filter in
the same way of the resolution of 1/2. When the filter of a
resolution of 1/8 is selected, it can be selected using pixels
interpolated using the optimum filter at the resolutions of 1/2 and
1/4 or resolutions of 1/2 and 1/4 are made and used for the
resolution of 1/8. When the resolutions of 1/2 and 1/4 are made for
the resolution of 1/8, they can be equally appointed between the
reference picture interpolator 6710 and the reference picture
interpolator 7110 of the video decoding apparatus, which will be
described later.
[0338] FIG. 69 shows an example of resolution identification flags
according to the resolution.
[0339] When the optimum resolution is a resolution of 1/4 and the
previously encoded resolution is a resolution of 1/8, the reference
picture interpolator 6710 can encode optimum filter information
(filter flag or filter coefficient) for each resolution in the unit
of predetermined areas, encode the resolution conversion flag into
1 in the unit of motion vectors, and encode the resolution
identification flag into 1/4 by using the table in FIG. 69.
[0340] Meanwhile, when the reference picture interpolator 6710
receives the resolution conversion flag from the resolution
conversion flag generator 3260, the reference picture interpolator
6710 can perform a filtering by using a single filter for each
resolution and different filters may be used depending on the
resolutions of motion vectors.
[0341] In this event, when the reference picture is interpolated
using a 1/4 pixel position, a 1/8 pixel position, or a pixel
position smaller than the 1/4 or 1/8 pixel position, the reference
picture is interpolated using only the 1/1 pixel position without
using the pixel position of the previous resolution and different
filters may be used depending on the resolutions of motion vectors.
For example, when the resolution appointment flags indicate (1/2,
1/4, 1/8, 1/16), the interpolation may be differently performed by
selecting an optimum filter for each resolution by using the table
of FIG. 64.
[0342] The reference picture interpolator 6710 encodes the
differential motion vector based on a resolution of 1/1 and can
encode the resolution conversion flag into 0 when the previous
resolution is a resolution of 1/8 and the currently is optimum
resolution is the resolution of 1/8.
[0343] FIG. 70 shows another example of resolution identification
flags according to the resolution.
[0344] The resolutions up to the resolution of 1/2 are interpolated
by using the 8-tap Wiener filter. Then, the resolutions smaller
than the resolution of 1/2 may be interpolated based on the
resolutions of 1/1 and 1/2, with the only difference in that
different filters are used as shown in the table of FIG. 65. The
reference picture interpolator 6710 encodes the differential motion
vector based on the resolution of 1/2, and can encode the
resolution conversion flag into 1 and the resolution identification
flag into 1/4 using the table in FIG. 70 when the previous
resolution is a resolution of 1/8 and the current resolution is a
resolution of 1/4.
[0345] Meanwhile, when the reference picture interpolator 6710
receives the resolution conversion flag from the resolution
conversion flag generator 3260, a filter is not applied in the unit
of pictures or slices and an optimum filter may be selected for
each resolution in the unit of predetermined areas in the
interpolation of the reference picture.
[0346] The predetermined area may be 16.times.16, 32.times.32,
64.times.64, 128.times.128 or the unit of the motion vector. In
this case, an optimum filter for each resolution is encoded in the
unit of predetermined areas or the reference picture interpolator
6710 and a reference picture interpolator 7110 of the video
encoding apparatus, which will be described later, can equally
appoint a filter for each resolution in advance. For example, when
the resolution appointment flags indicate (1/2, 1/4) and an optimum
filter is determined for each resolution in the unit of motion
vectors, the motion vector is found by a 1/1 pixel position and
different filters may be used in surroundings of the 1/1 pixel
position according to the resolutions. In this case, the motion
vector is first found by the 1/1 pixel position and resolutions 1/2
and 1/4 are interpolated using the table in FIG. 68, and thus the
optimum position may be found.
[0347] At this time, the reference picture interpolator 6710
encodes the differential motion vector based on a resolution of 1/1
and the resolution is conversion flag may be encoded into 0 when
the previous resolution is a resolution of 1/4 and the currently
optimum resolution is the resolution of 1/4.
[0348] Meanwhile, when the resolution appointment flag received
from the resolution appointment flag generator 3220 by the
reference picture interpolator 6710 is a flag indicating the single
resolution, different filters are used for each resolution and the
reference picture interpolator 6710 and the reference picture
interpolator 7110 of the video decoding apparatus, which will be
described later, can equally appoint the filter. For example, when
the resolution appointment flag designates the resolution of 1/4 as
the single resolution and uses an optimum filter from a plurality
of filters for each resolution, the reference picture interpolator
6710 selects an optimum filter for each resolution with reference
to the difference from the current picture in the interpolation of
the reference picture. The optimum filter may be applied in the
unit of predetermined areas such as a slice, a picture, and a
video. In this case, optimum filter information (filter flag or
filter coefficient) for each resolution may be encoded in the unit
of predetermined areas.
[0349] FIG. 57 is a flowchart of a reference picture interpolating
method for a video encoding according to an aspect of the present
disclosure.
[0350] According to the reference picture interpolating method in
accordance with the aspect of the present invention, the reference
picture interpolator 5210 selects a first filter for interpolating
a sub-pixel by using an integer pixel of the reference picture in
step S5710, interpolates the reference picture by using the first
filter in step S5720, selects a second filter for interpolating a
sub-pixel of the target precision by using the integer pixel and
interpolated sub-pixel in step S5730, and interpolates the
interpolated reference picture by using the second filter in step
S5740.
[0351] In step S5710, when the reference picture interpolator 5210
interpolates the sub-pixel by using the integer pixel of the
reference picture, the reference picture interpolator 5210 can
select one filter, which has the minimum difference between the
interpolated reference picture and the current picture among a
plurality of filters having a fixed filter coefficient, as the
first filter.
[0352] In step S5710, when the reference picture interpolator 5210
interpolates the sub-pixel by using the integer pixel of the
reference picture, the reference picture interpolator 5210 can
calculate a filter coefficient, which has the minimum difference
between the interpolated reference picture and the current picture,
as a first filter coefficient.
[0353] In step S5730, the reference picture interpolator 5210
interpolates the sub-pixel of the target precision by using the
interpolated sub-pixel and the integer pixel of the reference
picture, the reference picture interpolator 5210 can select one
filter, which has the minimum difference between the
re-interpolated reference picture and the current picture among a
plurality of filters having a fixed filter coefficient, as the
second filter.
[0354] In step S5730, the reference picture interpolator 5210
interpolates the sub-pixel of the target precision by using the
interpolated sub-pixel and the integer pixel of the reference
picture, the reference picture interpolator 5210 can calculate a
filter coefficient, which has the minimum difference between the
re-interpolated reference picture and the current picture, as a
second filter coefficient.
[0355] Further, the reference picture interpolator 5210 can encode
information on the first filter and information on the second
filter. The encoded information on the first filter and the second
is included in a bitstream.
[0356] FIG. 58 is a flowchart of a video encoding method according
to the third aspect of the present disclosure.
[0357] According to the video encoding method in accordance with
the third aspect of the present disclosure, the video encoding
apparatus 5200 interpolates the reference picture to have the
target precision through a stage-by-stage filtering of the
reference picture by using a plurality of filters in step S5810 and
performs an inter prediction encoding by using the interpolated
reference picture to have the target precision in step S5820.
[0358] In step S5810, the video encoding apparatus 5200 can obtain
an interpolated reference picture having the target precision
through an iterative process of interpolating the reference picture
by filtering the reference picture by using one filter among a
plurality of filters and interpolating the interpolated reference
picture by filtering the interpolated reference picture by using
another filter. That is, in order to interpolate the reference
picture to have the target precision, the video encoding apparatus
5200 selects a filter or a filter coefficient, which has the
minimum difference between the current picture and the interpolated
reference picture, and interpolates the reference picture by using
the selected filter or filter coefficient, which corresponds to a
reference picture interpolation performed in a first stage. The
video encoding apparatus 5200 selects a filter or a filter
coefficient, which has the minimum difference between the current
picture and the re-interpolated reference picture, and
re-interpolates the interpolated reference picture by using the
selected filter or filter coefficient, which corresponds to a
reference picture interpolation performed in a second stage. And
the, the reference picture is interpolated in a third stage and a
fourth stage, so that the reference picture having the target
precision may be generated. In this time, the filter selected in
each stage may be one of a plurality of filters having a fixed
filter coefficient or an optimum filter coefficient of a determined
filter may be calculated.
[0359] FIG. 71 illustrates a video encoding method according to the
fourth aspect of the present disclosure.
[0360] As shown in FIG. 71, the video encoding method according to
the fourth aspect of the present disclosure includes step S7102 of
generating a resolution appointment flag, step S7104 of determining
a resolution, step S7106 of interpolating a reference picture, step
S7108 of performing an inter prediction encoding, step S7110 of
encoding a differential vector, step S7112 of encoding a
resolution, and step S7114 of generating a resolution conversion
flag.
[0361] Here, step S7102 of generating the resolution appointment
flag corresponds to the operation of the resolution appointment
flag generator 3220 of the video encoding apparatus 6700 according
to the fourth aspect of the present disclosure, step S7104 of
determining the resolution corresponds to the operation of the
resolution determiner of the video encoding apparatus 6700
according to the fourth aspect of the present disclosure, step
S7106 of is interpolating the reference picture corresponds to the
operation of the reference picture interpolator 6710, step S7108 of
performing the inter prediction encoding corresponds to the
operation of the inter prediction encoder 3210 of the video
encoding apparatus 6700 according to the fourth aspect of the
present disclosure, step S7110 of encoding the differential vector
corresponds to the operation of the differential vector encoder
3250 of the video encoding apparatus 6700 according to the fourth
aspect of the present disclosure, step S7112 of encoding the
resolution corresponds to the operation of the resolution encoder
3240 of the video encoding apparatus 6700 according to the fourth
aspect of the present disclosure, and step S7114 of generating the
resolution conversion flag corresponds to the operation of the
resolution conversion flag generator 3260 of the video encoding
apparatus 6700 according to the fourth aspect of the present
disclosure, so detailed descriptions are omitted.
[0362] Further, the steps described above may include a step or
steps, which can be omitted, depending on the existence or absence
of each element of the video encoding apparatus 6700, from the
video encoding method according to the fourth aspect of the present
disclosure.
[0363] FIG. 18 is a block diagram illustrating a video decoding
apparatus using an adaptive motion vector according to the first
aspect of the present disclosure.
[0364] The video decoding apparatus 1800 using an adaptive motion
vector according to the first aspect of the present disclosure
includes a resolution change flag extractor 1810, a resolution
decoder 1820, a differential vector decoder 1830, and an inter
prediction decoder 1840.
[0365] The resolution change flag extractor 1810 extracts a
resolution change flag from a bitstream. That is, the resolution
change flag extractor 1810 extracts a resolution change flag, which
indicates whether the motion vector resolution is fixed or changes
according to each area, from a header of a bitstream. When the
resolution change flag indicates that the motion vector resolution
is fixed, the resolution change flag extractor 1810 extracts an
encoded motion vector resolution from the bitstream and then
decodes the extracted motion vector resolution, so as to make the
inter prediction decoder 1840 perform an inter prediction decoding
of all lower areas defined in the header with the reconstructed
fixed motion vector resolution or a preset motion vector resolution
and make the differential vector decoder 1830 reconstruct a motion
vector of each area with the fixed motion vector. When the
resolution change flag indicates that the motion vector resolution
changes according to each area or motion vector, the resolution
change flag extractor 1810 causes the resolution decoder 1820 to
reconstruct a motion vector resolution of each lower area or motion
vector defined in the header, causes the inter prediction decoder
1840 to perform an inter prediction decoding of each lower area or
motion vector defined in the header with the reconstructed motion
vector resolution, and causes the differential vector decoder 1830
to reconstruct a motion vector of each area with the reconstructed
motion vector.
[0366] Further, when the size of a predicted motion vector or
differential motion vector of a motion vector according to a motion
vector resolution determined for each area or motion vector is
larger than a threshold, the resolution change flag extractor 1810
may determine a predetermined value as the motion vector resolution
of each area or motion vector. For example, when the size of a
differential motion vector or the size of a predicted motion vector
of an area or a motion vector is larger than a threshold, the
resolution change flag extractor 1810 may determine a predetermined
value as a motion vector resolution of the area or the motion
vector without decoding the motion vector resolution of the area.
Further, when the size of a motion vector of a surrounding area of
an area or a motion vector is larger or the size of a motion vector
of an area is larger than a threshold, the resolution change flag
extractor 1810 may determine a predetermined value as a motion
vector resolution of the area without decoding the motion vector
resolution of the area. In this event, the to motion vector
resolution of the area or motion vector can be changed to a
predetermined resolution even without a flag. The threshold may be
a pre-appointed value or a certain input value, or may be
calculated from a motion vector of a surrounding block.
[0367] The resolution decoder 1820 extracts an encoded resolution
is identification flag from a bitstream according to a resolution
change flag extracted by the resolution change flag extractor 1810
and decodes the extracted resolution identification flag, so as to
reconstruct the motion vector resolution of each area. Meanwhile, a
decoding of a motion vector resolution by the resolution decoder
1820 simply described for convenience in the following discussion
may actually include a decoding of one of or both of a motion
vector resolution and a differential motion vector. Therefore, the
resolution indicated by the resolution identification flag may be
either a resolution of a motion vector or a resolution of a
differential motion vector, or may indicate both a resolution of a
motion vector and a resolution of a differential motion vector.
[0368] To this end, the resolution change flag extractor 1810 may
reconstruct a motion vector resolution of each area or motion
vector by decoding a resolution identification flag hierarchically
encoded in a Quadtree structure by grouping areas having the same
motion vector resolution together.
[0369] Referring to FIGS. 10 to 12, the resolution decoder 1820
reconstructs the motion vector resolutions by decoding the
resolution identification flags with a Quadtree structure as shown
in FIG. 10 according to the areas as shown in FIG. 12. For example,
in the case of decoding the resolution identification flag
generated through the encoding as shown in FIG. 11, the first bit
has a value of "1", which implies a division into sub layers, and
the second bit has a value of "0", which implies that the first
node of level 1 has not been divided into sub layers. Therefore, by
decoding the next bits, a motion vector resolution of 1/2 is
reconstructed. In the same manner as described above, the
resolution identification flags for level 1 and level 2 are decoded
in the same manner as, but in a reverse order to, the encoding
method as described above with reference to FIGS. 10 and 11, so as
to reconstruct the resolution identification flags of the
corresponding areas or motion vectors. Further, since an identifier
to indicating the size of an area indicated by the lowest node and
the maximum number of layers included in a header defines that the
maximum number of layers in level 3 should be 3, the resolution
decoder 1820 determines that there are no more layers lower than
level 3, and then reconstructs only the motion vector resolution of
each area. To this end, the resolution decoder 1820 decodes an
identifier, which indicates the size of the area indicated by the
lowest node of Quadtree layers and the maximum number of Quadtree
layers and is included in a header of a bitstream.
[0370] Although the above description discusses only two examples
including an example in which a node is divided into lower layers
(i.e. four areas) and another example in which a node is not
divided into lower layers. There may be various cases as shown in
FIG. 20, including a case in which a node is not divided into lower
layers and cases in which a node is divided into lower layers in
various ways, for example, a node may be divided into two
transversely lengthy areas, two longitudinally lengthy areas, or
four areas.
[0371] Further, the resolution decoder 1820 may reconstruct the
motion vector resolution of each area or motion vector by decoding
the resolution identification flag encoded using a predicted motion
vector resolution predicted by motion vector resolutions of
surrounding areas of the area or motion vector. For example, when
the resolution identification flag extracted for each area or
motion vector from a bitstream indicates that its resolution is
identical to a motion vector resolution predicted using motion
vector resolutions of surrounding areas (e.g. when the bit value of
the resolution identification flag is "1"), the resolution decoder
1820 may reconstruct the motion vector resolution predicted using
motion vector resolutions of surrounding areas without reading the
next resolution identification flag from the bitstream. In
contrast, when the resolution identification flag indicates that
its resolution is not identical to the motion vector resolution
predicted using motion vector resolutions of surrounding areas
(e.g. when the bit value of the resolution identification flag is
"0"), the resolution decoder 1820 may reconstruct the motion vector
resolution by reading the next resolution identification flag from
the bitstream and decoding the next resolution identification
flag.
[0372] In addition, the resolution decoder 1820 may reconstruct the
motion vector resolution of each area or motion vector by decoding
the resolution identification flag of the motion vector resolution
having an encoded run and length. For example, the resolution
decoder 1820 may reconstruct the run and length of the motion
vector resolution by decoding the encoded resolution is
identification flag of the differential motion vector resolutions
and/or motion vector resolutions of a part of multiple areas, or
may reconstruct the motion vector resolutions of the areas as shown
in FIG. 12 by using the reconstructed run and length of the motion
vector resolution.
[0373] Moreover, the resolution decoder 1820 may reconstruct the
motion vector resolution of each area or motion vector by decoding
the resolution identification flag hierarchically encoded using a
tag tree. Referring to FIGS. 13 and 14 as an example, since one can
see from the bit of the first area shown in FIG. 1, which is
"0111", that the bits corresponding to level 0 are "01" and it is
assumed that the number of the motion vector resolution of a higher
level of level 0 is "0", the resolution decoder 1820 may
reconstruct the motion vector resolution of 1/2, which has a
resolution number difference value of 1 from a higher level.
Further, since the next bit is "1", which has a resolution number
difference value of "0" from a higher layer, 1/2 is reconstructed
as the motion vector resolution in level 1 also. Also, since each
of the following bits is also "1", 1/2 is reconstructed as the
motion vector resolution in level 2 and level 3 also, respectively.
Since the bits of the second area in FIG. 14 are "01" and the
motion vector resolutions of level 0, level 1, and level 2 in the
first area have been already decoded, a decoding of the motion
vector resolution of only level 3 is required. In level 3, since
the resolution number difference value from a higher layer is "1",
it is possible to reconstruct a motion vector resolution of 1/4. In
the same manner, motion vector resolutions of the other areas can
be reconstructed.
[0374] Further, the resolution decoder 1820 may change and decode
the number of bits allocated to the resolution identification flag
according to the occurrence frequency of the motion vector
resolution determined for each motion vector or area. For example,
the resolution decoder 1820 may calculate the occurrence frequency
of the reconstructed motion vector resolution up to the just
previous area, provide numbers to motion vector resolutions
according to the calculated occurrence frequency, and allocate bit
numbers according to the provided numbers, so as to decode the
motion vector resolutions.
[0375] The area group may be a Quadtree, a Quadtree bundle, a tag
tree, a is tag tree bundle, a macroblock, a macroblock bundle, or
an area with a predetermined size. For example, when the area group
is appointed as including two macroblocks, it is possible to update
the occurrence frequency of the motion vector resolution for every
two macroblocks and allocate a bit number of the motion vector
resolution to the updated frequency, for the decoding. Otherwise,
when the area group is appointed as including four Quadtrees, it is
possible to update the occurrence frequency of the motion vector
resolution for every four Quadtrees and allocate a bit number of
the motion vector resolution to the updated frequency, for the
decoding.
[0376] Further, the resolution decoder 1820 may use different
methods for decoding a resolution identification flag according to
the distribution of the motion vector resolutions of surrounding
areas of each area with respect to the motion vector resolution
determined according to each area or motion vector. That is, the
smallest bit number is allocated to a resolution having the highest
probability that the resolution may be the resolution of a
corresponding area according to the distribution of the motion
vector resolutions of surrounding areas or area groups. For
example, if a left side area of the corresponding area has a motion
vector resolution of 1/2 and an upper side area of the area has a
motion vector resolution of 1/2, it is most probable that the area
may have a motion vector resolution of 1/2, and the smallest bit
number is thus allocated to the motion vector resolution of 1/2,
which is then decoded. As another example, if a left side area of
the corresponding area has a motion vector resolution of 1/4, a
left upper side area of the area has a motion vector resolution of
1/2, an upper side area of the area has a motion vector resolution
of 1/2, and a right upper side area of the area has a motion vector
resolution of 1/2, the bit numbers are allocated to the motion
vector resolutions in a sequence causing the smaller bit number to
be allocated to a motion vector resolution having the higher
probability, for example, in a sequence of 1/2, 1/4, 1/8, . . . ,
and the motion vector resolutions are then decoded.
[0377] Further, in performing the entropy decoding by an arithmetic
decoding, the resolution decoder 1820 uses different methods of
generating a bit string of a resolution identification flag
according to the distribution of the motion vector is resolutions
of the surrounding areas of each area for the motion vector
resolution determined according to each motion vector or area and
applies different context models according to the distribution of
the motion vector resolutions of the surrounding areas and the
probabilities of the motion vector resolution having occurred up to
the present, for the arithmetic decoding and probability update.
Further, in the arithmetic decoding and probability update, the
resolution decoder 1820 may use different context models according
to the positions of bits. For example, based on an assumption that
an entropy decoding is performed using only three motion vector
resolutions including 1/2, 1/4, and 1/8 by the CABAC, if a left
side area of a pertinent area has a motion vector resolution of 1/2
and an upper side area of the area has a motion vector resolution
of 1/2, the shortest bit string ("0" in FIG. 21) is allocated to
the motion vector resolution of 1/2 and the other bit strings are
allocated to the other motion vector resolutions, i.e. 1/4 and 1/8,
in a sequence causing the smaller bit number to be allocated to a
motion vector resolution having the higher probability.
[0378] In this event, if the motion vector resolution of 1/8 has
the higher occurrence probability up to the present than that of
the motion vector resolution of 1/4, the bitstream of "00" is
allocated to the motion vector resolution of 1/8 and the bitstream
of "01" is allocated to the motion vector resolution of 1/2.
Further, in decoding the first bit string, four different context
models may be used, which include: a first context model in which
the resolution of the left side area is equal to the resolution of
the upper side area, which is equal to the resolution of the
highest probability up to the present; a second context model in
which the resolution of the left side area is equal to the
resolution of the upper side area, which is different from the
resolution of the highest probability up to the present; a third
context model in which the resolutions of the left side area and
the upper side area are different from each other and at least one
of the resolutions of the left side area and the upper side area is
equal to the resolution of the highest probability up to the
present; and a fourth context model in which the resolutions of the
left side area and the upper side area are different from each
other and neither of them is equal to the resolution of the highest
probability up to the present. In decoding the second bit string,
two different context models may be used, which include: a first
context model in which the resolutions of the left side area and
the upper side area are different from each other and at least one
of the resolutions of the left side area and the upper side area is
equal to the resolution of the highest probability up to the
present; and a second context model in which the resolutions of the
left side area and the upper side area are different from each
other and neither of them is equal to the resolution of the highest
probability up to the present.
[0379] As another example, based on an assumption that an entropy
decoding is performed using only three motion vector resolutions
including 1/2, 1/4, and 1/8 by the CABAC and the highest motion
vector resolution up to the present is 1/4, "1", which is the
shortest bitstream, is allocated to the motion vector resolution of
1/4, and "00" and "01" are then allocated to the other motion
vector resolutions of 1/2 and 1/8, respectively.
[0380] Further, in decoding the first bit string, three different
context models may be used, which include: a first context model in
which each of the resolutions of the left side area and the upper
side area of a corresponding area is equal to the resolution of the
highest probability up to the present; a second context model in
which only one of the resolutions of the left side area and the
upper side area of a corresponding area is equal to the resolution
of the highest probability up to the present; and a third context
model in which neither of the resolutions of the left side area and
the upper side area of a corresponding area is equal to the
resolution of the highest probability up to the present. In
decoding the second bit string, six different context models may be
used, which include: a first context model in which each of the
resolution of the left side area and the resolution of the upper
side area of a corresponding area corresponds to a motion vector
resolution of 1/8; a second context model in which each of the
resolutions of the left side area and the upper side area of a
corresponding area corresponds to a motion vector resolution of
1/2; a third context model in which each of the resolutions of the
left side area and the upper side area of a corresponding area
corresponds to a motion vector resolution of 1/4; a fourth context
model in which one of the resolutions of the left side area and the
upper side area of a corresponding area corresponds to a motion
vector resolution of 1/8 and the other resolution corresponds to a
motion vector resolution of 1/4; a fifth context model in which one
of the resolutions of the left side area and the upper side area of
a corresponding area corresponds to a motion vector resolution of
1/2 and the other resolution corresponds to a motion vector
resolution of 1/4; and a sixth context model in which one of the
resolutions of the left side area and the upper side area of a
corresponding area corresponds to a motion vector resolution of 1/8
and the other resolution corresponds to a motion vector resolution
of 1/2. The resolution of the highest probability up to the present
may be a probability of a resolution encoded up to the previous
area, a probability of a certain area, or a predetermined fixed
resolution.
[0381] Further, when the resolution identification flag decoded for
each area or motion vector is a flag indicating the capability of
estimation, the resolution decoder 1820 may estimate a motion
vector resolution according to a pre-promised estimation scheme, so
as to reconstruct the estimated motion vector resolution as a
motion vector resolution of the area or motion vector. In contrast,
when the resolution identification flag decoded for each area or
motion vector is a flag indicating the incapability of estimation,
the resolution decoder 1820 may reconstruct the motion vector
resolution indicated by the decoded resolution identification flag
as the motion vector of the area.
[0382] For example, when the resolution identification flag decoded
for each area or motion vector indicates the capability of
estimation, the resolution decoder 1820 predicts a predicted motion
vector by changing each decoded motion vector resolution in a
method equal or similar to the method of the video encoding
apparatus 900, and reconstructs a motion vector by using the
predicted motion vector and a differential motion vector
reconstructed by the differential vector decoder 1830. First, based
on an assumption that a motion vector resolution of a predetermined
area corresponds to a 1/4 pixel unit, when the predicted motion
vector is (3, 14), the differential motion vector is (2, 3) and the
reconstructed motion vector of the predetermined area is thus (5,
17). Based on an assumption that a motion vector resolution of a
predetermined area corresponds to a 1/2 pixel unit, when the
predicted motion vector is (2, 7), the is reconstructed
differential motion vector is (2, 3) and the reconstructed motion
vector of the predetermined area is thus (4, 10). A resolution
having the least distortion between surrounding pixels of a
pertinent area and surrounding pixels of an area motion-compensated
using a reconstructed motion vector of each resolution in a
reference picture is an optimum motion vector resolution.
Therefore, when surrounding pixels of an area motion-compensated in
the unit of 1/2 pixels has the least distortion, the motion vector
resolution of 1/2 is the optimum motion vector resolution.
[0383] Further, when the resolution identification flag decoded for
each area or motion vector indicates the capability of estimation,
the resolution decoder 1820 may reconstruct the motion vector
resolution of the pertinent area or motion vector by additionally
decoding the motion vector resolution in the resolution
identification flag.
[0384] Further, the resolution decoder 1820 can reconstruct the
motion vector resolution of each area or motion vector only when
each component of the differential motion vector is not "0". That
is, when a component of a differential motion vector of a
particular area is "0", the resolution decoder 1820 may decode a
predicted motion vector into a motion vector without reconstructing
the motion vector resolution of the particular area.
[0385] The differential vector decoder 1830 extracts an encoded
differential motion vector from a bitstream and decodes the
extracted differential motion vector. Specifically, the
differential vector decoder 1830 reconstructs the differential
motion vector of each area or motion vector by performing the
decoding according to the motion vector resolution of each
reconstructed area or motion vector. Additionally, the inter
prediction decoder 1840 may predict a predicted motion vector of
each area and reconstruct a motion vector of each area by using the
predicted motion vector and the reconstructed differential motion
vector.
[0386] To this end, the differential vector decoder 1830 may use
UVLC in decoding the differential motion vector. In this event, the
differential vector decoder 1830 may use the K-th order Exp-Golomb
code in the decoding and is may change the degree of order (K) of
the Exp-Golomb code according to the motion vector resolution
determined for each reconstructed area. Further, the differential
vector decoder 1830 may decode the differential vector by using the
CABAC. In this event, the differential vector decoder 1830 may use
the Concatenated Truncated Unary/K-th Order Exp-Golomb Code in the
decoding and may change the degree of order (K) and the maximum
value (T) of the Concatenated Truncated Unary/K-th Order Exp-Golomb
Code according to the motion vector resolution determined for each
reconstructed area or motion vector. In addition, when the
differential vector decoder 1830 decodes the differential vector by
using the CABAC, the differential vector decoder 1830 may
differently calculate the accumulation probability according to the
motion vector resolution determined for each reconstructed area or
motion vector.
[0387] Further, the differential vector decoder 1830 may predict a
predicted motion vector for each area or motion vector by using
motion vectors of surrounding areas of each area or motion vector.
In this event, when the motion vector resolution of each area is
not equal to the motion vector resolution of surrounding areas, the
differential vector decoder 1830 may convert the motion vector
resolution of the surrounding areas to the motion vector resolution
of said each area for the prediction. The predicted motion vector
can be obtained in the same method by the video encoding apparatus
and the video decoding apparatus. Therefore, various aspects for
the motion vector resolution conversion and for obtaining a
predicted motion vector by a video encoding apparatus as described
above with reference to FIGS. 22 to 26 can also be applied to a
video decoding apparatus according to the following aspects of the
present disclosure.
[0388] Further, when at least one area among the areas is a block
and the block mode of the block is a skip mode, the differential
vector decoder 1830 may convert motion vector resolutions of
surrounding areas of the area to the highest resolution among the
motion vector resolutions of the surrounding areas and then perform
the prediction.
[0389] Moreover, the resolution identification flag indicating the
motion vector resolution decoded by the resolution decoder 1820 may
indicate either both or is each of the resolutions of an x
component and a y component of a motion vector. That is, when a
camera taking a video moves or when an object within a video moves,
the resolution decoder 1820 may perform the decoding with different
resolutions for the x component and the y component of a motion
vector for motion estimation. For example, the resolution decoder
1820 may perform the decoding with a resolution of 1/8 pixel unit
for the x component of a motion vector of a certain area while
performing the decoding with a resolution of 1/2 pixel unit for the
y component of the motion vector. Then, the inter prediction
decoder 1840 may perform an inter prediction decoding of a
pertinent area by performing a motion estimation and a motion
compensation of a motion vector of the pertinent area by using
different resolutions for the x component and the y component of
the motion vector.
[0390] The inter prediction decoder 1840 performs an inter
prediction decoding of each area by using a motion vector of each
area according to the motion vector resolution of each
reconstructed area or motion vector. The inter prediction decoder
1840 may be implemented by the video decoding apparatus 800
described above with reference to FIG. 8. When functions of the
resolution change flag extractor 1810, the resolution decoder 1820,
and the differential vector decoder 1830 in FIG. 18 overlap with
the function of the decoder 810 of the video decoding apparatus
800, the overlapping functions may be omitted in the decoder 810.
Further, when the operation of the resolution decoder 1820 overlaps
with the operation of the predictor 850, the overlapping operation
may be omitted in the predictor 850.
[0391] Also, the resolution change flag extractor 1810, the
resolution decoder 1820, and the differential vector decoder 1830
may be constructed either separately from the inter prediction
decoder 1840 as shown in FIG. 18 or integrally with the decoder 810
within the video decoding apparatus 1800.
[0392] However, although the above description with reference to
FIG. 8 discusses decoding of a video in the unit of blocks by the
video decoding apparatus 800, the inter prediction decoder 1840 may
divide the video into areas with various shapes or sizes, such as
blocks including macroblocks or subblocks, slices, or pictures, and
perform the decoding in the unit of areas each is having a
predetermined size. Such a predetermined area may be not only a
macroblock having a size of 16.times.16 but also blocks with
various shapes or sizes, such as a block having a size of
64.times.64 and a block having a size of 32.times.16.
[0393] Further, although the video decoding apparatus 800 described
above with reference to FIG. 8 performs an inter prediction
decoding using motion vectors having the same motion vector
resolution for all blocks of a video, the inter prediction decoder
1840 may perform an inter prediction decoding using motion vectors
having motion vector resolutions differently determined according
to each area or motion vector. That is, in the inter prediction
decoding of an area, the inter prediction decoder 1840 first
enhances the resolution of an area by interpolating a reference
picture having been already encoded, decoded, and reconstructed
according to a motion vector resolution and/or a differential
motion vector resolution of each area or motion vector
reconstructed by the resolution decoder 1820, and then performs a
motion estimation by using a motion vector and/or a differential
motion vector according to the motion vector resolution and/or the
differential motion vector resolution of the pertinent area or
motion vector reconstructed by the differential vector decoder
1830. For the interpolation of the reference picture, it is
possible to use various interpolation filters, such as a Wiener
filter, a bilinear filter, and a Kalman filter and to apply
resolutions in the unit of various integer pixels or fraction
pixels, such as 2 pixel unit, 1 pixel unit, 2/1 pixel unit, 1/1
pixel unit, 1/2 pixel unit, 1/4 pixel unit, and 1/8 pixel unit.
Further, according to such various resolutions, it is possible to
use different filter coefficients or different numbers of filter
coefficients. For example, a Wiener filter may be used for the
interpolation when the resolution corresponds to the 1/2 pixel unit
and a Kalman filter may be used for the interpolation when the
resolution corresponds to the 1/4 pixel unit. Moreover, different
numbers of taps may be used for the interpolation of the respective
resolutions. For to example, an 8-tap Wiener filter may be used for
the interpolation when the resolution corresponds to the 1/2 pixel
unit and a 6-tap Wiener filter may be used for the interpolation
when the resolution corresponds to the 1/4 pixel unit.
[0394] Further, the inter prediction decoder 1840 may decode the
filter coefficient of each motion vector resolution and then
interpolate a reference is picture with an optimum filter
coefficient for each motion vector resolution. In this event, it is
possible to use various filters including a Wiener filter and a
Kalman filter and to employ various numbers of filter taps.
Further, it is possible to employ different numbers of filters or
different numbers of filter taps according to the resolutions of
the motion vectors. Moreover, the inter prediction decoder 1840 may
perform the inter prediction decoding by using reference pictures
interpolated using different filters according to the motion vector
resolution of each area or motion vector. For example, a filter
coefficient of a 6-tap Wiener filter may be decoded for the 1/2
resolution, a filter coefficient of an 8-tap Kalman filter may be
decoded for the 1/4 resolution, a filter coefficient of a linear
filter may be decoded for the 1/8 resolution, and the reference
picture for each resolution may be then interpolated and decoded.
In the decoding, the inter prediction decoder 1840 may use a
reference picture interpolated by a 6-tap Wiener filter when the
resolution of the current area or motion vector is a 1/2
resolution, and may use a reference picture interpolated by a 8-tap
Kalman filter when the resolution of the current area or motion
vector is a 1/4 resolution.
[0395] FIG. 48 is a block diagram illustrating a video decoding
apparatus using an adaptive motion vector according to the second
aspect of the present disclosure.
[0396] The video decoding apparatus 4800 using an adaptive motion
vector according to the second aspect of the present disclosure
includes a resolution appointment flag extractor 4810, a resolution
decoder 4820, a differential vector decoder 4830, an inter
prediction decoder 4840, and a resolution conversion flag extractor
4850. In this event, all of the resolution appointment flag
extractor 4810, the resolution decoder 4820, the differential
vector decoder 4830, and the resolution conversion flag extractor
4850 are not necessarily included in the video decoding apparatus
4800 and may be selectively included in the video decoding
apparatus 4800 according to the encoding scheme of a video encoding
apparatus for generating an encoded bitstream.
[0397] Further, the inter prediction decoder 4840 performs an inter
prediction is decoding of each area by using a motion vector of
each area according to the motion vector resolution of each
reconstructed area or motion vector. The inter prediction decoder
4840 may be implemented by the video decoding apparatus 800
described above with reference to FIG. 8. When one or more
functions of the resolution change flag extractor 4810, the
resolution decoder 4820, the differential vector decoder 4830, and
the resolution conversion flag extractor 4850 in FIG. 48 overlap
with the function of the decoder 810 within the video decoding
apparatus 4800, the overlapping functions may be omitted in the
decoder 810. Further, when the operation of the resolution decoder
4820 overlaps with the operation of the predictor 850 within the
inter prediction decoder 4840, the overlapping operation may be
omitted in the predictor 850.
[0398] Also, the resolution change flag extractor 4810, the
resolution decoder 4820, the differential vector decoder 4830, and
the resolution conversion flag extractor 4850 may be constructed
either separately from the inter prediction decoder 4840 as shown
in FIG. 48 or integrally with the decoder 810 within the video
decoding apparatus 4800.
[0399] The resolution appointment flag extractor 4810 extracts a
resolution appointment flag from an input bitstream. The resolution
appointment flag corresponds to a flag indicating that it is fixed
to a single resolution or a resolution set including multiple
resolutions.
[0400] The resolution appointment flag extractor 4810 extracts a
resolution appointment flag from a bitstream. That is, the
resolution appointment flag extractor 4810 extracts a resolution
appointment flag, which indicates whether the motion vector
resolution is fixed to a predetermined value or corresponds to a
resolution set including different resolutions according to areas,
from a header of a bitstream. When the resolution appointment flag
indicates that the motion vector resolution and/or differential
motion vector resolution is fixed to a predetermined resolution,
the resolution appointment flag extractor 4810 transmits the fixed
resolution indicated by the resolution appointment flag to the
inter prediction decoder 4840 and the differential vector decoder
4830, and the differential vector decoder 4830 then decodes a
differential motion vector by using the received resolution and
then transmits the decoded differential motion vector to the inter
prediction decoder 4840. Then, the inter prediction decoder 4840
performs an inter prediction decoding by using the received
differential motion vector, the resolution received from the
resolution appointment flag extractor 4810, and the received
bitstream.
[0401] When the resolution appointment flag corresponds to a
predetermined resolution set, the resolution change flag extractor
4810 causes the resolution decoder 4820 to reconstruct a motion
vector resolution and/or differential motion vector resolution of
each lower area or motion vector defined in the header, causes the
inter prediction decoder 4840 to perform an inter prediction
decoding of each lower area or motion vector defined in the header
with the reconstructed motion vector resolution, and causes the
differential vector decoder 4830 to reconstruct a motion vector of
each area with the reconstructed motion vector.
[0402] Further, in the case of using multiple reference pictures,
an adaptability degree (i.e. resolution set) of the resolution may
be calculated for each reference picture based on a predetermined
criterion when the resolution appointment flag is not extracted
from a bitstream. For example, different adaptability degrees of
the resolution may be employed according to the distance between
the current picture and reference pictures. This configuration has
been already described above with reference to FIGS. 38 and 39, so
a detailed description thereof is omitted here.
[0403] Further, at the time of generating reference pictures, the
resolution set may be calculated using an error measurement means,
such as a Sum of Squared Difference (SSD) between resolutions. For
example, if usable resolutions are 1/1, 1/2, 1/4, and 1/8, in
interpolating a reference picture, it is possible to set the
resolution of 1/2 to be used only when an error value obtained
using an error measurement means, such as an SSD, for the
resolutions of 1/1 and 1/2 exceeds a predetermined threshold while
setting the resolution of 1/2 not to be used when the error value
does not exceed the predetermined threshold. Further, when it has
been set that the resolution of 1/2 should not be used, it is
determined whether an error value obtained using an error
measurement means, such as an SSD, for the resolutions of 1/1 and
1/4 exceeds a predetermined threshold. When the error value for the
resolutions of 1/1 and 1/4 does not is exceed the predetermined
threshold, the resolution of 1/4 is set not to be used. In
contrast, when the error value for the resolutions of 1/1 and 1/4
exceeds the predetermined threshold, the resolutions of both 1/1
and 1/4 are set to be used. Also, when the resolution of 1/4 has
been set to be used, it is determined whether an error value
obtained using an error measurement means, such as an SSD, for the
resolutions of 1/4 and 1/8 exceeds a predetermined threshold. When
the error value for the resolutions of 1/4 and 1/8 does not exceed
the predetermined threshold, the resolution of 1/8 is set not to be
used. In contrast, when the error value for the resolutions of 1/4
and 1/8 exceeds the predetermined threshold, all the resolutions of
1/1, 1/4, and 1/8 are set to be used. The threshold may be
different according to the resolutions or quantized parameters, or
may be the same.
[0404] Further, in the case of encoding the employment of different
adaptability degrees of the resolution according to the reference
pictures, the resolution appointment flag extractor 4810 may
extract a resolution set by extracting a reference picture index
number instead of the resolution appointment flag from the
bitstream and then storing and referring to the reference picture
index number corresponding to each predetermined resolution set as
shown in FIG. 40.
[0405] Also, when the resolution appointment flag indicates a
resolution set, it is possible to set an actually usable resolution
set according to the use or non-use of a reference picture by
setting different resolution sets for a picture to be used as a
reference picture and a picture not to be used as a reference
picture, respectively. Therefore, the video decoding apparatus 4800
may also store a table as shown in FIG. 42 to be referred to by the
resolution decoder 4820 in decoding the resolution. This
configuration also has been already described above with reference
to FIGS. 41 and 42, so a detailed description thereof is omitted
here.
[0406] Further, when the size of a predicted motion vector or
differential motion vector of a motion vector according to a motion
vector resolution and/or differential motion vector resolution
determined for each area or motion vector is larger than a
threshold, the resolution appointment flag extractor 4810 may
determine a predetermined value as the motion vector resolution
and/or differential motion vector resolution determined for each
area or motion vector. This configuration also has been already
described above in the discussion relating to the resolution change
flag extractor 1810 of the video decoding apparatus 1800 according
to the first aspect, so a detailed description thereof is omitted
here.
[0407] The resolution decoder 4820 extracts an encoded resolution
identification flag from a bitstream according to a resolution
appointment flag extracted by the resolution appointment flag
extractor 4810 and decodes the extracted resolution identification
flag, so as to reconstruct the motion vector resolution of each
area.
[0408] To this end, the resolution appointment flag extractor 4810
may reconstruct a motion vector resolution of each area or motion
vector by decoding a resolution identification flag hierarchically
encoded in a Quadtree structure by grouping areas having the same
motion vector resolution together. This configuration also has been
already described above in the discussion relating to the
resolution change flag extractor 1810 of the video decoding
apparatus 1800 according to the first aspect, so a detailed
description thereof is omitted here.
[0409] Further, the resolution decoder 4820 may reconstruct the
motion vector resolution of each area or motion vector by decoding
the resolution identification flag encoded using a predicted motion
vector resolution predicted using motion vector resolutions of
surrounding areas of the area or motion vector. This configuration
also has been already described above in the discussion relating to
the resolution change flag extractor 1810 of the video decoding
apparatus 1800 according to the first aspect, so a detailed
description thereof is omitted here.
[0410] In addition, the resolution decoder 4820 may reconstruct the
motion vector resolution of each area or motion vector by decoding
the resolution identification flag of the motion vector resolution
having an encoded run and length for each area or motion vector.
This configuration also has been already described above in the
discussion relating to the resolution change flag extractor 1810 of
the video decoding apparatus 1800 according to the first aspect, so
a detailed description thereof is omitted here.
[0411] Moreover, the resolution decoder 4820 may reconstruct the
motion vector resolution of each area or motion vector by decoding
the resolution identification flag hierarchically encoded using a
tag tree. This configuration also has been already described above
in the discussion relating to the resolution change flag extractor
1810 of the video decoding apparatus 1800 according to the first
aspect, so a detailed description thereof is omitted here.
[0412] Further, the resolution decoder 4820 may change and decode
the number of bits allocated to the resolution identification flag
according to the occurrence frequency of the motion vector
resolution determined for each motion vector or area. For example,
the resolution decoder 4820 may calculate the occurrence frequency
of the reconstructed motion vector resolution up to the just
previous area, provide numbers to motion vector resolutions
according to the calculated occurrence frequency, and allocate bit
numbers according to the provided numbers, so as to decode the
motion vector resolutions. This configuration also has been already
described above in the discussion relating to the resolution change
flag extractor 1810 of the video decoding apparatus 1800 according
to the first aspect, so a detailed description thereof is omitted
here.
[0413] Further, the resolution decoder 4820 may use different
methods for decoding a resolution identification flag according to
the distribution of the motion vector resolutions of surrounding
areas of each area with respect to the motion vector resolution
determined according to each area or motion vector. That is, the
smallest bit number is allocated to a resolution having the highest
probability that the resolution may be the resolution of a
corresponding area according to the distribution of the motion
vector resolutions of surrounding areas or area groups. This
configuration also has been already described above in the
discussion relating to the resolution change flag extractor 1810 of
the video decoding apparatus 1800 according to the first aspect, so
a detailed description thereof is omitted here.
[0414] Further, in performing the entropy decoding by an arithmetic
decoding, the resolution decoder 4820 may use different methods of
generating a bit string of a resolution identification flag
according to the distribution of the motion vector resolutions of
the surrounding areas of each area for the motion vector resolution
determined according to each motion vector or area and may apply
different context models according to the distribution of the
motion vector resolutions of the surrounding areas and the
probabilities of the motion vector resolution having occurred up to
the present, for the arithmetic decoding and probability update.
Further, in the arithmetic decoding and probability update, the
resolution decoder 4820 may use different context models according
to the positions of bits. This configuration also has been already
described above in the discussion relating to the resolution change
flag extractor 1810 of the video decoding apparatus 1800 according
to the first aspect, so a detailed description thereof is omitted
here.
[0415] Further, when the resolution identification flag decoded for
each area or motion vector is a flag indicating the capability of
estimation, the resolution decoder 4820 may estimate a motion
vector resolution according to a pre-promised estimation scheme, so
as to reconstruct the estimated motion vector resolution as a
motion vector resolution of the area or motion vector. In contrast,
when the resolution identification flag decoded for each area or
motion vector is a flag indicating the incapability of estimation,
the resolution decoder 4820 may reconstruct the motion vector
resolution indicated by the decoded resolution identification flag
as the motion vector of the area. This configuration also has been
already described above in the discussion relating to the
resolution change flag extractor 1810 of the video decoding
apparatus 1800 according to the first aspect, so a detailed
description thereof is omitted here.
[0416] Further, when the resolution identification flag decoded for
each area or motion vector indicates the capability of estimation,
the resolution decoder 4820 may reconstruct the motion vector
resolution of the pertinent area or motion vector by additionally
decoding the motion vector resolution in the resolution
identification flag. Further, the resolution decoder 4820 can
reconstruct the motion vector resolution of each area or motion
vector only when each component of the differential motion vector
is not "0". That is, when a is component of a differential motion
vector of a particular area is "0", the resolution decoder 4820 may
decode a predicted motion vector into a motion vector without
reconstructing the motion vector resolution of the particular area.
This configuration also has been already described above in the
discussion relating to the resolution change flag extractor 1810 of
the video decoding apparatus 1800 according to the first aspect, so
a detailed description thereof is omitted here.
[0417] Further, the resolution decoder 4820 extracts a resolution
identification flag according to the kind of the resolution of the
resolution change flag decoded after being extracted from a header.
Further, by using the extracted resolution identification flag, the
differential vector decoder 4830 extracts a value of a differential
motion vector corresponding to a pertinent resolution by referring
to a code number extracted from a code number table of a
differential motion vector according to the motion vector
resolutions as shown in FIG. 25 stored by a video encoding
apparatus. When the decoded resolution identification flag is 1/4,
the motion vector may be decoded using a differential motion vector
extracted from a bitstream and a predicted motion vector obtained
through a conversion into the resolution (i.e. 1/4) of surrounding
motion vectors. The predicted motion vector may be obtained by
taking a median of surrounding motion vectors converted using
multiplication and division like the encoder, without limiting the
present disclosure to this construction.
[0418] FIG. 49 illustrates an example of surrounding motion vectors
of current block X, and FIG. 50 illustrates an example of converted
values of surrounding motion vectors according to the current
resolution.
[0419] Further, the resolution decoder 4820 may calculate the
resolution by using reference picture indexes without extracting a
resolution identification flag. This configuration has been already
described above in the discussion relating to the video encoding
apparatus 3200 with reference to FIGS. 30 and 31, so a to detailed
description thereof is omitted here.
[0420] The inter prediction decoder 4840 may obtain and decode a
motion vector by using a differential motion vector calculated by
the differential vector decoder 4830 and a predicted motion vector
obtained using a table as shown in FIG. 50.
[0421] When the differential motion vector resolution has been
encoded and then transmitted through a bitstream as described above
with reference to FIGS. 26 and 27, the resolution decoder 4820
extracts and decodes a resolution identification flag of a
differential motion vector from the bitstream, and decodes a sign
of a differential motion vector and a code number of the
differential motion vector. When a code number extracted from a
bitstream including a code number table of differential motion
vectors according to the differential motion vector resolutions as
shown in FIG. 27 is (1, 1), it is possible to obtain a differential
motion vector of (-1/8, -1/4) by referring to the code number
extracted from a bitstream including a code number table of
differential motion vectors according to the differential motion
vector resolutions of FIG. 27. In this event, the inter prediction
decoder 4840 may calculate a predicted motion vector in the same
way as that of the video encoding apparatus as described below.
PMVx=median(7/8,1/8, 2/8)= 2/8
PMVy=median(- 6/8,1/8,- 2/8)=- 2/8
[0422] As a result, PMV=( 2/8, - 2/8)=(1/4, -1/4).
[0423] Therefore, MV (1/8, - 4/8)=MVD(-1/8, -1/4)-PMV(1/4, -1/4),
so that (1/8, - 4/8) is obtained as the decoded motion vector.
[0424] Meanwhile, when the differential vector decoder 1830
receives a reference resolution flag, the differential vector
decoder 1830 reconstructs a differential motion vector and decodes
the reference resolution. In this event, the differential vector
decoder 1830 extracts a code number included in the reference
resolution flag, decodes the differential reference motion vector
by referring to the code number table according to the differential
reference motion vector resolution as shown in FIG. 29, and then
reconstructs the differential to motion vector by using the
location information included in the reference resolution flag.
That is, in decoding the differential motion vector, the
differential vector decoder 1830 extracts a reference resolution
from a bitstream and calculates the differential reference motion
vector by referring to the code number table according to the
differential reference motion vector resolution as shown in FIG.
29. In this event, the differential vector decoder 1830 may extract
a reference resolution flag, which indicates a reference resolution
and location coordinates of a motion vector, from a bitstream and
then decode a differential motion vector from the extracted
reference resolution flag by using the reference resolution
flag.
[0425] If a motion vector has a resolution other than the reference
resolution, it is possible to employ a method of additionally
encoding a reference resolution flag. The reference resolution flag
may include data indicating whether the motion vector has the same
resolution as the reference resolution and data indicating a
location of an actual motion vector.
[0426] Meanwhile, the differential vector decoder 4830 may have
another function corresponding to the function of the differential
vector decoder 1830 of the video decoding apparatus 1800 according
to the first aspect as described above. This function has been
already described above in the discussion relating to the
differential vector decoder 1830 according to the first aspect, so
a detailed description thereof is omitted here.
[0427] According to the value (e.g. 1) of the resolution change
flag extracted from a bitstream, a resolution identification flag
is decoded after being extracted from the bitstream by using the
same resolution identification flag table as used in the encoder
except for the resolution having the highest frequency among the
surrounding resolutions.
[0428] Meanwhile, the resolution conversion flag extractor 4850
extracts a resolution conversion flag from a bitstream. Also,
according to the value (e.g. 1) of the resolution change flag
extracted from a bitstream, the resolution conversion flag
extractor 4850 may decode a resolution identification flag after
extracting the resolution identification flag from the bitstream by
a resolution identification flag table (see the table shown in FIG.
45) except for the resolution having the highest frequency among
the surrounding resolutions. Further, in the case of using the
resolutions of surrounding blocks, the resolution conversion flag
extractor 4850 may decode the resolution conversion flag into 0
when the is resolution of the current block is equal to the lowest
resolution among the resolutions of A and B, and may decode the
resolution conversion flag into 1 when the resolution of the
current block is equal to the highest resolution. When the
resolution conversion flag is encoded into 1, the encoded flag may
be excluded from the resolution identification flag.
[0429] Meanwhile, the resolution conversion flag extractor 4850 may
obtain the resolution of the current block by extracting the value
(i.e. 1) of the resolution conversion flag from a bitstream so as
to enable the difference between the resolution of the current
block and the resolution of previous block A to be understood. For
example, when the resolution set includes 1/2 and 1/4 and the
resolution of the previous block is 1/2, it is possible to
understand that the converted resolution is 1/4.
[0430] FIG. 59 is a schematic block diagram of a video decoding
apparatus according to the third aspect of the present
disclosure.
[0431] The video decoding apparatus 5900 according to the third
aspect of the present disclosure may include the reference picture
interpolator 5940 and the inter prediction decoder 5920.
[0432] The reference picture interpolator 5910 interpolates the
reference picture to have the target precision through a
multi-stage filtering of the reference picture by using multiple
filters identified by information on the multiple filters
reconstructed by a bitstream decoding. That is, the reference
picture interpolator 5910 interpolates the reference picture to
have the target precision by reconstructing the information on the
plurality of filters through the bitstream decoding and filtering
the reference picture stage-by-stage by using multiple filters or
filter coefficients of multiple filters identified by information
on the reconstructed multiple filters.
[0433] For example, when the information on the multiple filters
reconstructed by the bitstream decoding indicates types of two
filters, the reference picture interpolator 5910 interpolates the
reference picture by using one filter between the two filters and
re-interpolates the interpolated reference picture by using another
filter between the two filters. As a result, the interpolated
reference is picture having the target precision is obtained.
Further, when the information on the multiple filters reconstructed
by the bitstream decoding indicates two filter coefficients for one
filter, the reference picture interpolator 5910 interpolates the
reference picture by using a filter having one filter coefficient
between the two filter coefficients and re-interpolates the
interpolated reference picture by using another filter between the
two filters.
[0434] The reference picture interpolator 5910 will be discussed in
the following description with reference to FIG. 60 in detail.
However, the information on the multiple filters reconstructed
through the bitstream decoding does not necessarily have to be
reconstructed by the reference picture interpolator 5910 and the
reconstructed information on the multiple filters may be
transmitted from the inter prediction decoder 5920 when the inter
prediction decoder 5920 reconstructs the information on the
multiple filters through the bitstream decoding.
[0435] The inter prediction decoder 5920 reconstructs a video
through the inter prediction decoding of the bitstream using the
interpolated reference picture having the target precision. The
inter prediction decoder 5920 may be implemented as the video
decoding apparatus 800 described with reference to FIG. 8. However,
although it has been described that the video decoding apparatus
800 described with reference to FIG. 8 decodes the video in the
unit of blocks, the inter prediction decoder 5940 can decode the
video in the unit of areas having a predetermined size by dividing
the video into areas having various types and sizes such as a macro
block, a block, a sub-block, a slice, and a picture. The
predetermined area may be a macro block of a 16.times.16 size, but
the present disclosure is not limited thereto and may be blocks
having various types and sizes such as a block of a 64.times.64
size or a block of a 32.times.16 size.
[0436] Further, the video decoding apparatus 800 described with
reference to FIG. 8 performs the inter prediction decoding by the
motion vectors having the same motion vector precision for all
blocks of the video, but the inter prediction decoder 5920 can
perform the inter prediction decoding of each area by the motion
vectors having motion vector precisions differently determined for
each motion vector or each area.
[0437] FIG. 72 is a schematic block diagram of a video decoding
apparatus according to the fourth aspect of the present
disclosure.
[0438] The video decoding apparatus 7200 according to the fourth
aspect of the present disclosure may include the resolution
appointment flag extractor 4810, the resolution decoder 4820, the
differential vector decoder 4830, the inter prediction decoder
4840, and the resolution conversion flag extractor 4850. In this
event, all of the resolution appointment flag extractor 4810, the
resolution decoder 4820, the differential vector decoder 4830, and
the resolution conversion flag extractor 4850 are not necessarily
included in the video decoding apparatus 4800 and may be
selectively included in the video decoding apparatus 4800 according
to the encoding scheme of a video encoding apparatus for generating
an encoded bitstream. Here, the resolution appointment flag
extractor 4810, the resolution decoder 4820, and the resolution
conversion flag extractor 4850 are the same as or similar with the
resolution appointment flag extractor 4810, the resolution decoder
4820, the differential vector decoder 4830, and the resolution
conversion flag extractor 4850 shown in FIG. 48, respectively,
except that the resolution appointment flag extractor 4810, the
resolution decoder 4820, and the resolution conversion flag
extractor 4850 included in the video decoding apparatus 7200
extract the resolution appointment flag, the resolution
identification flag, and the resolution conversion flag,
respectively, to transmit them to the reference picture
interpolator 7210, so detailed descriptions are omitted.
[0439] Further, the inter prediction decoder 4840 in FIG. 72
performs an inter prediction decoding of each area by using a
motion vector of each area according to the motion vector
resolution of each reconstructed area or motion vector. The inter
prediction decoder 4840 may be implemented by the video decoding
apparatus 800 described above with reference to FIG. 8. When one or
more functions of the resolution change flag extractor 4810, the
resolution decoder 4820, the differential vector decoder 4830, and
the resolution conversion flag extractor 4850 in FIG. 72 overlap
with the function of the decoder 810 within the video decoding
apparatus 4800, the overlapping functions may be omitted in the
decoder 810. Also, the resolution change flag extractor 4810, the
resolution decoder 4820, the differential vector decoder 4830, and
the resolution conversion flag extractor 4850 may be constructed
either separately from the inter prediction decoder 4840 as shown
in FIG. 72 or integrally with the decoder 810 within the video
decoding apparatus 4800.
[0440] The reference picture interpolator 7210 in FIG. 72 may
include the function of the reference picture interpolator 5910 in
FIG. 59, include a function of the reference picture interpolator
7210, which will be discussed in the following description, or
include both functions.
[0441] When the resolution decoder 4280 extracts the resolution
identification flag from the bitstream, the reference picture
interpolator 7210 determines a filter tap by using the motion
vector resolution for the extracted area or motion vector and
interpolates the reference picture.
[0442] When the resolution conversion flag extractor 4850 extracts
the resolution conversion flag, which indicates a change from the
resolution of a previous block or resolutions of neighboring areas
to encode the current resolution, the reference picture
interpolator 7210 determines a filter tap by using the motion
vector resolution determined using the extracted resolution
conversion flag and interpolates the reference picture.
[0443] When the resolution appointment flag extractor 4810 extracts
resolution appointment flags, which appoint different resolution
sets for each motion vector or area of the video, the reference
picture interpolator 7210 determines a filter tap according to a
single resolution and interpolates the reference picture when the
extracted resolution appointment flag indicates the single
resolution.
[0444] Further, the reference picture interpolator 7210 can
interpolate the reference picture by setting types of the filter
taps for each resolution of the picture and selecting a filter,
which has the minimum difference from the current picture, as an
optimum filter from the types of the filter taps.
[0445] Further, the reference picture interpolator 7210 can
interpolate the reference picture by selecting filter taps
according to the motion vector resolutions.
[0446] Furthermore, the reference picture interpolator 7210 can
interpolate the reference picture by selecting an optimum filter
tap for each resolution in the unit of predetermined areas within a
picture or a slice.
[0447] An operation of the reference picture interpolator 7210 in
the video decoding apparatus 6700 according to the fourth aspect of
the present disclosure may be the same as or similar to the
operation of the reference picture interpolator 6710 in the video
decoding apparatus 6700 according to the fourth aspect of the
present disclosure, and thus a more detailed description will be
omitted.
[0448] FIG. 60 is a schematic block diagram of a reference picture
interpolating apparatus for a video decoding according to an aspect
of the present disclosure.
[0449] The reference picture interpolating apparatus for the video
decoding according to the aspect of the present disclosure may be
implemented as the reference picture interpolator 5910 in the video
decoding apparatus 5900 according to the third aspect of the
present disclosure described with reference to FIG. 59.
Hereinafter, for convenience of description, the reference picture
interpolating apparatus for the video decoding according to the
aspect of the present disclosure is referred to as the reference
picture interpolator 5910.
[0450] The reference picture interpolator 5910 may include a filter
information decoder 6010 and a filter 6020.
[0451] The filter information decoder 6010 reconstructs information
on a plurality of information by decoding a bitstream. That is, the
filter information decoder 6010 reconstructs the information on the
plurality of filters by extracting data encoded from the
information on the plurality of filters from the bitstream and
decoding the extracted data. The decoded information on the
plurality of filters may be information on types of filters. In
this case, the information may be information on selected filters
from a filter set having a fixed filter coefficient and can
represent a plurality of selected filters from a filter set in a
case where a multi-stage filtering is used. Further, information on
a plurality of reconstructed is filters may be information on
filter coefficients for determined filters and may be information
on a plurality of filter coefficients in a case where a multi-stage
filtering is used.
[0452] Filters used in the interpolation may include various
filters such as a Wiener filter, a Bilinear filter, and a Kalman
filter.
[0453] The filter 6020 interpolates the reference picture by using
information on a reconstructed filter by the filter information
decoder 6010. At this time, when the filter 6020 interpolates the
reference picture through the filtering, the filter 6020 can
interpolate the reference picture through a multi-stage filtering.
In this case, the filter 6020 can interpolate the reference picture
to have the target precision through the multi-stage filtering of
the reference picture by using information on a plurality of
reconstructed filters by the filter information decoder 6010.
[0454] Hereinafter, a process in which the reference picture
interpolator 6010 interpolates the reference picture will be
described with reference to FIGS. 54 to 56.
[0455] When information on the filter reconstructed by the filter
information decoder 6010 is a filter coefficient of a 6.times.6 tap
filter, the filter 6020 interpolates a sub-pixel by using the
6.times.6 tap filter having a reconstructed filter coefficient
based on an already reconstructed integer pixel. Further, when the
information on the filter reconstructed by the filter information
decoder 6010 is a filter coefficient of a 6-tap filter, the filter
6020 interpolates sub-pixels S.sub.11, S.sub.22, S.sub.33,
S.sub.44, S.sub.55, and S.sub.66 by using the 6-tap filter having a
reconstructed filter coefficient based on integer pixels A1, B2,
C3, D4, E5, and F6 or interpolates sub-pixel S.sub.01, S.sub.02,
S.sub.03, S.sub.O4, S.sub.o5, S.sub.ob, and S.sub.07 by using
integer pixels C1, C2, C3, C4, C5, and C6.
[0456] The reference picture interpolator 6010 can interpolate the
reference picture by using the multi-stage filtering. For example,
when horizontal and vertical precisions of the reference picture
are interpolated 8 times, the filter information decoder 6010 can
reconstruct 63 filter coefficient sets of the 6.times.6 tap filter
and the filter 6020 can interpolate the reference picture by using
the reconstructed 63 filter coefficient sets. Alternatively, when
horizontal and vertical precisions of the reference picture are
interpolated 8 times, the filter 6020 can interpolate a sub-pixel
of a 1/4 or a 1/2 pixel by using a filter having a filter
coefficient identified by information on a plurality of filters
reconstructed by the filter information decoder 6010 based on
integer pixels in a first stage and interpolate again a sub-pixel
of a 1/8 pixel by using the filter coefficient identified by
information on the plurality of filters reconstructed by the filter
information decoder 6010 based on an integer pixel and the
sub-pixel or of the 1/4 or the 1/2 pixel in a second stage.
[0457] For another example, when horizontal and vertical precisions
of the reference picture are interpolated 8 times, the filter 6020
interpolates sub-pixels S.sub.02, S.sub.04, and S.sub.06 by using
the 6-tap filter identified by the information on the plurality of
filters reconstructed by the filter information decoder 6010 based
on integer pixels C1, C2, C3, C4, C5, and C6 in a first stage,
interpolates sub-pixels S.sub.20, S.sub.40, and S.sub.60 by using
the 6-tap filter identified by the information on the plurality of
filters reconstructed by the filter information decoder 6010 based
on integer pixels A3, B3, C3, D3, E3, and F3, and interpolates
sub-pixels up to the 1/2 or the 1/4 pixel of the reference pixel by
using the 6-tap filter identified by the information on the
plurality of filters reconstructed by the filter information
decoder 6010 in a first stage. The filter 6020 interpolates again a
sub-pixel of a 1/8 pixel by using the sub-pixels interpolated in
the first stage and integers pixels of the reference picture
interpolated using the 4.times.4 tap filter, the 4-tap filter, the
6-tap filter, or the 6.times.6 tap filter identified by the
information on the plurality of filters reconstructed by the filter
information decoder 6010 in a second stage.
[0458] Meanwhile, the video encoding/decoding apparatus according
to an aspect of the present disclosure may be implemented by the
connection of a to bitstream output terminal of the video encoding
apparatus of FIG. 9, 32, 52, or 64 and a bitstream input terminal
of the video decoding apparatus of FIG. 18, 48, 59, or 72.
[0459] The video encoding/decoding apparatus according to an aspect
of the present disclosure interpolates the reference picture to
have the target precision through a multi-stage filtering of the
reference picture by using a plurality of filters and includes the
video encoder for interpolating a reference picture to have target
precision through a multi-stage filtering of the reference picture
by using a plurality of filters and performing an inter prediction
encoding of the video by using the interpolated reference picture
to have the target accuracy and the video decoder for interpolating
a reference picture to have the target precision through a
multi-stage filtering of the reference picture by using the
plurality of filters identified by information reconstructed by a
decoding of a bitstream and reconstructing a video by performing an
inter prediction decoding of the bitstream by using the
interpolated reference picture to have the target precision.
[0460] FIG. 19 is a flowchart of a method for decoding a video by
using an adaptive motion vector resolution according to the first
aspect of the present disclosure.
[0461] In the method for decoding a video by using an adaptive
motion vector resolution according to the first aspect of the
present disclosure, a resolution change flag is extracted from a
bitstream, an encoded resolution identification flag is extracted
from a bitstream according to the extracted resolution change flag
and is then decoded so that a motion vector resolution of each area
or motion vector is reconstructed, and an inter prediction decoding
of each area is performed using a motion vector of each area
according to the motion vector resolution of each area or motion
vector.
[0462] To this end, the video decoding apparatus 1800 extracts a
resolution change flag from a bitstream (step S1910), determines if
the extracted resolution change flag indicates that the motion
vector resolution changes according to each area or motion vector
(step S1920), reconstructs a motion vector resolution of each area
or motion vector by extracting a resolution identification flag
from a bitstream and decoding the extracted resolution
identification flag when the resolution change flag indicates that
the motion vector resolution changes according to each area or
motion vector (step S1930), and reconstructs a motion vector of
each area or motion vector by the reconstructed motion vector
resolution and then performs an inter prediction decoding of the
reconstructed motion vector (step S1940). Further, when the
resolution change flag indicates that the motion vector resolution
does not change according to each area or motion vector but is
fixed, the video decoding apparatus 1800 reconstructs a motion
vector resolution by extracting the resolution identification flag
from a bitstream and decoding the extracted resolution
identification flag (step S1950), and reconstructs a motion vector
according to the fixed motion vector resolution for lower areas
defined in a header according to the reconstructed motion vector
resolution and then performs an inter prediction decoding of each
area of the reconstructed motion vector (step S1960). In this
event, the motion vector resolution decoded for each area or motion
vector may have different values for an x component and a y
component of the motion vector.
[0463] The video decoding apparatus 1800 may reconstruct the motion
vector resolution of each area or motion vector by decoding a
resolution identification flag hierarchically encoded in a Quadtree
structure by grouping areas having the same motion vector
resolution together, may reconstruct the motion vector resolution
of each area by decoding a resolution identification flag
hierarchically encoded using a motion vector resolution predicted
using motion vector resolutions of surrounding areas of each area,
may reconstruct the motion vector resolution of each area or motion
vector by decoding a resolution identification flag in which the
run and length of a motion vector resolution of each area or motion
vector have been encoded, may reconstruct the motion vector
resolution of each area or motion vector by decoding a resolution
identification flag hierarchically encoded using a tag tree, may
reconstruct the motion vector resolution of each area or motion
vector by decoding a resolution identification flag with a changing
number of bits allocated to the resolution identification flag
according to the frequency of the motion vector resolution of each
area or motion vector, may estimate a motion vector resolution
according to a pre-promised estimation scheme and reconstruct the
estimated motion vector resolution as a motion vector resolution of
the corresponding area or motion vector when the resolution
identification flag decoded for each area or motion vector
corresponds to a flag indicating the capability of estimation, or
may reconstruct a motion vector resolution indicated by the decoded
resolution identification flag when the is resolution
identification flag decoded for each area or motion vector
corresponds to a flag indicating the incapability of estimation. In
this event, the video decoding apparatus 1800 may decode and
reconstruct an identifier, which indicates the size of an area
indicated by the lowest node of the Quadtree layers and the maximum
number of the Quadtree layers or the size of an area indicated by
the lowest node of the tag tree layers and the maximum number of
the tag tree layers, from a header of a bitstream.
[0464] Further, the video decoding apparatus 1800 may extract and
decode an encoded differential motion vector from a bitstream. In
this event, the video decoding apparatus 1800 may decode and
reconstruct a differential motion vector of each area or motion
vector according to a motion vector resolution of each
reconstructed area or motion vector. Additionally, the video
decoding apparatus 1800 may predict a predicted motion vector of
each area or motion vector and then reconstruct a motion vector of
each area by using the reconstructed differential motion vector and
the predicted motion vector.
[0465] To this end, the video decoding apparatus 1800 may decode
the differential vector by using the UVLC. In this event, the video
decoding apparatus 1800 may use the K-th order Exp-Golomb code in
the decoding, and may change the degree of order (K) of the
Exp-Golomb code according to the motion vector resolution
determined for each area. The video decoding apparatus 1800 may
decode the differential motion vector by using a text-based binary
arithmetic coding. In the decoding, the video decoding apparatus
1800 may use the Concatenated Truncated Unary/K-th Order Exp-Golomb
Code and may change the degree of order (K) and the maximum value
(T) of the Concatenated Truncated Unary/K-th Order Exp-Golomb Code
according to the motion vector resolution of each reconstructed
area or motion vector. When the video decoding apparatus 1800
decodes the differential vector by using the CABAC, the video
decoding apparatus 1800 may differently calculate the accumulation
probability according to the motion vector resolution of each
reconstructed area or motion vector.
[0466] Further, the video decoding apparatus 1800 may predict a
predicted motion vector for a motion vector of each area by using
motion vectors of is surrounding areas of each area. In this event,
when the motion vector resolution of each area is not equal to the
motion vector resolution of surrounding areas, the video decoding
apparatus 1800 may perform the prediction after converting the
motion vector resolution of the surrounding areas to the motion
vector resolution of said each area. The predicted motion vector
may be obtained by the same method in the video encoding apparatus
and the video decoding apparatus. Therefore, various aspects of
deriving a predicted motion vector by a video encoding apparatus
can be also implemented in a video decoding apparatus according to
an aspect of the present disclosure.
[0467] In addition, the video decoding apparatus 1800 may use
different methods of decoding a resolution identification flag
according to the distribution of the motion vector resolutions of
surrounding areas of each area with respect to the motion vector
resolution determined according to each area or motion vector.
[0468] Further, in performing the entropy decoding by an arithmetic
decoding, the video decoding apparatus 1800 may use different
methods of generating a bit string of a resolution identification
flag according to the distribution of the motion vector resolutions
of the surrounding areas of each area and may apply different
context models according to the distribution of the motion vector
resolutions of the surrounding areas and the probabilities of the
motion vector resolution having occurred up to the present, for the
arithmetic decoding and probability update. Also, the video
decoding apparatus 1800 may use different context models according
to the bit positions for the arithmetic decoding and probability
update.
[0469] Moreover, when one or more areas among the areas is a block
and the block mode of the block is a skip mode, the video decoding
apparatus 1800 may convert the motion vector resolution of the area
of the motion vector to be predicted as the highest resolution
among the motion vector resolutions of surrounding areas of the
area and then perform the prediction.
[0470] FIG. 51 is a flowchart illustrating a video decoding method
using an adaptive motion vector resolution according to the second
aspect of the present disclosure.
[0471] The video decoding method using an adaptive motion vector
resolution according to the second aspect of the present disclosure
includes: a resolution appointment flag extracting step (S5102), a
resolution decoding step (S5104), a differential vector decoding
step (S5106), an inter prediction decoding step (S5108), and a
resolution conversion flag generating step (S5110).
[0472] The resolution appointment flag extracting step (S5102)
corresponds to the operation of the resolution appointment flag
extractor 4810, the resolution decoding step (S5104) corresponds to
the operation of the resolution decoder 4820, the differential
vector decoding step (S5106) corresponds to the operation of the
differential vector decoder 4830, the inter prediction decoding
step (S5108) corresponds to the operation of the inter prediction
decoder 4840, and the resolution conversion flag generating step
(S5110) corresponds to the operation of the resolution conversion
flag extractor 4850. Therefore, a detailed description on each step
is omitted here.
[0473] FIG. 61 is a flowchart of a reference picture interpolating
method for a video decoding according to an aspect of the present
disclosure.
[0474] According to the reference picture interpolating method for
the video decoding according to the aspect of the present
disclosure, the reference picture interpolator 5910 reconstructs
information on a first filter and information on a second filter by
decoding a bitstream in step S6110, interpolates the reference
picture by using the first filter identified by the information on
the first filter in step S6120, and interpolates the reference
picture by using the second filter identified by the information on
the second filter in step S6130.
[0475] Here, the information on the first filter and the
information on the second filter may contain information on filter
coefficients or information on types of filters selected from a
plurality of filters.
[0476] In step S6120, the reference picture interpolator 5910 can
interpolate a sub-pixel of the reference picture by using the first
filter based on an integer pixel of the reference picture.
[0477] In step S6130, the reference picture interpolator 5910 can
interpolate the sub-pixel to have the target precision based on the
integer pixel of the reference picture and the interpolated
sub-pixel of the reference picture.
[0478] FIG. 62 is a flowchart of a video decoding method according
to the third aspect of the present disclosure.
[0479] According to the video decoding method according to the
third aspect of the present disclosure, the video decoding
apparatus 5900 interpolates the reference picture to have the
target precision through a multi-stage filtering of the reference
picture by using a plurality of filters identified by information
on the plurality of filters reconstructed through a bitstream
decoding in step S6210 and reconstructs the video by performing an
inter prediction decoding of the bitstream by using the
interpolated reference picture having the target precision in step
S6220.
[0480] In step S6210, the video decoding apparatus 5900 can
interpolate the reference picture through a filtering using one
filter among a plurality of filters and interpolate the reference
picture to have the target precision by repeating a process of
interpolating the reference picture through a filtering using
another filter among the plurality of filters.
[0481] FIG. 73 illustrates a video decoding method according to the
fourth aspect of the present disclosure.
[0482] As shown in FIG. 73, the video decoding method according to
the fourth aspect of the present disclosure includes extracting the
resolution appointment flag in step S7302, decoding the resolution
in step S7304, interpolating the reference picture in step S7306,
encoding the differential vector in step S7308, performing the
inter prediction decoding in step S7310, and extracting the
resolution conversion flag in step S7312.
[0483] Here, step S7302 of extracting the resolution appointment
flag corresponds to the operation of the resolution appointment
flag extractor 3220 of the video decoding apparatus 7200 according
to the fourth aspect of the present disclosure, step S7304 of
decoding the resolution corresponds to the operation of the
resolution decoder 4820 of the video decoding apparatus 7200
according to the fourth aspect of the present disclosure, step
S7306 of interpolating the is reference picture corresponds to the
operation of the reference picture interpolator 7210, step S7308 of
decoding the differential vector corresponds to the operation of
the differential decoder 4830 of the video decoding apparatus 7200
according to the fourth aspect of the present disclosure, step
S7310 of performing the inter prediction decoding corresponds to
the operation of the inter prediction decoder 4840 of the video
decoding apparatus 7200 according to the fourth aspect of the
present disclosure, and step S7312 of extracting the resolution
conversion flag corresponds to the operation of the resolution
conversion flag extractor 4850 of the video decoding apparatus 7200
according to the fourth aspect of the present disclosure, and thus
a detailed description will be omitted.
[0484] Further, the steps described above may include a step or
steps, which can be omitted, depending on the existence or absence
of each element of the video encoding apparatus 7200, from the
video decoding method according to the fourth aspect of the present
disclosure.
[0485] As described above, an aspect of the present disclosure can
adaptively determine the precision of the reference picture in
every area, which is the unit of a predetermined video encoding
such as a block, a macro block, a slice, a picture, or a picture
group, and encode the reference picture by changing the motion
vector resolution and also adaptively determine an optimum
interpolation filter or filter coefficient for an area to be
encoded by interpolating the reference picture through the
selection of the interpolation filter to change the precision of
the reference picture from a filter set having a fixed filter
coefficient or through the adaptive calculation of the optimum
filter coefficient for the determined filter.
[0486] Further, an aspect of the present disclosure can interpolate
the reference picture to have the target precision through a
multi-stage filtering of the reference picture, so that a reference
picture having the more precise resolution can be generated and the
precision of the motion estimation can be increased. As a result,
the prediction accuracy is increased and thus the compression
efficiency and the reconstruction efficiency can be improved.
[0487] Meanwhile, the video encoding/decoding method according to
an is aspect of the present invention may be implemented by the
combination of one method of the video encoding methods according
to the first aspect to the fourth aspect and one method of the
video decoding methods according to the first aspect to the fourth
aspect.
[0488] The video encoding/decoding method according to an aspect of
the present disclosure includes encoding a video, which determines
the motion vector resolution for each area or motion vector; and
performing the inter prediction encoding by using a motion vector
according to the motion vector resolution determined for each area
or motion vector and decoding a video, which reconstructs the
resolution by extracting resolution information from a bitstream
and performing the inter prediction decoding by using a motion
vector according to the motion vector resolution of each
reconstructed area or motion vector.
[0489] As described above, according to aspects of the present
disclosure, it is possible to determine a motion vector resolution
in the unit of motion vectors or areas having a predetermined size
of a video according the characteristics of the video (e.g. the
degree of complexity or the degree of movement of the video) and
then perform an inter prediction encoding by using a motion vector
having an adaptive motion vector resolution. Therefore, the present
disclosure can improve the quality of the video while reducing the
quantity of bits according to the encoding, so as to enhance the
compression efficiency. For example, an area (i.e. first area) in a
certain picture of a video may have a large complexity and a small
degree of movement while another area (i.e. second area) in the
certain picture of the video may have a small complexity and a
large degree of movement. In this event, for the first area, an
inter prediction encoding may be performed after enhancing the
motion vector resolution of the first area, so as to increase the
exactness of the inter prediction, which can reduce residual
signals and the quantity of encoded bits. Moreover, due to the
small degree of movement of the first area, even the increase of
the resolution in the first area does not largely increase the
quantity of bits, which can improve the video quality while
reducing the encoded bit quantity. Further, in relation to the
second area, even an inter prediction encoding with a lower motion
vector resolution does not largely degrade the video quality of the
second area, and the second area can allow a low motion vector
resolution, which can increase the quantity of encoded bits of the
motion vector. As a result, without largely degrading the video
quality of the second area, it is possible to reduce the entire
quantity of encoded bits, which can improve the compression
efficiency.
[0490] Further, according to an aspect of the present disclosure,
it is possible to adaptively determine the precision of the
reference picture in every area, which is the unit of a
predetermined video encoding such as a block, a macro block, a
slice, a picture, or a picture group, and encode the reference
picture by changing the motion vector resolution and also
adaptively determine an optimum interpolation filter or filter
coefficient for an area to be encoded by interpolating the
reference picture through the selection of the interpolation filter
to change the precision of the reference picture from a filter set
having a fixed filter coefficient or through the adaptive
calculation of the optimum filter coefficient for the determined
filter.
[0491] Further, an aspect of the present disclosure can interpolate
the reference picture to have the target precision through a
multi-stage filtering of the reference picture, so that a reference
picture having the more precise resolution can be generated and the
precision of the motion estimation can be increased to improve the
prediction accuracy and thus the compression efficiency and the
reconstruction efficiency.
[0492] In the description above, although all of the components of
the aspects of the present disclosure may have been explained as
being assembled or operatively connected as a unit, the present
disclosure is not intended to limit itself to such aspects. Rather,
within the objective scope of the present disclosure, the
respective components may be selectively and operatively combined
in any numbers. Every one of the components may be also implemented
by itself in hardware while the respective ones can be combined in
part or as a whole selectively and implemented in a computer
program having program modules for executing functions of the
hardware equivalents. Codes or code segments to constitute such a
program may be easily deduced by a person is skilled in the art.
The computer program may be stored in computer readable media,
which in operation can realize the aspects of the present
disclosure. As the computer readable media, the candidates include
magnetic recording media, optical recording media, and carrier wave
media.
[0493] In addition, terms like `include`, `comprise`, and `have`
should be interpreted in default as inclusive or open rather than
exclusive or closed unless expressly defined to the contrary. All
the terms that are technical, scientific or otherwise agree with
the meanings as understood by a person skilled in the art unless
defined to the contrary. Common terms as found in dictionaries
should be interpreted in the context of the related technical
writings not too ideally or impractically unless the present
disclosure expressly defines them so.
[0494] Although exemplary aspects of the present disclosure have
been described for illustrative purposes, those skilled in the art
will appreciate that various modifications, additions and
substitutions are possible, without departing from essential
characteristics of the disclosure. Therefore, exemplary aspects of
the present disclosure have not been described for limiting
purposes.
[0495] Accordingly, the scope of the disclosure is not to be
limited by the above aspects but by the claims and the equivalents
thereof.
INDUSTRIAL APPLICABILITY
[0496] As described above, the present disclosure is highly useful
for application in the fields of compressing a video, interpolating
the reference picture through the determination of a filter or a
filter coefficient to interpolate the reference picture according
to characteristics of a video and interpolating the reference
picture through a multi-stage filtering of the reference picture or
performing an inter prediction encoding by adaptively changing a
motion vector resolution in the unit of predetermined areas, and
thereby efficiently encoding a to video.
CROSS-REFERENCE TO RELATED APPLICATION
[0497] If applicable, this application claims priorities under 35
U.S.C. .sctn.119(a) of Patent Application No. 10-2009-0077452,
filed on Aug. 21, 2009; Patent Application No. 10-2010-0019208,
filed on Mar. 3, 2010 and Patent Application No. 10-2010-0081097,
filed on Aug. 20, 2010 in Korea, the entire contents of which are
incorporated herein by reference. In addition, this non-provisional
application claims priorities in countries, other than the U.S.,
with the same reason based on the Korean Patent Applications, the
entire contents of which are hereby incorporated by reference.
* * * * *