U.S. patent application number 13/415901 was filed with the patent office on 2012-09-13 for interpolation filter selection using prediction index.
This patent application is currently assigned to GENERAL INSTRUMENT CORPORATION. Invention is credited to David M. Baylon, Jian Lou, Koohyar Minoo, Krit Panusopone, Limin Wang.
Application Number | 20120230407 13/415901 |
Document ID | / |
Family ID | 46795573 |
Filed Date | 2012-09-13 |
United States Patent
Application |
20120230407 |
Kind Code |
A1 |
Minoo; Koohyar ; et
al. |
September 13, 2012 |
Interpolation Filter Selection Using Prediction Index
Abstract
In one embodiment, a method for encoding or decoding video
content is provided. The method includes determining a set of
interpolation filters for use in interpolating sub-pel pixel values
and a mapping between interpolation filters in the set of
interpolation filters and different prediction indexes of the video
content. A unit of video content is received and a prediction index
is determined in a plurality of prediction indexes that are used to
determine a prediction block for the unit of video content. The
method then determines an interpolation filter in the set of
interpolation filters based on a mapping between the interpolation
filter and the prediction index to interpolate a sub-pel pixel
value for use in a temporal prediction process for the unit of
video content.
Inventors: |
Minoo; Koohyar; (San Diego,
CA) ; Baylon; David M.; (San Diego, CA) ; Lou;
Jian; (San Diego, CA) ; Panusopone; Krit; (San
Diego, CA) ; Wang; Limin; (San Diego, CA) |
Assignee: |
GENERAL INSTRUMENT
CORPORATION
Horsham
PA
|
Family ID: |
46795573 |
Appl. No.: |
13/415901 |
Filed: |
March 9, 2012 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61451827 |
Mar 11, 2011 |
|
|
|
61453278 |
Mar 16, 2011 |
|
|
|
Current U.S.
Class: |
375/240.14 ;
375/E7.26 |
Current CPC
Class: |
H04N 19/523 20141101;
H04N 19/117 20141101; H04N 19/46 20141101 |
Class at
Publication: |
375/240.14 ;
375/E07.26 |
International
Class: |
H04N 7/36 20060101
H04N007/36 |
Claims
1. A method for encoding or decoding video content, the method
comprising: determining a set of interpolation filters for use in
interpolating sub-pel pixel values and a mapping between
interpolation filters in the set of interpolation filters and
different prediction indexes of the video content; receiving a unit
of video content; determining a prediction index in a plurality of
prediction indexes that are used to determine a prediction block
for the unit of video content; and determining an interpolation
filter in the set of interpolation filters based on a mapping
between the interpolation filter and the prediction index to
interpolate a sub-pel pixel value for use in a temporal prediction
process for the unit of video content.
2. The method of claim 1, wherein the mapping was received prior to
receiving the unit of video content.
3. The method of claim 1, wherein the prediction index comprises a
prediction combination index that is used to determine one or more
prediction blocks for the unit of video content.
4. The method of claim 1, wherein when a uni-prediction mode is
used, a first prediction index is associated with a first
interpolation filter that is used if the first prediction index is
determined and a second prediction index is associated with a
second interpolation filter that is used if the second prediction
index is determined
5. The method of claim 1, wherein when a bi-prediction mode is
used: the prediction block comprises a first prediction block used
in the bi-prediction, a second prediction block is used in the
bi-prediction, and a first prediction index is associated with a
first interpolation filter that is used for the first prediction
block and a second prediction index is associated with a second
interpolation filter that is used for the second prediction
index.
6. The method of claim 1, wherein the encoder does not signal to
the decoder which interpolation filter was determined
7. The method of claim 1, further comprising: determining when to
send a signal including information regarding the mapping; and
sending the signal from an encoder to a decoder for use in decoding
video content when it is determined the signal should be sent,
wherein the decoder uses the information to decode a unit of the
video content when determining an interpolation filter in the set
of interpolation filters based on a mapping between the
interpolation filter and a prediction index.
8. The method of claim 7, wherein the information comprises a new
mapping between interpolation filters in the set of interpolation
filters and prediction indexes.
9. The method of claim 7, wherein the information comprises a
change to a mapping between an interpolation filter in the set of
interpolation filters and a prediction index, wherein the mapping
changed was previously stored at the decoder.
10. The method of claim 7, wherein the mapping is used to select a
group of interpolation filters and the information is used to
select the interpolation filter from within the group.
11. The method of claim 7, wherein the information is used to
select a group of interpolation filters from the set of
interpolation filters.
12. An apparatus configured to encode or decode video content, the
apparatus comprising: one or more computer processors; and a
computer-readable storage medium comprising instructions for
controlling the one or more computer processors to be operable to:
determine a set of interpolation filters for use in interpolating
sub-pel pixel values and a mapping between interpolation filters in
the set of interpolation filters and different prediction indexes
of the video content; receive a unit of video content; determine a
prediction index in a plurality of prediction indexes that are used
to determine a prediction block for the unit of video content; and
determine an interpolation filter in the set of interpolation
filters based on a mapping between the interpolation filter and the
prediction index to interpolate a sub-pel pixel value for use in a
temporal prediction process for the unit of video content.
13. The apparatus of claim 12, wherein the mapping was received
prior to receiving the unit of video content.
14. The apparatus of claim 12, wherein the prediction index
comprises a prediction combination index that is used to determine
one or more prediction blocks for the unit of video content.
15. The apparatus of claim 12, wherein when a uni-prediction mode
is used, a first prediction index is associated with a first
interpolation filter that is used if the first prediction index is
determined and a second prediction index is associated with a
second interpolation filter that is used if the second prediction
index is determined
16. The apparatus of claim 12, wherein when a bi-prediction mode is
used: the prediction block comprises a first prediction block used
in the bi-prediction, a second prediction block is used in the
bi-prediction, and a first prediction index is associated with a
first interpolation filter that is used for the first prediction
block and a second prediction index is associated with a second
interpolation filter that is used for the second prediction
index.
17. The apparatus of claim 12, further operable to: determine when
to send a signal including information regarding the mapping; and
send the signal from an encoder to a decoder for use in decoding
video content when it is determined the signal should be sent,
wherein the decoder uses the information to decode a unit of the
video content when determining an interpolation filter in the set
of interpolation filters based on a mapping between the
interpolation filter and a prediction index.
18. The apparatus of claim 17, wherein the information comprises a
new mapping between interpolation filters in the set of
interpolation filters and prediction indexes.
19. The apparatus of claim 17, wherein the information comprises a
change to a mapping between an interpolation filter in the set of
interpolation filters and a prediction index, wherein the mapping
changed was previously stored at the decoder.
20. A non-transitory computer-readable storage medium comprising
instructions for encoding or decoding video content, the
instructions for controlling the one or more computer processors to
be operable to: determine a set of interpolation filters for use in
interpolating sub-pel pixel values and a mapping between
interpolation filters in the set of interpolation filters and
different prediction indexes of the video content; receive a unit
of video content; determine a prediction index in a plurality of
prediction indexes that are used to determine a prediction block
for the unit of video content; and determine an interpolation
filter in the set of interpolation filters based on a mapping
between the interpolation filter and the prediction index to
interpolate a sub-pel pixel value for use in a temporal prediction
process for the unit of video content.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] The present application claims priority to U.S. Provisional
App. No. 61/451,827 for "Adaptive Interpolation with Implicit
Signaling of the Interpolation Choice" filed Mar. 11, 2011 and U.S.
Provisional App. No. 61/453,278 for "High Level Adaptive
Interpolation Filter Signaling" filed Mar. 16, 2011, the contents
of which is incorporated herein by reference in their entirety.
[0002] The present application is related to U.S. application Ser.
No. ______ for "Interpolation Filter Selection Using PU Size" filed
concurrently, the contents of which is incorporated herein by
reference in their entirety.
BACKGROUND
[0003] Particular embodiments generally relate to video
compression.
[0004] High-efficiency video coding (HEVC) is a block-based hybrid
spatial and temporal predictive coding scheme. Similar to other
video coding standards, such as motion picture experts group
(MPEG)-1, MPEG-2, and MPEG-4, HEVC supports intra-picture, such as
I picture, and inter-picture, such as B picture. In HEVC, P and B
pictures are consolidated into a general B picture that can be used
as a reference picture.
[0005] Intra-picture is coded without referring to any other
pictures. Thus, only spatial prediction is allowed for a coding
unit (CU)/prediction unit (PU) inside an intra-picture.
Inter-picture, however, supports both intra- and inter-prediction.
A CU/PU in an inter-picture may be either spatially or temporally
predictive coded. Temporal predictive coding may reference pictures
that were previously coded.
[0006] Temporal motion prediction is an effective method to
increase the coding efficiency and provides high compression. HEVC
uses a translational model for motion prediction. According to the
translational model, a prediction signal for a given block in a
current picture is generated from a corresponding block in a
reference picture. The coordinates of the reference block are given
by a motion vector that describes the translational motion along
horizontal (x) and vertical (y) directions that would be
added/subtracted to/from the coordinates of the current block. A
decoder needs the motion vector to decode the compressed video.
[0007] The pixels in the reference frame are used as the
prediction. In one example, the motion may be captured in integer
pixels. However, not all objects move with the spacing of integer
pixels. For example, since an object motion is completely unrelated
to the sampling grid, sometimes the object motion is more like a
fractional-pel motion than a full-pel one. Thus, HEVC allows for
motion vectors with sub-pel (fractional) pixel accuracy.
[0008] In order to estimate and compensate sub-pel displacements,
the image signal on these sub-pel positions is generated by an
interpolation process. In HEVC, sub-pel pixel interpolation is
performed using finite impulse response (FIR) filters. Generally,
the filter may have 8 taps to determine the sub-pel pixel values
for sub-pel pixel positions, such as half-pel and quarter-pel
positions. The taps of an interpolation filter weight the integer
pixels with coefficient values to generate the sub-pel signals.
Different coefficients may produce different compression
performance in signal distortion and noise.
[0009] In one example, the coefficients for the filter are fixed
and applicable to compression of all sequences. In another example,
the filter choice may vary from sequence to sequence, within a
sequence, from picture to picture, from reference to reference, or
within a picture, from PU to PU. This is referred to as an adaptive
interpolation filter (AIF). To use an adaptive interpolation
filter, the choice of which adaptive filter needs to be
communicated to the decoder explicitly by sending the filter
coefficients or by sending information indicating the preferred
filter to be used. The explicit signaling increases the overhead as
the information needs to be encoded and sent to the decoder for
every PU that is being compressed and is costly because it
increases the required bit rate in temporally predictive
pictures.
SUMMARY
[0010] In one embodiment, a method for encoding or decoding video
content is provided. The method includes determining a set of
interpolation filters for use in interpolating sub-pel pixel values
and a mapping between interpolation filters in the set of
interpolation filters and different prediction indexes of the video
content. A unit of video content is received and a prediction index
is determined in a plurality of prediction indexes that are used to
determine a prediction block for the unit of video content. The
method then determines an interpolation filter in the set of
interpolation filters based on a mapping between the interpolation
filter and the prediction index to interpolate a sub-pel pixel
value for use in a temporal prediction process for the unit of
video content.
[0011] In one embodiment, an apparatus configured to encode or
decode video content is provided. The apparatus includes: one or
more computer processors; and a computer-readable storage medium
comprising instructions for controlling the one or more computer
processors to be operable to: determine a set of interpolation
filters for use in interpolating sub-pel pixel values and a mapping
between interpolation filters in the set of interpolation filters
and different prediction indexes of the video content; receive a
unit of video content; determine a prediction index in a plurality
of prediction indexes that are used to determine a prediction block
for the unit of video content; and determine an interpolation
filter in the set of interpolation filters based on a mapping
between the interpolation filter and the prediction index to
interpolate a sub-pel pixel value for use in a temporal prediction
process for the unit of video content.
[0012] In one embodiment, a non-transitory computer-readable
storage medium is provided including instructions for encoding or
decoding video content. The instructions are for controlling the
one or more computer processors to be operable to: determine a set
of interpolation filters for use in interpolating sub-pel pixel
values and a mapping between interpolation filters in the set of
interpolation filters and different prediction indexes of the video
content; receive a unit of video content; a prediction index in a
plurality of prediction indexes that are used to determine a
prediction block for the unit of video content; and determine an
interpolation filter in the set of interpolation filters based on a
mapping between the interpolation filter and the prediction index
to interpolate a sub-pel pixel value for use in a temporal
prediction process for the unit of video content.
[0013] The following detailed description and accompanying drawings
provide a more detailed understanding of the nature and advantages
of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] FIG. 1 depicts an example of a system for encoding and
decoding video content according to one embodiment.
[0015] FIG. 2A depicts an example of an encoder according to one
embodiment.
[0016] FIG. 2B depicts an example of decoder according to one
embodiment.
[0017] FIG. 3 depicts positions of half-pel and quarter-pel pixels
between full-pel pixels along a pixel line within an image
according to one embodiment.
[0018] FIG. 4 shows a more detailed example of a filter determiner
according to one embodiment.
[0019] FIG. 5 shows a simplified flowchart of a method for
determining an interpolation filter using PU size according to one
embodiment.
[0020] FIG. 6 depicts a simplified flowchart of a method for
determining an interpolation filter using a prediction index
according to one embodiment.
[0021] FIG. 7 depicts a simplified flowchart of a method for
performing adaptive interpolation filter signaling according to one
embodiment.
DETAILED DESCRIPTION
[0022] Described herein are techniques for a video compression
system. In the following description, for purposes of explanation,
numerous examples and specific details are set forth in order to
provide a thorough understanding of embodiments of the present
invention. Particular embodiments as defined by the claims may
include some or all of the features in these examples alone or in
combination with other features described below, and may further
include modifications and equivalents of the features and concepts
described herein.
Overview
[0023] FIG. 1 depicts an example of a system 100 for encoding and
decoding video content according to one embodiment. System 100
includes an encoder 102 and a decoder 104, both of which will be
described in more detail below. Encoder 102 and decoder 104 perform
temporal prediction through motion estimation and motion
compensation. The temporal prediction searches for a best match
prediction for a current prediction unit (PU) over reference
pictures. The best match prediction is described by a motion vector
(MV) and associated reference picture ID. Also, a PU in a B picture
may have up to two motion vectors.
[0024] The temporal prediction allows for fractional (sub-pel)
picture accuracy. Sub-pel pixel prediction is used because motion
during two instances of time (the current and reference frames'
capture times) can correspond to a sub-pel pixel position in pixel
coordinates and generation of different prediction data
corresponding to each sub-pel pixel positions allows for the
possibility of conditioning the prediction signal to better match
the signal in the current PU.
[0025] The temporal prediction may use adaptive sub-pel pixel
interpolation for the PU. In this case, different interpolation
filters 108 may be used to determine the sub-pel pixel values.
Interpolation filters 108 include taps that weight full-pel pixel
values with coefficient values that are used to determine the
sub-pel pixel values for different sub-pel pixel positions, such as
half-pel and quarter pel positions. When a different interpolation
filter 108 is used, the interpolation filter may use different
values for coefficients and/or a different number of taps.
[0026] Encoder 102 and decoder 104 need to know which interpolation
filter 108 to use in encoding and decoding a unit of video content,
such as a PU. Particular embodiments may use an implicit signaling
method for determining which interpolation filter 108 to use to
interpolate sub-pel pixel values. In one embodiment, information
already available to both encoder 102 and decoder 104 is used to
determine which interpolation filter 106 to use. For example, a
filter determiner 106 in either encoder 102 or decoder 104 receives
a set of interpolation filters with mappings between the
interpolation filters and a coding parameter used in the
compression process. For example, the set of interpolation filters
may be installed or stored in memory of encoder 102 and decoder
104. Filter determiner 106 uses the coding parameter used in the
compression process to determine an interpolation filter 108 based
on the mapping. For example, filter determiner 106 may use PU size
or a prediction index (for uni-prediction or bi-prediction mode) to
determine which interpolation filter 108 to use. The mappings may
be already known to encoder 102 and decoder 104 before the encoding
of PU or decoding of bits for the PU. Because encoder 102 and
decoder 104 use information already known to encoder 102 or decoder
104 to determine the appropriate interpolation filter 108, the
interpolation filter decision is implicitly determined without
requiring explicit communication between encoder 102 and decoder
104 for encoding and decoding the unit of video content.
[0027] Although the implicit signaling method may be used, it may
be useful to explicitly have communication between encoder 102 and
decoder 104 at certain times. For example, an explicit
communication from encoder 102 to decoder 104 is used to determine
which interpolation filter 108 to use. In one example, a filter
signaling block 110-1 in encoder 102 may communicate a filter
signaling block 110-2 in decoder 104. The communication between
encoder 102 and decoder 104 may vary, such as the communications
may be the mappings themselves, an update to the mappings, or
information for use in determining which interpolation filter to
use based on the mappings. The explicit signaling may be
communicated using high level syntax(es) in a sequence, picture, or
slice header. The signaling may be performed during an effective
period of the high level syntax(es).
Encoder and Decoder Examples
[0028] FIG. 2A depicts an example of an encoder 102 according to
one embodiment. A general operation of encoder 102 will now be
described. It will be understood that variations on the encoding
process described will be appreciated by a person skilled in the
art based on the disclosure and teachings herein.
[0029] For a current PU, x, a prediction PU, x', is obtained
through either spatial prediction or temporal prediction. The
prediction PU is then subtracted from the current PU, resulting in
a residual PU, e. A spatial prediction block 204 may include
different spatial prediction directions per PU, such as horizontal,
vertical, 45-degree diagonal, 135-degree diagonal, DC (flat
averaging), and planar.
[0030] A temporal prediction block 206 performs temporal prediction
through a motion estimation and motion compensation operation. The
motion estimation operation searches for a best match prediction
for the current PU over reference pictures. The best match
prediction is described by a motion vector (MV) and associated
reference picture (refldx). The motion vector and associated
reference picture are included in the coded bit stream.
[0031] Transform block 207 performs a transform operation with the
residual PU, e. Transform block 207 outputs the residual PU in a
transform domain, E.
[0032] A quantizer 208 then quantizes the transform coefficients of
the residual PU, E. Quantizer 208 converts the transform
coefficients into a finite number of possible values. Entropy
coding block 210 entropy encodes the quantized coefficients, which
results in final compression bits to be transmitted. Different
entropy coding methods may be used, such as context-adaptive
variable length coding (CAVLC) or context-adaptive binary
arithmetic coding (CABAC).
[0033] Also, in a decoding process within encoder 102, a
de-quantizer 212 de-quantizes the quantized transform coefficients
of the residual PU. De-quantizer 212 then outputs the de-quantized
transform coefficients, E'. An inverse transform block 214 receives
the de-quantized transform coefficients, which are then inverse
transformed resulting in a reconstructed residual PU, e'. The
reconstructed PU, e', is then added to the corresponding
prediction, x', either spatial or temporal, to form the new
reconstructed PU, x''. A loop filter 216 performs de-blocking on
the reconstructed PU, x'', to reduce blocking artifacts.
Additionally, loop filter 216 may perform a sample adaptive offset
process after the completion of the de-blocking filter process for
the decoded picture, which compensates for a pixel value offset
between reconstructed pixels and original pixels. Also, loop filter
216 may perform adaptive filtering over the reconstructed PU, which
minimizes coding distortion between the input and output pictures.
Additionally, if the reconstructed pictures are reference pictures,
the reference pictures are stored in a reference buffer 218 for
future temporal prediction.
[0034] Interpolation filter 108 interpolates sub-pel pixel values
for temporal prediction block 206. Filter determiner 106 implicitly
determines interpolation filter 108 as will be described below.
Also, filter signaling block 110-1 explicitly signals information
for use in determining interpolation filter 108 at certain times.
Temporal prediction block 206 then uses the sub-pel pixel values
outputted by interpolation filter 108 to generate a prediction of a
current PU.
[0035] FIG. 2B depicts an example of decoder 104 according to one
embodiment. A general operation of decoder 104 will now be
described. It will be understood that variations on the decoding
process described will be appreciated by a person skilled in the
art based on the disclosure and teachings herein. Decoder 104
receives input bits from encoder 102 for compressed video
content.
[0036] An entropy decoding block 230 performs entropy decoding on
input bits corresponding to quantized transform coefficients of a
residual PU. A de-quantizer 232 de-quantizes the quantized
transform coefficients of the residual PU. De-quantizer 232 then
outputs the de-quantized transform coefficients of the residual PU,
E'. An inverse transform block 234 receives the de-quantized
transform coefficients, which are then inverse transformed
resulting in a reconstructed residual PU, e'.
[0037] The reconstructed PU, e', is then added to the corresponding
prediction, x', either spatial or temporal, to form the new
constructed PU, x''. A loop filter 236 performs de-blocking on the
reconstructed PU, x'', to reduce blocking artifacts. Additionally,
loop filter 236 may perform a sample adaptive offset process after
the completion of the de-blocking filter process for the decoded
picture, which compensates for a pixel value offset between
reconstructed pixels and original pixels. Also, loop filter 236 may
perform an adaptive loop filter over the reconstructed PU, which
minimizes coding distortion between the input and output pictures.
Additionally, if the reconstructed pictures are reference pictures,
the reference pictures are stored in a reference buffer 238 for
future temporal prediction.
[0038] The prediction PU, x', is obtained through either spatial
prediction or temporal prediction. A spatial prediction block 240
may receive decoded spatial prediction directions per PU, such as
horizontal, vertical, 45-degree diagonal, 135-degree diagonal, DC
(flat averaging), and planar. The spatial prediction directions are
used to determine the prediction PU, x'.
[0039] Interpolation filter 108 interpolates sub-pel pixel values
for input into a temporal prediction block 242. Filter determiner
106 implicitly determines interpolation filter 108 as will be
described below. Also, filter signaling block 110-2 receives
signals information for use in determining interpolation filter 108
at certain times. Temporal prediction block 242 performs temporal
prediction using decoded motion vector information and interpolated
sub-pel pixel values outputted by interpolation filter 108 in a
motion compensation operation. Temporal prediction block 242
outputs the prediction PU, x'.
Interpolation Filter Selection
[0040] To estimate a fractional pixel (sub-pel displacements), an
image signal on each sub-pel position is generated by an
interpolation process. FIG. 3 depicts positions of half-pel and
quarter-pel pixels between full-pel pixels along a pixel line
within an image according to one embodiment. For example, the pixel
line may be along a row or column of an image. Full-pel pixels are
represented by integer pixels and are shown in FIG. 3 as pixels L3,
L2, L1, L0, R0, R1, R2, and R3. H is a half-pel pixel between
full-pel pixels L0 and R0. FL is a sub-pel pixel (quarter-pel
pixel) between full-pel pixels L0 and H and FR is a sub-pel pixel
between half-pel pixel H and full-pel pixel R0.
[0041] The quarter-pel and half-pel pixels may be interpolated
using the values of spatial neighboring full-pel pixels. For
example, the half-pel pixel H may be interpolated using the values
of full-pel pixels L3, L2, L1, L0, R0, R1, R2, and R3. Different
coefficients may also be used to weight the values of the
neighboring pixels and provide different characteristics of
filtering.
[0042] The number of taps and coefficient values may be varied. For
example, to determine an ideal sub-pixel value, interpolation
filters 108 may be linear in phase with a unity gain. The constant
(unity) gain is required to make sure the filters do not distort
the signal. However, in practice, it may not be possible to have a
constant unity gain for all frequencies. Thus, one of the goals for
designing interpolation filters 108 is to make a linear phase
filter (with appropriate phase slope or group delay) to have a gain
response that is as flat and as wide as possible. Different
trade-offs between flatness and wideness of a frequency response
result in different sub-pixel interpolation filters. For example,
since for natural images most of the signal is concentrated at low
and middle frequencies, it may be preferred to have an
interpolation filter 108 that is as flat as possible in the low and
middle frequencies, while the high frequencies may have more
fluctuations. For noise cancellation, it may be preferable that the
sub-pixel interpolation filters 108 attenuate signals where the
noise is present. The shape of the noise may depend on the picture
content and the amount of compression. For example, compression
noise for low quantization regimes tends to be flatter. Due to the
low frequency nature of natural scenes and the possibility of
having high-frequency compression noise, it may be desirable to
have a low-pass filter with a gain response to minimize the signal
distortion due to compression noise and signal attenuation. A
balance between the flatness and wideness of the sub-pixel
interpolation filters 108 is desired in the interpolation filter
design. Thus, different interpolation filters 108 may be used for
different PUs based on localized characteristics of each PU.
[0043] As discussed above, particular embodiments use implicit
signaling to determine which interpolation filter 108 to use. FIG.
4 shows a more detailed example of filter determiner 106 according
to one embodiment. As shown, filter determiner 106 receives a
mapping between the interpolation filters 108 and coding parameters
associated with the compression process. For example, the mapping
may be between interpolation filters 108 and a PU size or a
prediction index. Filter determiner 106 then implicitly decides
which interpolation filter 108 to use based on the mapping between
the interpolation filter 108 and the characteristic. The determined
interpolation filter 108 is then output and used to interpolate
sub-pel pixel values.
Interpolation Filter Selection Using PU Size
[0044] As discussed above, the coding parameter used may be PU size
or a prediction index. FIG. 5 shows a simplified flowchart 500 of a
method for determining an interpolation filter 108 using PU size
according to one embodiment. At 502, filter determiner 106
determines a set of interpolation filters 108. For example, a
specific set of interpolation filters 108 may be provided for
half-pel pixel interpolation and another specific set of
interpolation filters 108 are provided for quarter-pel pixel
interpolation. Each interpolation filter 108 may include different
interpolation filter settings, such as the number of taps and the
values of coefficients may vary. When different interpolation
filters 108 are referred to, this may mean that different settings
are used in interpolation filters 108.
[0045] At 504, filter determiner 106 determines mappings between
interpolation filters 108 and different PU sizes. The PU size may
be the size of the PU being encoded or decoded. Although a PU is
discussed, other units of video content may be used (e.g., a coding
unit size). The mapping may map, for each sub- pel pixel positions,
an interpolation filter 108 to a PU size. In one example, the
number of interpolation filters 108 used may be as large as the
number of possible PU sizes. In one example, an interpolation
filter 108 may be mapped to a PU size of a width or height of the
PU that is less than or equal to 8 pixels. An example of the
mapping will be described in more detail below.
[0046] At 506, filter determiner 106 determines a PU size of a
current PU being encoded or decoded. For example, decoder 104 may
determine the PU size from the input bitstream. Also, encoder 102
may choose the PU size based on an evaluation of performance, video
content, or some other characteristic. Using the PU size, at 508,
filter determiner 106 determines the interpolation filter
associated with the PU size in the mapping, for example, the
coefficient values used to an interpolate a sub-pel pixel
value.
[0047] Particular embodiments may use the size of a current PU to
determine the interpolation filter 108 because PU size may be an
indication of the characteristics of the reference block when the
quality of coded pictures is consistent for consecutive frames. In
other words, the compression quality from frame to frame may not
generally change much. In this case, smaller PU sizes may use
wider-band low pass interpolation filters 108 and larger PU sizes
may use narrower-band low pass interpolation filters 108. One
reason why the wider-band low pass interpolation filter 108 is used
for smaller PU sizes is because the smaller PU size is selected
because of certain reasons. For example, a bit budget may be high
so it is justifiable to send extra prediction mode information for
smaller PU sizes. Using the high bit-budget assumption, the
quantization noise of the reference block may be negligible given
the consistent quality of consecutive frames. In these cases, there
is not much noise due to quantization meaning that there will not
be much benefit from narrower band interpolation filters 108. In
this case, a wider band interpolation filter may be preferred to
preserve the possible high frequency content of the signal. Also,
when smaller PU sizes are selected, the image may contain
considerable contrast that goes through inconsistent motion.
Because high contrast usually implies high frequency content, a
wider interpolation filter 108 may be suitable in this case. The
opposite argument can be used to justify the use of a narrower band
interpolation filter 108 for larger PU sizes. Although the above
justifications are described, it will be understood that other
justifications may be used to select different interpolation
filters 108.
[0048] One example of different interpolation filters 108 to use
based on PU size is described. The coefficients are used to weight
the values of integer pixels depicted in FIG. 3. In one example
using PU size, if PU width is no larger than 8 pixels or PU height
is no larger than 8 pixels, then sub-pel pixels FL and FR are
interpolated using values of spatially neighboring full-pel pixels
L3, L2, L1, L0, R0, R1, R2, and R3 as follows:
FL=(-1*L3+4*L2-10*L1+57*L0+19*R0-7*R1+3*R2-1*R3+32)>>6;
FR=(-1*L3+3*L2-7*L1 30
19*L0+57*R0-10*R1+4*R2-1*R3+32)>>6;
The quarter-pel positions (FL and FR) are calculated by applying
coefficient values to the full-pel pixel values (L3, L2, L1, L0,
R0, R1, R2, and R3). For example, a full-pel pixel value is
multiplied by a corresponding coefficient value. Then, the results
of the multiplied values are added together. A value of "32" is
added and the result is right shifted by "6". The adding of "32"
and shifting to the right (6 bits) is equivalent to adding 32 and
dividing by 64, which truncates the value. Other operations are
also contemplated. For example, a different truncating operation
may be performed.
[0049] If the PU width is larger than 8 pixels and PU height is
larger than 8 pixels, sub-pel pixels FL and FR are interpolated
using the following interpolation filter values:
FL=(-1*L3+3*L2-9*L1+57*L0+18*R0-6*R1+2*R2-0*R3+32)>>6;
FR=(-0*L3+2*L2-6*L1+18*L0+57*R0-9*R1+3*R2-1*R3+32)>>6;
[0050] Table 1 summarizes the filter coefficients used as
follows:
TABLE-US-00001 TABLE 1 POSITION COEFFICIENTS PUWIDTH <= 8 OR FL
{-1, 4, -10, 57, 19, -7, 3, -1,} PUHEIGHT <= 8 FR {-1, 3, -7,
19, 57, -10, 4, -1,} PUWIDTH > 8 AND FL {-1, 3, -9, 57, 18, -6,
2, 0,} PUHEIGHT > 8 FR {0, 2, -6, 18, 57, -9, 3, -1,}
[0051] In one implementation, the operation in which a coefficient
is 0, such as "0*R3" or "0*L3" may be skipped. That means only a
7-tap interpolation filter is needed if PU width is larger than 8
pixels and PU height is larger than 8 pixels because one of the
taps has a coefficient of 0 for both of these conditions.
Interpolation Filter Selection Using a Prediction Index
[0052] In another embodiment, interpolation filter 108 filter may
be determined based on a prediction index. The prediction index may
be a list. For prediction of a current PU, a number of prediction
blocks may be used. The prediction blocks may differ and may be
signaled by motion vectors in reference to a reference picture. The
weighted averaging of these prediction blocks may be used to
determine a final prediction for a current PU. The different
choices for the prediction block may be specified in an indexing
scheme. For example, with n prediction block choices, any
combination of 2.sup.n-1 prediction blocks may be used. Each
combination may be defined by a prediction combination index (PCI).
In one embodiment, filter determiner 106 may use the prediction
combination index to determine interpolation filter 108. Different
interpolation filters 108 may be selected based on the prediction
combination index determined, and also for different sub-pel pixel
positions. Although the prediction combination index is described,
other information based on the prediction index may be used.
[0053] FIG. 6 depicts a simplified flowchart 600 of a method for
determining an interpolation filter 108 using a prediction index
according to one embodiment. At 602, filter determiner 106
determines a set of interpolation filters 108. Also, at 604, filter
determiner 106 determines a mapping between interpolation filters
108 and different prediction index combinations. For example, each
prediction combination index may be associated with a different
interpolation filter 108.
[0054] At 606, filter determiner 106 determines a PCI associated
with a current PU. For example, the PCI may be defined for a
current PU being encoded or decoded. At 608, filter determiner 106
determines an interpolation filter 108 for the PCI. For example,
the mapping from each PCI to an interpolation filter 108 is used to
determine the interpolation filter. In one example, uni-prediction
mode may have a choice between two prediction indexes. Based on
which of a first index and a second index is selected, different
interpolation filters 108 may be determined Also, with
bi-prediction, two prediction indices may be selected. For a third
index, a first interpolation filter may be selected and for the
fourth index, a second interpolation filter may be selected.
Different interpolation filters for first, second, third, and
fourth indices in the uni prediction and the bi-prediction may be
used.
[0055] One example of using the prediction index to determine
interpolation filter 108 will now be described, but other examples
may also be appreciated. In one embodiment, in HEVC, there may be
up to two reference blocks for each PU (bi-prediction mode). In one
embodiment, the reference blocks would be found on two different
prediction index lists. For example, a first prediction index is
used to index reference blocks located before the PU in time and a
second prediction index is used to index reference blocks located
after the PU in time. If there are two reference blocks, then an
interpolation filter 108 with a wider pass-band may be used to
include more of the signal high frequencies. If there are two
reference blocks, then it is also possible to have different
interpolation filters 108 for each reference block. Although wider
pass-band interpolation filters 108 pass more coding noise due to
the averaging of the prediction block to form the final prediction
for the current PU, the noise would be attenuated while the signal
would be unchanged after averaging. In another case, if there is
only one prediction block for the current PU (uni-prediction mode)
and that prediction block comes from a first reference list (e.g.,
a reference list that includes pictures that are located before the
PU in time), then filter determiner 106 applies an interpolation
filter 108 that is narrower than the above wider-pass band
interpolation filters 108 to cancel more noise. In a third case, if
there is only one prediction block for a current PU and that
prediction block comes from a second reference list (e.g., a
reference list that includes pictures that are located after the PU
in time), then filter determiner 106 applies another interpolation
filter 108 that is even narrower than in the second case to cancel
even more noise.
[0056] One example of determining interpolation filters 108 based
on the prediction index is described. For example, the mode of
uni-prediction or bi-prediction is used to determine interpolation
filter 108. If a bi-prediction mode is used, then a certain
interpolation filter is used. However, if a uni-prediction mode is
used for the PU, then a choice of prediction indexes may be made.
Depending on which prediction index is selected, different
interpolation filters 108 are used.
[0057] Accordingly, implicit signaling is used to determine the
interpolation filter. Thus, explicit signaling between an encoder
102 and decoder 104 is not needed to signal the interpolation
filter that should be used. Rather, encoder 102 and decoder 102 use
information that is already available to the encoder and decoder to
make a determination on the interpolation filter that is used.
High Level Adaptive Interpolation Filter Signaling
[0058] Even though implicitly deriving interpolation filter 108 is
used, particular embodiments use explicit signaling at certain
times. The explicit signaling may be used in conjunction with the
implicit derivation and/or without the implicit derivation. FIG. 7
depicts a simplified flowchart 700 of a method for explicitly
performing adaptive interpolation filter signaling according to one
embodiment. At 702, a mapping between interpolation filters 108 and
characteristics of the compression process is received. For
example, both encoder 102 and decoder 104 receive the mapping. This
mapping may be received for the implicit determination of
interpolation filter 108. As described above, the encoding and
decoding of video content is performed using the mappings.
[0059] At 704, filter signaling block 110-1 determines when to send
a signal including information regarding the mapping. In one
embodiment, filter signaling block 110-1 may send the signal for
each PU, CU, picture, or part of picture that is being encoded.
Also, other times, such as when the mappings are updated or when
additional information is required for decoding the PU, filter
signaling block 110-1 may send the signal.
[0060] At 706, filter signaling block 110-1 sends the signal to
filter signaling block 110-2 in decoder 104 for use in decoding
video content when it is determined the signal should be sent.
Decoder 104 uses the information to decode a block of video content
by determining a coding parameter, such as PU size or a prediction
index, used in the compression process and determining an
interpolation filter 108 in the set of interpolation filters based
on a mapping between the interpolation filter and the coding
parameter. The information received from filter signaling block
110-1 is used to determine interpolation filter 108. For example,
the selection of the mapping uses information in the signal to
determine which interpolation filter 108 to use. The information
may be used for one unit or may be used for multiple units.
[0061] In one example, the information includes a change to the
mapping between interpolation filters 108 in the set of
interpolation filters 108 and coding parameters used in the
compression process. In this case, the mapping that was previously
stored at encoder 102 and decoder 104 is replaced. The new mappings
will then be used thereafter.
[0062] In another example, the information indicates which
interpolation filter 108 in a group of interpolation filters 108 to
select. For example, decoder 104 uses a mapping to determine a
filter family (multiple interpolation filters 108). Then, the
information explicitly sent indicates which interpolation filter
108 in the filter family should be used. In another example, the
information is used to select the family of interpolation
filters.
[0063] Particular embodiments may be implemented in a
non-transitory computer-readable storage medium for use by or in
connection with the instruction execution system, apparatus,
system, or machine. The computer-readable storage medium contains
instructions for controlling a computer system to perform a method
described by particular embodiments. The instructions, when
executed by one or more computer processors, may be operable to
perform that which is described in particular embodiments.
[0064] As used in the description herein and throughout the claims
that follow, "a", "an", and "the" includes plural references unless
the context clearly dictates otherwise. Also, as used in the
description herein and throughout the claims that follow, the
meaning of "in" includes "in" and "on" unless the context clearly
dictates otherwise.
[0065] The above description illustrates various embodiments of the
present invention along with examples of how aspects of the present
invention may be implemented. The above examples and embodiments
should not be deemed to be the only embodiments, and are presented
to illustrate the flexibility and advantages of the present
invention as defined by the following claims. Based on the above
disclosure and the following claims, other arrangements,
embodiments, implementations and equivalents may be employed
without departing from the scope of the invention as defined by the
claims.
* * * * *