U.S. patent application number 11/251917 was filed with the patent office on 2006-04-20 for video coding and decoding methods using interlayer filtering and video encoder and decoder using the same.
This patent application is currently assigned to SAMSUNG ELECTRONICS CO., LTD.. Invention is credited to Sang-chang Cha, Ho-jin Ha, Woo-jin Han, Bae-keun Lee, Jae-young Lee, Kyo-hyuk Lee.
Application Number | 20060083300 11/251917 |
Document ID | / |
Family ID | 36748181 |
Filed Date | 2006-04-20 |
United States Patent
Application |
20060083300 |
Kind Code |
A1 |
Han; Woo-jin ; et
al. |
April 20, 2006 |
Video coding and decoding methods using interlayer filtering and
video encoder and decoder using the same
Abstract
Video coding and decoding methods using interlayer filtering and
a video encoder and decoder using the same are provided.
Multi-layer video coding employing a plurality of video coding
algorithms can improve video coding efficiency by using interlayer
filtering. The video coding method includes encoding video frames
using a first video coding scheme, performing interlayer filtering
on the frames encoded by the first video coding scheme, encoding
the video frames using a second video coding scheme by referring to
the frames subjected to the interlayer filtering, and generating a
bitstream containing the frames encoded by the first and second
video coding schemes.
Inventors: |
Han; Woo-jin; (Suwon-si,
KR) ; Lee; Bae-keun; (Bucheon-si, KR) ; Lee;
Jae-young; (Suwon-si, KR) ; Cha; Sang-chang;
(Hwaseong-si, KR) ; Ha; Ho-jin; (Seoul, KR)
; Lee; Kyo-hyuk; (Seoul, KR) |
Correspondence
Address: |
SUGHRUE MION, PLLC
2100 PENNSYLVANIA AVENUE, N.W.
SUITE 800
WASHINGTON
DC
20037
US
|
Assignee: |
SAMSUNG ELECTRONICS CO.,
LTD.
|
Family ID: |
36748181 |
Appl. No.: |
11/251917 |
Filed: |
October 18, 2005 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60619023 |
Oct 18, 2004 |
|
|
|
Current U.S.
Class: |
375/240.03 ;
375/240.12; 375/240.19; 375/240.21; 375/240.25; 375/E7.031;
375/E7.09; 375/E7.135; 375/E7.161; 375/E7.186; 375/E7.19;
375/E7.211; 375/E7.252; 375/E7.253 |
Current CPC
Class: |
H04N 19/172 20141101;
H04N 19/70 20141101; H04N 19/115 20141101; H04N 19/164 20141101;
H04N 19/36 20141101; H04N 19/13 20141101; H04N 19/15 20141101; H04N
19/59 20141101; H04N 19/187 20141101; H04N 19/136 20141101; H04N
19/149 20141101; H04N 19/177 20141101; H04N 19/61 20141101; H04N
19/132 20141101; H04N 19/34 20141101; H04N 19/577 20141101; H04N
19/517 20141101; H04N 19/31 20141101; H04N 19/162 20141101; H04N
19/46 20141101; H04N 19/63 20141101; H04N 19/587 20141101; H04N
19/154 20141101; H04N 19/86 20141101; H04N 19/615 20141101; H04N
19/117 20141101; H04N 19/152 20141101; H04N 19/127 20141101 |
Class at
Publication: |
375/240.03 ;
375/240.19; 375/240.12; 375/240.21; 375/240.25 |
International
Class: |
H04N 11/04 20060101
H04N011/04; H04N 7/12 20060101 H04N007/12; H04B 1/66 20060101
H04B001/66; H04N 11/02 20060101 H04N011/02 |
Foreign Application Data
Date |
Code |
Application Number |
Nov 18, 2004 |
KR |
10-2004-0094597 |
Claims
1. A video coding method comprising: encoding video frames using a
first video coding scheme; performing interlayer filtering on the
frames encoded by the first video coding scheme; encoding the video
frames using a second video coding scheme by referring to the
frames subjected to the interlayer filtering; and generating a
bitstream containing the frames encoded by the first and second
video coding schemes.
2. The video coding method of claim 1, wherein the first video
coding scheme is an Advanced Video Coding (AVC) coding scheme and
the second video coding scheme is a wavelet coding scheme.
3. The video coding method of claim 2, wherein the interlayer
filtering includes upsampling the AVC-coded frames using a wavelet
filter and downsampling the upsampled frames using an MPEG
filter.
4. The video coding method of claim 1, wherein the encoding of the
video frames using the second video coding scheme comprises:
performing Motion Compensation Temporal Filtering (MCTF) on the
video frames by referring to the interlayer filtered frames;
performing wavelet transform on the temporally filtered frames; and
quantizing the wavelet-transformed frames.
5. The video coding method of claim 4, wherein the performing of
MCTF comprises: generating prediction frames for the video frames
by referring to the interlayer filtered frames; smoothing the
prediction frames; generating residual frames for the video frames
using the smoothed prediction frames; and updating the video frames
using the residual frames.
6. A video coding method comprising: downsampling video frames to
generate video frames of low resolution; encoding the
low-resolution video frames using a first video coding scheme;
upsampling the frames encoded by the first video coding scheme to
the resolution of the video frames; performing interlayer filtering
on the upsampled frames; encoding the video frames using a second
video coding scheme by referring to the frames subjected to the
interlayer filtering; and generating a bitstream containing the
frames encoded by the first and second video coding schemes.
7. The video coding method of claim 6, wherein the first video
coding scheme is an Advanced Video Coding (AVC) coding scheme and
the second video coding scheme is a wavelet coding scheme.
8. The video coding method of claim 7, wherein the interlayer
filtering includes upsampling the AVC-coded frames using a wavelet
filter and downsampling the upsampled frames using an MPEG
filter.
9. The video coding method of claim 6, wherein the encoding of the
video frames using the second video coding scheme comprises:
performing Motion Compensation Temporal Filtering (MCTF) on the
video frames by referring to the interlayer filtered frames;
performing wavelet transform on the temporally filtered frames; and
quantizing the wavelet-transformed frames.
10. The video coding method of claim 9, wherein the performing of
MCTF comprises: generating prediction frames for the video frames
by referring to the interlayer filtered frames; smoothing the
prediction frames; generating residual frames for the video frames
using the smoothed prediction frames; and updating the video frames
using the residual frames.
11. A video encoder comprising: a first video coding unit which
encodes video frames using a first video coding scheme; an
interlayer filter which performs interlayer filtering on the frames
encoded by the first video coding scheme; a second video coding
unit which encodes the video frames using a second video coding
scheme by referring to the interlayer filtered frames; and a
bitstream generator which generates a bitstream containing the
frames encoded by the first and second video coding schemes.
12. The video encoder of claim 11, wherein the first video coding
scheme is an Advanced Video Coding (AVC) coding scheme and the
second video coding scheme is a wavelet coding scheme.
13. The video encoder of claim 12, wherein the interlayer filtering
includes upsampling the AVC-coded frames using a wavelet filter and
downsampling the upsampled frames using an MPEG filter.
14. The video encoder of claim 11, wherein the second video coding
unit comprises: a temporal filter which performs Motion
Compensation Temporal Filtering (MCTF) on the video frames by
referring to the interlayer filtered frames; a wavelet transformer
which performs a wavelet transform on the temporally filtered
frames; and a quantizer which quantizes the wavelet-transformed
frames.
15. The video encoder of claim 14, wherein the temporal filter
comprises: a prediction frame generator which generates prediction
frames for the video frames by referring to the interlayer filtered
frames; a prediction frame smoother which smoothes the prediction
frames; a residual frame generator which generates residual frames
for the video frames using the smoothed prediction frames; arid an
updater which updates the video frames using the residual
frames.
16. A video encoder comprising: a downsampler which downsamples
video frames to generate video frames of a low resolution; a first
video coding unit which encodes the low-resolution video frames
using a first video coding scheme; an upsampler which upsamples the
frames encoded by the first video coding scheme; an interlayer
filter which performs interlayer filtering on the upsampled frames;
a second video coding unit which encodes the video frames using a
second video coding scheme by referring to the interlayer filtered
frames; and a bitstream generator which generates a bitstream
containing the frames encoded by the first and second video coding
schemes.
17. The video encoder of claim 16, wherein the first a video coding
scheme is an Advanced Video Coding (AVC) coding scheme and the
second video coding scheme is a wavelet coding scheme.
18. The video encoder of claim 17, wherein the interlayer filtering
includes upsampling the AVC-coded frames using a wavelet filter and
downsampling the upsampled frames using an MPEG filter.
19. The video encoder of claim 16, wherein the second video coding
unit comprises: a temporal filter which performs Motion
Compensation Temporal Filtering (MCTF) on the video frames by
referring to the interlayer filtered frames; a wavelet transformer
which performs wavelet transform on the temporally filtered frames;
and a quantizer which quantizes the wavelet-transformed frames.
20. The video encoder of claim 19, wherein the temporal filter
comprises: a prediction frame generator which generates prediction
frames for the video frames by referring to the interlayer filtered
frames; a prediction frame smoother which smoothes the prediction
frames; a residual frame generator which generates residual frames
for the video frames using the smoothed prediction frames; and an
updater which updates the video frames using the residual
frames.
21. A video decoding method comprising: extracting frames encoded
by first and second video coding schemes from a bitstream; decoding
the frames encoded by the first video coding scheme using a first
video decoding scheme and reconstructing first layer frames;
performing interlayer filtering on the reconstructed first layer
frames; and decoding the frames encoded by the second video coding
scheme using a second video decoding scheme by referring to the
interlayer filtered first layer frames and reconstructing second
layer frames.
22. The video decoding method of claim 21, wherein the first video
coding and decoding scheme is the Advanced Video Coding (AVC)
scheme and the second video coding and decoding scheme is the
wavelet scheme.
23. The video decoding method of claim 22, wherein the interlayer
filtering includes upsampling the AVC-decoded frames using a
wavelet filter and downsampling the upsampled frames using an MPEG
filter.
24. The video decoding method of claim 21, wherein the decoding of
the frames using the second video decoding scheme comprises:
inversely quantizing the frames encoded by the second video coding
scheme; performing an inverse wavelet transform on the inversely
quantized frames; and performing inverse temporal filtering on the
inversely wavelet-transformed frames using Motion Compensation
Temporal Filtering (MCTF) by referring to the interlayer filtered
first layer frames.
25. The video decoding method of claim 24, wherein the performing
of inverse temporal filtering on the inversely wavelet-transformed
frames using MCTF comprises: inversely updating the inversely
wavelet-transformed frames; generating prediction frames by
referring to the inversely updated frames and the interlayer
filtered first layer frames; smoothing the prediction frames; and
reconstructing the second layer frames by referring to the
inversely updated frames and the smoothed prediction frames.
26. A video decoding method comprising: extracting frames encoded
by the first and second video coding schemes from a bitstream;
decoding the frames encoded by the first video coding scheme using
a first video decoding scheme and reconstructing first layer
frames; upsampling the reconstructed first layer frames; performing
interlayer filtering on the upsampled first layer frames; and
decoding the frames encoded by the second video coding scheme using
a second video decoding scheme by referring to the interlayer
filtered first layer frames and reconstructing second layer
frames.
27. The video decoding method of claim 26, wherein the first video
coding and decoding scheme is the Advanced Video Coding (AVC)
scheme and the second video coding and decoding scheme is the
wavelet scheme.
28. The video decoding method of claim 27, wherein the interlayer
filtering includes upsampling the AVC-decoded frames using a
wavelet filter and downsampling the upsampled frames using an MPEG
filter.
29. The video decoding method of claim 26, wherein the decoding the
frames encoded by the second video coding scheme comprises:
inversely quantizing the frames encoded by the second video coding
scheme; performing an inverse wavelet transform on the inversely
quantized frames; and performing inverse temporal filtering on the
inversely wavelet-transformed frames using Motion Compensation
Temporal Filtering (MCTF) by referring to the interlayer filtered
first layer frames.
30. The video decoding method of claim 29, wherein the performing
of inverse temporal filtering on the inversely wavelet-transformed
frames using MCTF comprises: inversely updating the inversely
wavelet-transformed frames; generating prediction frames by
referring to the inversely updated frames and the interlayer
filtered first layer frames; smoothing the prediction frames; and
reconstructing the second layer frames by referring to the
inversely updated frames and the smoothed prediction frames.
31. A video decoder comprising: a bitstream interpreter which
extracts from a bitstream frames encoded by first and second video
coding schemes; a first video decoding unit which decodes the
frames encoded by the first video coding scheme using a first video
decoding scheme and reconstructs first layer frames; an interlayer
filter which performs interlayer filtering on the reconstructed
first layer frames; and a second video decoding unit which decodes
the frames encoded by the second video coding scheme using a second
video decoding scheme by referring to the interlayer filtered first
layer frames and reconstructs second layer frames.
32. The video decoder of claim 31, wherein the first decoding unit
reconstructs the first layer frames using Advanced Video Coding
(AVC) decoding scheme and the second decoding unit reconstructs the
second layer frames using wavelet decoding scheme.
33. The video decoder of claim 32, wherein the interlayer filter
upsamples the frames decoded by the AVC decoding scheme using a
wavelet filter and downsamples the upsampled frames using an MPEG
filter.
34. The video decoder of claim 31, wherein the second video
decoding unit comprises: an inverse quantizer which inversely
quantizes the frames encoded by the second video coding scheme; an
inverse wavelet transform unit which performs an inverse wavelet
transform on the inversely quantized frames; and an inverse
temporal filtering unit which performs inverse temporal filtering
on the inversely wavelet-transformed frames by referring to the
interlayer filtered first layer frames using Motion Compensation
Temporal Filtering (MCTF).
35. The video decoder of claim 34, wherein the inverse temporal
filtering unit comprises: an inverse updater which inversely
updates the inversely wavelet-transformed frames; a prediction
frame generator which generates prediction frames by referring to
the inversely updated frames and the interlayer filtered first
layer frames; a prediction frame smoother which smoothes the
prediction frames; and a frame reconstructor which reconstructs the
second layer frames by referring to the inversely updated frames
and the smoothed prediction frames.
36. A video decoder comprising: a bitstream interpreter which
extracts frames encoded by first and second video coding schemes
from a bitstream; a first video decoding unit which decodes the
frames encoded by the first video coding scheme using a first video
decoding scheme and reconstructs first layer frames; an upsampler
which upsamples the reconstructed first layer frames; an interlayer
filter which performs interlayer filtering on the upsampled first
layer frames; and a second video decoding unit which decodes the
frames encoded by the second video coding scheme using a second
video decoding scheme by referring to the interlayer filtered first
layer frames and reconstructs second layer frames.
37. The video decoder of claim 36, wherein the first decoding unit
reconstructs the first layer frames using Advanced Video Coding
(AVC) decoding scheme and the second decoding unit reconstructs the
second layer frames using wavelet decoding scheme.
38. The video decoder of claim 37, wherein the interlayer filter
upsamples the upsampled first layer frames using a wavelet filter
and downsamples the upsampled frames using an MPEG filter.
39. The video decoder of claim 36, wherein the second video
decoding unit comprises: an inverse quantizer which inversely
quantizes the frames encoded by the second video coding scheme; an
inverse wavelet transform unit which performs inverse wavelet
transform on the inversely quantized frames; and an inverse
temporal filtering unit which performs inverse temporal filtering
on the inversely wavelet-transformed frames by referring to the
interlayer filtered first layer frames using Motion Compensation
Temporal Filtering (MCTF).
40. The video decoder of claim 39, wherein the inverse temporal
filtering unit comprises: an inverse updater which inversely
updates the inversely wavelet-transformed frames; a prediction
frame generator which generates prediction frames by referring to
the inversely updated frames and the interlayer filtered first
layer frames; a prediction frame smoother which smoothes the
prediction frames; and a frame reconstructor which reconstructs the
second layer frames by referring to the inversely updated frames
and the smoothed prediction frames.
41. A recording medium having a computer readable program recorded
therein, the program for executing a video coding method
comprising: encoding video frames using a first video coding
scheme; performing interlayer filtering on the frames encoded by
the first video coding scheme; encoding the video frames using a
second video coding scheme by referring to the frames subjected to
the interlayer filtering; and generating a bitstream containing the
frames encoded by the first and second video coding schemes.
42. A recording medium having a computer readable program recorded
therein, the program for executing a video coding method
comprising: downsampling video frames to generate video frames of
low resolution; encoding the low-resolution video frames using a
first video coding scheme; upsampling the frames encoded by the
first video coding scheme to the resolution of the video frames;
performing interlayer filtering on the upsampled frames; encoding
the video frames using a second video coding scheme by referring to
the frames subjected to the interlayer filtering; and generating a
bitstream containing the frames encoded by the first and second
video coding schemes.
43. A recording medium having a computer readable program recorded
therein, the program for executing a video decoding method
comprising: extracting frames encoded by first and second video
coding schemes from a bitstream; decoding the frames encoded by the
first video coding scheme using a first video decoding scheme and
reconstructing first layer frames; performing interlayer filtering
on the reconstructed first layer frames; and decoding the frames
encoded by the second video coding scheme using a second video
decoding scheme by referring to the interlayer filtered first layer
frames and reconstructing second layer frames.
44. A recording medium having a computer readable program recorded
therein, the program for executing a video decoding method
comprising: extracting frames encoded by the first and second video
coding schemes from a bitstream; decoding the frames encoded by the
first video coding scheme using a first video decoding scheme and
reconstructing first layer frames; upsampling the reconstructed
first layer frames; performing interlayer filtering on the
upsampled first layer frames; and decoding the frames encoded by
the second video coding scheme using a second video decoding scheme
by referring to the interlayer filtered first layer frames and
reconstructing second layer frames.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority from Korean Patent a
Application No. 10-2004-0094597 filed on Nov. 18, 2004 in the
Korean Intellectual Property Office, and U.S. Provisional Patent
Application No. 60/619,023 filed on Oct. 18, 2004 in the United
States Patent and Trademark Office, the disclosures of which are
incorporated herein by reference in their entirety.
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] Apparatuses and methods consistent with the present
invention relate to a multi-layer video coding technique, and more
particularly, to a video coding technique using interlayer
filtering.
[0004] 2. Description of the Related Art
[0005] With the development of communication technology including
Internet, video communication as well as text and voice
communication has increased. Conventional text communication cannot
satisfy the various demands of users, and thus multimedia services
that can provide various types of information such as text,
pictures, and music have increased. Multimedia data requires a
large capacity storage medium and a wide bandwidth for transmission
since the amount of data is usually large. For example, a 24-bit
true color image having a resolution of 640*480 needs a capacity of
640*480*24 bits, i.e., data of about 7.37 Mbits per frame. When a
video composed of such images is transmitted at a speed of 30
frames per second, a bandwidth of 221 Mbits/sec is required. When a
90-minute movie based on such an image is stored, a storage space
of about 1200 Gbits is required. Accordingly, a compression coding
method is a requisite for transmitting multimedia data.
[0006] A basic principle of multimedia data compression is removing
data redundancy. In other words, video data can be compressed by
removing spatial, temporal and visual redundancy. Spatial
redundancy occurs when the same color or object is repeated in an
image. Temporal redundancy occurs when there is little change
between adjacent frames in a moving image or the same sound is
repeated in audio. The removal of visual redundancy takes into
account the limitations of human eyesight and the limited
perception of high frequency.
[0007] FIG. 1 shows an environment in which video compression is
applied.
[0008] Video data is compressed by a video encoder 110. Currently
known Discrete Cosine Transform (DCT)-based video compression
algorithms are MPEG-2, MPEG-4, H.263, and H.264. In recent years,
research into wavelet-based scalable video coding has been actively
conducted. Compressed video data is sent to a video decoder 130 via
a network 120. The video decoder 130 decodes the compressed video
data to reconstruct the original video data.
[0009] The video encoder 110 compresses the original video data so
it does not exceed the available bandwidth of the network 120.
However, communication bandwidth may vary depending on the type of
the network 120. For example, the available communication bandwidth
of Ethernet is different from that of a wireless local area network
(WLAN). A cellular communication network may have a very narrow
bandwidth. Thus, research is being actively conducted into a method
for generating video data compressed at various bit-rates from the
same compressed video data, in particular, scalable video
coding.
[0010] Scalable video coding is a video compression technique that
enables video data to provide scalability. Scalability is the
ability to generate video sequences at different resolutions, frame
rates, and qualities from the same compressed bitstream. Temporal
scalability can be provided using a Motion Compensation Temporal
Filtering (MCTF), Unconstrained MCTF (UMCTF), or Successive
Temporal Approximation and Referencing (STAR) algorithm. Spatial
scalability can be achieved by a wavelet transform algorithm or
multi-layer coding, which have been actively studied in recent
years. Signal-to-Noise Ratio (SNR) scalability can be obtained
using Embedded ZeroTrees Wavelet (EZW), Set Partitioning in
Hierarchical Trees (SPIHT), Embedded ZeroBlock Coding (EZBC), or
Embedded Block Coding with Optimized Truncation (EBCOT).
[0011] Multi-layer video coding algorithms have recently been
adopted for scalable video coding. While conventional multi-layer
video coding usually uses a single video coding algorithm,
increasing attention has been recently directed to multi-layer
video coding using a plurality of video coding algorithms.
[0012] FIGS. 2 and 3 illustrate examples of bitstreams generated by
multi-layer video coding.
[0013] Referring to FIG. 2, a video encoder uses both a MPEG-4
Advanced Video Coding (AVC) algorithm offering excellent coding
efficiency and a wavelet coding technique providing excellent
scalability. When a video encoder performs encoding using only
wavelet coding, video quality tends to be significantly degraded at
a low resolution. Thus, the bitstream shown in FIG. 2 contains
AVC-coded lowest-resolution layer frames and highest-resolution
layer frames wavelet-coded using the AVC-coded lowest-layer frames.
The frame used as a reference during encoding is a frame
reconstructed by decoding a frame encoded by AVC coding.
[0014] Referring to FIG. 3, a video encoder uses both wavelet
coding offering the excellent scalability and AVC coding providing
high coding efficiency. While the bitstream shown in FIG. 2 has
only two wavelet-coded and AVC-coded layers, the bitstream shown in
FIG. 3 has a wavelet-coded layer and an AVC-coded layer for each
resolution.
[0015] Multi-layer video coding has a problem in that the coding
efficiency of an enhancement layer tends to be low due to
quantization noise in the previously encoded layer (base layer). In
particular, the problem is more severe when multi-layer video
coding uses a plurality of video coding algorithms having different
characteristics. For example, using both DCT-based AVC coding and
wavelet-based coding as shown in FIG. 2 or 3 may degrade the coding
efficiency in a wavelet layer.
SUMMARY OF THE INVENTION
[0016] The present invention provides video coding and decoding
methods using interlayer filtering, designed to improve the
efficiency of multi-layer video coding, and video encoders and
decoders.
[0017] The above stated object as well as other objects, features
and advantages, of the present invention will become clear to those
skilled in the art upon review of the following description.
[0018] According to an aspect of the present invention, there is
provided a video coding method including encoding video frames
using a first video coding scheme, performing interlayer filtering
on the frames encoded by the first video coding scheme, encoding
the video frames using a second video coding scheme by referring
the frames subjected to the interlayer filtering, and generating a
bitstream containing the frames encoded by the first and second
video coding schemes.
[0019] According to another aspect of the present invention, there
is provided a video coding method including downsampling video
frames to generate video frames of low resolution, encoding the
low-resolution video frames using a first video coding scheme,
upsampling the frames encoded by the first video coding scheme to
the resolution of the video frames, performing interlayer filtering
on the upsampled frames, encoding the video frames using a second
video coding scheme by referring the frames subjected to the
interlayer filtering, and generating a bitstream containing the
frames encoded by the first and second video coding schemes.
[0020] According to still another aspect of the present invention,
there is provided a video encoder including a first video coding
unit encoding video frames using a first video coding scheme, an
interlayer filter performing interlayer filtering on the frames
encoded by the first video coding scheme, a second video coding
unit encoding the video frames using a second video coding scheme
by referring the interlayer filtered frames, and a bitstream
generator generating a bitstream containing the frames encoded by
the first and second video coding schemes.
[0021] According to a further aspect of the present invention,
there is provided a video encoder including a downsampler
downsampling video frames to generate video frames of low
resolution, a first video coding unit encoding the low-resolution
video frames using a first video coding scheme, an upsampler
upsampling the frames encoded by the first video coding scheme, an
interlayer filter performing interlayer filtering on the upsampled
frames, a second video coding unit encoding the video frames using
a second video coding scheme by referring the interlayer filtered
frames, and a bitstream generator generating a bitstream containing
the frames encoded by the first and second video coding
schemes.
[0022] According to yet another aspect of the present invention,
there is provided a video decoding method including extracting
frames encoded by first and second video coding schemes from a
bitstream, decoding the frames encoded by the first video coding
scheme using a first video decoding scheme and reconstructing first
layer frames, performing interlayer filtering on the reconstructed
first layer frames, and decoding the frames encoded by the second
video coding scheme using a second video decoding scheme by
referring the interlayer filtered first layer frames and
reconstructing second layer frames.
[0023] According to yet another aspect of the present invention,
there is provided a video decoding method including extracting
frames encoded by first and second video coding schemes from a
bitstream, decoding the frames encoded by the first video coding
scheme using a first video decoding scheme and reconstructing first
layer frames, upsampling the reconstructed first layer frames,
performing interlayer filtering on the upsampled first layer
frames, and decoding the frames encoded by the second video coding
scheme using a second video decoding scheme by referring the
interlayer filtered first layer frames and reconstructing second
layer frames.
[0024] According to yet another aspect of the present invention,
there is provided a video decoder including a bitstream interpreter
extracting frames encoded by first and second video coding schemes
from a bitstream, a first video decoding unit decoding the frames
encoded by the first video coding scheme by referring a first video
decoding scheme and reconstructing first layer frames, an
interlayer filter performing interlayer filtering on the
reconstructed first layer frames, and a second video decoding unit
decoding the frames encoded by the second video coding scheme using
a second video decoding scheme by referring the interlayer filtered
first layer frames and reconstructing second layer frames.
[0025] According to yet another aspect of the present invention,
there is provided a video decoder including a video decoder
including a bitstream interpreter for extracting frames encoded by
first and second video coding schemes from a bitstream, a first
video decoding unit for decoding the frames encoded by the first
video coding scheme using a first video decoding scheme and
reconstructing first layer frames, an upsampler for upsampling the
reconstructed first layer frames, an interlayer filter for
performing interlayer filtering on the upsampled first layer
frames, and a second video decoding unit for decoding the frames
encoded by the second video coding scheme using a second video
decoding scheme by referring to the interlayer filtered first layer
frames and reconstructing second layer frames.
BRIEF DESCRIPTION OF THE DRAWINGS
[0026] The above and other aspects of the present invention will
become more apparent by describing in detail exemplary embodiments
thereof with reference to the attached drawings in which:
[0027] FIG. 1 shows an environment in which video compression is
applied;
[0028] FIGS. 2 and 3 show example structures of bitstreams
generated by multi-layer video coding;
[0029] FIG. 4 is a block diagram of a video encoder according to a
first exemplary embodiment of the present invention;
[0030] FIG. 5 is a block diagram of a video encoder according to a
second exemplary embodiment of the present invention;
[0031] FIG. 6 is a block diagram of a temporal filter according to
an exemplary embodiment of the present invention;
[0032] FIG. 7 is a flowchart illustrating a video coding process
according to an exemplary embodiment of the present invention;
[0033] FIG. 8 is a flowchart illustrating a detailed process of
encoding a second layer according to an exemplary embodiment of the
present invention;
[0034] FIG. 9 is a block diagram of a video decoder according to a
first exemplary embodiment of the present invention;
[0035] FIG. 10 is a block diagram of a video decoder according to a
second exemplary embodiment of the present invention;
[0036] FIG. 11 is a block diagram of an inverse temporal filter
according to an exemplary embodiment of the present invention;
[0037] FIG. 12 is a flowchart illustrating a video decoding process
according to an exemplary embodiment of the present invention;
and
[0038] FIG. 13 is a flowchart illustrating a detailed process of
performing inverse temporal filtering on a second layer according
to an exemplary embodiment of the present invention.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS OF THE INVENTION
[0039] Aspects of the present invention and methods of
accomplishing the same may be understood more readily by reference
to the following detailed description of exemplary embodiments and
the accompanying drawings. The present invention may, however, be
embodied in many different forms and should not be construed as
being limited to the exemplary embodiments set forth herein.
Rather, these exemplary embodiments are provided so that this
disclosure will be thorough and complete and will fully convey the
concept of the invention to those skilled in the art, and the
present invention will only be defined by the appended claims. Like
reference numerals refer to like elements throughout the
specification.
[0040] The present invention will now be described more fully with
reference to the accompanying drawings, in which exemplary
embodiments of the invention are shown. For convenience of
explanation, it is assumed that a video encoder has two coding
units for two layers.
[0041] FIG. 4 is a block diagram of a video encoder according to a
first exemplary embodiment of the present invention.
[0042] Referring to FIG. 4, a video encoder according to a first
exemplary embodiment of the present invention includes a first
video coding unit 410, a second video coding unit 420, a bitstream
generator 430, and an interlayer filter 440.
[0043] The first coding unit 410 includes a temporal filter 411, a
Discrete Cosine Transform (DCT) transformer 412, and a quantizer
413 and encodes a video frame using advanced video coding
(AVC).
[0044] The temporal filter 411 receives a video frame 400 and
removes temporal redundancy that the video frame 400 has with
adjacent frames. The temporal filter 411 may use a Motion
Compensated Temporal Filtering (MCTF) algorithm to remove temporal
redundancy between frames. The MCTF algorithm supporting temporal
scalability removes temporal redundancy between adjacent frames. A
5/3 filter is widely used for MCTF. Other temporal filtering
algorithms supporting temporal scalability, such as Unconstrained
MCTF (UMCTF) or Successive Temporal Approximation and Referencing
(STAR), may be used.
[0045] The DCT transformer 412 performs DCT on the temporally
filtered frame. The DCT is performed for each block of a
predetermined size (8*8 or 4*4). The entropy of a block subjected
to DCT is reduced compared with that of a block before DCT.
[0046] The quantizer 413 quantizes the DCT-transformed frame. In
AVC, quantization is determined based on a quantization parameter
(Qp). The quantized frame is inserted into a bitstream after being
subjected to scanning and entropy coding.
[0047] The second video coding unit 420 includes a temporal filter
421, a wavelet transformer 422, and a quantizer 423 and encodes a
video frame using wavelet coding.
[0048] The temporal filter 421 receives a video frame 400 and
removes temporal redundancy that the video frame 400 has with
adjacent frames. The temporal filter 421 may use an MCTF algorithm
to remove the temporal redundancy between frames. The MCTF
algorithm supporting temporal scalability removes temporal
redundancy between adjacent frames. Other temporal filtering
algorithms supporting temporal scalability, such as UMCTF or STAR,
may be used.
[0049] The wavelet transformer 422 performs a wavelet transform on
the temporally filtered frame on a frame-by-frame basis. The
wavelet transform algorithm supporting spatial scalability
decomposes a frame into one low-pass subband (LL) and three
high-pass subbands (LH, HL, and HH). The LL subband is a quarter
size of and an approximation of the original frame before being
subjected to wavelet transform. The wavelet transform is again
performed to decompose the LL subband into one low-pass subband
(LLLL) and three high-pass subbands (LLLH, LLHL, and LLHH). The
LLLL subband is a quarter size of and an approximation of the LL
subband. A 9/7 filter is commonly used for the wavelet
transform.
[0050] The quantizer 423 quantizes the wavelet-transformed frame.
The quantization is performed using an embedded quantization
algorithm such as Embedded ZeroTrees Wavelet (EZW), Set
Partitioning in Hierarchical Trees (SPIHT), Embedded ZeroBlock
Coding (EZBC), or Embedded Block Coding with Optimized Truncation
(EBCOT) offering Signal-to-Noise Ratio (SNR) scalability.
[0051] The temporal filter 421 removes temporal redundancy that
exists in the video frame 400 using adjacent frames or the frame
encoded by the first video coding unit 410 as a reference.
Block-based DCT, performed before quantization, may cause block
artifacts, which degrade the efficiency of wavelet coding performed
by the second video coding unit 420. That is, a frame having block
artifacts may degrade the wavelet coding efficiency since block
artifacts such as noise propagate across the entire frame when the
frame is subjected to a wavelet transform.
[0052] Thus, the video encoder of FIG. 4 further includes the
interlayer filter 440 to eliminate noise generated between layers.
The interlayer filter 440 performs filtering in such a way that a
frame encoded by a first video coding method is suitably used as a
reference for a second video coding method. Interlayer filtering is
needed when different video coding schemes are used for each layer
as shown in the video encoder of FIG. 4.
[0053] The interlayer filter 440 performs filtering on a frame
encoded by DCT-based AVC coding such that it can be suitably used
as a reference for wavelet coding. To accomplish this, the
interlayer filter downsamples the AVC-coded frame with an MPEG (or
other) filter after upsampling the same with a wavelet filter;
however, this is merely exemplary. Rather, a downsampling filter in
the interlayer filter 440 may be a low-pass filter showing a steep
gradient at a cut-off frequency. The interlayer filter 440 may be
any single or plurality of filters designed such that the frame
subjected to interlayer filtering can be suitably used as a
reference by the second video coding unit 420.
[0054] The bitstream generator 430 generates a bitstream containing
an AVC-coded frame 431, a wavelet-coded frame 432, motion vectors,
and other necessary information.
[0055] FIG. 5 is a block diagram of a video encoder according to a
second exemplary embodiment of the present invention.
[0056] Referring to FIG. 5, the video encoder includes a first
video coding unit 510, a second video coding unit 520, a bitstream
generator 530, an interlayer filter 540, an upsampler 550, and a
downsampler 560. The video encoder encodes a video frame and a
low-resolution video frame using different video coding schemes.
Specifically, the downsampler 560 downsamples a video frame 500 to
generate a low-resolution video frame.
[0057] The first video coding unit 510 includes a temporal filter
511, a DCT transformer 512, and a quantizer 513 and encodes the
low-resolution video frame using AVC coding.
[0058] The temporal filter 511 receives the low-resolution video
frame and removes temporal redundancy that the low-resolution frame
has with adjacent low-resolution frames. The temporal filter 511
uses an MCTF algorithm, which supports temporal scalability, to
remove temporal redundancy between low-resolution frames. A 5/3
filter is widely used for MCTF, but other temporal filtering
algorithms supporting temporal scalability such as UMCTF or STAR
may be used.
[0059] The DCT transformer 512 performs DCT on the temporally
filtered frame. The DCT is performed for each block of a
predetermined size (8*8 or 4*4). The entropy of a block subjected
to DCT is reduced compared with that of a block before DCT.
[0060] The quantizer 513 quantizes the DCT-transformed frame. In
AVC, quantization is determined based on a quantization parameter
(Qp). The quantized frame is inserted into a bitstream after being
subjected to reordering and entropy coding.
[0061] The upsampler 550 upsamples the AVC-coded frame to the
resolution of the frame 500.
[0062] The interlayer filter 540 performs filtering in such a
manner that an upsampled version of a frame can be suitably used as
a reference for wavelet coding. In an exemplary embodiment of the
present invention, the upsampled version of the frame may be
upsampled with a wavelet filter, followed by downsampling using an
MPEG filter; however, this is merely exemplary. Rather, a
downsampling filter in the interlayer filter 540 may be a low-pass
filter showing a steep gradient at a cut-off frequency. The
interlayer filter 540 may be any single or plurality of filters
designed such that the frame subjected to interlayer filtering can
be suitably used as a reference by the second video coding unit
520.
[0063] The second video coding unit 520 includes a temporal filter
521, a wavelet transformer 522, and a quantizer 523, and it encodes
a video frame using wavelet coding.
[0064] The temporal filter 521 receives the video frame 500 and
removes temporal redundancy that the video frame 500 has with
adjacent frames. In an exemplary embodiment of the present
invention, the temporal filter 521 uses an MCTF algorithm to remove
temporal redundancy between low-resolution frames. The MCTF
algorithm supporting temporal scalability removes temporal
redundancy between adjacent low-resolution frames. Other temporal
filtering algorithms supporting temporal scalability such as UMCTF
or STAR may be used.
[0065] The wavelet transformer 522 performs a wavelet transform on
the temporally filtered frame. Unlike the DCT transform that is
performed in units of blocks, the wavelet transform is performed in
units of frames. The wavelet transform algorithm supporting spatial
scalability decomposes a frame into one low-pass subband (LL) and
three high-pass subbands (LH, HL, and HH). The LL subband is a
quarter of the size of and an approximation of the original frame
before being subjected to wavelet transform. The wavelet transform
is again performed to decompose the LL subband into one low-pass
subband (LLLL) and three high-pass subbands (LLLH, LLHL, and LLHH).
The LLLL subband is a quarter of the size of and an approximation
of the LL subband. A 9/7 filter is commonly used for the wavelet
transform.
[0066] The quantizer 523 quantizes the wavelet-transformed frame.
The quantization may be an embedded quantization algorithm, which
provides SNR scalability, such as EZW, SPIHT, EZBC, or EBCOT.
[0067] The temporal filter 521 removes temporal redundancy that
exists in the video frame 500 using adjacent frames or the frame
encoded by the first video coding unit 510 as a reference. The
frame encoded by the first video coding unit is upsampled and
subjected to interlayer filtering before being sent to the temporal
filter 521.
[0068] The bitstream generator 530 generates a bitstream containing
an AVC-coded frame 531, a wavelet-coded frame 532, motion vectors,
and other necessary information.
[0069] The temporal filter will be described in greater detail with
reference to FIG. 6.
[0070] FIG. 6 is a block diagram of a temporal filter 600 according
to an exemplary embodiment of the present invention.
[0071] While FIGS. 4 and 5 show that the first and second video
coding units 410 (510) and 420 (520) include temporal filters 411
(511) and 421 (521), respectively, for convenience of explanation,
it is assumed in an exemplary embodiment that the temporal filter
600 is employed in the second video coding unit 420.
[0072] The temporal filter 600 removes temporal redundancy between
video frames using MCTF on a group-of-picture (GOP)-by-GOP basis.
To accomplish this function, the temporal filter 600 includes a
prediction frame generator 610 for generating a prediction frame, a
prediction frame smoother 620 for smoothing the prediction frame, a
residual frame generator 630 for generating a residual frame by
comparing a smoothed prediction frame with a video frame, and an
updater 640 for updating the video frame using the residual
frame.
[0073] The prediction frame generator 610 generates a prediction
frame that will be compared with a video frame in order to generate
a residual frame using video frames adjacent to the video frame and
a frame subjected to interlayer filtering as a reference. The
prediction frame generator 610 finds a matching block for each
block in the video frame within reference frames (adjacent video
frames and a frame subjected to interlayer filtering) (interceding)
or within another block in the video frame (intracoding).
[0074] The frame prediction smoother 620 smoothes a prediction
frame since blocking artifacts are introduced at block boundaries
in the prediction frame made up of blocks corresponding to blocks
in the video frame. To accomplish this, the prediction frame
smoother 620 may perform de-blocking on pixels at block boundaries
in the prediction frame. Since a de-blocking algorithm is commonly
used in the H.264 video coding scheme and is well known in the art,
a detailed explanation thereof will not be given.
[0075] The residual frame generator 630 compares the smoothed
prediction frame with the video frame and generates a residual
frame in which temporal redundancy has been removed.
[0076] The updater 640 uses the residual frame to update other
video frames. The updated video frames are then provided to the
prediction frame generator 610.
[0077] For example, when each GOP consists of eight video frames,
the temporal filter 600 removes temporal redundancy in frames 1, 3,
5, and 7 to generate residual frames 1, 3, 5, and 7, respectively.
The residual frames 1, 3, 5, and 7 are used to update frames 0, 2,
4, and 6. The temporal filter 600 removes the temporal redundancy
from updated frames 2 and 6 to generate residual frames 2 and 6.
The residual frames 2 and 6 are used to update frames 0 and 4.
Then, the temporal filter 600 removes temporal redundancy in the
updated frame 4 to generate a residual frame 4. The residual frame
4 is used to update the frame 0. Through the above process, the
temporal filter 600 performs temporal filtering on the eight video
frames to obtain one low-pass frame (updated frame 0) and seven
high-pass frames (residual frames 1 through 7).
[0078] A video coding process and a temporal filtering will be
described with reference to FIGS. 7 and 8. It is assumed that video
coding is performed on two layers.
[0079] Referring first to FIG. 7, a video encoder receives a video
frame in step S710.
[0080] In step S720, the video encoder encodes the input video
frame using AVC coding. In the present exemplary embodiment, a
first layer is encoded using AVC coding because the AVC coding
offers the highest coding efficiency currently available. However,
the first layer may be encoded using another video coding
algorithm.
[0081] After the first layer is encoded, in step S730 the video
encoder performs interlayer filtering on the AVC-coded frame so
that it can be suitably used as a reference for encoding a second
layer. The interlayer filtering involves upsampling the AVC-coded
frame using a wavelet filter and downsampling an upsampled version
of AVC-coded frame using an MPEG filter.
[0082] Following the interlayer filtering, in step S740 the video
encoder performs wavelet coding on the video frame using the frame
subjected to interlayer filtering as a reference.
[0083] After the wavelet coding is finished, in step S750 the video
encoder generates a bitstream containing the AVC-coded frame and
the wavelet-coded frame. When the first and second layers have
different resolutions, the video encoder uses a low-resolution
frame obtained by downsampling the input video frame during the AVC
coding. After the AVC coding is finished, the video encoder changes
the resolution of the AVC-coded frame. For example, when the
resolution of the first layer is lower than that of the second
layer, the video encoder upsamples the AVC-coded frame to the
resolution of the second layer. Then, the video encoder performs
interlayer filtering on the upsampled version of the AVC-coded
frame.
[0084] FIG. 8 is a flowchart illustrating a detailed process of
encoding a second layer according to an exemplary embodiment of the
present invention.
[0085] Referring to FIG. 8, in step S810 a second video coding unit
receives an encoded first layer video frame that has been subjected
to interlayer filtering.
[0086] In step S820, upon receipt of the video frame and the frame
subjected to interlayer filtering, the second video coding unit
performs motion estimation in order to generate a prediction frame
that will be used in removing temporal redundancy in the video
frame. Various well-known algorithms such as Block Matching and
Hierarchical Variable Size Block Matching (HVSBM) may be used for
motion estimation.
[0087] In step S830, after performing the motion estimation, the
second video coding unit uses motion vectors obtained as a result
of the motion estimation to generate the prediction frame.
[0088] In step S840, the second video coding unit smoothes the
prediction frame in order to reduce block artifacts in a residual
frame. This is because an apparent block boundary degrades the
coding efficiency during wavelet transform and quantization.
[0089] In step S850, the second video coding unit compares the
prediction frame with the video frame to generate a residual frame.
The residual frame corresponds to a high-pass frame (H frame)
generated through MCTF.
[0090] In step S860, a temporal filter uses a residual frame to
update another video frame. An updated version of the video frame
corresponds to a low-pass frame (L frame).
[0091] After an L frame and H frame are generated on a GOP basis
through steps S820 through S860, the second video coding unit
performs wavelet transform on the temporally filtered frames (L and
H frames) in step S870. While a 9/7 filter is commonly used for
wavelet transform, an 11/9 or 13/11 filter may also be used.
[0092] In step S880, the second video coding unit quantizes the
wavelet-transformed frames using EZW, SPIHT, EZBC, or EBCOT.
[0093] Next, a video decoder and a decoding process will be
described. While the decoding process is basically performed in
reverse order to the encoding process, layers are encoded and
decoded in the same order. For example, when a video encoder
sequentially encodes first and second layers, a video decoder
decodes the first and second layers in the same order. For
convenience of explanation, it is assumed that a video frame is
reconstructed from a bitstream having two layers.
[0094] FIG. 9 is a block diagram of a video decoder according to a
first exemplary embodiment of the present invention used when first
and second layers have the same resolution.
[0095] Referring to FIG. 9, the video decoder includes a bitstream
interpreter 900, a first video decoding unit 910, a second video
decoding unit 920, and an interlayer filter 940.
[0096] The bitstream interpreter 900 interprets an input bitstream
and extracts frames encoded by a first video coding and frames
encoded by second video coding. The frames encoded by the first
video coding are then provided to the first video decoding unit 910
while the frames encoded by the second video coding are provided to
the second video decoding unit 920.
[0097] The first video decoding unit 910 includes an inverse
quantizer 911, an inverse DCT transformer 912, and an inverse
temporal filter 913. The inverse quantizer 911 inversely quantizes
the frames encoded by the first video coding. The inverse
quantization may involve entropy decoding, inverse scanning, and a
process of reconstructing DCT-transformed frames using a
quantization table.
[0098] The inverse DCT transformer 912 performs an inverse DCT
transform on the inversely quantized frames.
[0099] The inverse temporal filter 913 reconstructs first layer
video frames from the inversely DCT-transformed frames and outputs
a decoded frame (931). The reconstructed first layer video frames
are obtained by encoding original video frames at a low bit-rate
and decoding the encoded frames.
[0100] The interlayer filter 940 performs interlayer filtering on
the reconstructed first layer frame. The interlayer filtering may
be performed using a deblocking algorithm.
[0101] The second video decoding unit 920 includes an inverse
quantizer 921, an inverse wavelet transformer 922, and an inverse
temporal filter 923.
[0102] The inverse quantizer 921 applies inverse quantization to
the frames encoded by the second video coding. The inverse
quantization may involve entropy decoding, inverse scanning, and a
process of reconstructing wavelet-transformed frames using a
quantization table.
[0103] The inverse wavelet transformer 922 performs an inverse
wavelet transform on the inversely quantized frames.
[0104] The inverse temporal filter 923 reconstructs second layer
video frames from the inversely wavelet-transformed frames using
frames subjected to the interlayer filtering as a reference and
outputs a decoded frame (932). The reconstructed second layer video
frames are obtained by encoding original video frames at a high
bit-rate and decoding the encoded frames.
[0105] FIG. 10 is a block diagram of a video decoder according to a
second exemplary embodiment of the present invention used when a
first layer has lower resolution than a second layer. Referring to
FIG. 10, the video decoder includes a bitstream interpreter 1000, a
first video decoding unit 1010, a second video decoding unit 1020,
an interlayer filter 1040, and an upsampler 1050.
[0106] The bitstream interpreter 1000 interprets an input bitstream
and extracts frames encoded by a first video coding and frames
encoded by second video coding. The frames encoded by the first
video coding have lower resolution than those encoded by the second
video coding. The former is then provided to the first video
decoding unit 1010 while the latter is provided to the second video
decoding unit 1020.
[0107] The first video decoding unit 1010 includes an inverse
quantizer 1011, an inverse DCT transformer 1012, and an inverse
temporal filter 1013.
[0108] The inverse quantizer 1011 inversely quantizes the frames
encoded by the first video coding. The inverse quantization may
involve entropy decoding, inverse scanning, and a process of
reconstructing DCT-transformed frames using a quantization
table.
[0109] The inverse DCT transformer 1012 performs an inverse DCT
transform on the inversely quantized frames.
[0110] The inverse temporal filter 1013 reconstructs first layer
video frames from the inversely DCT-transformed frames and outputs
a decoded frame (1031). The reconstructed first layer video frame
is obtained by downsampling and encoding an original video frame
and decoding the encoded frame.
[0111] The upsampler 1050 upsamples the first layer frame to the
resolution of a reconstructed second layer frame.
[0112] The interlayer filter 1040 performs interlayer filtering on
an upsampled version of the first layer frame. The interlayer
filtering may be performed using a de-blocking algorithm.
[0113] The second video decoding unit 1020 includes an inverse
quantizer 1021, an inverse wavelet transformer 1022, and an inverse
temporal filter 1023.
[0114] The inverse quantizer 1021 inversely quantizes the frames
encoded by the second video coding. The inverse quantization may
involve entropy decoding, inverse scanning, and a process of
reconstructing wavelet-transformed frames using a quantization
table.
[0115] The inverse wavelet transformer 1022 performs an inverse
wavelet transform on the inversely quantized frames.
[0116] The inverse temporal filter 1023 reconstructs second layer
video frames from the inversely wavelet-transformed frames using
frames that have been subjected to the interlayer filtering as a
reference and outputs a decoded frame (1032). The reconstructed
second layer video frames are obtained by encoding original video
frames and decoding the encoded frames.
[0117] FIG. 11 is a block diagram of an inverse temporal filter
1100 according to an exemplary embodiment of the present
invention.
[0118] While FIGS. 9 and 10 show that the first and second video
decoding units 910 (1010) and 920 (1020) include the inverse
temporal filters 913 (1013) and 923 (1023), respectively, for
convenience of explanation, it is assumed in an exemplary
embodiment that the temporal filter 1100 is employed in the second
video decoding unit 920 shown in FIG. 9.
[0119] The inverse temporal filter 1100 reconstructs video frames
from inversely wavelet-transformed frames on a GOP-by-GOP basis
using MCTF. The inversely wavelet-transformed frames fed into the
inverse temporal filter 1100 consist of low- and high-pass frames
in which temporal redundancies have been removed during video
coding. For example, when each GOP is made up of eight frames, the
inversely wavelet-transformed frames may include one low-pass frame
(updated frame 0) and seven high-pass frames (residual frames 1
through 7) obtained as a result of video coding.
[0120] To accomplish this function, the inverse temporal filter
1100 includes an inverse updater 1110, a prediction frame generator
1120, a prediction frame smoother 1130, and a frame reconstructor
1140.
[0121] The inverse updater 1110 updates the inversely
wavelet-transformed frames in the reverse order that video coding
is performed.
[0122] The prediction frame generator 1120 generates a prediction
frame that will be used in reconstructing a low-pass frame or video
frame from a residual frame using an interlayer filtered frame.
[0123] The prediction frame smoother 1130 smoothes a prediction
frame.
[0124] The frame reconstructor 1140 reconstructs a low-pass frame
or video frame from a high-pass frame using a smoothed prediction
frame.
[0125] For example, when each GOP is made up of eight video frames,
the inverse temporal filter 1100 uses residual frame 4 and
interlayer filtered frame 0 to inversely update a low-pass frame
0.
[0126] Then, the low-pass frame 0 is used to generate a prediction
frame for the residual frame 4, and the prediction frame is used to
obtain low-pass frame 4 from the residual frame 4.
[0127] Then, the inverse temporal filter 1100 uses residual frames
2 and 6 and interlayer filtered frames 0 and 4 to inversely update
low-pass frames 0 and 4, and then uses the low-pass frames 0 and 4
and interlayer filtered frames 2 and 6 to generate prediction
frames for the residual frames 2 and 6. The prediction frames are
then used to obtain low-pass frames 2 and 6 from the residual
frames 2 and 6.
[0128] Subsequently, the inverse temporal filter 1100 uses residual
frames 1, 3, 5, and 7 and interlayer filtered frames 0, 2, 4, and 6
to inversely update the low-pass frames 0, 2, 4, and 6, thereby
reconstructing video frames 0, 2, 4, and 6.
[0129] Lastly, the inverse temporal filter 1100 uses the
reconstructed frames 0, 2, 4, and 6 and inter-layer filtered frames
1, 3, 5, and 7 to generate prediction frames for the residual
frames 1, 3, 5, and 7. The prediction frames are then used to
reconstruct video frames 1, 3, 5, and 7.
[0130] A video decoding process will now be described with
reference to FIGS. 12 and 13.
[0131] FIG. 12 is a flowchart illustrating a video decoding process
according to an exemplary embodiment of the present invention.
[0132] In step S1210, a video decoder receives a bitstream and
extracts frames encoded by a first video coding and frames encoded
by a second video coding. For example, the first and second video
coding schemes may be AVC coding and wavelet coding,
respectively.
[0133] In step S1220, the video decoder performs AVC decoding on
the extracted AVC-coded frames.
[0134] In step S1230, the video decoder performs interlayer
filtering on the frames decoded by the AVC decoding.
[0135] In step S1240, the video decoder performs wavelet decoding
on the wavelet-coded frames by referring to the interlayer filtered
frames.
[0136] In step S1250, after wavelet decoding is finished, the video
decoder uses reconstructed video frames to generate a video signal.
That is, the video decoder converts luminance (Y) and chrominance
(UV) color components of a reconstructed pixel into red (R), green
(G), and blue (B) color components.
[0137] FIG. 13 is a flowchart illustrating a detailed process of
performing inverse temporal filtering on a second layer according
to an exemplary embodiment of the present invention.
[0138] Referring to FIG. 13, in step S1310, an inverse temporal
filter receives inversely wavelet-transformed frames and interlayer
filtered frames. The inversely wavelet-transformed frames are
reconstructed on a GOP basis and consist of low- and high-pass
frames.
[0139] In step S1320, the inverse temporal filter uses a high-pass
frame among the inversely wavelet-transformed frames and an
interlayer filtered frame to inversely update a low-pass frame.
[0140] In step S1330, the inverse temporal filter uses an inversely
updated low-pass frame and an inter layer filtered frame to
generate a prediction frame.
[0141] In step S1340, the inverse temporal filter smoothes the
prediction frame.
[0142] In step S1350, after smoothing of the prediction frame is
finished, the inverse temporal filter uses the smoothed prediction
frame and a high-pass frame to reconstruct a low-pass frame or a
video frame.
[0143] The present invention provides multi-layer video coding and
decoding methods employing a plurality of video coding algorithms
that can improve the coding efficiency using interlayer
filtering.
[0144] In particular, the present invention can prevent the
degradation of coding efficiency that may occur when a video coding
scheme with a block-based transform algorithm and a video coding
scheme with a frame-based transform algorithm are used
together.
[0145] While the present invention has been particularly shown and
described with reference to exemplary embodiments thereof, it will
be understood by those of ordinary skill in the art that various
changes in form and details may be made therein without departing
from the spirit and scope of the present invention as defined by
the following claims. For example, while a video encoder that
employs an AVC coding method and a wavelet coding method has been
described, the video encoder may employ other coding methods.
Therefore, the disclosed exemplary embodiments of the invention are
used in a generic and descriptive sense only and not for purposes
of limitation.
* * * * *