U.S. patent application number 11/221372 was filed with the patent office on 2006-07-06 for adaptive interlace-to-progressive scan conversion algorithm.
Invention is credited to Benitius M. Handjojo, Wenhua Li.
Application Number | 20060146187 11/221372 |
Document ID | / |
Family ID | 26752573 |
Filed Date | 2006-07-06 |
United States Patent
Application |
20060146187 |
Kind Code |
A1 |
Handjojo; Benitius M. ; et
al. |
July 6, 2006 |
Adaptive interlace-to-progressive scan conversion algorithm
Abstract
An interlace-to-progressive scan conversion system comprises: a
spatial line averaging prefilter; a motion estimator; a three-stage
adaptive recursive filter. The motion estimator comprises: a 3-D
recursive search sub-component having a bilinear interpolator; a
motion correction sub-component having an error-function including
penalties related to the difference between a given candidate
vector and a plurality of neighboring vectors; a block erosion
sub-component. The motion estimator assumes that motion is constant
between fields. The three-stage adaptive recursive filter
comprises: a first stage that selects between using static pixels
data and moving pixels data from a next field; a second stage that
selects a more valid set of data between motion compensated data
from a previous field and the pixels selected by the first stage; a
third stage that combines an intra-field interpolation with the
more valid set of data selected by the second stage.
Inventors: |
Handjojo; Benitius M.;
(Indianapolis, IN) ; Li; Wenhua; (Crawfordsville,
IN) |
Correspondence
Address: |
O'Shea, Getz & Kosakowski, P.C.;1500 Main Street
Suite 912
Springfield
MA
01115
US
|
Family ID: |
26752573 |
Appl. No.: |
11/221372 |
Filed: |
September 6, 2005 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10071711 |
Feb 8, 2002 |
6940557 |
|
|
11221372 |
Sep 6, 2005 |
|
|
|
60267356 |
Feb 8, 2001 |
|
|
|
Current U.S.
Class: |
348/448 ;
348/441; 348/E5.066; 348/E7.013 |
Current CPC
Class: |
H04N 5/145 20130101;
H04N 7/012 20130101; H04N 7/014 20130101 |
Class at
Publication: |
348/448 ;
348/441 |
International
Class: |
H04N 11/20 20060101
H04N011/20; H04N 7/01 20060101 H04N007/01 |
Claims
1. An interlace-to-progressive scan conversion system, comprising:
a prefilter having a prefiltered signal as an output; a motion
estimator having the prefiltered signal as input and a
motion-corrected signal as an output; an adaptive filter having the
prefiltered signal and the motion-corrected signal as inputs.
2. The interlace-to-progressive scan conversion system of claim 1,
wherein the prefilter is a line averaging filter.
3. The interlace-to-progressive scan conversion system of claim 2,
wherein the prefilter is a spatial line averaging filter.
4. The interlace-to-progressive scan conversion system of claim 1,
wherein the motion estimator is adapted to perform a 3-D recursive
search.
5. The interlace-to-progressive scan conversion system of claim 1,
wherein the motion estimator is adapted to perform motion vector
correction.
6. The interlace-to-progressive scan conversion system of claim 1,
wherein the motion estimator is adapted to perform a block erosion
process.
7. The interlace-to-progressive scan conversion system of claim 1,
wherein the adaptive filter comprises a median filter.
8. The interlace-to-progressive scan conversion system of claim 1,
wherein the adaptive filter comprises a line averaging filter.
9. The interlace-to-progressive scan conversion system of claim 1,
wherein the adaptive filter comprises an adaptive recursive
filter.
10. The interlace-to-progressive scan conversion system of claim 1,
wherein the adaptive filter comprises a time recursive filter.
11. The interlace-to-progressive scan conversion system of claim 1,
wherein: the adaptive filter comprises a three-stage adaptive
recursive filter, wherein: a first stage comprises a function that
selects between using static pixels data and moving pixels data
from a next field; a second stage comprises a function that selects
a more valid set of data between motion compensated data from a
previous field and the pixels selected by the first stage; and a
third stage comprises a function that combines an intra-field
interpolation with the more valid set of data selected by the
second stage.
12. The interlace-to-progressive scan conversion system of claim
11, wherein the prefilter comprises a spatial line average
filter.
13. The interlace-to-progressive scan conversion system of claim
11, wherein the motion estimator comprises a 3-D recursive search
sub-component.
14. The interlace-to-progressive scan conversion system of claim
11, wherein the motion-estimator comprises a motion vector
correction sub-component.
15. The interlace-to-progressive scan conversion system of claim
11, wherein the motion estimator comprises a block erosion
sub-component.
16. An interlace-to-progressive scan conversion system, comprising:
a spatial line averaging prefilter having a prefiltered signal as
an output; a motion estimator having the prefiltered signal as
input and a motion-corrected signal as an output, the motion
estimator comprising: a 3-D recursive search sub-component; a
motion vector correction sub-component; a block erosion
sub-component; a three-stage adaptive recursive filter, wherein: a
first stage comprises a function that selects between using static
pixels data and moving pixels data from a next field; a second
stage comprises a function that selects a more valid set of data
between motion compensated data from a previous field and the
pixels selected by the first stage; and a third stage comprises a
function that combines an intra-field interpolation with the more
valid set of data selected by the second stage.
17. The interlace-to-progressive scan conversion system of claim
16, wherein the 3-D recursive search sub-component resolves motion
vectors to at least quarter-pixel accuracy.
18. The interlace-to-progressive scan conversion system of claim
17, wherein the look-up table consists of: US n = .times. { ( 0 0 )
, ( 0 1 ) , ( 0 - 1 ) , ( 1 0 ) , ( - 1 0 ) , .times. ( 0 2 ) , ( 0
- 2 ) , ( 3 0 ) , ( - 3 0 ) , ( 0 1 4 ) , ( 0 - 1 4 ) , ( 1 4 0 ) ,
( - 1 4 0 ) } ##EQU42##
19. The interlace-to-progressive scan conversion system of claim
16, wherein the motion estimator includes a bilinear
interpolator.
20. The interlace-to-progressive scan conversion system of claim
19, wherein a value of a first estimator is set to a value of a
second estimator if: e({overscore (MV)}.sub.a,{overscore
(X)}-{overscore (SMV)}.sub.a,t)>e({overscore
(MV)}.sub.b,X-{overscore (SMV)}.sub.b,t)+Th and wherein the value
of the second estimator is set to the value of the first estimator
if: e({overscore (MV)}.sub.b,{overscore (X)}-{overscore
(SMV)}.sub.b,t)>e({overscore (MV)}.sub.a,{overscore
(X)}-{overscore (SMV)}.sub.a,t)+Th where Th is a fixed
threshold.
21. The interlace-to-progressive scan conversion system of claim
16, wherein an error function of the motion estimator includes
penalties related to a length of the difference vector between a
given candidate vector and a plurality of neighboring vectors.
22. The interlace-to-progressive scan conversion system of claim
21, wherein the error function is defined by: e .times. .times. ( C
_ , x , y , t ) = .times. x .di-elect cons. B .times. .times. ( x ,
y , t ) .times. .times. F .times. .times. ( x , y , t ) - F .times.
.times. ( x - C x , y - C y , t - T ) + .times. .alpha. U _ .times.
.times. ( x , y , t ) ##EQU43##
23. The interlace-to-progressive scan conversion system of claim
21, wherein the motion estimator assumes that a motion vector for
an object between a previous field and a current field is the same
as a motion vector for the object between the current field and a
next field.
24. The interlace-to-progressive scan conversion system of claim
23, wherein a motion vector error correction function is defined
by: MV _ .times. .times. ( x , y , t ) = { ( 0 0 ) , ( e m
.function. ( x , y , t ) .gtoreq. e s .function. ( x , y , t ) ) MV
_ .times. .times. ( x , y , t ) , ( e m .function. ( x , y , t )
< e s .function. ( x , y , t ) ) .times. .times. where : .times.
e m .function. ( x , y , t ) = x .di-elect cons. X .times. .times.
F .times. .times. ( X ) - F .times. .times. ( C ) + x .di-elect
cons. X .times. .times. F .times. .times. ( X ) - F .times. .times.
( D ) 2 e s .function. ( x , y , t ) = x .di-elect cons. X .times.
.times. F .times. .times. ( X ) - F .times. .times. ( A ) + x
.di-elect cons. X .times. .times. F .times. .times. ( X ) - F
.times. .times. ( B ) 2 ##EQU44## and where A, B, C, D, and X are
blocks containing ends of candidate motion vectors, X being in the
current field, A and C being in the previous field, and B and D
being in the next field.
25. The interlace-to-progressive scan conversion system of claim
23, wherein a motion vector error correction function is defined
by: MV _ .times. .times. ( x , y , t ) = { ( 0 0 ) , ( e m
.function. ( x , y , t ) .gtoreq. e s .function. ( x , y , t ) ) MV
_ .times. .times. ( x , y , t ) , ( e m .function. ( x , y , t )
< e s .function. ( x , y , t ) ) ##EQU45## where:
e.sub.m(x,y,t)=.SIGMA.|F(C)-F(D)| e.sub.s(x,y,t)=.SIGMA.|F(A)-F(B)|
and where A, B, C, D, and X are blocks containing ends of candidate
motion vectors, X being in the current field, A and C being in the
previous field, and B and D being in the next field.
26. The interlace-to-progressive scan conversion system of claim
16, wherein a cost function is defined by:
.A-inverted.F(x,y,t).epsilon.B(x,y,t):
D=|F(x,y,t)-F(x-MV.sub.x,y-MV.sub.y,t-1)| TD=TD+D Diff=D-EstErr
EstErr=EstErr+(.delta.+Diff)); Dev=Dev+.delta.(|Diff|-Dev)
27. The interlace-to-progressive scan conversion system of claim
16, wherein the block erosion sub-component divides each block
according to:
B(x,y,t)={(x,y)|X.sub.x-X/2.ltoreq.x.ltoreq.X.sub.x+X/2
X.sub.y-Y/2.ltoreq.y.ltoreq.X.sub.y+Y/2} wherein a vector
{overscore (MV)}(x, y, t) is assigned, into four sub-blocks
B.sub.i,j(x,y,t) B i , j .function. ( x , y , t ) = .times. { ( x ,
y ) | X x - ( 1 - i ) X 4 .ltoreq. x .ltoreq. X x + .times. ( 1 + i
) X 4 X y .function. ( 1 - j ) Y 4 .ltoreq. y .ltoreq. X y + ( 1 +
j ) Y 4 } ##EQU46## and wherein the variables I and j take the
values +1 and -1; wherein a vector MV.sub.i,j(x,y,t) is assigned to
the pixels of each of the sub-blocks B.sub.i,j(x,y,t):
.A-inverted.(x,y).epsilon.B.sub.i,j(x,y,t): {overscore
(MV)}.sub.i,j(x,y,t)={overscore (MV)}.sub.i,j({overscore (X)},t)
wherein: {overscore (MV)}.sub.i,j({overscore (X)},t)=med[{overscore
(MV)}(x+iX,y,t), {overscore (MV)}({overscore (X)},t), {overscore
(MV)}(x,y+jY,t)] wherein the median function is a median on the x
and y vector components separately; and wherein a resulting vector
is replaced by an original motion vector unless the resulting
vector is equal to one of the three input vectors.
28. The interlace-to-progressive scan conversion system of claim
16, wherein the first stage selection function is given by: F n
.times. .times. ( x , y , t ) = { F .times. .times. ( x + MV x
.function. ( x , y , t ) , y + MV y .function. ( x , y , t ) , t +
1 ) , ( D m < D s ) F .times. .times. ( x , y , t + 1 ) , ( D m
.gtoreq. D s ) .times. .times. where .times. : .times. .times. D s
= k = - 2 2 .times. .times. C v .function. ( k ) F .times. .times.
( x , y + k , t ) - F .times. .times. ( x , y + k , t + 1 ) .times.
.times. D m = k = - 2 2 .times. .times. C v .function. ( k ) F
.times. .times. ( x , y + k , t ) - F .times. .times. ( x - MV x
.function. ( x , y , t ) , y - MV y .function. ( x , y , t ) + k ,
t + 1 ) ##EQU47##
29. The interlace-to-progressive scan conversion system of claim
16, wherein the third stage combining function is given by: F o
.function. ( x , y , t ) = { F .times. .times. ( x , y , t ) , ( y
.times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) ( c i F i .function. ( x , y , t ) ) + ( 1 - c
i ) .times. ( c p F p ( x , y , t ) + ( 1 - c p ) .times. .times. F
n .function. ( x , y , t ) ) , ( otherwise ) ##EQU48## wherein
c.sub.i and c.sub.p are adaptive coefficients ranging from 0 to 1;
F.sub.n is given by: F n .times. .times. ( x , y , t ) = { F
.times. .times. ( x + MV x .function. ( x , y , t ) , y + MV y
.function. ( x , y , t ) , t + 1 ) , ( D m < D s ) F .times.
.times. ( x , y , t + 1 ) , ( D m .gtoreq. D s ) ##EQU49## wherein
intra-field interpolation is given by: F i .function. ( x , y , t )
= F .times. .times. ( x , y - 1 , t ) + F .times. .times. ( x , y +
1 , t ) 2 ##EQU50## and wherein backward data prediction is given
by: F.sub.p(x,y,t)=F(x-MV.sub.x(x,y,t), y-MV.sub.y(x,y,t),t-1)
30. An interlace-to-progressive scan conversion system, comprising:
a spatial line averaging prefilter having a prefiltered signal as
an output; a motion estimator having the prefiltered signal as
input and a motion-corrected signal as an output, the motion
estimator comprising: a 3-D recursive search sub-component having a
bilinear interpolator; a motion vector correction sub-component
having an error function, the error function including penalties
related to a length of the difference vector between a given
candidate vector and a plurality of neighboring vectors; a block
erosion sub-component; wherein the motion estimator assumes that a
motion vector for an object between a previous field and a current
field is the same as a motion vector for the object between the
current field and a next field a three-stage adaptive recursive
filter having the prefiltered output and the motion-corrected
output as inputs, the three stages comprising: a first stage that
comprises a function that selects between using static pixels data
and moving pixels data from a next field; a second stage that
comprises a function that selects a more valid set of data between
motion compensated data from a previous field and the pixels
selected by the first stage; and a third stage that comprises a
function that combines an intra-field interpolation with the more
valid set of data selected by the second stage.
31. An interlace-to-progressive scan conversion system, comprising:
a spatial line averaging prefilter having a prefiltered signal as
an output; a motion estimator having the prefiltered signal as
input and a motion-corrected signal as an output, the motion
estimator comprising: a 3-D recursive search sub-component; a
motion vector correction sub-component; and a block erosion
sub-component; wherein: the 3-D recursive search sub-component
includes a bilinear interpolator defined by: F .times. .times. ( x
, y , t ) = .times. ( yf xf F .times. .times. ( xi , yi , t ) ) + (
yf ( 1 - xf ) F .times. .times. ( xi + 1 , yi , t ) ) + .times. ( (
1 - yf ) xf F .times. .times. ( xi , yi + 1 , t ) ) + ( ( 1 - yf )
( 1 - xf ) .times. F .times. .times. ( xi + 1 , yi + 1 , t )
##EQU51## where: yf=.left brkt-bot.Y.right brkt-bot. xf=.left
brkt-bot.x.right brkt-bot. and: yi=y-.left brkt-bot.y' xi=.left
brkt-bot.x.right brkt-bot. and wherein a value of a first estimator
is set to a value of a second estimator if: e({overscore
(MV)}.sub.a,{overscore (X)}-{overscore
(SMV)}.sub.a,t)>e({overscore (MV)}.sub.b,{overscore
(X)}-{overscore (SMV)}.sub.b,t)+Th and wherein the value of the
second estimator is set to the value of the first estimator if:
e({overscore (MV)}.sub.b,{overscore (X)}-{overscore
(SMV)}.sub.b,t)>e({overscore (MV)}.sub.a,{overscore
(X)}-{overscore (SMV)}.sub.a,t)+Th where Th is a fixed threshold;
the 3-D recursive search sub-component has a look-up table
consisting of: US n = { ( 0 0 ) , ( 0 1 ) , ( 0 - 1 ) , ( 1 0 ) , (
- 1 0 ) , ( 0 2 ) , ( 0 - 2 ) , ( 3 0 ) , ( - 3 0 ) , ( 0 1 4 ) , (
0 - 1 4 ) , ( 1 4 0 ) , ( - 1 4 0 ) } ##EQU52## a motion vector
correction sub-component having an motion vector error correction
function defined by: MV _ .function. ( x , y , t ) = { ( 0 0 ) , (
e m .function. ( x , y , t ) .gtoreq. e s .function. ( x , y , t )
) MV _ .function. ( x , y , t ) , ( e m .function. ( x , y , t )
< e s .function. ( x , y , t ) ) ##EQU53## where:
e.sub.m(x,y,t)=.SIGMA.|F(C)-F(D)| e.sub.s(x,y,t)=.SIGMA.|F(A)-F(B)|
and where A, B, C, D, and X are blocks containing ends of candidate
motion vectors, X being in the current field, A and C being in the
previous field, and B and D being in the next field. a block
erosion sub-component that divides each block according to:
B(x,y,t)={(x,y)|X.sub.x-X/2.ltoreq.x.ltoreq.X.sub.x+X/2
X.sub.y-Y/2.ltoreq.y.ltoreq.X.sub.y+Y/2} wherein a vector
{overscore (MV)}(x,y,t) is assigned, into four sub-blocks
B.sub.i,j(x,y,t) B i , j .function. ( x , y , t ) = { ( x , y )
.times. .times. X x - ( 1 - i ) X 4 .ltoreq. x .ltoreq. X x + ( 1 +
i ) X 4 X y .function. ( 1 - j ) Y 4 .ltoreq. y .ltoreq. X y + ( 1
+ j ) Y 4 } ##EQU54## and wherein the variables I and j take the
values +1 and -1; wherein a vector MV.sub.i,j(x,y,t) is assigned to
the pixels of each of the sub-blocks B.sub.i,j(x,y,t):
.A-inverted.(x,y).epsilon.B.sub.i,j(x,y,t): {overscore
(MV)}.sub.i,j(x,y,t)={overscore (MV)}.sub.i,j({overscore (X)},t)
wherein: {overscore (MV)}.sub.i,j({overscore (X)},t)=med[{overscore
(MV)}(x+iX,y,t), {overscore (MV)}({overscore (X)},t),{overscore
(MV)}(x,y+jY,t)] wherein the median function is a median on the x
and y vector components separately; and wherein a resulting vector
is replaced by an original motion vector unless the resulting
vector is equal to one of the three input vectors. a three-stage
adaptive recursive filter having the prefiltered signal and
motion-corrected signals as output, the three stages comprising: a
first stage comprises a function that selects between using static
pixels data and moving pixels data from a next field according to
the function: F n .function. ( x , y , t ) = { F .function. ( x +
MV x .function. ( x , y , t ) , y + MV y .function. ( x , y , t ) ,
t + 1 ) , ( D m < D s ) F .function. ( x , y , t + 1 ) , ( D m
.gtoreq. D s ) .times. .times. where .times. : .times. .times. D s
= k = - 2 2 .times. C v .function. ( k ) F .function. ( x , y + k ,
t ) - F .function. ( x , y + k , t + 1 ) .times. .times. D m = k =
- 2 2 .times. C v .function. ( k ) F .function. ( x , y + k , t ) -
F .function. ( x - MV x .function. ( x , y , t ) , y - MV y
.function. ( x , y , t ) + k , t + 1 ) ##EQU55## a second stage
comprises a function that selects a more valid set of data between
motion compensated data from a previous field and the pixels
selected by the first stage; and a third stage comprises a function
that combines an intra-field interpolation with the more valid set
of data selected by the second stage according to the function: F o
.function. ( x , y , t ) = { F .function. ( x , y , t ) , ( y
.times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) ( c i F i .function. ( x , y , t ) ) + ( 1 - c
i ) .times. ( c p F p .times. ( x , y , t ) + ( 1 - c p ) .times. F
n .function. ( x , y , t ) ) , .times. ( otherwise ) ##EQU56##
wherein c.sub.i and c.sub.p are adaptive coefficients ranging from
0 to 1; F.sub.n is given by: F n .function. ( x , y , t ) = { F
.function. ( x + MV x .function. ( x , y , t ) , y + MV y
.function. ( x , y , t ) , t + 1 ) , ( D m < D s ) F .function.
( x , y , t + 1 ) , ( D m .gtoreq. D s ) ##EQU57## wherein
intra-field interpolation is given by: F i .function. ( x , y , t )
= F .function. ( x , y - 1 , t ) + F .function. ( x , y + 1 , t ) 2
##EQU58## and wherein backward data prediction is given by:
F.sub.p(x,y,t)=F(x-MV.sub.x(x,y,t),y-MV.sub.y(x,y,t),t-1)
32. A method for converting an interlaced image to a progressive
scan image, the method comprsing: providing an input signal
corresponding to an image; prefiltering the input signal with a
spatial line averaging prefilter; estimating motion in the image
by: performing a 3-D recursive search; performing a motion vector
correction; performing a block erosion to reduce blockiness in the
progressive scan image; filtering the signal in three stages: in
the first stage selecting between using static pixels data and
moving pixels data from a next field; in the second stage selecting
a more valid set of data between motion compensated data from a
previous field and the pixels selected by the first stage; and in
the third stage combining an intra-field interpolation with the
more valid set of data selected by the second stage.
33. A method for converting an interlaced image to a progressive
scan image, the method comprsing: providing an input signal
corresponding to an image; prefiltering the input signal with a
spatial line averaging prefilter; estimating motion in the image
by: assuming that a motion vector for an object between a previous
field and a current field is the same as a motion vector for the
object between the current field and a next field; performing a 3-D
recursive search; performing a motion vector correction in which
the error function penalizes a candidate vector based on a length
of a difference vector between the candidate vector and a plurality
of neighboring vectors; performing a block erosion to reduce
blockiness in the progressive scan image; filtering the signal in
three stages: in the first stage selecting between using static
pixels data and moving pixels data from a next field; in the second
stage selecting a more valid set of data between motion compensated
data from a previous field and the pixels selected by the first
stage; and in the third stage combining an intra-field
interpolation with the more valid set of data selected by the
second stage.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims priority from U.S. Provisional
Application No. 60/267,356, filed Feb. 8, 2001.
BACKGROUND
[0002] To interlace or not to interlace has been a contentious
issue between the television and computer communities. To the
general public, there appears to be little difference between the
television set and the computer monitor. However, those skilled in
the art appreciate the fundamental difference of the video data
formats in television and in computer monitors. Current television
technology uses an interlaced scanning method. In this approach
images are divided into several frames. Each frame is handled like
a two-dimensional matrix; for US NTSC signals each frame has 525
lines. At each consecutive time, only one half of the lines are
drawn, skipping every other line. Then the remaining lines are
drawn, interlacing with the previous one. Computer monitors, on the
other hand, use a progressive scanning approach that scans all the
lines in order from top to bottom in a single frame.
[0003] At first, interlaced scanning was used because of some of
its technological and psychophysical advantages. Interlacing was an
efficient method to reduce bandwidth when TV frame memories were
expensive and TV broadcast bandwidth was limited. Interlacing also
takes advantage of psychophysical properties of human visual
system. For example, the human visual system is less sensitive to
flickering details than to large area flicker. Doubling the
scanning frequency reduces the large area flickering. By only
transmitting half of the information at a time, a higher scanning
frequency can be achieved using the same bandwidth, which is one of
interlacing's principal advantages. However, in addition to the
loss of vertical resolution, interlacing results in many well-known
artifacts such as line flicker. Line flicker happens when high
vertical spectrum components in static images are present. It also
produce vertical-temporal aliasing in moving images, if there is no
appropriate vertical band limitation. Another major flaw of
interlacing is that it complicates many images processing tasks,
especially scanning format conversion.
[0004] Even though it seems that both interlaced scanning and
progressive scanning each have a well-established area, with the
advancement of the technology, especially in multimedia area, the
demand for television video and personal computer video to converge
is becoming irresistible. With the emergence of the new High
Definition Television (HDTV) technology, a good algorithm for
interlace-to-progressive scan conversion is becoming even more
important, since many of the HDTV proposals either involve
transmission of interlaced video, or high spatial frequency
information at a reduced temporal rate. The consumer wants to be
able to view a standard NTSC signal from broadcast or VCR in the
new HDTV, but because of the nature of HDTV, the artifacts in a
standard NTSC signal are becoming more visible and annoying when
displayed on in high-definition television video. On the other
hand, consumers also want to utilize their HDTV up to the maximum
potential. A good interlace-to-progressive scan algorithm is needed
to convert a standard NTSC signal to an HDTV signal.
[0005] Interlace-to-progressive scan conversion (which is sometimes
called deinterlacing) can be described as interpolating the missing
lines of an interlaced sequence. It can also be seen as a
resolution enhancement technique, which can use either linear or
non-linear interpolation (or both), or as a process to recover the
alias component. If the interlacing process is seen as a form of
spatio-temporal sub-sampling then interlace-to-progressive scan
conversion is the reverse operation aiming at the removal of the
sub-sampling artifact. From the mathematical perspective, the
process of interlace-to-progressive scan conversion is a problem in
linear up-sampling conversion.
[0006] A number of different interlace to-progressive conversion
algorithms have been proposed in the last few years. These
algorithms range from simple spatial, temporal, or spatio-temporal
conversion algorithms or filtering, to more advance motion adaptive
filtering, to the most advanced adaptive conversion algorithm with
motion estimation. Despite years of research, most of the
algorithms are only suitable for specific image characteristics. In
each case, the spectral content of the video data is different and
hence requires different approaches; The challenge, therefore, is
to implement an algorithm that can be adapted to various image
characteristics. Unfortunately, this is not a simple problem since
interlace-to-progressive scan conversion, suffers from some
fundamental problems. For example, though a few algorithms can
adapt to various image characteristics, most of them are too
complicated to implement in real applications.
[0007] In interlaced scanning, each frame is divided into two
fields. The field is normally divided into top and bottom fields,
the top field having the odd lines and the bottom having the even
lines. These two fields are transmitted alternately. For the
purposes of this paper the terms top and bottom field will be used
for normal discussion, while odd and even fields with starting line
equal to 1 is used whenever relation between frame/field number and
line in the frame/field should be emphasized.
[0008] FIG. 1 illustrates the interlace-to-progressive scan
conversion, or deinterlacing, task. The input video fields,
containing samples of either the top or bottom vertical grid
positions (lines) of an image, have to be converted to frames.
These frames represent the same image as the corresponding F 0
.function. ( x , y , t ) = { F .times. .times. ( x , y , t ) , ( y
.times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) F .times. i .times. .times. ( x , y , t ) , (
otherwise ) ( 1.0 ) ##EQU1## input field but contain the samples of
all lines. Formally, the output frame F.sub.o(x,y,t) can be defined
as: [0009] where: [0010] F(x, y, t) is the pixels from the original
line from the input field [0011] x, y are coordinates in the
spatial plane, [0012] t is coordinate in temporal domain [0013]
F.sub.i(x,y,t) is the interpolated pixels
[0014] The interlace-to-progressive scan conversion process doubles
the frequency in vertical space. The process removes the first
repeated spectrum caused by the interlaced sampling of the video.
At first interlace-to-progressive scan conversion seems like an
easy task. However, due to lack of prefiltering in the interlacing
process, the interlace-to-progressive scan conversion process is
not as simple as it looks like.
[0015] There are two major problems confronting
interlace-to-progressive scan conversion processes. The first is
that TV signals do not fulfill the demands of the sampling theorem,
i.e., it doesn't satisfy the Nyquist criteria. Interlaced scanning
introduces aliasing, unless the moving image is properly
pre-filtered with a low pass filter. In actual image capturing
devices, temporal filtering is performed by camera time
integration, and is performed independently of spatial filtering,
resulting in separable prefiltering. In most practical systems,
there is no prefiltering to suppress the higher frequencies prior
to sampling in TV signals. Hence, some of the information is lost
during the interlaced sampling process. From frequency domain point
of view, some of the higher frequencies still lie outside the
sampling frequency, and thus cause aliasing.
[0016] The second major problem is that the temporal frequencies at
the retina of an observer have an unknown relation to the scene
content. The results from psychophysical experiments shows that
temporal filtering blurs moving objects due to eye tracking. High
frequencies due to object motion are mapped to zero frequency (DC)
at the retina if the observer tracks the object. Consequently,
suppression of such apparently high and less relevant frequencies
results in significant blurring for this viewer. Due to this
complication the apparent quality of interlaced video is best if it
comes from progressive video by dropping half the lines with motion
adaptive prefiltering. In most cases, motion adaptive prefiltering
is not feasible, and interlacing is done without any
prefiltering.
[0017] FIG. 2a shows the vertical temporal (VT) video spectrum of a
static scene. This spectrum includes baseband and spectral replicas
due to the interlaced sampling. The sampling lattices result in a
quincunx pattern of the centers of the spectral replicas. The
vertical detail of the scene determines the extent of the VT
spectrum support, while vertical motion changes its orientation, as
illustrated in FIG. 2b. FIG. 3a illustrates the general spectrum
for an interlaced signal with motion, and FIG. 3b shows the ideal
spectrum result from an interlace-to-progressive scan conversion
process. Clearly, interlace-to-progressive scan conversion is a
spatio-temporal problem.
[0018] Over the last 30 years researchers have proposed many
different algorithms for interlace-to-progressive scan conversion.
So far no one has discovered an algorithm that can perfectly
deinterlace every image sequence. One major division among these
algorithms is motion compensation.
[0019] Most of the algorithms that were introduced before 1990 used
motion non-compensated methods. They fall generally into the
categories of linear and non-linear methods. Linear methods are the
best methods in the absence of motion. These methods are considered
outdated within the TV-product community. However they are still
widely used in the computer community, especially for multimedia
products.
[0020] Linear methods are well-known for their low cost of
implementation. All linear methods, whether using spatial,
temporal, or spatio-temporal filtering can be defined by: F o
.times. .times. ( x , y , t ) = { F .times. .times. ( x , y , t ) ,
( y .times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) k .times. .times. F .times. .times. ( x , y + k
, t + n ) .times. .times. h .times. .times. ( k , n ) , ( otherwise
) ( 1.1 ) ##EQU2## [0021] where: k,n.epsilon.{ . . . , -1,0,1, . .
. } [0022] h(k,n)=impulse response of the filter in the VT The
choice of h(k,n) depends upon whether it is a spatial, temporal or
spatio-temporal filter.
[0023] Spatial linear interlace-to-progressive scan conversion uses
the correlation between vertical neighboring pixels to interpolate
the missing pixels. It has the characteristic of passing all
temporal frequency responses, which guarantees the absence of
motion artifacts. Defects occur with high vertical frequencies
only. It is easy to implement and has the lowest hardware
requirement since normally it only requires a few lines of buffer
instead of one field buffer. It also doesn't require a complex
computation to execute its filtering algorithm.
[0024] The simplest form of the spatial progressive scan conversion
algorithm is line repetition. This algorithm doubles every line in
the original fields. The frequency response of this interpolator is
given by: H.sub.y(f.sub.y)=|cos(.pi.f.sub.y)| [0025] where: [0026]
f.sub.y is the vertical frequency (normalized to the vertical
sampling frequency) [0027] H.sub.y(f.sub.y) is the frequency
response in the vertical direction This frequency characteristic
has no steep roll off. As a consequence, the first spectral replica
is not much suppressed, while the baseband is partly suppressed.
This causes alias and blur in the output signal.
[0028] Line Averaging is the most popular and commonly used Spatial
filtering algorithm. It can be defined by equation (1.0) above,
with h(k,0)=0.5 for k=.+-.1 and h(k,n)=0 otherwise. The frequency
response: H y .function. ( f y ) = 1 2 + 1 2 .times. cos .times.
.times. ( 2 .times. .pi. .times. .times. f y ) ( 1.2 ) ##EQU3##
indicates a higher alias suppression. However, this suppresses the
higher part of the baseband spectrum as well which causes the
output signal to be blurred. In general, purely spatial filters
cannot discriminate between baseband and repeat spectrum regardless
of their length.
[0029] Temporal interlace-to-progressive scan conversion uses the
correlation in the time domain. Pure temporal interpolation has the
characteristic of passing all the spatial frequencies.
Consequently, there is no degradation in stationary images.
[0030] The most popular temporal filtering algorithm is field
insertion. The scan conversion is done by inserting the lines from
the previous field to replace the missing lines. The formal
definition is given by equation (1.0) with h(0,-1)=1 and h(k,n)=0
otherwise. The frequency characteristic is analogous to the line
repetition given in equation (1.1), the only difference being that
f.sub.y is replaced with f.sub.t.
[0031] Field insertion, also called "Weave" in PC world, provides
an all-pass characteristic in the vertical frequency domain. It is
the best solution in case of still images, as all vertical
frequencies are preserved. However, moving objects are not shown at
the same position for odd and even lines of a single output frame.
This causes serration of moving edges, which is a very annoying
artifact.
[0032] Longer temporal finite duration impulse response (FR)
filters require multiple field storage. Unlike signal processing
for audio signals, this will increase the storage requirement
significantly. Therefore it is economically unattractive.
Furthermore, they still cannot discriminate between baseband and
repeated spectra.
[0033] A spatio-temporal interpolation filter would theoretically
solve the interlace-to-progressive scan conversion problem if the
signal were band-limited prior to interlacing. The required
pre-filter would be similar to the up-conversion filter. The
required frequency characteristic is shown in FIG. 4.
[0034] Although the pre-filter is missing, and there are problems
with motion tracking viewers, FIG. 4 illustrates that the
spatial-temporal filter is certainly the best linear approach in
that it prevents both alias and blur in stationary images. The
vertical detail is gradually reduced with increasing temporal
frequencies. Such a loss of resolution with motion is not
unnatural.
[0035] The filter is usually designed such that the contribution
from the neighboring fields is limited to the higher vertical
frequencies. As a consequence, motion artifacts are absent for
objects without vertical detail that move horizontally. Early
versions of the Spatial-Temporal filtering reduces the filter into
only two-dimensional (Vertical-Temporal) filter. The version that
gives the best result, however, utilized other spatial neighboring
pixels and can be defined as: F 0 .times. .times. ( x , y , t ) = {
F .times. .times. ( x , y , t ) ( y .times. .times. mod .times.
.times. 2 = t .times. .times. mod .times. .times. 2 ) k .times.
.times. F .times. .times. ( x , y + k , t + n ) .times. .times. h
.times. .times. ( k , n ) 18 ( otherwise ) .times. .times. h
.times. .times. ( k , n ) = { 1 , 8 , 8 , 1 ( k = - 3 , - 1 , 1 , 3
) ( n = 0 ) - 5 , 10 , - 5 ( k = - 2 , 0 , 2 ) ( n = - 1 ) 0 (
otherwise ) ( 1.3 ) ##EQU4##
[0036] Linear Temporal interpolators are perfect in the absence of
motion. Linear Spatial methods have no artifacts in case of no
vertical detail occurs. It seems logical therefore, to adapt the
interpolation strategy to motion and/or vertical detail. Many such
systems have been proposed, mainly in the 1980's. The basic concept
of these methods is that the methods have some kind of motion
detection, which is implemented either implicitly or explicitly.
The motion detector is used to decide whether the algorithm will do
inter-field interpolation or intra-field interpolation. Inter-field
interpolation is used in static situations while the intra-field
interpolation is used with motion.
[0037] Non-linear algorithms consist primarily of implicitly
adaptive, nonlinear algorithms and some explicitly adaptive
nonlinear algorithms with motion detector (MD) algorithms. The
implicitly adaptive, nonlinear algorithms provided the best
affordable interlace-to-progressive scan conversion method for TV
receivers until in the 1990's, when single-chip motion compensated
methods become feasible. Implicitly adaptive, nonlinear algorithms
are still widely used even now, especially in the computer
community.
[0038] Median filtering is by far the most popular example of
implicitly adaptive methods. The simplest version is a Three-Tap VT
Median Filter. The interpolated samples are found as the median
luminance value of the vertical neighbors and the temporal neighbor
in the previous field. The formal definition of this filter is
given by: F o .function. ( x , y , t ) = { F .times. .times. ( x ,
y , t ) , ( y .times. .times. mod .times. .times. 2 = t .times.
.times. mod .times. .times. 2 ) med .times. .times. ( F .times.
.times. ( x , y - 1 , t ) , F .times. .times. ( x , y + 1 , t ) , F
.times. .times. ( x , y , t - 1 ) , ) , ( otherwise ) ( 1.4 )
##EQU5## [0039] where med(A, B, C) is defined by: med .times.
.times. ( A , B , C ) = { A , ( B < A < C ) ( C < A < B
) B , ( A .ltoreq. B .ltoreq. C ) ( C .ltoreq. B .ltoreq. A ) C , (
otherwise ) ( 1.5 ) ##EQU6## (The formulae above can be generalized
into any number of input values).
[0040] One of the attractive features of the median filter is its
ability to preserve signal edges while suppressing impulse noise
quite well. This is done by implicitly adapting to the motion or
edges. The underlying assumption is that in case of stationarity,
F(x, y, t) is likely to have a value between those of its vertical
neighbors. Hence an inter-field (temporal) filtering is utilized.
In case of motion, intra-field interpolation often results, since
the correlation between the samples in the current field is likely
to be the highest. Thus median filter automatically realizes
intra/inter field switching on a pixel basis.
[0041] The median filter, however, exhibits some undesirable
performance such as edge and plateau jitter for non-constant signal
plus impulsive noise. Near the edge, the median filter allows bias
error depending on the noise power and the height of signal edge.
Applying smoothing prior to median filtering can limit this
flaw.
[0042] The major drawback of median filtering is that it distorts
vertical details and introduces alias. Hsu and Chen proposed a 2D
Adaptive. Separable Median Filter to reduce the alias (blocking)
effect. This method is based on 1D adaptive median which can be
defined as: F o .function. ( x , y , t ) = { F .times. .times. ( x
, y , t ) , ( y .times. .times. mod .times. .times. 2 = t .times.
.times. mod .times. .times. 2 ) i = M - d M + d .times. .times. x ^
i ( 2 .times. d + 1 ) , ( otherwise ) .times. .times. where .times.
: .times. .times. M = ( N - 1 ) 2 .times. .times. d = { 0 , ( 1
> M ) M - l , ( 1 .ltoreq. M ) ( 1.6 ) ##EQU7## [0043] l is the
distance between the position of the filter and that of blocking
effects.
[0044] This algorithm was originally used for removing blocking
effect in block-based image coding, but it can be easily adapted by
giving l any specific value and combining this algorithm with a
motion/edge detector. Despite all the disadvantages of the median
filter, its superior properties at vertical edges and its low
hardware cost have made it very successful.
[0045] Motion-adaptive methods use a motion detector algorithm to
detect any movement inside the image. Based on the motion detector
result, a filtering algorithm is used to convert the image from
interlace to progressive. To detect the motion, the difference
between two consecutive images is calculated. Normally this
calculation is only done on the luminance data stream (the Y stream
in the YUV format). Unfortunately, due to noise, the difference
signal doesn't become zero in all parts of the picture that lack
motion. Some systems have additional problems; for example,
chrominance streams cause nonstationarities in color regions,
interlace causes nonstationarities in vertical detail parts and
timing jitter of the sampling clock is particularly harmful in
horizontal detailed areas
[0046] To overcome these problems it is desirable that the motion
detector output should have a multilevel signal output rather than
just a simple binary. The multilevel signal can be used to give
more information about the motion characteristic. Because of all
these difficulties with motion detection, providing a practical
motion detector is not trivial. Assumptions are necessary to
realize a practical motion detector that yields an adequate
performance in most cases. Common assumptions to improve the
detector include:
[0047] 1. Noise is small and signal is large
[0048] 2. The spectrum part of the chrominance streams carries no
motion information
[0049] 3. The low frequency energy in signal is larger than in
noise and alias
[0050] 4. Objects are large compared to a pixel.
[0051] A good motion detector must switch, or preferably fade,
between two processing modes: one optimal for stationarity and the
other for motion. An important aspect of designing a good
motion-adaptive algorithm is determining the switching threshold or
the fading function. Even with an adaptive switching/fading
function, it is still difficult to make a function that can adapt
to any kind of image.
[0052] Those skilled in the art will appreciate that temporal and
vertical filters may be combined to reject alias components and
preserve frequency domain by applying motion adaptive fading.
Fading between an interpolator optimized for static image parts and
one for moving images parts can be achieved with the following
function: F o .function. ( x , y , t ) = { F .times. .times. ( x ,
y , t ) ( y .times. .times. mod .times. .times. 2 = t .times.
.times. mod .times. .times. 2 ) .alpha. .times. .times. f st
.function. ( x , y , t ) + ( 1 - .alpha. ) .times. .times. F mot
.function. ( x , y , t ) ( otherwise ) ( 1.7 ) ##EQU8## With
F.sub.st the result of interpolation for static image parts and
F.sub.mot the result for moving image parts. The motion detector
determines the mixed factor .alpha..
[0053] It has been suggest by some that a well defined VT filter
can perform as well as the best motion adaptive filter at a lower
price. The idea is that in order to prevent switching artifacts the
fading results in something very similar to VT filtering that needs
no motion detector.
[0054] Others have suggested a fade between more than two
interpolators. For example, in certain interpolators the high
frequency information for the interpolated line is extracted from
the previous line, while a motion adaptive interpolator determines
the low frequency information. F o .function. ( x , y , t ) = { F
.times. .times. ( x , y , t ) ( y .times. .times. mod .times.
.times. 2 = t .times. .times. mod .times. .times. 2 ) F HF .times.
.times. ( x , y - 1 , t ) + .alpha. .times. .times. F av .function.
( x , y , t ) + ( 1 - .alpha. ) .times. .times. F LF .function. ( x
, y , t - 1 ) , ( other .times. .times. ( 1.10 ) ( 1.8 ) ##EQU9##
With F.sub.HF and F.sub.LF being the high pass and low pass
filtered versions of input signal F. F.sub.av is defined by: F av =
F LF .function. ( x , y - 1 , t ) + F LF .function. ( x , y + 1 , t
) 2 ( 1.9 ) ##EQU10## With .alpha. controlled by the motion
detector.
[0055] Another kind of motion detector is known as a "Mouse's Teeth
Detector," schematically illustrated in FIGS. 5 and 6. This motion
detector uses a spatial offset in the vertical direction to detect
the motion. The computational complexity of the algorithm is very
low and requires memory for only one field instead of an entire
frame. The output of the detector has 8 level that can be used to
better define a motion between the fields.
[0056] Another strategy for motion detection involves edge
detection. Certain edge dependent methods use a larger neighborhood
of samples in order to capture information about the edge
orientation. If an intra-field interpolation is necessary because
of motion, then the interpolation should preferably preserve the
baseband spectrum. After determining the least harmful filter
orientation, the signal is interpolated in that direction. As shown
in FIG. 7, the interpolated sample X is determined by a luminance
gradient indication which is calculated from its immediate
neighborhood. The formal definition is given by: X = { X A , ( ( A
- F < C - D ) ( A - F < B - E ) ) X C , ( ( C - D < A - F
) ) C - D < B - E ) ) X B , ( otherwise ) ( 1.10 ) ##EQU11##
[0057] where XA, XB, XC are defined by: X A = A + F 2 , .times. X B
= B + E 2 , .times. X C = C + D 2 ##EQU12## [0058] and where the
pixels A, B, C, D, E, and F are those indicated in FIG. 7, formally
defined by: A=F(x-1,y-1,t) E=F(x+1,t) B=F(x,y-1,t) F=F(x+1,y+t,t)
C=F(x+1,y-1,t) G=F(x,y-3,t) D=F(x-1,y+1,t) H=F(x,y+3,t) in a
certain methods, X.sub.B is replaced by a VT median filter.
[0059] It is uncertain whether a zero difference between pairs of
neighboring samples indicates the spatial direction in which the
signal is stationary. For example, noise--or more fundamentally,
alias--can negatively influenced the decision. An edge detector can
be applied to switch or fade between at least two processing modes,
each of them optimal for interpolation for a certain orientation of
the edge.
[0060] It is possible to increase the edge detection consistency by
checking also the edge orientation at neighboring pixels. In
certain methods, directional edge-detection operators are defined.
For example, the error measurement for a vertical orientation is
defined by: angle90.degree.=|B-E|+|C-F| (1.11) and for an edge
under 116 degrees: angle116.degree.=|A-E|+|B-F| (1.12)
[0061] Edge consistency information is further increased by looking
for a dominating main direction in a near neighborhood. The problem
of alias however, still remains.
[0062] Other methods of interpolation are hybrid methods, which mix
linear and nonlinear methods. For example, in FIR Median hybrids,
schematically illustrated in FIG. 8, first an 8-Tap VT filter is
used. The output of the FIR filter is fed as one of the inputs of a
five point median filter. The remaining four inputs are the nearest
neighbors on the VT sampling grid.
[0063] Another kind of hybrid, the 9-Point Weighted Median, extends
the aperture of median filter in the horizontal domain to enable
implicit edge adaptation. It consist of 7 samples point, and the
output of the median is defined by: F o .function. ( x , y , t ) =
med .times. .times. ( A , B , C , D , E , F , B + E 2 , F .times.
.times. ( x , y , t - 1 ) , F .times. .times. ( x , y , t - 1 ) ) (
1.13 ) ##EQU13## where A, B, C, D, E, and F are the pixels as
indicated in FIG. 7 and defined in equation 1.10.
[0064] Other methods extend this concept with a motion detector.
For example, instead of using nine points, certain methods made the
coefficient of 1/2(B+E) and F(x, y, t-1) adaptive. The motion
detector controls the importance of the "weight" of these
individual pixels at the input of the median filter. The output of
the deinterlacer is defined by: F o .function. ( x , y , t ) = { F
.times. .times. ( x , y , t ) , ( y .times. .times. mod .times.
.times. 2 = t .times. .times. mod .times. .times. 2 ) med .times.
.times. ( A , B , C , D , E , F , .alpha. .times. .times. F .times.
( x , y , t - 1 ) .times. .times. .beta. ( .times. B + E 2 ) ) , (
otherwise ) ( 1.14 ) ##EQU14## where .alpha. and .beta. are the
integer weights. .alpha.A indicates the number of A's that occur in
equation 1.14. For example, 3A means A, A, A. A large value of
.alpha. increases the probability of field insertion, whereas a
large .beta. increases the probability of line averaging at the
output.
[0065] Another combination of implicit/explicit edge and motion
adaptivity uses a hierarchical three-level motion detector that
provides indications of static, slow, and fast motion. Based on
this analysis, one of three different interpolators is selected. In
the case of static images, a temporal FIR filter is selected. In
the case of slow motion, the so-called weighed hybrid median filter
(WHMF) is used. And in the case of fast motion, a spatial FIR
filter is used as the interpolator. Applying the definitions of
FIG. 7 yields: F o = { F .times. .times. ( x , y , t ) , ( y
.times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) 1 2 .times. ( .times. ( x , y , t - 1 ) + F
.times. .times. ( x , y , t + 1 ) ) , ( static ) med .times.
.times. ( .alpha. 0 .times. A + F 2 .alpha. 1 .times. B + E 2
.alpha. 2 .times. C + D 2 .alpha. 3 .times. G + H 2 ) , ( slow
.times. .times. motion ) c 0 .times. B + c 1 .times. E + c 2
.times. G + c 3 .times. H , ( fast .times. .times. motion ) ( 1.15
) ##EQU15## The coefficients .alpha..sub.i are calculated according
to Webers Law: "the eye is more sensitive to small luminance
differences in dark area rather than in bright areas."
[0066] Motion compensated methods are the most advanced
interlace-to-progressive scan conversion algorithms available.
Similarly to many of the algorithms discussed above, motion
compensated methods try to interpolate in the direction with the
highest correlation. With motion vectors available, this is an
interpolation along the trajectory of motion. Using the motion
compensation, a moving sequence can virtually be converted into a
stationary one. Thus, methods that perform better for static image
parts will profit from motion compensation.
[0067] It is very easy to add a motion compensation in any of the
algorithms described above. However, in the following paragraphs,
attention will be given to new algorithms that cannot be deduced
directly from the non-motion compensated algorithms. The common
feature of these methods is that they provide a solution to the
fundamental problem of motion compensating sub-sampled data. This
problem arises if the motion vector used to modify coordinates of
pixels in a neighboring field does not point to a pixel on the
interlaced sampling grid. In the horizontal domain, this causes no
serious problem, with the application of sampling rate conversion
theory. In the vertical domain, however, the demands for applying
the sampling theorem are not satisfied, prohibiting correct
interpolation.
[0068] A first approximation to cope with this fundamental problem
is to perform a spatial interpolation whenever the motion vector
points at a nonexisting sample, or even to round to the nearest
pixel.
[0069] Certain more sophisticated methods depart from this
approximation. Before actually performing an intra-field
interpolation, the motion vector is extended into the previous
fields to check whether this extended vector arrives in the
vicinity of an existing pixel. The formal definition is given by: F
o .times. .times. ( x , y , t ) = { F .times. .times. ( x , y , t )
, ( y .times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) F .times. .times. ( x - MV x .function. ( x , y
, t ) , y - MV y .function. ( x , y , t ) - y , t - 1 ) , ( ( y -
MV y - y ) .times. .times. mod .times. .times. 2 = t .times.
.times. mod .times. .times. 2 ) F .times. .times. ( x , - 2 .times.
MV x .function. ( x , y , t ) , y - MV y .times. ( x , y , t ) - 2
.times. y , t - 2 ) , ( ( y - 2 .times. MV y - 2 .times. y )
.times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) F .times. .times. ( x - MV x .function. ( x , y
, t ) , y - MV y .function. ( x , y , t ) , t - 1 ) , ( otherwise )
( 1.16 ) ##EQU16## where .epsilon..sub.y is the small error
resulting from rounding to the nearest grid position.
.epsilon..sub.y has to be smaller than a threshold. If no motion
compensated pixels appear in the vicinity of the required position
it should be possible to find one even further backward in time.
This is not recommended, however, as the motion vector loses
validity by extending it too far.
[0070] The algorithm implicitly assumes uniform motion over a
two-fields period, which is a drawback. Furthermore, the robustness
to incorrect motion vectors is poor, since no protection is
provided.
[0071] The Motion-compensated Time Recursive Algorithm came from
the generalization of the fact that recursive filters have a lower
implementation complexity than FIR filters. Even first-order linear
recursive filters have infinite impulse response and produce output
depending on the whole history of input. The Motion-compensated
Time Recursive Algorithm uses a previously deinterlaced frame
instead of the previous field. Once a perfectly deinterlaced image
is available, and the motion vectors are accurate, sampling rate
conversion theory can be used to interpolate the samples required
to deinterlace the current field. The formal definition is given
by: F o .function. ( x , y , t ) = { F .times. .times. ( x , y , t
) , ( y .times. .times. mod .times. .times. 2 = t .times. .times.
mod .times. .times. 2 ) F o ( x , - MV x .function. ( x , y , t ) ,
y - MV y .function. ( x , y , t ) , t - 1 ) , ( otherwise ) ( 1.17
) ##EQU17## The initial condition F.sub.o(x, y, 0) is equal to F(x,
y, 0) where F(x, y, t) is the output of a linear spatial
interpolation.
[0072] As can be seen in FIG. 9, the interpolated samples generally
depend on previous original samples as well as previously
interpolated samples. Thus errors originating from one output frame
can propagate into subsequent output frames. This is inherent to
the recursive approach and is the worst drawback of this
approach.
[0073] To prevent serious errors from propagating, the following
algorithm has been proposed: F o .function. ( x , y , t ) = { F
.times. .times. ( x , y , t ) , ( y .times. .times. mod .times.
.times. 2 = n .times. .times. mod .times. .times. 2 ) ( 1 - c )
.times. .times. F ^ .times. .times. ( x , y , t ) + cF o ( x , - MV
x ( x , y , t ) , y - MV y .function. ( x , y , t ) , t - 1 ) , (
otherwise ) ( 1.18 ) ##EQU18##
[0074] Aliasing at the output of the deinterlaced results in
nonstationarity along the motion trajectory. Such nonstationarities
can be suppressed using a filter. Cost effective filtering in the
spatial, temporal, or spatio-temporal domain can best be realized
with a recursive filter.
[0075] Certain methods extended the idea of the time recursive
algorithm and proposed a motion-compensated first-order recursive
temporal filter given by: F o .function. ( x , y , t ) = { kF
.times. .times. ( x , y , t ) + ( 1 - k ) .times. .times. F o ( x -
MV x ( x , y , t ) , y - MV y .function. ( x , y , t ) , t - 1 ) ,
( y .times. .times. mod .times. .times. 2 = t .times. .times. mod
.times. .times. 2 ) pF i .function. ( x , y , t ) + ( 1 - p )
.times. .times. F o ( x , - MV x ( x , y , t ) , y - MV y
.function. ( x , y , t ) , t - 1 ) , ( otherwise ) ( 1.19 )
##EQU19## where p and k are adaptive parameters and F.sub.i, is the
output of any initial interlace-to-progressive Conversion
algorithm. Preferably a simple method is used, such as line
averaging. The derivation of k is fairly straightforward and is
comparable to what we see in edge preserving recursive filters,
which are used for motion-adaptive noise reduction.
[0076] A similar derivation for p is not obvious, since the
difference would heavily depend upon the quality of the initial
deinterlacer. To solve this problem, the factor p is selected such
that the nonstationarity along the motion trajectory of the
resulting output for interpolated pixels equals that of the
vertically neighboring original pixels. This assumption leads to: p
.times. .times. ( x , y , t ) = A + B + .delta. 2 .times. F i
.function. ( x , y , t ) - F o ( x , - MV x .function. ( x , y , t
) , y - MV y .times. ( x , y , t ) , t - 1 ) + .delta. .times.
.times. where .times. : .times. .times. A = .times. F o .function.
( x , y - 1 , t ) - F o ( x - MV x .times. ( x , y , t ) , .times.
y - MV y .times. ( x , y , t ) - 1 , t - 1 ) .times. .times. B =
.times. F o .function. ( x , y + 1 , t ) - F o ( x , - MV x
.function. ( x , y , t ) , .times. y - MV y .times. ( x , y , t ) +
1 , t - 1 ) ( 1.19 ) ##EQU20## [0077] and where .delta. is a small
constant, to prevent division by zero.
[0078] The recursion is an essential ingredient of the concept.
Consequently, the adaptive-recursive approach, similar to the
time-recursive approach, has the risk of error propagation as its
main disadvantage.
[0079] As can be seen from the discussion above, motion estimation
is used to improve the accuracy of the prediction of lines in
interlaced-to progressive scan conversion. (Motion estimation also
has various applications in the image, video processing, and
computer vision or robotics area.) Linear or temporal interpolators
are perfect in the absence of motion, but in the presence of
motion, especially multiple motions in one frame, motion estimation
is essential in order to have a good prediction of the missing
lines.
[0080] In general, motion estimation can be divided into three
categories: (1) Pixel-by-pixel motion estimation (sometimes called
"pel-recursive algorithms," or "PRAs"); (2) Block-by-block motion
estimation (commonly called "block matching algorithms," or "BMAs";
and (3) advanced motion estimation methods.
[0081] Pel-recursive algorithms have rarely been used because they
are inherently complex and quite difficult to implement. Another
problem with PRAs is that the motion estimation algorithms
sometimes run into convergence problems.
[0082] One well-known PRA is gradient matching. FIG. 10 illustrates
the principle of gradient matching. At a given point in a picture,
the function of brightness with respect to distance across the
screen will have a certain slope, known as the spatial luminance
gradient. If the associated picture area is moving, the slope will
traverse a fixed point on the screen and the result will be that
the brightness now changes with respect to time. For a given
spatial gradient, the temporal gradient becomes steeper as the
speed of movement increases. Thus motion speed can be estimated
from the ratio of the spatial and temporal gradients.
[0083] In practice this is difficult because there are numerous
processes which can change the luminance gradient. When an object
moves so as to obscure or reveal the background, the spatial
gradient will change from field to field even if the motion is
constant. Various illuminations, such as when an object moves into
shade, also cause difficulty. The process can be assisted by
recursion, in which the motion is estimated over a larger number of
fields, but this will result in problems directly after a scene
change.
[0084] Phase correlation is another kind of PRA. A block diagram of
a basic phase correlator is provided in FIG. 11. A phase correlator
works by performing a discrete Fourier transform on two successive
fields and then subtracting all of the phases of the spectral
components. The phase differences are then subject to a reverse
transform which directly reveals peaks whose positions correspond
to motions between the fields. The nature of the transform domain
means that if the distance and the direction of the motion are
measured accurately, the area of the screen in which they took
place is not. Thus in practical systems the phase correlation stage
is followed by a matching stage not dissimilar to the block
matching process. However, the matching process is steered by the
motions from the phase correlation, and so there is no need to
attempt to match at all possible motions. The similarities between
the two from the practical perspective causes some people think of
phase correlation as another branch of block matching.
[0085] One way of considering phase correlation is to think of the
Fourier transform as breaking the picture into its constituent
spatial frequencies. The hierarchical structure of block matching
at various resolutions is in fact performed in parallel. In this
way small objects are not missed because they will generate
high-frequency components in the transform.
[0086] Although the matching process is simplified by adopting
phase correlation, the Fourier transforms themselves require
complex calculations. The high performance of phase correlation
would remain academic because it's too difficult to implement, were
it not for an important assumption about the range of motion
speeds. When realistic values are used for the motion speed the
computation required by block matching actually exceeds that
required for phase correlation. FIG. 12 provides a block diagram of
a practical phase correlated motion estimator.
[0087] The elimination of amplitude information from the phase
correlation process ensures that motion estimation continues to
work in the case of fades, objects moving into shade, or flashgun
firings.
[0088] Block matching is the simplest approach to motion
compensation. Even though it is not optimal, it has been widely
used, and is the preferred technique in inter-frame motion
compensated (MC) hybrid coding, interlace-to-progressive scan
conversion, and other video/image processing related areas. The
reason for this is the ease of implementation, because it doesn't
require complicated circuitry. The idea of block matching, as
illustrated in FIG. 13, is to calculate motion of a block of pixels
by comparing it within a frame or field. Normally the search is
constrained to searching within a specific window.
[0089] In block matching motion estimation algorithms a
displacement/motion vector (MV) is assigned to the center of a
block pixel B(x, y, t) in the current field t. If you assume that
the block is M.times.N pixel size, B(x, y, t) can be described as:
B(x,y,t)=.left
brkt-bot.{(x,y)|X.sub.x-N/2.ltoreq.x.ltoreq.X.sub.x+N/2
X.sub.y-M/2.ltoreq.y.ltoreq.X.sub.y+M//2},t.right brkt-bot. Where
X=(X.sub.x,X.sub.y).sup.T is the center of B(x, y, t).
[0090] The motion vector MV (x,y,t) is determined by comparing the
present field block with the previous field. The goal is to find
the best match or least distorted block from the previous field.
The best matched block has a center, which is shifted with respect
to X over the motion MV (x,y,t).
[0091] It's desirable to compare all the possible positions to get
the optimal MV. However this is impractical and it requires a lot
overhead processing. In order to make it practical to implement,
the search is constrained within a specific window, which is
centered at X.
[0092] The window can be specified as: W .function. ( x , y , t - 1
) = [ { ( x , y ) | x .ltoreq. N 2 + n .times. .times. 1 y .ltoreq.
M 2 + m .times. .times. 2 } , t - 1 ] ( 1.21 ##EQU21##
[0093] The window is illustrated in FIG. 14 [2.15]
[0094] In most cases it's nearly impossible to get a similar block
from the previous field. The motion vector MV (x,y,t) resulting
from the block-matching process is a candidate vector C which
yields the minimum value of an error function e(C, X, t). S is
defined as the set of all possible C within MV (x,y,t-1).
S={{overscore (C)}.parallel.C.sub.x|.ltoreq.n1 |C.sub.y|.ltoreq.m2}
(1.22) {overscore (MV)}(x,y,t).epsilon.{{overscore
(C)}.epsilon.S|e({overscore (C)},{overscore
(X)},t).ltoreq.e({overscore (F)},{overscore
(X)},t).A-inverted.{overscore (F)}.epsilon.S}
[0095] Assuming that all the pixels in B(x, y, t) have the same
motion, then MV (x,y,t) with the smallest matching error is
assigned to all pixel positions of B(x, y, t):
.A-inverted.(x,y).epsilon.B(x,y,t): {overscore
(MV)}(x,y,t).epsilon.{{overscore (C)}.epsilon.S|e({overscore
(C)},{overscore (X)},t).ltoreq.e({overscore (F)},{overscore
(X)},t).A-inverted.{overscore (F)}.epsilon.S} (1.23)
[0096] The error value for a given candidate vector C is a function
of the luminance values of the pixels in the current block B(x,y,t)
and those of the shifted block from the previous field, summed over
the block B(x,y,t). Methods to calculate e(C, X, t) include:
[0097] Mean absolute Error (MAE) or Sum of the absolute Differences
(SAD): M 1 .function. ( i , j ) = 1 MN .times. p = 1 N .times. q =
1 M .times. ( X p , q , n ) - ( X p + i , q + j , n - 1 ) , .times.
i .ltoreq. n .times. .times. 1 , j .ltoreq. m .times. .times. 2 (
1.24 ) ##EQU22##
[0098] Mean Square Error (MSE): M 2 .function. ( i , j ) = 1 MN
.times. p = 1 N .times. q = 1 M .times. ( ( X p , q , n ) - ( X p +
i , q + j , n - 1 ) ) 2 , .times. i .ltoreq. n .times. .times. 1 ,
j .ltoreq. m .times. .times. 2 ( 1.25 ) ##EQU23##
[0099] Cross-correlation function: M 3 .function. ( i , j ) = p = 1
N .times. q = 1 M .times. ( X p , q , n ) .times. ( X p + i , q + j
, n - 1 ) [ p = 1 N .times. q = 1 M .times. ( X p , q , n ) 2 ] 1 /
2 .function. [ p = 1 N .times. q = 1 M .times. ( X p + i , q + j ,
n - 1 ) 2 ] 1 / 2 , .times. i .ltoreq. n .times. .times. 1 , j
.ltoreq. m .times. .times. 2 ( 1.26 ) ##EQU24## Mean Absolute Error
is presently the most commonly implemented method on ICs, since it
permits the simplest circuitry. In several simulations Mean
Absolute Error performs as well as Mean Square Error. The
Cross-correlation is the best method in the sense that it produce
less error, but it requires a lot of computation, which makes it
impractical to implement.
[0100] Block Matching Algorithm techniques depend upon several
assumptions:
[0101] (1). No rotational motion occurs inside the block
[0102] (2) Objects have inertia
[0103] (3) The window is sufficiently large to capture the motion
from frame to frame
[0104] The consequence of assumptions 1 and 2 is that the MV
(x,y,t) that assign to X is applied to all the pixels in the
blocks. Also B(x,y,t) should be small enough so that in cases where
there is a rotational motion of an object in a frame, it can be
translate into straight vector MV (x,y,t) at the block level.
[0105] Another thing that should be considered is the window size.
If the window is not big enough, there is a chance that the MV
(x,y,t) that we get is not optimal, especially in the case of a
very fast moving objects.
[0106] Conceptually, the simplest approach to block matching is
brute-force, or full search block matching. This approach involves
searching every possible position. This gives the global optima,
but at the expense of extensive computation. The magnitude of the
computational load is exacerbated by the need to extend motion
estimation to sub-pixel accuracy. As shown in FIG. 15 the increase
in potential match loci for even half-pixel accuracy involves a
quadrupling of the number of possible solutions that must be
searched. A motion vector resolution of 1/4 pixel accuracy is
normally considered as a near true-motion vector field.
[0107] Though brute search BMAs give a global optima result, it
requires more complex circuitry or more time to process. Most of
the motion estimation ICs right now implement a full search
algorithm. The complex circuitry makes the price of these ICs
impractical for most applications. It's desirable to have an
affordable consumer IC for motion estimation. In some applications,
a local optima solution is sufficient. This has led to development
of more efficient motion estimation approaches, which test only a
subset of candidate vectors.
[0108] One of these approaches is conjugate direction searching
(CDS). One-at-a-time searching (OTS) is a simplified version of
conjugate direction search. OTS tracks the motion alternately
horizontally and vertically, as shown in FIG. 16. A modified and
improved version of this approach, one-dimensional full search
motion estimation, has recently been developed.
[0109] Another block searching strategy is logarithmic searching.
Logarithmic searching was the first simplified search strategy
published. The logarithmic search tracks block motion along the
direction of minimum distortion, as illustrated in FIG. 17.
[0110] Yet another block searching strategy is three-step
searching, illustrated in FIG. 18. This is a fine-coarse search
mechanism. At each step, the algorithm calculates and compares 9
points. Assuming that the center for the first step is
X=(X.sub.x,X.sub.y).sup.T, the algorithm for each step can be
described as: ( X _ ' , t ) = { C _ .di-elect cons. S | e
.function. ( C _ , t ) .ltoreq. e .function. ( F _ , t ) .times.
.A-inverted. F _ .di-elect cons. S } .times. .times. where .times.
( X _ ' , t ) .times. .times. is .times. .times. the .times.
.times. new .times. .times. center .times. .times. point .times.
.times. C _ .times. : .times. .times. candidate .times. .times.
point .times. .di-elect cons. { X _ + a .times. U _ i } .times.
.times. U _ i = { ( 0 0 ) , ( 0 1 ) , ( 1 1 ) , ( 1 0 ) , ( 1 - 1 )
, ( 0 - 1 ) , ( - 1 - 1 ) , ( - 1 0 ) , ( - 1 1 ) } .times. .times.
a = { 4 ; for .times. .times. step .times. .times. I 2 ; for
.times. .times. step .times. .times. II 1 ; for .times. .times.
step .times. .times. III ( 1.27 ) ##EQU25## [0111] e(C, t) and e(F,
t) is error from the corresponding points. And the motion vector:
{overscore (MV)}(x,y,t)=({overscore (X)}''',t)-({overscore (X)},t)
(1.27) {overscore (X)}''' is the center point after the 3.sup.rd
step.
[0112] Still another strategy for block searching is hierarchical
searching. A block diagram of a hierarchical searching algorithm is
illustrated in FIG. 19. This process involves decimating
(sub-sampling) the present image and the reference image
successively both horizontally and vertically. The search process
starts with the lowest resolution images, using a small block size.
The motion vector estimated at the first stage is used as the
starting points for motion estimation at the next stage. Note that
block size is now doubled along both directions. This process is
repeated until the original resolution images are reach. The HDTV
codes propose by Zenith and AT&T uses hierarchical searching
for motion estimation
[0113] Another block search strategy is 3D recursive search motion
estimation. The concept of this method is to store all the
information of the motion vectors (MV) from the previous field and
use that as a comparison to predict the new motion vector from the
current field. At the top of the field, the prediction only
involves the temporal neighboring motion vector. After the top
motion vectors of the current field have been found, the prediction
involves both the spatial and temporal neighboring motion
vectors.
[0114] This algorithm tries to overcome the problem of
one-dimensional recursive searching and tries to focus on the
smoothness of the motion vectors. In one-dimensional recursive
searching, like one-at-a-time searching, the resulting smoothness
of these algorithms is insufficient. This is assumed to be caused
by a large number of evaluated candidate vectors located around the
spatial or temporal prediction value. This can cause strong
deviation from the prediction, like inconsistencies in the velocity
field, as the vector selection criterion applied in block matching
(minimum match error) cannot guarantee returning true motion
vectors. The fundamental difficulty with a one-dimensionally
recursive algorithm is that it cannot cope with discontinuities in
the velocity plane.
[0115] Certain 3D recursive search algorithms make the assumption
that the discontinuities in the velocity plane are spaced at a
distance that enables convergence of the recursive block matcher
between two discontinuities. The recursive block matcher yields the
correct vector value at the first side of the object boundary and
starts converging at the opposite side. The convergence direction
here points from side one to side two. Either side of the contour
can be estimated correctly, depending on the convergence direction
chosen, though not both simultaneously. Based on this, two
estimators are applied concurrently, as indicated in FIG. 20, with
opposite convergence directions. A mean absolute error criterion is
used to decide which of these two estimators yields the correct
displacement vector at the output.
[0116] This bi-directional convergence is hereinafter referred to
as 2-D Convergence. The process is formally defined by:
.A-inverted. ( x , y ) .di-elect cons. B .function. ( x , y , t ) :
MV _ .function. ( x , y , t ) = { MV _ a .function. ( x , y , t ) ,
( e .function. ( MV _ a , X _ , t ) .ltoreq. e .function. ( MV _ b
, X _ , t ) ) MV _ b .function. ( x , y , t ) , ( e .function. ( MV
_ a , X _ , t ) .gtoreq. e .function. ( MV _ b , X _ , t ) )
.times. .times. where .times. : .times. .times. e .function. ( MV _
a , X _ , t ) = x .di-elect cons. B .function. ( x , y , t )
.times. F .function. ( x , y , t ) - F .function. ( x - MV ax , y -
MV ay , t - T ) .times. .times. e .function. ( MV _ b , X _ , t ) =
x .di-elect cons. B .function. ( x , y , t ) .times. F .function. (
x , y , t ) - F .function. ( x - MV bx , y - MV by , t - T ) ( 1.28
) ##EQU26## [0117] MV.sub.a and MV.sub.b are found in a spatial
recursive process and can be calculated using equations 1.22 and
1.23. The updating prediction vectors S.sub.a(x, y, t) are given
by: {overscore (S)}.sub.a(x,y,t)={overscore (MV)}.sub.a({overscore
(X)}-{overscore (SMV)}.sub.a,t) (1.29) {overscore
(S)}.sub.b(x,y,t)={overscore (MV)}.sub.b({overscore (X)}-{overscore
(SMV)}.sub.b,t) (1.30) where: {overscore
(SMV)}.sub.a.noteq.{overscore (SMV)}.sub.b (1.31) and where SMV
points from the center of the block from which the prediction
vector is taken to the center of the current block.
[0118] As indicated in condition 1.31, the two estimators have
unequal spatial recursion vectors. If the two convergence
directions are opposite (or at least different), the 2-D
Convergence solves the run-in problem at the boundaries of moving
objects. This is because one of the estimators will have converged
already at the position where the other is yet to do so. Hence the
concept combines the consistent velocity field of a recursive
process with the fast step response as required at the contours of
moving objects. The attractiveness of a convergence direction
varies significantly for hardware. Referring to FIG. 21, the
predictions taken from blocks 1, 2, or 3 are convenient for
hardware and blocks 6, 7, 8 are totally unattractive.
[0119] The 3D Recursive approach extends the concept of the 2-D
Convergence by adding convergence accelerators (CA), which are
taken from the temporal neighboring prediction vectors. The spatial
predictions are selected to yield two perpendicular diagonal
convergence axes, as given by the following equations, and
illustrated in FIG. 22. S _ a .function. ( x , y , t ) = MV _ a
.function. ( X _ - ( X Y ) , t ) .times. .times. S _ b .function. (
x , y , t ) = MV _ b .function. ( X _ - ( - X Y ) , t ) ( 1.32 )
##EQU27##
[0120] The Convergence Accelerator (CA) is another estimator that
is selected along the convergence direction of each original
estimator. To cope with the casuality, instead of introducing new
estimators c and d, it uses a temporal neighboring motion vector
from the previous field (Ta and and T.sub.b for estimators a and b
respectively). The concept is that the new candidate in each
original estimator accelerates the convergence of the individual
estimator by introducing a look ahead into the convergence
direction. These convergence accelerators are not taken from the
corresponding block in the previous field, but from a block shifted
diagonally over r blocks and opposite to the blocks which the
spatial predictions S7a and S7b. {overscore
(T)}.sub.a(x,y,t)={overscore (MV)}(x+rX,y+rY,t-T) {overscore
(T)}.sub.b(x,y,t)={overscore (MV)}(x+rX,y+rY,t-T) (1.33)
[0121] Increasing r implies a larger look ahead, but the
reliability of the prediction decreases correspondingly, as the
correlation between the vectors in a velocity plane can be expected
to drop with increasing distance. r=2 has been experimentally found
to the best for a block size of 8*8 pixels. The resulting relative
positions are drawn in FIG. 23.
[0122] For the resulting 3D RS block matching algorithm, the motion
vector MV(x, y, t) is calculated according to equation 1.28, where
MV.sub.a(x,y,t) and MV.sub.b(x,y,t) result from estimators a and b
respectively, and are taken from a candidate set CS. The motion
vector range is limited to CS.sup.max (which is the search window
defined in 1.28) and the proposed candidate set CS.sub.a(x,y,t) for
estimator a applying this updating strategy, hereinafter referred
to as asynchronous cyclic search (ACS), defined as: CS a .function.
( x , y , t ) = { C _ .di-elect cons. CS max | C _ = MV _ a
.function. ( x - X , y - Y , t ) + U _ a .function. ( x , y , t ) }
{ MV _ .function. ( x + 2 .times. X , y + 2 .times. Y , t - T )
.times. .0 } ( 1.34 ) ##EQU28## [0123] where: {overscore
(U)}.sub.a(x,y,t).epsilon.{0,lut(N.sub.b1(x,y,t)mod p)} [0124]
where Nbl is the output of a block counter, lut is a look up table
function, and p is a number [0125] which not a factor of the number
of blocks in the picture (preferably a prime number). The candidate
set for b is given by CS b .function. ( x , y , t ) = { C _
.di-elect cons. CS max | C _ = MV _ a .function. ( x - X , y + Y ,
t ) + U _ b .function. ( x , y , t ) } { MV _ .function. ( x - 2
.times. X , y + 2 .times. Y , t - T ) .times. .0 } ( 1.35 )
##EQU29## [0126] where: {overscore
(U)}.sub.b(x,y,t).epsilon.{0,lut((N.sub.b1(x,y,t)+offset)mod
p)}
[0127] U.sub.b(x,y,t) differs from U.sub.a(x,y,t) due to integer
offset added to the value of the block rate counter. The estimators
a and b are chosen from candidate set to minimize the matching
error: e .function. ( C _ , x , y , t ) = x .di-elect cons. B
.function. ( x , y , t ) .times. F .function. ( x , y , t ) - F
.function. ( x - C x , y - C y , t - T ) ( 1.36 ) ##EQU30## [0128]
where the matching error is summed over a block B(x,y,t), defined
as: B(x,y,t)={(x,y)|X.sub.x-X/2.ltoreq.x.ltoreq.X.sub.x+X/2
X.sub.y-Y/2.ltoreq.y.ltoreq.X.sub.y+Y/2}(1.37)
[0129] The best of two vectors resulting from estimators a and b is
selected in the output multiplexer and assigned to all pixels in
B(x,y,t). Good results are obtained from estimators using the ACS
strategy where the lut (look up table) containes the following
updates: U _ i = { ( 0 0 ) , ( 0 1 ) , ( 0 - 1 ) , ( 0 2 ) , ( 0 -
2 ) , ( 1 0 ) , ( - 1 0 ) , ( 3 0 ) .times. ( - 3 0 ) } ( 1.38 )
##EQU31##
[0130] Thus, a new system and method for interlace-to-progressive
scan conversion are needed, which implement a new algorithm by an
electronic consumer Video Display Processor Chip. The present
invention is directed towards meeting this need, among others.
SUMMARY OF THE INVENTION
[0131] A first embodiment interlace-to-progressive scan conversion
system according to the present invention comprises: a prefilter
having a prefiltered signal as an output; a motion estimator having
the prefiltered signal as input and a motion-corrected signal as an
output; and an adaptive filter having the prefiltered signal and
the motion-corrected signal as inputs.
[0132] A second embodiment interlace-to-progressive scan conversion
system according to the present invention comprises: a spatial line
averaging prefilter having a prefiltered signal as an output; a
motion estimator; and a three-stage adaptive recursive filter. The
motion estimator has the prefiltered signal as input and a
motion-corrected signal as an output. The motion estimator
comprises: a 3-D recursive search sub-component; a motion vector
correction sub-component; and a block erosion sub-component. The
three-stage adaptive recursive filter has the prefiltered output
and the motion corrected output as inputs. The three-stage adaptive
recursive filter comprises: a first stage comprises a function that
selects between using static pixels data and moving pixels data
from a next field; a second stage comprises a function that selects
a more valid set of data between motion compensated data from a
previous field and the pixels selected by the first stage; and a
third stage comprises a function that combines an intra-field
interpolation with the more valid set of data selected by the
second stage.
[0133] A third embodiment interlace-to-progressive scan conversion
system according to the present invention comprises: a spatial line
averaging prefilter having a prefiltered signal as an output; a
motion estimator having the prefiltered signal as input and a
motion-corrected signal as an output; and a three-stage adaptive
recursive filter having the prefiltered signal and the
motion-corrected signal as inputs. The motion estimator comprises:
a 3-D recursive search sub-component having a bilinear
interpolator; a motion vector correction sub-component having an
error function, the error function including penalties related to a
length of the difference vector between a given candidate vector
and a plurality of neighboring vectors; and a block erosion
sub-component. The motion estimator assumes that a motion vector
for an object between a previous field and a current field is the
same as a motion vector for the object between the current field
and a next field. The three-stage adaptive recursive filter
comprises: a first stage comprises a function that selects between
using static pixels data and moving pixels data from a next field;
a second stage comprises a function that selects a more valid set
of data between motion compensated data from a previous field and
the pixels selected by the first stage; and a third stage comprises
a function that combines an intra-field interpolation with the more
valid set of data selected by the second stage.
[0134] In a fourth embodiment, the invention provides a method for
converting an interlaced image to a progressive scan image, the
method comprising: providing an input signal corresponding to an
image; prefiltering the input signal with a spatial line averaging
prefilter; estimating motion in the image by performing a 3-D
recursive search, performing a motion vector correction, and
performing a block erosion to reduce blockiness in the progressive
scan image; and filtering the signal in three stages. In the first
stage selecting between using static pixels data and moving pixels
data from a next field. In the second stage selecting a more valid
set of data between motion compensated data from a previous field
and the pixels selected by the first stage. In the third stage
combining an intra-field interpolation with the more valid set of
data selected by the second stage.
[0135] In a fifth embodiment, the invention provides a method for
converting an interlaced image to a progressive scan image, the
method comprising: providing an input signal corresponding to an
image; prefiltering the input signal with a spatial line averaging
prefilter; estimating motion in the image; and filtering the signal
in three stages. The estimation of motion is performed by: assuming
that a motion vector for an object between a previous field and a
current field is the same as a motion vector for the object between
the current field and a next field; performing a 3-D recursive
search; performing a motion vector correction in which the error
function penalizes a candidate vector based on a length of a
difference vector between the candidate vector and a plurality of
neighboring vectors; performing a block erosion to reduce
blockiness in the progressive scan image. The three-stage filtering
is performed by: in the first stage selecting between using static
pixels data and moving pixels data from a next field; in the second
stage selecting a more valid set of data between motion compensated
data from a previous field and the pixels selected by the first
stage; and in the third stage combining an intra-field
interpolation with the more valid set of data selected by the
second stage.
BRIEF DESCRIPTION OF THE DRAWINGS
[0136] FIG. 1 is an illustration of the interlace-to-progressive
scan conversion, or deinterlacing, task.
[0137] FIGS. 2a and 2b illustrate the change in orientation of the
vertical-temporal video spectrum caused by vertical motion.
[0138] FIG. 3a illustrate the general spectrum for an interlaced
signal with motion.
[0139] FIG. 3b illustrated the ideal spectrum result from an
interlace-to-progressive scan conversion process.
[0140] FIG. 4 is a graph of the required frequency characteristic
in the video spectrum for an interlace-to-progressive scan
conversion.
[0141] FIG. 5 is a block diagram of a Mouse's Teeth detector.
[0142] FIG. 6 is diagram of the coefficients of the vertical high
pass filters used with a Mouse's Teeth detector.
[0143] FIG. 7 is a diagram of pixels in an image to be converted
from interlace to progressive scan.
[0144] FIG. 8 is a block diagram of an FIR Median Hybrid
filter.
[0145] FIG. 9 is a diagram of a time recursive function.
[0146] FIG. 10 is an illustration of a gradient matching
process.
[0147] FIG. 11 is a block diagram of a phase correlator.
[0148] FIG. 12 is a block diagram of a phase correlated motion
estimator.
[0149] FIG. 13 is an illustration of a block-matched motion
estimation process.
[0150] FIG. 14 is a diagram of the search window used in a motion
estimation process.
[0151] FIG. 15 is a diagram of the location of "fractional pixels"
in a half-pixel accuracy motion estimation process.
[0152] FIG. 16 is an illustration of a one-at-a-time search
process.
[0153] FIG. 17 is an illustration of a logarithmic search
process.
[0154] FIG. 18 is an illustration of a three-step search
process.
[0155] FIG. 19 is a block diagram of a hierarchical block matching
process.
[0156] FIG. 20 is an illustration of the 2-D convergence
principle.
[0157] FIG. 21 is an illustration of convergence directions for a
2-D convergence process.
[0158] FIG. 22 is an illustration of convergence directions for a
3-D convergence process.
[0159] FIG. 23 is a diagram of the relative positions of blocks in
a 3-D convergence process used in the preferred embodiment
system.
[0160] FIG. 24 is a block diagram of a preferred embodiment
adaptive interlace-to-progressive scan conversion system.
[0161] FIG. 25 is an illustration of certain features of a motion
vector correction process according to the present invention.
[0162] FIG. 26 is an illustration of a block erosion process
suitable for use in a system according to the present
invention.
[0163] FIG. 27 is a diagram of the prediction process of a
three-stage adaptive filter suitable used in the preferred
embodiment system according to the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
[0164] For the purposes of promoting an understanding of the
principles of the invention, reference will now be made to
preferred embodiments and specific language will be used to
describe the same. It will nevertheless be understood that no
limitation of the scope of the invention is thereby intended. Such
alternations and further modifications in the invention, and such
further applications of the principles of the invention as
described herein as would normally occur to one skilled in the art
to which the invention pertains are contemplated, and desired to be
protected.
[0165] FIG. 24 is a basic block diagram of an adaptive
interlace-to-progressive scan conversion system according to the
present invention, indicated generally at 240. The system comprises
three general parts: a prefilter 242, a motion estimator 244, and
an adaptive filter 246. Each of these parts is discussed in greater
detail hereinbelow.
[0166] In most prior art interlace-to-progressive scan conversion
algorithms both the filter and the motion estimator typically use
the interlace fields as the input to produce the output. Since the
field parity is different within each consecutive field, the motion
estimator always compares the lines from current field (e.g. top
field) with different lines (e.g. bottom field) from the previous
field. This sometimes can lead the motion estimator to give
defective motion vectors. To overcome this problem, a prefiltering
is often applied to convert the interlace input to a progressive
one. The filter should be able to do a simple deinterlacing process
and also provide enough information to the motion estimator to
better predict the motion vector. Simulation shows that
pre-filtering also increases the performance of the "real" filter,
as explained hereinbelow. At first it is tempting to use a
sophisticated deinterlacing filter, but this defeats the purpose of
prefiltering, and will impose a substantial hardware cost.
[0167] Thus, the prefilter is preferably a linear or an implicit
motion adaptive filter. Implementing a motion detector is not
feasible. Field insertion is a tempting choice, based on an
assumption that normally there aren many movements in the first few
images in the sequence. Hence a perfect progressive scan image
could be reconstructed. This can boost the "real" filter
performance (that is, the performance of the adaptive filter 246)
as the general sampling theories can be applied directly. However,
after a few fields/frames, field insertion causes annoying
serration line artifacts, especially when the image sequence
changes to a new sequence. Thus, field insertion is a poor choice
for the prefilter 242.
[0168] The preferred embodiment system 240 uses a spatial line
average (SLA) filter for the prefilter 242. An SLA filter 242
passes all the temporal frequencies, which guarantees no motion
artifacts. The SLA filter 242 also gives greater alias suppression.
The principal drawback of using an SLA for the prefilter 242 is
that the higher-frequency portion of the baseband is also
suppressed, which causes the image to be blurred. However, since
the human eyes is less sensitive to large area noise compares to
small area but strong noise, this is acceptable. From simulations
it has been determined that the SLA filter 242 even gives stronger
temporal filtering than a motion-compensated median filter. This
makes it the truly ideal choice for prefiltering.
[0169] The formal definition of a spatial line average filter is
given by: Fo .times. .times. ( x , y , t ) = { F .times. .times. (
x , y , t ) , ( y .times. .times. mod .times. .times. 2 = t .times.
.times. mod .times. .times. 2 ) F .times. .times. ( x , y + 1 , t )
+ F .times. .times. ( x , y - 1 , t ) 2 , ( otherwise ) ( 2.1 )
##EQU32## An SLA filter can also be easily adapted to the hardware,
and has a very low hardware requirement. It only needs a few lines
of buffer, a full adder, and right shift logic operator.
[0170] Turning now to the motion estimator 244, it will be
appreciated that most motion estimator hardware that is available
today is used in image compression. In image compression, the
motion estimator is used to find a motion vector that produces the
minimum error or difference. The smaller the difference, the less
data needs to be sent, and thus the higher the compression ratio
that can be achieved.
[0171] In interlace-to-progressive scan conversion, the goal is to
find the "true motion" of the image. Experiments show that the
smallest difference doesn't always give the most correct motion
vector. A full search motion estimator, with the mean absolute
error cost function, is good for image compression because it
guarantees finding the smallest error motion vector. But this
algorithm is not suitable for the deinterlacing process. It often
gives incorrect motion vectors, which then result in a strong local
distortion in the interpolated image. In certain types of image
sequences the output of a motion compensated median filter using a
full search motion estimator is even worse then non-compensated
image, due to motion vector errors.
[0172] One explanation for this behavior is that some of the
information in the current and previous fields is missing. A simple
but somewhat extreme illustration: assume there is a motionless
American flag in the image with the red and white stripes, in which
each stripe occupies a single line. Thus, in the current field
there are only red lines and in the previous field there are only
white lines. Based on its criteria the motion estimator tries to
find any location that contains primarily red pixels, and operates
as though the flag has moved from it's previous location to this
new location between the two fields. Clearly, this represents a
substantial error in the motion estimation, and motion prediction
based on this flawed estimate will likely be equally inaccurate.
Note that normally even the smallest line in the picture will
occupy more than one pixel line; hence the error typically
introduced by motion estimation is not this extreme.
[0173] A prefiltering can be used to reduce the probability of
prediction error. If the filter can perfectly deinterlace the
image, theoretically, this will solve the problem of the error
caused by the missing information, such as in the case discussed
above.
[0174] Prediction error can also happen because of the size of the
block is too small. Reducing the number of candidate vectors is
also known to reduce the inconsistency of motion prediction.
[0175] Experiments also show that viewers perceive the artifacts
cause by an incorrect motion vector to be worse than blur or global
degeneration, despite the fact that this type of error yields a
significantly lower mean square error. Since it is difficult to
determine the true motion vector in the image sequence, most of the
time all that can be done is to minimize the error. Therefore, the
primary criterion in the selection of the motion estimator design
should instead be to assure that the generated velocity field be
smooth. Accuracy, at least initially, could be considered to be of
secondary importance. The assumption here is that, though the
motion estimator still can't give a true motion vector, at least it
won't produce a motion vector that will cause strong local
distortion. Based on this criterion, the motion estimator algorithm
must contain elements that impose a smoothness constraint on the
resulting motion vectors. At the same time, the additional
constraints should not lead to a computationally complex
algorithm.
[0176] The preferred embodiment system 240 uses a 3D recursive
search as a basis for the motion estimator 244. This algorithm has
several desirable characteristics that make it the best choice. It
is block-based and thus requires far less computation than a full
search. It requires only 8 candidate vectors. The small number of
candidates vector reduces the chance of motion vector error and
greatly reduces the complexity of computation. For example, a full
search algorithm, using an 8.times.8 block and 32.times.32 search
window will need (32-8+1)2=625 iterations to complete. The
corresponding 3D recursive search requires only 8 iterations. The
motion vector for each block in 3D recursive search is initially
based on the neighboring motion vectors both in spatial and
temporal region. This inherent smoothness constraint yields very
coherent vector fields that closely correspond to the true motion
of the objects. This makes it particularly suitable for the motion
predictor 244 of the deinterlacing system 240.
[0177] In order to improve the performance of the 3D recursive
search the accuracy of the motion predictor 244 is carried to sub
pixel accuracy. This is done by adding elements to the lut (look up
table) candidacy set. All that is required is an extension of the
update set, with some update vectors with at least one non-integer
component. The new lut used in the motion predictor 244 of the
preferred embodiment system 240 consist of: US n = .times. { ( 0 0
) , ( 0 1 ) , ( 0 - 1 ) , ( 1 0 ) , ( - 1 0 ) , ( 0 2 ) , ( 0 - 2 )
, ( 3 0 ) , .times. ( - 3 0 ) , ( 0 1 4 ) , ( 0 - 1 4 ) , ( 1 4 0 )
, ( - 1 4 0 ) , } ( 2.2 ) ##EQU33##
[0178] Quarter pixel accuracy is normally considered to yield the
"true motion" of the object and will increase the performance of
the deinterlacing filter. The principal drawback of introducing
sub-pixel accuracy is that the 3D recursive search block matcher
has a somewhat slower convergence. It is clear that additional
small fractional update will reduce the appearance of the larger
update vectors. This, however, is relatively unimportant compared
to the improved smoothness of the estimated velocity field caused
by sub-pixel accuracy.
[0179] A bilinear interpolation is used to calculate the sub pixel
value. Assuming the coordinates of the pixel are given by (x, y),
the formal definition of the bilinear interpolation is defined as:
F .times. .times. ( x , y , t ) = .times. ( yf xf F .times. .times.
( xi , yi , t ) ) + ( yf ( 1 - xf ) .times. F .times. .times. ( xi
+ 1 , yi , t ) ) + ( ( 1 - yf ) xf F .times. .times. ( xi , yi + 1
, t ) ) + .times. ( ( 1 - yf ) ( 1 - xf ) F .times. .times. ( xi +
1 , yi + 1 , t ) ( 2.3 ) ##EQU34## [0180] where yf=.left
brkt-bot.y.right brkt-bot. xf=.left brkt-bot.x.right brkt-bot. and
yi=y-.left brkt-bot.y.right brkt-bot. xi=x-.left brkt-bot.x.right
brkt-bot.
[0181] It sometimes happens that blocks shifted over very different
vectors with respect to the current block contain the same
information. This is a particular problem with periodic structures.
When confronted with this situation, a block matcher will randomly
select one of these vectors based on small differences in the
matching error caused by noise in the picture. If the estimate is
used for temporal interpolation, very disturbing artifacts will be
generated in the periodic picture part. For the 3D Recursive Search
block matcher, the spatial consistency could guarantee that after
reaching a converged situation at the boundary of a moving object,
no other vectors will be selected. This, however, functions only if
none of the other candidate vectors that yield an equally good
matching error are ever generated. A number of risks jeopardize
this constraint: [0182] 1. An element of the update sets
US.sub.b(x,y,t) and US.sub.b(x,y,t) may equal a multiple of the
basic period of the structure. [0183] 2. The other estimator may
not be converged, or may be converged to a value that doesn't
correspond to the actual displacement. [0184] 3. Directly after a
scene change, it is possible that one of the convergence
accelerators T.sub.a(x,y,t) or T.sub.b(x,y,t) yields the
threatening candidate.
[0185] It is possible to minimize the risks mentioned under 1 and 3
above by adding penalties to the error function related to the
length of the difference vector between the candidate to be
evaluated and some neighboring vectors. For the 3D recursive search
block matcher, a very simple implementation is realized with a
penalty depending on the length of the update: e .times. .times. (
C _ , x , y , t ) = .times. x .di-elect cons. B .times. .times. ( x
, y , t ) .times. .times. F .times. .times. ( x , y , t ) - F
.times. .times. ( x - C x , y - C y , t - T ) + .times. .alpha. U _
.times. .times. ( x , y , t ) ( 2.4 ) ##EQU35##
[0186] It has been experimentally determined that fixed penalties
for all updates can be applied. Optimization led to values for
these fixed penalties (.alpha.) of, respectively, 0.4%, 0.8% and
1.6% of the maximum error value, for the cyclic update, the
convergence accelerator and the fixed 0 candidate vector. This last
candidate especially requires a large penalty, as it introduces the
risk of convergence interruption in flat areas.
[0187] The risk described in point 2 above, however, is not reduced
with these update penalties. This situation typically occurs if a
periodic part of an object enters the picture from the blanking or
appears from behind another object in the image. In this situation
one of the two estimators can converge to a wrong vector value
since there is no boundary moving along with the periodic picture
part to prevent this. Therefore an attempt to cope with this
problem is made by linking the two estimators. S.sub.a(x,y,t) is
set to the value of S.sub.b(x,y,t) if: e({overscore (MV)}.sub.a,
{overscore (X)}-{overscore (SMV)}.sub.a,t)>e({overscore
(MV)}.sub.b,{overscore (X)}-{overscore (SMV)}.sub.b,t)+Th (2.5)
where Th is a fixed threshold, and, conversely, S.sub.b(x,y,t) is
set to the value of S.sub.a(x,y,t) if: e({overscore (MV)}.sub.b,
{overscore (X)}-{overscore (SMV)}.sub.b,t)>e({overscore
(MV)}.sub.a,{overscore (X)}-{overscore (SMV)}.sub.a,t)+Th (2.6) The
threshold Th above turns out to be useful, as the advantage of two
independent estimators would otherwise be lost.
[0188] Even though several techniques have been applied to increase
the performance of the 3D recursive search, it can still sometimes
gives an incorrect motion prediction. For moving parts in the image
this is not a serious problem, since in most circumstances the
adaptive filter will do an intra-field interpolation as a
compensation of motion vector error, as is discussed further
hereinbelow. Further, since the image is moving, slight degradation
of the image quality is not generally noticeable to the human
eye.
[0189] However, with respect to static portions of the image it has
been observed experimentally that incorrect motion predictions will
cause annoying artifacts such as flickering on the edge boundaries.
The effect is worse where there is a repeated pattern in a static,
background part of the image. All artifacts in the static portions
of the image are normally noticeable by human eyes. To cope with
these artifacts, further emphasis should be given to the static
part of the image. The best way to deal with these artifacts is to
use both the motion estimator/predictor 244 and the adaptive filter
246.
[0190] The motion predictor can be used to improve performance in
static portions of the image by implementing a motion correction
algorithm. One simple approach would be to calculate the motion
vector between the current and the next fields and use the motion
vector information to help correct the motion vectors. The
principal problem with this approach is that it requires more
memory to save both the previous and next motion vectors'
information. The amount of exta memory required will generally make
this approach infeasible. A more efficient solution, in terms of
hardware requirements, is based on a simple and quite powerful
assumption. In the presently preferred embodiment, it is assumed
that the momentum of the object motion between a small number of
consecutive fields (in this case there are 3 fields) is constant.
In other words, it is assumed that the motion vector for an object
between the previous and the current fields is the same as the
motion vector between the current and the next fields.
[0191] A first method of motion vector error correction based on
this assumption that was experimentally tested is illustrated in
FIG. 25 and is formally defined by: MV _ .times. .times. ( x , y ,
t ) = { ( 0 0 ) , ( e m .function. ( x , y , t ) .gtoreq. e s
.function. ( x , y , t ) ) MV _ .times. .times. ( x , y , t ) , ( e
m .function. ( x , y , t ) < e s .function. ( x , y , t ) )
.times. .times. where : ( 2.7 ) e m .function. ( x , y , t ) = x
.di-elect cons. X .times. .times. F .times. .times. ( X ) - F
.times. .times. ( C ) + x .di-elect cons. X .times. .times. F
.times. .times. ( X ) - F .times. .times. ( D ) 2 ( 2.8 ) e s
.function. ( x , y , t ) = x .di-elect cons. X .times. .times. F
.times. .times. ( X ) - F .times. .times. ( A ) + x .di-elect cons.
X .times. .times. F .times. .times. ( X ) - F .times. .times. ( B )
2 ( 2.9 ) ##EQU36## [0192] and where A, B, C, D, and X are blocks
as shown in FIG. 25.
[0193] Though this algorithm makes significant improvement in the
static areas of the image, it fails to correct the motion if the
difference in the static parts between the current field and the
previous field is large (as, for example, with the American flag
example above).
[0194] A second, preferred method of motion vector error correction
based on the assumption of negligible change in velocity that was
experimentally tested is formally defined by equation 2.7 above,
but replaces conditions 2.8 and 2.9 with the following:
e.sub.m(x,y,t)=.SIGMA.|F(C)-F(D)| (2.10)
e.sub.s(x,y,t)=.SIGMA.|F(A)-F(B)| (2.11) Note that this provides an
even simpler algorithm for motion error correction.
[0195] Criteria 2.10 and 2.11 are based on the assumption that if
the difference between the previous and the next fields in a given
part of the image is small, it is safe to assume that the given
part is static. Because this method doesn't calculate the error
based on the current field, the problem of the first method is
solved. Furthermore, because the previous and the next fields
initially contain the same information (since both are either top
field or bottom field), this method performs better.
[0196] Note that, in the case where a large difference between
previous and current fields is in fact caused by a fast moving
object (fast enough that the object does not appear in the search
window in two consecutive fields), it is still safe to assume that
the block is static. The motion vector is out of the range of the
searching window, so instead of pointing to somewhere randomly, it
is better if the motion vector is equal to 0. The moving object
part in the block will be handled by the intra-field interpolation
of the adaptive filter while the static background in the block can
be deinterlace perfectly.
[0197] The principal disadvantage of this method is that it might
introduce some inconsistency into the motion vector smoothness.
Most of the time an image is divided into several parts of static
and moving areas, so normally the neighborhood of a static block is
also static blocks and the neighborhood of the moving block is also
moving blocks. Based on this fact, typically if the motion
corrector decides to produce a static motion vector output for a
block, its neighboring blocks should also be static.
[0198] Another assumption that can be made is that most of the time
the search window is large enough to contain most of the motion
vectors. Based on the combination of these two assumptions the
inconsistency that is introduced by the method of motion error
correction defined in equations 2.7, 2.10, and 2.11 should be
insignificant. This motion corrector only needs a subtraction block
to calculate the difference and a comparator, so its hardware
implementation is simple.
[0199] In nearly every prior art practical application the mean
absolute error criterion is used as the cost/criterion function to
determined the motion vector. The mean square error is too complex
to practically implement in hardware for most applications, since
it requires multiplication and power operations. On the other hand,
mean absolute error can be implemented with a simple subtraction
block, XOR, and shifting logic. The principal drawback to mean
absolute error is that the output does not contain any deviation
information.
[0200] In the preferred embodiment the cost function is given by:
.A-inverted.F(x,y,t).epsilon.B(x,y,t):
D=|F(x,y,t)-F(x-MV.sub.x,y-MV.sub.y,t-1)| TD=TD+D Diff=D-EstErr
EstErr=EstErr+(.delta.+Diff)); Dev=Dev+.delta.(|Diff|-Dev)
(2.12)
[0201] Where the initial value for TD, EstErr, and Dev are all
equal to zero.
[0202] If the motion information is limited to one vector per block
of pixels, motion compensation sometimes creates visible block
structures in the interpolated picture. The block sizes commonly
used in block matching are in a range that gives rise to very
visible artifacts. A post-filter on the vector field can overcome
this problem, but this has the drawback that discontinuities in the
vector field are blurred as well. For this reason, a preferred
embodiment system according to the present invention employs a
post-operation that eliminates fixed block boundaries from the
vector field without blurring the contours. The post operation also
prevents vectors that do not result from the estimator from being
generated. This is especially useful where algorithms can yield
vectors that have poor relation to the actual object velocities. In
case of a velocity field for which one vector per block is
available the preferred embodiment divides each block B(x,y,t)
according to:
B(x,y,t)={(x,y)|X.sub.x-X/2.ltoreq.x.ltoreq.X.sub.x+X/2
X.sub.y-Y/2.ltoreq.y.ltoreq.X.sub.y+Y/2} (2.13) to which a vector
{overscore (MV)}(x,y,t) is assigned, into four sub-blocks
B.sub.i,j(x,y,t) B i , j .function. ( x , y , t ) = { ( x , y ) | X
x - ( 1 - i ) X 4 .ltoreq. x .ltoreq. X x + ( 1 + i ) X 4 X y
.function. ( 1 - j ) Y 4 .ltoreq. y .ltoreq. X y + ( 1 + j ) Y 4 }
( 2.14 ) ##EQU37## where the variables I and j take the values +1
and -1. To the pixels in each of the four sub-blocks
B.sub.i,j(x,y,t) a vector MV.sub.i,j(x,y,t) is assigned:
.A-inverted.(x,y).epsilon.B.sub.i,j(x,y,t): {overscore
(MV)}.sub.i,j(x,y,t)={overscore (MV)}.sub.i,j({overscore (X)},t)
(2.15) [0203] where: {overscore (MV)}.sub.i,j({overscore
(X)},t)=med.left brkt-bot.{overscore (MV)}(x+iX,y,t), {overscore
(MV)}({overscore (X)},t), {overscore (MV)}(x,y+jY,t).right
brkt-bot. (2.16)
[0204] The median function is a median on the x and y vector
components separately: med .times. .times. ( X _ , Y _ , Z _ ) = (
median .times. .times. ( X x , Y x , Z x ) median .times. .times. (
X y , Y y , Z y ) ) ( 2.17 ) ##EQU38##
[0205] Because of this separate operation, a new vector that was
neither in the block itself nor in the neighboring blocks can be
created. To prevent this, the preferred embodiment checks whether
the new vector is equal to one of the three input vectors, and if
it is not the original motion vector is applied. FIG. 26
illustrates the process. The motion vectors are taken from the
neighboring shaded areas. To calculate the result for sub-block H
the neighboring blocks E, G, and H itself are used.
[0206] A number of adaptive filters suitable for use as the
adaptive filter 246 will-now be discussed. In certain embodiments
the adaptive filter 246 is a median filter. The principal
shortcoming of using a median filter for the adaptive filter 246 is
that it introduces aliasing. In certain other embodiments a line
averaging filter is used. The chief limitation of line averaging
filters, when used for the adaptive filter 246, is that they
suppresses the higher baseband frequencies, resulting in image
blurring.
[0207] In certain other embodiments the adaptive filter 246 is an
adaptive recursive filter. An adaptive recursive filter 246 reduces
most of the artifacts that are caused by a conventional filter.
Nevertheless, there are few limitations with this type of filter.
In this method, the intra-field line average interpolation is used
for interpolating a fast moving object in the image where the
motion-vector cannot cope with the motion, and also for protection
when the motion predictor gives a wrong vector. For a fast moving
object, this interpolation works well because human eyes are not
very sensitive to artifacts in fast moving objects.
[0208] However, the intra-field interpolation will introduce blur
in the case of scene changes, and when a new object or background
appears from behind a moving object, or both. This is expected, and
the best result this method can do, since in both cases there is no
information available from the previous field. The artifacts will
be noticeable by human eyes because most of the blurring parts now
are static. It will remain in the next few fields until the motion
estimator produces a correct motion vector. The effect will be even
worse if the new object or scene that appears happens to be a
periodic pattern image. Sometimes even the filter gives a wrong
prediction and doesn't use an intra-field interpolation. Instead,
it uses the pixel values (that is point by the motion vector) from
the previous field resulting in a strong noticeable artifact and
breakdown on the edges of the moving object. Another problem that
was found during simulations is that the intra-field interpolation
that is used as protection from incorrect motion vectors sometimes
produces an annoying artifact. In certain experimental images
having periodic static parts with very small patterns, incorrect
motion predictions were generated, even using a standard 3D
recursive search. The deinterlace image therefore had a blurred
static background.
[0209] In certain other embodiments an 11-tap weighted median time
recursive filter is used for the adaptive filter 246. This method
is designed based on the time recursive filter. The input for this
filter 246 is prefiltered using an SLA filter 242. Rather than
simply using the motion compensation pixel data from the previous
field, this filter 246 uses an 11-tap weighted median to decide the
interpolated pixel. The formal definition of this filter 246 is
given by: F o .function. ( x , y , t ) = { F .times. .times. ( x ,
y , t ) , ( y .times. .times. mod .times. .times. 2 = t .times.
.times. mod .times. .times. 2 ) med .times. .times. ( A , B , C , D
, E , F , G , G , H , H .times. B + E 2 ) , ( otherwise ) ( 2.18 )
##EQU39## [0210] where: A=F(x-1,y-1,t) E=F(x+1,t) B=F(x,y-1,t)
F=F(x+1,y+t,t) C=F(x+1,y-1,t) G=F(x,y-3,t) D=F(x-1,y+1,t)
H=F(x-MV.sub.x,y-MV.sub.y,t-1) (2.19)
[0211] From the definition it can be seen that this algorithm gives
double protection in the case of an incorrect motion vector. It
implicitly detects the static image part, though the motion vector
is incorrect, by introducing the G factor in the weighted median.
In the case of fast moving objects, or an incorrect motion vector,
the intra-field interpolation is selected from A, B, C, D, E, F,
and 1/2(B+E). The weight of the G and H factors are doubled to give
stronger temporal filtering. The results of this method are quite
good. Nevertheless, this algorithm does not completely resolve the
problems that occur with an adaptive recursive filter 246. The
experimental results with the images having periodic static
portions with very small patterns indicated that the static
portions with the periodic, small patterns were still blurred in
the deinterlaced image. This situation is close to the extreme
example of American flag problem described above; it is nearly a
worst-case scenario.
[0212] The preferred embodiment employs a 3-stage adaptive
recursive filter to cope with such worst-case scenarios. It should
be noted that the problems that are found with the adaptive
recursive filter actually occur in other filter algorithms, as
well. The different algorithms adapt to these problems in different
ways. The principal problems into two categories: no or missing
information, and static portions of the image with small periodic
patterns.
[0213] In the case of no information, such as occurs directly after
a scene change, or when a new object or background appears from
behind a moving object, it is possible to interpolate the missing
pixels better by using the information from the next field.
Assuming that at least the direction of the object motion is not
changed (if the momentum is not constant), the next field should
contain the information of the new object that just appears, and
that part of the image can be better interpolated. The same thing
can be applied for changes of scene.
[0214] For periodic static parts of the image a simple field
insertion filter can solve the problem. It will be appreciated that
field insertion filters have the best result in the absence of
motion. Further notice, however, should be given on the boundaries
between the static background and the moving object to prevent the
serration of the moving edge. Regarded this way, the problem is to
decide when and where to apply a field insertion filter.
[0215] The novel three-stage adaptive recursive filter of the
preferred embodiment is illustrated in FIG. 27, and defined in the
following equations. In the first stage, the algorithm decides
whether to use static pixels data or moving pixels data from the
next field. The process of obtaining the moving pixels data is
based on the assumption that the motion vector that is applied to
the object is constant between these three fields. The selection is
made based on the differences between these two data sets and the
pixel data in the current field. The selection process is formally
defined by: F n .times. .times. ( x , y , t ) = { F .times. .times.
( x + MV x .function. ( x , y , t ) , y + MV y .function. ( x , y ,
t ) , t + 1 ) , ( D m < D s ) F .times. .times. ( x , y , t + 1
) , ( D m .gtoreq. D s ) .times. .times. where .times. : ( 2.20 ) D
s = k = - 2 2 .times. .times. C v .function. ( k ) F .times.
.times. ( x , y + k , t ) - F .times. .times. ( x , y + k , t + 1 )
( 2.21 ) D m = k = - 2 2 .times. .times. C v .function. ( k ) F
.times. .times. ( x , y + k , t ) - F .times. .times. ( x - MV x
.function. ( x , y , t ) , y - MV y .function. ( x , y , t ) + k ,
t + 1 ) ( 2.22 ) ##EQU40##
[0216] D.sub.m and D.sub.s are the difference between interpolated
pixels in the current field with the moving or static pixels in the
next field respectively. C.sub.v(k) is the coefficient of vertical
LPF, and k is vertical shift in field.
[0217] Information from the next field is important for enhancing
the performance of the filter. In the absence of motion, the static
pixel data provide the missing information that cannot be obtained
from previous field in the cases of new objects or backgrounds
appearing from behind a moving object or directly after a scene
change. Information from the next field also increases the
cohesiveness in the temporal domain of both the static and moving
interpolation.
[0218] In the second stage of the 3-stage process the motion
compensated data from the previous field and the data that obtained
from the first stage are used to determine which data is more
valid, or have more or better influence. An adaptive fading is used
between these two data sets. This function is based on the
correlation of all data with the pixel data in the current
field.
[0219] The last stage of the filter combines the result of the
second stage process with an intra-field interpolation. The
intra-field interpolation is used to predict a fast moving object
and also to insure the robustness of the filter. The formal
definition of the filter can be given by: F o .function. ( x , y ,
t ) = { F .times. .times. ( x , y , t ) , ( y .times. .times. mod
.times. .times. 2 = t .times. .times. mod .times. .times. 2 ) ( c i
F i .function. ( x , y , t ) ) + ( 1 - c i ) ( c p F p .function. (
x , y , t ) + ( 1 - c p ) .times. .times. F n .function. ( x , y ,
t ) ) , ( otherwise ) .times. .times. where .times. : ( 2.23 ) F i
.function. ( x , y , t ) = F .times. .times. ( x , y - 1 , t ) + F
.times. .times. ( x , y + 1 , t ) 2 ( 2.24 ) ##EQU41## [0220] is
the intra-field interpolation; F.sub.p(x,y,t)=F(x-MV.sub.x(x,y,t),
y-MV.sub.y(x,y,t),t-1) (2.25) [0221] is the backward data
prediction; [0222] Fn(x,y,t) is the forward data prediction defined
in equation 2.19; and [0223] c.sub.i and c.sub.p are adaptive
coefficients ranging from 0-1
[0224] While the invention has been illustrated and described in
detail in the drawings and foregoing description, the description
is to be considered as illustrative and not restrictive in
character. Only the preferred embodiments, and such alternative
embodiments deemed helpful in further illuminating the preferred
embodiment, have been shown and described. All changes and
modifications that come within the spirit of the invention are
desired to be protected.
* * * * *