U.S. patent application number 11/980474 was filed with the patent office on 2008-06-26 for moving picture prediction system.
Invention is credited to Kohtaro Asai, Yuri Hasegawa, Yoshimi Isu, Shinichi Kuroda, Tokumichi Murakami, Hirofumi Nishikawa, Shunichi Sekiguchi.
Application Number | 20080152007 11/980474 |
Document ID | / |
Family ID | 12260195 |
Filed Date | 2008-06-26 |
United States Patent
Application |
20080152007 |
Kind Code |
A1 |
Sekiguchi; Shunichi ; et
al. |
June 26, 2008 |
Moving picture prediction system
Abstract
To achieve an encoding system including a highly efficient
prediction performed in response to the content of a scene, a
significance, and a motion characteristic of a moving picture and
the like, memories a, b, c, motion compensator 5 responsive to an
arbitrary transform parameter representing the motion of a
prediction picture segment for generating a predicted picture by
using arbitrary data stored in the memories a, b, c based upon the
transform parameter, and memory update unit 15 for allowing the
content of one or more of the memories to be updated at an
arbitrary period of time, are provided.
Inventors: |
Sekiguchi; Shunichi; (Tokyo,
JP) ; Asai; Kohtaro; (Tokyo, JP) ; Murakami;
Tokumichi; (Tokyo, JP) ; Nishikawa; Hirofumi;
(Tokyo, JP) ; Kuroda; Shinichi; (Tokyo, JP)
; Isu; Yoshimi; (Tokyo, JP) ; Hasegawa; Yuri;
(Tokyo, JP) |
Correspondence
Address: |
BIRCH STEWART KOLASCH & BIRCH
PO BOX 747
FALLS CHURCH
VA
22040-0747
US
|
Family ID: |
12260195 |
Appl. No.: |
11/980474 |
Filed: |
October 31, 2007 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10642508 |
Aug 18, 2003 |
|
|
|
11980474 |
|
|
|
|
Current U.S.
Class: |
375/240.14 ;
375/E7.081; 375/E7.085; 375/E7.086; 375/E7.094; 375/E7.141;
375/E7.163; 375/E7.165; 375/E7.166; 375/E7.211; 375/E7.243 |
Current CPC
Class: |
H04N 19/105 20141101;
H04N 19/137 20141101; H04N 19/503 20141101; H04N 19/463 20141101;
H04N 19/196 20141101; H04N 19/23 20141101; H04N 19/179 20141101;
H04N 19/44 20141101; H04N 19/573 20141101; H04N 19/46 20141101;
H04N 7/24 20130101; H04N 19/61 20141101; H04N 19/127 20141101; H04N
19/58 20141101; H04N 19/537 20141101; H04N 19/186 20141101; H04N
19/87 20141101; H04N 19/423 20141101; H04N 19/142 20141101 |
Class at
Publication: |
375/240.14 ;
375/E07.243 |
International
Class: |
H04N 7/32 20060101
H04N007/32 |
Foreign Application Data
Date |
Code |
Application Number |
Feb 13, 1997 |
JP |
HEI-9-28862 |
Claims
1. A moving picture decoding apparatus, comprising: a memory for
storing previously decoded image as reference image data used for
generating a prediction picture; a prediction picture generation
section for receiving prediction information, for each macroblock,
indicating one of a plurality of prediction modes and a motion
parameter extracted from a bit stream, wherein the plurality of
prediction modes includes a global prediction and a local
prediction, and for generating the prediction picture using the
reference image data and the prediction modes indicated by the
prediction information; and a decoding section for decoding a
texture from the bit stream, and adding the texture to the
prediction picture generated by the prediction picture generation
section so as to obtain a decoded image.
2. The moving picture decoding apparatus according to claim 1,
wherein if the prediction information indicates the global
prediction as the prediction modes, the prediction picture
generation section generates the prediction picture using a global
prediction picture data as the reference image data and a global
motion parameter as the motion parameter.
3. The moving picture decoding apparatus according to claim 1,
wherein if the prediction information indicates the local
prediction as the prediction modes, the prediction picture
generation section generates the prediction picture using a
corresponding macroblock data in the reference picture and a motion
vector as the motion parameter.
4. A moving picture prediction system for predicting a moving
picture to be implemented in a decoder, the moving picture
prediction system comprising: a memory for storing previously
decoded image as reference image data used for generating a
prediction picture; and a prediction picture generation section for
receiving, for each macroblock, prediction information and a motion
parameter extracted from a bit stream, the prediction information
being used for determining whether the prediction picture is
predicted using the reference image data of full decode image or
the prediction picture is predicted using the corresponding
macroblock in the reference image, and for generating the
prediction picture using the reference image data and the
prediction information.
5. A moving picture decoding method, comprising the steps of:
receiving prediction information, for each macroblock, indicating
one of a plurality of prediction modes and a motion parameter
extracted from a bit stream, wherein the plurality of prediction
modes includes a global prediction and a local prediction;
generating a prediction picture using a reference image data of
previously decoded image stored in a memory and the prediction
modes indicated by the prediction information; and decoding a
texture from the bit stream, and adding the texture to the
prediction picture generated by the prediction picture generation
section so as to obtain a decoded image.
6. The moving picture decoding method according to claim 5, wherein
if the prediction information indicates the global prediction as
the prediction modes, the prediction picture generation section
generates the prediction picture using a global prediction picture
data as the reference image data and a global motion parameter as
the motion parameter.
7. The moving picture decoding method according to claim 5, wherein
if the prediction information indicates the local prediction as the
prediction modes, the prediction picture generation section
generates the prediction picture using a corresponding macroblock
data in the reference picture and a motion vector as the motion
parameter.
8. A moving picture prediction method for predicting a moving
picture to be implemented in a decoder, the moving picture
prediction method comprising the steps of: receiving, for each
macroblock, prediction information and a motion parameter extracted
from a bit stream, the prediction information being used for
determining whether the prediction picture is predicted using the
reference image data of full decode image or the prediction picture
is predicted using the corresponding macroblock in the reference
image; and generating a prediction picture using a reference image
data of previously decoded image stored in a memory and the
prediction information.
9. A moving picture prediction system for predicting a moving
picture to be implemented in at least one of an encoder and a
decoder, the moving picture prediction system comprising: a
plurality of reference picture memory areas, each area for storing
picture data of a reference picture to be used for prediction; and
a prediction picture generation section including a motion
compensator for receiving a parameter representing a motion of
between an image to be predicted and a reference picture stored in
a reference memory area, and a parameter representing a reference
memory area to be used for prediction, and for generating a
predicted image by using the reference picture data stored in the
plurality of the reference picture memory areas and indicated by
said parameter representing a reference memory area to be used for
prediction, wherein each reference picture memory area is allocated
unique reference number by which the prediction picture generation
section identifies the reference picture memory area storing the
reference picture to be used for prediction.
10. The moving picture prediction system according to claim 9,
wherein each reference picture memory area is allocated unique
reference number based on frequency in use for prediction.
11. The moving picture prediction system according to claim 9,
wherein each reference number is dynamically updated.
12. A moving picture prediction method of predicting a moving
picture to be implemented in at least one of an encoder and a
decoder, the moving picture prediction method comprising the steps
of: receiving a parameter representing a motion of between an image
to be predicted and a reference picture stored in a reference
memory area, an indication information being used for identifying a
reference picture data in the reference picture memory areas and a
updating information indicating a timing of updating the reference
picture memory, each memory area for storing picture data of one
reference picture each to be used for prediction; generating a
predicted image data by using the reference picture data identified
by the indication information; and controlling dynamically the
period of time of each reference picture data stored in the
reference picture memory areas based on the updating information.
Description
[0001] This application is a Divisional of co-pending application
Ser. No. 10/642,508 filed on Aug. 18, 2003 and for which priority
is claimed under 35 U.S.C. .sctn. 120. The entire contents of the
above-identified application is hereby incorporated by
reference.
TECHNICAL FIELD
[0002] The present invention relates to the prediction of a moving
picture implemented, for example, in
[0003] a moving picture encoder/decoder used in a
portable/stationary video communication device and the like for
visual communications in a video telephone system, a video
conference system or the like,
[0004] a moving picture encoder/decoder used in a picture
storage/recording apparatus such as a digital VTR and a video
server, and
[0005] a moving picture encoding/decoding program implemented in
the form of a single software or a firmware as a Digital Signal
Processor (DSP).
BACKGROUND ART
[0006] MPEG-4 (Moving Picture Experts Group Phase-4) Video
Encoding/Decoding Verification Model (hereinafter referred to by
the initials VM) whose standardization is in progress by ISO/IEC
JTC1/SC29/WG11 may be introduced as a conventional type of
predictive encoding/decoding in an encoding/decoding system of
moving pictures. The VM continues to revise its contents according
to the progress being made in standardization of MPEG-4. Here,
Version 5.0 of the VM is designated to represent the VM and will be
simply referred to as VM hereinafter.
[0007] The VM is a system for encoding/decoding each video object
as one unit in view of a moving picture sequence being an aggregate
of video objects changing their shapes time-/space-wise
arbitrarily. FIG. 29 shows a VM video data structure. According to
the VM, a time-based moving picture object is called a Video Object
(VO), and picture data representing each time instance of the VO,
as an encoding unit, is called a Video Object Plane (VOP). If the
VO is layered in time/space, a special unit called a Video Object
Layer (VOL) is provided between the VO and the VOP for representing
a layered VO structure. Each VOP includes shape information and
texture information to be separated. If the moving picture sequence
includes a single VO, then the VOP is equated to a frame. There is
no shape information included, in this case, and the texture
information alone is then to be encoded/decoded.
[0008] The VOP includes alpha data representing the shape
information and texture data representing the texture information,
as illustrated in FIG. 30. Each data are defined as an aggregate of
blocks (alphablocks/macroblocks), and each block in the aggregate
is composed of 16.times.16 samples. Each alphablock sample is
represented in eight bits. A macroblock includes accompanied
chrominance signals being associated with 16.times.16 sample
luminance signals. VOP data are obtained from a moving picture
sequence externally processed outside of an encoder.
[0009] FIG. 31 is a diagram showing the configuration of a VOP
encoder according to the VM encoding system. The diagram includes
original VOP data P1 to be inputted, an alphablock P2 representing
the shape information of the VOP, a switch P3a for passing the
shape information, if there is any, of the inputted original VOP
data, a shape encoder P4 for compressing and encoding the
alphablock, compressed alphablock data P5, a locally decoded
alphablock P6, texture data (a macroblock) P7, a motion detector
P8, a motion parameter P9, a motion compensator P10, a predicted
picture candidate P11, a prediction mode selector P12, a prediction
mode P13, a predicted picture P14, a prediction error signal P15, a
texture encoder P16, texture encoding information P17, a locally
decoded prediction error signal P18, a locally decoded macroblock
P19, a sprite memory update unit P20, a VOP memory P21, a sprite
memory P22, a variable-length encoder/multiplexer P23, a buffer
P24, and an encoded bitstream P25.
[0010] FIG. 32 shows a flowchart outlining an operation of the
encoder.
[0011] Referring to the encoder of FIG. 31, the original VOP data
P1 are decomposed into the alphablocks P2 and the macroblocks P7
(Steps PS2 and PS3). The alphablocks P2 and the macroblocks P7 are
transferred to the shape encoder P4 and the motion detector P8,
respectively. The shape encoder P4 is a processing block for data
compression of the alphablock P2 (step PS4), the process of which
is not discussed here further in detail because the compression
method of shape information is not particularly relevant to the
present invention.
[0012] The shape encoder P4 outputs the compressed alphablock data
P5 which is transferred to the variable-length encoder/multiplexer
P23, and the locally decoded alpha data P6 which is transferred
sequentially to the motion detector P8, the motion compensator P10,
the prediction mode selector P12, and the texture encoder P16.
[0013] The motion detector P8, upon reception of the macroblock P7,
detects a local-motion vector on a macroblock basis using reference
picture data stored in the VOP memory P21 and the locally decoded
alphablock P6 (step PS5). Here, the motion vector is one example of
a motion parameter. The VOP memory P21 stores the locally decoded
picture of a previously encoded VOP. The content of the VOP memory
P21 is sequentially updated with the locally decoded picture of a
macroblock whenever the macroblock is encoded. In addition, the
motion detector P8 detects a global warping parameter, upon
reception of the full texture data of the original VOP, by using
reference picture data stored in the sprite memory P22 and locally
decoded alpha data. The sprite memory P22 will be discussed later
in detail.
[0014] The motion compensator P10 generates the predicted picture
candidate P11 by using the motion parameter P9, which is detected
in the motion detector P8, and the locally decoded alphablock P6
(step PS6). Then, the prediction mode selector P12 determines the
final of the predicted picture P14 and corresponding prediction
mode P13 of the macroblock by using a prediction error signal power
and an original signal power (step PS7). In addition, the
prediction mode selector P12 judges the coding type of the data
either intra-frame coding or inter-frame coding.
[0015] The texture encoder P16 processes the prediction error
signal P15 or the original macroblock through Discrete Cosine
Transformation (DCT) and quantization to obtain a quantized DCT
coefficient based upon the prediction mode P13. An obtained
quantized DCT coefficient is transferred, directly or after
prediction, to the variable-length encoder/multiplexer P23 to be
encoded (steps PS8 and PS9). The variable-length
encoder/multiplexer P23 converts the received data into a bitstream
and multiplexes the data based upon predetermined syntaxes and
variable-length codes (step PS10). The quantized DCT coefficient is
subject to dequantization and inverse DCT to obtain the locally
decoded prediction error signal P18, which is added to the
predicted picture P14, and the locally decoded macroblock P19 (step
PS11) is obtained. The locally decoded macroblock P19 is written
into the VOP memory P21 and the sprite memory P22 to be used for a
later VOP prediction (step PS12).
[0016] Dominant portions of prediction including a prediction
method, a motion compensation, and the update control of the sprite
memory P22 and the VOP memory P21 will be discussed below in
detail.
(1) Prediction Method in the VM
[0017] Normally, four different types of VOP encoding shown in FIG.
33 are processed in the VM. Each encoding type is associated with a
prediction type or method marked by a circle on a macroblock basis.
With an I-VOP, intra-frame coding is used singly involving no
prediction. With a P-VOP, past VOP data can be used for prediction.
With a B-VOP, both past and future VOP data can be used for
prediction.
[0018] All the aforementioned prediction types are motion vector
based. On the other hand, with a Sprite-VOP, a sprite memory can be
used for prediction. The sprite is a picture space generated
through a step-by-step mixing process of VOPs based upon a warping
parameter set
{right arrow over (.alpha.)}=(a,b,c,d,e,f,g,h)
detected on a VOP basis (The mark .fwdarw. denotes a vector
hereinafter). The warping parameter set is determined by the
following parametric equations.
x'=(ax+by+c)/(gx+hy+1)
y'=(dx+ey+f)/(gx+hy+1)
The sprite is stored in the sprite memory P22.
[0019] Referring to the parametric equations, (x, y) represents the
pixel position of an original VOP in a two-dimensional coordinate
system. (x', y') represents a pixel position in the sprite memory
corresponding to (x, y,) based upon a warping parameter. With the
Sprite-VOP, the warping parameter set can be used uniformly with
each macroblock to determine (x', y') in the sprite memory for
prediction to generate a predicted picture. In a strict sense, the
sprite includes "Dynamic Sprite" used for prediction and "Statistic
Sprite" used for prediction as well as for another purpose of an
approximate representation of VOP at a decoding station. In FIGS.
34 through 37 below, "sprite" stands for Dynamic Sprite.
[0020] The motion detector P8 detects the motion vector and the
warping parameter to be used for the aforementioned prediction
types. The motion vectors and the warping parameters are
generically called the motion parameter P9 hereinafter.
(2) Motion Compensation
[0021] FIG. 34 is a diagram showing the configuration of the motion
compensator P10 in detail. In the figure, a warping parameter P26,
a motion vector P27, a global-motion compensator P28, a
local-motion compensator P29, a warping-parameter based predicted
picture candidate P30, and a motion-vector based predicted picture
candidate P31 are shown. The warping-parameter and motion-vector
based predicted picture candidates 30, 31 are generically called
the predicted picture candidates P11 hereinafter.
[0022] FIG. 35 shows a flowchart outlining the operation of the
motion compensator P10 including steps PS14 through PS21.
[0023] The motion compensator P10 generates the predicted picture
candidate P11 using the warping parameter P26 of a full VOP
detected on a macroblock P7 basis in the motion detector P8 or a
macroblock based motion vector P27. The global-motion compensator
P28 performs a motion compensation using the warping parameter P26,
and the local-motion compensator P29 performs a motion compensation
using the motion vector P27.
[0024] With the I-VOP, the motion compensator P10 does not operate.
(The operating step proceeds to step PS21 from step PS14.) With a
VOP other than the I-VOP, the local-motion compensator P29 reads
out a predicted picture candidate PR1 from the locally decoded
picture of a past VOP stored in the VOP memory P21 by using the
motion vector P27 (step PS15). With the P-VOP, the predicted
picture candidate PR1 is only available to be used.
[0025] When the B-VOP is identified in step PS16, the local-motion
compensator P29 further reads out a predicted picture candidate PR2
from the locally decoded picture of a future VOP stored in the VOP
memory P21 by using the motion vector P27 (step PS17). In addition,
an arithmetic mean of the predicted picture candidates PR1, PR2
obtained from the past and future VOP locally decoded pictures to
obtain a predicted picture candidate PR3 (step PS18).
[0026] A predicted picture candidate PR4 is generated also through
Direct Prediction (step PS19). (Direct Prediction is based upon a
prediction method corresponding to B-Frame in an encoding method
H.263, Recommendation ITU-T. A vector for B-Frame is produced based
upon a group of P-VOP vectors, which is not discussed further here
in detail.) In FIG. 34, the motion-vector based predicted picture
candidates P31 is a generic term for all or part of the predicted
picture candidates PR1 through PR4.
[0027] If a VOP is of neither I-VOP nor B-VOP, then the VOP is of
Sprite-VOP. With the Sprite-VOP, the predicted picture candidate
PR1 is read out from the VOP memory based upon the motion vector.
In addition, the global-motion compensator P28 reads out the
predicted picture candidate P30 from the sprite memory P22 based
upon the warping parameter P26 in step PS20.
[0028] The global-motion compensator P28 calculates the address of
a predicted picture candidate in the sprite memory P22 based upon
the warping parameter P26, and reads out the predicted picture
candidate P30 from the sprite memory P22 to be outputted based upon
a resultant address. The local-motion compensator P29 calculates
the address of a predicted picture candidate in the VOP memory P21
based upon the motion vector P27 and reads out the predicted
picture candidate P31 to be outputted based upon a resultant
address.
[0029] These predicted picture candidates P11 are evaluated along
with an intra-frame coding signal of the texture data P7 in the
prediction mode selector P12, which selects a predicted picture
candidate having the least power of a prediction error signal along
with a prediction mode.
(3) Updating of Memories
[0030] The memory update unit P20 controls the VOP memory P21 and
sprite memory P22 to be updated (step PS12). The contents of these
memories are updated regardless of the prediction mode P13 selected
on a macroblock basis.
[0031] FIG. 36 is a diagram showing the configuration of the memory
update unit P20. FIG. 37 shows a flowchart including steps PS22
through PS28 illustrating the operation of the memory update unit
P20.
[0032] In FIG. 36, an externally supplied VOP encoding type P32, an
externally supplied sprite prediction identification flag P33 for
indicating the use of the sprite memory for prediction, an
externally supplied blend factor P34 used for prediction with the
sprite memory, switches P35, P36, a sprite blender P37, a sprite
transformer P38, a VOP memory update signal P39, and a sprite
update signal P40 are shown.
[0033] Firstly, the use of the sprite with the current VO or VOL is
examined if being designated by the sprite prediction
identification flag P33 (step PS22). With no use of the sprite
designated, the data are examined if being the B-VOP (step PS27).
With the B-VOP, then no updating is performed with the VOP memory
P21. With either the I-VOP or the P-VOP, then the VOP memory P21 is
written over with the locally decoded macroblock P19 on a
macroblock basis (step PS28).
[0034] With the use of the sprite designated in step PS22, then the
VOP memory P21 is updated in the same manner as above (steps PS23,
PS24), and in addition, the sprite memory PS22 is updated through
the following procedure.
a) Sprite Warping (Step PS25)
[0035] In the sprite transformer P38, an area
M({right arrow over (R)},t-1)
in the sprite memory P22 (M({right arrow over (R)},t-1) is an area
having the same size as that of a VOP having the origin of the
coordinates at a position in the sprite memory P22 with the VOP at
a time t) is subject to warping (transformation) based upon a
warping parameter
{right arrow over (.alpha.)}=(a,b,c,d,e,f,g,h)
b) Sprite Blending (Step PS26)
[0036] By using a resultant warped picture from a) above, a new
sprite memory area is calculated in the sprite blender P37
according to the following expression,
M({right arrow over (R)},t)=(1-.alpha.)W.sub.b[M(R,{right arrow
over (t)}-1),.alpha.]{right arrow over (+)}.alpha.VO(r,{right arrow
over (t)}),
where .alpha. is the blend factor P34, W.sub.b[M,{right arrow over
(.alpha.)}] is the resultant warped picture, and VO({right arrow
over (r)},t) is a pixel value of a locally decoded VOP with a
location {right arrow over (r)} and a time t.
[0037] With a non-VOP area in a locally decoded macroblock, it is
assumed that
VO({right arrow over (r)},t)=0.
As the blend factor .alpha. is assigned on a VOP basis, a locally
decoded VOP is collectively blended into the sprite memory P22
based upon a weight .alpha., regardless of the contents of a VOP
area.
[0038] According to the aforementioned prediction system in the
conventional encoding system, the video object is predicted by
using the memory designed to be used for detecting the motion
vector alone and the memory designed to be used for detecting the
warping parameter alone, both of which are structurally allowed the
maximum use of a single screen alone each. Thus, the limited use of
reference pictures is only available for prediction, thereby
hindering a sufficient improvement in prediction efficiency.
[0039] Further, in such a system where two or more video objects
are encoded concurrently, these memories only include a reference
picture representing the past record of a video object to be
predicted alone, which limits the variation of a reference picture
and precludes the utilization of a correlation among video objects
for prediction.
[0040] Further, the memories are updated regardless of such items
as the internal structure, a characteristic, and the past record of
the video object. This results in the insufficient storage of
information lacking significant data for predicting a video object,
thereby posing a problem of failing to enhance prediction
efficiency.
[0041] The present invention is directed to solving the
aforementioned problems. An objective of this invention is to
provide the prediction system for encoding/decoding of picture data
where two or more memories are provided to store the past record of
the moving picture sequence effectively in consideration of the
internal structure and characteristic of the moving picture
sequence, thereby achieving a highly efficient prediction as well
as encoding/decoding. In addition, the prediction system provides a
sophisticated inter-video object prediction performing among two or
more video objects.
DISCLOSURE OF THE INVENTION
[0042] According to the present invention, a moving picture
prediction system, for predicting a moving picture to be
implemented in at least one of an encoder and a decoder, includes a
plurality of memories for storing picture data for reference to be
used for prediction, the plurality of memories being corresponding
to different transform methods, respectively, and a prediction
picture generation section for receiving a parameter representing a
motion of a picture segment to be predicted, and for generating a
predicted picture using the picture data stored in one of the
plurality of memories used for the picture segment to be predicted
based upon the parameter and one of the transform methods
corresponding to the one of the plurality of memories.
[0043] The encoder generates a prediction memory indication
information signal indicating the one of the plurality of memories
used for generating the predicted picture and transmits the
prediction memory indication information signal and the parameter
to a decoding station so as to generate the predicted picture using
the picture data stored in the one of the plurality of memories
based upon the one of the transform methods corresponding to the
one of the plurality of memories in the decoding station.
[0044] The decoder receives the parameter and a prediction memory
indication information signal indicating the one of the plurality
of memories used for generating the predicted picture from an
encoding station, wherein the prediction picture generation section
generates the predicted picture using the picture data stored in
the one of the plurality of memories based upon the parameter and
the one of the transform methods corresponding to the one of the
plurality of memories.
[0045] Further, according to the present invention, a moving
picture prediction system, for predicting a moving picture to be
implemented in at least one of an encoding and a decoding, includes
a plurality of memories for storing picture data for reference to
be used for prediction, the plurality of memories being assigned to
different parameter effective value ranges, respectively, and a
prediction picture generation section for receiving a parameter
representing a motion of a picture segment to be predicted, for
selecting one of the plurality of memories assigned to one of the
parameter effective value ranges including a value of the
parameter, and for generating a predicted picture using the picture
data stored in a selected memory.
[0046] Still further, according to the present invention, a moving
picture prediction system, for predicting a moving picture to be
implemented in at least one of an encoding and a decoding, includes
a plurality of memories for storing picture data for reference to
be used for prediction and a prediction picture generation section
including a motion compensator for receiving a parameter
representing a motion of a picture segment to be predicted, and for
generating a predicted picture by using the picture data stored in
the plurality of memories based upon the parameter, and a memory
update unit for updating the picture data stored in at least one of
the plurality of memories at an arbitrary timing.
[0047] The moving picture prediction system predicts the moving
picture in a moving picture sequence having first and second video
objects, wherein the plurality of memories includes separate first
and second pluralities of memories corresponding to the first and
second video objects, respectively, and the prediction picture
generation section includes separate first and second generators,
respectively, corresponding to the first and second video objects,
wherein the first generator uses the picture data stored in at
least one of the first and second pluralities of memories to
generate the predicted picture when predicting the first object,
and generates information indicating a use of the second plurality
of memories for predicting the first object, the information being
added to the predicted picture.
[0048] The prediction picture generation section generates the
predicted picture through a change of either one of a number and a
size of the plurality of memories in response to a change in the
moving picture at each time instance.
[0049] The prediction picture generation section generates the
predicted picture in a limited use of memories for prediction in
response to a change in the moving picture at each time
instance.
[0050] The prediction picture generation section generates the
predicted picture by calculating a plurality of the predicted
pictures generated by using the respective picture data stored in
the plurality of memories.
[0051] The moving picture prediction system further includes a
significance detector for detecting a feature parameter
representing a significance of the picture segment to be predicted,
wherein the prediction picture generation section generates the
predicted picture by selecting at least one of choices of at least
one of a plurality of prediction methods, the plurality of
memories, and a plurality of memory update methods.
[0052] The moving picture prediction system further includes a
significance detector for detecting a parameter representing at
least one of an amount of bits available for coding the picture
segment to be predicted, an amount of change of the picture segment
at each time instance, and a significance of the picture segment,
wherein the prediction picture generation section generates the
predicted picture by selecting at least one of choices of at least
one of a plurality of prediction methods, the plurality of
memories, a plurality of memory update methods.
[0053] The moving picture prediction system predicts the moving
picture on a video object basis, wherein the moving picture
prediction system further includes a significance detector for
detecting a parameter representing at least one of an amount of
bits available for coding a video object to be predicted, an amount
of change in the video object at each time instance, and a
significance of the video object, wherein the prediction picture
generation section generates the predicted picture by selecting at
least one of choices of at least one of a plurality of prediction
methods, the plurality of memories, and a plurality of memory
update methods.
[0054] The moving picture prediction system further includes a
prediction information encoder for encoding prediction relating
information of the moving picture, wherein the prediction picture
generation section counts times of a memory used for prediction and
determines a rank of the plurality of memories based upon a counted
number of the times, wherein the prediction information encoder
allocates a code length to the prediction relating information to
be encoded based upon the rank of a memory used for prediction.
[0055] The plurality of memories includes at least a frame memory
for storing the picture data on a frame basis and a sprite memory
for storing a sprite picture.
[0056] The sprite memory includes at least one of a dynamic sprite
memory involving a regular updating, and a static sprite memory not
involving the regular updating.
[0057] The one of the transform methods corresponding to the one of
the plurality of memories is at least one of a parallel
translation, an affine transformation, and a perspective
transformation in an interchangeable manner.
[0058] Still further, according to the present invention, a method
for predicting a moving picture to be implemented in at least one
of an encoding or a decoding, includes the steps of storing picture
data for reference to be used for prediction in a plurality of
memories, corresponding different transform methods with the
plurality of memories, respectively, receiving a parameter
representing a motion of a picture segment to be predicted, and
generating a predicted picture using the picture data stored in one
of the plurality of memories used for predicting the picture
segment based upon the parameter and one of the transform methods
being corresponding to the one of the plurality of memories.
[0059] The method for predicting a moving picture further includes
the steps of generating a prediction memory indication information
signal indicating the one of the plurality of memories used for the
picture segment to be predicted, and transmitting the prediction
memory indication information signal and the parameter to a
decoding station.
[0060] The method for predicting a moving picture is implemented in
the decoding, and further includes the step of receiving a
prediction memory indication information signal indicating the one
of the plurality of memories used for generating the predicted
picture and the parameter representing a motion of the picture
segment to be predicted from an encoding station.
[0061] Still further, according to the present invention, a method,
for predicting a moving picture to be implemented in at least one
of an encoding and a decoding, includes the steps of storing
picture data for reference to be used for prediction in a plurality
of memories, assigning separate parameter effective value ranges to
the plurality of memories, respectively, receiving a parameter
representing a motion of a picture segment to be predicted,
selecting one of the plurality of memories assigned to one of the
parameter effective value ranges including a value of the
parameter, and generating a predicted picture using the picture
data stored in a selected memory.
[0062] Still further, according to the present invention, a method,
for predicting a moving picture to be implemented in at least one
of an encoding and a decoding, includes the steps of storing
picture data for reference to be used for prediction in a plurality
of memories, receiving a parameter representing a motion of a
picture segment to be predicted, generating a predicted picture
using the picture data stored in the plurality of memories based
upon the parameter, and updating the picture data stored in at
least one of the plurality of memories at an arbitrary timing.
BRIEF DESCRIPTION OF DRAWINGS
[0063] FIG. 1 is a structural diagram of a moving picture encoder
according to an embodiment of this invention.
[0064] FIG. 2 is a flowchart illustrating an operation of the
moving picture encoder according to the embodiment of this
invention.
[0065] FIG. 3 is a structural diagram illustrating the
configuration of a motion compensator of the moving picture encoder
of the embodiment of this invention.
[0066] FIG. 4 is a flowchart illustrating an operation of the
motion compensator.
[0067] FIG. 5 is a structural diagram illustrating the structure of
a memory update unit of the moving picture encoder of the
embodiment of this invention.
[0068] FIG. 6 is a flowchart illustrating an operation of the
memory update unit.
[0069] FIG. 7 is a structural diagram illustrating the
configuration of a motion compensator of a moving picture encoder
according to another embodiment of this invention.
[0070] FIG. 8 is a flowchart illustrating an operation of the
motion compensator of FIG. 7.
[0071] FIG. 9 is a structural diagram illustrating the
configuration of a motion compensator of a moving picture encoder
according to another embodiment of this invention.
[0072] FIG. 10 is a flowchart illustrating an operation of the
motion compensator of FIG. 9.
[0073] FIG. 11 is a structural diagram of a moving picture encoder
according to another embodiment of this invention.
[0074] FIG. 12 is a structural diagram showing the configuration of
a motion compensator of the moving picture encoder according to the
embodiment of this invention.
[0075] FIG. 13 is a flowchart illustrating an operation of the
motion compensator of FIG. 12;
[0076] FIG. 14 is a structural diagram illustrating the
configuration of a memory update unit of a moving picture encoder
according to another embodiment of this invention.
[0077] FIG. 15 is a flowchart illustrating an operation of the
memory update unit of FIG. 14.
[0078] FIG. 16 is a structural diagram of a moving picture encoder
according to another embodiment of this invention.
[0079] FIG. 17 is a structural diagram of a moving picture encoder
according to another embodiment of this invention.
[0080] FIG. 18 is a structural diagram of a moving picture encoder
according to another embodiment of this invention.
[0081] FIG. 19 is a structural diagram of a moving picture encoder
according to another embodiment of this invention.
[0082] FIG. 20 is a diagram showing bit stream 21 according to the
first embodiment of this invention.
[0083] FIG. 21 is a diagram showing bit stream 21 according to the
second embodiment of this invention.
[0084] FIG. 22 is a diagram showing bit stream 21 according to the
third embodiment of this invention.
[0085] FIG. 23 is a diagram showing bit stream 21 according to the
sixth embodiment of this invention.
[0086] FIG. 24 is a structural diagram of a moving picture decoder
according to another embodiment of this invention.
[0087] FIG. 25 is a structural diagram illustrating the
configuration of a motion compensator of the moving picture decoder
according to the embodiment of this invention.
[0088] FIG. 26 is a flowchart illustrating an operation of the
motion compensator.
[0089] FIG. 27 is an exemplary diagram of interpolation.
[0090] FIG. 28 is a flowchart illustrating an operation of a memory
update unit of the moving picture decoder according to the
embodiment of this invention.
[0091] FIG. 29 is an exemplary diagram of a video data
configuration according to the VM encoding system.
[0092] FIG. 30 is an exemplary diagram of a VOP data structure.
[0093] FIG. 31 is a structural diagram illustrating the
configuration of a VM encoder.
[0094] FIG. 32 is a flowchart illustrating an operation of the
encoder of FIG. 31.
[0095] FIG. 33 is an exemplary diagram of VOP encoded types and
corresponding prediction types.
[0096] FIG. 34 is a structural diagram illustrating the
configuration of a motion compensator of the encoder of FIG.
31.
[0097] FIG. 35 is a flowchart illustrating an operation of the
motion compensator of FIG. 34.
[0098] FIG. 36 is a structural diagram illustrating the
configuration of a memory update unit of the encoder of FIG.
31.
[0099] FIG. 37 is a flowchart illustrating an operation of the
memory update unit of FIG. 36.
BEST MODE FOR CARRYING OUT THE INVENTION
Embodiment 1
[0100] FIG. 1 is a block diagram showing the configuration of an
encoder according to a first embodiment and the following
embodiments. The diagram illustrates an input moving picture signal
1, texture data 2, a motion detector 3, a motion parameter 4, a
motion compensator 5, a predicted picture candidate 6, a prediction
mode selector 7, a prediction mode 8, a predicted picture 9, a
prediction error picture 10, a texture encoder 11, a quantized DCT
coefficient 12, a locally decoded prediction error picture 13, a
locally decoded picture 14, a memory update unit 15, a memory-a 16,
a memory-b 17, a memory-c 18, a variable-length encoder/multiplexer
19, a transmission buffer 20, a bitstream 21, a scene-change
detector 80, and a timer 81. Particularly, the motion compensator 5
and the memory update unit 15 forms a predicted picture generation
section 100 which implemented a prediction system. Memories a, b, c
forms a memory area 200. In the figure, portions not mentioned in
this embodiment will be discussed in the following embodiments.
FIG. 2 shows a flowchart illustrating an operating flow of the
encoder.
[0101] This embodiment is based upon the assumption that a plural
number, e.g. three, of memories, are used adaptively according to
the significance of an input moving picture based upon such
characteristics of motion as an amount of motion and an intensity
of color. It is also assumed that the content of an arbitrary
memory (area), the memory-a for example, are updated at an
arbitrary period of time and a moving picture sequence is received
on a frame basis.
(1) Input Signal
[0102] As aforementioned, the encoder inputs a frame representing a
picture at each time instance of the moving picture sequence and
decomposes the frame into the encoding units of macroblocks which
are one example of picture segments subject to prediction (step
S1).
(2) Adaptive Use of Memories
[0103] The memories store previously decoded pictures or previously
provided fixed pictures. In this embodiment, the three memories are
used adaptively according to the significance of a picture segment
in a frame as follows.
[0104] The memory-a stores a least significant picture segment
(i.e., a background-like picture segment whose motion is static or
flat and the texture is flat.)
[0105] The memory-b stores a less significant picture segment
(i.e., a picture segment of an object whose motion is relatively
small.)
[0106] The memory-c stores a most significant picture segment
(i.e., a picture segment of an object whose motion is complicated
or drastic.)
[0107] The least significant picture segment to be stored in the
memory-a may be a background picture segment in a video conference
scene or the like. The least significant picture segment also
corresponds to a background segment in a camera-work relating flat
motion scene of a full screen including slightly moving objects.
With this type of motion, it is efficient to obtain a frame based
amount of a motion to substitute for a macroblock based motion,
rather than to obtain a macroblock based amount of a motion.
Specifically, a transform parameter corresponding to the sprite
warping parameter discussed in the conventional art is obtained and
the transform parameter of a full frame is then used as the motion
parameter of a macroblock in the frame. The motion parameter may be
selected from among a simple parallel translation parameter (=a
motion vector), an affine motion parameter involving
transformation, and a perspective motion parameter involving
transformation. Here, a motion vector is one example of the motion
parameter.
[0108] The less significant picture segment to be stored in the
memory-b may be a picture segment of a moving figure who is not a
speaker in a video conference scene or the like. This segment type
of the object may be considered less attracting in the scene. The
most significant picture segment to be stored in the memory-c may
be a segment of an object attracting most attention in the video
conference scene such as a speaker.
[0109] A picture segment stored in the memory-b or the memory-c
representing a unique type of motion of an object should have a
macroblock based unique motion parameter. The motion parameter of
this case may be selected from among the simple parallel
translation parameter (=a motion vector), the affine motion
parameter involving transformation, the perspective motion
parameter involving transformation, etc.
(3) Motion Detection (Step S2)
[0110] The motion detector 3 of this embodiment is designed to
detect an arbitrary transform parameter of the respective three
memories on a macroblock basis, involving no distinction between a
motion vector and a warping parameter in the conventional art. The
motion detector 3 is provided with additional functions of a
global-motion parameter detection for detecting a frame based
transform parameter using the memory-a and a local-motion parameter
detection for detecting a macroblock based transform parameter
using the memories a through c.
(4) Motion Compensation (Step S3)
[0111] FIG. 3 shows the configuration of the motion compensator 5
of this embodiment in detail. In the figure, a prediction picture
memory address calculator 22, a prediction picture memory address
23, a memory reader 24, and a reference memory indicator signal 25
which is supplied externally are shown. In this embodiment, the
reference memory indicator signal 25 indicates the use of the
memory a, b, c. FIG. 4 shows a flowchart including steps S11
through S16 illustrating an operation of the motion compensator
5.
[0112] Initially, with an I(Intra)-frame, no motion compensation is
performed (step S11). With a frame other than the I-frame,
predicted picture candidates are generated based upon the
global-motion and local-motion parameters corresponding to the
respective memories detected in the motion detector 3 (steps S12
through S15). Specifically, the prediction picture memory address
calculator 22 calculates the prediction picture memory address 23
of a predicted picture candidate in a memory identified by the
reference memory indicator signal 25 based upon the motion
parameter 4. Upon reception of the prediction picture memory
address 23, the memory reader 24 reads out the predicted picture
candidate 6 from a corresponding memory to be outputted.
[0113] In this embodiment, the global-motion and local-motion
parameters are obtained through the same transform method, thereby
allowing the motion compensator 5 of FIG. 3 to be shared by both
global-motion and local-motion parameter based approaches of
generating a predicted picture. When generating the predicted
picture candidate 6 through the global-motion parameter, the
memory-a is always used as a reference memory (step S15).
(5) Prediction Mode Selection (Step S4)
[0114] The prediction mode of this embodiment is assumed to include
the following.
(a) a mode for using the memory-a, (b) a mode for using the
memory-b, (c) a mode for using the memory-c, and (d) a mode for
using an intra-frame coding signal. Similarly to the discussion in
the conventional art, the prediction mode selector 7 selects the
predicted picture candidate 6 having the least power (amplitude) of
a prediction error signal, for example, from among all the
predicted picture candidates 6 generated in the motion compensator
5 along with an intra-frame coding signal, and outputs a selected
one of the predicted picture candidates 6 as the predicted picture
9 and an corresponding one of the prediction mode 8. The prediction
mode 8 includes memory selection information indicating a memory
used for predicting the selected predicted picture 9. The
prediction mode 8 is transferred to the variable-length
encoder/multiplexer 19 to be encoded with an allocated length of
code in the bitstream 21 as prediction memory indication
information 800.
(6) Memory Updating
[0115] The memory update unit 15 controls the memories to be
updated. FIG. 5 shows the configuration of the memory update unit
15 of this embodiment in detail. In the figure, an activity 26 used
for updating the memory-a (which will be discussed later), a
memory-a update judger 27, a reference memory selector 28, switches
29, 30, picture data 31 for updating the memory-a, picture data 32
for updating the memory-b, picture data 33 for updating the
memory-c, and global prediction picture data 34 for updating the
memory-a are shown. FIG. 6 shows a flow of a memory updating
operation.
[0116] The memory updating operation of this embodiment has the
following procedure. Upon reception of the locally decoded picture
14, the memory update unit 15 judges the necessity of updating the
memory-a with a frame including the locally decoded picture 14 in
the memory-a update judger 27 (step S17). The reference memory
selector 28 selects a memory used for predicting the locally
decoded picture based upon the prediction mode 8 (steps S18, S19).
Then, a reference picture stored in a selected memory is updated
with one of the picture data 31, 32, 33 for updating the memory-a,
the memory-b, the memory-c, respectively, and the global prediction
picture data 34 for updating the memory-a of the locally decoded
picture 14 based upon the following rule. A memory is assumed to be
updated each frame on a prediction unit (macroblock) basis.
(1) Frame Based Regular Updating of the Memory-b and the Memory-c
(Steps S20, S21):
[0117] The locally decoded picture 14 is written into either the
memory-b or the memory-c used for predicting the picture.
(2) Frame Based Adaptive Updating of the Memory-a (Steps S22,
S23):
[0118] The locally decoded picture 14 is written into the memory-a
used for predicting the picture for an arbitrary frame only or at
an arbitrary period of time based upon a memory-a update judgement
1000 obtained in step S17.
[0119] The content of memory-a is the past record of a
time-unvarying picture segment such as a background picture. This
removes the necessity of the regular updating of the content of
memory unless a full-screen involved complicated or drastic
movement such as a scene change occurs to cause a drastic change in
the content of a picture segment.
[0120] As aforementioned, a frame based regular updating is
performed with a complicated or drastic area of an object, whereas
a longer-term based updating is performed with the content of the
memory-a, thereby achieving an effective prediction with a
background picture half visible among moving objects.
[0121] Viewed in this light, the memory-a is updated in an
arbitrary period of time in this embodiment. Specifically, possible
arbitrary update criteria are as follows.
a. A full-screen content is updated all at once with a
global-motion parameter indicating a complicated or drastic motion,
whereas no updating operation is performed with the parameter
indicating a rather static motion. b. A full-screen content is
updated all at once on a predetermined period basis, regardless of
a frame based period of time. c. A full-screen content is updated
all at once only with a frame immediately after a scene change
detected.
[0122] In this embodiment, data as the arbitrary update criteria
are generically called as the activity 26 used for updating
memory-a. Initially, the memory-a update judger 27 judges whether
to update the content of memory-a based upon the activity 26 (step
S17). Specifically, the activity 26 corresponds to a value of the
global-motion parameter detected in the motion detector 3 with the
arbitrary update criterion-a, a time stamp of the current frame
from the timer 81 with the arbitrary update criterion-b, and a flag
indicating a scene change detection outputted from the scene-change
detector 80 with the arbitrary update criterion-c.
[0123] When the content of memory-a is judged to be updated, the
content of the locally decoded picture 14 is outputted as the
global prediction picture data 34 for updating the content of the
memory-a (step S23). When no updating is judged with the content of
the memory-a, then no updating is performed with the memory-a.
[0124] The memory-a update judgement 1000 of a frame is multiplexed
in the bitstream 21 to be transmitted to a decoding station so that
the same updating of the memory-a can be performed with the frame
in the decoding station.
[0125] FIG. 20 is a diagram illustrating the bitstream 21 of this
embodiment.
[0126] FIG. 20 is a conceptual diagram showing how frame data are
encoded in sequence to be transmitted. Each frame data is provided
with header information at the front as a frame based additional
information. The header information has the memory-a update
judgement 1000 multiplexed to be transmitted to a decoding station.
The header information is followed by the component macroblock data
of the frame. The macroblock data include the prediction memory
indication information 800 indicating a memory used for predicting
the macroblock data. In a counterpart decoder, a memory for
predicting a predicted picture is specified based upon the
prediction memory indication information 800 of macroblock data to
generate the predicted picture.
[0127] Although not shown in the figures, the memory-b update
information or/and the memory-c update information may be
transmitted to the decoding station along with, or alternatively
to, the memory-a update judgement 1000.
[0128] The aforementioned encoder thus provides the adaptive and
efficient use of two or more memories in response to the content of
a moving picture sequence, thereby enhancing prediction efficiency.
Specifically, a moving picture sequence is predicted based upon an
arbitrary transform parameter through an adaptive use of two or
more memories in response to the content and characteristic of a
moving picture sequence. This enables an efficient prediction of a
moving picture in response to a local characteristic of a picture
by even covering complicated motion. With the enhanced prediction
efficiency, the encoder is allowed to reduce an amount of encoded
data without deteriorating encoded picture quality. The same
prediction system may be employed by the counterpart decoder for
decoding a bitstream encoded through the prediction system of this
invention.
[0129] This embodiment has thus disclosed the encoder performing on
a frame basis. The same effect may be expected with an alternative
encoder performing on an arbitrary shaped video object (VOP)
basis.
[0130] Further, this embodiment has thus disclosed the encoder
performing on a macroblock basis as the macroblock being a picture
segment subjected to prediction. The same effect can be expected
with an alternative encoder for encoding a picture on such a
picture segment basis as an arbitrary shaped picture segment and a
variable shaped block including fixed-size block components.
[0131] Further, this embodiment has thus disclosed the
global-motion parameter detection using the memory-a.
Alternatively, a single use of the local-motion parameter detection
is of course applicable involving no global-motion parameter
detection. With no global-motion detection, no necessity occurs for
transmitting prediction information indicating a global/local
prediction as the prediction mode.
[0132] Further, this embodiment may include a special memory for
prediction which stores reference picture data previously generated
based upon the content of the moving picture sequence. The special
memory is not updated during an encoding operation.
[0133] Further, this embodiment has thus disclosed the case that
memories a, b, c store a picture segment each time and the memory
update unit 15 updates one of the memories a, b, c each time. If
two or all of the memories a, b, c share to store a picture in part
or fully, then the memory update unit 15 updates the two or all of
the memories a, b, c. In the case of the memory-a being a frame
memory for storing a frame of reference picture data, the memory-b
being a static sprite memory involving the adaptive updating, and
the memory-c being a dynamic sprite memory involving the regular
updating, the memory update unit 15 does not update the memory-b as
the static sprite memory for storing previously fixed reference
picture data but updates the memory-a and the memory-c concurrently
when the memories store the same reference picture segment. Thus,
if a duplicated storage of the reference picture data occurs with
the memories a, b, c, then the memory update unit 15 updates a
duplicated segment stored in each memory.
[0134] The aforementioned can also be applied to the following
embodiments.
[0135] Further, this embodiment has thus disclosed the use of three
memories a, b, c, but alternatively, two of the memories may be
utilized.
[0136] Further, a counterpart decoder may be provided with the
prediction picture generation section 100 including the same
components as the motion compensator 5 and the memory update unit
15 discussed in this embodiment. A motion compensator provided in
the decoder, having no necessity of generating all the three
predicted picture candidates, generates a single predicted picture
alone based upon a decoded motion parameter.
Embodiment 2
[0137] A second embodiment shows an encoder with a single
replacement of the motion compensator 5 of the encoder shown in
FIG. 1. The configuration and operation of a motion compensator 5a
of the second embodiment are now described.
[0138] FIG. 7 shows the configuration of the motion compensator 5a
of this embodiment in detail. The figure includes a reference
memory determiner 35. FIG. 8 shows a flowchart illustrating a
detailed operation of the motion compensator 5a.
[0139] Initially, with the I-frame, no compensation is performed
(step S24). With a frame other than the I-frame, the reference
memory determiner 35 determines a reference memory based upon a
value of the motion parameter 4 (step S25). The reference memory
determiner 35 holds effective motion parameter value ranges (which
will be discussed later in detail) allocated, respectively, to the
memories a, b, c. The reference memory determiner 35 compares the
respective effective motion parameter value ranges with the value
of the motion parameter 4 to judge which memory is designated by
the motion parameter 4 and outputs a reference memory indicator
signal 25a for identifying the respective memories a, b, c.
[0140] The effective motion parameter value ranges are effective
search ranges allocated to the respective memories for detecting a
motion vector, for example. Specifically, if .+-.15 pixels are
assumed to be given for a total search value range, then the
memory-a is chosen to be used for prediction in a range of .+-.0 to
3 pixels, the memory-b is used in a range of .+-.4 to 8 pixels, and
the memory-c is used in a range of .+-.9 to 15 pixels, for example.
Here, the reference memory determiner 35 operates only when the
local-motion parameter is used for prediction, because the memory-a
is exclusively used as a reference memory when the predicted
picture candidate is generated based upon the global-motion
parameter. This motion-vector value based approach of identifying a
memory to be used for prediction is based upon the assumption that
a background picture should include a static motion and a most
attractive picture should include a complicated or drastic motion.
This motion-vector value based approach of identifying a memory for
prediction involves no necessity of encoding the prediction mode to
be transmitted.
[0141] Next, the predicted picture candidate 6 is generated based
upon the reference memory indicator signal 25a of a selected memory
(steps S26 through S30). Specifically, the prediction picture
memory address calculator 22 calculates the prediction picture
memory address 23 of the predicted picture candidate 6 in an
identified memory by the reference memory indicator signal 25a
based upon the motion parameter 4. Based upon the prediction
picture memory address 23, the memory reader 24 reads out the
predicted picture candidate 6 from the memory to be outputted.
[0142] As the global-motion and local-motion parameters of this
embodiment are based upon the same transform method, both parameter
based approaches can share the motion compensator 5a of FIG. 7 for
generating the predicted picture candidate. When reading out the
predicted picture candidate 6 based upon the global-motion
parameter (step S31), the memory-a is always used as a reference
memory.
[0143] The effective motion parameter value ranges may be fixed on
a moving picture sequence basis, and alternatively, changed on a
frame basis, for example. With the frame based changes, the
effective motion parameter value ranges assigned to the respective
memories of the frame are multiplexed in a bitstream to be
transmitted to a decoding station to perform the same memory
selection.
[0144] FIG. 21 is a diagram showing the bitstream 21 of this
embodiment.
[0145] The bitstream is provided with header information added at
the front on a moving picture sequence basis. The header
information includes effective motion parameter value range
indication information of the respective memories. By thus
designating the effective motion parameter value range indication
information at the front of a moving picture sequence, the moving
picture sequence is predicted with the fixed effective motion
parameter value ranges in a counterpart decoder.
[0146] When varying the effective motion parameter value ranges
each frame, the effective motion parameter indication information
is to be included in the header information added on a frame
basis.
[0147] Thus, the efficient and adaptive use of the memories in
response to the magnitude of a local motion of a frame can be
provided by the encoder including the motion compensator 5a,
thereby enhancing prediction efficiency.
[0148] This embodiment has thus disclosed the encoder performing on
a frame basis. The same effect may be expected with an alternative
encoder performing on an arbitrary shaped video object (VOP)
basis.
[0149] Further, this embodiment has thus disclosed the encoder
performing on a macroblock basis. The same effect may be expected
with an alternative encoder for encoding a picture on such a
picture segment basis as an arbitrary shaped picture segment and a
variable shaped block including fixed-size block components.
[0150] Further, this embodiment has thus disclosed the
global-motion parameter detection using the memory-a.
Alternatively, a single use of the local-motion parameter detection
is of course applicable involving no global-motion parameter
detection. No global-motion detection involves no necessity for
transmitting information indicating a global/local prediction as
the prediction mode.
[0151] Further, a counterpart decoder may be provided with the
prediction picture generation section 100 including the same
component as the motion compensator 5 discussed in this embodiment.
In the decoder, a motion compensator only generates a single
predicted picture based upon a decoded motion parameter.
Embodiment 3
[0152] Another embodiment shows an encoder having a single
replacement of the motion compensator 5 of the encoder of FIG. 1.
The configuration and operation of a motion compensator 5b are now
described. A motion detector 3a employed in this embodiment is
assumed to output an amount of the parallel translation, the affine
parameter, and the perspective parameter as motion parameters
4a.
[0153] Further, the memory-a of this embodiment is assumed to be a
frame memory for storing a reference picture frame, the memory-b is
assumed to be a static sprite memory, and the memory-c is assumed
to be a dynamic sprite memory.
[0154] FIG. 9 shows the configuration of the motion compensator 5b
of this embodiment in detail. In the figure, a parallel translation
amount 36 (i.e., a motion vector), an affine parameter 37, a
perspective parameter 38, a parallel-translation based prediction
picture memory address calculator 39, an affine parameter based
prediction picture memory address calculator 40, and a perspective
parameter based prediction picture memory address calculator 41 are
shown. FIG. 10 is a flowchart illustrating the operation of the
motion compensator 5b in detail.
[0155] Initially, with the I-frame, no prediction is performed
(step S33). With a frame other than the I-frame, the prediction
picture memory address calculators 39 through 41 calculate the
respective prediction picture memory addresses 23 based upon the
respective values of the motion parameters 4a (step S34).
[0156] The memory address calculators 39, 40, 41 calculate
addresses based upon picture transform methods assigned,
respectively, to the corresponding memories. In this embodiment,
the parallel translation is assigned to the memory-a, the affine
parameter involving such a simple transformation as a rotation and
an expansion/contraction is assigned to the memory-b, and the
perspective parameter involving a three-dimensional complicated
motion is assigned to the memory-c. These transform methods may be
expressed by the following transform expressions.
[Parallel Translation]
[0157] Amount of parallel translation (a, b):
x'=x+a
y'=y+b
[Affine Transform]
[0158] Affine parameter (a,b,c,.theta.):
x'=a(cos .theta.)x+a(sin .theta.)y+b
y'=a(-sin .theta.)x+a(cos .theta.)y+c
[Perspective Transform]
[0159] Perspective parameter (a,b,c,d,e,f):
x'=(ax+by+c)/(gx+hy+1)
y'=(dx+ey+f)/(gx+hy+1)
[0160] Here, (x, y) in a two-dimensional coordinate system
represents a pixel location of an original macroblock. (x', y')
represents a pixel location in a memory corresponding to (x, y,)
based upon each of the parameters. That is a location in a memory
(x', y') is calculated based upon these parameters. Through this
mechanism, a memory most suitable for the characteristic of a
motion can be chosen to be used for prediction on a macroblock
basis. With calculated prediction picture memory addresses 23 based
upon the respective motion parameters 36, 37, 38, the memory reader
24 reads out the predicted picture candidates 6 from corresponding
memories to be outputted (steps S35 through S39).
[0161] The transform methods assigned to the respective memories of
the frame are multiplexed in the bitstream 21 to be transmitted to
a decoding station as a motion detection method indication
information so that the same motion compensation can be performed
in the decoding station.
[0162] FIG. 22 is a diagram showing the bitstream 21 of this
embodiment.
[0163] Header information added at the front of a moving picture
sequence includes the motion detection method indication
information. In the encoder, the transformation types to be used in
the respective memories are interchangeable, and thus the motion
detection method indication information indicating a
memory-transform method relation is to be transmitted to the
counterpart decoder as the header information of the moving picture
sequence. Thus, transformation types assigned to be used with the
respective memories can be identified in the decoder.
[0164] In the decoder, the identified transformation types are
dynamically assigned to the respective memories.
[0165] Thus, the efficient and adaptive use of the memories in
response to the characteristic of a local motion of a frame is
provided by the encoder including the motion compensator 5b,
thereby enhancing prediction efficiency.
[0166] This embodiment has thus disclosed the encoder performing on
a frame basis. The same effect may be expected with an alternative
encoder performing on an arbitrary shaped video object (VOP)
basis.
[0167] Further, this embodiment has thus disclosed the encoder
performing on a macroblock basis. The same effect may be expected
with an alternative encoder for encoding a picture on such a
picture segment basis as an arbitrary shaped picture segment and a
variable shaped block including fixed-size block components.
[0168] Further, this embodiment has thus disclosed the
global-motion parameter detection using the memory-a.
Alternatively, a single use of the local-motion parameter detection
is of course applicable involving no global-motion parameter
detection. No global-motion detection involves no necessity of
transmitting information of a global/local prediction as the
prediction mode.
[0169] Further, this embodiment has thus disclosed the use of the
memories a, b, and c. Alternatively, the use of memories a and b
alone, memories a and c alone, or memories b and c alone, is also
applicable.
[0170] Further, a decoder may be provided with the prediction
picture generating section 100 including the same component as the
motion compensator 5b discussed in this embodiment. A motion
compensator in a decoder only generates a single predicted picture
based upon a decoded motion parameter.
Embodiment 4
[0171] Another embodiment shows an encoder which receives a plural
number, two for example, of different video objects, having shape
information, intermingled in a moving picture sequence for a
collective encoding. FIG. 11 shows the configuration of the encoder
of this embodiment.
[0172] In the figure, an input picture frame 42, an object
separator 43, object data 44a, 44b, shape blocks 45a, 45b, switches
46a, 46b, shape encoders 47a, 47b, compressed shape block data 48a,
48b, locally decoded shape blocks 49a, 49b, texture data
(macroblocks) 50a,50b, motion detectors 51a, 51b, motion parameters
52a, 52b, motion compensators 53a, 53b, predicted picture
candidates 54a, 54b, prediction mode selectors 55a, 55b, prediction
mode information 56a, 56b, predicted pictures 57a, 57b, prediction
error signals 58a, 58b, texture encoders 59a, 59b, compressed
texture data 60a, 60b, locally decoded prediction error signals
61a, 61b, locally decoded macroblocks 62a, 62b, a memory update
unit 63, memory-a 64, memory-b 65, memory-c 66, memory-d 67,
memory-e 68, memory-f 69, variable-length encoders 70a, 70b, a
multiplexer 71, a buffer 72, a bitstream 73, a memory section 94,
an object-A encoder 88a for encoding an object-A, and an object-B
encoder 88b for encoding an object-B are shown. The object encoders
88a, 88b are structurally identical to each other with the
identical components.
[0173] This encoder inputs the picture frame 42, which is
decomposed into the encoding units of objects in the object
separator 43. The object separator 43 is assumed to be assigned a
processing method arbitrarily.
[0174] The shape information of an object is transferred to the
shape encoder 47a, 47b in a form of the shape block 45a, 45b to be
encoded, and then transferred to the variable-length encoder 70a,
70b as the compressed shape block data 48a, 48b.
[0175] The motion detector 51a, 51b detects a motion parameter
based upon the locally decoded shape block 49a, 49b in the same
manner as that of the VM encoding system. A motion parameter can be
detected on a macroblock basis by using all the memories a through
f.
[0176] As a rule, however, the memories a through c are designed to
be used for an object-A to be encoded in the object-A encoder 88a,
and the memories d through f are designed to be used for an
object-B to be encoded in the object-B encoder 88b.
[0177] Also, as for a motion type, an arbitrary transform parameter
is assumed to be detected on a macroblock basis with all the
memories in the memory section 94, involving no distinction between
the motion vector and the warping parameter.
[0178] The motion compensator 53a, 53b generates all the predicted
picture candidates 54a, 54b based upon the respective motion
parameters 52a, 52b. Then, in the prediction mode selector 55a,
55b, the predicted picture 57a, 57b is obtained along with the
prediction mode information 56a, 56b. The predicted picture 57a,
57b is then differentiated from an original signal or the texture
data 50a, 50b to obtain the prediction error signal 58a, 58b, which
is encoded in the texture encoder 59a, 59b to be transmitted to the
variable-length encoder 70a, 70b. The locally decoded prediction
error signal 61a, 61b is added to the predicted picture 57a, 57b to
obtain the locally decoded macroblock 62a, 62b to be stored into
the memories a through f in accordance with an indication by the
memory update unit.
[0179] Object A/B data when encoded in the object-A/B encoder 88a,
88b are multiplexed in the bitstream 73 at the multiplexer 71 to be
transmitted via the buffer 72.
[0180] The prediction of this embodiment is discussed below
focusing on the motion compensator 53a, 53b playing a primary role
in the prediction.
[0181] The motion compensator 53a, 53b of this embodiment generates
a predicted picture candidate based upon the motion parameter 52a,
52b detected in the motion detector 51a, 51b. FIG. 12 shows the
configuration of the motion compensator 53a in detail. FIG. 13
shows a flowchart illustrating the operation of the motion
compensator 53a in the object-A encoder 88a.
[0182] In FIG. 12, an object-B reference judger 74a and an object-B
reference indicator flag 75a are shown.
[0183] The motion parameter 52a includes memory information used
for detection. A predicted picture candidate is generated based
upon a parameter value through the prediction picture memory
address calculator 22a and a memory reader 24a in the same manner
as that stated in the first embodiment (step S44 through step S49).
The object-B reference judger 74a judges if the memories assigned
to object-B are used for predicting the current macroblock based
upon the reference memory information included in the motion
parameter 52a (step S43).
[0184] The object-B reference judger 74a outputs a judged result as
the object-B reference indicator flag 75a, which is multiplexed in
the bitstream 73 to be transmitted to a decoding station so as to
be used for deciding whether the object can be reproduced in a
single use of the memories a, b, c of self in the decoding station.
In order to secure the single use of the memories of self when
reproducing the object in the decoding station, a limited use of
the memories (a, b, c alone) of self for prediction can be
controlled by an externally supplied signal 85a at the time of
encoding the object.
[0185] Thus, the efficient and adaptive use of the memories in
response to the characteristic of a local motion of a frame is
provided by the encoder including the motion compensator 53a, 53b,
thereby achieving an efficient prediction.
[0186] This embodiment has thus disclosed the encoder for encoding
an object on a macroblock basis. The same effect may be expected
with an alternative encoder for encoding a picture on such a
picture segment basis as an arbitrary shaped picture segment and a
variable shaped block including fixed-size block components.
[0187] Further, a decoder may be provided with the same components
as the motion compensator 53a, 53b of this embodiment. A motion
compensator 53 of the decoder only generates a single predicted
picture based upon a decoded motion parameter. Further, if the
decoder is structured so as to acknowledge whether a decoding
object can be reproduced by itself by way of decoding a bit
corresponding to object reference indicator flag 75a, 75b of the
other object in a bitstream, then an error-free secured
reproduction of decoded object data can be achieved.
Embodiment 5
[0188] Another embodiment shows an encoder where the number of
memories or the size of a memory can be varied flexibly in response
to a change in a video object at each time instance. The encoder of
a fifth embodiment modifies the encoder of FIG. 1 with a
replacement of the memory update unit 15.
[0189] FIG. 14 shows the configuration of a memory update unit 15a
of this embodiment in detail. In the figure, a memory expansion
judger 76, a memory expansion indicator signal 77, and a memory
contraction judger 78, and a memory contraction indicator signal 79
are shown. FIG. 15 shows an operating flow (step S51 through S63)
of the memory update unit 15a.
[0190] A picture substantially different from the past record of a
moving picture sequence stored in the memories may occur due to a
scene change or the like. This may cause a deterioration of
prediction efficiency after the scene change if reference pictures
stored in the existing memories are the only available. For such an
occasion, the scene-change detector 80 detects a scene change, a
frame appearing immediately after the detected scene change is
subject to intra-frame coding or the like, and resultant
intra-frame coded data are stored additionally in a memory as new
reference data, thereby enhancing prediction efficiency
thereafter.
[0191] Further, in consideration of the physical limitation of
storage capacity, a flexible approach of contracting the portions
which are rarely used for prediction of the reference pictures
stored in the memories is introduced. Specifically, the frequency
in use of memory areas for prediction of the respective memories a,
b, c is examined in the memory update unit 15a based upon the
prediction mode 8. Consequently, the memory update unit releases a
memory area identified low in frequency from an area for use. For
example, with a software based implementation of this encoder,
limited RAM resources may be used effectively.
[0192] Viewed in this light, the memory update unit 15a of this
embodiment is provided with a function of expanding a memory area
in response to each time instance of a time-varying moving picture
sequence and contracting a memory area including a reference
picture rarely used for prediction.
[0193] The memory-a, similarly to the first embodiment, is judged
in the memory-a update judger 27 whether to be updated (step S50).
When updating the memory-a, the locally decoded picture 14 is
written into the memory-a (steps S56, S57). The locally decoded
picture 14 is written into the other memories as well in accordance
with the prediction mode 8 (step S51 through S55).
[0194] The updating of the contents of the memories involves the
judgement of memory expansion/contraction. The memory expansion
judger 76 judges whether to expand the size of the memory-a (or the
memory-b, or the memory-c) based upon the activity 26 used for
updating the memory-a (steps S58 through S60). When a positive
judgement is made due to a scene change or the like, the expansion
of the memory is indicated by the memory expansion indicator signal
77. The memory contraction judger 78 counts the times of a memory
area used for prediction based upon the prediction mode 8. With a
memory area counted less than a predetermined number in use for
prediction, the contraction of the memory area is indicated by the
memory contraction indicator signal 79 (steps S61 through S63).
[0195] Thus, a highly efficient prediction can be achieved in
response to each time instance of a time-varying moving picture
sequence by the encoder including the memory update unit 15a. In
addition, the dynamic allocation of memory areas required for
prediction contributes to the enhancement of prediction efficiency
and the effective use of memory resources.
[0196] This embodiment has thus disclosed the encoder performing on
a frame basis. The same effect may be expected with an alternative
encoder performing on an arbitrary shaped video object (VOP)
basis.
[0197] Further, this embodiment has thus disclosed the encoder for
encoding a frame on a macroblock basis. The same effect can be
expected with an alternative encoder for encoding a picture on such
a picture segment basis as an arbitrary shaped picture segment and
a variable shaped block including fixed-size block components.
[0198] Further, a counterpart decoder may be provided with the same
component as the memory update unit 15a discussed in this
embodiment.
Embodiment 6
[0199] With reference to the respective previous embodiments, the
memories to be used for prediction are changed on a macroblock
basis. Alternatively, the memories to be used for prediction can be
changed on a frame or a video object basis. This eliminates the
necessity of encoding memory relating information to be encoded on
a frame or a video object basis and memory selection information
(which is included in the prediction mode 8) to be encoded on a
macroblock basis, thereby achieving an efficient encoding.
[0200] With reference to the encoder of FIG. 1 of the first
embodiment, for example, the macroblock based changes of the
memories used for prediction create the necessity of transmitting
additional information identifying a memory used for prediction on
a macroblock basis. According to this embodiment, the changing unit
of the memories to be used for prediction is limited to a frame or
a video object, thereby eliminating the additional information to
be transmitted on a macroblock basis effectively. FIG. 23 shows a
difference of the transmission bitstream 21 of this embodiment from
the transmission bitstream 21 of FIG. 20 of the first embodiment.
The bitstream of FIG. 23 represents a frame based change of the
memories to be used for prediction with the prediction memory
indication information 800 included in frame based header
information. The bitstream of FIG. 23 may be effective, for
example, in the case that the picture characteristic of a moving
picture sequence changes infrequently including little changes
locally on a macroblock level. Further, a decoder may be provided
so as to decode the thus encoded bitstream to reproduce a frame or
a video object.
Embodiment 7
[0201] With reference to the previous embodiments, two predicted
picture candidates read out from an arbitrary plural number, two
(e.g., memories a and b) for example, of the memories are subject
to an arithmetic mean to obtain a picture as a member of the
predicted picture candidates 6 or as the predicted picture 9.
Further, a decoder may be provided so as to decode the thus encoded
bitstream to reproduce a frame or a video object.
Embodiment 8
[0202] With reference to the encoders of the previous embodiments,
a previously detected feature parameter representing the spatial
complexity, perceptual significance and the like of a picture
segment as a prediction unit may be utilized as tools for deciding
a prediction mode and for judging the updating of the memories.
[0203] For example, a moving picture is assumed to include a motion
too complicated to encode data in an acceptable quality within a
given amount of encoding. In this case, significance is examined on
a prediction picture segment (e.g., a macroblock, an arbitrary
shaped picture segment, an arbitrary shaped block) basis.
Consequently, a low quality encoding is assigned to some extent to
a less significant segment in order to save some amount of encoding
for a more significant segment, thereby improving an overall
picture quality. With the encoders of this invention where two or
more memories are switched interchangeably at an arbitrary timing
to be used for prediction, a more adaptive prediction can be
achieved in response to the characteristic of a picture, through
detecting a feature parameter representing the significance of a
prediction picture segment and then determining the use of the
memories dynamically based upon a detected feature parameter. For
example, as shown in FIG. 16, a segment-significance detector 95 is
provided for detecting the feature parameter on a segment basis to
determine the significance of the segment. The segment-significance
detector 95 transfers a segment-significance to a prediction mode
selector 7a and a quantization parameter based upon the
segment-significance to a texture encoder 11a. With a segment
judged more significant in the segment-significance detector 95, a
most complex mode among two or more prediction modes available is
used for prediction. Specifically, reference pictures from the
respective memories a, b, c are used to obtain the motion
parameters and the predicted pictures, respectively, based upon a
complex motion model. In the prediction mode selector 7a, a
prediction mode having the highest prediction efficiency is
selected from among modes including an arbitrary combination (e.g.,
an arithmetic mean) of the predicted pictures. At the same time,
reference pictures of all the memories used for prediction are
updated. The texture encoder 11a performs an encoding using a
quantization parameter having a smaller quantization step size.
With a less significant segment, a simplified prediction mode
(i.e., a parallel translation amount detection using a single
memory) is employed for prediction and a quantization parameter
having a larger quantization step size is utilized for encoding,
regardless of the amplitude of an obtained prediction error signal,
so that an amount of encoding be reduced. Through this control, a
less significant segment reduces its picture quality to some extent
and a more significant segment maintains its quality through a
high-quality prediction, thereby improving an overall quality
within a given amount of encoding.
Embodiment 9
[0204] In an encoder where a moving picture sequence is predicted
and encoded by using two or more memories, a parameter representing
an amount of encoding available for the moving picture sequence at
each time instance, an amount of a change in a scene at a certain
time instance (e.g., a scene change detection), or the feature
parameter or significance of a prediction picture segment described
in the eighth embodiment may be detected previously. The values of
these parameters may be used for predicting a picture at a
particular time instance in a prediction system. Alternatively,
these values may be utilized as judging tools for selecting a
reference memory area in a prediction system. A frame-significance
detector 96 may be provided for determining the significance on a
frame basis as shown in FIG. 17. The frame-significance detector 96
detects, for example, an amount of a change in a motion between the
current and the previous frame (e.g., a scene change detection by
the scene change detector 80), the appearance of a new object or
the disappearance of an object or the like. A final significance of
the current frame is determined in consideration of an amount of
encoding available for the current frame informed by the
transmission buffer 20. Based upon the final significance, a more
significant frame may be predicted by using all the prediction
methods and reference memory areas available for the maximum
possible improvement of prediction efficiency, whereas a less
significant frame is predicted in a limited use of the prediction
methods and the reference memory areas for a simplified encoding so
as to reduce the throughput. An alternative encoder for performing
intra-frame coding alone involving no prediction at a scene change
may be possible. In addition, a more sophisticated quality control
may be achieved with a joint use of the segment-significance
detector 95 discussed in the eighth embodiment. Through this
control, a less significant frame reduces its quality to some
extent and a more significant frame maintains its quality by a
high-quality prediction, thereby improving an overall picture
quality within a given amount of encoding.
[0205] The idea of this embodiment is also applicable to a software
based encoding being associated with unsteady transaction processes
and unsteady size of available storage in order to achieve an
efficient encoding in the maximum use of available resources. This
reduces a throughput with a less significant frame, thereby
accelerating a general processing speed.
Embodiment 10
[0206] In an encoder where a moving picture sequence including two
or more video objects is predicted and encoded by using two or more
memories, as shown in FIG. 11, a parameter representing a gross
amount of encoding available for the sequence, an available amount
of encoding of a video object at each time instance, an amount of a
change in a video object at each particular time instance (e.g.,
the appearance/disappearance of an object), a level of
significance/attention of a video object in a particular scene, or
the feature parameter or significance of a prediction picture
segment discussed in the eighth and ninth embodiments may be
detected previously. The values of these parameters may be utilized
for predicting a video object at each particular time instance.
Alternatively, these values may be utilized as judging tools for
selecting a reference memory area.
[0207] For example, as shown in FIG. 18, significance detectors 97a
through 97n responsive, respectively, to objects 1 through n may be
provided for detecting a parameter representing an amount of a
change in an object at each time instance, or the
appearance/disappearance of an object. In addition, the
significance of an object at each time instance is determined in
consideration of the occupational proportion of a buffer 72x for
storing the encoded data of all the objects and the occupational
proportion of virtual buffers 72a through 72n for the respective
objects. When a new type of segment appears as a result of an
object having another object overlapped in part, for example, then
the following control may be applied to this type of segment
because this type of segment has a great influence on prediction
efficiency thereafter. A higher significance may be assigned to
this type of segment to obtain an encoded picture in high quality
even without enough space available for storage in the
corresponding virtual buffer of the object. Significance detected
in the significance detectors 97a through 97n is transferred to
object 1 through N encoders 98a through 98n, where a full use of
the prediction methods and the reference memory areas available is
allowed to a more significant object so as to improve prediction
efficiency to the maximum, whereas a limited use of the prediction
methods and the reference memory areas is assigned to a less
significant object so as to simplify the encoding, thereby reducing
the throughput. Further, with an encoder for encoding objects
decomposed from a frame through a real time separation, when a
considerable amount of a change occurs in the contents of the
object due to the appearance of a new object or the disappearance
of an existing object, the object may be subject to intra-frame
coding alone with no prediction involved. A more sophisticated
quality control may be achieved on a prediction segment basis of an
object in a joint use of the object 1 through N encoders 98a
through 98n and the segment-significance detector 95 discussed in
the eighth embodiment. Through this control, a less significant
object is reduced in quality to some extent and a more significant
object manages to maintain its quality through a sophisticated
prediction, thereby improving an overall quality within a given
amount of encoding.
Embodiment 11
[0208] An alternative encoder may be provided with a prediction
information encoder 91 for allocating a code (encoding) to
prediction relating encoding information (e.g., a reference memory
number) as shown in FIG. 19.
[0209] In the encoder where a moving picture sequence or a video
object is predicted and encoded by using the memories a, b, c, the
memories may be ranked based upon frequency in use for prediction
with ranks being updated dynamically during an encoding operation.
Consequently, a code allocation is performed to the prediction
relating encoding information (e.g., a reference memory number)
based upon the ranks of the respective memories used for prediction
in the prediction information encoder 91.
[0210] For example, in the encoder of FIG. 19, the memory update
unit 15b may be provided with a counter 92 which counts the times
of the respective memories a, b, c to be used for prediction, ranks
the memories a, b, c based upon counted values, and outputs
resultant ranking information 90. This ranking may be performed on
a picture (VOP) basis at a particular time instance of a frame or a
video object, and alternatively, on a smaller unit basis of a
prediction picture segment (e.g., a macroblock, an arbitrary shaped
segment, and an arbitrary shaped block).
[0211] This shows how often the respective memories are used for
prediction. A memory in a frequent use for prediction is the most
significant for prediction, and thus, a high frequency in use for
reference corresponding to a high rank.
[0212] When encoding information on the frequency in use for
prediction of the memories on a prediction picture segment basis, a
memory in a frequent use for reference (i.e., a high-rank memory)
is allocated a short code to enhance encoding efficiency.
[0213] In addition, if the motion parameter detected on a
prediction picture segment basis is allocated a code length in
response to the rank of a memory used for reference, then a shorter
code may be assigned to a motion parameter value generated
frequently, thereby achieving an efficient encoding of the
prediction information. This may be materialized with an
alternative encoder where the prediction information encoder 91 in
the variable-length encoder/multiplexer 19 receives the ranks of
the respective memories from the counter 92 in the memory update
unit 15b and encodes the prediction information using a
variable-length code based upon the ranking information 90.
Embodiment 12
[0214] FIG. 24 shows the configuration of a picture decoder where
an encoded digital picture through compression is reproduced
through expansion according to another embodiment. In the figure,
the encoded bitstream 21, a variable-length decoder 119, the
quantized DCT coefficient 12, a quantization orthogonal transform
coefficient 12a, a quantization step 12b, a texture decoder 111, a
dequantizer 111a, an inverse orthogonal transformer 111b, a
decoding adder 190, a decoded picture 101, a display controller
191, the prediction mode 8, memory-b update information 1001,
memory-c update information 1002, the motion vector 4 (a motion
parameter), the prediction memory indication information 800, an
in-screen location 195 of a prediction picture segment, a motion
compensator 105, a memory-a 116, a memory-b 117, a memory-c 118, a
memory update unit 115, and a predicted picture 106 are shown. The
motion compensator 105 and the memory update unit 115 form a
prediction picture generation section 100a. The memories a, b, c
form a memory area 200a.
[0215] According to this embodiment, the memory-a is assumed to be
a frame memory designed to store a frame of picture data, the
memory-b is assumed to be a static sprite memory, and the memory-c
is assumed to be a dynamic sprite memory. The decoder of this
embodiment is assumed to receive the bitstream 21 of FIG. 22.
Although not shown in FIG. 22, the memory-b update information 1001
and the memory-c update information 1002 are assumed to be
transmitted in the bitstream. The memory-b update information 1001
is assumed to include an update indication for a full updating of
the static sprite memory and picture data for the full updating.
Similarly, the memory-c update information 1002 is assumed to
include an update indication for a full updating of the dynamic
sprite memory and picture data for the full updating.
[0216] The operation of the thus configured decoder is described
below. The variable-length decoder 119 analyzes the bitstream 21
and decomposes it into separate encoded data. The quantization
orthogonal transform coefficient 12a is transferred to the
dequantizer 119a to be dequantized by using the quantization step
12b. A dequantized result is subject to inverse orthogonal
transformation in the inverse orthogonal transformer 111b to obtain
a decoded texture, which is transferred to the decoding adder 190.
Orthogonal transformation employed here is the same as that
employed in an encoding station such as Discrete Cosine
Transformation (DCT).
[0217] The motion compensator 105 inputs the motion vector 4, the
prediction memory indication information 800, and information
indicating the in-screen location 195 of a prediction picture
segment included in the bitstream 21, all of which are decoded in
the variable-length decoder 119. The motion compensator 105 reads
out a right predicted picture from reference pictures stored in the
memories a, b, c based upon the three kinds of information. The
in-screen location 195 of a prediction picture segment can be
obtained by counting the number of macroblocks, other than from the
information included in the bitstream. The process of generating a
predicted picture will be discussed in a later section for
describing the operation of the motion compensator 105 in
detail.
[0218] The decoding adder 190, based upon the information of the
prediction mode 8, outputs an output from the inverse orthogonal
transformer 111b directly as the decoded picture 101 with a block
through intra-frame coding, and, with a block through inter-frame
coding, adds an output from the inverse orthogonal transformer 111b
to the predicted picture 106 to be outputted as the decoded picture
101. The decoded picture 101 is transferred to the display
controller 191 to be outputted to a display device and also
transferred to the memories a, b, c to be stored as a reference
picture for a later use in decoding. A memory writing operation is
controlled by the memory update unit 115 based upon the prediction
mode 8.
[0219] A predicted picture generation performed in the motion
compensator 105 in the prediction picture generation section 100a
is now discussed. According to this embodiment, the prediction
method of a picture is determined based upon the prediction memory
indication information 800. The decoder of this embodiment
generates a predicted picture using a reference picture through
predetermined coordinate transformation and interpolation based
upon the motion vector 4 and the prediction memory indication
information 800. Coordinate transform methods are assigned
previously to the respective memories to be used for prediction.
For example, the following approaches are possible similar to the
picture transform methods described in the third embodiment.
(1) The Memory-a Used for Prediction (with the Prediction Memory
Identification Information 800 Indicating the Use of the
Memory-a)
[0220] The coordinates of each pixel of a prediction segment are
translated based upon the motion vector and picture data at a
corresponding location in the memory-a is read out as the predicted
picture.
(2) The Memory-b Used for Prediction (with the Prediction Memory
Identification Information 800 Indicating the Use of the
Memory-b)
[0221] An affine transform expression is found based upon the
motion vector, the coordinates of each pixel of a prediction
segment are displaced based upon the transform expression, and
picture data at a corresponding location in the memory-c is read
out as a predicted picture.
(3) The Memory-c Used for Prediction (with the Prediction Memory
Identification Information 800 Indicating the Use of the
Memory-c)
[0222] A perspective transform expression is found based upon the
motion vector, the coordinates of each pixel of a prediction
segment are displaced based upon the transform expression, and
picture data at a corresponding location in the memory-b is read
out as a predicted picture.
[0223] FIG. 25 shows the configuration of the motion compensator
105 in detail. In the figure, a switch 161, a corresponding point
determiner 162 for the memory-a, a corresponding point determiner
163 for the memory-b, a corresponding point determiner 164 for the
memory-c, a memory read address generator 165, a switch 166, and an
interpolator 167 are shown. FIG. 26 is a flowchart illustrating the
operation of the motion compensator 105.
[0224] The operation of the motion compensator 105 of this
embodiment is described below with reference to FIGS. 25 and
26.
1) Determining a Corresponding Point
[0225] Initially, the corresponding point determiner of a
corresponding memory is selected by the switch 161 based upon the
prediction memory indication information 800. The vector 4 is then
inputted to a selected corresponding point determiner. In this
section, a predicted picture location corresponding to each memory
is calculated, which is explained below with each memory.
1-1) the Memory-a Indicated by the Prediction Memory Indication
Information 800 (Step S100)
[0226] A predicted picture location is calculated through parallel
translation based upon a motion vector (step S101). Specifically, a
predicted picture location (x',y') corresponding to a pixel at a
prediction picture segment location (x,y) is determined based upon
a motion vector (a, b) according to the following expression.
x'=x+a
y'=y+b
A determined predicted picture location is outputted to the memory
read address generator 165.
1-2) The Memory-b Indicated by the Prediction Memory Indication
Information 800 (Step S103)
[0227] An affine transform expression is determined based upon the
motion vector 4. Specifically, an affine parameter (a, b, c,
.theta.) of the following expression is determined by using the
motion vector of a vertex of a rectangular area enclosing a
prediction picture segment,
x'=a(cos .theta.)x+a(sin .theta.)y+b
y'=a(-sin .theta.)x+a(cos .theta.)y+c
thereby obtaining the predicted picture location (x',y')
corresponding to a pixel at the location (x,y) of a prediction
picture segment to be outputted to the memory read address
generator 165 (step S104).
1-3) The Memory-c Indicated by the Prediction Memory Indication
Information 800 (Step S106)
[0228] A perspective transform expression is determined based upon
a motion vector. Specifically, a perspective parameter (a, b, c, d,
e, f) of the following expression is determined by using the motion
vector of a vertex of a rectangular area enclosing a prediction
picture segment,
x'=(ax+by+c)/(gx+hy+1)
y'=(dx+ey+f)/(gx+hy+1)
thereby obtaining the predicted picture location (x',y')
corresponding to a pixel at the location (x,y) of a prediction
picture segment to be outputted to the memory read address
generator (step S107).
2) Reading Out Data for Generating a Predicted Picture
[0229] Based upon the predicted picture location (x', y') outputted
from a selected corresponding point determiner, the memory read
address generator 165 generates a memory address for specifying the
location of picture data required for generating a predicted
picture in a reference picture stored in a memory, and reads out
the data for generating a predicted picture (steps S102, 105,
108).
3) Generating a Predicted Picture
[0230] Among the component pixels of a predicted picture, with a
pixel at an integer pixel location, the data for generating a
predicted picture is used directly as a component pixel of a
predicted picture, and with a pixel at a real number precision
pixel location, the data for generating a predicted picture is
subject to interpolation in the interpolator 167 to generate an
interpolated pixel value (steps S109, S110, S111). FIG. 26
illustrates an interpolated pixel value generation. In FIG. 26,
(i.sub.0, j.sub.p) denotes an integer pixel location, (j.sup.p,
j.sup.p) denotes a real number precision pixel location, and w
denotes a weight.
4) Updating a Memory (a Reference Picture)
[0231] FIG. 28 shows a flowchart illustrating a control operation
of the memory update unit 115. The memory update unit 115 controls
an updating of the respective memories on a readout unit basis
(e.g., a macroblock) of a predicted picture based upon the
prediction mode 8 (or the prediction memory indication information
800). With the memory-a used for prediction (step S112), the
contents of the memory-a and the memory-c are updated regularly
with the decoded picture 101 (step S113). With the memory-b used
for prediction (step S114), the reference picture of the memory-b
is not updated on a readout unit basis of a predicted picture
because of the memory-b being a static sprite memory, whereas the
contents of the memory-a and the memory-c are updated regularly
with the decoded picture 101 (step S115). When receiving the update
indication by the memory-b update information 1001, then the memory
update unit updates a full content of the memory-b with received
picture data included in the memory-b update information 1001 (step
S116). With the memory-c used for prediction (step S117), the
contents of the memory-a and the memory-c are updated regularly by
using the decoded picture 101 (step S118). When receiving the
update indication by the memory update information, the memory
update unit updates the content of the memory-c with received
picture data included in the memory-c update information 1002 (step
S119).
[0232] The use of the three memories a, b, c of this embodiment may
be replaced by the use of two memories thereof, for example, with
the memories a and b, that is, a frame memory and a static sprite
memory. Alternatively, the memories a and c, that is, a frame
memory and a dynamic memory, may be used.
[0233] As aforementioned, according to the decoder of this
embodiment, the bitstream 21 encoded through an efficient
prediction using the various kinds of motion parameters in response
to the motion of a picture can be decoded. In addition, the decoder
is applicable to the arbitrary updating approach of the contents of
a reference picture at a timing determined in the encoding station,
thereby achieving a more adaptive decoding in response to the
characteristic of a picture.
[0234] According to this embodiment, if the bitstream includes a
prediction error signal encoded through encoding other than
orthogonal transform encoding, the same effect may be obtained by
replacing a component for decoding a prediction error signal, other
than the motion compensator and memory update unit.
[0235] Further, this embodiment may be applied not only to a
decoder for decoding data on a fixed-size block basis, e.g., for
decoding a normal television signal on a frame basis, but also to a
decoder for decoding an arbitrary shaped video object (e.g., a
Video Object Plane disclosed in ISO/IEC JTC1/SC29/WG11/N1902) as a
unit without limiting a prediction segment to a fixed-size
block.
INDUSTRIAL FEASIBILITY
[0236] As discussed above, the memory areas provided for storing
reference pictures according to this embodiment enables the
adaptive use of the memories for storing data based upon the
characteristic of the moving picture sequence. In addition, the
contents of one or more of the memory areas can be updated at an
arbitrary timing, so that the content of a time-unvarying picture,
such as a background picture, is controlled to be updated on a
longer-term basis, and the contents of a locally changing picture
segment is controlled to be updated on a regular or sequential
basis. This achieves an efficient prediction by reflecting the past
record of the moving picture sequence.
[0237] Further, the transform parameter value ranges are assigned
to the memory areas for making the respective memory areas
effective, and the memory areas are switched to be used for
prediction among them based upon the value of the transform
parameter of a prediction picture segment, thereby achieving an
efficient prediction in response to the magnitude of a local/global
motion of the moving picture sequence. At the same time, the motion
parameters to be encoded on a prediction picture segment basis can
be encoded efficiently within the effective motion parameter value
ranges of the reference memory areas.
[0238] Further, to the respective memory areas, the transform
methods becoming effective in the respective memories are assigned,
and the memories are switched to be used for prediction among them
in response to the type of the transform parameter of a prediction
picture segment, thereby achieving an efficient prediction in
response to the complexity of a local/global motion of the moving
picture sequence. At the same time, the transform method can be
selected adaptively in response to the characteristic of a
prediction picture segment, thereby achieving an efficient encoding
of the motion parameter.
* * * * *