U.S. patent application number 08/949875 was filed with the patent office on 2001-11-15 for recordable dvd disk with video compression software included in a read-only sector.
This patent application is currently assigned to LSI Logic Corporation. Invention is credited to DIERKE, GREGG.
Application Number | 20010041054 08/949875 |
Document ID | / |
Family ID | 25489614 |
Filed Date | 2001-11-15 |
United States Patent
Application |
20010041054 |
Kind Code |
A1 |
DIERKE, GREGG |
November 15, 2001 |
RECORDABLE DVD DISK WITH VIDEO COMPRESSION SOFTWARE INCLUDED IN A
READ-ONLY SECTOR
Abstract
A recordable DVD disk which includes a read-only sector for
specifying a video encoding algorithm to be used for compressing
video programs that are to be recorded in the recordable region of
the recordable DVD disk. Such disks may be used in a digital video
recording system having a programmable video encoder. In one
embodiment, the system accepts the recordable DVD disks having a
read-only sector for storing customized video encoding algorithms
and programs the programmable video encoder with the customized
video encoding algorithms prior to encoding and recording a video
signal on the disk. By designing the video encoding algorithms to
optimize one or more of a number of desirable attributes, the DVD
media vendors can then create "classes" of recordable DVD disks,
i.e. high capacity, high quality, high speed, high image detail,
high color resolution, variable frame rate, etc. One programmable
video encoder for this embodiment would include an instruction
memory for storing the customized video algorithms, a video buffer
for buffering the video signal, and a CPU which encodes the video
signal according to the customized video algorithms.
Inventors: |
DIERKE, GREGG; (SAN JOSE,
CA) |
Correspondence
Address: |
LSI Logic Corporation
1551 McCarthy Blvd.
M/S: D-106 Patent Department
Milpitas
CA
95035
US
|
Assignee: |
LSI Logic Corporation
|
Family ID: |
25489614 |
Appl. No.: |
08/949875 |
Filed: |
October 14, 1997 |
Current U.S.
Class: |
386/248 ;
375/E7.211; 386/E9.013 |
Current CPC
Class: |
H04N 5/85 20130101; H04N
19/61 20141101; H04N 9/8205 20130101; H04N 9/8042 20130101 |
Class at
Publication: |
386/95 ; 386/111;
386/125 |
International
Class: |
H04N 005/76; H04N
005/781 |
Claims
What is claimed is:
1. A digital information storage medium for use in a programmable
digital video recording system having a read head, a video encoder
operatively coupled to the read head to receive a video encoding
algorithm for encoding a video signal to produce an encoded digital
signal, and a record head operatively coupled to the video encoder
to receive the encoded digital signal, wherein the medium includes:
a read-only sector containing said video encoding algorithm and
configured to communicate said video encoding algorithm when
engaged in an operative relationship with the read head; and a
recordable sector configured to receive and store the encoded
digital signal when engaged in an operative relationship with the
record head.
2. The digital information storage medium of claim 1, wherein the
medium comprises a plurality of tracks on a disc, wherein the
read-only sector comprises a subset of the plurality of tracks, and
wherein the recordable sector comprises a second disjoint subset of
the plurality of tracks.
3. The digital information storage medium of claim 1, wherein the
read-only sector also contains an audio encoding algorithm and is
configured to communicate said audio encoding algorithm when
engaged in an operative relationship with said read head.
4. A recordable DVD disk for storing video bitstreams having an
optimized attribute, wherein the recordable DVD disk includes: an
algorithm sector containing an MPEG-2 video encoding algorithm for
encoding video signals into video bitstreams having said optimized
attribute; and a recordable sector configurable to record said
video bitstreams.
5. The recordable DVD disk of claim 4, wherein said optimized
attribute is image quality optimized relative to a predetermined
disk capacity.
6. The recordable DVD disk of claim 4, wherein said optimized
attribute is disk capacity optimized relative to a predetermined
image quality.
7. The recordable DVD disk of claim 4, wherein the algorithm sector
is read-only.
8. A recordable DVD disk for storing audio bitstreams having an
optimized attribute, wherein the recordable DVD disk includes: an
algorithm sector containing an audio encoding algorithm for
encoding audio signals into audio bitstreams having said optimized
attribute; and a recordable sector configurable to record said
audio bitstreams.
9. The recordable DVD disk of claim 8, wherein said optimized
attribute is sound quality optimized relative to a predetermined
disk capacity.
10. The recordable DVD disk of claim 8, wherein said optimized
attribute is disk capacity optimized relative to a predetermined
audio quality.
11. The recordable DVD disk of claim 8, wherein the algorithm
sector is read-only.
12. A method for recording a video program on a recordable DVD
disk, wherein the method comprises: designing a MPEG-2 video
encoding algorithm for encoding video signals into video bitstreams
having an optimized attribute; creating a recordable DVD disk with
a read-only sector and a recordable sector, wherein the read only
sector contains the video encoding algorithm, and wherein the
recordable sector is configurable to store said video bitstreams;
and placing the recordable DVD disk in a programmable video
recording system.
13. A digital information storage device which comprises a base
portion having an algorithm region containing a video compression
algorithm and a recordable data region for storing a video
bitstream created by the video compression algorithm.
14. The digital information storage device of claim 13, wherein the
algorithm region is a read-only region.
15. The digital information storage device of claim 13, wherein the
base portion is a DVD disk substrate.
16. A digital information storage device which comprises a base
portion having an algorithm region containing an audio compression
algorithm and a recordable data region for storing a video
bitstream created by the audio compression algorithm.
17. The digital information storage device of claim 16, wherein the
algorithm region is a read-only region.
18. The digital information storage device of claim 16, wherein the
base portion is a DVD disk substrate.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] This invention relates to the field of video compression
systems, and in particular to programmable DVD video encoders.
[0003] 2. Description of the Related Art
[0004] A video program signal is converted to a digital format, and
then compressed and encoded in accordance with one of several known
compression algorithms or methodologies. This compressed digital
system signal, or bitstream, which includes a video portion, an
audio portion, and other informational portion, is then transmitted
to a receiver. Transmission may be over existing television
channels, cable television channels, satellite communications
channels, and the like. A decoder is then typically employed at the
receiver to decompress and decode the received system signal in
accordance with the same compression algorithm used to encode the
signal. The decoded video information may then be output to a
display device, such as a television (TV) monitor.
[0005] Video compression and encoding is typically performed by a
video encoder. The video encoder normally produces a compressed
digital system signal that conforms to a recognized standard or
specification agreed to among the senders and receivers of digital
video signals. One such standard is DVD. It includes audio and
video compression technologies, as well as provisions for other
information streams. The video compression standard adopted by DVD
was developed by the Moving Pictures Experts Group (MPEG). The MPEG
standard concerns high-quality coding of possibly interlaced video,
including high definition television (HDTV). A wide range of
applications, bit rates, resolutions, signal qualities and services
are addressed, including all forms of digital storage media, TV
broadcasting and communications.
[0006] The MPEG standard, although it details the structure and
syntax of the compressed bitstreams, does not provide complete
system specifications. A nearly infinite number of bitstreams can
be generated to represent an image sequence while conforming to the
MPEG standard. Design considerations such as image preprocessing,
motion estimation methods, the order of compressed frame types,
bit-rate management, implementation complexity, coded image size,
color space sampling, and field interleaving, all lead to different
representations of the same image. It should be recognized that the
different representations may have varying degrees of quality, both
in terms of compression and accuracy, but they all conform to the
MPEG standard. A somewhat lesser degree of freedom exists in the
way a bitstream is decoded, but nevertheless exists. For example,
note that some video degradation might be an acceptable tradeoff
for reduced implementation complexity, or that the coded image
characteristics (size, frame rate) might be incompatible with the
display device and require some adjustments (scaling, pulldown).
The MPEG standard carefully avoids addressing issues such as these,
preferring instead to allow industries to "customize" encoder and
decoder implementations to their best advantage.
[0007] Currently, the customization of the encoder and decoder
implementations is performed by the system designers primarily
using a combination of hardware and proprietary microcode, and the
displayed image quality thereby determined. One drawback of this
approach is that the digital media vendors cannot easily "improve"
image quality to distinguish themselves from their competitors, nor
can then easily upgrade their systems to incorporate the latest
compression advances. It is desirable to provide a method for DVD
media vendors to participate in the encoder and decoder
customization process to produce media classes that provide for
optimized trade-offs (e.g. capacity vs. image quality), and in so
doing provide better performances for specific applications.
SUMMARY OF THE INVENTION
[0008] The problems outlined above are in large part solved by a
recordable DVD disk which includes a read-only sector for
specifying a video encoding algorithm to be used for compressing
video programs that are to be recorded in the recordable region of
the recordable DVD disk. Such disks may be used in a digital video
recording system having a programmable video encoder. In one
embodiment, the system accepts the recordable DVD disks having a
read-only sector for storing customized video encoding algorithms
and programs the programmable video encoder with the customized
video encoding algorithms prior to encoding and recording a video
signal on the disk. By designing the video encoding algorithms to
optimize one or more of a number of desirable attributes, the DVD
media vendors can then create "classes" of recordable DVD disks,
i.e. high capacity, high quality, high speed, high image detail,
high color resolution, variable frame rate, etc. One programmable
video encoder for this embodiment would include an instruction
memory for storing the customized video algorithms, a video buffer
for buffering the video signal, and a CPU which encodes the video
signal according to the customized video algorithms.
[0009] Broadly speaking, the present invention contemplates a
recordable digital information storage medium for use in a
programmable digital video recording system having a read head, a
video encoder, and a record head. The video encoder is operatively
coupled to the read head to receive a video encoding algorithm for
encoding a video signal. The record head is operatively coupled to
the video encoder to receive the encoded digital signal. The
storage medium includes a read-only sector and a recordable sector.
The read-only sector stores the video encoding algorithm and
communicates the video encoding algorithm to the read head when the
two are engaged in an operative relationship. The recordable sector
receives and stores the encoded digital signal from the read head
when they are engaged in an operative relationship.
[0010] The present invention further contemplates a recordable DVD
disk for storing video bitstreams having an optimized attribute.
The recordable DVD disk includes an algorithm sector and a
recordable sector. The algorithm sector stores an MPEG-2 video
encoding algorithm for encoding video signals into video bitstreams
having the optimized attribute. The recordable sector records the
video bitstreams. The optimized attribute may be image quality
optimized relative to a predetermined disk capacity and/or record
time.
[0011] The present invention still further contemplates a method
for recording a video program on a recordable DVD disk. The method
comprises: (i) designing a MPEG-2 video encoding algorithm for
encoding video signals into video bitstreams having an optimized
attribute; (ii) creating a recordable DVD disk with a read-only
sector and a recordable sector; and (iii) placing the recordable
DVD disk in a programmable video recording system. The read only
sector stores the video encoding algorithm, and the recordable
sector stores the video bitstreams.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] Other objects and advantages of the invention will become
apparent upon reading the following detailed description and upon
reference to the accompanying drawings in which:
[0013] FIG. 1 is a block diagram of a programmable digital video
record and playback system; and
[0014] FIG. 2 is a block diagram of a programmable digital video
encoder.
[0015] While the invention is susceptible to various modifications
and alternative forms, specific embodiments thereof are shown by
way of example in the drawings and will herein be described in
detail. It should be understood, however, that the drawings and
detailed description thereto are not intended to limit the
invention to the particular form disclosed, but on the contrary,
the intention is to cover all modifications, equivalents and
alternatives falling within the spirit and scope of the present
invention as defined by the appended claims.
DETAILED DESCRIPTION OF THE INVENTION
[0016] Turning now to the figures, FIG. 1 provides a block diagram
of a programmable digital video record and playback system which
can accept recordable DVD disks having a read-only sector with a
video encoding algorithm. The system is configured to read the
video encoding algorithm and execute the algorithm when encoding
video signals for storage on the recordable DVD disks. This system
allows for the use of various classes of recordable DVD disks in
which each class is targeted for recording digital video signals of
a given type (i.e. long play, high resolution, superior color,
etc.)
[0017] The system receives audio and digital signals which are
converted to digital signals by analog-to-digital (A/D) converters
102 and 104. The digital signals are encoded by a DVD encoder 106
which uses synchronous dynamic random access memory (SDRAM) 108 as
a frame store buffer. The encoded digital signals are processed by
an error correction encoder 110 then converted to a modulated
digital signal by modulator 112. The modulated digital signal is
coupled to a digital signal processor (DSP) 114 and from there to a
power amplifier 116. Amplified signals are coupled to drive motors
118 to spin a recordable DVD disk 120 and a record head 122 to
store the modulated digital signal on the recordable DVD disk 120.
Stored data can be read from the recordable DVD disk 120 by read
head 124 which sends a read signal to DSP 114 for filtering. The
filtered signal is coupled to channel control buffer 126 for rate
control, then demodulated by demodulator 128. An error correction
code decoder 130 converts the demodulated signal into either a
video encoding algorithm for DVD encoder 106, or an encoded video
signal which is then decoded by DVD decoder 134 and converted to
analog audio and video signals by digital-to-analog (D/A)
converters 136 and 138. A microcontroller 132 coordinates the
operations of the system components and loads the video encoding
algorithm into the DVD encoder 106.
[0018] A/D converter 102 operates to convert an analog video signal
into a digital video signal. This is likely to be the case for an
analog video camera or videocassette playback recording. Digital
cameras and computer generated graphics produce digital video
signals. If a digital video signal is being received, the A/D
converter 102 may be bypassed. Similarly, AID converter 104
operates to convert an analog audio signal into a digital audio
signal. Microphones and analog TV broadcasts are sources of analog
audio signals. Examples of digital audio sources are compact disks
and digital audio cassettes. For digital audio sources, the A/D
converter 104 may be bypassed.
[0019] DVD encoder 106 operates to encode the digital audio and
video signals to produce an encoded digital signal. DVD encoder 106
is preferably a programmable encoder able to execute software video
encoding algorithms. The operation and structure of the DVD encoder
106 is discussed further below.
[0020] Error correction encoder 110 and modulator 112 operate to
provide channel coding and modulation for the encoded digital
signal. Error correction encoder 110 may be a Reed-Solomon block
code encoder, which provides protection against errors in the read
signal. The modulator 112 converts the error correction coded
output into a modulated signal suitable for recording on DVD disk
120.
[0021] DSP 114 serves multiple functions. It provides filtering
operations for write and read signals, and it acts as a controller
for the read/write components of the system. The modulated signal
provided by modulator 112 provides an "ideal" which the read signal
should approximate. In order to most closely approximate this
ideal, certain nonlinear characteristics of the recording process
must often be compensated. The DSP 114 may accomplish this
compensation by pre-processing the modulated signal and/or
post-processing the read signal. The DSP 114 controls the drive
motors 118 and the record head 122 via the power amplifier 116 to
record the modulated signal on the DVD disk 120. The DSP 114 also
controls the drive motors 118 and uses the read head 124 to scan
the DVD disk 120 and produce a read signal.
[0022] The channel control buffer 126 provides buffering of the
read signal, while demodulator 128 demodulates the read signal and
error correction code decoder 130 decodes the demodulated signal.
After decoding the demodulated signal, the error correction decoder
130 forwards the decoded signal in response to the microcontroller
132. If the microcontroller 132 indicates that the output is a
video encoding algorithm, the output is forwarded to the
microcontroller 132 for use in the DVD encoder 106. Otherwise the
output is assumed to be an encoded digital signal and is forwarded
to DVD decoder 134.
[0023] DVD decoder 134 operates to decode the encoded digital
signal to produce digital audio and video signals. The operation
and structure of DVD decoder 134 are discussed further below. The
digital audio signal may be converted to an analog audio signal by
D/A converter 136, and the digital video signal may be converted to
an analog video signal by D/A converter 138. One specific instance
of D/A converter 138 is a NTSC (National Television Standards
Committee) standard or a PAL (Phase Alternation Line) standard
encoder which converts the digital video signal into a raster scan
signal for display on a monitor.
[0024] Turning now to FIG. 2, a block diagram of a programmable DVD
video encoder 106 is shown. The digital audio signal may be
compressed by one of three audio compression blocks: a MUSICAM
encoder 202, a Dolby digital (AC3) encoder 204, or a Linear PCM
encoder 206. A special purpose RISC CPU 210 accesses the
appropriate audio encoder hardware to choose the audio compression
method. The RISC CPU 210 operates on software stored in instruction
memory 212, which can be loaded by microcontroller 132 via
instruction interface 211. A video buffer 214 buffers the digital
video signal while the RISC CPU 210 operates to convert the digital
video signal into a compressed video signal to be combined with the
compressed audio signal to form the encoded digital signal
bitstream. A bitstream buffer 216 is used to buffer the encoded
digital signal until it can be processed by the error correction
encoder 110. The video compression process requires the use of
frame buffers, and RISC CPU 210 uses SDRAM 108 via SDRAM interface
218 for this purpose.
[0025] In one embodiment, RISC CPU 210 relies on supporting video
encoding hardware 208 to perform the low-level steps of the video
encoding algorithm such as motion compensation and discrete cosine
transform of macroblocks. The video encoding hardware 208 may be
programmable via configuration registers to set the desired method
of operation. The audio encoders 202, 204, and 206 may also be
programmable and used in a supporting role. In this case, the RISC
CPU 210 can also provide the high-level aspects of the audio
compression algorithm. The steps of forming the bitstream syntax
and interleaving the audio and video programs are performed by the
RISC CPU 210.
[0026] RISC CPU 210, perhaps along with the supporting audio and
video encoding hardware 208, compresses the audio and video signals
using a software algorithm stored in instruction memory 212. The
software algorithm may come from a variety of sources, including a
system EEPROM. In particular, the software algorithm may be
provided in a read-only sector on a recordable DVD disk 120. In one
embodiment, the insertion of a recordable DVD disk 120 initiates a
load sequence in which the software algorithm is read from the
read-only sector of the recordable DVD disk 120 and stored in
instruction memory 212. A subsequent initiation of a record
sequence causes the system to use the loaded software algorithm for
compressing any audio/video program it records on the recordable
DVD disk 120.
[0027] To highlight customizable aspects of the video compression
algorithm which the RISC CPU 210 executes, a general discussion of
the steps which might be performed by a video MPEG encoder is now
provided.
[0028] In order to compress a video signal, it is typically
necessary to sample the analog data and represent this data with
digital values of luminance and color difference. Video input is
typically sampled at 4:2:2, where a red color difference signal
(Cr) and a blue color difference signal (Cb) are sub-sampled 2-to-1
with respect to a luminance (Y) signal. The MPEG standard suggests
that the luminance component Y of a video signal may be sampled
with respect to the color difference signals Cr, Cb by a ratio of
4-to-1. That is, for every four samples of the luminance component
Y, there is one sub-sample each of the color difference components
Cr and Cb. A 4-to-1 sampling ratio is generally considered
acceptable because the human eye is much more sensitive to
luminance (brightness) components than to color components. For end
users, video sub-sampling typically is performed 2-to-1 in both the
vertical and horizontal directions (known as 4:2:0). However, the
MPEG standard allows the use of other sampling ratios, and
commercial studio-quality systems will often not vertically
sub-sample the color difference components at all, i.e. maintain
2-to-1 horizontal only sampling ratios. A 3-to-2 sampling ratio has
also been discussed for use in MPEG video compression.
[0029] Once the video signal is sampled, it is typically formatted
into a non-interlaced signal that contains all of the picture
content. More particularly, the video signal includes a plurality
of pictures or frames, where each frame includes a plurality of
horizontal scan lines for display. An interlaced signal, in
contrast, is one that contains only part of the picture content for
each complete display scan. In an interlaced signal, each frame is
divided into two fields. The two fields are often referred to as
the even and odd or the top and bottom fields. Each field spans the
length of the frame, but only includes every other scan line. The
purpose for such field division is that most TVs today display the
video information in interlaced format, by displaying one field
first, such as the entire top field, then displaying the entire
bottom field. Note that although the non-interlaced frame format is
common, interlaced field encoding is permitted under the MPEG-2
standard. It is possible that the interlaced field encoding may be
more suitable for use with some low-cost video cameras.
[0030] After a video signal is sampled and formatted, the encoder
may process it further by converting it to a different resolution
in accordance with the image area to be displayed. A wide variety
of picture resolutions are available, but a higher encoded
resolution typically implies a higher bit rate and hence a smaller
disk capacity.
[0031] The video encoder must next determine how to encode each
picture. A picture may be considered as corresponding to a single
frame of motion video, or to a single frame of a movie film.
Different encoding schemes may be employed for each picture. The
most prevalent picture coding types are: I-pictures (intra-coded
pictures) which are coded without reference to any other pictures
and are often referred to as anchor frames; P-pictures
(predictive-coded pictures) which are coded using
motion-compensated prediction from the past I- or P-reference
picture, and may also be considered anchor frames; and B-pictures
(bidirectionally predictive-coded pictures) which are coded using
motion compensation from a previous and a future I- or P-picture.
These picture types will be referred to as I, P or B frames.
[0032] A typical coding scheme may employ a mixture of I, P, and B
frames. Typically, an I frame may occur every half a second, with
two B frames inserted between each pair of I or P frames. I frames
provide random access points within the coded sequence of pictures
where decoding can begin, but are coded with only a moderate degree
of compression. P frames are coded more efficiently using motion
compensated prediction from a past I or P frame and are generally
used as a reference for further prediction. B frames provide the
highest degree of compression but require both past and future
reference pictures for motion compensation. B frames are not used
as references for prediction. The organization of the three picture
types in a particular video sequence is very flexible. A fourth
picture type is defined by the MPEG standard as a D-picture, or
DC-picture, which is provided to allow a simple, but limited
quality, Fast-Forward mode. Note that the mixture of frames is
customizable, and that some higher compression may be attained by
reducing the number of random-entry points in the bitstream.
Conversely, better slow and fast, forward and reverse motion
effects may be provided by sacrificing some compression and
increasing the number of random entry point in the bitstream.
[0033] Once the picture types have been defined, the encoder may
estimate motion vectors for each 16 by 16 macroblock in a picture.
A macroblock (MB) is the basic coding unit for the MPEG standard. A
macroblock consists of a 16-pixel by 16-line portion, or four
8-pixel by 8-line blocks, of luminance components (Y) and several
spatially corresponding 8 by 8 blocks of chrominance components Cr
and Cb. The number of blocks of chrominance values depends upon
which particular format is used. Common color space sampling
schemes include 4:4:4 for maximum quality but relatively low
compression, 4:2:2 including two Cb chrominance blocks and Cr
chrominance blocks, and 4:2:0 including one Cb chrominance block
and one Cr chrominance block. A plurality of such macroblocks form
a horizontal slice within a frame, where the slice is the basic
processing unit in an MPEG coding scheme. A plurality of such
slices form each picture or frame, which is the basic unit of
display. As described previously, however, each frame is typically
interlaced and displayed as two separate fields.
[0034] Motion vectors provide displacement information between a
current picture and a previously stored picture. P frames use
motion compensation to exploit temporal redundancy, or lack of
substantial changes, between picture frames in the video. Apparent
motion between sequential pictures is caused by pixels in a
previous picture occupying different positions with respect to the
pixels in a current macroblock. This displacement between pixels in
a previous and a current macroblock is represented by motion
vectors encoded in the MPEG bitstream. Typically, the encoder
chooses which picture type is to be used for each given frame.
Having defined the picture type, the encoder then estimates motion
vectors for each macroblock in the picture. Typically in P frames,
one vector is employed for each macroblock, and in B frames, one or
two vectors are used. Note that the algorithm for determining
motion vectors is completely customizable, and that "good" motion
vector estimation is largely a matter of judgment. Different motion
estimation techniques may be more suitable (i.e. result in better
compression) for different film types (e.g. cartoons vs. action
movies).
[0035] When the encoder processes B frames, it usually re-orders
the picture sequence so that a video decoder receiving the digital
video signal operates properly. Since B frames are usually coded
using motion compensation based on previously sent I or P frames,
the B frames can only be decoded after the subsequent anchor
pictures (an I or P frame) have been received and decoded. Thus,
the sequence of the series of pictures may be re-ordered by the
encoder so that the pictures arrive at the decoder in a proper
sequence for decoding of the video signal. The decoder may then
re-order the pictures in proper sequence for viewing.
[0036] For a given macroblock of video data, the encoder is
programmed to select a coding mode depending on the picture type,
the effectiveness of motion compensation in the particular region
of the picture, and the nature of the signal within the block. The
criteria for making this selection are customizable. A coding
method is selected, and the encoder begins to process the
macroblocks accordingly. For I frames, the encoder performs a
Discrete Cosine Transform (DCT) on the current macroblock. For P
and B frames, the encoder first performs a motion-compensated
prediction of the block contents based on past and/or future
reference pictures. The encoder then produces an error signal by
subtracting the prediction from the actual data in the current
macroblock. The error signal is similarly separated into 8 by 8
blocks (four luminance blocks and two chrominance blocks for 4:2:0
encoding). A DCT is then performed on each block to achieve further
compression. The DCT operation converts an 8 by 8 block of pixel
values to an 8 by 8 matrix of horizontal and vertical coefficients
of spatial frequency. An 8 by 8 block of pixel values can
subsequently be reconstructed by a video decoder performing an
Inverse DCT (IDCT) on the spatial frequency coefficients.
[0037] In addition to the signal compression that is achieved by
the encoding process itself, a substantial degree of intentional
(but lossy) signal compression can be achieved by a process of
selecting a quantization step size, where the quantization
intervals or steps are identified by an index. Considerable freedom
is permitted in making the selection of quantization step sizes.
The quantization level of frequency coefficients corresponding to
the higher spatial frequencies favors the creation of coefficient
values of zero by choosing an appropriate quantization step size
based on the human visual perception system. In particular, the
step size is chosen so that the human visual perception system is
unlikely to notice the loss of a particular spatial frequency
unless the coefficient value for that spatial frequency rises above
the particular quantization level. The statistical encoding of the
resulting runs of consecutive zeroed-valued coefficients
corresponding to the higher-order coefficients accounts for
considerable compression gain. Higher quantization values allow
greater compression at the expense of quality, while lower values
allow higher video quality at the expense of more bits.
[0038] In order to cluster non-zero coefficients early in the
series and to encode as many zero coefficients as possible
following the last non-zero coefficient in the ordering, the
coefficient sequence is often organized in a specified orientation
termed zigzag ordering. More than one method of zigzag ordering is
possible. Zigzag ordering concentrates the highest (and least
common) spatial frequencies at the end of the series. Once the
zigzag ordering has been performed, the encoder performs
"run-length coding" on the AC coefficients. This process reduces
each 8 by 8 block of DCT coefficients to a number of events
represented by a non-zero coefficient and the number of preceding
zero coefficients. Because the high-frequency coefficients are more
likely to be zero, the combination of zigzagging and run-length
coding results in additional video compression.
[0039] The video encoder then performs variable-length coding (VLC)
on the resulting data. VLC is a reversible lossless procedure for
coding data that assigns shorter code words to frequent events and
longer code words to less frequent events, thereby achieving
additional video compression. Huffman encoding is a particularly
well-known form of VLC that reduces the number of bits necessary to
represent a data set without losing any information.
[0040] The final compressed video data is then ready to be
transmitted to a storage device or over a transmission medium for
reception and decompression by a remotely located decoder. The MPEG
standard specifies a particular syntax for a compressed bitstream,
and bitstreams which adhere to this standard can be decoded by
MPEG-compliant decoders.
[0041] The video decoding process is generally the inverse of the
video encoding process and is employed to reconstruct a motion
picture sequence from a compressed and encoded bitstream. The data
in the bitstream is decoded according to a syntax that is itself
defined by the data compression algorithm. The decoder must first
identify the beginning of a coded picture, identify the type of
picture, then decode each individual macroblock within a particular
picture. If there are motion vectors and macroblock types (each of
the frame types I, P, and B have their own macroblock types)
present in the bitstream, they can be used to construct a
prediction of the current macroblock based on past and future
reference frames that the decoder has already stored. The Huffman
coded data are decoded and inverse zigzagged back into coefficient
data. The coefficient data is then inverse quantized and operated
on by an IDCT (inverse DCT) process so as to transform the
macroblock data from the frequency domain to data in the time and
space domain.
[0042] After all of the macroblocks have been processed by the
decoder, the picture reconstruction is complete. If a reconstructed
frame is a reference or anchor frame, such as an I or a P frame, it
replaces the oldest stored anchor frame and is used as the new
anchor for subsequent frames. As noted above, the frames may also
need to be re-ordered before they are displayed in accordance with
their display order instead of their coding order. After the frames
are re-ordered, they may then be displayed on an appropriate
display device.
[0043] Numerous variations and modifications will become apparent
to those skilled in the art once the above disclosure is fully
appreciated. It is intended that the following claims be
interpreted to embrace all such variations and modifications.
* * * * *