Method and apparatus for reducing breathing artifacts in compressed video Patent Grant Florencio , et al. September 26, 2 [Motorola, Inc.]

Method and apparatus for reducing breathing artifacts in compressed video

Florencio , et al. September 26, 2

Patent Grant 6125147

U.S. patent number 6,125,147 [Application Number 09/127,451] was granted by the patent office on 2000-09-26 for method and apparatus for reducing breathing artifacts in compressed video. This patent grant is currently assigned to Motorola, Inc.. Invention is credited to Dinei Afonso Ferreira Florencio, Shipeng Li.

United States Patent	6,125,147
Florencio , et al.	September 26, 2000

**Please see images for: ( Certificate of Correction ) **

Method and apparatus for reducing breathing artifacts in compressed video

Abstract

A method and apparatus for reducing breathing artifacts in MPEG-like video information streams employing group of pictures (GOP) or group of frames (GOF) structures.

Inventors:	Florencio; Dinei Afonso Ferreira (Plainsboro, NJ), Li; Shipeng (Princeton, NJ)
Assignee:	Motorola, Inc. (Schaumburg, IL)
Family ID:	26771221
Appl. No.:	09/127,451
Filed:	July 31, 1998

Current U.S. Class:	375/240.29; 375/240.26; 375/E7.211; 375/E7.256; 375/E7.252; 375/E7.207; 375/E7.206; 375/E7.098; 375/E7.051; 375/E7.075; 375/E7.05; 375/E7.094; 375/E7.036; 375/E7.171; 375/E7.166; 375/E7.176; 375/E7.145
Current CPC Class:	H04N 19/16 (20141101); H04N 19/90 (20141101); H04N 19/51 (20141101); H04N 19/48 (20141101); H04N 19/176 (20141101); H04N 19/428 (20141101); G06T 3/4084 (20130101); H04N 19/186 (20141101); H04N 19/645 (20141101); H04N 19/59 (20141101); H04N 19/423 (20141101); H04N 19/61 (20141101); H04N 19/63 (20141101); H04N 19/132 (20141101); H04N 19/115 (20141101); H04N 19/10 (20141101); H04N 19/146 (20141101)
Current International Class:	G06T 9/00 (20060101); G06T 3/40 (20060101); H04N 7/46 (20060101); H04N 7/36 (20060101); H04N 7/26 (20060101); H04N 7/50 (20060101); H04N 007/36 ()
Field of Search:	;375/240.26,240.29 ;348/413,416,845,845.1

References Cited [Referenced By]

U.S. Patent Documents


4851906	July 1989	Koga et al.
5229864	July 1993	Moronaga et al.
5231484	July 1993	Gonzales et al.
5361105	November 1994	Iu
5479211	December 1995	Fukuda
5734755	March 1998	Ramchandran et al.
5844614	December 1998	Chong et al.

Foreign Patent Documents


0 721 286 A 2	Oct 1996	EP

Other References

Bellifemine F et al: "Statistical Analysis of the 2D-DCT Coefficients of the Differential Signal for Images" Signal Processing. Image Communication, vol. 4, No. 6, Nov. 1, 1992, pp. 477-488, XP000322089 ISSN: 0923-5965. .
PCT International Search Report PCT/US99/1024 corresponding to 09/127,451..

Primary Examiner: Britton; Howard
Attorney, Agent or Firm: Burke; William J.

Parent Case Text

This application claims the benefit of U.S. Provisional Application No. 60/084,632, filed May 7, 1998.

Claims

What is claimed is:

1. In a system for processing a sequence of compressed images arranged according to a group of pictures (GOP) information structure, a method for reducing an inter-GOP visual artifact comprising the steps of:

determining, for a first type of information frame within said GOP information structure, a respective level of fidelity degradation;

determining, for a second type of information frame within said GOP information structure, a respective level of fidelity degradation;

computing, using said determined levels of fidelity degradation, a degradation equalization parameter;

processing, using said computed degradation equalization parameter, one of said first and second types of information frames such that said first and second types of information frames have levels of fidelity degradation within a predetermined range.

2. The method of claim 1, wherein said first type of information frame comprises an intra-coded information frame (I-frame), and said second type of information frame comprises a predictive coded information frame (P-frame or B-frame).

3. The method of claim 1, wherein said first type of information frame comprises a forward predicted information frame (I-frame), and said second type of information frame comprises a bi-directionally predicted information frame (B-frame).

4. The method of claim 1, wherein said level of fidelity degradation is determined with respect to a level of quantization.

5. The method of claim 1, wherein said first type of information frame has a higher fidelity level than said second type of information frame, and said step of processing comprises the steps of:

characterizing a quantization noise parameter associated with said second type of information frame;

scaling, according to said degradation equalization parameter, said characterized quantization noise parameter; and

injecting, into said first information frame, quantization noise approximating said scaled quantization noise parameter.

6. The method of claim 5, wherein said noise is characterized with respect to a power density function.

7. The method of claim 1, further comprising the steps of:

determining, for a third type of information frame within said GOP information structure, a respective level of fidelity degradation; and

processing, using said computed degradation equalization parameter, said third type of information frames such that said first, second and third types of information frames have levels of fidelity degradation within a predetermined range.

8. The method of claim 1, wherein said step of processing comprises the step of filtering, on a block by block basis, those macroblocks having relatively low fidelity and including motion vectors.

9. In a system for decoding a sequence of compressed video information frames arranged according to a group of pictures (GOP) information structure, a method of reducing visual artifacts comprising the steps of:

determining a relative fidelity degradation level of a plurality of said compressed video information frames within a GOP;

computing, using said determined relative fidelity degradation level, a fidelity degradation equalization parameter; and

applying, to one or more compressed video information frames within said GOP, said computed fidelity degradation equalization parameter such that said relative fidelity degradation level is reduced to a threshold level.

10. The method of claim 9, wherein said relative fidelity degradation level is determined with respect to a quantization level.

11. The method of claim 9, wherein said relative fidelity degradation level is determined with respect to a compressed video information frame type.

12. The method of claim 9, wherein said relative fidelity degradation level is determined with respect to a relative position of said compressed video information frames within said GOP.

13. The method of claim 9, wherein said relative fidelity degradation level is determined with respect to a spectral information distribution of said compressed video information frames within said GOP.

14. The method of claim 9, wherein said relative fidelity degradation level is determined with respect to a sharpness level of said compressed video information frames within said GOP.

15. The method of claim 13, further comprising the steps of:

comparing said spectral information distribution to an expected spectral distribution;

calculating a spectral distribution equalization parameter; and

applying said spectral distribution equalization parameter to said GOP such that said spectral information distribution and said expected spectral distribution are within a threshold level.

16. In a system for processing a sequence of compressed images arranged according to a group of pictures (GOP) information structure, a method for reducing an inter-GOP visual artifact comprising the steps of:

selecting, for each GOP, one compressed image as a reference image;

determining, for each selected reference image, a level of fidelity degradation;

adapting, according to said determined level of fidelity degradation, image parameters of one or more compressed images such that a fidelity degradation level of said one or more compressed images

determining, for a second type of information frame within said GOP information structure, a respective level of fidelity degradation;

computing, using said determined levels of fidelity degradation, a degradation equalization parameter;

processing, using said computed degradation equalization parameter, one of said first and second types of information frames such that said first and second types of information frames have levels of fidelity degradation within a predetermined range.

17. The method of claim 16, wherein said selected reference frame comprises a first P-frame within said GOP.

18. The method of claim 16, wherein each GOP is processed according to a reference frame within a preceding GOP.

19. The method of claim 16, wherein said step of processing comprises the step of filtering, on a block by block basis, those macroblocks having relatively low fidelity and including motion vectors.

20. In a system for decoding a sequence of compressed video information frames arranged according to a group of pictures (GOP) information structure, apparatus comprising:

a video processor, for processing video information frames in response to a control signal; and

a controller, for generating said control signal in response to indicia of fidelity of said video information frames;

said controller determining a relative fidelity degradation level of a plurality of said compressed video information frames within a GOP, and computing, using said determined relative fidelity degradation level, a fidelity degradation equalization parameter;

said video processor applying, to one or more video information frames within said GOP, said computed fidelity degradation equalization parameter such that said relative fidelity degradation level is reduced to a threshold level.

21. The apparatus of claim 20, wherein said video processor reduces fidelity levels of relatively high fidelity video information frames by injecting quantization noise into said relatively high fidelity video information frames.

Description

The invention relates to communications systems generally and, more particularly, the invention relates to a method and concomitant apparatus for reducing a "breathing" artifact within an encoded information stream.

BACKGROUND OF THE DISCLOSURE

In several communications systems the data to be transmitted is compressed so that the available bandwidth is used more efficiently. For example, the Moving Pictures Experts Group (MPEG) has promulgated several standards relating to digital data delivery systems. The first, known as MPEG-1 refers to ISO/IEC standards 11172 and is incorporated herein by reference. The second, known as MPEG-2, refers to ISO/IEC standards 13818 and is incorporated herein by reference. A compressed digital video system is described in the Advanced Television Systems Committee (ATSC) digital television standard document A/53, and is incorporated herein by reference.

The above-referenced standards describe data processing and manipulation techniques that are well suited to the compression and delivery of video, audio and other information using fixed or variable length digital communications systems. In particular, the above-referenced standards, and other "MPEG-like" standards and techniques, compress, illustratively, video information using intra-frame coding techniques (such as run-length coding, Huffman coding and the like) and inter-frame coding techniques (such as forward and backward predictive coding, motion compensation and the like). Specifically, in the case of video processing systems, MPEG and MPEG-like video processing systems are characterized by prediction-based compression encoding of video frames with or without intra- and/or inter-frame motion compensation encoding.

In a typical MPEG encoder, a received video stream comprising a plurality of video frames is encoded according to a predefined group of pictures (GOP) structure. That is, the received video stream is encoded to produce a GOP comprising, e.g., an intra-coded frame (I-frame), followed by one or more forward predicted coded frames (P-frames) and bi-directional (i.e., forward and backward) predicted frames (B-frames).

Unfortunately, since motion compensation techniques are not perfect, predicted frames or pictures may suffer from undesirable visual artifacts such as error propagation, reduced editing granularity and the like. Moreover, GOP structures including such predicted frames also suffer from such artifacts. Since the fidelity of individual frames within a GOP structure tend to degrade after an initial high quality I-frame, the display of a sequence of GOPs produces a cyclical visual artifact known as "breathing," where the cyclical nature of the artifact is related to the length of the GOPs displayed.

Therefore, it is seen to be desirable to provide a method and concomitant apparatus that reduce the above-described "breathing" artifact.

SUMMARY OF THE INVENTION

The invention comprises a method and apparatus for reducing breathing artifacts in MPEG-like video information streams employing group of pictures (GOP) or group of frames (GOF) structures. Specifically, in a system for processing a sequence of compressed images arranged according to a group of pictures (GOP) information structure, a method according to the invention for reducing an inter-GOP visual artifact comprises the steps of: determining, for a first type of information frame within the GOP information structure, a respective level of fidelity degradation; determining, for a second type of information frame within the GOP information structure, a respective level of fidelity degradation; computing, using the determined levels of fidelity degradation, a degradation equalization parameter; and processing, using the computed degradation equalization parameter, one of the first and second types of information frames such that the first and second types of information frames have levels of fidelity degradation within a predetermined range.

BRIEF DESCRIPTION OF THE DRAWINGS

The teachings of the present invention can be readily understood by considering the following detailed description in conjunction with the accompanying drawings, in which:

FIGS. 1A through 1C depict graphical representations of relative frame by frame degradation of one or more parameters associated with video fidelity within a group of pictures (GOP);

FIG. 2 depicts an embodiment of an MPEG-like decoder according to the invention;

FIG. 3 depicts a degradation equalization routine according to the invention for reducing breathing artifacts in a video signal;

FIG. 4 depicts a degradation equalization routine 400 according to the invention;

FIG. 5 depicts a flow diagram of a GOP equalization routine according to the invention;

FIG. 6 depicts a flow diagram of a degradation equalization routine according to the invention;

FIG. 7 depicts a flow diagram of a degradation equalization routine according to the invention and suitable for reducing breathing artifacts within the context of a memory-constrained decoder; and

FIG. 8 depicts a flow diagram of a block level correction control routine according to the invention.

To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.

DETAILED DESCRIPTION

The invention will be described within the context of a video decoder, illustratively an MPEG-2 video decoder, within a digital television (DTV) receiver, illustratively an ATSC television receiver. However, it will be apparent to those skilled in the art that the invention is applicable to any video processing system, including those systems adapted to DVB, MPEG-1 and other information streams.

Specifically, the invention will be primarily described within the context of an MPEG-like decoding system that receives and decodes a compressed video information stream IN to produce a video output stream OUT. The invention operates to reduce a "breathing" artifact that, in the case of a video or image processing system utilizing a group of pictures (GOP) information structure, is caused by intra-GOP changing of visual fidelity levels resulting in a modulation of visual fidelity at the GOP presentation rate. However, it will be apparent to those skilled in the art that the invention has broad applicability to other information systems having different inter-frame fidelity levels within a group of

pictures (GOP) or group of frames (GOF) information structure.

FIGS. 1A through 1C depict graphical representations of relative frame by frame degradation of one or more parameters associated with video fidelity within a group of pictures (GOP). To the extent that such frame by frame degradation within a GOP results in a periodic increase and/or decrease in visual fidelity having a period approximating the GOP period, the aforementioned breathing artifact is produced. If the periodic increase and/or decrease in visual fidelity is sufficiently large, the breathing artifact becomes noticeable to a viewer when the video sequence including the GOP is presented on a display device.

For example, consider the case of a 30 frames per second (fps) presentation/display system responsive to a video information stream comprising MPEG-like video information arranged according to a 15 frame GOP exhibiting significant intra-GOP changes in visual fidelity (e.g., large quantization differences between I- and B-frames). Since the GOP comprises a single I-frame and many B-frames, each half second (15 frame GOP/30 fps) of displayed video begins with a high fidelity I-frame that is followed by a number of relatively low fidelity B-frames (and, typically, P-frames). This displayed sequence of images may exhibit a pulsating (i.e., breathing) characteristic having a period of 1/2 second due to the inter-frame fidelity variations within the GOP.

The inventors have determined that the primary causes of breathing artifacts are 1) inter-frame differences in quantization levels within a GOP; 2) propagation of half pel motion estimation errors within a GOP; and 3) fidelity anomalies introduced by (or characteristic of) a decoder processing a GOP. Embodiments of the invention addressing all three causes of breathing artifacts will be disclosed.

FIG. 1A depicts a graphical representation of a frame by frame degradation of video "sharpness" within a GOP comprising an I-frame followed by ten P-frames. Sharpness is a function of high frequency response, which is critical to the accurate presentation of edges and other high contrast luminance transitions. Specifically, FIG. 1A depicts a sequence of three GOPs (denoted as 101) and a corresponding diagrammatic representation of a sharpness level of each frame in the GOPs (denoted by reference 102). It can be seen that the sharpness of a video image is at a maximum for each I-frame in the GOPs, with a corresponding decrease in sharpness for each successive P-frame. A breathing artifact is produced due to the gradual degradation and rapid increase in sharpness level at a rate corresponding to the GOP display rate.

In FIG. 1A, the corresponding decrease in sharpness for each successive P-frame is caused by half pel interpolation within the motion compensation circuitry of a decoder processing the GOPs. The half pel interpolator operates to low pass filter the predicted macroblocks or pel groupings. The relatively linear degradation of sharpness level depicted in FIG. 1A is due to the use of P-frames within the GOPs and an assumption that the half pel interpolation error is approximately constant. The shaded portion of the diagrammatic representation comprises a statistical band of errors within each P-frame falls. It must be noted that while each I-frame (I.sub.1, I.sub.2 and I.sub.3) is shown as having the same maximum level of sharpness, the sharpness level of the I-frames is initially determined by an encoder's rate controller based on many factors, including a bit-budget allocated to a GOP. Thus, the level of sharpness and general visual fidelity of an I-frame varies as necessary depending on requirements at the encoder.

FIG. 1B a graphical representation of a frame by frame degradation of video sharpness within a 12 frame GOP comprising an initial I-frame followed by a repeating sequence of three B-frames and one P-frame. Since each P-frame is predicted using the immediately preceding anchor frame (I- or P-frame), a relatively linear degradation in sharpness level for the P-frames is shown. However, since each B-frame is predicted using the anchor frames (I-frame or P-frame) surrounding the B-frame, two half pel interpolations are associated with each B-frame. Therefore, each B-frame is shown as having a sharpness level that is a half pel lower than the least accurate reference frame used to predict the B-frame. It can be seen that the sharpness of a video image is at a maximum for each I-frame in the 15 GOPs, with a varying amount of degradation in sharpness for each successive frame in a GOP. A breathing artifact is produced due to the varying amount of degradation and rapid increase in sharpness level at a rate corresponding to the GOP display rate.

In FIG. 1B the first I-frame (I.sub.1) is depicted as having a sharpness level of maximum (max). The first P-frame following that I-frame is depicted as having a sharpness level of one half pel lower than the maximum. The three B-frames predicted using the first I-frame and the first P-frame are depicted as having sharpness levels of one half pel below the sharpness level of the P-frame. Similarly, the second P-frame after the first I-frame has a sharpness level one half pel below the sharpness level of the first P-frame, from which it is predicted. Thus, the B-frames predicted using the first P-frame and the second P-frame have sharpness levels that are one half pel below the second P-frame sharpness level

FIG. 1C a graphical representation of a frame by frame quantization level (i.e., granularity) within a 12 frame GOP comprising an initial I-frame followed by a repeating sequence of three B-frames and one P-frame. Since I-frames are typically encoded with a finer quantization level than P-frames, and since P-frames are typically encoded with a finer quantization level than B-frames, FIG. 1C depicts the change in quantization level of video images produced using the GOP structure. It can be seen by inspection that the quantization level starts at a fine level for the initial I-frame (I.sub.1) and degrades to a coarser level for the B-frames and to a less coarser level for the P-frames. Thus, a video image produced using the GOP structure depicted in FIG. 1C, including the corresponding quantization granularity levels, will have an increase and decrease in fidelity based upon the frame-type within the GOP structure used to produce an individual video frame. This change or modulation of fidelity also contributes to the aforementioned breathing artifact.

FIG. 2 depicts an embodiment of an MPEG-like decoder 200 according to the invention. Specifically, the decoder 200 of FIG. 2 receives and decodes a compressed video information stream IN to produce a video output stream OUT. The video output stream OUT is suitable for coupling to, e.g., a display driver circuit within a presentation device (not shown). The MPEG-like decoder 200 examines fidelity indicative parameters associated with the information frames (e.g., video or image frames) within the compressed video information stream IN to determine if a resulting decompressed information stream (e.g., a video stream) will likely exhibit the above-described breathing artifact. In response to a determination that a breathing artifact is likely to be present, the MPEG-like decoder 200 modifies a decoded information stream in a manner tending to attenuate the breathing artifact. Advantageously, the invention can operate without any knowledge regarding the behavior of the encoder forming the compressed video stream IN (other than the information within the compressed video stream IN).

The MPEG-like decoder 200 of FIG. 2 comprises a video decoder 210, a video processor 220 and a controller 230. Video decoder 210 operates in a relatively standard manner to receive and decode the compressed video information stream IN to produce a decompressed video stream V1. The decompressed video stream V1 is susceptible to breathing artifact caused by inter-frame fidelity variations within GOPs and anomalies (or characteristics) of the video decoder 210 itself, as described above. The decompressed video stream V1 is coupled to video processor 220 and, optionally, controller 230.

Video processor 220, in response to a degradation control signal C produced by controller 230, adapts a degradation level of one or more frames within the decoded video stream V1 to produce the video output stream OUT, illustratively an baseband digital video stream.

The video processor 220 operates in one of several degradation modes, as will be discussed in more detail below. Briefly, in a noise injection mode of operation, the video processor 220 injects an amount of noise (determined by the controller 230) into a higher fidelity image frame (e.g., an I-frame) calculated to reduce an effective quantization level of the higher fidelity image frame. In this mode of operation, breathing artifacts induced by large differences in inter-frame quantization levels (e.g., per FIG. 1C) are reduced by reducing inter-frame quantization level differences to a threshold difference level. In a filter mode of operation, the video processor selectively applies low pass filtering (LPF) and high pass filtering (HPF) to one or more image frames or image frame types within a GOP. In this mode of operation, breathing artifacts induced by the propagation of prediction errors are reduced by filtering each image frame in a manner tending to equalize the amount of sharpness degradation among all the image frames within a GOP. Other modes of operation will be discussed below.

Controller 230 receives a fidelity indicative parameter signal DATA from video decoder 210 indicative of a, e.g., quantization levels, frame types, GOP positions and other information regarding video frames within the compressed video input stream end. Optionally, the controller 230 receives the decompressed video stream V1. In addition, controller 230 optionally receives a signal SOURCE ERROR indicative of errors imparted to the compressed video information stream IN at the source (i.e., at the encoder).

Controller 230 processes the fidelity indicative parameter signal DATA and, optionally, the decompressed video stream V1 to determine a level of visual fidelity degradation associated with one or more image frames or image frame types within one or more GOPs comprising the compressed video information stream IN.

The controller 230 then determines if the variations in image frame or image frame type degradation levels are such that a breathing artifact is likely to be produced by a subsequent presentation of the decompressed video stream V1. If the identified degradation variations are likely to produce a breathing artifact (e.g., exceed a threshold difference level), then the controller 230 calculates one or more degradation equalization parameters, which are then coupled to the video processor 220 as degradation control signal C. The operation of the controller 230 will be discussed in more detail below with respect to FIG. 3.

In the exemplary embodiment of the MPEG-like decoder 200 of FIG. 2, video decoder 210 comprises an input buffer memory module 211, a variable length decoder (VLD) module 212, an inverse quantizer (IQ) module 213, an inverse discrete cosine transform (IDCT) module 214, a summer 215, a motion compensation module 216, an output buffer module 218 and an anchor frame memory module 217.

The input buffer memory module 211 receives the compressed video stream IN, illustratively a variable length encoded bitstream representing, e.g., a high definition television signal (HDTV) or standard definition television signal (SDTV) output from a transport demultiplexer/decoder circuit (not shown). The input buffer memory module 211 is used to temporarily store the received compressed video stream IN until the 25 variable length decoder module 212 is ready to accept the video data for processing. The VLD 212 has an input coupled to a data output of the input buffer memory module 211 to retrieve, e.g., the stored variable length encoded video data as data stream S1.

The VLD 212 decodes the retrieved data to produce a constant length bit stream S2 comprising quantized prediction error DCT coefficients, and a motion vector stream MV. The IQ module 213 performs an inverse quantization operation upon constant length bit stream S2 to produce a bit stream S3 comprising quantized prediction error coefficients in standard form. The IDCT module 214 performs an inverse discrete cosine transform operation upon bit stream S3 to produce a bitstream S4 comprising pixel-by-pixel prediction errors. These prediction errors (and associated image degradation) are caused by encoder-side quantization of the pixel information, as discussed above with respect to FIG. 1C.

The summer 215 adds the pixel-by-pixel prediction error stream S4 to a motion compensated predicted pixel value stream S6 produced by the motion compensation module 216. The motion compensated predicted pixel value stream S6 will comprise a prediction error component (and associated sharpness degradation) that is related to the number of prior predictions used to produce the present pixel value, as discussed above with respect to FIG. 1A-1B. Thus, the output of summer 215 is, in the exemplary embodiment, a video stream S5 comprising reconstructed pixel values that are degraded by both quantization errors and by cumulative prediction errors, which are discussed above with respect to FIG. 1A-1C.

The video stream S5 (including quantization and prediction errors) produced by summer 215 is coupled to the anchor frame memory module 217 and the output buffer module 218. The anchor frame memory module 217 is accessed by the motion compensation module 216 via signal path S7. The motion compensation module 216 utilizes one or more stored anchor frames (e.g., the I-frame or P-frame last frame of video produced at the output of the summer 215), and the motion vector signal MV received from the VLD 212, to calculate the values for the motion compensated predicted pixel value stream S6.

The video decoder 210 described above also produces a fidelity indicative parameter signal DATA indicative of a, e.g., quantization levels, frame types, GOP positions and other information regarding video frames within the compressed video input stream end. Specifically, these and other fidelity indicative parameters are extracted from the compressed video information stream IN by the VLD 212 in a standard manner. For example, if the compressed video information stream IN comprises an MPEG video stream, the VLD 212 examines the information contained in one or more of the video sequence header, GOP header, picture header, slice header and macroblock header. The extracted information is coupled to the controller 230 as the fidelity indicative parameter signal DATA.

In the exemplary embodiment of the MPEG-like decoder 200 FIG. 2, the controller 230 comprises a microprocessor 234 as well as memory 238 for storing at least one of a degradation equalization routine 300, a quantization equalization routine 400 and a frequency equalization routine 500. The operation of these routines will be discussed in detail below with respect to FIGS. 3-5. The microprocessor 234 cooperates with conventional support circuitry 236 such as power supplies, clock circuits, cache memory and the like as well as circuits that assist in executing the software routines. As such, it is contemplated that some of the process steps discussed herein as software processes may be implemented within hardware, e.g., as circuitry that cooperates with the microprocessor 234 to perform various steps. The controller 230 also contains input/output circuitry 232 that forms an interface between the microprocessor 234, the video decoder 210 and the video processor 220. Although the controller 230 is depicted as a general purpose computer that is programmed to perform specific control functions in accordance with the present invention, the invention can be implemented in hardware as an application specific integrated circuit (ASIC). Moreover, the controller 230 may be functionally combined with one or both of the video processor 220 and the video decoder 210. In fact, it is contemplated by the inventors that a single integrated circuit comprising substantially all the functionality of the MPEG-like decoder 200 of FIG. 2 will be implemented as a single integrated circuit.

FIG. 3 depicts a degradation equalization routine 300 according to the invention for reducing breathing artifacts in a video signal. The degradation equalization routine 300 may be implemented as a control routine within the controller 230 or, in the case of controller 230 not comprising a general purpose computing device as depicted, as a logical function between cooperating modules of the of the MPEG-like decoder 200 of FIG. 2. The degradation equalization routine 300 is entered at step 302 when, e.g., the video decoder 210 of FIG. 2 begins to receive the compressed video information stream IN. The routine 300 then proceeds to

steps 304.

At step 304 the relative degradation levels of received video frames or video frame types are determined. The relative degradation levels are determined by evaluating one or more parameters associated with each video frame or video frame type. Specifically, one or more of various image or video frame parameters such as frame quantization level (e.g., average of frame macroblocks), frame type, frame position within GOP, frame spectrum and the like may be analyzed to determine a degradation level associated a particular frame. The degradation level associated with a particular video or image frame is then compared to, e.g., a nominal or average degradation level associated with the frames comprising a GOP or a sub-set of the frames comprising a GOP. In this manner a degradation variance of a particular frame from a GOP or sub-GOP mean is determined. The routine 300 then proceeds to step 306.

At step 306 the degradation level and/or degradation variance of some or all of the frames evaluated in step 304 is used to compute one or more degradation equalization parameters for one or more frames in the GOP. For example, those frames within the GOP having relatively high fidelity (i.e., relatively low degradation) are degraded in terms of the one or more frame parameters analyzed in step 304 such that the inter-frame variance in fidelity degradation within the GOP is constrained to a level at or below a level resulting in a noticeable breathing artifact. The routine 300 then proceeds to step 308.

At step 308 the degradation equalization parameters computed at step 306 are applied to one or more appropriate frames within the GOP structure such that the inter frame degradation variance within the GOP structure is appropriately constrained. Since degrading the fidelity of a video or image frame is typically easier (in terms of processing complexity and the like), appropriate frames will typically comprise those frames having a relatively high fidelity level. However, in the case of a fidelity parameter that is susceptible to enhancement, appropriate frames may comprise those frames having a relatively low fidelity level. The routine 300 then proceeds to step 310.

At step 310, a query is made as to whether more frames are to be processed. If the query at step 310 is answered affirmatively, then the routine 300 proceeds to step 304. If the query at step 310 is answered negatively, then the routine 300 proceeds to step 312 where it is exited.

It will be noted that one technique used to equalize degradation levels is to further degrade one or more relatively high fidelity video or image frames or frame types. It has been determined by the inventors that the decrease in fidelity of, e.g., an I-frame in a GOP, is much less objectionable to a viewer than the breathing artifact effectively reduced or removed by the operation of the invention. Thus, to the extent that a tradeoff is made in terms of shifting or selectively reducing visual, aural or other information fidelity within one or more frames of a GOP of GOF, the resulting processed information stream (e.g., video output stream OUT) provides a superior presentation to the information consumer.

The invention contemplates the use of one or more methods for determining degradation levels (step 304), computing degradation equalization parameters (step 306) and applying degradation equalization parameters (step 308). Specifically, in one embodiment of the invention, the degradation equalization routine 300 of FIG. 3 is adapted to reduce inter-frame variations in quantization levels within a GOP to substantially a threshold quantization variation level. The quantization equalization embodiment is especially well suited for reducing breathing artifacts related to quantization level variations between different types (i.e., I-frame, P-frame or B-frame) of video or image frames, and will be described below with respect to FIG. 4.

In another embodiment of the invention, the degradation equalization routine 300 of FIG. 3 is adapted to reduce inter-frame variations in frequency domain within a GOP to below a threshold spectrum variation level. The frequency response equalization embodiment is especially well suited for reducing breathing artifacts related to the low pass filtering (i.e., reduction in picture sharpness) imparted to predicted video or image frames (i.e., P-frames or B-frames) due to the propagation of motion estimation errors, and will be described below with respect to FIG. 5.

A first method involves examining a quantization level parameter associated with each encoded video frame. Typically, video decoders such a MPEG video decoders produce a GOP structure having a relatively high fidelity I-frame followed by one or more P-frames and B-frames of diminishing fidelity. It is important to encode the initial I-frame of a GOP structure at a high fidelity level since this frame is used to predict subsequent frames within the GOP structure. Thus, a fine quantization level is typically used by the quantizer in the encoder. Since P-frames are used as anchor frames within a GOP structure (i.e., P-frames are used to predict subsequent P-frames and immediately subsequent and prior B-frames), P-frames are quantized at a level typically finer than the level used for B-frames. As previously noted, variations in quantization level within a GOP contribute to the undesirable breathing artifact. Thus, to equalize the quantization levels within a GOP structure it is necessary to increase the effective quantization level of, e.g., the I-frame and/or one or more of the P-frames within a GOP. An example of quantization degradation equalization will now be described with respect to FIG. 4.

FIG. 4 depicts a degradation equalization routine 400 according to the invention. Specifically, the degradation equalization routine 400 of FIG. 4 is directed to reducing breathing artifacts in a video signal related to quantization level variations between an I-frame within a GOP and a plurality of B-frames within the GOP. The routine 400 adapts quantization levels within a GOP structure such that the fidelity of an I-frame within a GOP is reduced to within a threshold fidelity level of, illustratively, the average fidelity level of a plurality of B-frames within the GOP. The degradation equalization routine 400 may be implemented as a control routine within the controller 230 or, in the case of controller 230 not comprising a general purpose computing device as depicted, as a logical function between cooperating modules of the MPEG-like decoder 200 of FIG. 2.

The degradation equalization routine 400 is entered at step 402 when, e.g., the video decoder 210 of FIG. 2 begins to receive the compressed video information stream IN. The routine 400 then proceeds to steps 404.

At step 404, a quantization parameter associated with an I-frame (Q.sub.1) is determined by, e.g., averaging the quantization level of each macroblock within the I-frame in the GOP being processed. Additionally, a quantization parameter associated with the B-frames (Q.sub.B) is determined by, e.g., averaging the quantization level of each macroblock within one or more B-frames within the GOP being processed. The determination may be made with respect to, e.g., the MPEG variable quantizer.sub.-- scale, which may be found within the slice layer header of an MPEG compliant video stream and which has an integer value between 1 and 31 (assuming a constant quantizer table such as the default table is used). It must be noted that quantizer.sub.-- scale is also adjustable at the macro block layer.

The routine 400 then proceeds to step 406, where a query is made as to whether the quantization parameter associated with the B-frame Q.sub.B is greater than the sum of quantization parameter associated with the I-frame Q.sub.I plus a quantization parameter threshold amount Q.sub.TH. The quantization parameter threshold level Q.sub.TH represents a maximal quantization parameter deviation between the B-frame quantization parameter Q.sub.B and the I-frame quantization parameter Q.sub.i that does not result in a noticeable, quantization difference attributable breathing artifact.

If the query at step 406 is answered affirmatively, then the routine 400 proceeds to step 408. If the query at step 406 is answered negatively, then the routine proceeds to step 414, where a query is made as to whether more frames are to be processed. If the query at step 414 is answered affirmatively, then the routine 400 proceeds to step 404. If the query at step 414 is answered negatively, then the routine proceeds to step 416 and exits.

At step 408 the noise within the I-frame and the B-frames is characterized. That is, the amount of noise expected to be found within an I-frame having a quantization parameter Q.sub.I and within a B-frame having a quantization parameter Q.sub.B is determined with respect to a quantization level (i.e., step size) producing a noise distribution.

One method for determining the amount of additional noise expected to be found within a B-frame is by estimating a power density function (pdf) of the quantized AC coefficients of the B-frame DCT coefficients, and then subtracting a similarly obtained pdf for the reference frame (e.g., the anchor frame used by the motion compensation module). In the exemplary embodiment a pdf is kept for each of 63 AC coefficients within the DCT. Since the Laplacian distribution is a good approximation of the actual pdf for an AC coefficient, the exemplary embodiment uses such a Laplacian distribution as the preferred pdf for fitting the data. However, it will be recognized by those skilled in the art that other distribution functions may also be employed in practicing invention. Moreover, any one of several methods may be used to fit the data into the below-described Laplacian pdf.

As illustrated in equation 1, the pdf (i.e., pdf(x)) is estimated by computing the cc that minimizes the difference between the actual probability of each quantization level (computed from the received data) and the probability of that quantization level if the pdf was the estimated pdf. ##EQU1##

That is, as illustrated equation 2, minimizing with respect to .alpha., where the summation is over all quantization bins B, p(B.sub.i) is the relative frequency of that bin in the received signal, B.sub.ilow and B.sub.ihigh are the lower and upper limits of the quantization bin (i.e., the lowest and highest values that would be quantized to that value). ##EQU2##

Assuming the original (unquantized) pdf follows the estimated distribution, the pdf of the error is computed by adding the pdf of the difference between the original and the reconstruction level for each quantization bin, as illustrated in equation 3. ##EQU3## where ##EQU4## and where rec(B.sub.i) is the reconstruction level for bin B.sub.i.

The error.sub.-- pdf is then compared to a (similarly obtained) error pdf for the reference frame, and an appropriate amount of noise is added to the low noise frame to reduce the breathing artifact. The added noise would has a similar (or smaller) variance, and may also have the same pdf as the error pdf.

At step 410 the characterized B-frame noise of step 408 is scaled to an appropriate level depending upon the quantization parameter associated with the I-frame Q.sub.I. That is, the previously determined pdf is scaled according to the ratio of Q.sub.B /Q.sub.I to produce a scaled pdf

The routine 400 then proceeds to step 412, where the scaled characterized noise is injected into the I-frame. In one embodiment of the invention this injection of noise is realized by injecting a random number bounded by the scaled quantization level determined at step 410. In this manner the effective quantization parameter Q.sub.I associated with the I-frame is reduced to a quantization parameter having a value within Q.sub.TH of the quantization parameter associated with the B-frames Q.sub.B.

It must be noted that the routine 400 of FIG. 4 may be adapted to simultaneously or individually address quantization level variations between I-frames and P-frames and P-frames and B-frames. Moreover, though the routine 400 is depicted as operating on each GOP individually, the inventors contemplate that the routine 400 may be modified to operate on a plurality of GOPs. For example, in the case of a constant bitrate compressed video stream IN, the average bit budget for each GOP formed by an encoder is substantially the same. However, in the case of a variable bitrate information stream, the bit budget for each GOP formed by the encoder may vary as more or less bandwidth becomes available for use by the encoder. In the variable bitrate case, an additional variable is used to ensure that the average quantization parameter of a GOP (Q.sub.GOP) does not vary from the average quantization parameter of one or more preceding and succeeding GOPs by more than a threshold amount.

FIG. 5 depicts a flow diagram of a GOP equalization routine 500 according to the invention. Specifically, the GOP equalization routine 500 of FIG. 5 is directed toward reducing breathing artifacts in a video signal caused by successive half pel interpolations of predicted frames within a GOP, as described above with respect to FIG. 1.

The routine 500 is entered at stop 502, when, e.g., the video decoder 210 of FIG. 2 begins to receive the compressed video information stream IN. The routine 500 then proceeds to step 504, where a sharpness function is determined using the GOP structure. That is, an estimation of the propagated degradation in image sharpness is made based upon the types of frames within the GOP and the number of frames within the GOP. For example, a GOP comprising all I-frames will suffer no sharpness degradation due to half pel interpolations (since no predictions are made). Similarly, a GOP comprising a single I-frame followed by a plurality of P-frames and B-frames will suffer some level of sharpness degradation due to half pel interpolations. The sharpness function determined at stop 504 is an estimate of the amount of such sharpness degradation as a function of frame position and/or frame type within the GOP. The routine 500 then proceeds to step 506.

At step 506 equalizing filter parameters are calculated. The equalizing filter parameters comprise low pass and/or high pass filter parameters associated with one or more frames within the GOP. That is, a baseline level of sharpness is selected (e.g., a minimum sharpness level as defined by the most severely sharpness degraded frame, a median sharpness level defined by an average sharpness level of all the frames, or a particular frame type, within the GOP and the like). The equalizing filter parameters are determined for each frame within the GOP based on the position of the frame and/or frame type such that the relative difference in sharpness between frames within the GOP is reduced to below a threshold level, as previously described with respect to FIG. 1. The routine 500 then proceeds to step 508.

At step 508 the calculated equalizing filter parameters are used to equalize, as necessary, one or more of the frames within the GOP. The routine 500 then proceeds to step 510 where a query is made as to whether additional GOPs are to be processed. If the query at step 510 is answered negatively, then the routine 500 proceeds to step 512 and exits. If the query at step 510 is answered affirmatively, then the routine 500 proceeds to step 504, where a sharpness function related to the next GOP is calculated. That is, the routine 500 is repeated for each subsequent GOP. In a case of a fixed GOP structure, the sharpness function determined at step 504 and the equalizing filter parameters calculated at step 506 may be reused for each subsequent GOP.

In one embodiment of the invention "training sequences" are utilized to characterize the errors imparted to a sequence of decoded images by the decoder itself For example, if a distortion within a decoded GOP is introduced by a known factor (e.g., by the motion interpolation filter in the receiver or decoder), a pre-emphasis filter is computed to compensate for the known factor. To identify the distortion, a spectrum analysis function is utilized. That is, the actual spectral characteristic(s) of a training sequence comprising one or more "typical" decoded video frames compared to an "expected" spectral characteristic(s). The spectral characteristics of subsequently decoded video frames are then "corrected" based on filters obtained from the training sequences. It should be noted that the "expected" spectral characteristics may be determined with respect to a second decoder.

The transfer function of the spectral correction filter is determined as

the ratio of the expected spectrum and actual spectrum as a function of frequency. This is the desired frequency response of the signal. The filter may be implemented as, e.g., a finite impulse response (FIR) or infinite impulse response (IIR) filter. For example, in one embodiment of the invention, coefficients for an FIR filter substantially satisfying the determined transfer function are calculated, inverted and windowed to a desired number of taps. The resulting FIR filter is used to process each decoded frame such that the spectral characteristics of the decoded frame approach the previously-described "expected" spectral characteristics, thereby equalizing the decoded video output signal.

FIG. 6 depicts a flow diagram of a degradation equalization routine 600 according to the invention. Specifically, the degradation equalization routine 600 of FIG. 6 is directed towards reducing artifacts imparted to a sequence of decoded images by, e.g., anomalies within the decoder itself. The routine first characterizes the decoder using a training sequence. The training sequence comprises a "gold code" sequence of encoded images that have been previously been processed using a "known good" decoder to produce an output image sequence having an associated spectral characteristic. This associated spectral characteristic of the known good decoded gold code is denoted as an "expected spectrum."

The routine 600 is entered at step 602, and proceeds to step 604 where a training sequence is run. The running of the training sequence comprises the application to the decoder input (via, e.g., a switch or multiplex unit) of the known "gold code" sequence of encoded images (e.g., an MPEG-like compressed video information stream) such that a decoded training information stream is produced. The routine 600 then proceeds to step 606, where the decoded training information stream is spectrally analyzed to produce a training sequence spectrum. The routine 600 then proceeds to step 608.

At step 608 the training sequence spectrum computed at step 606 is compared to the expected spectrum (i.e., the previously calculated spectrum of the "gold code" video training sequence as produced by a known good decoder). Since, broadly speaking, the differences between the training sequence spectrum and the expected spectrum are primarily caused by differences in decoder behavior, spectral anomalies introduced into the decoded video information stream by the decoder may be isolated from spectral anomalies introduced into the decoded video stream by virtue of half pel interpolation, or other non-decoder error sources. The routine 600 then proceeds to step 610.

At step 610 a query is made as to whether the GOP structure of the video information stream to be decoded is known (i.e., the non-training compressed video sequence). If the query at step 610 is answered affirmatively, then the routine 600 proceeds to step 612 where a GOP equalizing filter is computed based upon the spectral variances between the training sequence GOP and a spectrally analyzed GOP from the compressed video information stream to be decoded. That is, spectral variations between the "gold code" GOP and the received GOP are compared and used to compute an equalizing filter such that the spectral variances between these GOPs are reduced to a level below a threshold level. The routine 600 then proceeds to step 614, where the computed GOP equalizing filter is applied to the received GOP.

The routine 600 then proceeds to step 616, where a query is made as to whether more GOPs are to be processed. If the query at step 616 is answered affirmatively, then the routine proceeds to step 614 where the computer filter is applied to the next GOP. Optionally, the routine may proceed to step 612, where a spectral analysis and comparison function is performed on the next GOP to be processed, resulting in a new GOP equalizing filter calculation. If the query at step 616 is answered negatively, the routine 600 proceeds to step 618 and exits.

If the query at step 610 is answered negatively, then the routine 600 proceeds to step 620, where spectral variations between frame types within the "gold code" training sequence and the received compressed video information stream are made. That is, spectral variations between, e.g., P-frames within the gold code training sequence and P-frames within the received video information stream are compared. Similarly, spectral variations between I-frames and B-frames may also be compared. The routine 600 then proceeds to step 622.

At step 622 one or more frame-type equalizing filters based upon frame spectral variances are computed. That is, spectral variations between, e.g., P-frames of the gold code training sequence and P-frames of the received information stream are used to compute a equalizing filter which, when applied to the P-frames within the received information stream, will reduce the spectral variations to below a threshold level. The filter selection made for a particular is also related to the number of I-frames and P-frames preceding the particular frame within the GOP. For example, since the magnitude of a prediction errors increases as a prediction error is propagated to later frames within a GOP, the filter utilized for later frames within the GOP is selected to impart a correspondingly greater amount of change to the underlying signal. The routine 600 then proceeds to step 624.

At step 624, the frame-type filters computed at step 622 are applied to each appropriate frame within a received GOP. The routine 600 then proceeds to step 626, where a query is made as to whether more frames are to be processed. If the query at 626 is answered negatively, then the routine proceeds to step 630 and exits. If the query at 626 is answered affirmatively, then the routine then proceeds to step 628, where the next frame is selected for processing, and to step 624, were the selected frame is filtered according to the appropriate frame-type filter computed at step 622. That is, the next frame that is selected (i.e., an I-frame, P-frame or B-frame) is filtered according to an I-frame spectral equalizing filter, a P-frame spectral equalizing filter or a B-frame spectral equalizing filter, as previously computed at step 622.

The above-described embodiments of the invention are directed toward decoder having sufficient memory resources to store an entire GOP for processing. However, where memory constraints prevent the storage and processing of an entire GOP structure, the inventors contemplate several methods for reducing the aforementioned breathing artifact.

In a first method according to the invention and directed to memory 10 constrained decoders, a single frame within the GOP is selected as the "fidelity reference" frame. The other frames within the GOP are degraded or enhanced as necessary to conform to a level of fidelity within a threshold level of the fidelity reference frame. This method will be described in more detail below with respect to FIG. 7.

In a second method according to the invention and directed to memory constrained decoders, the characteristics of a previously decoded GOP are used to adapt a GOP presently being decoded. That is, the characteristics of a first GOP are determined and stored as that first GOP is processed. For example, the GOP structure itself, quantization levels utilized in the GOP and other such fidelity-related parameters are stored. A second GOP, following the first GOP, is processed in a manner utilizing the stored parameters determined with respect to the first GOP. Since these parameters typically do not change much between, e.g., GOPs including images representing a common scene, the assumptions are useful. Moreover, even in the event of a scene change, any fidelity differences between the last GOP of a first scene and the first GOP of a second scene will not be noticeable in view of the actual scene change (i.e., the large visual discontinuity caused by the scene change will tend to mask any breathing artifact due to inaccurate GOP parameter predictions). These methods according to the invention and directed to memory constrained decoders provide an advantage in that there is no need to wait for a full GOP to be received prior to decoding the GOP.

FIG. 7 depicts a flow diagram of a degradation equalization routine 700 according to the invention and suitable for reducing breathing artifacts within the context of a memory-constrained decoder. The routine 700 is entered at step 702, when, e.g., the video decoder 210 of FIG. 2 begins to receive the compressed video information stream IN. The routine 700 then proceeds to step 704, where a portion of the GOP is stored in memory. Specifically, a portion of the GOP including at least a predetermined reference frame (e.g., the first anchor frame after an initial I-frame) is stored in memory. Alternatively, only the predetermined reference frame is stored in memory. The routine 700 then proceeds to step 706.

At step 706 the stored reference frame is characterized in the manner previously described by noting, e.g., the quantization step size, position within the GOP, frame type and other parameters. The reference frame characterization results in an approximate baseline level of fidelity that other frames within the GOP will be enhanced and/or degraded to achieve. That is, in the case of an I-frame having significantly greater fidelity than the selected reference frame, the I-frame will be processed in a manner that degrades the fidelity somewhat such that fidelity differences between the I-frame and the selected reference frame are reduced to below a threshold level. Similarly, if the fidelity level of the selected reference frame is significantly higher than, e.g., one or more B-frames within the GOP, the B-frames will be filtered in a manner tending to result in an apparent increase in fidelity. For example, the B-frames may be subjected to high-frequency enhancement to help offset a low pass filtering induced by half pel interpolation within the decoder. The routine 700 then proceeds to step 708.

At step 708 the frames within the GOP are processed according to the characterized parameters of the reference frame. Specifically, in step 708 any frames stored along with the reference frame are selectively enhanced or degraded according to the reference frame characteristics determined at step 706. The routine 700 then proceeds to step 710, where a query is made as to whether more frames exist within the GOP to be processed.

If the query at step 710 is answered affirmatively, then the routine proceeds to step 716, where the next frame within the GOP is processed per the reference frame characteristics determined at step 706. The routine 700 then proceeds to step 710.

If the query at step 710 is answered negatively, then the routine proceeds to step 712, where a query is made as to whether more GOPs are to be processed. If the query at step 712 is answered negatively, than the routine 700 proceeds to step 714 and exits. If the query at step 712 is answered affirmatively, than the routine 700 proceeds to step 704, where the routine 700 is repeated for the next GOP. It must be noted that since GOP to GOP characteristics change relatively slowly (except for, e.g., scene cuts), the next GOP may be optionally processed using the characterized reference frame from the preceding GOP.

In another embodiment of the invention, each GOP is processed according to an average statistical measure of the previous GOP. That is, each GOP is characterized using one or more of the above-described techniques as the GOP is being processed. The results of that characterization are used to process the next GOP. In this manner, the amount of memory required is greatly reduced. Moreover, since the potential inter-GOP fidelity error will be corrected within one GOP, there will be little, if any, breathing artifacts between the last GOP of a scene and the first GOP of a following scene.

In another one embodiment of the invention, error correction and/or equalization is imparted to the decoded video stream on a block by block basis. Advantageously, block level processing provides greatly enhanced control of the equalization process. For example, high pass filtering is appropriately applied to blocks where help-pel interpolation was used, but not to blocks having integer motion vectors. Blocks having integer motion vectors typically do not included the previously discussed half pel error component and, therefore, are not degraded in the previously discussed low pass filtering manner such that an equalizing high pass filtering is warranted. Additionally, block by block processing provides greater control of the equalization process such that, e.g., the controller can ensure that an equalization or correction imparted to a block never exceeds the quantization step, thereby avoiding the introduction of excessive distortion into a resulting image.

Finally, it should be noted that a "copy" of the original reference frame, as decoded by the decoder, may be stored and use as a reference to limit drift to the desired maximal level, such as one quantizer step.

Thus, in one embodiment of the invention, an "enhanced" reference frame (i.e., a reference frame comprising blocks having a maximal level of fidelity) is formed by adding correction information to the "original" or "nominal" reference frame. The enhanced reference frame is then stored along with the original reference frame. Each time a correction is made to avoid the breathing artifact, it is desirable to examine the correction so that the correction itself does not introduce a new artifact. This may be done by limiting the difference between the enhanced reference frame and the original reference frame to less than one quantization step when the DCT of the difference is computed. Thus, in one embodiment of the invention, a DCT of the difference between a block within the enhanced reference frame (i.e., a "corrected" block) and a corresponding block within the original frame is computed. If this difference exceeds a threshold level, illustratively one quantization step, then the original block is used or, optionally, the corrected macroblock is reprocessed. Additionally, this process of controlling the correction imparted to enhanced blocks is used for non-reference frames.

FIG. 8 depicts a flow diagram of a block level correction control routine according to the invention. The routine 800 is entered at step 802, and proceeds to step 804, where a block or macroblock is corrected. The routine 800 then proceeds to step 806, where the corrected macroblock or block is compared to a corresponding uncorrected macroblock or, to produce a different quantity. The routine 800 then proceeds to step 808, where a discrete cosine transforms (DCT) is performed on the difference between the corrected and uncorrected macroblock or block. The routine 800 then proceeds to step 810, where a query is made as to whether the quantizer step size of the DCT performed at step 808 is greater than one. If the query at step 810 is answered negatively, then the routine proceeds to 812 and exits. If the query at step 810 is answered affirmatively, then the routine proceeds to step 814, where a query is made as to whether the block or macroblock should be reprocessed.

If the query at step 814 is answered affirmatively, then the routine 800 proceeds to step 804, where the block or macroblock is reprocessed or re-corrected. The reprocessing or re-correction at step 804 utilizes the additional information of the DCT quantizer step size and DCT information formed at step 808.

If the query at step 814 is answered negatively, then the routine 800 proceeds to step 816, where the UN-corrected block or macroblock is used instead of the corrected macroblock. The routine 800 then proceeds to step 812 and exits.

Another block-based embodiment of the invention is similar to that described above with respect to FIG. 3 or FIG. 4, except that the relative degradation levels are determined with respect to individual blocks, and the individual blocks are processed in a manner addressing quantization and/or half pel prediction degradations.

Another block-based embodiment of the invention is similar to that described above with respect to FIG. 7, except that the blocks are processed to conform to, e.g., the average fidelity levels of the blocks in a particular frame type or intra-GOP frame position.

In one embodiment of the invention, the correction filter is computed in an encoder and transmitted to the decoder along with the encoded video stream. The correction filter may comprise an auxiliary stream, an enhancement layer stream or be included within a user data field. The decoder receives and utilizes the correction filter data to filter and enhance the picture in the manner similar to that previously described. However, since in this embodiment the encoder is responsible for computing a correction factor, the correction factor will be much more accurate. It should be noted that in the case of a decoder not utilizing the encoder-provided correction factor, the amount of correction is relatively

small (though sufficient to attenuate breathing artifacts) and the non-utilizing decoder will still be able to decode the received compressed video stream.

The present invention can be embodied in the form of computer-implemented processes and apparatuses for practicing those processes. The present invention also can be embodied in the form of computer program code embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other computer readable storage medium, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. The present invention can also be embodied in the form of computer program code, for example whether stored in a storage medium, loaded into and/or executed by a computer, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. When implemented on a general-purpose microprocessor, the computer program code segments configure the microprocessor to create specific logic circuits.

Although various embodiments which incorporate the teachings of the present invention have been shown and described in detail herein, those skilled in the art can readily devise many other varied embodiments that still incorporate these teachings.

* * * * *