Image processing apparatus, image processing method, image processing program, and recording medium Sugiyama, Akira ; et al. [Sugiyama, Akira]

Image processing apparatus, image processing method, image processing program, and recording medium

Sugiyama, Akira ; et al.

Patent Application Summary

U.S. patent application number 10/473441 was filed with the patent office on 2004-07-08 for image processing apparatus, image processing method, image processing program, and recording medium. Invention is credited to Sugiyama, Akira, Togashi, Haruo.

Application Number	20040131116 10/473441
Document ID	/
Family ID	26612508
Filed Date	2004-07-08

United States Patent Application	20040131116
Kind Code	A1
Sugiyama, Akira ; et al.	July 8, 2004

Image processing apparatus, image processing method, image processing program, and recording medium

Abstract

In the case that the amount of a code generated in one frame is controlled, when the picture quality is optimized using an adaptive quantization, even if a scene change occurs between two frames, the adaptive quantization can be accurately performed. When data for eight lines that compose a macro block of only a first field of an N-th picture is stored in a buffer, the calculation of activities of macro blocks of the first field is started. Corresponding to the activities of the first field, an average activity is calculated. The average activity of the first field is applied to the first field and the second field of the N-th picture. As a result, a normalized activity of the N-th picture is calculated. Corresponding to the normalized activity, a quantizer scale mqaunt in consideration of a visual characteristic is calculated. Thus, the N-th picture is adaptively quantized. Since the normalized activity is calculated corresponding to the own data, even if a scene change occurs, the picture quality of the following picture does not deteriorate. In addition, since only the first field is used for the calculation, the system delay can be suppressed to a small value. In addition, the capacity of the buffer memory can be reduced.

Inventors:	Sugiyama, Akira; (Kanagawa, JP) ; Togashi, Haruo; (Kanagawa, JP)
Correspondence Address:	William S Frommer Frommer Lawrence & Haug 745 Fifth Avenue New York NY 10151 US
Family ID:	26612508
Appl. No.:	10/473441
Filed:	September 26, 2003
PCT Filed:	March 28, 2002
PCT NO:	PCT/JP02/03060

Current U.S. Class:	375/240.03 ; 348/439.1; 375/240.24; 375/E7.129; 375/E7.139; 375/E7.14; 375/E7.143; 375/E7.15; 375/E7.156; 375/E7.163; 375/E7.176; 375/E7.181; 375/E7.199; 375/E7.211; 375/E7.217; 375/E7.274
Current CPC Class:	H04N 19/172 20141101; H04N 19/112 20141101; H04N 19/126 20141101; H04N 21/23602 20130101; H04N 19/46 20141101; H04N 19/15 20141101; H04N 19/124 20141101; H04N 21/4342 20130101; H04N 19/176 20141101; H04N 19/61 20141101; H04N 19/70 20141101; H04N 19/122 20141101; H04N 19/137 20141101
Class at Publication:	375/240.03 ; 348/439.1; 375/240.24
International Class:	H04N 007/12

Foreign Application Data

Date	Code	Application Number
Mar 29, 2001	JP	2001-95298
May 25, 2001	JP	2001-156818

Claims

1. A picture processing apparatus, comprising: average activity calculating means for calculating an average activity with a first field of picture data; normalized activity calculating means for applying the average activity calculated with the first field by the average activity calculating means to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing means for quantizing the first field and the second field with the normalized activity calculated by the normalized activity calculating means.

2. The picture processing apparatus as set forth in claim 1, wherein the average activity calculating means starts calculating the activity when minimum picture data for contracting one of blocks into which the picture data is divided is input, cumulates the activity in the first field, and calculates the average activity.

3. The picture processing apparatus as set forth in claim 1, wherein the activity calculating means calculates the activity with data corresponding to a macro block of only the first field, the macro block being dealt in the MPEG.

4. A picture processing method, comprising the steps of: calculating an average activity with a first field of picture data; applying the average activity calculated with the first field at the average activity calculating step to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing the first field and the second field with the normalized activity calculated at the normalized activity calculating step.

5. A picture processing program for causing a computer apparatus to execute a picture processing method for quantizing picture data, the picture processing method, comprising the steps of: calculating an average activity with a first field of picture data; applying the average activity calculated with the first field at the average activity calculating step to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing the first field and the second field with the normalized activity calculated at the normalized activity calculating step.

6. A recording medium on which a picture processing program has been recorded, the picture processing program for causing a computer apparatus to execute a picture processing method for quantizing picture data, the picture processing method, comprising the steps of: calculating an average activity with a first field of picture data; applying the average activity calculated with the first field at the average activity calculating step to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing the first field and the second field with the normalized activity calculated at the normalized activity calculating step.

Description

TECHNICAL FIELD

[0001] The present invention relates to a picture processing apparatus, a picture processing method, a picture processing program, and a recording medium for controlling a generated code amount in compression encoding for a picture signal using quantization of each block of the picture signal so that the code amount of each frame does not exceed a predetermined amount.

BACKGROUND ART

[0002] In a conventional picture data compression-encoding system, picture data is quantized in the unit of one block that is composed of a predetermined number of pixels. For example, in the MPEG2 (Moving Pictures Experts Group 2), such a compression-encoding system is used. In the MPEG2, picture data is transformed in the unit of one block that is composed of a plurality of pixels by the DCT (Discrete Cosine Transform) method. The obtained DCT coefficients are quantized. As a result, the picture data has been compression-encoded. In the MPEG2, a quantizer step is designated with a quantizer scale.

[0003] In a conventional picture quality optimizing method using the compression-encoding of the MPEG2, an activity that is a coefficient that represents the complexity and smoothness of a picture to be compressed is calculated. With an adaptive quantization based on the activity, the picture quality is optimized.

[0004] This method is performed as follows. In a simple and smooth picture region, when the compressing process is performed, the deterioration of the picture quality is remarkable. In this region (referred to as plane region), the picture data is finely quantized with a quantizer scale whose quantizer step is small. In contrast, in a complicated picture region, when the compressing process is performed, the deterioration of the picture quality is not remarkable. In this region, the picture data is coarsely quantized with a quantizer scale whose quantizer step is large. Thus, with a limited code amount, the picture quality can be effectively optimized.

[0005] When picture data is compressed, as described above, each picture region is divided into pixel blocks each having a predetermined size. The picture data is quantized and DCT-transformed in the unit of one pixel block. The MPEG2 standard prescribes a block of eight pixels.times.eight lines as the minimum process unit. In addition, the MPEG2 standard prescribes that each block of eight pixels.times.eight lines should be DCT-transformed and that the resultant DCT coefficients are quantized in the unit of a macro block of 16 pixels.times.16 lines.

[0006] On the other hand, although the MPEG2 standard does not clearly prescribe the unit for calculating the forgoing activity, however, the MPEG2 TM5 (Test Model 5) has proposed that the activity should be processed in the unit of one sub block of eight pixels.times.eight lines, which is the same as one DCT block.

[0007] Next, the activity calculating method in "adaptive quantization considering visual characteristic" that has been proposed in the MPEG2 TM5 will be described.

[0008] The adaptive quantization is to vary a quantizer scale Qj that depends on the state of a picture with an activity of each macro block so as to control the generated code amount of for example one frame and improve the picture quality. The quantizer scale Qj is varied with the activity so that in a plane region, where the deterioration of the picture quality is remarkable, the picture data is quantized with the quantizer scale Qj whose quantizer step is small and in a complicated picture region, where the deterioration of the picture quality is not remarkable, the picture data is quantized with the quantizer scale Qj whose quantizer step is large.

[0009] The activity is obtained with pixel values of a luminance signal of an original picture rather than a predictively error. For example, the activity act.sub.j is obtained by calculating the following Formulas (1) to (3) in the reverse order for the j-th macro block with four blocks of the frame DCT encoding mode and four blocks of the field DCT encoding mode.

act.sub.j=1+min[sblk=1, 8](var.sub.--sblk) (1)

var.sub.--sblk={fraction (1/64)}.SIGMA.[k=1, 64](Pk-Pavg).sup.2 (2)

Pavg={fraction (1/64)}.SIGMA.[k=1, 64]P.sub.k (3)

[0010] where P.sub.k is a pixel value of a block of a luminance signal of the original picture. In Formula (3), 64 pixel values of a block of 8.times.8 are summed and the result is divided by 64. As a result, the average value Pavg of the pixel value P.sub.k of the block is obtained. Next, in Formula (2), the difference between the average value Pavg and the pixel value P.sub.k is obtained. As a result, the average difference value var_sblk of the block of 8.times.8 is calculated. In Formula (1), with the minimum value of the average difference values var_sblk, the activity act.sub.j of the j-th macro block is obtained. The minimum value is used because even if a part of the macro block contains a plain portion, it is necessary to finely quantize the macro block.

[0011] In the MPEG2 TM5, a normalized activity Nact.sub.j that has values in the range from "2.0" to "0.5" is obtained from activities act.sub.j of macro blocks corresponding to the following Formula (4).

Nact.sub.j=(2.times.act.sub.j+avg.sub.--act)/(act.sub.j+2.times.avg.sub.--- act) (4)

[0012] where "avg_act" represents an average value (average activity) of activities act.sub.j of an encoded frame immediately preceded by a frame (picture) that is currently being processed.

[0013] A quantizer scale mquant.sub.j that considers a visual characteristic is given by the following Formula (5) corresponding to a quantizer scale Q.sub.j that is obtained for controlling a generated code amount of one frame.

mquant.sub.j=Q.sub.j.times.Nact.sub.j (5)

[0014] When each macro block is quantized with such a quantizer scale mquant.sub.j, while the code amount of one whole frame is kept in a predetermined range, each macro block is optimally quantized corresponding to flatness and complexity of a picture of the frame. As a result, while a limited code amount is effectively used, the picture is effectively compressed with much suppression of the picture quality.

[0015] However, if there is a scene change between two frames, pictures largely change before and after it. Thus, the correlation between the frames is lost.

[0016] FIG. 1A, FIG. 1B, FIG. 1C, and FIG. 1D show an example of a process for obtaining a normalized activity Nact.sub.j according to related art. FIG. 1A shows a frame signal of which one frame is composed of one set of a low region and a high region. FIG. 1B shows a picture that is input. One picture corresponds to one frame. In the interlace scanning, one frame is composed of two fields that are a first field as a first half portion and a second field as a second half portion.

[0017] Next, the case that a normalized activity Nact.sub.j for an input picture will be described. As was described above, in the MPEG2, to process a picture signal in the unit of one block, an input picture signal that is input in the unit of one line should be converted into block so that each block has a predetermined pixel size. In the case that one block is composed of 16 pixels.times.16 lines, when data of first eight lines of a second field of a frame has been input, a first block of the frame can be formed. Thus, to obtain a normalized activity Nact.sub.j for an input picture, as shown in FIG. 1C, a delay for at least 1 field (0.5 frame)+8 lines has occurred.

[0018] With the delay for 0.5 field+8 lines of the input picture, an average activity avg_act of an (N-1)-th picture is calculated with one frame, namely a first field and a second field, of the (N-1)-th picture. A quantizer scale mquant.sub.j, which was obtained corresponding to the forgoing Formula (5), is calculated with the normalized activity Nact.sub.j obtained with the average activity avg_act of the (N-1)-th picture. With the quantizer scale mquant.sub.j, the next N-th picture is quantized. As a result, an output picture is generated (see FIG. 1D).

[0019] Now, it is considered that a scene change has occurred when an N-th picture has changed to an (N+1)-th picture of input pictures. In this case, as shown in FIG. 1D, the (N+1)-th picture, which has been input after the scene change, is quantized with the quantizer scale mquant.sub.j obtained with the normalized activity Nact.sub.j obtained with the average activity avg_act of the picture (N-th picture), which has been output before the scene change. The resultant quantized picture is output.

[0020] In the method described in the section of the Background Art, to obtain the normalized activity Nact.sub.j, the average activity avg_act of the immediately preceding frame that has been encoded is used. When the correlation between two frames is lost due to for example a frame change, if an activity of a frame that has been input after the scene change is normalized with the average activity avg_act obtained with a frame that has not been input before the scene change, the average activity avg_act used to normalize the frame that has been input after the scene change becomes different from the average activity avg_act of the frame that has been input before the scene change. Thus, the activity of the frame that has been input after the scene change cannot be properly optimized. As a result, the picture quality deteriorates.

[0021] Therefore, an object of the present invention is to provide a picture processing apparatus, a picture processing method, a picture processing program, and a recording medium that allow pictures to be accurately adaptively quantized so as to control a generated code amount of one frame even if a scene change occurs between frames and optimize the picture quality using the adaptive quantization.

DISCLOSURE OF THE INVENTION

[0022] To solve the forgoing problem, the present invention is a picture processing apparatus, comprising average activity calculating means for calculating an average activity with a first field of picture data; normalized activity calculating means for applying the average activity calculated with the first field by the average activity calculating means to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing means for quantizing the first field and the second field with the normalized activity calculated by the normalized activity calculating means.

[0023] In addition, the present invention is a picture processing method, comprising the steps of calculating an average activity with a first field of picture data; applying the average activity calculated with the first field at the average activity calculating step to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing the first field and the second field with the normalized activity calculated at the normalized activity calculating step.

[0024] In addition, the present invention is a picture processing program for causing a computer apparatus to execute a picture processing method for quantizing picture data, the picture processing method, comprising the steps of calculating an average activity with a first field of picture data; applying the average activity calculated with the first field at the average activity calculating step to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing the first field and the second field with the normalized activity calculated at the normalized activity calculating step.

[0025] The present invention is a recording medium on which a picture processing program has been recorded, the picture processing program for causing a computer apparatus to execute a picture processing method for quantizing picture data, the picture processing method, comprising the steps of calculating an average activity with a first field of picture data; applying the average activity calculated with the first field at the average activity calculating step to the first field and a second field of the same frame as the first frame so as to calculate a normalized activity; and quantizing the first field and the second field with the normalized activity calculated at the normalized activity calculating step.

[0026] As was described above, according to the present invention, since an average activity calculated with a first field of picture data is used to quantize a first field and a second field of the same frame as the first frame with a normalized activity calculated with the first field and the second field, picture data of one frame can be quantized with a normalized activity of own picture data with a small system delay.

BRIEF DESCRIPTION OF DRAWINGS

[0027] FIG. 1A, FIG. 1B, FIG. 1C, and FIG. 1D show a time chart that represents an example of a process for obtaining a normalized activity Nact.sub.j according to background art;

[0028] FIG. 2A, FIG. 2B, FIG. 2C, FIG. 2D, FIG. 2E, and FIG. 2F show a time chart that represents an example of a process for obtaining a normalized activity Nact.sub.j according to a first embodiment of the present invention;

[0029] FIG. 3A, FIG. 3B, FIG. 3C, FIG. 3D, FIG. 3E, and FIG. 3F show a time chart that represents an example of a process for pre-calculating an average activity avg_act of a frame to be normalized and normalizing an activity of the frame;

[0030] FIG. 4A and FIG. 4B show a block diagram that represents an example of the structure of a digital VTR according to an embodiment of the present invention;

[0031] FIG. 5A, FIG. 5B, and FIG. 5C show a block diagram that represents in reality an example of the structure of an MPEG encoder;

[0032] FIG. 6A, FIG. 6B, and FIG. 6C show schematic diagrams that represent examples of the structures of streams transferred by each portion of the MPEG encoder;

[0033] FIG. 7A, FIG. 7B, and FIG. 7C show schematic diagrams that represent examples of the structures of streams transferred by each portion of the MPEG encoder;

[0034] FIG. 8A and FIG. 8B show schematic diagrams that represent examples of the structures of streams transferred by each portion of the MPEG encoder;

[0035] FIG. 9A and FIG. 9B show schematic diagrams that represent examples of the structures of streams transferred by each portion of the MPEG encoder;

[0036] FIG. 10A, FIG. 10B, and FIG. 10C show schematic diagrams that represent examples of the structures of streams transferred by each portion of the MPEG encoder;

[0037] FIG. 11A, FIG. 11B, and FIG. 11C show schematic diagrams that represent examples of the structures of streams transferred by each portion of the MPEG encoder;

[0038] FIG. 12 shows a schematic diagram that represents an example of the structure of a stream transferred by each portion of the MPEG encoder;

[0039] FIG. 13 shows a flow chart that represents an example of which a process of the MPEG encoder is implemented by software; and

[0040] FIG. 14 shows a schematic diagram for describing a block segmentation for calculating an activity according to an embodiment of the present invention.

BEST MODES FOR CARRYING OUT THE INVENTION

[0041] Next, with reference to the accompanying drawings, the present invention will be described. According to the present invention, a normalized activity Nact.sub.j of a frame is calculated with an average activity avg_act obtained from picture data of a first field of the frame. According to an embodiment of the present invention, the average activity avg_act and the normalized activity Nact.sub.j can be also calculated in the method corresponding to Formula (1) to Formula (5) described in the section of the Related Art.

[0042] FIG. 2A, FIG. 2B, FIG. 2C, FIG. 2D, FIG. 2E, and FIG. 2F show an example of a process for obtaining a normalized activity Nact.sub.j according to a first embodiment of the present invention. FIG. 2A shows a frame signal of which one frame is composed of one set of a low region and a high region. FIG. 2B shows input pictures. One picture corresponds to one frame. In the interlace scanning, one frame is composed of two fields that are a first field as a first half portion and a second field as a second half portion. Now, a process for an N-th picture will be considered. In addition, it is assumed that an average activity avg_act is calculated in the unit of one macro block of 16 pixels.times.16 lines.

[0043] Since the average activity avg_act is calculated with data of only a first field, when first eight lines of the first field have been input and stored in a buffer, data of a macro block of the first field is obtained. At that point, the calculation of the average activity avg_act can be started. Thus, the delay after the N-th picture is input until the calculation is started is at least eight lines as shown in FIG. 2C.

[0044] When all data of a first field of an N-th picture has been input and then an average activity avg_act of the first field has been calculated, the calculated result is applied to the first and second fields. As a result, a normalized activity Nact.sub.j of the N-th picture is obtained (see FIG. 2E). With the normalized activity Nact.sub.j and a quantizer scale Q.sub.j obtained in a method that will be described later, the calculation corresponding to Formula (5) is performed. As a result, a quantizer scale mqaunt.sub.j that considers a visual characteristic is calculated. With the quantizer scale mqaunt.sub.j, the N-th picture is quantized.

[0045] Since the normalized activity Nact.sub.j of the N-th picture is obtained only from the average activity avg_act of the first field of the N-th picture, as shown in FIG. 2F, the delay after the N-th picture is input until the N-th picture is adaptively quantized with the quantizer scale mqauntj, which considers the visual characteristic, is performed becomes 0.5 frame+.alpha..

[0046] Now, it is supposed that a scene change has occurred when the N-th picture has changed to the (N+1)-th picture. According to the present invention, as was described above, with the average activity avg_act calculated with the first field of the N-th picture, the normalized activity Nact.sub.j of the N-th picture is obtained. In other words, unlike with the description in the section of the Related Art, according to the present invention, when the normalized activity Nact.sub.j is obtained, since an average activity of the immediately preceding frame is not used, even if a scene change occurs between two frames, the normalized activity Nact.sub.j can be properly calculated. Thus, immediately after a scene change has occurred, the picture quality of a picture that is input after a scene change can be prevented from deteriorating. Consequently, when pictures are edited for example frame by frame, a good picture quality is obtained.

[0047] Alternatively, an average activity avg_act of a frame to be normalized may be pre-calculated. With the pre-calculated average activity avg_act, the activity of the frame may be normalized. In this method, regardless of occurrence of a scene change between two frames, with the optimum average activity avg_act, the activity can be normalized.

[0048] FIG. 3A, FIG. 3B, FIG. 3C, FIG. 3D, FIG. 3E, and FIG. 3F show an example of a process for pre-calculating an average activity avg_act of a frame to be normalized and then normalizing an activity of the frame with the pre-calculated average activity avg_act. The process shown in FIG. 3A and FIG. 3B is the same as the process shown in FIG. 1A and FIG. 1B. In the example shown in FIG. 3, as shown in FIG. 3D, when all data of a picture (for example, an N-th picture) to be quantized, namely a first field and a second field of the N-th picture, has been stored in a buffer, with the data stored in the buffer, the average activity avg_act of the picture is calculated.

[0049] In this case, with a delay for at least 0.5 field+8 lines after the picture (N-th picture) has been input, the average activity avg_act is calculated (see FIG. 3C and FIG. 3D). With the obtained average activity avg_act of the picture, the normalized activity Nact.sub.j of the own picture is obtained.

[0050] Since the normalized activity Nact.sub.j of the picture is obtained with the average activity avg_act calculated with the own data, even if a scene change occurs immediately after for example the N-th picture, the normalized activity Nact.sub.j can be properly calculated (see FIG. 3E).

[0051] However, in this case, after the average activity avg_act has been calculated with all data of the first and second fields of the picture, the normalization of the activity of the picture is started. Thus, a delay (system delay) after the average activity avg_act is calculated until the activity of the picture is normalized becomes large.

[0052] In the example shown in FIG. 3A, FIG. 3B, FIG. 3C, FIG. 3D, FIG. 3E, and FIG. 3F, the calculation of the average activity avg_act is started with a delay for 0.5 frame+8 lines after the N-th picture is input. In addition, after the average activity avg_act of all the first and second fields of the N-th picture has been calculated, since the activity is normalized, a delay for at least one frame is added. In other words, as shown in FIG. 3F, the total delay amount (system delay) becomes at least 1.5 frames+.alpha..

[0053] When the system delay becomes large, although it is not suitable for a real-time editing operation, it seems that the delay adversely affects a video unit used in a broadcasting station. In addition, after the average activity avg_act for one picture has been calculated, since the picture is normalized, a buffer memory that stores data of the first and second fields of the picture is required. As a result, the capacity of the required buffer memory becomes large.

[0054] In contrast, in the method according to the present invention described with reference to FIG. 2A, FIG. 2B, FIG. 2C, FIG. 2D, FIG. 2E, and FIG. 2F, since an average activity is calculated only with a first field, the calculation time becomes shorter than that for the case that an average activity is calculated with a frame (two fields) whose average activity is to be calculated. As a result, the system delay can be shortened. Consequently, the time period necessary for the picture compressing process can be shortened. Thus, the time restriction for each operation and each process for a VTR, an encoder, and so forth can be alleviated.

[0055] Unless a special editing process is performed, a scene change in a VTR or the like occurs in each frame. Thus, when a frame is processed with a first field thereof, the picture quality can be prevented from being affected by a scene change.

[0056] Next, an embodiment of the present invention will be described. The embodiment that follows is a preferred embodiment of the present invention. Although the embodiment contains many technically preferred limitations, it should be noted that the scope of the present invention is not limited to such an embodiment unless otherwise specified.

[0057] FIG. 4A and FIG. 4B show an example of a structure of a digital VTR according to an embodiment of the present invention. The digital VTR can directly record a digital video signal that has been compressed and encoded corresponding to the MPEG system onto a recording medium.

[0058] First of all, the structure and the processing operation of a recording system of the digital VTR will be described. Signals that are input from the outside to the recording system are two types of serial digital interface signals that are an SDI (Serial Data Interface) signal and an SDTI (Serial Data Transport Interface) signal, an analog interface signal, and an external reference signal REF that is a control signal.

[0059] The SDI is an interface prescribed by the SMPTE so as to transmit a (4:2:2) component video signal, a digital audio signal, and additional data. The SDTI is an interface through which an MPEG elementary stream (referred to as MPEG ES) that is a stream of which a digital video signal has been compression-encoded according to the MPEG system. The ES is 4:2:2 components. As described above, the ES is a stream of all I pictures having the relation of 1 GOP=1 picture. In the SDTI-CP (Content Package) format, the MPEG ES is separated into access units and packed to a packet in the unit of one frame. The SDTI-CP uses a sufficient transmission band (clock rate: 27 MHz or 36 MHz or stream bit rate: 270 M bps or 360 M bps). In one frame period, the ES can be transmitted as a burst.

[0060] The SDI signal transmitted through the SDI is input to an SDI input portion 101. An analog input signal as an analog video signal is input to an analog input portion 120. The analog input portion 120 converts the analog input signal into a digital signal, maps the digital signal to for example the aforementioned SDI format, and outputs the resultant SDI signal. The SDI signal, where the analog input signal has been converted and mapped to the SDI format, is supplied to the SDI input portion 101.

[0061] The SDI input portion 101 converts the supplied SDI signal as a serial signal into a parallel signal. In addition, the SDI input portion 101 extracts an input synchronous signal as an input phase reference from the SDI signal and outputs the extracted input synchronous signal to a timing generator TG 102.

[0062] In addition, the SDI input portion 101 separates a video signal and an audio signal from the converted parallel signal. The separated video input signal and audio signal are output to an MPEG encoder 103 and a delay circuit 104, respectively.

[0063] The timing generator TG 102 extracts a reference synchronous signal from an input external reference signal REF. In synchronization with a designated one of the reference synchronous signal or the input synchronous signal, which has been supplied from the SDI input portion 101, the timing generator TG 102 generates a timing signal necessary for the digital VTR and supplies it as timing pulses to each block.

[0064] The MPEG encoder 103 converts the input video signal into coefficient data according to the DCT method, quantizes the coefficient data, and encodes the quantized data with a variable length code. Variable-length-coded (VLC) data, which has been output from the MPEG encoder 103, is an elementary stream (ES) according to the MPEG2. The elementary stream, which has been output from the MPEG encoder 103, is supplied to one of input terminals of a recording side multi-format converter (hereinafter referred to as recording side MFC).

[0065] The delay circuit 104 functions as a delay line that delays the input audio signal as non-compressed signal corresponding to the delay of the video signal of the MPEG encoder 103. The audio signal, which has been delayed by the delay circuit 104, is output to an ECC encoder 107. This is because the digital VTR according to the embodiment of the present invention treats the audio signal as a non-compressed signal.

[0066] The SDTI signal supplied from the outside through the SDTI is input to an SDTI input portion 105. The SDTI input portion 105 detects the synchronization of the SDTI signal. The SDTI signal is temporarily buffered and the elementary stream is extracted therefrom. The extracted elementary stream is supplied to the other input terminal of the recording side MFC 106. The synchronous signal, which has been synchronously detected, is supplied to the timing generator TG 102 (not shown).

[0067] The SDTI input portion 105 extracts a digital audio signal from the input SDTI signal. The extracted digital audio signal is supplied to the ECC encoder 107.

[0068] In the digital VTR according to the present invention, an MPEG ES can be directly input independent from the base band video signal, which has been input from the SDI input portion 101.

[0069] The recording side MFC 106 has a stream converter and a selector. The recording side MFC 106 selects one of the MPEG ES supplied from the SDI input portion 101 and the MPEG ES supplied from the SDTI input portion 105, and collects DCT coefficients of DCT blocks of one macro block so that frequency components are rearranged in the order of frequency components. The resultant stream, of which the coefficients of the MPEG ES have been rearranged, is referred to as converted elementary stream. Since the MPEG ES is rearranged, when a search reproduction is performed, as many DC coefficients and low order AC coefficients can be collected as possible, which contributes the improvement of the quality of the search picture. The converted elementary stream is supplied to the ECC encoder 107.

[0070] A main memory (not shown) having a large storage capacity is connected to the ECC encoder 107. The ECC encoder 107 has a packing and shuffling portion, an audio outer code encoder, a video outer code encoder, an inner code encoder, an audio shuffling portion, a video shuffling portion, and so forth. In addition, the ECC encoder 109 has an sync block ID adding circuit and a synchronous signal adding circuit. According to the first embodiment of the present invention, as an error correction code for the video signal and audio signal, a product code is used. In the product code, a data symbol is dually encoded in such a manner that the two-dimensional array of the video signal or audio signal is encoded in the vertical direction with an outer code and that the two-dimensional array is encoded in the horizontal direction with an inner code. As the outer code and inner code, the Reed-Solomon code can be used.

[0071] The converted elementary stream, which has been output from the recording side MFC 106, is supplied to the ECC encoder 107. In addition, the audio signals, which are output from the SDTI input portion 105 and the delay circuit 104, are supplied to the ECC encoder 107. The ECC encoder 107 shuffles the converted elementary stream and audio signals, encodes them with an error correction code, adds IDs and a synchronous signal to sync blocks, and outputs the resultant signal as record data.

[0072] The record data, which has been output from the ECC encoder 107, is converted into a record RF signal by the equalizer EQ 108 that has a recording amplifier. The record RF signal is supplied to a rotating dram 109. The rotating dram 109 records the record signal on a magnetic tape 110. The rotating dram 109 has a rotating head disposed in a predetermined manner. In reality, a plurality of magnetic heads whose azimuths are different from each other and that form adjacent tracks are disposed.

[0073] When necessary, the record data may be scrambled. When data is recorded, it may be digitally modulated. In addition, the partial response class 4 and Viterbi code may be used. The equalizer EQ 108 has both a recording side structure and a reproducing side structure.

[0074] Next, the structure and the processing operation of the reproducing system of the digital VTR will be described. In the reproduction mode, a reproduction signal that is reproduced from the magnetic tape 110 by the rotating dram 109 is supplied to the reproducing side structure of the equalizer EQ 108 that has a reproducing amplifier and so forth. The equalizer EQ 108 equalizes the reproduction signal and trims the wave shape thereof. When necessary, the equalizer EQ 108 demodulates the digital modulation and decodes the Viterbi code. An output of the equalizer EQ 108 is supplied to an ECC decoder 111.

[0075] The ECC decoder 111 performs the reverse process of the ECC encoder 107. The ECC decoder 111 has a main memory having a large storage capacity, an inner code decoder, an audio deshuffling portion, a video deshuffling portion, and an outer code decoder. In addition, the ECC decoder 111 has a deshuffling and depacking portion and a data interpolating portion for video data. Likewise, the ECC decoder 111 has an audio AUX separating portion and a data interpolating portion for audio data.

[0076] The ECC decoder 111 detects the synchronization of reproduction data. In other words, the ECC decoder 111 detects a synchronous signal added at the beginning of a sync block and extracts the sync block from the reproduction signal. The ECC decoder 111 corrects an error of each sync block of the reproduction data with an inner code. Thereafter, the ECC decoder 111 performs an ID interpolating process for each sync block. The ECC decoder 111 separates video data and audio data from the reproduction data, where IDs have been interpolated. The ECC decoder 111 deshuffles the video data and audio data so that they are restored to the original data. The ECC decoder 111 corrects an error of the deshuffled data with an outer code.

[0077] When the ECC decoder 111 cannot correct an error of data, which exceeds its error correcting performance, the ECC decoder 111 sets an error flag to the data. For an error of video data, the ECC decoder 111 outputs a signal ERR that represents that data has an error.

[0078] The reproduction audio data, whose error has been corrected, is supplied to an SDTI output portion 115. A delay circuit 114 delays the reproduction audio data for a predetermined amount and supplies the delayed reproduction audio data to an SDI output portion 116. The delay circuit 114 absorbs the delay of the video data processed in an MPEG decoder 113 that will be described later.

[0079] On the other hand, the video data, whose error has been corrected, is supplied as converted reproduction elementary stream to a reproducing side MFC circuit 112. The aforementioned signal ERR is also supplied to the reproducing side MFC circuit 112. The reproducing side MFC circuit 112 performs the reverse process of the recording side MFC 106. The reproducing side MFC circuit 112 has a stream converter. The stream converter performs the reverse process of the recording side stream converter. In other words, the stream converter rearranges DCT coefficients arranged in the order of frequency components to those arranged in the order of DCT blocks. As a result, the reproduction signal is converted into an elementary stream according to the MPEG2. At that point, when the signal ERR is supplied from the ECC decoder 111 to the reproducing side MFC circuit 112, it replaces the corresponding data with a signal that perfectly complies with the MPEG2.

[0080] The MPEG ES, which has been output from the reproducing side MFC circuit 112, is supplied to the MPEG decoder 113 and the SDTI output portion 115. The MPEG decoder 113 decodes the supplied MPEG ES so as to restore it to the original video signal, which has not been compressed. In other words, the MPEG decoder 113 performs an inversely quantizing process and an inverse DCT process for the supplied MPEG ES. The decoded video signal is supplied to an SDI output portion 116.

[0081] As described above, the audio data, which had been separated from the video data by the ECC decoder 111, has been supplied to the SDI output portion 116 through the delay circuit 114. The SDI output portion 116 maps the supplied video data and audio data in the SDI format so as to convert them into the SDI signal having the SDI data structure. The SDI signal is output to the outside.

[0082] On the other hand, as described above, the audio data, which had been separated from the video data by the ECC decoder 111, has been supplied to the SDTI output portion 115. The SDTI output portion 115 maps the video data and audio data as the supplied elementary stream in the SDTI format so as to convert them into the SDTI signal having the SDTI data structure. The SDTI signal is output to the outside.

[0083] A system controller 117 (abbreviated as sys-con 117 in FIG. 4A and FIG. 4B) is composed of for example a micro computer. The system controller 117 communicates with each block using a digital signal SY_IO so as to control the entire operation of the digital VTR. A servo 118 communicates with the system controller 117 using a signal SY_SV. Using the signal SV_IO, the servo 118 controls the traveling of the magnetic tape 110 and drives and controls the rotating dram 109.

[0084] FIG. 5A, FIG. 5B, and FIG. 5C more practically show the structure of the forgoing example of the MPEG encoder. FIG. 6A, FIG. 6B, FIG. 6C, FIG. 7A, FIG. 7B, FIG. 7C, FIG. 8A, FIG. 8B, FIG. 9A, FIG. 9B, FIG. 10A, FIG. 10B, FIG. 10C, FIG. 11A, FIG. 11B, FIG. 11C, and FIG. 12 show examples of the structures of streams transferred in the individual portions of FIG. 5A, FIG. 5B, and FIG. 5C.

[0085] The MPEG encoder 103 is composed of an input field activity averaging process portion 103A, a pre-encoding process portion 103B, and an encode portion 103C. The input field activity averaging process portion 103A obtains the average value of activities of the input video data and supplies the obtained average value to the pre-encoding process portion 103B. The pre-encoding process portion 103B estimates the generated code amount of quantized input video data with the average value of the activities. According to the estimated result, while controlling the code amount, the encode portion 103C actually quantizes the input video data, encodes the quantized video data with a variable length code, and outputs the resultant data as an MPEG ES.

[0086] A timing generator TG 220 generates a timing signal necessary for the MPEG encoder 103 with a horizontal synchronous signal HD, a vertical synchronous signal VD, and a field synchronous signal FLD that have been supplied from for example the timing generator TG 103 shown in FIG. 4A and FIG. 4B. A CPU I/F block 221 is an interface with the system controller 117 shown in FIG. 4A and FIG. 4B. With a control signal and data transferred through the CPU I/F block 221, the operation of the MPEG encoder 103 is controlled.

[0087] First of all, the process of the input field activity averaging process portion 103A will be described. Video data, which has been output from the SDI input portion 101 and input to the MPEG encoder 103, is supplied to an input portion 201. The input field activity averaging process portion 103A converts the video data into data that can be stored in a main memory 203 and checks a parity for the video data. The video data, which has been output from the input portion 201, is supplied to a header creating portion 202. Using a vertical blanking region or the like, headers according to the MPEG, for example sequence_header, quantizer_matrix, and gop_header, are extracted. The extracted headers are stored in the main memory 203. These headers are designated mainly by the CPU I/F block 221. In other than the vertical blanking region, in the header creating portion 202, the video data supplied from the input portion 201 is stored in the main memory 203.

[0088] The main memory 203 is a frame memory for a picture. The main memory 203 rearranges video data and absorbs the system delay. The video data is rearranged by for example an address controller (not shown) that controls read addresses of the main memory 203. In FIG. 8A, FIG. 8B, and FIG. 8C, 8 lines, 0.5 frame, and 1 frame in the block of the main memory 203 represent delay values as read timings of the main memory 203. They are properly controlled corresponding to a command issued from the timing generator TG 220.

[0089] A raster scan/block scan converting portion 204 extracts macro blocks according to the MPEG from each line of the video data, which has been stored in the main memory 203, and supplies the extracted macro blocks to an activity portion 205 disposed downstream thereof. According to the embodiment, with only the first field, activities are calculated. Thus, the macro blocks that are output from the raster scan/block scan converting portion 204 are composed of video data of the first field.

[0090] As shown in FIG. 6A, at the beginning of the stream that is output from the raster scan/block scan converting portion 204, address information of the vertical and horizontal directions of the macro block are placed. The address information is followed by a blank area having a predetermined size, followed by picture data for one macro block.

[0091] The stream has a data length of 576 words each of which is composed of for example eight bits. The last 512 words (referred to as data portion) are assigned a picture data area for one macro block. The first 64 words (referred to as header portion) contain the aforementioned address information of the macro block. The other portion is a blank area for data and flags embedded by each portion disposed downstream of the raster scan/block scan converting portion 204.

[0092] A macro block according to the MPEG is a matrix of 16 pixels.times.16 lines. However, as described with reference to FIG. 3A, FIG. 3B, FIG. 3C, FIG. 3D, FIG. 3E and FIG. 3F, the MPEG encoder 103 performs the process for obtaining activities with only the first field. Thus, when eight lines of the first field have been stored in the main memory 203, the process can be started. In reality, corresponding to a command issued from the timing generator TG 220, the process is properly started.

[0093] The activity portion 205 calculates an activity of each macro block. In the MPEG encoder 103, an activity of each macro block of only a first field is calculated. The calculated result is output as a field activity signal field_act. The signal field_act is supplied to the averaging portion 206. The signal field_act is cumulated for one field. As a result, an average value avg_act is obtained. The average value avg_act is supplied to the activity portion 209 of the pre-encoding process portion 103B. The activity portion 209 performs a pre-encoding process with the average values avg_act of the first and second fields.

[0094] Thus, after the average value avg_act of the activities of the first field has been obtained, a pre-encoding process can be performed with the average value in consideration of adaptive quantization.

[0095] Next, the pre-encoding process portion 103B will be described. The raster scan/block scan converting portion 207A performs basically the same process as the forgoing raster scan/block scan converting portion 204. However, since the raster scan/block scan converting portion 207A is to perform a pre-encoding process for estimating a code amount, the raster scan/block scan converting portion 207A requires video data of both the first field and the second field. Thus, when eight lines of the second field have been stored in the main memory 203, the raster scan/block scan converting portion 207A can form a macro block having a size of 16 pixels.times.16 lines, which is dealt with the MPEG. At that point, the raster scan/block scan converting portion 207A can start the process. In reality, the raster scan/block scan converting portion 207A properly starts the process corresponding to a command received from the timing generator TG 220.

[0096] Video data that is output from the raster scan/block scan converting portion 207A is supplied to the DCT mode portion 208. The DCT mode portion 208 decides to select a field DCT encoding mode or a frame DCT encoding mode to encode video data.

[0097] In this case, instead of actually encoding video data, the DCT mode portion 208 calculates the sum of absolute values of difference values of vertically adjacent pixels in both the DCT encoding mode and the frame DCT encoding mode, compares them, and selects the mode whose calculated sum is smaller than the other. The selected result is temporarily placed as DCT mode type data that is a flag in the stream. The flag is transferred to each portion downstream of the DCT mode portion 208. As shown in FIG. 6B, the DCT mode type data dct_type is placed on the rear end side of a blank area of the header portion.

[0098] The activity portion 209 performs basically the same process as the forgoing activity portion 205. However, as was described above, the activity portion 209 is to perform the pre-encoding process. Thus, the activity portion 209 calculates an activity of each macro block using data of both the first field and the second field. The activity portion 209 first obtains an activity act and places it after a macro block address of the header portion as shown in FIG. 6C. Thereafter, with the activity act and the average value avg_act of the field activity obtained from the forgoing averaging portion 206, the activity portion 209 obtains a normalized activity Nact.

[0099] The obtained normalized activity Nact is temporarily placed as normalized activity data norm_act that is a flag in the header portion of the stream as shown in FIG. 7A. The flag is transferred to the individual portions downstream of the averaging portion 206. The activity act is overwritten to the normalized activity data norm_act in the stream.

[0100] An output of the activity portion 209 is supplied to a DCT portion 210A. The DCT portion 210A divides the supplied macro block into DCT blocks each of which is composed of eight pixels.times.eight pixels, performs the two-dimensional DCT for each DCT block, and generates DCT coefficients. As shown in FIG. 7B, the DCT coefficients are placed in the data portion of the stream and supplied to a quantizer table portion 211A.

[0101] The quantizer table portion 211A quantizes the DCT coefficients, which have been transformed by the DCT portion 210A, with a quantizer matrix (quantizer_matrix). As shown in FIG. 7C, the DCT coefficients, which have been quantized by the quantizer table portion 211A, are placed in the data portion of the stream and then output. An output of the quantizer table portion 211A is supplied to a multi-staged quantizer portion composed of a plurality of Q_n (quantizer) portions 212, 212, . . . , VLC portions 213, 213, . . . , cumulating portions .SIGMA. 214, 214, . . . , and cumulating portions .SIGMA. 215, 215, . . . . The DCT coefficients quantized by the quantizer table portion 211A are quantized on multiple stages of the quantizer portions.

[0102] The Q_n portions 212, 212, . . . quantize the DCT coefficients with different quantizer scales (quantizer_scale) Q. The values of the quantizer scales Q are prescribed in for example the MPEG2 standard. The Q_n portions 212, 212, . . . are composed of for example 31 quantizers according to the standard. At that point, since n=31, the Q_n portions 212, 212, . . . are a Q.sub.--1 portion, a Q.sub.--2 portion, . . . , and a Q.sub.--31 portion. With the quantizer scales Qn, the Q_n portions 212 quantize DCT coefficients with the quantizer scales Qn assigned thereto at a total of 31 steps. Hereinafter, the quantizer scale values of the Q_n portions 212, 212, . . . are denoted by the quantizer scales Qn values.

[0103] The Q_n portions 212, 212, . . . quantize DCT coefficients with their quantizer scale Qn values. At that point, with the quantizer scale mqaunt, which is in consideration of the visual characteristic, and which has been obtained using the following Formula (6) with the normalized activity data norm_act obtained by the activity portion 209, the adaptive quantization is performe.

mqaunt=Q.sub.--n.times.norm.sub.--act (6)

[0104] The DCT coefficients adaptively quantized by the Q_n portions 212, 212, . . . with the quantizer scales Qn are placed in the data portion of the stream as shown in FIG. 8A and supplied to the VLC portions 213, 213, . The VLC portions 213, 213, . . . scan DCT coefficients for the individual quantizer scales Qn according to for example the zigzag scanning method and encodes them with a variable length code with reference to a VLC table according to for example the two-dimensional Huffman code.

[0105] The data, which has been encoded with the variable length code by the VLC portions 213, 213, . . . , is placed in the data portion of the stream as shown in FIG. 8B and then output. Outputs of the VLC portions 213, 213, . . . are supplied to the corresponding cumulating portions .SIGMA. 214, 214, . . . .

[0106] The cumulating portions .SIGMA. 214, 214, . . . cumulate the generated code amounts for each macro block. As described above, when 31 types of quantizing devices are used, 31 types of generated code amounts are obtained for each macro block. As shown in FIG. 9A, the generated code amounts cumulated by the cumulating portions .SIGMA. 214, 214, . . . are placed in the header portion of the stream. In other words, the generated code amounts quantized by the Q.sub.--1 portion 212 to Q_n portion 212 for each macro block are placed in the header portion of the stream. The data portion of the stream is deleted. The stream of each macro block is supplied to the main memory 203.

[0107] The generated code amounts for each macro block, which have been output from the cumulating portions .SIGMA. 214, 214, . . . , are supplied to the respective cumulating portions .SIGMA. 215, 215, . . . . The cumulating portions .SIGMA. 215, 215, . . . select generated code amounts for each macro block quantized with quantizer_scale (=mquant), in which the forgoing visual characteristic has been considered, from those obtained by the cumulating portions .SIGMA. 214 and cumulate them for one frame.

[0108] The generated code amounts (frame data rates) cumulated by the cumulating portions .SIGMA. 215, 215, . . . for the quantizer scales Qn are supplied as an n-word stream as shown in FIG. 9B to a rate controlling portion 217. When 31 types of quantizing devices are used as the forgoing example, corresponding 31 types of generated code amounts for each frame are obtained.

[0109] Next, a method for obtaining a generated code amount will be described more practically. For example, "generated code amount of Q.sub.--4 portion 212" can be obtained as follows.

[0110] For example, in the case

[0111] norm_act [1]=1.3

[0112] norm_act [2]=1.5

[0113] norm_act [3]=0.8

[0114] norm_act [4]=1.0

[0115] . . . ,

[0116] mqaunt [1]=4.times.1.3=5.2

[0117] : The generated code amount of the Q.sub.--5 portion 212 is obtained from the header portion of FIG. 9A.

[0118] mqaunt [2]=4.times.1.5=6.0

[0119] : The generated code amount of the Q.sub.--6 portion 212 is obtained from the header portion of FIG. 9A.

[0120] mqaunt [3]=4.times.0.8=3.2

[0121] : The generated code amount of the Q.sub.--3 portion 212 is obtained from the header portion of FIG. 9A.

[0122] mqaunt [4]=4.times.1.0=4.0

[0123] :The generated code amount of the Q.sub.--4 portion 212 is obtained from the header portion of FIG. 9A.

[0124] . . .

[0125] They are cumulated for one frame. They are performed for each of the Q.sub.--1 portion 212 to the Q_n portion 212. As a result, the generated code amount for one frame is obtained.

[0126] Next, the encoding process portion 103C will be described. The encoding process portion 103C performs the final encoding process. As described above, the pre-encoding process portion 103B estimates the generated code amounts for one frame in various quantizing operations. The encoding process portion 103C encodes data corresponding to the generated code amount, which has been estimated for one frame so that the generated code amount does not exceed the pre-designated target generated code amount and outputs an MPEG ES.

[0127] The data used in the encoding process portion 103C has been stored in the main memory 203. However, as described above, when the generated code amounts for one frame have been estimated in various quantizing operations by the pre-encoding process portion 103B, the encoding process portion 103C can start the process. As described above, the process of each portion of the encoding process portion 103C can be properly started corresponding to a command issued from the timing generator TG 220.

[0128] Video data that has been read from the main memory 203 is supplied to a raster scan/block scan converting portion 207B. The raster scan/block scan converting portion 207B performs a process similar to the process of the raster scan/block scan converting portion 207A and extracts a macro block of 16 pixels.times.16 lines from the video data. As shown in FIG. 10A, the extracted macro block is placed in a data portion corresponding to the header portion shown in FIG. 9A and supplied to a DCT mode portion 216.

[0129] Like the DCT mode portion 208, the DCT mode portion 216 decides to use the field DCT encoding mode or the frame DCT encoding mode to encode data. At that point, the DCT mode portion 208 has decided the encoding mode and temporarily placed the result as DCT type data dct_typ in the stream (see FIG. 10A). The DCT mode portion 216 detects the DCT type data dct_typ from the stream and switches to the field encoding mode or the frame encoding mode corresponding to the detected DCT type data dct_typ. An output of the DCT mode portion 216 is shown in FIG. 10B.

[0130] A macro block that has been output from the DCT mode portion 216 is supplied to a DCT portion 210B. Like the DCT portion 210A, the DCT portion 210B two-dimensionally transforms the macro block into DCT coefficients in the unit of one DCT block of eight pixels.times.eight pixels. As shown in FIG. 10C, the DCT coefficients, into which the macro block has been two-dimensionally transformed corresponding to the two-dimensional DCT method, are placed in the data portion of the stream and then output from the DCT portion 210B.

[0131] A quantizer table portion 211B can be structured in the same manner as the forgoing quantizer table portion 211A. The quantizer table portion 211B quantizes the DCT coefficients transformed by the DCT portion 210B with a quantizer matrix. As shown in FIG. 11A, the DCT coefficients quantized by the quantizer table portion 211B are placed in the data portion of the stream and supplied to a rate controlling portion 217.

[0132] The rate controlling portion 217 selects one from the frame data rates, which have been obtained by the cumulating portions .SIGMA. 215, 215, . . . of the pre-encoding process portion 103B for each quantizer scale Qn so that the selected one does not exceed the maximum generated code amount per frame designated by the system controller 117 and is the closest to the designated value. The quantizer scale (mquant) for each macro block used in the quantizing device corresponding to the selected frame data rate is obtained from the normalized activity data norm_act placed in the stream and supplied to a quantizing portion 218.

[0133] The quantizer scale for each macro block is placed as quantizer_scale on the rear end side of the header portion of the stream as shown in FIG. 11B and then sent to the quantizing portion 218.

[0134] The maximum generated code amount per frame is designated by for example the system controller 117 and supplied to the rate controlling portion 217 through the CPU I/F block 221.

[0135] At that point, the value of the quantizer scale (mquant) for each macro block can be decreased by one size in the range that does not exceed the difference between the maximum generated code amount per frame, which has been designated by the system controller 117 and transferred through the CPU I/F block 221, and the generated code amount corresponding to the quantizer scale (mquant) for each macro block, which has been obtained from the normalized activity data norm_act placed in the stream. Thus, since a code amount close to the maximum generated code amount designated by the system controller 117 and transferred trough the CPU I/F block 221 is obtained, high picture quality can be accomplished.

[0136] The quantizing portion 218 extracts the quantizer scale (quantizer_scale) designated by the rate controlling portion 217 in the forgoing manner from the stream and quantizes DCT coefficients with the quantizer table portion 211B corresponding to the extracted quantizer scale. At that point, since the quantizer scale supplied from the rate controlling portion 217 is the value of the quantizer scale (mquant) obtained from the normalized activity data norm_act, adaptive quantization is performed in consideration of visual characteristic.

[0137] The DCT coefficients quantized by the quantizing portion 218 are placed in the data portion of the stream as shown in FIG. 11C and supplied to a VLC portion 219. The DCT coefficients supplied to the VLC portion 219 are scanned corresponding to for example the zigzag scanning method. The resultant DCT coefficients are encoded with a variable length code with reference to a VLC table according to the two-dimensional Huffman code. The variable length code is bit-shifted so that it is byte-aligned and then output as an MPEG ES.

[0138] At that point, as shown in FIG. 12, the header portion as the first half portion of the stream is replaced with the MPEG header portion in which the MPEG header information of the slice layer or below is placed. The variable length code is placed in the data portion on the second half side of the stream.

[0139] The forgoing example shows that the process of the MPEG encoder 103 is implemented by hardware. However, according to the present invention, the process of the MPEG encoder 103 is not limited to such an example. In other words, the process of the MPEG encoder 103 can be implemented by software. In this case, for example, a computer_apparatus is provided with analog and digital interfaces for a video signal. Software installed on the computer is executed with a CPU and a memory. In the forgoing digital VTR structure, the CPU and memory may be substituted for the MPEG encoder 103.

[0140] The software is recorded as program data on a recording medium such as a CD-ROM (Compact Disc--Read Only Memory). The recording medium, on which the software has been recorded, is loaded to the computer apparatus. With a predetermined operation, the software is installed onto the computer apparatus. As a result, the process of the software can be executed. Since the structure of the computer apparatus is well known, its description will be omitted in the following.

[0141] FIG. 13 is a flow chart showing an example of the process of the MPEG encoder 103 implemented by software. Since the process of the flow chart is the same as the process implemented by the forgoing hardware, the process of the flow chart will be briefly described in consideration of the process implemented by hardware. Steps S1 to S7 correspond to the process of the forgoing input field activity averaging process portion 103A. Steps S11 to S21 correspond to the process of the forgoing pre-encoding process portion 103B. Steps S31 to S38 correspond to the process of the forgoing encoding process portion 103C.

[0142] At step S1, the first step, video data is captured. At step S2, the next step, each MPEG header is extracted from the captured video data in the vertical blanking region and stored in a memory. In other than the vertical blanking region, the captured video data is stored in the memory.

[0143] At step S3, video data is converted from raster scan data into block scan data. As a result, a macro block is extracted. This operation is performed by controlling read addresses of the video data stored in the memory. At step S4, an activity of the extracted macro block of the first field of the video data is calculated. At step S5, the calculated result Activity (act) is cumulated and stored as a cumulated value sum in a memory. The process from step S3 to step S5 is repeated until it has been determined that the last macro block of the first field has been processed at step S6. In other words, the cumulated value sum becomes the sum of activities of macro blocks for one field.

[0144] When it has been determined that the last macro block of one field has been processed at step S6, the cumulated value sum stored in the memory is divided by the number of macro blocks for one field at step S7. As a result, the average value Activity (avg_act) of the field activities for one field is obtained and stored in the memory.

[0145] When the average value Activity (avg_act) of the field activities has been obtained, the flow advances to step S11. Like at step S3, at step S11, the video data stored in the memory is converted from raster scan data into block scan data. As a result, a macro block is extracted.

[0146] At step S12, the field DCT encoding mode or the frame DCT encoding mode is selected so as to process each DCT. The selected result is stored as DCT mode type data dct_typ in the memory. At step S13, with the first and second fields, an activity of each macro block is calculated. With the average value Activity (avg_act) of the field activities obtained and stored in the memory at step S7, a normalized activity Activity (norm_act) is obtained. The obtained normalized activity Activity (norm_act) is stored in the memory.

[0147] At step S14, the macro block extracted from the video data at step S11 is divided into DCT blocks each of which is composed of eight pixels.times.eight pixels. The DCT blocks are two-dimensionally transformed in the two-dimensional DCT method. As a result, DCT coefficients are obtained. At step S15, the DCT coefficients are quantized with a quantizer table (quantizer_table). Thereafter, the flow advances to step S16.

[0148] The process from steps S16 to S20 is repeated with each quantizer scale (quantizer_scale) Qn value. As a result, the processes corresponding to the forgoing Q_n portions 212, 212, . . . , the forgoing VLC portions 213, 213, . . . , the forgoing cumulating portions .SIGMA. 214, 214, . . . , and the forgoing cumulating portions .SIGMA. 215, 215, . . . are performed. In other words, at step S16, the DCT coefficients are quantized with quantizer scale Q=1. At step S17, the DCT coefficients quantized with reference to the VLC table are encoded with a variable length code. At step S18, the generated code amount of the macro block with the variable length code is calculated. At step S19, the generated code amount of each macro block for one frame is cumulated. At step S20, it is determined whether or not there is another quantizer scale Qn. When it has been determined that there is another quantizer scale Qn, the flow returns to step S16. At step S16, the process for another quantizer scale Qn is performed. The generated code amounts for one frame corresponding to the individual quantizer scales Qn are stored in the memory.

[0149] When it has been determined that the cumulated value of the generated code amounts for the frame corresponding to all the quantizer scale values Qn has been obtained at step S20, the flow advances to step S21. At step S21, it is determined whether or not the last macro block (MB) of one frame has been processed. When the last macro block has not been processed, the flow returns to step S11. When the last macro block has been processed and the generated code amount for one frame has been estimated, the flow advances to step S31. At step S31, the real encoding process is performed.

[0150] Like at S11, at step S31, the video data stored in the memory is converted from raster scan data into block scan data. As a result, a macro block is extracted. At step S32, corresponding to the DCT mode type data dct_typ stored in the memory at step S12, the DCT encoding mode is designated.

[0151] At step S33, the macro block extracted from the video data at step S31 is divided into DCT blocks each of which is composed of eight pixels.times.eight pixels. The DCT blocks are two-dimensionally transformed in the dimensional DCT method. As a result, DCT coefficients are obtained. The DCT coefficients are quantized with the quantizer table (quantizer_table) at step S34. Thereafter, the flow advances to step S35.

[0152] At step S35, corresponding to the generated code amounts for one frame corresponding to the quantizer scales Qn, which have been estimated at steps S11 to S21, the quantizer scales Qn used at step S36 are designated for each macro block so as to control the generated code amount in the real encoding process.

[0153] Thereafter, the flow advances to step S36. With the quantizer scales Qn designated at step S35, the DCT coefficients quantized with the quantizer table at step S34 are quantized. With reference to the VLC table, the DCT coefficients quantized at step S36 are encoded with a variable length code at step S37. At step S38, it is determined whether or not the last macro block of one frame has been processed. When it has been determined that the last macro block of one frame has not been processed, the flow returns to step S31. At step S31, the quantizing process and the variable length code encoding process are performed for the next macro block. In contrast, when it has been determined that the last macro block of one frame has been processed at step S37, the encoding process for one frame has been completed.

[0154] The forgoing example shows that the pre-encoding process from steps S11 to S21 is different from the encoding process from of steps S31 to S38. However, it should be noted that the present invention is not limited to such an example. For example, the generated code amounts estimated at steps S11 to S21 are stored in the memory. Data that is obtained in the real encoding process is selected from the stored data. As a result, the process from steps S31 to S38 can be contained as a loop in the process from steps S11 to S21.

[0155] FIG. 14 is a schematic diagram for describing a block segmentation for calculating an activity according to an embodiment of the present invention. According to the embodiment of the present invention, when activities are calculated for only one field, they are calculated in the unit of one sub block composed of eight pixels.times.four lines.

[0156] One macro block composed of 16 pixels.times.16 lines is shown on the left of FIG. 14. In the macro block of one frame shown on the left side of FIG. 14, hatched lines represent lines of a second field. The other lines represent lines of a first field. The macro block is divided into four DCT blocks each of which is composed of eight pixels.times.eight lines. As shown on the right of FIG. 14, components (lines) that compose the first field are extracted. As a result, four sub blocks each of which is composed of eight pixels.times.four lines are obtained. A field activity field_act of the sub blocks composed of data of the first field is calculated.

[0157] As was described above, according to the embodiment of the present invention, when data for eight lines of the first field has been stored in the main memory 203, the process is started. Thus, actually, sub blocks each of which is composed of eight pixels.times.four lines are not extracted from a macro block composed of 16 pixels.times.16 lines. Instead, when data for eight lines of the first field has been stored in the main memory 203, four sub blocks each of which is composed of eight pixels.times.four lines are obtained by the raster scan/block scan converting portion 204.

[0158] An average value (P) of luminance level values of individual pixels of each sub block composed of eight pixels.times.four lines is obtained corresponding to Formula (7).

P={fraction (1/32)}.SIGMA.[k=1, 32]Yk (7)

[0159] In other words, 32 luminance level values (Yk) of eight pixels.times.four lines of each sub block are summed and then divided by "32". As a result, an average value (P) is obtained.

[0160] Next, a difference value between the luminance level value (Yk) of each sub block of eight pixels.times.four lines and the average value (P) is squared. An average difference value (var_sblk) of the squared difference values is obtained corresponding to Formula (8).

var.sub.--sblk={fraction (1/32)}.SIGMA.[k=1, 32](Yk-P).sup.2 (7)

[0161] Since one macro block is composed of four sub blocks, the minimum value is obtained from the four average difference values (var_sblk). The minimum value is used as a field activity field_act of the macro block (see Formula (8)).

field_act=1+min[sblk=1, 4](var.sub.--sblk) (8)

[0162] The forgoing calculating process is repeated for each macro block. In such a manner, an average value (field_avg_act) of the field activities field_act of the first field is obtained. (Formula (9))

field.sub.--avg_act=1/MBnum.SIGMA.[m=1, MBnum]field_act [m] (9)

[0163] where the value MBnum represents the total number of macro blocks of one frame.

[0164] Thereafter, calculations corresponding to Formula (1) to Formula (5) described in the section of the Related Art are performed so as to perform an adaptive quantization in consideration of an activity of each macro block.

[0165] The calculating method for the forgoing field activity field_act is just an example. In other words, the field activity field_act may be calculated in another method.

[0166] To allow the MPEG encoder 103 according to the present invention to be suitable for a VTR used in a broadcasting station, the MPEG encoder 103 deals with only intra pictures and controls a code amount so that the data amount per frame does not exceed a predetermined value. Thus, since the pre-encoding process portion 103B performs the pre-encoding process, a delay for one frame is added to the system delay shown in FIG. 3F. However, the delay for one frame results from the code amount control in the pre-encoding process. Thus, unlike with the present invention, the adaptive quantization in consideration of the activity for each macro block is not performed with the average activity obtained from the first field of the picture. In other words, when the pre-encoding process is performed, not only at the timing according to the present invention as shown in FIG. 2B, FIG. 2C, FIG. 2D, FIG. 2E, and FIG. 2F, but also at the timing according to the related art as shown in FIG. 1A, FIG. 1B, FIG. 1C, and FIG. 1D and at the timing in the other method as shown in FIG. 3A, FIG. 3B, FIG. 3C, FIG. 3D, FIG. 3E, and FIG. 3F, a delay for one frame is made. As a result, the advantage of the embodiment is not lost.

[0167] In the forgoing example, as the picture data encoding system, the MPEG2 system as an encoding system of which a picture is divided into blocks and each block is DCT-transformed is used. However, the present invention is not limited to such an example. As long as picture is encoded in the unit of one block, another encoding system can be used.

[0168] In the forgoing example of the present invention, a digital VTR has been described. However, the present invention is not limited to such an example. The present invention can be applied to a picture processing apparatus that compression-encodes picture data in a block encoding method, for example a picture transmitting apparatus that compression-encodes picture data and transmits the encoded picture data.

[0169] As was described above, according to the embodiment of the present invention, when a normalized activity of a frame is calculated, with an average activity obtained from picture data of a first field of the frame, the normalized activity is calculated. Thus, according to the embodiment, when the normalized activity is calculated, since an average activity of the immediately preceding frame is not used, even if a scene change occurs between two frames, the normalized activity can be properly calculated. As a result, the picture quality does not deteriorate.

[0170] In addition, according to the present invention, the calculation time can be shortened and the system delay can be shortened in comparison with the method for calculating the average activity with the entire frame (two fields) and obtaining the normalized activity.

[0171] In addition, since a buffer for one frame is not necessary for obtaining a normalized activity, the memory capacity can be reduced.

[0172] Thus, when the present invention is applied to a VTR for an editing operation that has a restriction of a system delay, the picture quality can be optimized in consideration of the visual characteristic.

* * * * *