U.S. patent application number 13/141499 was filed with the patent office on 2011-10-27 for method for the encoding by segmentation of a picture.
This patent application is currently assigned to SAGEMCON BROADBAND SAS. Invention is credited to Jean-Pierre Morard, Olivier Pietquin, Stephane Vialle.
Application Number | 20110262051 13/141499 |
Document ID | / |
Family ID | 41111033 |
Filed Date | 2011-10-27 |
United States Patent
Application |
20110262051 |
Kind Code |
A1 |
Morard; Jean-Pierre ; et
al. |
October 27, 2011 |
METHOD FOR THE ENCODING BY SEGMENTATION OF A PICTURE
Abstract
A method for encoding an image, the encoding being a mixed
encoding with the possibility of using a first lossless compression
type, and a second lossy compression type, the method including:
dividing the image into a plurality of elementary blocks;
determining which elementary blocks have a high level of detail;
allocating the first type of compression to each elementary block
that has a high level of detail; allocating the second compression
type to each elementary block that does not have a high level of
detail; applying the first type of compression to each elementary
block to which the first compression type has been allocated;
applying the first compression type to each elementary block
directly surrounded by two elementary blocks to which the first
compression type has been allocated.
Inventors: |
Morard; Jean-Pierre; (Rueil
Malmaison, FR) ; Vialle; Stephane; (Rueil Malmaison,
FR) ; Pietquin; Olivier; (Rueil Malmaison,
FR) |
Assignee: |
SAGEMCON BROADBAND SAS
Rueil Malmaison
FR
|
Family ID: |
41111033 |
Appl. No.: |
13/141499 |
Filed: |
December 23, 2009 |
PCT Filed: |
December 23, 2009 |
PCT NO: |
PCT/FR09/52681 |
371 Date: |
July 13, 2011 |
Current U.S.
Class: |
382/244 |
Current CPC
Class: |
H04N 19/85 20141101;
H04N 19/186 20141101; H04N 19/48 20141101; H04N 1/41 20130101; H04N
19/12 20141101; H04N 19/14 20141101; H04N 19/61 20141101; H04N
19/80 20141101; H04N 19/176 20141101 |
Class at
Publication: |
382/244 |
International
Class: |
G06K 9/36 20060101
G06K009/36 |
Foreign Application Data
Date |
Code |
Application Number |
Dec 23, 2008 |
FR |
0859018 |
Claims
1. A method for encoding an image, said encoding being a mixed
encoding with the possibility of using a first lossless compression
type, and a second lossy compression type, said method comprising:
dividing the image into a plurality of elementary blocks;
determining which elementary blocks have a high level of detail;
allocating the first type of compression to each elementary block
that has a high level of detail; allocating the second compression
type to each elementary block that does not have a high level of
detail; applying the first type of compression to each elementary
block to which said first compression type has been allocated;
applying the first compression type to each elementary block
directly surrounded by two elementary blocks to which the first
compression type has been allocated.
2. The encoding method according to claim 1, wherein determining
the elementary blocks having a high level of detail comprises, for
each elementary block under consideration: performing spatial
filtering to obtain a frequency representation; measuring a
high-frequency component level of the frequency representation; if
the high-frequency component level is greater than a previously
determined threshold, identifying the elementary block under
consideration as an elementary block presenting a high level of
detail.
3. The encoding method according to claim 2, wherein determining
the elementary blocks having a high level of detail is carried out
for each of the color planes of the image under consideration, the
elementary block under consideration being identified as an
elementary block having a high level of detail if, for one of the
color planes under consideration, the high-frequency component
level is greater than a specific, previously determined threshold,
each color plane being associated with a specific threshold.
4. The method according to claim 3, wherein the specific thresholds
of each color plane have the same value.
5. The method according to claim 4, the method comprising: applying
the first compression type to all elementary blocks of the
homogeneous elementary block type.
6. The method according to claim 1, wherein encoding is of the H264
encoding type.
Description
TECHNICAL FIELD OF THE INVENTION
[0001] The object of the present invention is a method of encoding
by video image segmentation. In particular, the purpose of the
invention is to improve the rendering quality of an image that had
previously been subject to a compression operation in order to
limit the data rate necessary for storing and/or transmitting data
relative to the image under consideration once it is encoded. The
encoding operation according to the invention is carried out in
particular by ensuring the possibility of restitution of the
maximum details for the image zones corresponding to zones known as
MMI (Man-Machine Interface) with relation to other zones of the
image under consideration corresponding to photo, video, etc.,
zones.
[0002] The field of the invention is, in general, that of video
image compression. By way of example, without limiting the scope of
the object of the invention, the field of the invention will be
more particularly detailed in a context essentially involving the
compression standard known as H264, without this aspect being
limiting with regard to the scope of the invention. In fact, other
compression standards, for example of the VC1 or DivX type, even if
they are less advantageous in certain contexts, may be utilized in
the embodiment of the method according to the invention.
[0003] The different video coding standards are all based on the
same major principles. On the one hand, they are based on the
redundancy of temporal or spatial data, in order to reduce the
quantity of data, without necessarily introducing losses. On the
other hand, some data or details are eliminated, which introduces
losses in the restored image, said losses generally being targeted
with relation to the psychovisual properties. In fact, some image
details are perceived by the eye very little or not at all and may
therefore be omitted. In this way a compressed video stream is
obtained. The main steps of video compression are thus as follows:
[0004] Coding the image to be encoded in luminance-chrominance;
[0005] Division of the image into macro-blocks, or elementary
blocks, that are rectangular regions with a size of between 4*4 and
16*16 pixels; [0006] Motion estimation; [0007] Motion compensation;
[0008] Frequency transform: DCT (Discrete Cosine Transformation) is
applied to each elementary block. Such a transform enables a
frequency representation of the image to be obtained. [0009]
Quantification; Data from the DCT are quantified by being coded on
a limited number of bits. This is where the loss of data takes
place. [0010] Entropy coding. In such coding, the more often a
given value appears, the more it will be coded on a small number of
bits.
[0011] The context in which the present invention will be described
will be that of the placement of media center type applications,
that consist of the remote utilization of a computer in a residence
in various points of said residence, in order to have various
services that may be utilized through workstations, for example a
digital television decoder, distributed in the habitat. For this
purpose, it is necessary to transfer various data, particularly
video images, across the network constituted of the computer, the
clients and the connections connecting them; Image compression is
thus a necessity for ensuring good operation of the media center
type application distributed over a network.
TECHNOLOGICAL BACKGROUND OF THE INVENTION
[0012] Standard H264 provides two types of compression, illustrated
in FIG. 1. A first compression type 100, known as lossless
compression mode or lossless compression, obtains, from an original
image 103, a restored image 104 after a compression phase 105 that
does not lead to any loss in the restored image 104. A second type
of compression 101, known as lossy compression mode or lossy
compression, obtains, from an original image 106, a restored image
107 after a compression phase 108 that leads to a loss of data in
the restored image 107 with relation to the original image 106, a
data loss that manifests in a reduction in image quality, notably
in terms of sharpness.
[0013] Standard H264 is preferred for the transmission of video
across the network created. But this standard is, as explained
previously, likely to produce data losses during compression
operations occurring during video data encoding, in particular.
Basically, these losses are considered to be not really discernable
to the human eye; this was the case, in particular, when the video
data to be encoded and transmitted were only of the photo or
television broadcast image types, for example. However, in some
cases, the defects introduced by these data losses may become very
visible. This is the case in particular with images such as
buttons, menus, or any other element containing text and many
details.
[0014] More generally, when an image is compressed to gain
transmission time or storage space, a certain loss rate is made in
order to obtain a better compression rate. Such being the case,
these losses, which do not pose problems for video visualization,
are an inconvenience for image renderings of the graphic MMI type.
In fact, more particularly, MMI renderings must be produced more
carefully since the image is very often static or slightly
animated, and the defects therein are perceptible. On the other
hand, when an end user passes from a PC type station to an
application on the television, his assessment of the identical
content is more critical.
GENERAL DESCRIPTION OF THE INVENTION
[0015] The method according to the invention proposes a solution to
the problems and disadvantages that have just been stated. In the
invention, a solution to improve the rendering quality of the image
to be restored is proposed. For this purpose, in the invention, one
seeks in particular to distinguish image zones corresponding to
MMI, whose restoration quality must be optimized, and image zones
corresponding to photo, video, image, etc., type content, for which
lossy compression may be accepted. Depending on the nature of the
zones distinguished, either a lossless compression mode, or a lossy
compression mode is then applied.
[0016] The invention thus essentially relates to a method for
encoding an image, said encoding being a mixed encoding with the
possibility of using a first lossless compression type, and a
second lossy compression type, said method comprising the operation
consisting of dividing the image into a plurality of elementary
blocks;
[0017] characterized in that said method comprises different
additional steps consisting of: [0018] determining the elementary
blocks having a high level of detail; [0019] allocating the first
compression type to each elementary block that has a high level of
detail; [0020] allocating the second compression type to each
elementary block that does not have a high level of detail.
[0021] The method according to the invention may comprise, in
addition to the main steps that have just been mentioned in the
previous paragraph, one or more additional characteristics from
among the following: [0022] the step consisting of determining the
elementary blocks having a high level of detail comprises different
operations consisting of, for each elementary block under
consideration: [0023] performing spatial filtering to obtain a
frequency representation; [0024] measuring a high-frequency
component level of the frequency representation; [0025] if the
high-frequency component level is greater than a previously
determined threshold, then identify the elementary block under
consideration as an elementary block presenting a high level of
detail; [0026] the different operations of the step consisting of
determining the elementary blocks having a high level of detail are
carried out for each of the color planes of the image under
consideration, the elementary block under consideration being
identified as an elementary block having a high level of detail if,
for one of the color planes under consideration, the high-frequency
component level is greater than a specific, previously determined
threshold, each color plane being associated with a specific
threshold; [0027] the specific thresholds of each color plane have
the same value; [0028] the method comprises the additional steps
consisting of: [0029] applying a first type of compression to each
elementary block to which said first compression type has been
allocated; [0030] applying the first compression type to each
elementary block directly surrounded by two elementary blocks to
which the first compression type has been allocated; the expression
"directly surrounded" refers to the fact that the elementary block
under consideration is adjacent to at least two lossless type
elementary blocks, the two elementary blocks being situated either
to the left and to the right of the elementary block under
consideration, or above and below the elementary block under
consideration; [0031] the method comprises the additional step
consisting of applying the first compression type to all elementary
blocks of the homogeneous elementary block type; [0032] the
encoding is of the H264 encoding type.
[0033] The different additional characteristics of the method
according to the invention, insofar as they are not mutually
exclusive, are combined according to all combination possibilities
to result in different examples of embodiment of the invention.
[0034] The invention and its various applications will be better
understood upon reading the following description and examining the
accompanying figures.
BRIEF DESCRIPTION OF THE FIGURES
[0035] The figures are presented for indicative purposes and in no
way limit the invention.
[0036] FIG. 1, already described, schematically illustrates the
operation of two different compression modes;
[0037] FIG. 2 schematically illustrates the encoding method
according to the invention;
[0038] FIG. 3 illustrates an example of embodiment of the method
according to the invention in which certain elementary blocks of
the image to be compressed are compressed according to a lossless
compression mode after placement of particular criteria for
determining the compression mode;
[0039] FIG. 4 illustrates an example of an image having undergone
encoding by an example of embodiment of the method according to the
invention.
DESCRIPTION OF PREFERRED FORMS OF EMBODIMENT OF THE INVENTION
[0040] Unless otherwise stated, the elements appearing in different
figures will have retained the same references.
[0041] In the invention, one seeks to segment an image to be
encoded by utilizing particular criteria to determine if each
elementary block under consideration should be encoded according to
a lossless compression mode or according to a lossy compression
mode. The criteria defined aim to distinguish the MMI elements
(buttons, menus, etc.) From the rest of the image. One essential
criterion resides in the massive presence or not of text in each
elementary block under consideration.
[0042] The invention proposes, first, the analysis of the spectral
content of each elementary block; Such a step is justified by the
fact that the text elements, in an image, are characterized by the
high number of abrupt transitions in luminosity and/or chrominance.
Thus, the invention proposes measuring the high frequency component
level present in each elementary block for each of the three color
components of the image under consideration. If the amplitude of
the frequency components situated beyond a certain frequency
exceeds a given threshold, the elementary block under consideration
is marked as a lossless zone.
[0043] FIG. 2 illustrates such a principle. In this figure, an
elementary block 201 containing a button type graphic element 202
is represented. First, the elementary block undergoes filtering
203, equivalent to edge detection. This is high-pass filtering
allowing a high-frequency elementary block 204 to be obtained. Such
filtering amplifies the abrupt variations in the image contained in
the elementary block under consideration, and reduces the smooth
parts, without details, of the source image. Thus, in the
high-frequency elementary block, a high quantity of very bright
pixels is found at the locations where there is text, or many
details.
[0044] High-frequency image 204 may be obtained by a
differentiation filter such as the Laplacien filter.
[0045] Secondly, a thresholding operation 205 is carried out in
order to determine if the compression of the elementary block
should be of the lossless or lossy type. Thus, once the
high-frequency image has been calculated for the elementary block
under consideration, it is necessary to mark said elementary block
as lossless or lossy.
[0046] Thus, for example, the following different steps are
planned: [0047] A step of thresholding applied to the
high-frequency image, where the value "1" is assigned to a pixel
whose frequency value is greater than a threshold that was
previously determined, advantageously empirically; [0048] A step of
counting in which the number of pixels is counted in the elementary
block under consideration that were assigned the value 1 in the
previous step; [0049] A decision step: If the number obtained in
the previous step is greater than a given value, determined
empirically, for example, then the elementary block under
consideration is marked as lossless. If not, the elementary block
under consideration is marked as lossy.
[0050] As shown in FIG. 3, for an image 300 composed, by way of
example, of 16 elementary blocks, a plurality of elementary blocks
marked lossless are thus obtained, represented hatched, the other
elementary blocks being marked lossy, represented unhatched.
[0051] According to a first advantageous embodiment of the
invention, if an elementary block 301 is marked lossy after the
thresholding operation 205, but is surrounded by a first elementary
block 302 marked lossless and by a second elementary block 303 also
marked lossless, finally marking elementary block 301 under
consideration as lossless is expected. Thus, it will be subject to
lossless compression. Such an operation improves the rendering of
the image that will later be restored, by preventing too many
transitions between the elementary blocks compressed in a lossless
manner and the elementary blocks compressed in a lossy manner.
[0052] According to another advantageous embodiment of the method
according to the invention, the homogeneous elementary blocks that
have been marked lossy after the thresholding operation 205 are
transformed into elementary blocks marked lossless. Homogeneous
elementary block, also known as a flat zone, refers to
zero-gradient zones: Such zones are characterized, for the three
color components under consideration, by a null vectorial
derivative along two perpendicular axes of the image under
consideration.
[0053] Such a mode of embodiment also improves the rendering of the
restored image, the text zones, that are thus encoded in lossless
format, very often being directly surrounded by flat zones. In
addition, this embodiment is not punishing in terms of the required
bandwidth, the homogeneous elementary blocks, even encoded in
lossless format.
[0054] FIG. 4 shows an image 402, composed of a first window 400
and a second window 401. Elementary blocks 403 having undergone
lossless compression, represented hatched, and elementary blocks
404 having undergone lossy compression, represented unhatched, are
illustrated in this figure. Image 402 has undergone the encoding
method according to the invention, with the application of the
embodiment of the first advantageous mode that has just been
described. Thus, no lossy elementary block is disposed directly
between two lossless elementary blocks.
* * * * *