U.S. patent application number 11/515198 was filed with the patent office on 2007-09-06 for information processing apparatus, method of computer control, computer readable medium, and computer data signal.
This patent application is currently assigned to FUJI XEROX CO., LTD.. Invention is credited to Hitoshi Ikeda, Junichi Takeda.
Application Number | 20070206862 11/515198 |
Document ID | / |
Family ID | 38471548 |
Filed Date | 2007-09-06 |
United States Patent
Application |
20070206862 |
Kind Code |
A1 |
Takeda; Junichi ; et
al. |
September 6, 2007 |
Information processing apparatus, method of computer control,
computer readable medium, and computer data signal
Abstract
An image processing apparatus includes: an image pickup unit
that images an image of a face; a first extraction unit that
extracts a first image pattern as a correct solution pattern based
on a sample image of the race; a second extraction unit that
extracts a second image pattern as a counterexample pattern based
on the sample image; a learning unit that learns a pattern
recognition of the target part based on the first image pattern and
the second image pattern; an identification unit that identifies a
face area from the image of the face, the face area being an area
where the face is shown; and a detection unit that detects a
position of the target part from the face area based on the pattern
recognition of the target part.
Inventors: |
Takeda; Junichi; (Kanagawa,
JP) ; Ikeda; Hitoshi; (Kanagawa, JP) |
Correspondence
Address: |
OLIFF & BERRIDGE, PLC
P.O. BOX 19928
ALEXANDRIA
VA
22320
US
|
Assignee: |
FUJI XEROX CO., LTD.
Tokyo
JP
|
Family ID: |
38471548 |
Appl. No.: |
11/515198 |
Filed: |
September 5, 2006 |
Current U.S.
Class: |
382/209 ;
382/165; 382/195 |
Current CPC
Class: |
G06K 9/6254 20130101;
G06K 9/00248 20130101 |
Class at
Publication: |
382/209 ;
382/195; 382/165 |
International
Class: |
G06K 9/62 20060101
G06K009/62; G06K 9/00 20060101 G06K009/00; G06K 9/46 20060101
G06K009/46 |
Foreign Application Data
Date |
Code |
Application Number |
Mar 2, 2006 |
JP |
2006-056970 |
Claims
1. An image processing apparatus comprising: an image pickup unit
that images an image of a face; a first extraction unit that
extracts a first image pattern as a correct solution pattern based
on a sample image of the face, the first image pattern being an
image pattern of a target part of the face; a second extraction
unit that extracts a second image pattern as a counterexample
pattern based on the sample image, the second image pattern being
an image pattern of the target part in an adjacent area of the
correct solution pattern; a learning unit that learns a pattern
recognition of the target part based on the first image pattern and
the second image pattern; an identification unit that identifies a
face area from the image of the face, the face area being an area
where the face is shown; and a detection unit that detects a
position of the target part from the face area based on the pattern
recognition of the target part.
2. The image processing apparatus as claimed in claim 1, which
comprises: a determination unit that determines a search area based
on color information of the face, the search area being an area
where a search is made for the target part, wherein the detection
unit detects the position of the target part from the search area
based on the pattern recognition of the target part.
3. The image processing apparatus as claimed in claim 2, wherein
the target part is a mouth corner, and the determination unit
determines the search area based on a red component of complexion
of the face area.
4. The image processing apparatus as claimed in claim 2, wherein
the target part is a nose, and the determination unit determines
the search area based on luminance of complexion of the face
area.
5. A control method of an image processing apparatus, which
comprises: imaging an image of a face; first extracting a first
image pattern as a correct solution pattern based on a sample image
of the face, the first image pattern being an image pattern of a
target part of the face; second extracting a second image pattern
as a counterexample pattern based on the sample image, the second
image pattern being an image pattern of the target part in an
adjacent area of the correct solution pattern; learning a pattern
recognition of the target part based on the first image pattern and
the second image pattern; identifying a face area from the image of
the face, the face area being an area where the face is shown; and
detecting a position of the target part from the face area based on
the pattern recognition of the target part.
6. A computer readable medium storing a program causing a computer
to execute a process for controlling an image processing apparatus,
which comprises: imaging an image of a face; first extracting a
first image pattern as a correct solution pattern based on a sample
image of the face, the first image pattern being an image pattern
of a target part of the face; second extracting a second image
pattern as a counterexample pattern based on the sample image, the
second image pattern being an image pattern of the target part in
an adjacent area of the correct solution pattern; learning a
pattern recognition of the target part based on the first image
pattern and the second image pattern; identifying a face area from
the image of the face, the face area being an area where the face
is shown; and detecting a position of the target part from the face
area based on the pattern recognition of the target part.
7. A computer data signal embodied in a carrier wave for enabling a
computer to perform a process for controlling an image processing
apparatus, which comprises: imaging an image of a face; first
extracting a first image pattern as a correct solution pattern
based on a sample image of the face, the first image pattern being
an image pattern of a target part of the face; second extracting a
second image pattern as a counterexample pattern based on the
sample image, the second image pattern being an image pattern of
the target part in an adjacent area of the correct solution
pattern; learning a pattern recognition of the target part based on
the first image pattern and the second image pattern; identifying a
face area from the image of the face, the face area being an area
where the face is shown; and detecting a position of the target
part from the face area based on the pattern recognition of the
target part.
Description
BACKGROUND
[0001] 1. Technical Field
[0002] This invention relates to image processing and particularly
to an art of detecting a part of a face from a picked-up image of
the face.
[0003] 2. Related Art
[0004] Hitherto, a detecting technique according to template
matching of the feature amount independent of colors, such as
luminance information has been proposed as an art of detecting
parts (an eye, a nose, a mouth corner, etc.,) of a face. In the
technique, a partial image is extracted from an image and the
matching degree between the extracted partial image and a template
is calculated. The most matched partial image is adopted as the
face part of the detection target. Aside from the arts, a technique
of detecting the area of each part using the color feature amounts
of the face parts is also proposed.
[0005] It is therefore an object of the invention to provide an
information processing apparatus and a computer control method and
program capable of detecting a part of a face with accuracy while
lightening processing load.
SUMMARY
[0006] According to an aspect of the invention, an image processing
apparatus includes: an image pickup unit that images an image of a
face; a first extraction unit that extracts a first image pattern
as a correct solution pattern based on a sample image of the face,
the first image pattern being an image pattern of a target part of
the face; a second extraction unit that extracts a second image
pattern as a counterexample pattern based on the sample image, the
second image pattern being an image pattern of the target part in
an adjacent area of the correct solution pattern; a learning unit
that learns a pattern recognition of the target part based on the
first image pattern and the second image pattern; an identification
unit that identifies a face area from the image of the face, the
face area being an area where the face is shown; and a detection
unit that detects a position of the target part from the face area
based on the pattern recognition of the target part.
BRIEF DESCRIPTION OF THE DRAWINGS
[0007] An exemplary embodiment of the present invention will be
described in detail based on the following figures, wherein:
[0008] FIG. 1 is a block diagram that illustrates the hardware
configuration of an image processing apparatus;
[0009] FIG. 2 is a functional block diagram or the image processing
apparatus;
[0010] FIG. 3 is a drawing that illustrates an example of a sample
image to learn image recognition;
[0011] FIG. 4 is an enlarged drawing of a part of the sample
image;
[0012] FIG. 5 is a flowchart of learning processing of the image
processing apparatus; and
[0013] FIG. 6 is a flowchart of image recognition processing of the
image processing apparatus.
DETAILED DESCRIPTION
[0014] Referring now to the accompanying drawings (FIGS. 1 to 6),
there is illustrated an exemplary embodiment of the invention.
[0015] To begin with, the configuration of an image processing
apparatus 10 according to the invention will be discussed.
[0016] FIG. 1 is a block diagram to show the hardware configuration
of the image processing apparatus 10. As illustrated in the figure,
the image processing apparatus 10 includes a processor 100, memory
102, an input/output interface 104, a graphic interface 106, and
storage of a hard disk 108, etc., as the physical configuration.
The components are connected so that they can communicate with each
other via a bus 110.
[0017] The processor 100 controls the components of the image
processing apparatus 10 based on an operating system and programs
stored in the storage of the hard disk 108, the memory 102, etc. A
program and data are written into the memory 102 as required and
the memory 102 is also used as work memory of the processor
100.
[0018] The input/output interface 104 is hardware for controlling
input/output of data signals to/from the image processing apparatus
10. In the embodiment, a camera 20 is connected to the input/output
interface 104. The input/output interface 104 may be based on the
standard of a serial interface such as USB.
[0019] The graphic interface 106 includes video memory. It outputs
an image to a connected display 30 in accordance with image data
stored in the video memory in order.
[0020] FIG. 2 is a functional block diagram of the image processing
apparatus 10. As illustrated in the figure, the image processing
apparatus 10 includes an image pickup section 200, a picked-up
image input section 202, an image output section 204, a detection
target part learning section 206, a face area determination section
(serving as identification unit) 208, a search area determination
section (serving as determination unit) 210, and a target part
detection section 212 as the functional configuration. The sections
will be discussed in detail below:
[0021] The image pickup section 200 has a function of picking up an
image. The image pickup section 200 is a function provided by the
camera 20 connected to the image processing apparatus 10. The
camera 20 may be a WEB camera, etc., including a CCD camera and may
be connected to the image processing apparatus 10 through a serial
interface such as USB. The camera 20 may have a function of picking
up an image in order at predetermined time intervals (for example,
1/60 seconds).
[0022] The picked-up image input section 202 has a function of
accepting input of image data picked up in the image pickup section
200. The picked-up image input section 202 is a function provided
by the input/output interface 104 of the image processing apparatus
10. The input/output interface 104 may be an interface such as
USB.
[0023] The image output section 204 has a function of displaying an
image based on the image data input to the picked-up image input
section 202. The image output section 204 includes the display 30
connected to the image processing apparatus 10. The image output
section 204 may add predetermined image processing to the input
image data before outputting the resultant image.
[0024] The detection target part learning section 206 has a
function of learning an image pattern of the target part of a face
to be detected based on a sample image of a picked-up face image.
The detection target part learning section 206 includes a correct
solution pattern extraction section 206A and a counterexample
pattern extraction section 206B. The sections perform processing of
extracting an image pattern to learn image recognition of the
target part from the sample image, as described later in detail.
The detection target part learning section 206 inputs the image
patterns extracted by the correct solution pattern extraction
section 206A and the counterexample pattern extraction section 206B
to a support vector machine for learning the image pattern of the
target part.
[0025] FIG. 3 shows an example of a sample image input to the image
processing apparatus 10. As illustrated in the figure, a sample
image 300 contains a user face area 302. In the embodiment, the
sample image 300 is a picked-up image of the whole face, but need
not cover the whole face and may be a picked-up image of a mouth
proximity area 310 containing the target parts to be learnt (in the
embodiment, mouth corners 304A and 304B).
[0026] The correct solution pattern extraction section 206A has a
function of extracting the image pattern of the target part of the
face to be detected as a correct solution pattern based on a sample
image of a picked-up face image. In the embodiment, mouth corners
are adopted as the target parts. The correct solution pattern
extraction section 206A extracts the image pattern of a picked-up
image of the target mouth corner as the correct solution pattern
based on the mouth corner contained in the sample image and the
position coordinates of the mouth corner. If the position
coordinates of the mouth corner are not previously known, the user
may be allowed to specify the mouth corner position with a pointing
device such as a mouse for acquiring the position coordinates of
the mouth corner.
[0027] The counterexample pattern extraction section 206B has a
function of extracting the image pattern within a predetermined
range in the proximity of the correct solution pattern as a
counterexample pattern of the target part based on the sample image
of the picked-up face image. In the embodiment, the counterexample
pattern extraction section 206B determines based on the position
coordinates of the mouth corner contained in the sample image that
the position at a predetermined offset distance from the position
coordinates is the extraction position of the image pattern. The
counterexample pattern extraction section 206B extracts the image
pattern of a predetermined size as the counterexample pattern.
Plural partial images within a predetermined range from the
position of the target part may be extracted as the counterexample
pattern.
[0028] The correct solution pattern and the counterexample pattern
will be discussed specifically with reference to FIG. 4. FIG. 4 is
an enlarged drawing of the mouth proximity area 310 of the face in
FIG. 3. As illustrated in FIG. 4, correct solution patterns 320A
and 320B are partial images containing mouth corners 304A and 304B.
The correct solution pattern 320A corresponds to the mouth corner
304A and the correct solution pattern 320B corresponds to the mouth
corner 304B. Partial images illustrated in the proximity of the
correct solution patterns 320A and 3208 are counterexample patterns
330A and 330B of the mouth corners. The counterexample pattern 330A
corresponds to the mouth corner 304A and the counterexample pattern
330B corresponds to the mouth corner 304B. Of course, the number
and the positions of the counterexample patterns to be extracted
are not limited to those illustrated in FIG. 4 and may be different
therefrom.
[0029] The detection target part learning section 206 adds "+1" as
label data to the image pattern extracted in the correct solution
pattern extraction section 206A and inputs the image pattern to the
support vector machine. The detection target part learning section
206 adds "-1" as label data to the image pattern extracted in the
counterexample pattern extraction section 206B and inputs the image
pattern to the support vector machine. When all image patterns have
been input to the support vector machine, image recognition
learning is executed. The parameters provided by performing the
learning processing are stored in the memory 102.
[0030] The face area determination section 208 has a function of
determining the area where the user face is shown from the
picked-up image. The face area determination section 208 determines
the face area from the whole of the picked-up image in FIG. 3, for
example. Various face area determination techniques are proposed.
In the invention, the known face area determination techniques
maybe used. As an example of the face area determination technique,
"technique of detection apparatus of face of person" described in
patent document 3 may be used. The face area determination section
208 is a function provided as a program created according to an
algorithm based on the face area determination technique is read
into the memory 102 and the processor 100 operates in accordance
with the read program.
[0031] The search area determination section 210 has a function or
determining a search area where a search is made for the target
part based on color information of the face area determined by the
face area determination section 208. The search area determination
section 210 includes a color information processing section 210A.
The color information processing section 210A has a function of
calculating the threshold value of mouth corner search based on the
color information of the determined face area. In the embodiment,
processing of determining the search area when the target part is a
mouth corner will be discussed below:
[0032] To begin with, the processor 100 of the image processing
apparatus 10 estimates the position of the center point of the face
area from the coordinate data of the determined face area. The
processor 100 calculates the average value of red components of the
complexion from the horizontal line passing through the estimated
center point. The processor 100 calculates the maximum value of red
components of the complexion from the vertical line passing through
the center point of the determined face area. Next, the processor
100 further calculates the average value of the average value and
the maximum value of the red components previously calculated. The
value is adopted as the threshold value in mouth corner search. The
processor 100 determines that the area having a red component equal
to or greater than the calculated threshold value in the determined
face area is the search area.
[0033] The target part detection section 212 has a function of
detecting the position of the target part according to pattern
recognition learnt by the detection target part learning section
206 from within the face area determined by the face area
determination section 208. The target part detection section 212
extracts a partial image in order at predetermined intervals from
within the search area determined by the search area determination
section 210. Pattern matching as to whether or not each extracted
partial image matches the image pattern or the target part is
performed in order. The pattern matching is performed using the
support vector machine previously learning image recognition of the
target part. As the pattern matching is performed, the position of
the partial image most matching the target part is detected as the
position of the target part.
[0034] A processing flow of the image processing apparatus 10
according to the embodiment of the invention will be discussed
below:
[0035] FIG. 5 is a flowchart of learning processing of learning
image recognition of a target part by the image processing
apparatus 10. As illustrated in the figure, the image processing
apparatus 10 accepts input of a gray scale image containing a mouth
corner for face recognition learning (S101). The image processing
apparatus 10 extracts the mouth corner image pattern based on the
coordinate information indicating the position of the mouth corner
in the input image data (S102). The coordinate information may be
specified by the user with a pointing device such as a mouse. In
addition to the mouth corner image pattern, the image processing
apparatus 10 extracts an image pattern in a predetermined range in
the proximity of the extracted image pattern as a counterexample
pattern (S103) Plural counterexample patterns may be extracted at
5103.
[0036] Next, the label data of the image pattern extracted at S102
is incremented by one and the label data of the image pattern
extracted at S103 is decremented by one before input to the support
vector machine for learning the mouth corner image pattern (S104).
As the counterexample pattern is also input, the support vector
machine for indicating a negative reaction to a pattern where the
mouth corner position shifts can be generated. Thus, among the
patterns in the proximity of the mouth corner, the support vector
machine indicates a positive reaction only to the pattern where the
mouth corner is at the center, and it is made possible to detect
the precise mouth corner position. The parameter obtained as the
result of learning at S104 is stored in the memory 102 (S105) so
that the parameter can be used when a mouth corner is detected for
an image picked up by the camera 20.
[0037] FIG. 6 is a flowchart of processing of detecting a target
part of a face by the image processing apparatus 10. As illustrated
in the figure, the image processing apparatus 10 accepts input of
an image picked up by the camera 20 (S201). The image processing
apparatus 10 determines the face area where the face of the user is
shown from the input image (S202). It determines the threshold
value of the color of the red component based on the red component
of the complexion for the determined face area. The image
processing apparatus 10 determines that the area having a red
component equal to or greater than the calculated threshold value
is the search area (S203). The image processing apparatus 10
extracts a partial image in order at predetermined intervals from
the search area. It calculates the distance from the determination
boundary of the support vector machine previously learnt about the
feature amount calculated from the extracted partial image
(hereinafter, the processing will be referred to as template
matching processing (S204)). The image processing apparatus 10
applies the template matching processing to the extracted partial
images in order and detects the extraction position of the partial
image having the largest distance value as the mouth corner
position (S205) and terminates the processing.
[0038] The image processing apparatus 10 according to the
embodiment of the invention described above can detect a face part
with accuracy based on image recognition learnt according to the
correct solution pattern and the counterexample pattern of the
detection target part. The image processing apparatus 10 according
to the invention can decrease the number of iterations of template
matching in detecting the face part, so that the processing load
can be lightened.
[0039] The invention is not limited to the embodiment described
above.
[0040] For example, the image processing apparatus 10 according to
the embodiment adopts the mouth corner as the detection target, but
is not limited to it and can be applied to various parts of a nose,
etc. To adopt a nose as the detection target according to the
invention, the search area may be determined as follows:
[0041] The processor 100 of the image processing apparatus 10
calculates the average value of luminance components of the
complexion from the horizontal line passing through the center
point of the detected face. The processor 100 calculates the
minimum value of luminance components of the complexion from the
vertical line passing through the center point of the detected
face. Next, the processor 100 further calculates the average value
or the calculated average value and the calculated minimum value
and adopts the value as the threshold value in nose search. The
processor 100 determines that the area having a luminance component
equal to or less than the previously calculated threshold value in
the detected face area is the search area.
[0042] FIG. 5 [0043] S101 INPUT SAMPLE AGE [0044] S102 EXTRACT
IMAGE PATTERN OF TARGET PART [0045] S103 EXTRACT IMAGE PATTERN IN
PROXIMITY OF TARGET PART [0046] S104 LEARN IMAGE RECOGNITION OF
TARGET PART USING SVM [0047] S105 STORE PARAMETER
[0048] FIG. 6 [0049] S201 INPUT PICKED-UP IMAGE [0050] S202
DETERMINE FACE AREA [0051] S203 DETERMINE SEARCH AREA BASED ON
COLOR INFORMATION OF FACE AREA [0052] S204 PERFORM TEMPLATE
MATCHING PROCESSING IN SEARCH AREA [0053] S205 DETECT TARGET
PART
* * * * *