U.S. patent application number 08/885823 was filed with the patent office on 2001-08-16 for three-dimensional information processing apparatus and method.
This patent application is currently assigned to CANON KABUSHIKI KAISHA. Invention is credited to IIJIMA, KATSUMI, ISHIKAWA, MOTOHIRO, KATAYAMA, TATSUSHI, KURAHASHI, SUNAO, MATSUGU, MASAKAZU, MORI, KATSUHIKO, OKAUCHI, SHIGEKI, SEKINE, MASAYOSHI, YANO, KOTARO.
Application Number | 20010014171 08/885823 |
Document ID | / |
Family ID | 27326257 |
Filed Date | 2001-08-16 |
United States Patent
Application |
20010014171 |
Kind Code |
A1 |
IIJIMA, KATSUMI ; et
al. |
August 16, 2001 |
THREE-DIMENSIONAL INFORMATION PROCESSING APPARATUS AND METHOD
Abstract
A three-dimensional information processing apparatus for
obtaining three-dimensional information from an object having a
three-dimensional shape, and performing predetermined information
processing, comprises: a camera for sensing images of the object
from a plurality of coordinate positions using an image sensing
system having one or a plurality of optical systems. A plurality of
depth information are extracted from image sensing related
information sensed by the camera at the plurality of coordinate
positions, and the plurality of extracted depth information are
converted and unified into depth information expressed by a unified
coordinate system.
Inventors: |
IIJIMA, KATSUMI; (TOKYO,
JP) ; OKAUCHI, SHIGEKI; (TOKYO, JP) ; MATSUGU,
MASAKAZU; (CHIBA-SHI, JP) ; SEKINE, MASAYOSHI;
(TOKYO, JP) ; YANO, KOTARO; (YOKOHAMA-SHI, JP)
; KURAHASHI, SUNAO; (KAWASAKI-SHI, JP) ; KATAYAMA,
TATSUSHI; (TOKYO, JP) ; MORI, KATSUHIKO;
(KAWASAKI-SHI, JP) ; ISHIKAWA, MOTOHIRO;
(YOKOHAMA-SHI, JP) |
Correspondence
Address: |
MORGAN AND FINNEGAN
345 PARK AVENUE
NEW YORK
NY
10154
|
Assignee: |
CANON KABUSHIKI KAISHA
|
Family ID: |
27326257 |
Appl. No.: |
08/885823 |
Filed: |
June 30, 1997 |
Current U.S.
Class: |
382/154 ;
348/E13.005; 348/E13.008; 348/E13.014; 348/E13.018; 348/E13.025;
348/E13.071 |
Current CPC
Class: |
H04N 2013/0081 20130101;
H04N 13/128 20180501; G06T 7/55 20170101; G06T 2207/10012 20130101;
H04N 13/296 20180501; H04N 13/254 20180501; H04N 13/194 20180501;
H04N 13/239 20180501; H04N 13/221 20180501; H04N 13/207 20180501;
H04N 13/189 20180501 |
Class at
Publication: |
382/154 |
International
Class: |
G06K 009/00 |
Foreign Application Data
Date |
Code |
Application Number |
Jul 1, 1996 |
JP |
8-189972 |
Jul 4, 1996 |
JP |
8-192727 |
Jul 5, 1996 |
JP |
8-194102 |
Claims
What is claimed is:
1. A three-dimensional information processing apparatus for
obtaining three-dimensional information from an object having a
three-dimensional shape, and performing predetermined information
processing, comprising: image sensing means for sensing images of
the object from a plurality of coordinate positions using an image
sensing system having one or a plurality of optical systems;
information extraction means for extracting a plurality of depth
information from image sensing related information obtained by said
image sensing means at the plurality of coordinate positions; and
conversion/unification means for converting and unifying the
plurality of depth information extracted by said depth information
extraction means into depth information expressed by a unified
coordinate system.
2. The apparatus according to claim 1, wherein said
conversion/unification means obtains image information of the
object from said image sensing means, detects a displacement
between coordinate systems of the plurality of depth information on
the basis of the obtained image information, and converts and
unifies the plurality of depth information onto the unified
coordinate system.
3. The apparatus according to claim 1, wherein the unified
coordinate system has five different projection planes.
4. The apparatus according to claim 2, wherein the image
information includes luminance information of the object, and said
conversion/unification means detects the displacement between the
coordinate systems on the basis of the luminance information.
5. A three-dimensional information processing method for obtaining
three-dimensional information from an object having a
three-dimensional shape, and performing predetermined information
processing, comprising: the first step of sensing images of the
object from a plurality of coordinate positions using an image
sensing system having one or a plurality of optical systems; the
second step of extracting a plurality of depth information from
image sensing related information sensed at the plurality of
coordinate positions in the first step; and the third step of
converting and unifying the plurality of depth information
extracted by said depth information extraction means into depth
information expressed by a unified coordinate system.
6. The method according to claim 5, wherein the third step has the
step of obtaining image information of the object obtained in the
first step, detecting a displacement between coordinate systems of
the plurality of depth information on the basis of the obtained
image information, and converting and unifying the plurality of
depth information onto the unified coordinate system.
7. The method according to claim 5, wherein the unified coordinate
system has five different protection planes.
8. The method according to claim 6, wherein the image information
includes luminance information of the object, and the displacement
between the coordinate systems is detected on the basis of the
luminance information.
9. A three-dimensional information processing apparatus for
obtaining three-dimensional information from an object having a
three-dimensional shape, and performing predetermined information
processing, comprising: image sensing means for sensing images of
the object using an image sensing system having one or a plurality
of optical systems; three-dimensional shape extraction means for
extracting three-dimensional shape information of the object from
image sensing related information obtained by said image sensing
means; and reliability determination means for determining
reliability of the three-dimensional shape information extracted by
said three-dimensional shape extraction means.
10. The apparatus according to claim 9, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an angle of the
object with respect to an image sensing plane.
11. The apparatus according to claim 9, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a distance
between said image sensing means and the object.
12. The apparatus according to claim 9, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an angle a pad
that places the object thereon makes with an image sensing plane of
said image sensing means.
13. The apparatus according to claim 9, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an area ratio
of a pad that places the object thereon to an image sensing
region.
14. The apparatus according to claim 9, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a position of a
pad that places the object thereon.
15. The apparatus according to claim 9, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of reflected light
information reflected by the object.
16. The apparatus according to claim 9, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a degree of
correspondence of pixels between a plurality of image sensing
related data sensed by said image sensing means.
17. A three-dimensional information processing apparatus for
obtaining three-dimensional information from an object having a
three-dimensional shape, and performing predetermined information
processing, comprising: image sensing means for sensing images of
the object using an image sensing system having one or a plurality
of optical systems; three-dimensional shape extraction means for
extracting three-dimensional shape information of the object from
image sensing related information sensed by said image sensing
means; reliability determination means for determining reliability
of the three-dimensional shape information extracted by said
three-dimensional shape extraction means; and informing means for
informing a reliability determination result of said reliability
determination means.
18. The apparatus according to claim 17, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an angle of the
object with respect to an image sensing plane.
19. The apparatus according to claim 17, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a distance
between said image sensing means and the object.
20. The apparatus according to claim 17, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an angle a pad
that places the object thereon makes with an image sensing plane of
said image sensing means.
21. The apparatus according to claim 17, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an area ratio
of a pad that places the object thereon to an image sensing
region.
22. The apparatus according to claim 17, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a position of a
pad that places the object thereon.
23. The apparatus according to claim 17, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of reflected light
information reflected by the object.
24. The apparatus according to claim 17, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a degree of
correspondence of pixels between a plurality of image sensing
related data sensed by said image sensing means.
25. A three-dimensional information processing apparatus for
obtaining three-dimensional information from an object having a
three-dimensional shape, and performing predetermined information
processing, comprising: image sensing means for sensing images of
the object using an image sensing system having one or a plurality
of optical systems; three-dimensional shape extraction means for
extracting three-dimensional shape information of the object from
image sensing related information sensed by said image sensing
means; reliability determination means for determining reliability
of the three-dimensional shape information extracted by said
three-dimensional shape extraction means; and display means for
processing the three-dimensional shape information in accordance
with a reliability determination result of said reliability
determination means, and displaying the processed three-dimensional
shape information.
26. The apparatus according to claim 25, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an angle of the
object with respect to an image sensing plane.
27. The apparatus according to claim 25, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a distance
between said image sensing means and the object.
28. The apparatus according to claim 25, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an angle a pad
that places the object thereon makes with an image sensing plane of
said image sensing means.
29. The apparatus according to claim 25, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of an area ratio
of a pad that places the object thereon to an image sensing
region.
30. The apparatus according to claim 25, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a position of a
pad that places the object thereon.
31. The apparatus according to claim 25, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of reflected light
information reflected by the object.
32. The apparatus according to claim 25, wherein said reliability
determination means determines the reliability of the
three-dimensional shape information on the basis of a degree of
correspondence of pixels between a plurality of image sensing
related data sensed by said image sensing means.
33. An image sensing method comprising: the image sensing step of
sensing images of an object; the storage step of storing image
information of the object; the image sensing condition detection
step of detecting a relative relationship between the object and an
image sensing apparatus main body; and the control step of
controlling a storage operation of the image information, wherein
the control step includes the step of controlling the storage
operation in the storage step in accordance with a detection result
of the image sensing condition detection step.
34. The method according to claim 33, wherein the control step
includes the step of controlling to store information associated
with the relative relationship between the object and the image
sensing apparatus main body together with sensed images sensed in
the image sensing step in the storage step.
35. The method according to claim 33, wherein the image sensing
condition detection step includes the step of detecting the
relative relationship using a sensor for detecting an angle and
translation movement of the image sensing apparatus main body.
36. The method according to claim 33, wherein the image sensing
condition detection step includes the step of analyzing an object
image and images around the object sensed by the image sensing
apparatus main body, and detecting an angle and translation
movement of the image sensing apparatus main body on the basis of
changes in state of sensed images sensed in the image sensing
step.
37. The method according to claim 33, wherein the image sensing
condition detection step includes the step of analyzing an object
image and images around the object sensed by the image sensing
apparatus main body, and detecting changes in relative position
relationship between the object and the image sensing apparatus
main body on the basis of an error signal generated upon analyzing
the images.
38. The method according to claim 33, wherein the image sensing
condition detection step includes the step of analyzing an object
image sensed by the image sensing apparatus main body, and
detecting changes in occlusion state of the object.
39. The method according to claim 33, wherein the image sensing
condition detection step includes the step of analyzing an object
image sensed by the image sensing apparatus main body, and
detecting an overlapping region area between time-serial object
images.
40. The method according to claim 33, wherein the image sensing
condition detection step includes the step of analyzing an object
image sensed by the image sensing apparatus main body, and
detecting changes in distance image of the object.
41. An image sensing method comprising: the image sensing step of
sensing images of an object; the analysis step of analyzing image
information obtained in the image sensing step; the image sensing
condition detection step of detecting a relative relationship
between the object and an image sensing apparatus main body; and
the control step of controlling an image analysis operation in the
analysis step, wherein the control step includes the step of
controlling the image analysis operation in accordance with a
detection result of the image sensing condition detection step.
42. The method according to claim 41, wherein the analysis step
includes the step of performing an analysis calculation for
acquiring a three-dimensional shape and a surface image of the
object using a plurality of images.
43. The method according to claim 41, wherein the image sensing
condition detection step uses a sensor for detecting an angle and
translation movement of the image sensing apparatus main body.
44. The method according to claim 41, wherein the image sensing
condition detection step includes the step of analyzing an object
image and images around the object sensed by the image sensing
apparatus main body, and detecting an angle and translation
movement of the image sensing apparatus main body on the basis of
changes in state of sensed images sensed in the image sensing
step.
45. The method according to claim 41, wherein the image sensing
condition detection step includes the step of analyzing an object
image and images around the object sensed by the image sensing
apparatus main body, and detecting changes in relative position
relationship between the object and the image sensing apparatus
main body on the basis of an error signal generated upon analyzing
the images.
46. The method according to claim 41, wherein the image sensing
condition detection step includes the step of analyzing an object
image sensed by the image sensing apparatus main body, and
detecting changes in occlusion state of the object.
47. The method according to claim 41, wherein the image sensing
condition detection step includes the step of analyzing an object
image sensed by the image sensing apparatus main body, and
detecting an overlapping region area between time-serial object
images.
48. The method according to claim 41, wherein the image sensing
condition detection step includes the step of analyzing an object
image sensed by the image sensing apparatus main body, and
detecting changes in distance image of the object.
49. The method according to claim 41, wherein the image sensing
condition detection step includes the step of stopping the image
sensing step and the analysis step during a period in which neither
storage processing nor analysis processing are performed.
50. An image sensing apparatus comprising: image sensing means for
sensing images of an object; storage means for storing image
information of the object; image sensing condition detection means
for detecting a relative relationship between the object and an
image sensing apparatus main body; and control means for
controlling said storage means, wherein the control means controls
said storage means in accordance with an output from said image
sensing condition detection means.
51. The apparatus according to claim 50, wherein said control means
controls said storage means to store information associated with
the relative relationship between the object and the image sensing
apparatus main body together with sensed images sensed by said
image sensing means.
52. The apparatus according to claim 50, wherein said image sensing
condition detection means comprises a sensor for detecting an angle
and translation movement of the image sensing apparatus main
body.
53. The apparatus according to claim 50, wherein said image sensing
condition detection means analyzes an object image and images
around the object sensed by the image sensing apparatus main body,
and detects an angle and translation movement of the image sensing
apparatus main body on the basis of changes in state of sensed
images sensed by said image sensing means.
54. The apparatus according to claim 50, wherein said image sensing
condition detection means analyzes an object image and images
around the object sensed by the image sensing apparatus main body,
and detects changes in relative position relationship between the
object and the image sensing apparatus main body on the basis of an
error signal generated upon analyzing the images.
55. The apparatus according to claim 50, wherein said image sensing
condition detection means analyzes an object image sensed by the
image sensing apparatus main body, and detects changes in occlusion
state of the object.
56. The apparatus according to claim 50, wherein said image sensing
condition detection means analyzes an object image sensed by the
image sensing apparatus main body, and detects an overlapping
region area between time-serial object images.
57. The apparatus according to claim 50, wherein said image sensing
condition detection means analyzes an object image sensed by the
image sensing apparatus main body, and detects changes in distance
image of the object.
58. An image sensing apparatus comprising: image sensing means for
sensing images of an object; image analysis means for analyzing
image information sensed by said image sensing means; image sensing
condition detection means for detecting a relative relationship
between the object and an image sensing apparatus main body; and
control means for controlling said image analysis means, wherein
said control means controls said image analysis means in accordance
with an output from said image sensing condition detection
means.
59. The apparatus according to claim 58, wherein said image
analysis means performs an analysis calculation for acquiring a
three-dimensional shape and a surface image of the object using a
plurality of images.
60. The apparatus according to claim 58, wherein said image sensing
condition detection means comprises a sensor for detecting an angle
and translation movement of the image sensing apparatus main
body.
61. The apparatus according to claim 58, wherein said image sensing
condition detection means analyzes an object image and images
around the object sensed by the image sensing apparatus main body,
and detects an angle and translation movement of the image sensing
apparatus main body on the basis of changes in state of sensed
images sensed by said image sensing means.
62. The apparatus according to claim 58, wherein said image sensing
condition detection means analyzes an object image and images
around the object sensed by the image sensing apparatus main body,
and detects changes in relative position relationship between the
object and the image sensing apparatus main body on the basis of an
error signal generated upon analyzing the images.
63. The apparatus according to claim 58, wherein said image sensing
condition detection means analyzes an object image sensed by the
image sensing apparatus main body, and detects changes in occlusion
state of the object.
64. The apparatus according to claim 58, wherein said image sensing
condition detection means analyzes an object image sensed by the
image sensing apparatus main body, and detects an overlapping
region area between time-serial object images.
65. The apparatus according to claim 58, wherein said image sensing
condition detection means analyzes an object image sensed by the
image sensing apparatus main body, and detects changes in distance
image of the object.
66. The apparatus according to claim 58, wherein said image sensing
condition detection means stops operations of said image sensing
means and said image analysis means during a period in which
neither storage processing nor analysis processing are
performed.
67. The method according to claim 33, wherein the image analysis
step includes the step of performing an analysis calculation for
acquiring a three-dimensional shape and a surface image of the
object using a plurality of images.
Description
BACKGROUND OF THE INVENTION
[0001] The present invention relates to a three-dimensional
information processing apparatus and method for extracting
three-dimensional information, that can be used in CG, CAD, and the
like, from an object having a three-dimensional shape.
[0002] As a conventional technique for obtaining the
three-dimensional shape of an object, for example, "Stereoscopic
matching using a plurality of base line distances" (Journal of
Papers of the Institute of Electronics, Information and
Communication Engineers D-II, Vol. J75-D-II, No. 8, pp. 1317-1327,
August 1992) is known. Generally, the conventional method of
acquiring a three-dimensional shape can be roughly classified into
passive and active methods.
[0003] One typical passive method is a stereoscopic image method,
which utilizes trigonometric measurements using two cameras. In
this method, the positions of images of an identical object are
detected from right and left images taken by cameras, and the
three-dimensional position of the object is measured based on the
displacement amount between the detected positions.
[0004] As typical active methods, an optical radar type range
finder which obtains distance by measuring the time until light
projected toward and reflected by an object returns, a slit light
projection method for projecting a slit-shaped light pattern onto
an object, and measuring the three-dimensional shape on the basis
of the displacement of the pattern shape formed on the object, and
the like are known.
[0005] Note that the three-dimensional data of the object obtained
by the above-mentioned methods can be reproduced and displayed on,
e.g., a two-dimensional display.
[0006] However, the stereoscopic image method has as its major
objective to calculate the distance information from a specific
position where the cameras are set to the object, and does not
measure the three-dimensional shape itself of a certain object. In
the active methods, since a laser beam or the like must be
irradiated onto the object, it is cumbersome to use such
methods.
[0007] For this reason, such methods cannot flexibly cope with a
dynamic image sensing environment, i.e., image sensing while moving
around a certain object, and hence, none of the conventional
methods can extract depth information in such dynamic image sensing
environment.
[0008] Images normally used in an office are often finally output
onto paper sheets, and images types to be used include both natural
images and line images that express objects by edge lines alone.
More specifically, in an office or the like, it is a common
practice to process image information for various purposes.
[0009] In contrast to this, since the principal object of the
above-mentioned prior art is to calculate the three-dimensional
shape data of the object from certain specific setting positions of
the cameras and to faithfully display the calculated data on a
two-dimensional display, the above-mentioned methods cannot cope
with various kinds of image processing required in, e.g., an
office.
[0010] More specifically, the present invention is addressed to a
three-dimensional information extraction apparatus which can be
easily applied to a dynamic image sensing environment in which the
image sensing position changes, and can process acquired
three-dimensional information into various forms.
[0011] Some stereoscopic image processing apparatuses use three or
more images in place of two images, and form three-dimensional
shapes by unifying shape information obtained from such images.
[0012] Upon judging the reliability of the obtained
three-dimensional shape, for example, the above-mentioned
stereoscopic image method uses the comparison result or correlation
of residuals obtained upon calculating the position displacement
amount by corresponding point extraction of the luminance values in
place of reliability judgment.
[0013] However, in the above-mentioned prior arts, in the case of,
e.g., the stereoscopic image method, even when the residual is
large or when the correlation function is small, if the angle the
object makes with the image sensing plane is large or the distance
from the apparatus to the object is large, calculation errors due
to minimum errors of the corresponding extraction results are
large, and the obtained three-dimensional shape has low
reliability. On the other hand, the obtained three-dimensional
shape is not displayed considering its low reliability.
[0014] That is, the present invention is also addressed to
improvement of reliability in three-dimensional information
processing.
[0015] On the other hand, the present invention is addressed to
storage of image information in the dynamic image sensing
environment. Problems associated with storage of image information
in the dynamic image sensing environment will be discussed
below.
[0016] In a certain prior art associated with the dynamic image
sensing environment, a single image sensing unit placed on a rail
is translated to sense a plurality of images, and shape analysis is
made using the correlation among the sensed images.
[0017] In addition, Japanese Patent Publication No. 7-9673 is known
as the technique of analyzing the shape of a stereoscopic object
using the correlation among two pairs of parallax images sensed at
the same time using a compound-eye image sensing device which is
made up of a plurality of image sensing units. In this prior art,
the image sensing device is fixed to a robot arm, and is moved as
instructed to sense images.
[0018] A conventional image sensing apparatus which allows the
photographer to freely carry the image sensing apparatus main body
and can analyze the shape of an arbitrary object will be described
below.
[0019] FIG. 1 is a block diagram showing the arrangement of a
conventional portable automatic image sensing apparatus and the
principle of its use state.
[0020] In FIG. 1, reference numeral 1101 denotes an object to be
sensed (a cup in this embodiment), which is placed on a pad 1102,
and a case will be explained below wherein this object 1101 is to
be sensed. A plurality of bright point marks 1103a, 1103b, and
1103c are printed on the pad 1102, and their position relationship
is known and is pre-stored in an image sensing apparatus 1900 (to
be described below).
[0021] Reference numeral 1900 denotes a portable image sensing
apparatus, which comprises photographing lenses 1110 and 1111,
shutters 1112 and 1113 which also serve as iris diaphragms, image
sensing elements 1114 and 1115 for performing photoelectric
conversion, control circuits 1116 and 1117 for controlling the
image sensing elements 1114 and 1115, image signal processing
circuits 1118 and 1119 for processing signals obtained from the
image sensing elements 1114 and 1115, image signal storage circuits
1120 and 1121 for storing image signals output from the image
signal processing circuits 1118 and 1119, a corresponding point
extraction circuit 1122, an image sensing parameter detection
circuit 1123, a ROM (read-only memory) 1124 that stores the (known)
position relationship among the bright points on the pad, a
unifying circuit 1125 for unifying three-dimensional information,
and buffer circuits 1126 and 1127 for temporarily storing the
three-dimensional information unified by the three-dimensional
information unifying circuit 1125.
[0022] This image sensing apparatus 1900 extracts corresponding
points from the obtained two image signals by the corresponding
point extraction circuit 1122 to obtain distance images at the
individual timings, and at the same time, obtains image sensing
parameters (the position relationship between the pad and the image
sensing apparatus 1900 obtained based on the bright point
coordinate positions, accurate focal length, and the like) using
the image sensing parameter detection circuit 1123 and the ROM
1124. The three-dimensional information unifying circuit 1125
calculates three-dimensional shape data and texture image data of
the object 1101 on the basis of these distance images, image
sensing parameters, and change information that expresses their
time-series changes, and stores them in the buffer circuits 1126
and 1127.
[0023] In FIG. 1, reference numeral 1140 denotes numerical value
data of the three-dimensional shape of the object 1101 output from
the image sensing apparatus 1900; and 1141, developed image data of
the surface texture of the object 1101. These output data are
transferred to a personal computer or the like, which performs
texture mapping to display the input data as a stereoscopic CG
(computer graphics) image. The display angle, size, and the like of
the CG image can be instantaneously changed, and the image can also
be deformed and processed. Two CG images which have slightly
different view points are generated, and are output to a
stereoscopic display, thus allowing the user to observe a
stereoscopic image. In this case, since the stereoscopic image can
be freely rotated and deformed, the user can experience higher
reality.
[0024] In the image sensing apparatus 1900, the corresponding point
extraction circuit 1122 and the three-dimensional information
unifying circuit 1125 require the most complicated, time-consuming
processing and, hence, require a very large circuit scale and
consumption power. The image sensing apparatus 1900 has a
sequential processing mode in which such complicated processing is
sequentially executed while sensing images, and a simultaneous
processing mode in which the required sensed images are stored in
the image signal storage circuits 1120 and 1121, and thereafter,
the processing is executed simultaneously. On the other hand, the
image sensing apparatus 1900 allows the photographer to freely
carry the image sensing apparatus 1900 without requiring any
large-scale positioning device unlike in the above-mentioned prior
art, and can easily analyze the shape of the object 1101 without
requiring any special preparation processes.
[0025] However, the prior art shown in FIG. 1 suffers the following
problems.
[0026] More specifically, in general, accurate positioning cannot
be attained at a constant speed even by the operation of the
photographer unlike in the above-mentioned conventional positioning
device. For example, when images are stored in the image storage
circuit at given time intervals and are subjected to image
processing, redundant information increases in a portion sensed by
moving the apparatus at an excessively low speed, and a very large
image memory capacity is required, resulting in a long shape
analysis time. Furthermore, the analyzed three-dimensional shape
data becomes excessively fine, and the subsequent CG generation
requires an extra processing time and storage capacity. Conversely,
when the photographer moves the image sensing apparatus at high
speed, information required for analyzing the three-dimensional
shape becomes short, and the analysis precision is impaired. In the
worst case, if an image of a specific side surface of the object
cannot be acquired, the shape information of that portion is
lost.
SUMMARY OF THE INVENTION
[0027] The present invention has been made in consideration of the
above situation, and has as its object to provide a
three-dimensional information processing apparatus and method,
which can flexibly cope with dynamic image sensing, and can process
the obtained three-dimensional information into various forms.
[0028] In order to achieve the above object, according to the
present invention, there is provided a three-dimensional
information processing apparatus for obtaining three-dimensional
information from an object having a three-dimensional shape, and
performing predetermined information processing, comprising:
[0029] image sensing means for sensing images of the object from a
plurality of coordinate positions using an image sensing system
having one or a plurality of optical systems;
[0030] information extraction means for extracting a plurality of
depth information from image sensing related information sensed by
the image sensing means at the plurality of coordinate positions;
and
[0031] conversion/unification means for converting and unifying the
plurality of depth information extracted by the depth information
extraction means into depth information expressed by a unified
coordinate system.
[0032] Also, in order to achieve the above object, according to the
present invention, there is provided a three-dimensional
information processing method for obtaining three-dimensional
information from an object having a three-dimensional shape, and
performing predetermined information processing, comprising:
[0033] the first step of sensing images of the object from a
plurality of coordinate positions using an image sensing system
having one or a plurality of optical systems;
[0034] the second step of extracting a plurality of depth
information from image sensing related information sensed at the
plurality of coordinate positions in the first step; and
[0035] the third step of converting and unifying the plurality of
depth information extracted by the depth information extraction
means into depth information expressed by a unified coordinate
system.
[0036] According to the apparatus and method with the above
arrangement, upon unifying depth information, since a plurality of
depth information are converted into depth information expressed by
a unified coordinate system on the basis of, e.g., the luminance
information of the object and displacement information of distance
information, the present invention can flexibly cope with dynamic
image sensing in which image sensing is done while moving the
apparatus around a certain object, and can easily process the
obtained information into various image forms.
[0037] According to one preferred aspect of the present invention,
a displacement between coordinate systems of the plurality of depth
information is detected on the basis of the image information of
the object.
[0038] According to one preferred aspect of the present invention,
the unified coordinate system has five different projection
planes.
[0039] According to one preferred aspect of the present invention,
the image information includes luminance information of the object,
and the displacement between the coordinate systems is detected on
the basis of the luminance information.
[0040] In order to achieve the above object, according to the
present invention, there is provided a three-dimensional
information processing apparatus for obtaining three-dimensional
information from an object having a three-dimensional shape, and
performing predetermined information processing, comprising:
[0041] image sensing means for sensing images of the object using
an image sensing system having one or a plurality of optical
systems;
[0042] three-dimensional shape extraction means for extracting
three-dimensional shape information of the object from image
sensing related information sensed by the image sensing means;
and
[0043] reliability determination means for determining reliability
of the three-dimensional shape information extracted by the
three-dimensional shape extraction means.
[0044] It is another object of the present invention to provide a
three-dimensional information processing apparatus and method,
which can notify the discrimination result of reliability.
[0045] It is still another object of the present invention to
provide a three-dimensional information processing apparatus and
method, which can process three-dimensional shape information in
accordance with the discrimination result of reliability, and can
display the processed three-dimensional shape information.
[0046] According to one preferred aspect of the present invention,
the reliability of the three-dimensional shape information is
determined on the basis of an angle of the object with respect to
an image sensing plane.
[0047] According to one preferred aspect of the present invention,
the reliability of the three-dimensional shape information is
determined on the basis of a distance between the image sensing
means and the object.
[0048] According to one preferred aspect of the present invention,
the reliability of the three-dimensional shape information is
determined on the basis of an angle a pad that places the object
thereon makes with an image sensing plane of the image sensing
means.
[0049] According to one preferred aspect of the present invention,
the reliability of the three-dimensional shape information is
determined on the basis of an area ratio of a pad that places the
object thereon to an image sensing region.
[0050] According to one preferred aspect of the present invention,
the reliability of the three-dimensional shape information is
determined on the basis of a position of a pad that places the
object thereon.
[0051] According to one preferred aspect of the present invention,
the reliability of the three-dimensional shape information is
determined on the basis of reflected light information reflected by
the object.
[0052] According to one preferred aspect of the present invention,
the reliability of the three-dimensional shape information is
determined on the basis of a degree of correspondence of pixels
between a plurality of image sensing related data sensed by the
image sensing means.
[0053] It is still another object of the present invention to
provide an image sensing method and apparatus, which can minimize
the storage capacity of storage means that stores images, can
shorten the time required for processing images, and can avoid any
errors upon executing processing or display after image
sensing.
[0054] In order to achieve the above object, according to the
present invention, there is provided an image sensing method
comprising:
[0055] the image sensing step of sensing images of an object;
[0056] the storage step of storing image information of the
object;
[0057] the image sensing condition detection step of detecting a
relative relationship between the object and an image sensing
apparatus main body; and
[0058] the control step of controlling a storage operation of the
image information,
[0059] wherein the control step includes the step of controlling
the storage operation in the storage step in accordance with a
detection result of the image sensing condition detection step.
[0060] Also, in order to achieve the above object, according to the
present invention, there is provided an image sensing apparatus
comprising:
[0061] image sensing means for sensing images of an object;
[0062] storage means for storing image information of the
object;
[0063] image sensing condition detection means for detecting a
relative relationship between the object and an image sensing
apparatus main body; and
[0064] control means for controlling the storage means,
[0065] wherein the control means controls the storage means in
accordance with an output from the image sensing condition
detection means.
[0066] According to the method or apparatus with the above
arrangement, since the required minimum capacity of images used in
image display and three-dimensional shape analysis is always
stored, the storage capacity of the storage means can be reduced,
and the operation time of the three-dimensional shape analysis
processing means can be shortened, thereby realizing a size
reduction and a cost reduction of the overall apparatus.
[0067] In order to achieve the above object, according to the
present invention, there is provided an image sensing method
comprising:
[0068] the image sensing step of sensing images of an object;
[0069] the analysis step of analyzing image information obtained in
the image sensing step;
[0070] the image sensing condition detection step of detecting a
relative relationship between the object and an image sensing
apparatus main body; and
[0071] the control step of controlling an image analysis operation
in the analysis step,
[0072] wherein the control step includes the step of controlling
the image analysis operation in accordance with a detection result
of the image sensing condition detection step.
[0073] Also, in order to achieve the above object, according to the
present invention, there is provided an image sensing apparatus
comprising:
[0074] image sensing means for sensing images of an object;
[0075] image analysis means for analyzing image information sensed
by the image sensing means;
[0076] image sensing condition detection means for detecting a
relative relationship between the object and an image sensing
apparatus main body; and
[0077] control means for controlling the image analysis means,
[0078] wherein the control means controls the image analysis means
in accordance with an output from the image sensing condition
detection means.
[0079] According to the image sensing method and apparatus with the
above arrangement, since required minimum images alone are
subjected to three-dimensional shape analysis processing, the
operation time of the three-dimensional analysis can be shortened,
and loss of required images can be avoided, thus realizing a size
reduction and a cost reduction of the overall apparatus.
[0080] According to one preferred aspect of the present invention,
control is made to store information associated with the relative
relationship between the object and the image sensing apparatus
main body together with sensed images sensed in the image sensing
step in the storage step. The stored information can be easily
compared with desired observation direction information input by
the observer upon reproduction of an image, and an appropriate
image can be instantaneously displayed.
[0081] According to one preferred aspect of the present invention,
the image sensing condition is detected using a sensor for
detecting an angle and translation movement of the image sensing
apparatus main body. Sampling positions can be assigned on the
space at nearly equal intervals by a simple apparatus
arrangement.
[0082] According to one preferred aspect of the present invention,
the image sensing condition detection includes the step of
analyzing an object image and images around the object sensed by
the image sensing apparatus main body, and detecting an angle and
translation movement of the image sensing apparatus main body on
the basis of changes in state of sensed images sensed in the image
sensing step. The sampling interval of images can be appropriately
changed in correspondence with the complexity of the object
structure.
[0083] According to one preferred aspect of the present invention,
the image sensing condition detection includes the step of
analyzing an object image and images around the object sensed by
the image sensing apparatus main body, and detecting changes in
relative position relationship between the object and the image
sensing apparatus main body on the basis of an error signal
generated upon analyzing the images. Since the shape information of
the object region that could not be analyzed at a certain time can
be compensated for using information obtained by analyzing an image
at a different time, accurate three-dimensional shape data can
always be output.
[0084] According to one preferred aspect of the present invention,
the image sensing condition detection includes the step of
analyzing an object image sensed by the image sensing apparatus
main body, and detecting changes in occlusion state of the object.
Even for an object with a complicated shape, regions that cannot be
analyzed are few, and accurate information can be output as a
whole.
[0085] According to one preferred aspect of the present invention,
the image sensing condition detection includes the step of
analyzing an object image sensed by the image sensing apparatus
main body, and detecting an overlapping region area between
time-serial object images. In particular, when high-magnification
image sensing is done, joint analysis between images can be
performed from images with predetermined precision, and loss of
required images can be avoided.
[0086] According to one preferred aspect of the present invention,
the image sensing condition detection includes the step of
analyzing an object image sensed by the image sensing apparatus
main body, and detecting changes in distance image of the object.
In the object region corresponding to a complicated
three-dimensional shape, the number of times of sampling can be
increased, and high-precision three-dimensional shape data can be
output.
[0087] According to one preferred aspect of the present invention,
the image sensing condition detection includes the step of
analyzing an object image sensed by the image sensing apparatus
main body, and detecting changes in distance image of the object.
In the object region corresponding to a complicated
three-dimensional shape, the number of times of sampling can be
increased, and high-precision three-dimensional shape data can be
output.
[0088] According to one preferred aspect of the present invention,
the image sensing condition detection includes the step of stopping
the image sensing step and the analysis step during a period in
which neither storage processing nor analysis processing are
performed. Since the image sensing means and the image analysis
means that consume relatively large power cease to operate during
the period that requires neither image storage nor processing,
consumption power can be greatly reduced.
[0089] According to one preferred aspect of the present invention,
the image analysis step includes the step of performing an analysis
calculation for acquiring a three-dimensional shape and a surface
image of the object using a plurality of images. Accordingly, since
an object image is generated using texture mapping or the like in
computer graphics, the observer can freely select the observation
direction and distance, and the three-dimensional shape and surface
state.
[0090] Other features and advantages of the present invention will
be apparent from the following description taken in conjunction
with the accompanying drawings, in which like reference characters
designate the same or similar parts throughout the figures
thereof.
BRIEF DESCRIPTION OF THE DRAWINGS
[0091] FIG. 1 is a schematic block diagram showing the arrangement
of a conventional three-dimensional information processing
apparatus;
[0092] FIG. 2 is a schematic block diagram showing the arrangement
of a three-dimensional information processing apparatus according
to the first embodiment of the present invention;
[0093] FIGS. 3A and 3B are block diagrams showing the arrangement
of a three-dimensional shape extractor 12 in detail;
[0094] FIG. 4 is a block diagram showing the arrangement of a
system controller 210 in detail;
[0095] FIG. 5 is a block diagram showing the portion associated
with extraction of depth information;
[0096] FIG. 6 is a block diagram showing the portion associated
with unification of depth information;
[0097] FIG. 7 is an explanatory view of template matching;
[0098] FIGS. 8A and 8B are explanatory views for explaining the
procedure of unifying depth information;
[0099] FIGS. 9A and 9B are explanatory views for explaining the
procedure of unifying depth information;
[0100] FIGS. 10A and 10B are explanatory views for explaining the
procedure of unifying depth information;
[0101] FIG. 11 is a schematic view showing the intermediate point
interpolation method;
[0102] FIG. 12 is a view showing the method of converting depth
information into one expressed by a unified coordinate system;
[0103] FIG. 13 is a view showing the method of converting depth
information into one expressed by the unified coordinate
system;
[0104] FIG. 14 is a block diagram showing the arrangement
associated with extraction of distance information according to the
second embodiment of the present invention;
[0105] FIG. 15 is a block diagram showing the arrangement
associated with unification of distance information;
[0106] FIG. 16 is a flow chart showing the operation of an image
sensing head device 1;
[0107] FIG. 17 is an explanatory view of zoom adjustment;
[0108] FIG. 18 is an explanatory view of zoom adjustment;
[0109] FIG. 19 is an explanatory view of reliability
discrimination;
[0110] FIG. 20 is an explanatory view of reliability
discrimination;
[0111] FIG. 21 is an explanatory view of a three-dimensional
information processing apparatus according to the third
modification of the second embodiment;
[0112] FIG. 22 is a block diagram showing the arrangement of a
three-dimensional shape extractor 12 according to the third
modification in detail;
[0113] FIG. 23 is a flow chart showing the operation of an image
sensing head device 1 according to the third modification;
[0114] FIG. 24 is an explanatory view of reliability discrimination
according to the third modification;
[0115] FIG. 25 is a block diagram showing the arrangement and use
state of an image sensing apparatus according to the third
embodiment of the present invention;
[0116] FIG. 26 is a diagram showing the arrangement of a posture
sensor according to the third embodiment;
[0117] FIG. 27 is a diagram showing the arrangement of acceleration
sensors that make up the position sensor of the third
embodiment;
[0118] FIG. 28 is a view showing an example of the image input
timings of the third embodiment;
[0119] FIGS. 29A to 29C show examples of sensed images in the third
embodiment;
[0120] FIGS. 30A to 30C show examples of sensed images in the third
embodiment;
[0121] FIG. 31 is a block diagram showing the arrangement and use
state of an image sensing apparatus according to the fourth
modification; and
[0122] FIG. 32 is a block diagram showing the arrangement of a
stereoscopic image display means in the fourth modification.
DETAILED DESCRIPTION OF THE INVENTION
[0123] The preferred embodiments of the present invention will be
described in detail hereinafter with reference to the accompanying
drawings.
[0124] <First Embodiment>
[0125] The first embodiment of the present invention will be
described below.
[0126] FIG. 2 is a schematic block diagram showing the arrangement
of a three-dimensional information processing apparatus according
to the first embodiment of the present invention.
[0127] Arrangement
[0128] A three-dimensional information processing system according
to the first embodiment comprises an image sensing head device 1, a
three-dimensional shape extractor 12 for extracting the
three-dimensional shape from an image sensed by the head device 1,
a text editor 1001 for creating text data, a data combining unit
(program) 1000 for combining image data extracted by the extractor
12 and the data generated by an operation unit 11, a monitor 8 for
displaying two-dimensional image data of an object 2 and text data,
a printer 9 for printing the two-dimensional data of the object 2
and text data on a paper sheet or the like, and the operation unit
11 for moving the view point of the object 2, changing the display
format of the object 2, and attaining combining and editing of data
using the data combining unit 1000.
[0129] The image sensing head device 1 senses images of the object
2 having a three-dimensional shape, which is present in front of a
background plane 3. The three-dimensional shape extractor 12
comprises an image sensing processor 13 for executing various kinds
of image processing for images sensed by the image sensing head
device 1.
[0130] In the first embodiment, the user can select one of a
plurality of display formats of the object 2. More specifically,
the display formats include, e.g., a natural image, a line image
that expresses the edges of the object 2 by lines, a polygon image
that expresses the surface of the object 2 as contiguous planes
each having a predetermined size, and the like.
[0131] The image sensing head device 1 comprises an image sensing
lens 100R located on the right side when viewed from the apparatus,
an image sensing lens 100L located on the left side when viewed
from the apparatus, and an illumination unit 200 that outputs
illumination light in correspondence with an image sensing
environment. In FIG. 2, 10L represents the image sensing range of
the left image sensing lens 100L, and 10R the image sensing range
of the right image sensing lens 100R. The image sensing head device
1 senses images of the object 2 while moving to arbitrary positions
within the range from an image sensing start position A.sub.0 to an
image sensing end position A.sub.n. Note that the position
information of the image sensing head unit 1 at each image sensing
position between A.sub.0 and A.sub.n is output to a posture
detector 4 (to be described later).
[0132] The image sensing processor 13 comprises the posture
detector 4, an image memory 5, a 3D image processor 6, and a 2D
image processor 7.
[0133] The posture detector 4 of the image sensing processor 13 has
a position detector comprising a unit for calculating the position
information of the image sensing head device 1 by image processing
on the basis of information obtained from the background plane 3,
and a unit for calculating the position information of the image
sensing head device 1 by a sensor such as a gyro or the like. With
this detector, the position of the image sensing head device 1 with
respect to the background plane 3 can be determined.
[0134] The image memory 5 stores image data obtained by the image
sensing head device 1, and the position information of the image
sensing head device 1 obtained by the posture detector 4, and
comprises an image memory 5R for right images, and an image memory
5L for left images.
[0135] The 3D image processor 6 calculates the three-dimensional
shape (depth information, i.e., distance information) of the object
2 on the basis of the image data stored in the image memory 5 and
the corresponding position information of the image sensing head
device 1.
[0136] The 2D image processor 7 calculates two-dimensional image
data of the object 2 viewed from an arbitrary view point in the
image format designated by the user on the basis of the
stereoscopic image data of the object 2 obtained by the 3D image
processor 6.
[0137] With the three-dimensional information processing apparatus
having the above-mentioned arrangement, when the user directs the
image sensing head device 1 toward the object 2, and operates a
release button (not shown), images of the object 2 are sensed, and
the first image data are stored in the image memory 5.
[0138] Subsequently, when the user moves the image sensing head
device 1 from an arbitrary position A.sub.0 to a position An to
have the object 2 as the center, the posture detector 4 detects
that the position and direction have changed from the initial
position A.sub.0 of the image sensing head unit 1 by a
predetermined amount during movement from the position AO to the
position A.sub.n. After such detection is done by the posture
detector 4, second image sensing is made at a position A.sub.1. and
thereafter, image sensing is repeated n times in turn.
[0139] At this time, the image data and the displacement amounts
from the initial image sensing position and direction of the image
sensing head device 1 obtained by the posture detector 4 are stored
in the image memory 5. When the posture detector 4 detects that at
least one of the moving amount of the image sensing head device 1
and the direction change amount has largely exceeded a
predetermined value, an alarm unit (to be described later) produces
an alarm.
[0140] Thereafter, this operation is repeated several times. After
the image data sufficient for calculating the depth information of
the object 2 are obtained, an image sensing end information unit
(not shown) informs the user of the end of image sensing, thus
ending the image sensing processing.
[0141] Upon completion of the image sensing processing, the 3D
image processor 6 calculates stereoscopic image data of the object
2 on the basis of the image data of the object 2 and the position
information of the image sensing head device 1 corresponding to the
image data, which are stored in the image memory 5. The 2D image
processor 7 calculates two-dimensional image data viewed from the
initial image sensing position (the position A.sub.0) of the object
2, and outputs it to the monitor 8. The image format of the image
to be output to the monitor 8 can be selected by the operation unit
11.
[0142] The user can display an object image viewed from an
arbitrary view point on the monitor 8 by operating the operation
unit 11. For this purpose, the 2D image processor 7 generates the
object image viewed from the designated view point by performing
predetermined calculations of the stereoscopic image data in
correspondence with the user's operation on the operation unit 11.
Also, the user can change the image format of the object 2
displayed on the monitor 8 to other formats (natural image, polygon
image, and the like) by operating the operation unit 11.
[0143] The user can output the sensed image of the object 2 to the
printer 9 after he or she changes the view point and the image
format in correspondence with his or her purpose. Furthermore, the
user can combine and edit text data created in advance and the
object image data calculated by the 2D image processor 7 using the
data combining unit 1000 while displaying them on the monitor 8. At
that time, the user can also change the image format and view point
of the object 2 by operating the operation unit 11.
[0144] The detailed arrangement of the three-dimensional shape
extractor 12 will be described below.
[0145] FIG. 3 shows, in detail, the arrangement of the
three-dimensional shape extractor 12, i.e., the arrangement of the
image sensing head device 1 and the image sensing processor 13.
[0146] As shown in FIG. 3, the three-dimensional shape extractor 12
comprises the above-mentioned posture detector 4, image memories
73R and 73L for storing images which are being sensed currently,
image memories 75R and 75L for storing images sensed at the
immediately preceding image sensing timing, an overlapping portion
detector 92 for detecting the overlapping portion of the sensed
images, a sound generator 97 for informing the setting state of
various image sensing parameters such as an exposure condition and
the like by means of a sound, the image sensing lenses 100R and
100L each consisting of a zoom lens, iris diaphragms 101R and 101L
for adjusting the amounts of light coming from the image sensing
lenses 100R and 100L, image sensors 102R and 102L made up of CCDs,
and the like, A/D converters 103R and 103L for analog-to-digital
converting signals from the image sensors 102R and 102L, image
signal processors 104R and 104L for converting the signals from the
image sensors 102R and 102L into image signals, image separators
105R and 105L for separating an object, from which
three-dimensional information (depth information) is to be
extracted, from the background plane 3, zoom controllers 106R and
106L for adjusting the focal lengths of the image sensing lenses
100R and 100L, focus controllers 107R and 107L for adjusting the
focal point positions, iris diaphragm controllers 108R and 108L for
adjusting the aperture values, a system controller 210 for
controlling the overall three-dimensional shape extractor 12, an
image processor 220 including the image memory 5, the 3D image
processor 6, and the 2D image processor 7 shown in FIG. 2, a
release button 230 which is operated at the beginning of image
sensing, an EVF (electronic view finder) 240 for displaying the
setting state of various image sensing parameters such as an
exposure condition and the like, a recorder 250 which is connected
to the image processor 220 to record predetermined image data and
the like, an R-L difference discriminator 260 for detecting signals
required for R-L difference correction, a focusing state detector
270 for detecting the focusing state, image sensor drivers 280R and
280L for controlling driving of the image sensors 102R and 102L,
and an I/F 760 to external devices, which allows connections with
the external devices.
[0147] As shown in FIG. 4, the system controller 210 comprises a
microcomputer 900 for mainly performing the overall control, a
memory 910 which stores a program required for the overall control,
sensed image data, and the like, and an image processing section
920 for performing predetermined calculation processing for the
image data and the like stored in the memory 910 and the like.
[0148] The image processor 220 extracts three-dimensional
information of the object 2 from image signals obtained from the
image sensing lenses 100R and 100L, and unifies and outputs a
plurality of extracted three-dimensional information (depth
information) of the object 2 at the individual image sensing
positions on the basis of a plurality of posture information at the
individual image sensing positions obtained from the posture
detector 4.
[0149] FIG. 5 is a block diagram showing the arrangement of the
image processor 220 in detail, and mainly shows the arrangement
portion associated with extraction of depth information in the
image processor 220.
[0150] The image processor 220 extracts depth information from
stereoscopic images 110 consisting of right and left images (R and
L images) stored in the predetermined image memories.
[0151] As shown in FIG. 5, the image processor 220 comprises edge
extractors 111 (111R, 111L) for extracting edge images from the
stereoscopic images 110, a stereoscopic corresponding point
extractor 112 for extracting the correspondence among pixels in the
stereoscopic images 110, a corresponding edge extractor 113 for
extracting the correspondence among pixels in two edge images
extracted by the edge extractors 111, an inconsistency eliminating
unit or eliminator 114 for detecting inconsistent portions from the
correspondences extracted by the stereoscopic corresponding point
extractor 112 and the corresponding edge extractor 113, and
eliminating the inconsistent portions, an occlusion determining
unit 115 for determining the occlusion region based on the
extracted corresponding points and an index indicating the degree
of correlation used during corresponding point extraction, e.g., a
residual, a depth information distribution processor 116 for
calculating the depth information distribution by the principle of
trigonometric measurements on the basis of the relationship among
the corresponding points, characteristic point extractors 117
(117R, 117L) for identifying characteristic points of a background
plane portion, and a correction data calculation unit 118 for
acquiring the image sensing parameters, posture, and movement
relationship using the characteristic points of the background
plane portion.
[0152] FIG. 6 is a block diagram showing the arrangement of the
image processor 220 in more detail, and mainly shows the
arrangement portion associated with unification of depth
information of the object 2 in the image processor 220. Note that
"unification" means conversion of images sensed at different
positions to image data associated with a single unified coordinate
system. More specifically, "unification" is to convert a plurality
of depth information of the object obtained from at least two
arbitrary positions into depth data viewed from a single coordinate
system. Also, "unification" of this embodiment also implies
coordinate interpolation processing (to be described later).
[0153] In order to attain unification processing of depth
information of the object 2, as shown in FIG. 6, the image
processor 220 comprises a coordinate system converter 121 for
converting two depth information data (Z.sup.t(i, j) and
Z.sup.t+.delta.t(i, j)) 120 from a pair of stereoscopic images 110
obtained by the individual units onto a unified coordinate system,
a depth information unificator 122 for unifying depth information
120' converted onto the unified coordinate system, and a display
unit 124 for displaying the unified depth information.
[0154] Also, the image processor 220 comprises a unit for
outputting occlusion region information 123 to the unificator 122
and the display unit 124, and a unit for detecting the moving
amount and direction of the image sensing head device 1, and the
like.
[0155] Operation
[0156] The operation of the three-dimensional information
processing apparatus of the first embodiment with the above
arrangement will be described below.
[0157] The operation of the three-dimensional shape extractor 12
will be described in detail below with reference to FIG. 3.
[0158] In the three-dimensional shape extractor 12, images of the
object 2 are input via the image sensing lenses 100R and 100L. The
input object images are converted into electrical signals by the
image sensors 102R and 102L. Furthermore, the converted signals are
converted from analog signals into digital signals by the A/D
converters 103R and 103L, and the digital signals are supplied to
the image signal processors 104R and 104L.
[0159] The image signal processors 104R and 104L convert the
digital signals of the object 2 into luminance and chrominance
signals in an appropriate format. The image separators 105R and
105L measure depth information in the object to be sensed on the
basis of the signals obtained from the image signal processors 104R
and 104L, thereby separating the principal object 2 from the
background plane 3.
[0160] As one separation method, an image of the background plane 3
is sensed in advance, and is stored in a predetermined memory.
Thereafter, the principal object 2 is placed on the background
plane, and its image is sensed. The sensed image and the stored
image of the background plane 3 are subjected to matching and
differential processing, thereby separating the background plane
region. Note that the separation method is not limited to such
specific method, and the background plane region may be separated
on the basis of color or texture information.
[0161] The separated image data of the principal object 2 are
supplied to the image processor 220, which executes
three-dimensional shape extraction processing on the basis of
various image sensing parameters obtained upon image sensing.
[0162] The image sensing parameters upon image sensing include,
e.g., a focal length, which can be set by the following method.
[0163] Distance information Z is given by the following equation
(1): 1 Z = f B d ( 1 )
[0164] where Z: the distance, f: the focal length, B: the base line
distance; and d: the parallax.
[0165] In order to precisely recognize the three-dimensional shape
by image processing, the resolution of the distance Z corresponding
to the parallax is important. The resolution of Z is defined by the
following equation: 2 Z d = - f B d 2 ( 2 )
[0166] Accordingly, the focal length f is written as follows using
the distance resolution determined by the parallax as a parameter:
3 f = - d 2 B Z d ( 3 )
[0167] Hence, the resolution is set at, e.g., the operation unit 11
via the I/F 760, and the focal length can be set based on this
value.
[0168] The method of extracting depth information Z from
stereoscopic images 110R and 110L by the image processor 220 will
be described below with reference to FIG. 5.
[0169] Two processing operations are done for the stereoscopic
images 110R and 110L read out from the predetermined image
memories.
[0170] In one processing, the stereoscopic corresponding point
extractor 112 extracts the correspondence among pixels in the
stereoscopic images 110R and 101L on the basis of their luminance
values.
[0171] In the other processing, the corresponding edge extractor
113 extracts the correspondence among pixels in two stereoscopic
edge images 110R' and 110L' (obtained as edge images by the edge
extractors 111).
[0172] The inconsistency eliminator 114 detects inconsistent
portions in the correspondences on the basis of the outputs from
the above-mentioned corresponding point extractors (112 and 113).
If the correspondence obtained based on the luminance values does
not coincide with that obtained based on the edge images, it is
determined that their reliability is low, and it is proper to
eliminate such correspondences. Alternatively, the individual
correspondences may be weighted, and inconsistent portions may be
detected.
[0173] The occlusion determining unit 115 determines the occlusion
region on the basis of the obtained corresponding points and an
index (e.g., a residual R) indicating the degree of correlation
between corresponding points used during calculations of the
corresponding points. This processing is to add reliability to the
results of the corresponding point processing, although the
corresponding point processing yields tentative results. As the
index indicating the degree of correlation, a correlation
coefficient or residual is used. If the residual is very large, or
if the correlation coefficient is low, it is determined that the
reliability of the correspondence is low. The low-reliability
portion is processed as an occlusion region or a region without any
correspondence.
[0174] Using the correspondence obtained via the above-mentioned
processing, the depth information Z of the object 2 is calculated
according to equation (1) using the principle of trigonometric
measurements.
[0175] The template matching method as a typical corresponding
point extraction method executed in the above-mentioned
stereoscopic corresponding point extractor 112 will be explained
below.
[0176] In the template matching method, a template image T
consisting of N*N pixels is extracted from, e.g., the image 110L
obtained by the left image sensing system, as shown in FIG. 7.
Using this template T, search of equation (4) below is performed
(M-N+1).sup.2 times in a search region having a size of M.times.M
pixels (N<M) in the image 110R obtained by the right image
sensing system. That is, as shown in FIG. 7, a position (a, b) is
defined as the upper left position of the template T.sub.L to be
set, and a residual R(a, b) given by equation (4) below is
calculated while placing the template TL at a certain position (a,
b): 4 R ( a , b ) = i = 0 N - 1 j = 0 N - 1 I R ( a , b ) ( i , j )
- T L ( i , j ) ( 4 )
[0177] This operation is repeated by moving the position (a, b)
within the image to be searched (in this example, the left image
110L) to obtain a position (a, b) corresponding to the minimum
residual R(a, b). The central pixel position of the template image
T.sub.L(i, j) when the template image T.sub.L(i, j) is located at
the position (a, b) corresponding to the minimum value R(a, b) is
determined as a corresponding point. In the above equation,
I.sub.R(a,b)I(i, j) represents a partial image of the right image
110R when the upper left point of the template is located at the
position (a, b).
[0178] The stereoscopic corresponding point extractor 112 applies
the above template matching method to the stereoscopic images 110
to obtain corresponding points for luminance level.
[0179] In corresponding point extraction for edge level, the
above-mentioned template matching is done for edge-extracted
stereoscopic images 110L' and 110R'.
[0180] As pre-processing for corresponding point extraction for
edge level, the edge extractors (111) emphasize the edge portions
using, e.g., a Robert filter or Sobel filter.
[0181] More specifically, when the Robert filter is used, the edge
extractors 111R and 111L receive the input images 110R and 110L
(f(i, j) represents each input image), and output the output image
data (g(i, j) represents each output image) expressed by the
following equation:
g(i,j)=sqrt({f(i,j)-f(i+1,j+1)}.sup.2)+sqrt({f(i+1,j)-f(i,j+1)}.sup.2)
(5)
or
g(i,j)=abs{f(i,j)-f(i+1,j+1)} +abs{f(i+1,j)-f(i,j+1)} (6)
[0182] When the Robert filter is used, an x-filter f.sub.x and
y-filter f.sub.y are defined by: 5 f x = ( - 1 0 1 - 2 0 2 - 1 0 1
) , ( 7 ) f y = ( - 1 2 - 1 0 0 0 1 2 1 ) ( 8 )
[0183] and, the tilt .theta. of the edge is given by: 6 = tan - 1 (
f y f x ) ( 9 )
[0184] The edge extractors perform binarization of such
edge-emphasized images to extract edge components. The binarization
is performed using an appropriate threshold value.
[0185] The time-series unification processing of depth information
obtained as described above will be described below with reference
to FIG. 6.
[0186] FIG. 6 shows the process of generating the depth information
Z 120 obtained from the stereoscopic images 110 by the
above-mentioned processing time-serially. More specifically, depth
information Z.sup.t(i, j) obtained at time t is input to the
coordinate system converter 121, and thereafter, depth information
Z.sup.t+.delta.t(i, j) obtained at time t+.delta.t is input.
[0187] On the other hand, the posture detector 4 for detecting the
moving amount, direction, and the like of the image sensing head
device 1 sends that information to the coordinate system converter
121. The coordinate system converter 121 converts the depth
information Z onto the unified coordinate system using such
position information by the processing method to be described
below. By converting the coordinate system of the depth
information, the time-serially obtained image information can be
easily unified. As the coordinate conversion method in the
coordinate system converter 121, for example, affine transformation
is used, and identical Euler's angles are set.
[0188] Unification of Depth Information
[0189] The processing for unifying the depth information converted
onto the unified coordinate system in the depth information
unificator 122 will be described below with reference to FIGS. 8A
to 10A. FIGS. 8A and 8B to FIGS. 10A and 10B are views for
explaining the procedure for combining depth information.
[0190] FIG. 8A is a graph showing changes in depth information
Z.sup.t(i, j) detected at certain time t in a (Zij) space. Note
that i and j represent the coordinate axes i and j perpendicular to
the depth direction Z of the object 2.
[0191] FIG. 8B is a graph showing changes in Z'.sup.t+.delta.t(i,
j) obtained by viewing depth information Z.sup.t+.delta.t(i, j)
detected at time t+.delta.t from the unified direction again in the
(Zij) space.
[0192] FIG. 9A is a graph showing changes in luminance information
I.sup.t.sub.R(i, j) in the (Zij) space. FIG. 9B is a graph showing
changes in luminance information I'.sub.R.sup.t+.delta.t(i, j)
viewed from the unified direction again.
[0193] FIG. 10A shows shifts in depth information Z.sup.t(i,j) from
time t to time t+.delta.t. In FIG. 10A, (i.sub.0, j.sub.0)
represents changes in the i and j directions. That is,
superposition of the graphs in FIGS. 8A and 8B gives the graph in
FIG. 10A.
[0194] FIG. 10B shows the state wherein Z'.sup.t+.delta.t(i, j) in
FIG. 10A is shifted by (i.sub.0, j.sub.0), and is superposed on
Z.sup.t(i, j).
[0195] As shown in FIG. 10B, upon superposing depth information,
the superposing degree Q is calculated using, e.g., the following
equation (10): 7 Q = i = 0 N - 1 j = 0 N - 1 I R t ( i , j ) - I R
' t + t ( i , j ) + i = 0 N - 1 j = 0 N - 1 Z R t ( i , j ) - Z R '
t + t ( i , j ) ( 10 )
[0196] Subsequently, (i.sub.0, j.sub.0) that yields the minimum
superposing degree Q is calculated.
[0197] Since a bright point Z (having a luminance I) on the object
at the depth Z is an identical bright point even at time t and time
t+.delta.t, the depth information z and luminance information I
from the identical bright point must assume identical values at
time t and time t+.delta.t. Hence, if Z'.sup.t+.delta.t(i, j)
coincides with Z.sup.t(i, j), (i.sub.0, j.sub.0) minimizes the
evaluation function Q.
[0198] Using the calculated (i.sub.0, j.sub.0) the depth
information Z is shifted by (i.sub.0, j.sub.0) and is superposed on
another depth information, as shown in FIG. 10B.
[0199] Identical Point Removal
[0200] Subsequently, identical point removal and intermediate point
interpolation are performed. The identical point removal is
performed to reduce the information volume in each depth
information.
[0201] Assume that two corresponding points (x.sub.0, y.sub.0,
z.sub.0) and (x.sub.1, y.sub.1, z.sub.1) are obtained from images
at time t and time t+.delta.t. Whether or not these two
corresponding points are identical points is determined based on
the relation below. That is, if the following relation holds for an
infinitesimal constant .delta..sub.1, the two points are determined
as identical points, and one of these points is removed:
(x.sub.0-x.sub.1).sup.2+(y.sub.0y.sub.1).sup.2+(z.sub.0-z.sub.1).sup.2<-
.delta..sub.1 (11)
[0202] In place of relation (11), the following relation may be
used:
a(x.sub.0-x.sub.1).sup.2+b(y.sub.0-y.sub.1).sup.2+c(z.sub.0-z.sub.1).sup.2-
<.delta..sub.2 (12)
[0203] where a, b, c, and d are appropriate coefficients. For
example, if a=b=1 and c=2, i.e., the weighting coefficient in the
z-direction is set to be larger than those in the x- and
y-directions, the difference in distance Z in the z-direction
between two points can be discriminated more sensitively.
[0204] Interpolation with Intermediate Point
[0205] As the intermediate point interpolation method, a method of
calculating an intermediate point, as shown in, e.g., FIG. 11, may
be used.
[0206] Note that the Zij three-dimensional space is projected onto
a Z-i plane in FIG. 11 for the sake of simplicity.
[0207] In FIG. 11, a point A (denoted by .largecircle.) on the
graph indicates the extracted depth information Z.sup.t(i, j), and
a point B (denoted by .circle-solid.) indicates
z'.sup.t+.delta.t(i+i.sub.0, j+j.sub.0) obtained by shifting
Z'.sup.t+.delta.t(i, j) by (i.sub.0, j.sub.0). Also, a point C
(denoted by .quadrature.) indicates the interpolated intermediate
point, i.e., new depth information Z.sub.new. As the interpolation
method, for example, linear interpolation, spline interpolation, or
the like is used.
[0208] Unified Coordinate System
[0209] The "unified coordinate system" used in the above-mentioned
unification processing will be described below with reference to
FIGS. 12 and 13.
[0210] In FIG. 12, reference numeral 2 denotes an object; 3, a
background plane formed by a pad; and 1800 to 1804, imaginary
projection planes used for registering depth information. Also,
reference numerals 1810 to 1814 denote central axes (optical axes)
of the imaginary projection plane.
[0211] The "unified coordinate system" used in this embodiment
means five sets of reference coordinate systems each of which is
defined by (x, y, z). That is, as shown in, e.g., FIG. 12, five
sets of coordinate systems that form the imaginary projection
planes 1800 to 1804 are present.
[0212] The depth information Z.sup.t(i, j) obtained by the above
processing is projected onto the individual projection planes (five
planes). Upon projection, conversions such as rotation,
translation, and the like are performed in accordance with the
individual reference coordinates. This state is shown in FIG.
13.
[0213] In FIG. 13, the intersections between the projection plane
1803 and straight lines that connect the central point O on the
optical axis 1813 and the individual points S on the object are
points P converted onto the unified coordinate systems.
[0214] Note that FIG. 13 exemplifies the projection plane 1803, and
the same applies to other projection planes. Also, the same applies
to the next depth information Z.sup.t+.delta.t(i, j). In this case,
each depth information is sequentially overwritten on the
previously written one. Accordingly, depth information along five
reference axes is obtained for a certain object 2. For example, one
point is expressed by five points (x.sub.0, y.sub.0, z.sub.0),
(x.sub.1, y.sub.1, z.sub.1), (x.sub.2, y.sub.2, z.sub.2), (x.sub.3,
y3, z.sub.3), and (x.sub.4, y.sub.4, z.sub.4) on the projection
planes 1800 to 1804.
[0215] As described above, according to the first embodiment, upon
unifying depth information, since a plurality of depth information
are converted into a plurality of unified coordinate systems on the
basis of displacement information of the luminance information and
distance information of the object 2, the present invention can
flexibly cope with dynamic image sensing which is done while moving
around the object 2, and can process an image to various image
formats.
[0216] <Modification of First Embodiment> . . . First
Modification
[0217] The first-modification of the first embodiment will be
described below. Note that the arrangement of the image sensing
device and the image sensing method are the same as those in the
first embodiment, and a detailed description thereof will be
omitted. Hence, a unificator different from that in the first
embodiment will be described below.
[0218] In the first modification, a correlation calculation is made
using the obtained depth information alone, as shown in the
following equation (13): 8 Q = i = 0 N - 1 j = 0 N - 1 Z R t ( i ,
j ) - Z R ' t + t ( i , j ) ( 13 )
[0219] More specifically, the first modification does not use any
luminance information I given by equation (10) in the first
embodiment. Such method is effective for shortening the correlation
calculation time albeit slightly.
[0220] <Modification of First Embodiment> . . . Second
Modification
[0221] The second modification will be described below. Note that
the arrangement of the image sensing device and the image sensing
method are the same as those in the first embodiment, and a
detailed description thereof will be omitted. Hence, a unificator
different from that in the first embodiment will be described
below.
[0222] In the second modification, as the method of interpolation,
weighting is performed using luminance level given by equation (14)
below.
[0223] For example, equation (14) is used as a weighting
coefficient t: 9 t = 1 2 tan h ( I R t ( i , j ) - I R ' t + t ( i
, j ) ) + 1 2 ( 14 )
[0224] information Z by interpolation, weighting is performed as
follows as a kind of linear interpolation:
Z.sub.new=t.multidot.Z.sub.1+(1-t).multidot.Z.sub.2 (15)
[0225] <Advantages of First Embodiment>
[0226] As described above, according to the first embodiment and
its modifications, upon unifying depth information, since a
plurality of depth information are converted into a plurality of
unified coordinate systems on the basis of displacement information
of the luminance information and distance information of the object
2, the present invention can flexibly cope with dynamic image
sensing which is done while moving around a certain object, and can
process an image to various image formats.
[0227] <Second Embodiment>
[0228] In the first embodiment mentioned above, depth information
is converted onto the unified coordinate systems on the basis of
the displacements of the luminance information and distance
information of the object. A three-dimensional information
extraction apparatus of the second embodiment has as its object to
improve reliability in three-dimensional information processing.
Accordingly, the system of the second embodiment has many common
elements to those in the system of the first embodiment. That is,
the second embodiment directly uses, as its hardware arrangement,
the elements of the first embodiment shown in FIGS. 2 to 4.
[0229] That is, the system of the second embodiment has
substantially the same image processor 220 as in the first
embodiment, except that the image processor 220 has a distance
information distribution processor 116' and a reliability
determining unit 130, as shown in FIG. 14. Note that "distance
information" is information having the same concept as "depth
information". Hence, the arrangement and operation of the distance
information distribution processor 116' of the second embodiment
will be understood by reference to those associated with the depth
information distribution processor 116 of the first embodiment.
[0230] Elements different from those in the first embodiment in
FIG. 14 will be described below. The distance information
distribution processor 116' calculates the distance information
distribution using the principle of trigonometric measurements on
the basis of the relationship among corresponding points. The
reliability determining unit 130 determines reliability.
[0231] Note that the reliability determining unit 130 determines
the reliability level of the calculated distance information on the
basis of the output from the occlusion determining unit 115, the
processing result of the distance information distribution
processor 116', and the image sensing parameters and position
information from the correction data calculation unit 118, and adds
reliability information corresponding to the reliability level to
the calculated distance information.
[0232] FIG. 15 is a block diagram showing the image processor 220
in more detail, and mainly shows the arrangement portion associated
with unification of distance information of the object 2 in the
image processor 220.
[0233] In order to perform unification processing of the distance
information of the object 2, as shown in FIG. 15, the image
processor 220 comprises a coordinate system converter 121 for
converting distance information (Z.sup.t(i, j)) from a pair of
stereoscopic images 110 calculated by the individual units onto a
unified coordinate system, a distance information unificator 122
for unifying the distance information converted onto the unified
coordinate system, and a display unit 124 for displaying the
unified distance information. The image processor 220 also
comprises a unit for outputting occlusion region information to the
unificator 122 and the display unit 124, and a unit for detecting
the moving amount and direction of the image sensing head device 1,
and the like.
[0234] Note that "unification" is to set identical points so as to
convert each distance information 120 into one viewed from a single
coordinate system on the basis of displacement information between
two distance information data 120 of the object 2 obtained from at
least two arbitrary positions. Also, "unification" implies
interpolation processing of coordinates (to be described later),
determining the reliability of coordinates of a point or area on
the basis of a reliability coefficient obtained from reliability
information of the distance information, and the like.
[0235] Reliability Determination
[0236] The processing sequence of the image head device 1 of the
three-dimensional information processing apparatus according to the
second embodiment will be described below with reference to the
flow chart in FIG. 16.
[0237] When the power supply is turned on (step S1) and image
signals are input, the controller 210 integrates the image signals
obtained from the image separators 105R and 105L using the image
processing section 920 to calculate the luminance level of the
principal object 2 (step S2). If it is determined that the
calculated luminance level is insufficient for three-dimensional
shape extraction, the controller 210 turns on the illumination unit
200 (step S3). At this time, the illumination intensity level may
be varied in correspondence with the calculated luminance
level.
[0238] Subsequently, in-focus points are adjusted using the
individual image signals set at appropriate luminance level (step
S5). At this time, the lenses 100R and 100L are moved to form focal
points on both the principal object 2 and the background plane 3,
and the iris diaphragms 108R and 108L are adjusted. At that time,
when the luminance level changes by a given amount or more, the
intensity of the illumination unit 200 is changed to compensate for
that change in luminance level. Alternatively, an AGC (auto-gain
control) circuit may be assembled to attain electrical level
correction. The focusing state is detected by the focusing state
detector 270. As a detection method for this purpose, a method of
detecting the sharpness of an edge, or the defocus amount may be
used.
[0239] After the in-focus points are adjusted, zoom ratio
adjustment is done (step S6).
[0240] FIG. 17 shows the outline of zoom ratio adjustment in the
system of the second embodiment.
[0241] In the state wherein the principal object 2 roughly falls
within the focal depth, images obtained from the individual image
sensing systems 100R and 100L are held in the memory 910 of the
controller 210, and the image processing section 920 detects the
overlapping region. In this case, correlation calculation
processing, template matching processing, or the like is used as
the detection method.
[0242] As shown in FIG. 17, an overlapping region 500 is detected
in the initial state, and thereafter, the controller 210 sets the
zoom ratio in a direction to increase the area of the region in the
frames of the two image sensing systems and outputs control signals
to the zoom controllers 106R and 106L.
[0243] FIG. 18 shows changes in overlapping region in the frame by
a series of zoom ratio adjustment processes. In FIG. 18, the image
processing section 920 of the controller 210 calculates a focal
length f at which the overlapping region has a peak area P in FIG.
18, and control signals are supplied to the zoom controllers 106R
and 106L.
[0244] When the focal length f changes by the above-mentioned
operation, and consequently, the focal depth range changes by a
given amount or more, control signals are supplied to the iris
diaphragm controllers 108R and 108L in accordance with step S200
(steps S1 to S5) of readjusting parameters in FIG. 16.
[0245] After step S100 (including a series of adjustment steps S1
to S7), readjustment of parameters and adjustment of an R-L
difference in step S200 are performed. In the adjustment of the R-L
difference, the R-L difference discriminator 260 detects the
exposure amounts, in-focus points, and zoom ratios from image
signals. Based on the detected signals, the controller 210 supplies
control signals to the zoom controllers 106R and 106L, focus
controllers 107R and 107L, and iris diaphragm controllers 108R and
108L.
[0246] Note that various image sensing parameters upon image
sensing include, e.g., the focal length, which can be set by the
method (equations (1) to (3)) described in the first
embodiment.
[0247] After the image sensing parameters are adjusted in steps
S100 and S200, the controller 210 supplies a signal to the display
unit 240 to inform the user of the end of parameter setting (step
S8). Note that the display unit 240 may comprise a display such as
a CRT, an LCD, or the like, or may perform simplified indication
using an LED or the like. Also, a sound may be produced as well as
visual information.
[0248] Upon completion of parameter setting, the user presses the
release button at appropriate intervals while moving the image
sensing head device 1 to input images (steps S9 to S11). In this
case, the moving speed, position, and the like of the image sensing
head device 1 are also detected (steps S12 to S14).
[0249] The method of extracting distance information from
stereoscopic images 110 by the image processor 220 is substantially
the same as extraction of depth information in the first
embodiment.
[0250] The corresponding point extraction processing in the second
embodiment uses the template matching method as in the first
embodiment.
[0251] In this manner, edge-emphasized images are subjected to
binarization to extract edge components. Note that the binarization
is made using an appropriate threshold value.
[0252] In the next image extraction processing step, the occlusion
region is determined by the occlusion region determining unit 115
on the basis of the calculated corresponding points and an index
(e.g., a residual) indicating the degree of correlation used in the
process of calculating the corresponding points.
[0253] This processing is to add reliability to the results of the
corresponding point processing, although the corresponding point
processing yields tentative results. Reliability information is
added using a correlation coefficient or residual as an index
indicating the degree of correlation. If the residual is very
large, or if the correlation coefficient is low, it is determined
that the reliability of the correspondence is low. The
low-reliability portion is processed as an occlusion region or a
region without any correspondence.
[0254] More specifically, as shown in FIG. 19, if the residual per
pixel falls within the range from 0 to 2, the reliability
coefficient is 3; if the residual per pixel falls within the range
from 2 to 4, the reliability coefficient is 2; and if the residual
per pixel is 4 or more, the reliability coefficient is 0. When the
reliability coefficient is 0, the corresponding pixel is
deleted.
[0255] Via the above-mentioned processing steps, the distance
information of the object is calculated using the calculated
correspondence and the principle of trigonometric measurements. The
trigonometric measurements are attained as described above using
equation (1).
[0256] Subsequently, since the position and image sensing direction
of the image sensing head device 1 upon image sensing can be
detected from the output from the correction data calculation unit
118, the reliability determining unit 130 determines reliability of
the distance information based on the calculation result from the
unit 118. The calculated distance information is expressed as a
point group on the coordinate system determined by the data of the
background plane 3. At this time, when a region between the edge
portions as the outputs from the edge extractors 111 undergoes an
abrupt change in distance, the corresponding distance information
is deleted. This is because when the distance changes abruptly, it
is very likely that such portion is recognized as an edge
portion.
[0257] The distances from the image sensing plane to the individual
points are calculated, and the tilt of an area defined by adjacent
three points with respect to the image sensing plane is calculated.
The tilts of neighboring areas are checked, and if the difference
between their tilts is negligibly small, the area is extended until
all the areas having the same tilt are combined. Thereafter,
reliability information is added to each area. In this case, the
area is not extended to an occlusion portion or a portion from
which the distance information is deleted. At this time,
information as a point group may be held, but is preferably deleted
to compress the information volume.
[0258] The reliability information is determined and added in
correspondence with the angle with respect to the image sensing
plane and the residual, as shown in FIG. 20.
[0259] In the case of FIG. 20, when the angle with respect to the
image sensing plane falls within the range from 0.degree. to
30.degree. and the residual falls within the range from 0 to 2, the
reliability coefficient is 3 which indicates the highest
reliability. On the other hand, when the angle with respect to the
image sensing plane falls within the range from 80.degree. to
90.degree. and the residual falls within the range from 2 to 4, the
reliability coefficient is 0 which indicates the lowest
reliability. The data of the area with the reliability
coefficient=0 may be deleted as unreliable data.
[0260] In this manner, reliability data is added to each area as
2-bit information having different reliability coefficients 3, 2,
1, and 0 in correspondence with the angle of the area. Thereafter,
three-dimensional shape information is recorded in the recorder 250
after it is converted into an appropriate format.
[0261] As described above, since image sensing is performed at a
plurality of positions A.sub.0 to A.sub.n, all the sensed images do
not always include the background plane 3 with a size large enough
to precisely obtain characteristic points. For this reason,
reliability information is added in correspondence with the ratio
of the background plane 3 to the image sensing region. The
background plane 3 can be detected by the image separator 105. For
example, when the ratio falls within the range from 100 to 30%, the
reliability coefficient is 3; when the ratio falls within the range
from 30 to 15%, the reliability coefficient is 2; and when the
ratio is 15% or less, the reliability coefficient is 1. When the
image sensing region includes almost no pad image of the background
plane 3, since the reference coordinate system cannot be
determined, distance information must be unified using, e.g.,
texture information. Accordingly, in such case, a low reliability
coefficient is set since reliability may be impaired otherwise. The
reliability coefficient determined based on the angle with respect
to the image sensing plane and the residual is changed in
correspondence with that reliability coefficient, and the changed
coefficient is added to the distance information as a new
reliability coefficient.
[0262] A distance image obtained from the right and left images can
be displayed on the monitor 8. The image displayed at that time can
be selected from a natural image, line image, and polygon image, as
described above, and in any one of the display patterns,
reliability information can be displayed at the same time. A
natural image is displayed while the luminance of each region is
changed in correspondence with the reliability coefficient. On the
other hand, a line image is displayed while changing the thickness
or type of lines (e.g., a solid line, broken line, chain line, and
the like). Also, a polygon image is displayed by changing the
colors of polygons. In this manner, the reliability information can
be displayed at the same time.
[0263] The time-series unification processing of the distance
information obtained as described above will be described below
with reference to FIG. 15.
[0264] Distance information 120 is time-serially generated based on
the obtained stereoscopic images 110, while the unit for detecting
the moving amount, direction, and the like of the image sensing
head device 1 sends that information. The coordinate system
converter 121 converts the distance information onto a unified
coordinate system using such information by the processing method
(to be described later). Converting the distance information allows
easy unification of information obtained time-serially.
[0265] Subsequently, a plurality of distance information converted
onto the unified coordinate system are unified.
[0266] Upon unification, the reliability information is used. For
example, assuming that two distance information data are obtained,
and they have different reliability coefficients in their
overlapping portion, the information with a higher reliability is
selected. Or information may be unified while being weighted in
correspondence with their reliability coefficients. When three or
more overlapping region data are present, unification is similarly
done in correspondence with the reliability coefficients.
Thereafter, the reliability coefficient is added to the unified
distance information. Since data with higher reliability is
selected upon unification, the reliability of the unified distance
information can be improved.
[0267] As shown in FIG. 15, the unificator 122 of the second
embodiment executes processing for removing identical points and
intermediate point correction processing as in the first
embodiment.
[0268] In the system of the second embodiment, since the "unified
coordinate system" used in the above-mentioned unification
processing is explained by FIGS. 12 and 13 as in the first
embodiment, a detailed description thereof will be omitted.
[0269] The unified distance information can be displayed on the
monitor 8. The three-dimensional shape of the object viewed from an
arbitrary view point can be observed by operating the operation
unit 11. At this time, the reliability information can be displayed
at the same time as in the case wherein the distance information
obtained from the right and left images is displayed. With this
display, since a low-reliability region can be determined at a
glance, the user can recognize the region to be additionally
sensed, and can perform additional image sensing.
[0270] <Modification of Second Embodiment> . . . Third
Modification
[0271] The third modification of the second embodiment will be
explained below.
[0272] FIG. 21 shows the outline of the third modification.
[0273] Referring to FIG. 21, reference numeral 2101 denotes a
principal object; 2100, a three-dimensional shape extractor of the
three-dimensional information processing apparatus; 100, an image
sensing lens; and 200, an illumination unit. Also, reference
numeral 2102 denotes a calibration pad. The three-dimensional shape
extractor detects the posture based on the image of this pad. Note
that letters A, B, C, and D on the pad 2102 serve as markers used
for detecting the posture of the extractor 2100. The posture of the
camera can be calculated based on the directions of these markers,
distortions of marker images, and the like.
[0274] FIG. 22 is a block diagram showing the three-dimensional
shape extractor 2100 according to the third modification in detail.
Note that the components denoted by the same reference numerals in
FIG. 22 except for symbols R and L have the same functions and
operations as those in the second embodiment, and a detailed
description thereof will be omitted. As shown in FIG. 22, the
three-dimensional shape extractor 2100 has substantially the same
functions and operations as those in the second embodiment, except
that it has a single-lens arrangement.
[0275] The operation in the third modification will be explained
below.
[0276] Since the apparatus of the third modification attains
posture detection in combination with the pad 2102, the image of
the pad 2102 must be obtained within an appropriate range upon
image sensing. The image separator 105 performs calculations or
template matching between the pre-stored feature portions (the four
corners A, B, C, and D in FIG. 21) and an image which is being
currently sensed, and outputs the detection signal to the system
controller 210. The system controller 210 sets the focal length so
that the image of the pad 2102 falls within an appropriate range in
the field of view. At the same time, the system controller 210
holds the focal length information in its memory 910.
[0277] With this processing, since the image of the entire pad is
kept within the field of view, the posture can always be detected
based on the shapes of the markers. Also, since the image of the
entire pad always falls within the field of view, reliability can
be improved in the corresponding point extraction processing. Since
the principal object 2101 is present in front of the pad, if the
calculated distance information exceeds the pad, that calculation
result can be deleted. Also, since the pad region can be
determined, the search region for extracting corresponding points
can be limited, and consequently, a large template size can be used
to improve precision for corresponding point extraction.
[0278] FIG. 23 is a flow chart showing the operation of the
three-dimensional information processing apparatus according to the
third modification.
[0279] As shown in the flow chart in FIG. 23, when the power supply
is turned on, and various parameters of the optical system such as
an exposure condition, in-focus point adjustment, and the like are
set (steps S21 to S25), an LED of the display unit 240 is turned on
(step S26) to inform the user of the input ready state. In response
to this indication, the user starts input (step S27) and presses
the release button 230 at appropriate intervals while moving the
extractor 2100 so as to input images (step S28). At this time, the
system controller 210 sets the focal length on the basis of
information from the image separator 105 so that the characteristic
portions of the pad 2102 including the principal object fall within
an appropriate range in the field of view. At the same time, the
system controller 210 stores image sensing parameter information
including the focal lengths at the individual image sensing
positions in the memory 910. The posture detector 4 detects the
posture based on the states of the characteristic portions (step
S29).
[0280] The image processor 220 reads out a plurality of image
signals held in image memories 73 and 75, and converts and corrects
images into those with an identical focal length on the basis of
the image sensing parameter information held in the memory 910 of
the system controller. Furthermore, the image processor 220
extracts the object shape using the corrected image signals and the
posture signal detected by the posture detector 4.
[0281] Thereafter, reliability information is added to the obtained
three-dimensional shape information. In the third modification, the
reliability information is determined and added in correspondence
with the angle with respect to the image sensing plane and the
distance from the image sensing plane, as shown in FIG. 24.
[0282] In the case of FIG. 24, when the angle with respect to the
image sensing plane falls within the range from 0.degree. to
30.degree. and the object distance falls within the range from 10
cm to 30 cm, the reliability coefficient is 3, and this value
indicates the highest reliability. On the other hand, when the
angle with respect to the image sensing plane falls within the
range from 80.degree. to 90.degree. and the object distance is 60
cm, the reliability coefficient is 0, and this value indicates the
lowest reliability. The data of an area with the reliability
coefficient=0 may be deleted. In this manner, reliability data is
added as 2-bit information to each area.
[0283] The three-dimensional shape information added with the
reliability information is supplied to the recorder 250. The
recorder 250 converts the input signal into an appropriate format,
and records the converted signal.
[0284] <Advantages of Second Embodiment>
[0285] As described in detail above, according to the second
embodiment, since the reliability of the extracted
three-dimensional shape information is determined on the basis of
the angle of the object with respect to the image sensing plane,
the object distance, and the image correspondence that can be
discriminated from the residual or correlation, the reliability of
the obtained three-dimensional shape information can be improved.
When the three-dimensional shape information is processed and
displayed in correspondence with the reliability, the user can be
visually informed of the reliability.
[0286] In the second embodiment and third modification, the
reliability is determined using the residual or correlation upon
extracting corresponding points, the angle of the object with
respect to the image sensing plane and object distance, the ratio
of the pad image with respect to the image sensing region, and the
position information of the pad. In addition to them, the
reliability of the obtained three-dimensional shape can also be
determined using light emitted by a light source and reflected by
the object and the angle of the pad with respect to the image
sensing plane.
[0287] A case using light emitted by a light source and reflected
by the object will be explained below.
[0288] Light reflected by the object can be discriminated to some
extent on the basis of the luminance information of image signals.
This is because when the reflectance of the object is high, the
luminance becomes very high over a certain range at the position
where the reflected light enters the lens. The portion with the
high luminance is removed as that obtained by reflection. More
specifically, threshold values are determined in correspondence
with the respective luminance levels, and the reliability
coefficients of 0 to 3 are determined in accordance with the
threshold values.
[0289] A case using the angle of the pad with respect to the image
sensing plane will be explained below.
[0290] In this case, the reliability coefficients are added in
correspondence with the angle of the pad like in a case wherein the
reliability coefficients are set in correspondence with the angle
of the object with respect to the image sensing plane. This
utilizes the fact that if the reliability of the reference
coordinate system is low, the three-dimensional shape on the
reference coordinate system also has low reliability since the
reference coordinate system is obtained from the pad. For example,
when the angle of the object falls within the range from 0.degree.
to 60.degree., the reliability coefficient is 3; when the angle of
the pad falls within the range from 60.degree. to 75.degree., the
reliability coefficient is 2; when the angle of the pad falls
within the range from 75.degree. to 85.degree., the reliability
coefficient is 1; and when the angle of the pad falls within the
range from 85.degree. to 90.degree., the reliability coefficient is
0. The reason why the pad angle detection is set to have higher
reliability than the object angle detection is that the angle can
be precisely calculated from a plurality of data by, e.g., the
method of least squares since the pad is recognized as a plane in
advance.
[0291] In the above description, the reliability coefficient is
2-bit information, but the number of bits may be increased as
needed.
[0292] As described above, according to the second embodiment,
since the reliability of the extracted three-dimensional shape
information is determined on the basis of the angle of the object
with respect to the image sensing plane, the object distance, and
the image correspondence that can be discriminated from the
residual or correlation, the reliability of the obtained
three-dimensional shape information can be improved. When the
three-dimensional shape information is processed and displayed in
correspondence with the reliability, the user can be visually
informed of the reliability.
[0293] <Third Embodiment>
[0294] The third embodiment aims at improving the image sensing
timing.
[0295] FIG. 25 is a diagram showing the arrangement and use state
of an automatic image sensing apparatus 1100 as an image sensing
apparatus according to the third embodiment of the present
invention. In FIG. 25, the same reference numerals denote the same
parts as in the previously described prior art shown in FIG. 1. The
differences in FIG. 25 from FIG. 1 are that a posture sensor 1128,
a process controller 1129, and an object recognition circuit 1130
are added to the arrangement shown in FIG. 1. In FIG. 25, reference
numerals 1142 and 1143 denote signal lines.
[0296] In the automatic image sensing apparatus 1100 of the present
invention, a plurality of means can be used as image sensing
condition detection means and, for example, the posture sensor
1128, an image sensing parameter detection circuit 1123, the object
recognition circuit 1130, and a corresponding point extraction
circuit 1122 in FIG. 25 correspond to such means.
[0297] The operation when these constituting elements are used will
be explained below.
[0298] The operation in the simultaneous processing mode will be
exemplified below.
[0299] In the automatic image sensing apparatus 1100, the posture
sensor 1128 always detects the rotation angle and moving amount of
the apparatus 1100, and the process controller 1129
process-controls to input image signals to storage circuits 1120
and 1121 every time the automatic image sensing apparatus 1100
changes to a predetermined position and by a predetermined angle.
When the posture sensor 1128 detects that the apparatus 1100 has
completed one revolution around an object 1101, the process
controller 1129 reads out images from the image signal storage
circuits 1120 and 1121, and starts simultaneous processing of the
corresponding point extraction circuit 1122, the image sensing
parameter detection circuit 1123, and a three-dimensional
information unifying circuit 1125.
[0300] FIG. 26 shows the arrangement of the posture sensor 1125 in
detail. As shown in FIG. 26, three small vibration gyros 1201,
1202, and 1203 are arranged so that their axes extend in directions
perpendicular to each other, and independently detect the rotation
angular velocities (pitch, yaw, and roll) of the automatic image
sensing apparatus 1100. Integrators 1203, 1204, and 1205
respectively integrate the detected values, and convert them into
rotation angles of the automatic image sensing apparatus 1100. When
the photographer performs image sensing so that an object 1101
always falls within the frame, the rotation angles of the automatic
image sensing apparatus 1100 itself substantially match information
indicating the degree of revolution of the automatic image sensing
apparatus 1100 around the object 1101. Based on such information,
when the pitch or yaw angle has changed by a predetermined angle,
the process controller 1129 controls to store images. Although
changes in the roll direction are not directly used in process
control, if the automatic image sensing apparatus 1100 rolls
considerably and both pitch information and yaw information are
mixed and output, the roll information is used for accurately
separating these outputs. The merits of the arrangement using the
angular velocity sensors are a very compact arrangement, high
sensor sensitivity, and very high precision owing to only one
integration.
[0301] The posture sensor 1128 may be constituted by acceleration
sensors to detect accelerations.
[0302] FIG. 27 shows the layout of acceleration sensors 1301, 1302,
1303, 1304, 1305, and 1306 that make up the posture sensor 1128. In
general, since an acceleration sensor detects linear vibrations, a
pair of sensors are arranged parallel to each other. Reference
numerals 1310 to 1315 respectively denote integrators each for
performing integration twice. Each integrator integrates the
corresponding acceleration sensor output twice to calculate the
position moving amount. When the integral outputs from a channel
consisting of a pair of acceleration sensors are added to each
other, translation components (X, Y, Z) in the attachment direction
of the pair of acceleration sensors can be obtained; when the
outputs are subjected to subtraction, rotation components (.alpha.,
.beta., .gamma.) can be obtained. To attain such calculations,
adders 1320, 1321, and 1322, and subtractors 1330, 1331, and 1332
are arranged.
[0303] The process controller 1129 checks the moving amount of the
automatic image sensing apparatus 1100 relative to the object 1101
to control the image input timings to the image signal storage
circuits 1120 and 1121. Although this detection method requires a
complicated sensor arrangement, since all the degrees of freedom
(horizontal X, vertical Y, back-and-forth Z, pitch .alpha., yaw
.beta., and roll .gamma.) of the automatic image sensing apparatus
1100 can be detected at the same time, changes in view point with
respect to the object 1101 can be accurately detected.
[0304] Furthermore, as for some methods for detecting the relative
position relationship between two objects in a non-contact manner,
"Survey of helmet tracking technologies" SIP Vol. 1456
Large-Screen-Projection, Avionics, and Helmet-Mounted Displays
(1991) p. 86 (to be referred to as a reference hereinafter) has
descriptions about the principles, characteristics, and the like of
the individual methods.
[0305] Such principles can be applied to the posture sensor 1128 of
the automatic image sensing apparatus 1100. This reference
describes the principle of analyzing relative position on the basis
of bright point images sensed by a camera. When such technique is
applied to the automatic image sensing apparatus 1100, the image
sensing parameter detection circuit 1123 is controlled to operate
all the time using the signal lines 1142 and 1143 in FIG. 25
without going through the image signal storage circuits 1120 and
1121. The image sensing parameter detection circuit 1123 analyzes
an image of a known bright point pattern, and detects the moving
amount and posture of the automatic image sensing apparatus
1100.
[0306] FIG. 28 shows an example of the image storage timings of the
automatic image sensing apparatus 1100.
[0307] In FIG. 28, reference numeral 1400 denotes a path formed
when the photographer manually holds and moves the automatic image
sensing apparatus 1100 around the object 1101. Reference numeral
1401 denotes an image sensing start position, which corresponds to
the storage timing of the first image.
[0308] Also, reference numerals 1402, 1403, 1404, 1405, . . . ,
1409 denote the detection timings of changes, by a predetermined
amount, in X- or Y-direction or in rotation angle .alpha. or .beta.
under the assumption that the image sensing system points in the
direction of the object 1101, and images are stored at the timings
of these positions 1402, 1403, 1404, 1405, . . . , 1409.
[0309] At the timing of the position 1409 corresponding to the end
of one revolution, the coordinate X and the rotation angle .beta.
assume values equal to those at the position 1401, but other values
(Y, Z, .alpha., .gamma.) do not always match those at the position
1401. However, in the automatic image sensing apparatus 1100, the
start and end points need not always strictly match, and when Y, Z,
and .alpha. are smaller than predetermined values, image input is
terminated when X and .beta. match those at the start point.
[0310] The automatic image sensing apparatus 1100 need always be
moved in a plane (e.g., the path 1400 in FIG. 28) parallel to the
ground to perform image sensing. For example, an image sensing
method of moving the apparatus 1100 above the object 1101 may be
used.
[0311] In FIG. 28, reference numeral 1410 denotes a path when the
photographer manually holds and moves the automatic image sensing
apparatus 1100 above the object 1101 to perform image sensing.
Reference numerals 1411, 1412, 1413, . . . , 1419 denote storage
timing positions. In this image sensing mode, the values Y and
.alpha. are detected in place of X and .beta. to perform image
input control, and when a has changed 180.degree., the image input
is stopped.
[0312] FIGS. 29A to 29C show an example of input images obtained
when the image input is made at the timings of the positions 1401
to 1405. As can be seen from FIGS. 29A to 29C, time-serial images
obtained by viewing the object 1101 in turn from slightly different
view points are obtained.
[0313] FIGS. 30A to 30C show an example of the image input timings
different from those in FIGS. 29A to 29C. In an image sensing mode
of this example, an image sensing unit set with a large image
sensing magnification is used, so that the object 1101 falls
outside the frame. In this mode, the automatic image sensing
apparatus 1100 is moved in roughly the X-direction to perform image
sensing. In such image sensing mode, when the overlapping region
with the previously sensed image in each frame reaches a
predetermined area, i.e., at the timing at which each hatched
portion in FIGS. 29A to 29C reaches the predetermined area, the
sensed images are stored in the image signal storage circuits 1120
and 1121. In this mode, since a large image sensing magnification
is set, the image and shape of the object 1101 can be analyzed in
detail, and continuous images can be stably input under the control
of the process controller 1129.
[0314] In the above description, signal storage and process control
operations are attained based on the position and angle of the
automatic image sensing apparatus 1100. Also, the storage and
process control operations may be attained by analyzing the image
itself of the object 1101, as will be described below.
[0315] For example, the object recognition circuit 1130 shown in
FIG. 25 is used. The object recognition circuit 1130 detects
changes in object image from changes over time in image signal. For
example, the difference from a past image is detected, and when,
the difference reaches a predetermined value, image signals are
input. Since this method does not directly detect the movement of
the automatic image sensing apparatus 1100, the processing timing
precision is low, but since the processing is simple and no extra
sensor is required, the entire automatic image sensing apparatus
1100 can be rendered compact.
[0316] Furthermore, the corresponding point extraction circuit 1122
in FIG. 25 may operate all the time, and distance image data output
from this corresponding point extraction circuit 1122 may be
analyzed to attain process control. When the automatic image
sensing apparatus 1100 has moved by a predetermined amount, the
detected distance image changes accordingly. When time changes in
distance image reach a predetermined amount, image input can be
performed. In this method, when the object 1101 has a large uneven
portion, a large signal is output even when changes in position of
the image sensing system are small. For this reason, image sensing
is controlled for such uneven portion at short intervals,
otherwise, image sensing is controlled at long intervals. In
general, since the shape of such uneven portion is to be analyzed
in detail, images can be input more efficiently according to this
method.
[0317] Similarly, a method of using an error signal output from the
corresponding point extraction circuit 1122 in FIG. 25 is also
available. Note that the error signal is information which
indicates a pixel position where corresponding points cannot be
normally detected upon detecting corresponding points in image
signals obtained from the right and left image sensing units in
units of pixels. Such phenomenon occurs when so-called occlusion
has occurred, i.e., a portion that can be viewed from one image
sensing unit cannot be viewed from the other image sensing unit,
when the illumination conditions of the right and left image
sensing units are considerably different from each other, e.g.,
when directly reflected light from the illumination unit enters
only one image sensing unit, when the surface of the object 1101 is
flat and has no texture, and corresponding points cannot be
detected, and so on. However, such image sensing conditions may
allow corresponding point extraction and may not cause any errors
if the view point of the image sensing apparatus is changed.
[0318] In the automatic image sensing apparatus 1100, process
control is performed at a timing at which such error output of the
corresponding point extraction circuit 1122 time-serially changes,
so as to input images. In this method, as the characteristic
information of the object 1101, which cannot be accurately detected
at a certain timing due to, e.g., occlusion, can be compensated for
by an image at another timing, the three-dimensional shape of even
an object with a large unevenness can be efficiently extracted.
[0319] The processing flow controlled by the process controller
1129 will be described below.
[0320] In the above description, the image input timings to the
image signal storage circuits 1120 and 1121 in the simultaneous
processing mode have been explained. The automatic image sensing
apparatus 1100 also has a sequential processing mode for performing
shape extraction processing while sensing images of the object
1101. In this case as well, when an unnecessarily large number of
images are to be processed, the calculation volumes of the
corresponding point extraction circuit 1122, the image sensing
parameter detection circuit 1123, and the three-dimensional
information unifying circuit 1125 increase, and output data from
the signal lines 1140 and 1141 become large. As a consequence, the
buffer circuits 1126 and 1127 require a large storage capacity.
[0321] In view of this problem, using the output from the
above-mentioned posture sensor 1128 and information of the sensed
images, the process controller 1129 controls the processing start
timings of the corresponding point extraction circuit 1122, the
image sensing parameter detection circuit 1123, and the
three-dimensional information unifying circuit 1125. More
specifically, when the automatic image sensing apparatus 1100 is
moved along the path 1400 in FIG. 28, the images sensed at the
position 1401 are processed by the corresponding point extraction
circuit 1122 and the image sensing parameter detection circuit 1123
to extract a distance image. Subsequently, even when the processing
has ended before the apparatus 1100 is moved to the position 1402,
images acquired during the movement are not processed, and the
corresponding point extraction circuit 1122 and the image sensing
parameter detection circuit 1123 are stopped. During this interval,
image signals obtained from image sensing elements 1114 and 1115
are discarded or shutters 1112 and 1113 are closed to stop scanning
of the image sensing elements 1114 and 1115. With this control, the
consumption power of the image processing circuits and the
peripheral circuits of the image sensing elements 1114 and 1115,
which consume large electric power, can be reduced. Subsequently,
when the automatic image sensing apparatus 1100 is located at the
position 1402 in FIG. 28, the processing of the corresponding point
extraction circuit 1122, the image sensing parameter detection
circuit 1123, and the three-dimensional information unifying
circuit 1125 is started in synchronism with the beginning of
vertical scanning of the image sensing elements 1114 and 1115.
[0322] When the moving speed of the automatic image sensing
apparatus 1100 is high and processing cannot be done within a given
period, images that cannot be processed are sequentially stored in
the image signal storage circuits 1120 and 1121. The process
controller 1129 transfers the next images from the image signal
storage circuits 1120 and 1121 by detecting the end of processing
in the corresponding point extraction circuit 1122 and the image
sensing parameter detection circuit 1123.
[0323] The above-mentioned embodiment has exemplified a case using
two image sensing units. The image input timing control of the
present invention can be similarly applied to an apparatus which
analyzes the three-dimensional shape using a single image sensing
unit.
[0324] As described above, according to the image sensing apparatus
of the third embodiment, since the image input control and
processing start control are done in correspondence with the
position/angular relationship between the object 1101 and the image
sensing apparatus 1100 and changes in object image, the capacities
of the image signal storage circuits 1120 and 1121 and the buffer
circuits 1126 and 1127 can be minimized, and complicated image
processes can be attained within a minimum required time.
[0325] <Modification of Third Embodiment> . . . Fourth
Modification
[0326] The fourth modification of the third embodiment of the
present invention will be described with reference to FIGS. 31 and
32.
[0327] The fourth modification is applied to a system in which a
plurality of image information sensed by moving around an object
are directly stored, and the input image is selected and displayed
as it is in place of a CG image. The following description will
exemplify a case wherein two image sensing units are used to easily
obtain sense of reality, and a stereoscopic image is displayed on a
stereoscopic display. However, the image input timing control of
the present invention can also be applied to a system using a
single image sensing unit.
[0328] FIG. 31 is a diagram showing the arrangement and operation
principle upon acquisition of images of an image sensing apparatus
1700 according to the fourth modification, and the same reference
numerals in FIG. 31 denote the same parts as in FIG. 25 of the
third embodiment described above. The differences in FIG. 31 from
FIG. 25 are that circuits associated with stereoscopic image
analysis such as the corresponding point extraction circuit 1122,
image sensing parameter detection circuit 1123, ROM 1124,
three-dimensional information unifying circuit 1125, buffer
circuits 1126 and 1127, and the like are omitted from the
arrangement shown in FIG. 25, and an image sensing condition
storage circuit 1702 is added to the arrangement in FIG. 25.
[0329] In the fourth embodiment, the image signal storage circuits
1120 and 1121 are housed in a storage unit 1701, which is
detachable from the image sensing apparatus 1700, together with the
image sensing condition storage circuit 1702, and upon completion
of image sensing, the storage unit 1701 can be detached and
carried.
[0330] In the image sensing apparatus 1700 according to the fourth
modification, the image sensing positions and angles detected by
the posture sensor 1128 are stored in the image sensing condition
storage circuit 1702 simultaneously with the sensed images.
[0331] Since other arrangements and operations in the image sensing
apparatus 1700 according to the fourth modification are the same as
those in the image sensing apparatus 1100 according to the third
embodiment, a detailed description thereof will be omitted.
[0332] FIG. 32 shows the arrangement of an image display means for
displaying an image sensed by the image sensing apparatus 1700.
[0333] In FIG. 32, reference numeral 1801 denotes an image
reproduction unit; 1802, a stereoscopic display; 1803, a
three-dimensional mouse; and 1804, a coordinate comparison circuit.
The storage unit 1701 stores images around the object 1101, and
their image sensing directions and positions. When the operator
designates the observation direction of the object 1101 using the
three-dimensional mouse 1803, the coordinate comparison circuit
1804 checks if an image in the designated observation direction is
stored in the image sensing condition storage circuit 1702. If an
image in the designated observation direction is stored, image data
are read out from the image signal storage circuits 1120 and 1121
and are displayed on the stereoscopic display 1803. On the other
hand, if an image in the designated observation direction is not
stored, an image closest to the designated image is retrieved, and
is displayed on the stereoscopic display 1803.
[0334] Since such image sensing/display system does not calculate a
stereoscopic image as numerical value information but selects and
displays an image in the view point direction desired by the
operator, an object image which is discrete but is viewed virtually
from an arbitrary direction can be instantaneously displayed.
Hence, the operator can feel as if an actual object were present
there.
[0335] As described above, according to the image sensing apparatus
1700 of the fourth modification, even when the operator does not
move the apparatus at a constant speed around the object, images
can be properly input at appropriate positions. Accordingly, a
display image relatively close to that in the direction designated
by the observer can always be presented.
[0336] As many apparently widely different embodiments of the
present invention can be made without departing from the spirit and
scope thereof, it is to be understood that the invention is not
limited to the specific embodiments thereof except as defined in
the appended claims.
* * * * *