U.S. patent application number 13/968961 was filed with the patent office on 2013-12-12 for 3d geometric modeling and 3d video content creation.
This patent application is currently assigned to MantisVision Ltd.. The applicant listed for this patent is MantisVision Ltd.. Invention is credited to Gur Arie BITTAN, Eyal GORDON.
Application Number | 20130329018 13/968961 |
Document ID | / |
Family ID | 39111025 |
Filed Date | 2013-12-12 |
United States Patent
Application |
20130329018 |
Kind Code |
A1 |
GORDON; Eyal ; et
al. |
December 12, 2013 |
3D GEOMETRIC MODELING AND 3D VIDEO CONTENT CREATION
Abstract
A system, apparatus and method of obtaining data from a 2D image
in order to determine the 3D shape of objects appearing in said 2D
image, said 2D image having distinguishable epipolar lines, said
method comprising: (a) providing a predefined set of types of
features, giving rise to feature types, each feature type being
distinguishable according to a unique bi-dimensional formation; (b)
providing a coded light pattern comprising multiple appearances of
said feature types; (c) projecting said coded light pattern on said
objects such that the distance between epipolar lines associated
with substantially identical features is less than the distance
between corresponding locations of two neighboring features; (d)
capturing a 2D image of said objects having said projected coded
light pattern projected thereupon, said 2D image comprising
reflected said feature types; and (e) extracting: (i) said
reflected feature types according to the unique bi-dimensional
formations; and (ii) locations of said reflected feature types on
respective said epipolar lines in said 2D image.
Inventors: |
GORDON; Eyal; (Tel Aviv,
IL) ; BITTAN; Gur Arie; (Shoham, IL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
MantisVision Ltd. |
Petah Tikwa |
|
IL |
|
|
Assignee: |
MantisVision Ltd.
Petah Tikwa
IL
|
Family ID: |
39111025 |
Appl. No.: |
13/968961 |
Filed: |
August 16, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12515715 |
May 20, 2009 |
8538166 |
|
|
PCT/IL2007/001432 |
Nov 20, 2007 |
|
|
|
13968961 |
|
|
|
|
60935427 |
Aug 13, 2007 |
|
|
|
60929835 |
Jul 13, 2007 |
|
|
|
60924206 |
May 3, 2007 |
|
|
|
60907495 |
Apr 4, 2007 |
|
|
|
60860209 |
Nov 21, 2006 |
|
|
|
Current U.S.
Class: |
348/48 |
Current CPC
Class: |
G06T 2207/20228
20130101; G01B 11/2513 20130101; G06T 15/205 20130101; G06T 7/536
20170101; G01B 11/25 20130101; G06T 7/521 20170101; G01C 11/025
20130101; G06K 9/2036 20130101; G06T 2200/04 20130101 |
Class at
Publication: |
348/48 |
International
Class: |
G01C 11/02 20060101
G01C011/02 |
Claims
1. A method, comprising: providing a first bi-dimensional coded
light pattern having a first plurality of feature types and a
second bi-dimensional coded light pattern having a second plurality
of feature types, and each feature type from the first and the
second plurality of feature types is distinguishable according to a
unique bi-dimensional formation of varying light intensities, and
each one of the unique bi-dimensional formations of varying light
intensities is associated with a unique combination of a plurality
of elements comprises: at least one maximum or minimum non-saddle
element and a plurality of saddle elements, and wherein each
feature type from the second plurality of feature types corresponds
to a respective feature type from the first plurality of feature
types, and wherein the at least one non-saddle element of each
feature type from the second plurality of feature types is inverted
relative to the at least one non-saddle element of the respective
feature type from the first plurality of feature types; projecting
said first bi-dimensional coded light pattern and said second
bi-dimensional coded light pattern on objects; capturing a first
image of said objects, having said first bi-dimensional coded light
pattern projected thereupon, and a second image of said objects,
having said second bi-dimensional coded light pattern projected
thereupon; and processing the first and the second images.
2. The method according to claim 1, wherein the saddle point is
neither maximum nor minimum.
3. The method according to claim 1, wherein said processing
comprises obtaining a resultant image from a subtraction of said
second image from said first image.
4. The method according to claim 3, further comprising: determining
a location of the at least one non-saddle element and of the
plurality of saddle elements of said reflected feature types in
said resultant image; and determining feature type locations in the
first or second image according to locations of the at least one
non-saddle element and of the plurality of saddle elements of said
reflected feature types in said resultant image.
5. The method according to claim 1, wherein said processing
comprises obtaining a resultant image from the addition of said
second image to said first image.
6. The method according to claim 5, wherein the resultant image
comprises accentuated saddle points of said first and said second
plurality of feature types, and further comprising determining
saddle identity information of a saddle point of said first and/or
said second plurality of feature types according to intensity
values of a respective accentuated saddle point in said resultant
image.
7. The method according to claim 1, wherein said processing
comprises obtaining a resultant image from the absolute value of a
subtraction of said second captured image from said first captured
image.
8. The method according to claim 7, wherein the resultant image
comprises borderlines between non-saddle points in the first or
second image.
9. The method according to claim 1, wherein said projecting
comprises one of the following: projecting the first and second
bi-dimensional coded light patterns temporally; projecting the
first and second bi-dimensional coded light patterns with different
spectral values of light; or projecting the first and second
bi-dimensional coded light patterns with differing
polarization.
10. The method according to claim 1, wherein said capturing
comprises capturing the first image and second image temporally,
giving rise to temporal imaging.
11. The method according to claim 10, further comprising carrying
out the temporal imaging over non-uniform time intervals.
12. The method according to claim 1, wherein said capturing further
comprises capturing the first image and the second image
simultaneously using spectral or polar differentiation.
13. An apparatus, comprising: a first bi-dimensional coded light
pattern having a first plurality of feature types and a second
bi-dimensional coded light pattern having a second plurality of
feature types, and each feature type from the first and the second
plurality of feature types is distinguishable according to a unique
bi-dimensional formation of varying light intensities, and each one
of the unique bi-dimensional formations of varying light
intensities is associated with a unique combination of a plurality
of elements comprises: at least one maximum or minimum non-saddle
element and a plurality of saddle elements, and wherein each
feature type from the second plurality of feature types corresponds
to a respective feature type from the first plurality of feature
types, and wherein the at least one non-saddle element of each
feature type from the second plurality of feature types is inverted
relative to the at least one non-saddle element of the respective
feature type from the first plurality of feature types; a
projection module capable of projecting said first bi-dimensional
coded light pattern and said second bi-dimensional coded light
pattern on objects; at least one imaging module capable of
capturing a first image of said objects, having said first
bi-dimensional coded light pattern projected thereupon, and a
second image of said objects, having said second bi-dimensional
coded light pattern projected thereupon; and an image processing
module capable of processing the first and the second images.
14. The apparatus according to claim 13, wherein the saddle point
is neither maximum nor minimum.
15. The apparatus according to claim 13, wherein said image
processing module is capable of obtaining a resultant image from a
subtraction of said second image from said first image.
16. The apparatus according to claim 15, wherein said image
processing module is capable of: determining a location of the at
least one non-saddle element and of the plurality of saddle
elements of said reflected feature types in said resultant image;
and of determining feature type locations in the first or second
image according to locations of the at least one non-saddle element
and of the plurality of saddle elements of said reflected feature
types in said resultant image.
17. The apparatus according to claim 13, wherein said image
processing module is capable of obtaining a resultant image from
the addition of said second image to said first image.
18. The apparatus according to claim 17, wherein the resultant
image comprises accentuated saddle points of said first and said
second plurality of feature types, and, wherein said image
processing module is capable of determining saddle identity
information of a saddle point of said first and/or said second
plurality of feature types according to intensity values of a
respective accentuated saddle point in said resultant image.
19. The apparatus according to claim 13, wherein said image
processing module is capable of obtaining a resultant image from
the absolute value of a subtraction of said second captured image
from said first captured image.
20. The apparatus according to claim 19, wherein the resultant
image comprises borderlines between non-saddle points in the first
or second image.
21. The apparatus according to claim 13, wherein said projection
module is capable of any one of the following: projecting the first
and second bi-dimensional coded light patterns temporally;
projecting the first and second bi-dimensional coded light patterns
with different spectral values of light; or projecting the first
and second bi-dimensional coded light patterns with differing
polarization.
22. The apparatus according to claim 13, wherein said at least one
imaging module is capable of capturing the first image and second
image temporally, giving rise to temporal imaging.
23. The apparatus according to claim 13, wherein said at least one
imaging module is capable of carrying out the temporal imaging over
non-uniform time intervals.
24. The apparatus according to claim 13, wherein said at least one
imaging module is capable of capturing the first image and the
second image simultaneously using spectral or polar
differentiation.
25. A program storage device readable by machine, tangibly
embodying a program of instructions executable by the machine to
perform a method, comprising: providing a first bi-dimensional
coded light pattern having a first plurality of feature types and a
second bi-dimensional coded light pattern having a second plurality
of feature types, and each feature type from the first and the
second plurality of feature types is distinguishable according to a
unique bi-dimensional formation of varying light intensities, and
each one of the unique bi-dimensional formations of varying light
intensities is associated with a unique combination of a plurality
of elements comprises: at least one maximum or minimum non-saddle
element and a plurality of saddle elements, and wherein each
feature type from the second plurality of feature types corresponds
to a respective feature type from the first plurality of feature
types, and wherein the at least one non-saddle element of each
feature type from the second plurality of feature types is inverted
relative to the at least one non-saddle element of the respective
feature type from the first plurality of feature types; projecting
said first bi-dimensional coded light pattern and said second
bi-dimensional coded light pattern on objects; capturing a first
image of said objects, having said first bi-dimensional coded light
pattern projected thereupon, and a second image of said objects,
having said second bi-dimensional coded light pattern projected
thereupon; and processing the first and the second images.
26. A computer program product comprising a computer useable medium
having computer readable program code embodied therein, the
computer program product comprising: computer readable program code
for causing the computer to provide a first bi-dimensional coded
light pattern having a first plurality of feature types and a
second bi-dimensional coded light pattern having a second plurality
of feature types, and each feature type from the first and the
second plurality of feature types is distinguishable according to a
unique bi-dimensional formation of varying light intensities, and
each one of the unique bi-dimensional formations of varying light
intensities is associated with a unique combination of a plurality
of elements comprises: at least one maximum or minimum non-saddle
element and a plurality of saddle elements, and wherein each
feature type from the second plurality of feature types corresponds
to a respective feature type from the first plurality of feature
types, and wherein the at least one non-saddle element of each
feature type from the second plurality of feature types is inverted
relative to the at least one non-saddle element of the respective
feature type from the first plurality of feature types; computer
readable program code for causing said first bi-dimensional coded
light pattern and said second bi-dimensional coded light pattern to
be projected on objects; computer readable program code for causing
the computer to obtain a first image of said objects, having said
first bi-dimensional coded light pattern projected thereupon, and a
second image of said objects, having said second bi-dimensional
coded light pattern projected thereupon; and computer readable
program code for causing the computer to process the first and the
second images.
Description
RELATIONSHIP TO EXISTING APPLICATIONS
[0001] The present application is a Division of U.S. patent
application Ser. No. 12/515,715 filed May 20, 2009, which is the
U.S. National Phase of PCT/IL2007/001432 filed Nov. 20, 2007, the
contents of which are hereby incorporated by reference.
[0002] The present application claims priority from U.S.
provisional patent 60/860,209 filed on Nov. 21, 2006, the contents
of which are hereby incorporated by reference.
[0003] The present application also claims priority from U.S.
provisional patent 60/907,495 filed on Apr. 4, 2007, the contents
of which are hereby incorporated by reference.
[0004] The present application also claims priority from U.S.
provisional patent 60/924,206 filed on May 3, 2007, the contents of
which are hereby incorporated by reference.
[0005] The present application also claims priority from U.S.
provisional patent 60/929,835 filed on Jul. 13, 2007, the contents
of which are hereby incorporated by reference.
[0006] The present application also claims priority from U.S.
provisional patent 60/935,427 filed on Aug. 13, 2007, the contents
of which are hereby incorporated by reference.
FIELD AND BACKGROUND OF THE INVENTION
[0007] The present invention relates to a system and a method for
three dimensional imaging and depth measurement of objects using
active triangulation methods, and, more particularly, but not
exclusively to three dimensional imaging of both objects at rest
and in motion.
[0008] Three dimensional sensor systems are used in a wide array of
applications. These sensor systems determine the shape and or
surface features of an object positioned in a scene of the sensor
system's view. In recent years, many methods have been proposed for
implementing 3D modeling systems that are capable of rapid
acquisition of accurate high resolution 3D images of objects for
various applications.
[0009] The precise configuration of such 3D imaging systems may be
varied. Many current triangulation-based systems use an array of at
least two or more cameras to determine the depth values by what is
known as passive stereo correspondence. Such a method is dependent
on the imaged surfaces being highly textured and therefore error
prone and non-robust. Furthermore, automatic correspondence
algorithms often contain an abundance of errors in matching between
shots from different cameras.
[0010] Other methods utilize LIDAR (Light Imaging Detection and
Ranging) systems to determine range and/or other information of a
distant target. By way of light pulses, the distance to an object
is determined by measuring the time delay between transmission of
the light pulse and detection of the reflected signal. Such
methods, referred to as time-of-flight, are generally immune to
occlusions typical of triangulation methods, but the accuracy and
resolution are inherently inferior to that obtained in
triangulation methods.
[0011] Active triangulation-based 3D sensor systems and methods
typically have one or more projectors as a light source for
projecting onto a surface and one or more cameras at a defined,
typically rectified relative position from the projector for
imaging the lighted surface. The camera and the projector therefore
have different optical paths, and the distance between them is
referred to as the baseline. Through knowledge of the baseline
distance as well as projection and imaging angles, known
geometric/triangulation equations are utilized to determine
distance to the imaged object. The main differences among the
various triangulation methods known in the art lie in the method of
projection as well as the type of light projected, typically
structured light, and in the process of image decoding to obtain
three dimensional data.
[0012] Methods of light projection vary from temporal methods to
spatial coded structured light. Examples in the art of various
forms of projected light include "laser fans" and line coded
light.
[0013] Once a 2D image of the object is captured upon which a light
source is projected as described above, image processing software
generally analyzes the image to extract the three dimensional
geometry of the object and possibly the three dimensional movement
of the object through space. This is generally done by comparison
of features in the captured image with previously captured images
and/or with known characteristics and features of the projected
light. The implementation of this step varies widely among
currently known methods, typically a function of the method used to
project light onto the object. Whatever the method used, the
outcome of the process is generally a type of
disparity/displacement map of identified features in the captured
image. The final step of 3D spatial location and/or 3D motion
capture involves the translation of the above mentioned disparity
map into depth data, according to well known geometric equations,
particularly triangulation equations.
[0014] The very fact that hundreds of methods and systems exist
today hints at the underlying problem of a lack of a sufficiently
effective and reliable method for 3D imaging. Furthermore, most of
the systems that utilize active triangulation methods today are
restricted to non dynamic imaging of objects. That is to say, even
at high frame rates and shutter speeds, the imaged object must
remain static during image acquisition. For example, a building may
be imaged, but not a person riding a bike or cars moving on the
street. This limitation on three dimensional imaging is a direct
result of the need in most triangulation based 3D imaging systems
to obtain a series of images while changing the characteristics of
the light source over time. For example, many methods utilize a
number of light patterns projected over a time interval, known as
temporal coding.
[0015] Nonetheless, many methods have been introduced over the
years for the three dimensional imaging of moving objects, most of
which are based on the projection of a single pattern of light on
the imaged object, thus enabling reconstruction of the depth
measurements from one or more simultaneous images rather than
multiple images over a time interval. These single pattern methods
can be broken down into two main classes. The first is assisted
stereo methods wherein a single pattern is projected and a
comparison is made between two or more images from two or more
imaging systems to compute depth data.
[0016] The second is structured light methods, and in particular
coded structured light methods. These methods often use only one
imaging system or camera. Coded structured light methods can
further be broken down into several encoding types. One such type
using coded structured light is spatial coding, which suffers from
a wide range of problems of precision and reliability, particularly
regarding feature identification, and other serious performance
limitations. As a result, spatial single pattern systems have been
implemented commercially only in a very limited manner. A further
structured coding technique in the art is spectral or color coding,
which requires a color neutral surface and usually requires
expensive imaging systems.
[0017] Therefore, there is an unmet need for, and it would be
highly useful to have, a system and a method that overcomes the
above drawbacks.
SUMMARY OF THE INVENTION
[0018] According to an embodiment of the present invention, a
method of obtaining data from a 2D (two-dimensional) image in order
to determine the 3D (three-dimensional) shape of objects appearing
in the 2D image, said 2D image having distinguishable epipolar
lines, said method comprising:
[0019] providing a predefined set of types of features, giving rise
to feature types, each feature type being distinguishable according
to a unique bi-dimensional formation,
[0020] providing a coded light pattern comprising multiple
appearances of said feature types;
[0021] projecting said coded light pattern on said objects such
that the distance between epipolar lines associated with
substantially identical features is less than the distance between
corresponding locations of two neighboring features;
[0022] capturing a 2D image of said objects having said projected
coded light pattern projected thereupon, said 2D image comprising
reflected said feature types; and
[0023] extracting: [0024] a) said reflected feature types according
to the unique bi-dimensional formations; and [0025] b) locations of
said reflected feature types on respective said epipolar lines in
said 2D image.
[0026] According to another embodiment of the present invention, a
method of obtaining data from a 2D image in order to determine the
3D (three-dimensional) shape of objects appearing in said 2D image,
said 2D image having distinguishable epipolar lines, said method
comprising:
[0027] providing a predefined set of types of features, giving rise
to feature types, each feature type being distinguishable according
to a unique bi-dimensional formation;
[0028] providing a coded light pattern comprising multiple
appearances of said feature types;
[0029] projecting said coded light pattern on said objects at an
angle in relation to said epipolar lines, said angle being in
accordance with a determined limiting distance between said
distinguishable epipolar lines;
[0030] capturing a 2D image of said objects having said projected
coded light pattern projected thereupon, said 2D image comprising
reflected said feature types; and
[0031] extracting: [0032] a) said reflected feature types according
to the unique bi-dimensional formations; and [0033] b) locations of
said reflected feature types on respective said epipolar lines in
said 2D image.
[0034] According to still another embodiment of the present
invention, the method further comprises determining corresponding
3D spatial coordinates of the objects by means of the locations of
the reflected feature types on the respective epipolar lines.
[0035] According to still another embodiment of the present
invention, the method further comprises providing a 3D point cloud
by a compilation of the 3D spatial coordinates.
[0036] According to still another embodiment of the present
invention, the method further comprises processing the 3D point
cloud to form a 3D surface.
[0037] According to still another embodiment of the present
invention, the method further comprises adding texture data of the
objects.
[0038] According to a further embodiment of the present invention,
the method further comprises providing the coded light pattern as
one or more of the following:
[0039] a) a repeating periodic pattern; and
[0040] b) a non-periodic pattern.
[0041] According to still a further embodiment of the present
invention, the method further comprises projecting the coded light
pattern such that each of the feature types appears at most once on
one or more predefined sections of the distinguishable epipolar
lines.
[0042] According to still a further embodiment of the present
invention, the method further comprises providing the entire length
of epipolar lines in the 2D image within the predefined sections of
the epipolar lines.
[0043] According to still a further embodiment of the present
invention, the method further comprises providing the predefined
sections as sections of the length of the epipolar lines in the 2D
image.
[0044] According to an embodiment of the present invention, the
method further comprises projecting the coded light pattern at an
angle in relation to the epipolar lines in accordance with the
parameters of a Sin.sup.-1 (P/YC) equation.
[0045] According to an embodiment of the present invention,
extracting the reflected feature types according to the unique
bi-dimensional formations comprises determining: [0046] a) elements
that comprise said reflected feature types; and [0047] b) epipolar
distances between said elements of said reflected feature
types.
[0048] According to another embodiment of the present invention,
the method further comprises providing the elements of the
reflected feature types as bi-dimensional light intensity
formations on a pixel area of a sensor.
[0049] According to still another embodiment of the present
invention, the method further comprises identifying the
bi-dimensional light intensity formations by locating critical
points of light intensity among adjacent sampled pixel values.
[0050] According to still another embodiment of the present
invention, the method further comprises providing the elements as
predefined values.
[0051] According to still another embodiment of the present
invention, the method further comprises expressing locations of the
elements as two-dimensional coordinates in the 2D image.
[0052] According to still another embodiment of the present
invention, the method further comprises providing the epipolar
distances between the elements as one or more of the following:
[0053] a) constant distances; and [0054] b) varying distances.
[0055] According to still another embodiment of the present
invention, the method further comprises providing a spectral
encoding within the unique bi-dimensional formation.
[0056] According to still another embodiment of the present
invention, the method further comprises capturing the 2D image at
each of successive time frames.
[0057] According to a further embodiment of the present invention,
the method further comprises determining, for each of the 2D
images, corresponding 3D spatial coordinates of the objects by
means of the locations of the reflected feature types on the
respective epipolar lines for each of the 2D images.
[0058] According to still a further embodiment of the present
invention, the method further comprises forming a video stream
providing 3D motion capture by means of the 3D spatial
coordinates.
[0059] According to still a further embodiment of the present
invention, the method further comprises computing the 3D spatial
coordinates substantially in real-time to provide a substantially
real-time video stream of the 3D motion capture.
[0060] According to still a further embodiment of the present
invention, the method further comprises generating the
bi-dimensional coded light pattern by superimposing a plurality of
values on a predefined pattern, said values derived from one or
more of the following: [0061] a) Debruijn sequences; [0062] b)
Perfect Maps; [0063] c) M-Arrays; and [0064] d) Pseudo-random
codes.
[0065] According to an embodiment of the present invention, an
apparatus is configured to obtain data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said apparatus
comprising:
[0066] a predefined set of types of features, giving rise to
feature types, each feature type being distinguishable according to
a unique bi-dimensional formation;
[0067] a coded light pattern comprising multiple appearances of
said feature types;
[0068] a projector configured to project said coded light pattern
on said objects such that the distance between epipolar lines
associated with substantially identical features is less than the
distance between corresponding locations of two neighboring
features;
[0069] an imaging device configured to capture a 2D image of said
objects having said projected coded light pattern projected
thereupon, said 2D image comprising reflected said feature types;
and
[0070] an image processing device configured to extract: [0071] a)
said reflected feature types according to the unique bi-dimensional
formations; and [0072] b) locations of said reflected feature types
on respective said epipolar lines in said 2D image.
[0073] According to an embodiment of the present invention, the
locations on the respective epipolar lines of the reflected feature
types determine corresponding 3D spatial coordinates of the
objects.
[0074] According to another embodiment of the present invention,
the projector is further configured such that the projecting of the
coded light pattern is at an angle in relation to the epipolar
lines in accordance with the parameters of a Sin.sup.-1 (P/YC)
equation.
[0075] According to still another embodiment of the present
invention, the apparatus is further configured to move in
three-dimensions in relation to the objects during the imaging of
said objects.
[0076] According to still another embodiment of the present
invention, the apparatus is further configured to capture the
images of the objects that are moving in three dimensions in
relation to said apparatus.
[0077] According to still another embodiment of the present
invention, a method of obtaining data from a 2D video sequence,
comprising two or more frames, each frame being a 2D image, in
order to determine the 3D shape of moving objects appearing in said
2D video sequence, each frame having distinguishable epipolar
lines, said method comprising:
[0078] providing a bi-dimensional coded light pattern having a
plurality of types of features, giving rise to feature types, each
feature type being distinguishable according to a unique
bi-dimensional formation,
[0079] projecting said coded light pattern on said moving objects
such that the distance between epipolar lines associated with
substantially identical features is less than the distance between
corresponding locations of two neighboring features;
[0080] capturing said 2D video sequence having said moving objects
having said projected coded light pattern projected thereupon, said
2D video sequence comprising reflected said feature types, and
[0081] extracting: [0082] a) said reflected feature types according
to the unique bi-dimensional formations; and [0083] b) locations of
said feature types on respective said epipolar lines.
[0084] According to another embodiment of the present invention, an
apparatus is configured to obtain data from 2D video sequence,
comprising two or more frames, each frame being a 2D image, in
order to determine the 3D shape of moving objects appearing in said
2D video sequence, each frame having distinguishable epipolar
lines, said apparatus comprising:
[0085] a bi-dimensional coded light pattern having a plurality of
types of features, giving rise to feature types, each feature type
being distinguishable according to a unique bi-dimensional
formation;
[0086] a projector configured to project said coded light pattern
on said moving objects such that the distance between epipolar
lines associated with substantially identical features is less than
the distance between corresponding locations of two neighboring
features;
[0087] an imaging device configured to capture said 2D video
sequence having said moving object having said projected coded
light pattern projected thereupon, said 2D video sequence
comprising reflected feature types; and
[0088] an image processing device configured to extract: [0089] a)
said reflected feature types according to the unique bi-dimensional
formations; and [0090] b) locations of said reflected feature types
on respective said epipolar lines.
[0091] According to still another embodiment of the present
invention, a method of obtaining distance data from a scene
comprising one or more objects, the method comprising:
[0092] projecting a bi-dimensional coded light pattern having a
plurality of feature types onto said scene such that each feature
of said light pattern is reflected from a different respective
reflection location in said scene, giving rise to a reflected
feature, said feature types being distinguishable according to a
unique bi-dimensional formation, said projecting performed such
that the distance between epipolar lines associated with
substantially identical features is less than the distance between
corresponding locations of two neighboring features;
[0093] capturing a 2D image of said scene comprising the reflected
features;
[0094] determining for each said reflected feature appearing in
said 2D image: [0095] a) a respective feature type according to
said unique bi-dimensional formation; and [0096] b) a location of
said each reflected feature on the respective epipolar line;
and
[0097] deriving for said each reflected feature, in accordance with
the respective determined feature type and with the respective
determined location on said respective epipolar line, a respective
distance of a respective associated reflection location.
[0098] According to an embodiment of the present invention, the
method further comprises providing a 3D geometric shape of the one
or more objects by a compilation of the respective distances.
[0099] According to another embodiment of the present invention,
determining of the respective feature type and the respective
epipolar line comprises effecting a correspondence between the
respective determined feature type in the 2D image and the location
of the same feature type in a projected pattern.
[0100] According to an embodiment of the present invention, a
system is configured to obtain scene data from a target scene, the
system comprising:
[0101] a pattern-projecting apparatus operative to project a
bi-dimensional coded light pattern having a plurality of feature
types onto said scene such that each feature reflects from a
respective reflection location, giving rise to a reflected feature,
said feature types being distinguishable according to a unique
bi-dimensional formation, wherein said coded light pattern are
projected such that the distance between epipolar lines associated
with substantially identical features is less than the distance
between corresponding locations of two neighboring features;
[0102] an image capture apparatus operative to capture a 2D image
of said scene comprising the reflected features; and
[0103] an image processing element operative, for each said
reflected feature of said 2D image, to: [0104] i) determine, for
said each reflected feature: [0105] A. a feature type according to
said unique bi-dimensional formation; and [0106] B. a location of
said each reflected feature on the respective epipolar line; and
[0107] ii) derive, for each said reflected feature, in accordance
with the respective determined feature type and with the respective
determined location on the respective epipolar line, a respective
distance of a respective associated reflection location.
[0108] According to still another embodiment of the present
invention, a method of determining the 3D shape of imaged objects
appearing in at least two obtained 2D images, said obtained images
related to each other by defined epipolar fields, said method
comprising:
[0109] providing a bi-dimensional coded light pattern having a
plurality of types of features, giving rise to feature types, each
feature type being distinguishable according to a unique
bi-dimensional formation;
[0110] projecting said coded light pattern on said imaged objects
such that the distance between epipolar lines associated with
substantially identical features is less than the distance between
corresponding locations of two neighboring features;
[0111] capturing a first 2D image of said imaged objects;
[0112] capturing a second 2D image of said imaged objects;
[0113] selecting a pixel area PX1 of said first 2D image, the
content appearing on PX1 being constrained to appear on a specific
epipolar line EPm in said second 2D image;
[0114] finding said content of said pixel area PX1 in said second
2D image along said epipolar line EPm; and
[0115] determining locations of appearances of said content of said
pixel area PX1 between said first and second 2D images.
[0116] According to still another embodiment of the present
invention, a method of determining the 3D shape of imaged objects
appearing in at least two obtained 2D images, said obtained images
related to each other by defined epipolar fields, each epipolar
field comprising one or more epipolar lines, said method
comprising:
[0117] providing a bi-dimensional coded light pattern having a
plurality of types of features, giving rise to feature types, each
feature type being distinguishable according to a unique
bi-dimensional formation;
[0118] projecting said coded light pattern on said imaged objects
at an angle in relation to at least one of said epipolar fields,
said angle being in accordance with a determined limiting distance
between distinguishable epipolar lines;
[0119] capturing a first 2D image of said imaged objects;
[0120] capturing a second 2D image of said imaged objects;
[0121] selecting a pixel area PX1 of said first 2D image, the
content appearing on PX1 being constrained to appear on a specific
epipolar line EPm in said second 2D image;
[0122] finding said content of said pixel area PX1 in said second
2D image along said epipolar line EPm; and
[0123] determining locations of appearances of said content of said
pixel area PX1 between said first and second 2D images.
[0124] According to an embodiment of the present invention, the
method further comprises determining corresponding 3D spatial
coordinates of the imaged objects by means of the locations of the
appearances of the content of the pixel area PX1.
[0125] According to an embodiment of the present invention, the
method further comprises extracting for each of said 2D images,
independently, reflected feature types, said feature types being
reflected from the imaged objects, according to the unique
bi-dimensional formations and the feature type locations on the
respective epipolar lines.
[0126] According to still another embodiment of the present
invention, an apparatus is configured to determine the 3D shape of
imaged objects appearing in two or more obtained 2D images, said
two or more 2D images being obtained from at least two imaging
devices, said two or more 2D images being related to each other by
defined epipolar fields, said apparatus comprising:
[0127] a bi-dimensional coded light pattern having a plurality of
types of features, giving rise to feature types, each feature type
being distinguishable according to a unique bi-dimensional
formation;
[0128] a projector configured to project said coded light pattern
on said imaged objects such that the distance between epipolar
lines associated with substantially identical features is less than
the distance between corresponding locations of two neighboring
features;
[0129] a first imaging device configured to capture a first 2D
image of said imaged objects;
[0130] a second imaging device configured to capture a second 2D
image of said imaged objects;
[0131] an image processing configured to: [0132] a) select a pixel
area PX1 of said first 2D image, the content appearing on said
pixel area PX1 being constrained to appear on a specific epipolar
line EPm in the second 2D image; [0133] b) find said content of
said pixel area PX1 in said second 2D image along said epipolar
line EPm; and [0134] c) determine locations of appearances of said
content of said pixel area PX1 between said first and second 2D
images.
[0135] According to an embodiment of the present invention, the
locations of the appearances of the content of the pixel area PX1
determine corresponding 3D spatial coordinates of the imaged
objects.
[0136] According to another embodiment of the present invention,
the image processing device is further configured to extract in
each image, independently, the reflected feature types, said
feature types being reflected from the imaged objects, according to
the unique bi-dimensional formations and the feature type locations
on the respective epipolar lines.
[0137] According to still another embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0138] providing a bi-dimensional coded light pattern having a
plurality of feature types, each of said feature types being
distinguishable according to a unique bi-dimensional formation;
[0139] providing the inverse of said bi-dimensional coded light
pattern, giving rise to an inverse coded light pattern;
[0140] projecting said coded light pattern and said inverse coded
light pattern on said objects;
[0141] capturing:
[0142] i) a first 2D image of said objects, having said projected
coded light pattern projected thereupon, said first 2D image
comprising reflected said feature types; and
[0143] ii) a second 2D image of said objects, having said inverse
coded light pattern projected thereupon, said second 2D image
comprising reflected said feature types;
[0144] obtaining a resultant image from the subtraction of said
second 2D image from said first 2D image; and
[0145] extracting said reflected feature types according to the
unique bi-dimensional formations and the feature type locations on
respective said epipolar lines in said resultant image.
[0146] According to an embodiment of the present invention, the
method further comprises projecting the light patterns by one or
more of the following: [0147] a) temporally; [0148] b) with
differing spectral values of light; and [0149] c) with differing
polarization.
[0150] According to another embodiment of the present invention,
the method further comprises capturing the first 2D image and
second 2D images temporally, giving rise to temporal imaging.
[0151] According to still another embodiment of the present
invention, the method further comprises carrying out the temporal
imaging over non-uniform time intervals.
[0152] According to still another embodiment of the present
invention, the method further comprises capturing the first and
second 2D images simultaneously by using spectral
differentiation.
[0153] According to a further embodiment of the present invention,
the method further comprises carrying out the extraction of the
feature types in the resultant image by determining intensity
values of sample points.
[0154] According to still another embodiment of the present
invention, an apparatus is configured to obtain data from a 2D
image in order to determine the 3D shape of objects appearing in
said 2D image, said 2D image having distinguishable epipolar lines,
said apparatus comprising:
[0155] a predefined set of feature types, each feature type being
distinguishable according to a unique bi-dimensional formation;
[0156] a bi-dimensional coded light pattern comprising multiple
appearances of the feature types;
[0157] an inverse bi-dimensional coded light pattern comprising
multiple appearances of the inversed feature types;
[0158] a projector configured to project said bi-dimensional coded
light pattern and said inverse bi-dimensional light pattern on said
objects;
[0159] a first imaging device configured to capture a first 2D
image of said objects, having said projected coded light pattern
projected thereupon, said first 2D image comprising reflected said
feature types;
[0160] a second imaging device configured to capture a second 2D
image of said objects, having said inverse coded light pattern
projected thereupon, said second 2D image comprising reflected said
feature types; and
[0161] an image processing device configured to: [0162] i) obtain a
resultant image from the subtraction of said second 2D image from
said first 2D image; and [0163] ii) extract said reflected feature
types according to the unique bi-dimensional formations and the
feature type locations on respective said epipolar lines in said
resultant image.
[0164] According to a further embodiment of the present invention,
a method of obtaining texture data from two 2D images, each of said
images containing a reflected code used to obtain depth data of
imaged objects independently for each image, said method
comprising:
[0165] providing a bi-dimensional coded light pattern having a
plurality of feature types, each feature type being distinguishable
according to a unique bi-dimensional formation,
[0166] providing the inverse of said bi-dimensional coded light
pattern, giving rise to an inverse coded light pattern;
[0167] projecting said coded light pattern and said inverse coded
light pattern on said objects;
[0168] capturing:
[0169] i) a first 2D image of said objects, having said projected
coded light pattern projected thereupon, said first 2D image
comprising reflected said feature types,
[0170] ii) a second 2D image of said objects, having said inverse
coded light pattern projected thereupon, said second 2D image
comprising reflected said feature types,
[0171] obtaining a resultant image from the addition of said second
2D image with said first 2D image, said resultant image providing
texture information of said imaged objects.
[0172] According to still a further embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0173] providing a bi-dimensional coded light pattern having a
plurality of feature types, each feature type being distinguishable
according to a unique bi-dimensional formation;
[0174] providing the inverse of said bi-dimensional coded light
pattern, giving rise to an inverse coded light pattern;
[0175] projecting said coded light pattern and said inverse coded
light pattern on said objects;
[0176] capturing: [0177] i) a first 2D image of said objects,
having said projected coded light pattern projected thereupon, said
first 2D image comprising reflected said feature types; and [0178]
ii) a second 2D image of said objects, having said inverse coded
light pattern projected thereupon, said second 2D image comprising
reflected said feature types;
[0179] obtaining a resultant image from the absolute value of the
subtraction of said second 2D image from said first 2D image, said
resultant image comprising outlines of said reflected feature
types; and
[0180] extracting said outlines and outline locations on respective
said epipolar lines in said resultant image.
[0181] According to an embodiment of the present invention, the
method further comprises determining corresponding 3D spatial
coordinates of the objects by means of the outline locations on the
respective epipolar lines.
[0182] According to still a further embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0183] providing a first bi-dimensional coded light pattern having
a plurality of feature types, each feature type being
distinguishable according to a unique bi-dimensional formation and
each feature type comprising a plurality of elements having varying
light intensity, wherein said plurality of elements comprises
non-maximum and non-minimum elements and comprises one or more of
the following: [0184] a) at least one maximum element; [0185] b) at
least one minimum element; and [0186] c) at least one saddle
element;
[0187] providing a second bi-dimensional coded light pattern
comprising said first bi-dimensional coded light pattern having the
one or more maximum and/or minimum inverted elements;
[0188] projecting said first bi-dimensional coded light pattern and
said second bi-dimensional coded light pattern on said objects;
[0189] capturing: [0190] i) a first 2D image of said objects,
having said first projected bi-dimensional coded light pattern
projected thereupon, said first 2D image comprising reflected said
feature types; and [0191] ii) a second 2D image of said objects,
having said second bi-dimensional coded light pattern projected
thereupon, said second 2D image comprising reflected said feature
types;
[0192] obtaining a resultant image from the subtraction of said
second 2D image from said first 2D image; and
[0193] extracting from said resultant image one or more of the
following: at least one maximum element, at least one minimum
element and at least one saddle element, and extracting their
locations respective to said epipolar lines in said resultant
image.
[0194] According to an embodiment of the present invention, the
method further comprises determining the corresponding feature
types by determining their locations and/or by determining values
of the one or more of the following: at least one maximum element,
at least one minimum element, at least one non-maximum element, at
least one non-minimum element and at least one saddle element.
[0195] According to another embodiment of the present invention,
the method further comprises determining corresponding 3D spatial
coordinates of the objects by determining locations of the one or
more of the following: at least one maximum element, at least one
minimum element, and at least one saddle element and/or by
determining locations of the non-maximum and non-minimum elements
on the respective epipolar lines.
[0196] According to still a further embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0197] providing a first bi-dimensional coded light pattern having
a plurality of feature types, each feature type being
distinguishable according to a unique bi-dimensional formation and
each feature type comprising a plurality of elements having varying
light intensity, wherein said plurality of elements comprising
non-maximum and non-minimum elements comprising one or more of the
following: [0198] a) at least one maximum element; [0199] b) at
least one minimum element; and [0200] c) at least one saddle
element;
[0201] providing a second bi-dimensional coded light pattern
comprising said first bi-dimensional coded light pattern having the
one or more maximum and/or minimum inverted elements;
[0202] projecting said first bi-dimensional coded light pattern and
said second bi-dimensional coded light pattern on said objects;
[0203] capturing: [0204] i) a first 2D image of said objects,
having said first projected coded light pattern projected
thereupon, said first 2D image comprising reflected said feature
types; and [0205] ii) a second 2D image of said objects, having
said second coded light pattern projected thereupon, said second 2D
image comprising reflected said feature types;
[0206] obtaining a resultant image from the addition of said second
2D image with said first 2D image; and
[0207] extracting from said resultant image one or more said
non-maximum and non-minimum elements and extracting locations of
said non-maximum and non-minimum elements on one or more respective
epipolar lines in said resultant image.
[0208] According to still a further embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0209] providing a first bi-dimensional coded light pattern having
a plurality of feature types, each feature type being
distinguishable according to a unique bi-dimensional formation and
each feature type comprising a plurality of elements having varying
light intensity, wherein said plurality of elements comprises
non-maximum and non-minimum elements and comprises one or more of
the following:
[0210] a) at least one maximum element;
[0211] b) at least one minimum element; and
[0212] c) at least one saddle element;
[0213] providing a second bi-dimensional coded light pattern
comprising said first bi-dimensional coded light pattern having the
one or more maximum and/or minimum inverted elements;
[0214] projecting said first bi-dimensional coded light pattern and
said second bi-dimensional coded light pattern on said objects;
[0215] capturing: [0216] i) a first 2D image of said objects,
having said first projected bi-dimensional coded light pattern
projected thereupon, said first 2D image comprising reflected said
feature types; and [0217] ii) a second 2D image of said objects,
having said second bi-dimensional coded light pattern projected
thereupon, said second 2D image comprising reflected said feature
types;
[0218] obtaining a resultant image from the absolute value of the
subtraction of said second 2D image from said first 2D image, said
resultant image comprising borderlines of said reflected feature
types; and
[0219] extracting said borderlines and borderline locations on
respective said epipolar lines in said resultant image.
[0220] According to an embodiment of the present invention, the
method further comprises determining corresponding 3D spatial
coordinates of the objects by means of the borderline locations on
the respective epipolar
[0221] According to still a further embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0222] providing a first bi-dimensional coded light pattern having
a plurality of feature types, each feature type being
distinguishable according to a unique bi-dimensional formation;
[0223] providing a second bi-dimensional coded light pattern,
different from said first pattern, having a plurality of feature
types, each feature type being distinguishable according to a
unique bi-dimensional formation;
[0224] projecting said first coded light pattern and said second
coded light pattern on said objects;
[0225] capturing: [0226] i) a first 2D image of said objects having
said first projected coded light pattern projected thereupon, said
first 2D image comprising reflected said feature types; and [0227]
ii) a second 2D image of said objects having said second coded
light pattern projected thereupon, said second 2D image comprising
reflected said feature types;
[0228] extracting, in each image independently, said reflected
feature types according to the unique bi-dimensional formations and
the feature type locations on respective said epipolar lines;
and
[0229] comparing regions of epipolar lines in said second 2D image
to similar regions along same epipolar lines in said first 2D image
to verify feature identities in said first 2D image.
[0230] According to still a further embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0231] providing a bi-dimensional coded light pattern having a
plurality of feature types, each feature type being distinguishable
according to a unique bi-dimensional formation;
[0232] projecting said bi-dimensional coded light pattern on said
objects;
[0233] capturing: [0234] i) a first 2D image of said objects,
having said projected coded light pattern projected thereupon, said
first 2D image comprising reflected said feature types; and) [0235]
ii) a second 2D image of said objects, having ambient light
projected thereupon;
[0236] obtaining a resultant image from the subtraction of said
second 2D image from said first 2D image; and
[0237] extracting said reflected feature types and locations of
said reflected feature types on respective said epipolar lines in
said resultant image.
[0238] According to still a further embodiment of the present
invention, a method of obtaining data from a 2D image in order to
determine the 3D shape of objects appearing in said 2D image, said
2D image having distinguishable epipolar lines, said method
comprising:
[0239] providing a bi-dimensional coded light pattern having a
plurality of feature types, each feature type being distinguishable
according to a unique bi-dimensional formation;
[0240] projecting said coded light pattern on said objects;
[0241] capturing: [0242] i) a first 2D image of said objects,
having said projected coded light pattern projected thereupon, said
first 2D image comprising reflected said feature types; and [0243]
ii) a second 2D image of said objects, having uniform light
projected thereupon;
[0244] obtaining a resultant image from the division of said first
2D image by said second 2D image; and
[0245] extracting said reflected feature types and the locations of
said reflected feature types on respective said epipolar lines in
said resultant image.
[0246] Unless otherwise defined, all technical and scientific terms
used herein have the same meaning as commonly understood by one of
ordinary skill in the art to which this invention belongs. The
materials, methods, and examples provided herein are illustrative
only and not intended to be limiting.
[0247] Implementation of the method and system of the present
invention involves performing or completing certain selected tasks
or steps manually, automatically, or a combination thereof.
Moreover, according to actual instrumentation and equipment of
preferred embodiments of the method and system of the present
invention, several selected steps could be implemented by hardware
or by software on any operating system of any firmware or a
combination thereof. For example, as hardware, selected steps of
the invention could be implemented as a chip or a circuit. As
software, selected steps of the invention could be implemented as a
plurality of software instructions being executed by a computer
using any suitable operating system. In any case, selected stages
of the method and system of the invention could be described as
being performed by a data processor, such as a computing platform
for executing a plurality of instructions.
BRIEF DESCRIPTION OF THE DRAWINGS
[0248] The invention is herein described, by way of example only,
with reference to the accompanying drawings. With specific
reference now to the drawings in detail, it is stressed that the
particulars shown are by way of example and for purposes of
illustrative discussion of the preferred embodiments of the present
invention only, and are presented in order to provide what is
believed to be the most useful and readily understood description
of the principles and conceptual aspects of the invention. In this
regard, no attempt is made to show structural details of the
invention in more detail than is necessary for a fundamental
understanding of the invention, the description taken with the
drawings making apparent to those skilled in the art how the
several forms of the invention may be embodied in practice.
[0249] In the drawings:
[0250] FIG. 1A-1G are simplified diagrams illustrating one
embodiment of the present invention showing how bi-dimensional
light pattern imaging is used together with the principles of
epipolar geometry.
[0251] FIG. 2 is a simplified flow chart showing the steps in the
process of three dimensional image and/or motion-image capture of
the present embodiments.
[0252] FIG. 3 illustrates an exemplary spatially-periodic 2D
pattern P1 projected to obtain epipolar separation and
corresponding image `PI of the reflected pattern.
[0253] FIG. 4 is a simplified flow chart and diagram illustrating
how the corresponding 3D spatial locations of identified features
in the captured 2D image are derived.
[0254] FIG. 5 shows further exemplary point clouds derived through
the methods of embodiments of the present invention.
[0255] FIG. 6A is a simplified schematic representation of a
reflected pattern after being projected in accordance with the
epipolar separation principle.
[0256] FIG. 6B again shows a simplified schematic representation of
a reflected pattern, but now includes multiple appearances of the
same feature on a given epipolar line in the image.
[0257] FIG. 7 is a simplified illustration of a captured image of a
preferred light pattern projected in accordance with a particular
embodiment of the present invention.
[0258] FIG. 8 is a simplified schematic representation of a pattern
now projected at a smaller rotation angle with respect to the
direction of the epipolar field.
[0259] FIG. 9 is a simplified schematic diagram of a non ordered
and non periodic bi dimensional spatial pattern that may be used in
certain embodiments.
[0260] FIG. 10 shows illustrations of exemplary feature types that
comprise preferred encoded pattern P1.
[0261] FIG. 11 is an illustration of the exemplary pattern P1 as
seen an image I'.sub.P1 after being reflected from an imaged
cube.
[0262] FIGS. 12A and 12B illustrate the construction process and
the encoding scheme inherent in pattern P1.
[0263] FIG. 12C contains simplified illustrations of the sample
points in pattern P1 after being reflected from an imaged object
and viewed as part of a captured image in an imaging apparatus.
[0264] FIG. 13A illustrates a preferred projection and imaging
method using pattern P1 that enable epipolar separation
techniques.
[0265] FIG. 13B1 illustrates a preferred pattern P2 that provides
an increased number of sample points.
[0266] FIG. 13B2 illustrate the construction process and the
encoding scheme inherent in pattern P2.
[0267] FIG. 13C illustrates two preferred patterns, T1 and T2, for
a temporal encoding embodiment.
[0268] FIG. 13D illustrates a close-up of preferred pattern T2 for
a temporal encoding embodiment.
[0269] FIG. 14 shows exemplary characters or feature types along
with image I.sub.P1 of exemplary pattern P1.
[0270] FIG. 15 is an illustration of a preferred dual projection
and dual imaging embodiment of the present invention involving P1
and -P.sub.1.
[0271] FIG. 16 is an illustration of the addition of the two
images, P.sub.1 and -P.sub.1.
[0272] FIG. 17 is a preferred embodiment of a dual projection and
dual imaging method of the present invention.
[0273] FIG. 18 shows the addition of the two imaged patterns of
FIG. 17.
[0274] FIG. 19 shows an illustration of the resultant image
obtained from the absolute value of the subtraction of image
-P.sub.1 from image P.sub.1.
[0275] FIG. 20 is an illustration of the resultant image obtained
from the absolute value of the subtraction of image -C from image
P.sub.1.
[0276] FIG. 21 is an illustration of a particular embodiment of
dual pattern projection and dual pattern imaging.
[0277] FIG. 22 is an illustration of a particular embodiment of
dual projection using uniform light.
[0278] FIG. 23 is a simplified illustration showing how an epipolar
field between two images is related.
[0279] FIG. 24 is a simplified illustration showing how separate
epipolar fields exist for respective imaging apparatuses and a
projector in a particular embodiment.
[0280] FIG. 25 is a simplified flow chart of a particular
embodiment of the present invention showing the steps in the
process of generating a 3D map of an urban region.
[0281] FIG. 26 is simplified flow chart of a particular embodiment
of the present invention showing the steps in the process of 3D
imaging of an anatomical part.
[0282] FIG. 27 is a typical flow chart of a particular embodiment
of the present invention showing the steps in the process of
determining distances of obstacles in the path of a reversing motor
vehicle.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0283] The present embodiments provide an apparatus and a method
for three dimensional imaging of static and moving objects. In
particular, it is possible to determine three-dimensional spatial
shape data of a given object(s) by (a) projecting an encoded
bi-dimensional light pattern on the object(s); and (b) analyzing a
2D image of the reflected bi-dimensional light pattern utilizing
the optical constraints and principles associated with epipolar
geometry and triangulation.
[0284] The principles and operation of an apparatus and method
according to the present invention may be better understood with
reference to the drawings and accompanying description.
[0285] Before explaining at least one embodiment of the invention
in detail, it is to be understood that the invention is not limited
in its application to the details of construction and the
arrangement of the components set forth in the following
description or illustrated in the drawings. The invention is
capable of other embodiments or of being practiced or carried out
in various ways. Also, it is to be understood that the phraseology
and terminology employed herein is for the purpose of description
and should not be regarded as limiting.
[0286] FIGS. 1A-1G: Determining a 3D Depth Map of an Object or
Objects
[0287] Reference is now made to FIG. 1A-1D, which are simplified
diagrams illustrating one embodiment of the present invention. In
the particular example of FIGS. 1A-1D, system 10 comprises
projector 12 and imaging apparatus 14 found at respective locations
(x.sub.1.sup.p, y.sub.1.sup.p, z.sub.1.sup.p) and
(x.sub.1.sup.i,z.sub.1.sup.i). An encoded bi-dimensional light
pattern 16, comprising a predefined array of a finite set of
identifiable feature types, is projected by the projector 12 onto
surfaces in a scene containing imaged objects 18a, 18b, and 18c.
Exemplary features in the projected pattern are denoted as 20a,
20b, 20d, and 20f. In this particular example, the projected
pattern takes the form of an array of monochromatic light beams of
varying intensity, wherein combinations of adjacent light beams
comprise encoded features or letters having bi-dimensional spatial
formations. These features intersect the imaged objects' surfaces
at various reflection points 19a, 19b, 19d, and 19f and are then
observed in 2D image 22 captured on sensor 24 of imaging apparatus
14. The projected features 20a, 20b, 20d, and 20f thus appear as
reflected features 28a, 28b, 28d, and 28f in the captured image 22.
Imaging apparatus 14 may be for example a CCD or CMOS digital video
camera, or any other type of array of photo-detector elements.
[0288] The relative position 30 between projector 12 and sensor 24
is fixed, thus imparting a geometric constraint to system 10. This
constraint restricts the location of any given reflected feature
28a, 28b, 28d, and 28f in the captured image 22 to a unique set of
points in the image called an epipolar line. Epipolar lines .beta.,
.phi., and .rho. are shown in the figure. That is to say, for a
system 10 with a given light pattern and fixed relative position
between projector and sensor, for any object or group of objects
upon which the pattern is projected, each feature of the projected
pattern is always observed on a particular, singular epipolar line
in the captured image 22. Moreover, as will be explained below, the
above holds true whether the imaged objects are fixed or in
relative motion to the projector/imaging apparatus. The
relationship between each feature in the pattern and a constant
epipolar line is thus irrespective of the reflection point in 3D
space from which the feature is reflected.
[0289] The reflection points of the exemplary projected features
are denoted as 19a, 19b, 19d, and 19f and have associated
respective reflection distances D.sub.A, D.sub.B, D.sub.D, and
D.sub.F. The reflection distance is the distance from the sensor 24
of the imaging apparatus to the reflection point location in 3D
space on the imaged object 18a, 18b, 18c upon which a given feature
is reflected. Reflection distance is sometimes referred to as
depth.
[0290] Thus, as illustrated in FIGS. 1A-1D, each reflected feature
28a, 28b, 28d, and 28f found in image 22 is constrained to a
respective specific epipolar line, independent of the point in
space from which the feature was reflected. Feature 28a is thus
constrained to epipolar line .phi. independent of the 3D location
of 19a, or alternatively independent of reflection distance
D.sub.A. Features 28b and 28d are two reflected features from the
pattern that share the same epipolar line .beta.. Features 28b and
28d are always found on this epipolar line irrespective of
reflection distances D.sub.B and D. Similarly, feature 28f is
always reflected onto epipolar line .rho. independent of the
reflection distance D.sub.F.
[0291] However, the reflection distance, or alternatively the point
in 3D space from which the feature is reflected, does indeed
determine where along that epipolar line the feature appears. Thus,
the reflection distance does not affect the particular epipolar
line upon which a given reflected feature appears, but rather only
the precise location along that line.
[0292] To illustrate the change in position of a feature along the
epipolar line as a function of reflection distance, we now turn to
FIGS. 1E-1G. In each of FIGS. 1E, 1F, and 1G, the projector and
imaging apparatus, although always at a fixed relative distance to
one another, move in relation to the object 18a. In FIG. 1E,
projector 12 and imaging apparatus 14 are at locations
(x.sub.1.sup.p, y.sub.1.sup.p, z.sub.1.sup.p) and (x.sub.1.sup.i,
y.sub.1.sup.i, z.sub.1.sup.i) respectively. The reflection distance
of reflected feature 28a (projected feature 20a) on epipolar line
cp is seen as D.sub.A, after being reflected from reflection point
19a on object 18a. In FIG. 1F, the projector and imaging apparatus
move to respective points in space (x.sub.2.sup.p, y.sub.2.sup.p,
z.sub.2.sup.p) and (x.sub.2.sup.i, y.sub.2.sup.i, z.sub.2.sup.i).
The reflection distance of reflected feature 28a is now D.sub.A'
after being reflected from a different reflection point 19a' on
object 18a. As a result, feature 28a is now reflected onto a lower
part of epipolar line .phi.. In FIG. 1G, the projector and imaging
apparatus move to a third set of respective coordinates in space
(x.sub.3.sup.p, y.sub.3.sup.p, z.sub.3.sup.p) and (x.sub.3.sup.i,
y.sub.3.sup.i, z.sub.3.sup.i). The reflection distance of reflected
feature 28a is now D.sub.A'' after being reflected from a still
third reflection point 19a'' on object 18a. As a result, feature
28a is now reflected onto a higher part of epipolar line .phi..
However, no matter what the reflection distance, reflected feature
28a (projected feature 20a) must always appear on epipolar line
.phi. and only on .phi.. For purposes of clarity, each feature is
associated with only one epipolar line in the present embodiment.
It is understood that, as features comprise spatial formations of
light intensities in the captured image, elements of each feature
may lie on separately distinguishable epipolar lines.
[0293] It is understood that any relative motion between the system
10 and an imaged object causes the reflection distance of a
particular feature to change. This relative motion may result from
motion of the imaged object, from motion of the projector/imaging
system, or from motion of both the imaged object and
projector/imaging system. Any change in the reflection distance
causes a change in the precise location of a given reflected
feature along that feature's associated epipolar line. However, the
particular epipolar line upon which that feature appears remains
constant.
[0294] Therefore, we can conclude that the principles of epipolar
geometry dictate a mapping between a set of 2D coordinates in the
image and three dimensional coordinates in space that are viewed by
the imaging apparatus. Again, the precise point at which a given
captured feature appears along a particular epipolar line is a
function of the feature's reflection distance, or alternatively
stated, the point in 3D space from which the feature is
reflected.
[0295] For each image, or frame, each reflected feature is
identified according to feature type and the reflected feature's
location is determined along the feature's associated epipolar line
in the image, preferably by an image processing device 36. The
precise position of an identified feature along the feature's
epipolar line is then corresponded to that feature's position in
the original projected pattern. This correspondence of features
between the image and projected pattern allows for triangulation
based computations to determine the 3D spatial coordinate from
which the feature was reflected. Pre-computed triangulation tables
may be utilized to determine the three dimensional spatial location
of the point on the object from which the feature is reflected. In
some embodiments, these triangulation based calculations are
carried out by the image processing device 36.
[0296] This process may be repeated for a plurality of features of
the 2D image, where each feature is reflected off of a different
respective location on the surface of the 3D imaged object(s). For
any given image frame, each such identified feature in the captured
2D image leads to a three dimensional spatial location, the
compilation of all such spatial locations comprising a point cloud
of locations in 3D space. This 3D point cloud gives a three
dimensional mapping of the imaged object(s). Further processing of
the point cloud may yield a 3D mesh which essentially fuses the
points of the 3D cloud into 3D surfaces. This mesh may also be
given graphics texture based on additional texture capture of the
objects in the scene. For 3D mapping of objects in motion, the
above process described for a single image is carried out over a
series of images leading to 3D video.
[0297] FIG. 2: Simplified Flow Chart of 3D Image Capture
[0298] Reference is made to FIG. 2, which is a simplified flow
chart showing the steps in the process of three dimensional image
and/or motion-image capture of the present embodiments. Each step
of the flow chart is discussed in further detail in the following
figures. The flow chart, together with diagrams 1A-1G, gives the
reader a simplified and overall intuitive understanding of the
three dimensional imaging process described herein.
[0299] Step 70 is the provision of a predefined coded light
pattern. This coded light pattern, as exemplified in FIGS. 1A-1D
and denoted there as 16, is an array of a finite set of feature
types in the form of spatial formations of varying intensity light
beams. Preferred embodiments of patterns and their characteristics
are discussed in figures below. Step 72 is the projection of that
pattern on an object(s). Several preferred projection methods are
discussed in figures below as well. In step 74, a 2D image is
captured that contains features reflected off of the object(s) upon
which the pattern was projected. The image is analyzed to identify
features and their locations along respective epipolar lines, step
76. The locations of the features along their epipolar lines are
then associated with 3D coordinates in space from which the
features were reflected, step 78. This process of correlating
between feature locations along epipolar lines and 3D spatial
coordinates, carried out through triangulation techniques, is
discussed below. For each identified feature in the 2D image, a
corresponding 3D coordinate is thus derived indicating the point in
space at which that feature reflected off of an imaged object.
Through a compilation of all such 3D coordinates, a 3D point cloud
is derived that gives a three dimensional mapping of the imaged
object(s), step 80. Further processing of the point cloud may yield
a 3D mesh which essentially fuses the points of the 3D cloud into
3D surfaces. This mesh may also be given graphics texture based on
additional texture capture of the objects in the scene.
[0300] In the case where objects or the camera are in motion, steps
74-80 may be continuously repeated to obtain 3D motion capture. In
such a case of 3D motion capture, a series of 2D captured images of
the reflected light pattern off of the moving object(s) comprises
frames in a video sequence. This 2D video sequence can be processed
frame by frame in the manner discussed in the flow chart to derive
the 3D spatial coordinates for each frame. The result is a series
of point clouds, one for each frame of the video sequence, that
together comprise a dynamic 3D mapping over time.
[0301] FIG. 3: Pattern Projection and Epipolar Separation
[0302] FIG. 3 provides an exemplary spatially-periodic 2D pattern
P1 which may be projected onto the object. The 2D pattern includes
a plurality of different feature types 20, which appear within the
pattern at various locations. For clarity, in the present figure,
each unique feature type is arbitrarily assigned an alphabetic
character. As will be explained in further detail below in FIG. 10,
each pattern feature is composed of a codified spatial formation of
black and white points. Features are repeated in a cyclical fashion
in the code. Specifically, identical feature types repeat
themselves every other row in the vertical direction of the pattern
and every sixteen columns in the horizontal direction of the
pattern. The black and white points of the pattern features
correspond to projections of either a high or low illumination
intensity of monochromatic light on an imaged object. Each
character in the code is thus a bidimensional spatial formation of
projected light intensities. In the pattern of the present
embodiment, P1, the formations are continuous. Other patterns
having non-continuous formations may also be implemented.
[0303] After reflection off of the 3D objects, subsection 48 of the
projected periodic pattern P1 generates image I.sub.P1, containing
the reflected pattern. Epipolar lines 52, 54, and 56 are shown.
Thus image I.sub.P1 is a simplified illustration of the reflected
pattern being viewed through an imaging apparatus upon being
reflected off of an imaged object. For simplicity, only the
reflected pattern is shown and not any reflected imaged objects
appearing in I.sub.P1 as well. In this particular embodiment, the
projector and imaging apparatus are positioned vertically from one
another, thus causing the epipolar lines in the captured image to
be in a substantially vertical direction. In another possible
embodiment, the projector and imaging apparatus may be positioned
horizontally from each other, in which case the epipolar lines in
the captured image would be substantially horizontal.
[0304] As seen and mentioned above, any given feature type repeats
itself cyclically many times in the vertical direction of the
pattern, and were the pattern to be projected without rotation,
many features of the same type would be observed in the captured
image on the same vertical epipolar line. However, through slight
rotation of the projected pattern, features of the same type are
reflected onto separate epipolar lines in image I.sub.P1. For
example, features 58 and 60, both of type (A), are now reflected
onto two adjacent but distinguishably separate epipolar lines 52
and 54. As will be explained further below, the separation of all
identical pattern feature types onto separate epipolar lines in the
captured image, called epipolar separation herein, enables
identification of each feature without ambiguity. To understand why
this is so, a brief discussion of system calibration follows to
understand the relationship between features in the original
projected pattern and epipolar lines in the captured image.
[0305] As explained above, given a particular pattern such as that
described in the current embodiment, and assuming the geometric
constraint of a fixed relative position between the projector and
imaging apparatus, any given pattern feature appears on a constant
pre-determined epipolar line in an image containing the reflected
pattern. This pre-determined relationship between each feature in
the projected pattern and an associated epipolar line in an image
of the reflected pattern may be determined in many ways, including
brute force. The process of determining this relationship may be
referred to as the epipolar field calibration process, and is
carried out a single time for a given pattern and given
configuration of the imaging device and projector. The result of
such a mapping is an epipolar-field table as seen below.
TABLE-US-00001 Feature Type and Location in Pattern Epipolar Line
In Captured linage (A.sub.1) at X.sub.1, Y.sub.1 E.sub.1 (A.sub.2)
at X.sub.2, Y.sub.2 E.sub.2 . . . . . . (A.sub.N) at X.sub.3,
Y.sub.3 E.sub.N (B.sub.1) at X.sub.4, Y.sub.4 E.sub.1 (B.sub.2) at
X.sub.5, Y.sub.5 E.sub.2 . . . . . . (B.sub.m) at X.sub.6, Y.sub.6
E.sub.N . . . . . .
[0306] As seen from the table, each occurrence of a feature type in
the pattern is associated with a different epipolar line: (A.sub.1)
with E.sub.1, (A.sub.2) with B.sub.2, and so on for as many (A)
type appearances in the projected pattern that appear in the image.
Of course features of different types, such as (A.sub.1) and
(B.sub.1), can fall on the same epipolar line, such as E.sub.1.
Thus, all epipolar line locations and the specific features in the
original pattern that fall on these epipolar lines are known from
the calibration process.
[0307] As mentioned, any given image containing the reflected
pattern contains feature types of a set of a certain size. In the
pattern of the current embodiment this set contains 32 types. The
decoding engine used to decode the captured image decodes each
feature according to feature type and location coordinate in the
image, including epipolar line. This decoded image feature is then
corresponded with a particular feature in the original projected
pattern. For instance, if a feature type (A) is found on epipolar
line E.sub.2, then from the epipolar field calibration table we
know that this image feature uniquely corresponds to feature
(A.sub.2) in the original pattern, and not any other feature in the
pattern. Again, were the cyclical pattern to be projected without
rotation, many features of the same type would be observed in the
captured image on the same epipolar line. Since reflected features
are identified based on feature type, there would be no way of
corresponding on a one-to-one basis between identical image feature
types and their corresponding features in the original pattern. In
other words, there would be no indication of which identified (A)
feature type in the image corresponds to which (A) type feature in
the projected pattern.
[0308] However, by limiting each feature type to a singular
appearance on a particular epipolar line in the captured image, the
association of that particular image feature and the feature's
appearance in the pattern can be made. Therefore each epipolar line
has at most one feature of a given feature type. So each (A) type
feature in the pattern P1 falls on a different epipolar line in the
image, each (B) type feature falls on an independent epipolar line,
and so on for all pattern features. Similarly, each epipolar line
preferably has multiple features so long as they are of
distinguishable types. For instance, with reference to FIGS. 1A-1D,
it is noted that features 28b and 28d are both found on the same
epipolar line .beta., yet may be compared to respective pattern
appearances since they are of different, distinguishable feature
types.
[0309] The unambiguous matching between a particular image feature
and that feature's appearance in the pattern leads to a correct
triangulation that determines a precise 3D location in space from
which the feature is reflected.
[0310] It is appreciated that the present embodiments utilize a
rectified imaging system, meaning that the plane of the imaging
apparatus and the plane of the projector lie on the same spatial
plane. As such, only a vertical or horizontal shift exists between
the projector and the imaging apparatus. In such a rectified
system, the epipolar lines in the captured image together comprise
a unidirectional epipolar field. Since the epipolar lines in the
image are thus parallel to each other, the rotation of the
repeating pattern places repeating feature types onto separate
epipolar
[0311] So in summation, as a result of the internal structure of
the code and the projection approach, each epipolar line has at
most one feature of a given feature type. Again, although the
pattern is cyclical and therefore feature types appear many times
in the pattern, the above described placement of same feature types
onto separate epipolar lines means there is no confusion between
multiple appearances of that feature type in the image. For each
image in a stream of images, the location of each feature
determines a new set of 3D spatial coordinates that provide a 3D
map of the currently imaged object location.
[0312] With reference to FIGS. 1A-1G, image processing device 36
preferably contains at least the image processing engine used to
identify (1) features and their associated epipolar lines in the
captured 2D image and (2) feature locations upon their epipolar
lines. The image processing device typically also has a database
containing the coordinates for all features in the projected
pattern. The processing device matches each identified feature in
the image to that feature's corresponding feature in the pattern.
Triangulation-based computations then are used to assign three
dimensional coordinates in space to each identified feature to
derive a 3D point cloud of the imaged object. These
triangulations-based computations are preferably stored in a table
in memory to aid in fast processing of 3D spatial coordinates.
Further processing of the point cloud may be optionally desired.
Such processing may include determining a 3D surface or mesh from
the 3D point cloud for a richer, life-like 3D image/video. Texture
data may be added as well. Such processing of the point cloud may
utilize known processing engines for three dimensional point cloud
processing.
[0313] As such, through novel techniques described herein combining
the above principles of epipolar geometry together with structured
light projection and encoding/decoding schemes, the present
embodiments provide for high resolution, high accuracy 3D imaging,
such as that prescribed for HDTV, 4CIF, mega-pixel imaging and
similar applications. In preferred embodiments, for short
distances, accuracy is obtained of less than one millimeter. That
is to say, the triangulation error is +-1 millimeter.
[0314] As stated, this system and method may be preferably
implemented even where the object(s) is (are) in motion, as
preferred embodiments of the invention utilize the projection of a
single coded pattern. At each time interval determined by the
imaging apparatus, only a single projection of the encoded light
pattern is needed to capture surface points of objects, thus
enabling successive images to capture changing surface points of
objects in motion. That is to say, objects in the scene and/or the
camera and projector system may move in a three dimensional space
relative to each other, thus enabling the dynamic depth capture of
three dimensional objects in motion. As the object moves, the three
dimensional coordinates of the object in a Euclidian space
dynamically change. Thus, depth change over a series of captured
images is reflected in the movement or displacement of features
along their respective epipolar lines for each captured image. As a
result, different point clouds are derived for each image
frame.
[0315] In preferred embodiments, the imaging apparatus and the
projector are typically contained in a single device housing. The
projected two dimensional light pattern may be projected by any
light source, including but not limited to a video projector, a
slide projector in the case of a printed pattern, or laser/LED
projectors. The use of single pattern projectors in preferred
embodiments allows for the straight forward use of diffractive
projection, which leads to low power consumption, compactness,
noise filtering due to narrowband type signals, and strong
suitability to invisible NIR radiation applications. The imaging
apparatus may be a proprietary or off the shelf video or still
camera typically with CCD or CMOS sensor. It is further understood
that the epipolar lines in a captured image may be of nonlinear
shape and the linear lines in FIGS. 1A-1G and FIG. 3 are for
illustration purposes only.
[0316] Reference is made to FIG. 4, which is a simplified flow
chart and diagram illustrating how the corresponding 3D spatial
locations of identified features in the captured 2D image are
derived. The flow chart of FIG. 4 begins from the 2D image capture
stage, stage 74, also seen in FIG. 2. Three different exemplary
feature types, (A), (B), and (C), share an epipolar line in the
captured 2D image 84 which includes both the imaged object in 2D
and the reflected pattern. The exemplary feature types in the coded
pattern are projected onto an imaged object and all are
subsequently observed along a single epipolar line E.sub.1 in 2D
image 84, seen as step 76. The next step involves corresponding
between each identified feature along the epipolar line and the
originating feature in the projected 2D coded light pattern (step
78a). This correspondence is used to calculate, through
triangulation methods, the 3D point in space, or depth coordinate,
from which the feature was reflected (step 78b). Notably, steps 78a
and 78b are combined and seen as step 78 in the flow chart of FIG.
2. The sum of all such triangulated features gives rise to a 3D
point cloud (step 80) of the imaged object. In preferred
embodiments, the point cloud 80 is further processed to obtain
texture and surface for the 3D image/video image, step 82.
[0317] Since the feature types shown are distinct, all three may be
readily corresponded to respective features in the projected
pattern. If there were more than a single appearance of say (B) on
the epipolar line, then ambiguity would result when attempting to
associate each appearance of (B) to the corresponding appearance in
the projected pattern. Ambiguity as a result of multiple
appearances of feature types on a single epipolar line would
perforce lead to triangulation errors and false measurements of
spatial locations on the imaged object. The embodiments of the
present invention therefore ensure at most a singular appearance of
any given feature type on an epipolar line, herein referred to as
epipolar separation, to ensure a one to one matching process
between an observed feature and the feature's appearance in the
pattern.
[0318] Again, in the present embodiments, the imaged object may be
still or in motion. If the imaged object is in motion, then the
steps 74-82 are repeated to provide 3D video. Alternatively, the
imaging and projector apparatus may be in motion relative to the
object. In dynamic applications, triangulation equations are
computed at video frame rates to give real time three dimensional
coordinates of the object(s) in motion.
[0319] Furthermore, as will be shown in the figures below, as the
total number of distinguishable epipolar lines in the captured
image increases, the encoded light may be projected in a manner
that allows for more appearances of features in the captured image.
The total number of distinguishable epipolar lines may increase due
to many factors, including but not limited to feature structure or
shape, a low noise level as a result of a high resolution sensor,
optical characteristics of the system, or more accurate coordinate
detection algorithms.
[0320] FIG. 5 shows further exemplary point clouds derived through
the methods of embodiments of the present invention. Illustration
86 shows a frontal view of a point cloud of person waving. Each
point in the cloud has a three dimensional (x, y, z) coordinate in
space. Illustration 88 shows a side view that provides another
depth perspective.
[0321] Reference is made to FIG. 6A. FIG. 6A now shows a schematic
representation 500 of the appearance in an image of a reflected
spatially encoded pattern after being projected in accordance with
the epipolar separation principle discussed above. That is, a light
pattern is constructed and projected in such a manner that any
given feature type appears at most once on any given epipolar line
in the captured image.
[0322] In the present embodiment, the singular appearance of any
given feature type on an epipolar line is achieved through the
orientation of the projected periodic light code at a suitable tilt
angle with respect to the direction of the epipolar field. The
epipolar field is vertical in this example and denoted by epipolar
lines 102A, 102B, and 102C. The schematic diagram 500 of the
reflected light pattern shows a periodical or cyclic tiling of
identical 10.times.2 matrices 90. Each matrix has twenty feature
types A-T. The matrix pattern repeats itself over the length and
width of the image.
[0323] Now, when the pattern projection is rotated at an angle
specified as 94 in the figure, the periodic pattern of feature
types in the image repeats itself on any given epipolar line over
length H.sub.1. Thus, for each distinguishable epipolar line over
the imaging apparatus vertical field of view H.sub.1, any feature
type appears at most once. By tilting the projection angle of the
pattern, the separation of same feature types onto distinguishable
epipolar lines is achieved. For example, features 96, 98, and 100,
which repeat in the y-direction of the pattern every other feature,
are separated onto separate distinguishable epipolar lines 102A,
102B, and 1020, in the captured image. Resultantly a significant
number of features can be identified over the entire image area.
The total number of identified features in a captured image is
referred to as image capacity or coverage.
[0324] Thus, epipolar separation of the tight periodic code allows
for a large number of identifiable features to be captured in the
image leading to high resolution. Furthermore, the periodic nature
of the pattern allows for a coded pattern containing a small number
of repeating features types, thereby enabling each feature to
remain small in size and further contributing to high resolution
images. As an example, on a typical sensor with 640.times.480 pixel
resolution, each feature may have an approximate area of 10 pixels.
This translates into an image capacity of approximately 31 thousand
features for every frame, and thus 31 thousand 3D spatial
coordinates.
[0325] In the above and current embodiments, for simplicity and
clarity of explanation, width of pattern features in the 2D image
is not considered. As such, each feature is associated with an
epipolar line as if the feature has infinitely small size, and each
feature is only distinguishable by one epipolar line. In actual
practice, rather than a single epipolar line being associated with
an individual feature, the spatial formations that comprise the
feature lie on distinguishable, substantially adjacent epipolar
lines.
[0326] Returning to the FIG. 6A, distances denoted as D.sub.1 and
D.sub.2 are equidistant distances between distinguishable epipolar
lines. The known horizontal distance between distinguishable
epipolar lines is a function of the imaging apparatus and image
processing device and aids in the definitive identification and
verification of reflected features in the image. For example, a
reflected feature type (A) 98 is detected in the captured image at
a certain y coordinate along epipolar line 102B. Thus, the nearest
feature type (A), if indeed captured in the obtained image, can
only be found on the nearest distinguishable epipolar lines to the
right and left of epipolar line 102B, namely 102A and 102C.
Although all epipolar lines are precisely vertical in the figure
for illustration purposes, other embodiments are possible wherein
the epipolar lines may be lines or curves with varying x and/or y
coordinates.
[0327] The limiting epipolar line separation factor is the minimum
horizontal distance necessary to distinguish between separate
epipolar lines in the captured image. The ability to differentiate
between features on separate epipolar lines is based on several
factors. Firstly, a feature's shape often determines whether the
feature has a detectable coordinate location after reflection from
an imaged object. A code having features that can be assigned
coordinates with greater exactitude, allows for differentiation
between features on ever closer epipolar lines. The object surface
type also may have an effect, as features from metallic surfaces,
glass, and other transparent or very dark surfaces, for example,
are reflected with less exactitude. Low projector SNR and inferior
sensor optics may limit the distance between distinguishable
epipolar lines.
[0328] Reference is now made to FIG. 6B. As has been explained
until now, as long as identical feature types fall on separate
distinguishable epipolar lines, each feature type may be identified
in the captured image without concern of confusion between them.
However, under certain circumstances now to be discussed, where
reflected features are known to be found only on pre-defined
sections of an epipolar line, then separation of same feature types
is even possible along the same epipolar line. Two individual
features of the same type on the same epipolar line may appear on
only limited sections thereupon in the case where the imaged object
moves within a very limited depth range over a series of images.
That is to say, if the imaged object movement is confined to a
particular distance range from the imaging system, then multiple
occurrences of the same feature type may occur over a single
epipolar line. If the movement of an imaged object in 3D space is
slight, then the corresponding movement of reflected features along
their respective epipolar lines is small. In such instances, it is
possible to separate multiple appearances of a given feature type
on a single epipolar line, where each feature type appears in a
pre-defined restricted range along the epipolar line.
[0329] To illustrate, we now turn to FIG. 6B, showing the same
schematic representation 500 as in FIG. 6A. Epipolar lines .rho., ,
O, .theta., .phi., and .beta. are shown and the image height is now
larger, denoted H.sub.1', to include multiple appearances of the
same feature on a given epipolar line in the image. On epipolar
line , for instance, features 96 and 506 are two individual
features that appear on the same epipolar line and are of identical
type, namely (A). If each of these features is free to appear on
any part of the epipolar line in the image, then confusion may
arise between the two features during feature identification. As
explained, the correspondence between each of features 96 and 506
and the feature's unique appearance in the projected pattern would
be subject to error. But in the current embodiment, the appearance
of feature 96 is guaranteed to appear only on subsection 96A of
epipolar line , while the appearance of feature 506 is guaranteed
to appear only on subsection 506A of the same epipolar line . In
such a case, the distance between the two subsections, D, is large
enough so that no concern of confusion between same feature types
exists. During the feature identification process, an (A) type
feature found on section 506A of epipolar line in a given image is
definitively identified as feature 506, while an (A) type feature
found on section 96A is definitively identified as feature 96.
Regions 96A and 506A are two well-defined, spaced apart regions.
Likewise, the appearance of features 98 and 504 upon epipolar line
O is only within ranges 98A and 504A respectively. Finally,
features 100 and 502 also only appear on respective ranges 100A and
502A along epipolar line .theta.. In all of the previous three
cases, a large enough distance exists between the ranges along a
single epipolar line upon which two identical feature types may
fall. Therefore, identification of two separate features of the
same type is possible.
[0330] One way of having more than a single appearance of the same
feature type on an epipolar line is by increasing the image height
to H.sub.1' from H.sub.1. The increase of the image height results
in multiple appearances of the same feature type along epipolar
lines in the image. The multiple appearance of same feature types
is due to the periodic cycle of the pattern, which repeats itself
along the length of the enlarged H.sub.1, or H.sub.1'.
Alternatively, if the resolution of the image sensor is increased
and the pattern features are decreased in size, then feature types
may repeat themselves along epipolar lines. As long as the depth
measurement is limited to a certain range over which the imaged
object moves, any given feature along the epipolar line only
displaces a short distance for every frame. Thus, even two
identical features may be differentiated on a single epipolar line,
so long as the sections of the epipolar line upon which they shift
are limited and spaced far enough apart. The division of the
epipolar line into predefined sections is a function of the depth
range limitation. Through such depth range limitations, more points
are sampled in the image and resolution is further increased as the
number of depth readings in the captured image is greater.
[0331] We now refer to FIG. 7, which is a simplified illustration
of a captured image of a preferred light pattern projected in
accordance with epipolar separation to ensure that same feature
types fall on distinguishably separate epipolar lines. The figure
illustrates the geometric principles that provide a significantly
high number of identifiable features through projection of
preferred patterns at angles to the epipolar field. The angling of
the projected periodic pattern ensures that same feature types are
captured on separate distinguishable epipolar lines. The features
are preferably comprised of unique combinations of spatially
arranged sample points, and the number of sampled points in the
captured image is referred to as the image capacity or coverage
factor.
[0332] The question that could be asked is: Why not construct a
code having as many feature types as needed to cover the image
frame, denoted by sensor height H, without the need for epipolar
separation? The answer lies in the fact that single pattern imaging
systems strive to have feature types as small as possible, thus
providing higher sampling density per image. To this end, in
spatially encoded methods and systems, it is highly desirable to
encode the projected light with a minimal amount of feature types,
since each feature type is represented by a spatial code of some
finite area. That is to say, the less features in the code, the
smaller pixel area needed to encode a feature, as each feature
appears on the image sensor as light intensity formations over a
certain number of square pixels. By limiting the number of feature
types through a repeating code, the number of pixels needed to
represent a given feature type is minimized. As such, the number of
identifiable features in the 2D image is increased, and thus the
number of corresponding point cloud points, or point cloud density,
is increased as well. A higher number of point cloud points leads
to higher resolution 3D images.
[0333] As such, it would not be effective or desirable to encode a
large amount of feature types in order to obviate the epipolar line
separation techniques employed in the present embodiments. Through
the techniques of epipolar separation described in the embodiments,
a rather small matrix code can provide "inflation" of the image
capacity, as more uniquely identifiable features may be observed
for each captured image frame.
[0334] FIG. 7 illustrates a schematic image of a projected coded
pattern 104 comprised of a series of 10.times.2 matrices of feature
types, wherein the projection is again at an angle to the epipolar
field. Each feature of the pattern has square pixel size C.times.C.
As such, the length of each periodic matrix in the pattern is
XC=10C, while the height is YC=2C. The distance P represents the
horizontal distance in pixels between distinguishable epipolar
lines in the image. As seen, the projected pattern 104 is rotated
at an angle 106 with respect to the epipolar field. It is shown
that every other feature in the Y direction of the pattern, or in
other words every reoccurrence of the same feature, falls on a
distinguishable epipolar line. It is appreciated that the matrix
may take on various sizes, and this size is only an example for
illustration purposes. Using the principles known in geometry of
similar triangles, in can be proven that the triangle with sides
H-U-V and the triangle with sides YC-h-P are similar triangles, and
thus the following general equation holds:
H/U=YC/P where U=XC.fwdarw.H=XYC.sup.2/P
[0335] H is thus the number of pixels appearing on epipolar line
108. As the epipolar separation P is decreased, angle 106 becomes
smaller. From the equation we see that, for a constant pattern area
size, if only the epipolar line separation P decreases, then H
grows larger.
[0336] For example, feature 110, an (A) type feature, occurs only
once, as desired, over the epipolar line 108. Feature 112
represents the next occurrence of feature type (A) after feature
110, and represents the upper limit of the image height. If the
distance P between distinguishable epipolar lines is decreased,
angle 106 is decreased in turn, and the same coded pattern with the
same number of features is thus rotated less from the epipolar
field. Identical features are now separated by closer epipolar
lines that are still distinguishable, and the effect of rotating
the projected pattern to obtain epipolar separation becomes more
apparent. Feature 110 repeats itself over an ever greater distance
on epipolar line 108, thus expanding the value of H and thus the
number of uniquely identifiable features, or resolution, in the
image. The lower limit of angle 106 occurs when epipolar lines are
no longer distinguishable, that is to say, the same feature type
falls on two adjacent epipolar lines that are too close to
accurately be distinguished.
[0337] From the equation, we see that so long as the overall matrix
size does not change any combination of X and Y dimensions of the
matrix has no effect on the image coverage. For instance, if the Y
component of the matrix code is increased and the X component
decreased, say from a 10.times.2 rectangle, as shown, to a
5.times.4 square-like shape, we now have a matrix shape closer to a
square rather than a long skinny rectangle. However, the overall
matrix area (XY) remains the same. The angle 106 is decreased and
the pattern is rotated less from the epipolar field to ensure
epipolar separation on each distinguishable epipolar line. As seen
from the equation H remains the same, and thus the image coverage
is not increased.
[0338] Notably, the matrix structure is such that the length of
periodicity in the X direction is much greater than that in the Y
direction. Such a matrix code structure is referred to herein as
preferred direction coding. It is advisable for X to be much larger
then Y in terms of practical optic assembly, calibration,
operational physical and sensitivity. For example, vibrations or
other shaking of the imaging system can cause a very small rotation
angle 106 to become too small for epipolar differentiation.
Furthermore, a smaller angle of rotation as a result of non
preferred direction encoded patterns requires decreasing the image
field of view by a larger safety factor in order to ensure epipolar
separation.
[0339] Reference is now made to FIG. 8, which is an illustration of
the captured image of FIG. 6A with reflected light pattern 500 now
projected at a smaller rotation angle 114 with respect to the
direction of epipolar field. The distance between distinguishable
epipolar lines is decreased to d.sub.1. Features 116, 118, and 120,
are identical feature types that repeat themselves every other
feature in the y-direction of the pattern. These features now fall
on epipolar lines 122A, 122B, and 122C. According to the equation
described in FIG. 7, since the pattern has not changed, meaning X,
Y, and C are constant but P decreases, H becomes larger. Therefore,
H.sub.2 is a larger vertical image height than H.sub.1 of FIG. 6A,
providing more sample points and thus higher resolution in the
captured image. The imaging system can now sample more points in
the captured image before encountering ambiguity, meaning without
possibly encountering the same feature type along the same epipolar
line. Alternatively, instead of increasing the image field of view,
a pattern with smaller features may be projected using a high
resolution sensor.
[0340] However, when angle 114 is decreased too far, undesirable
system ambiguity occurs. When angle 114 becomes too close to 0
degrees, any slight movement or shaking in the calibrated
projector/imaging apparatus may cause a distortion and obstruct
clear epipolar line distinction. In addition, as the periodic
matrix in the pattern decreases in the X direction and increases in
the Y direction, such as the almost-square shape mentioned above,
the projection angle needed to ensure that H is not decreased,
requires that P decrease to a point which is too small to provide a
safe horizontal distance for epipolar line distinction. As a result
of the above geometric factors and additional factors to account
for possibility of error, the code preferably utilizes a preferred
directional code in the X direction. Although the present
embodiment utilizes a rotated pattern to achieve the result of
epipolar separation, other embodiments including but not limited to
skewing the projected pattern exist.
[0341] Although in the previous embodiments the projected pattern
is periodic, it is understood that in alternate embodiments other
coding schemes are possible, including but not limited to
non-ordered and/or non-periodic coded patterns that may be
projected to achieve epipolar separation. The previous embodiments
have described spatial coding techniques using a uniquely
structured spatially encoded pattern P1. Nonetheless, it is
appreciated that further embodiments may include temporal coding,
spectral coding, a combination thereof, or any other two
dimensional coding scheme or combination thereof, including
combinations involving spatial coding embodiments such as those
described herein, that enable differentiation of a plurality of
feature types along distinguishable epipolar lines.
[0342] Reference is made to FIG. 9, which is schematic diagram of a
non ordered and non periodic pattern. Each feature type of a
bi-dimensional spatial formation in a non-periodic code is
represented in the pattern as a number. It is appreciated that each
feature type, numbers 1 to 6 in the figure, appears not more than
once on any give epipolar line. Exemplary epipolar lines are 124,
126, and 128. An exemplary spatial coding shown is a pattern
similar to dominoes. The spatial code may, of course, take on many
forms, possibly similar or identical to the code seen in FIG. 3 and
FIG. 10 below. Other structured light coding techniques may include
bi-tonal, grey level, and/or multi-spectral techniques.
[0343] Reference is now made to FIG. 10, which are illustrations of
exemplary feature types that comprise preferred encoded pattern P1.
Features A-J are examples of binary spatially coded feature types.
Each feature is comprised of a spatial combination or formation of
five black and white points, made up of a black (local minimum) or
white (local maximum) center point 130 and four black or white
peripheral or "saddle" points 132. All possible combinations of the
five black and white points lead to an alphabet of 32 unique
characters or feature types. The exemplary feature types in FIG. 10
are arbitrarily named A-J, each corresponding to a different
combination of a center point and four saddle points. A feature
type having a white center point corresponds to a local maximum
illumination intensity in the projected pattern, while a feature
type having a black center point corresponds to a local minimum
illumination intensity in the projected pattern. The peripheral
points of each letter correspond to illumination intensities that
are neither maximum nor minimum, where a white peripheral point is
closer to maximum intensity than minimum intensity and a black
peripheral point is closer to minimum than maximum intensity. It is
noted that other embodiments may use features having combinations
of more or less saddle points. For instance, if each feature
contained six saddle points, an alphabet now exists of 128 unique
characters or feature types and thus a larger pattern period. Such
encoded features would be suitable for applications of larger
patterns with more sampling points, such as high-resolution
mega-pixel imaging.
[0344] To illustrate further, reference is made to FIG. 11, which
is an illustration of the exemplary pattern P1 as seen in an image
I'.sub.P1 after being reflected from an imaged cube 134. The
periodic pattern comprises the features types described in FIG. 10.
A close-up of the image showing one part of the cube is shown at
the bottom of the figure. White center point 136 represents a
maximum intensity reflection within the immediate surrounding pixel
area of the image. Black center point 138 represents a minimum
intensity reflection within the immediate surrounding pixel area of
the image. White saddle point 140 shows a reflection intensity that
is closer to a maximum than a minimum. Black saddle point 142, is
the opposite, and shows a reflection intensity that is closer to a
minimum than a maximum.
[0345] When features of the projected pattern are reflected from an
imaged object onto an image sensor, each of the five points of any
given feature becomes a sample point in the captured image. These
sample points are contained on adjacent sensor pixels upon which
the reflected feature has been imaged. It is understood that
whether a particular sample point is a maximum, minimum, or saddle
is dependent on the projected light intensity on that point as well
as the reflectance properties of the material of the imaged object
at that point.
[0346] The maxima, minima, and saddle points are preferably
extracted through identifying their derivatives in the image. This
means that they are determined through local variations in
illumination. For example, a white center point of a projected
feature, representing a local maximum illumination intensity in the
projected pattern, receives a low pixel(s) intensity or grey-value
on the imaging sensor if reflected off of a dark surface. If such a
center point were decoded independent of the center point's
surrounding sample points, the decoding engine may mistakenly
interpret the low intensity value as indicating a local minimum, or
black center point. However, when analyzing the surrounding pixels
of a white sample point reflected from a dark surface, the decoding
engine will see pixel values of even lower intensity, and thus a
correct identification of the sample point as a local maximum
occurs.
[0347] Likewise, a black center point of a projected feature,
representing a local minimum illumination intensity in the
projected pattern, receives a high pixel(s) intensity on the
imaging sensor if reflected off of a bright surface. Again, a
direct intensity evaluation of such a sample point would falsely
identify a feature containing a local maximum, or white center
point. However, by measuring the changes in local intensity around
such a point, the decoding engine will recognize pixel values of
even higher values. As a result, a correct identification of the
sample point as a local minimum occurs. Therefore, when analyzing
an image of the reflection of a single projected pattern critical
point detection is used. Critical point detection means that
changes in local illumination intensity by known derivative
analysis methods are used rather than direct intensity evaluation
to ensure correct feature detection in the decoding process.
[0348] Furthermore, in the periodic repeating code of the previous
embodiments, the relative horizontal distances between saddles and
minima or maxima of features is constant. During the decoding
process, the vector, or set of distances between elements or
components of features is determined and a constant vector is
expected. Such a vector enables verification of accurate feature
readings. False readings that arise from noise and textural
differences on the imaged object can be minimized through the
verification process of distances between points of the feature.
If, for instance, a distance between a saddle and a minimum is
larger than the expected difference, then the decoding system may
decide that an identified feature element is merely noise rather
than a part of a feature.
[0349] Known values of neighboring saddles and maxima are further
utilized for the feature validation process and error correction,
possibly by the cross referencing of adjacent point grey values.
For instance, in the identification and validation process, certain
characters within the known alphabet may be eliminated if their
spatial value arrangement contradicts those of known neighboring
characters. Another method for decoding feature identity is by the
integration of semi validated characters to known groups of
characters. An example case is where one of four saddles is not
clearly identified, but the other three have been validated. It is
understood that many different implementations of character
recognition are possible, including wavelength analysis or any
other methods known in the art.
[0350] It is appreciated that the encoding scheme is not limited to
a binary code, nor to any preferred optical implementations of
binary code, such as bi-tonal, monochromatic, or bi-polar
projection, and as in several examples given in previous
embodiments above may be implemented in other ways, including but
not limited to spectral coding, non periodic coding, and temporal
coding.
[0351] Reference is now made to FIGS. 12A and 12B, which illustrate
the construction process and the encoding scheme inherent in
pattern P1. First referring to FIG. 12A, the process begins with a
Debruijn sequence 144. The sequence 144 is created over a character
space S .SIGMA.={0, 1, 2, 3}. The series has a length of two,
meaning that every sequence of two numbers appears at most once.
Therefore, the length of the DeBruijn sequence is
|S|=|.SIGMA.|.sup.2=4.sup.2=16.
[0352] Matrix 146 is generated from matrix 144 by replacing each
number in matrix 144 by the matrix's binary representation, where
each binary number is written column wise. Matrix 148 is obtained
from matrix 146 by flipping even columns, in other words binary
values in the upper row of even columns of the matrix are moved to
the bottom row and visa versa. The flipping of even columns in
matrix 146 is performed to avoid a situation of code duplicity,
meaning to ensure that no letter appears more than once in any
given cycle of the matrix in the pattern.
[0353] We now turn to FIG. 12B. The matrix 148 of FIG. 12A is
mapped into a two dimensional pattern of rhombi 150, having two
colors of grey, each color representing a 1 or 0 in matrix 148.
This two color grey pattern is repeatedly mapped onto a pattern 152
of black, white, and grey rhombi. The resulting pattern 154 is
formed, having four colors of grey. Every grey rhombus of pattern
150 colors one of the grey rhombi in pattern 152 one of two shades
of grey. The black and white rhombi remain unchanged. In the next
two steps, the binarization of rhombus pattern 154 is carried out,
possibly by Gaussian smoothing followed by obtaining what is
referred to as a sharp intensity threshold, although other
techniques are possible. The resulting pattern P1, as shown
initially in FIG. 3, provides a bi-tonal sharpened pattern for
projection that allows the epipolar separation techniques described
above.
[0354] It is understood that the encoding described herein is a
mere single way of constructing a viable two dimensional spatial
formation-based light code having multiple feature types to be used
in preferred embodiments of the present invention, and other code
types, such as Perfect Maps, M Arrays, and pseudo random codes are
possible.
[0355] Reference is made to FIG. 12C, which are simplified
illustrations of the sample points in the pattern P1 after being
reflected from an imaged object and viewed as part of a captured
image in an imaging apparatus. Pattern 164 is a close up of pattern
154 of the previous figure. The thirty two binary values of matrix
150 are seen as thirty two saddle points 168. Each of these thirty
two saddles points becomes a sample point. Each group of 4 saddle
points has a black dot 170 or white dot 172 in the center. The
black and white dots are black and white rhombi in pattern 154, and
become sample points as well. Altogether, there are 64 sample
points in every cycle of the code in the pattern. These points are
shown in the brightened area 173. Pattern 166 is a close up of
pattern P1. In pattern 166, the sample points are seen after the
pattern is processed with Gaussian smoothing and sharp intensity
threshold.
[0356] Reference is made to FIG. 13A, which illustrates a preferred
projection and imaging method that enable epipolar separation
techniques described above. Once again, the encoded light pattern
P1 of previous embodiments is shown. The encoded light pattern is
seen here, as in FIG. 3, rotated a certain angle in relation to the
epipolar line 174 in the captured image, thus placing same feature
types of the code on separate epipolar lines. The angle of rotation
is described by the following equation.
Sin.sup.-1(P/YC)
[0357] By selecting the angle of rotation, the epipolar separation
techniques described above may be carried out. Calibration methods
known in the art are used to determine the exact location of the
epipolar lines in the captured image. Furthermore, to ensure that
no two identical features of the code repeat themselves on a given
epipolar line, the projected pattern height H.sub.P is limited in
the projector apparatus, as exemplified by the brightened area 175
in the pattern. As a result, each diagonal column for every period
is cut by a different epipolar line. Each epipolar line in the
captured image therefore contains one and only one feature of each
type. If the measured depth range is limited, then as discussed
above in FIG. 6B, more than one of the same feature type can appear
on each epipolar line in the image. The image height H in pixels
can be described by the equation in FIG. 7 above. Since the aspect
ratio of the shaded area is 4/3, the total surface area in terms of
sampled pixels is defined by the following equation:
A=H.sup.2(4/3)=X.sup.2Y.sup.2C.sup.4/P.sup.2(4/3)
[0358] Reference is made to FIG. 13B1. For certain 3D imaging
applications, a larger number of sample points is needed to obtain
the desired resolution. For such a purpose, a variant coding scheme
of the pattern P1 in above embodiments, named P2 herein, may be
utilized. The P2 pattern is shown in FIG. 13B1. As in P1, P2 is a
binary spatially coded periodic pattern containing feature types
comprised of spatial combinations of five black and white elements
or points. Likewise, as in P1, the black (local minimum) or white
(local maximum) center points are surrounded by 4 saddle points.
However, in pattern P2, two geometrical variants of each letter are
contained over each cycle of the code, and thus the population of
letters is doubled from 32 to 64. That is to say, each letter has
one variation Y1 in which the center point is shifted towards the
left and falls closer to the left saddle points, and another
variation X1 in which the center point is shifted towards the right
and falls closer to the right saddle point. The alternating maxima
and minima do not fall one on top of the other in the vertical
direction as in the coded pattern P1, but rather each maximum is
shifted slightly to the right or left of the minima directly above
and below. Similarly, each minimum is shifted slightly to the right
or left of the maxima directly above and below. Such a pattern
creates a "zig-zag" type effect, where alternating rows are shifted
either slightly to the right or to the left. In all, in a given
cycle, there are 32 black minimum, 32 white minimum, and 64 saddles
that in total comprise 128 sample points.
[0359] Notably, the relative horizontal distances between points of
each feature in the image, as discussed above, now allow for a
vector that represents an encoding. That is to say, in addition to
the five points of the feature which allow for a 2.sup.5=32 feature
code, the varying horizontal distances between points of a given
feature allow for additional feature types to be present in each
pattern, thus adding to the number of sample points possible for a
given image. Various combinations of horizontal distances of the
same five feature points are possible.
[0360] We now turn to FIG. 13B2. Pattern P2 is constructed in a
similar way as described above in FIG. 12A, 12B using a Debruijn
sequence. A binary matrix based on a slightly modified Debruijn
sequence is mapped into a two dimensional pattern of rhombi 282
having two colors of grey. This two color grey rhombi pattern is
repeatedly mapped onto a pattern 284 of black, white, and grey
rhombi. The resulting pattern 286 is formed, having four colors of
grey. Every grey rhombus of pattern 282 colors one of the grey
rhombi in pattern 284 one of two shades of grey. The black and
white rhombi remain unchanged. In the next two steps, the
binarization of rhombus pattern 286 is carried out, possibly by
Gaussian smoothing followed by obtaining what is referred to as a
sharp intensity threshold, although other techniques are possible.
The resulting pattern P2, as shown initially in FIG. 13B1, provides
a bi-tonal sharpened pattern for projection that allows the
epipolar separation techniques described above.
[0361] Graphical processing known in the art is carried out on the
rhombus array to obtain the "zig-zag" pattern that allows for two
shifted variants of each letter as described above.
[0362] Again, the rotation of the grid pattern ensures that every
two identical features that appear in an alternating fashion on
each diagonal column do not fall on the same epipolar line.
[0363] The process of decoding is characterized by comparing
relative horizontal distances between the various sample points,
meaning either a minimum, maximum, or saddle point. As described
above, the decoding process attempts to identify locations of
maxima, minima, and saddle points. The horizontal distances between
such points serve to validate the existence of a feature at that
location in the 2D image. In the present embodiment, these
horizontal distances may be utilized for identification of features
in such a zig-zag pattern. For example, by determining relative
horizontal distances between an identified maximum and the
maximum's associated saddle points, the identity of the letter
comprising such an array of points may be determined.
[0364] Reference is made to FIGS. 13C and 13D. In FIG. 13C, two
patterns T1 and T2 are presented. T1 is simply the pattern P1 in
previous embodiments doubled in height. Pattern T1, like P1,
contains 32 features and has a periodic cycle of 16. A captured
image containing projected pattern T1 is shown as I.sub.T1 and
allows for two cycles of the pattern in the vertical direction.
Thus, each feature appears twice on a given epipolar line in image
I.sub.T1. Noticeably, these features are at a distance from each
other that is equal to the height of pattern P1.
[0365] T2, being equal in height to pattern T1, is a periodic
spatially encoded pattern having unique bi-dimensional formations
different than T1. The pattern T2 contains 32 types of features.
Each feature type appears twice consecutively in the cycle (other
than 4 unique feature types that appear once in each cycle). Each
pair of adjacent identical feature types shares common saddles. For
example, in FIG. 13D, features 350a and 350b are of the same type
and appear adjacent to each other. Each feature is comprised of an
extremum and four saddles. Two of the four saddles for each feature
are common to both features. Therefore, each pair of adjacent
features contains two extremas and six saddle points. Each sequence
of two identical features repeats itself every 32 lines in the
pattern.
[0366] As a result of such a repetition within the pattern T2
cycle, when the pattern is projected at a rotation to the angle of
the imaging sensor, feature types reappear within close proximity
on the same epipolar line in the resultant obtained image.
Referring to FIG. 13D, such an example is seen on epipolar line 354
where identical features 350a and 350c appear on the line in close
proximity. As seen in the figure, identical features 350a and 350b
are adjacent to one another in the cycle. Therefore, the rotation
places like feature types in close proximity on the epipolar line.
Likewise, identical features 360a, 360b, and 360c appear in a group
of three as seen in the cycle. These three features thus appear
along epipolar line 362 in close proximity as shown. The location
range of each feature type along the epipolar line is thus
known.
[0367] Returning to FIG. 13C, we see that an advantage of enlarged
pattern T1 is the enlarged number of sample points that may be
imaged and correlated to depth points in space. However, the
appearance of like feature types on the same epipolar line
presumably causes ambiguity in the decoding process as discussed
above. For instance, identical features 366a and 366b are on the
same epipolar line 370 in I.sub.T1. To resolve such ambiguity, a
second pattern T2 may be projected intermittently with T1. Since T2
has a cycle twice the length of T1, then in an image I.sub.T2
having the same height as image I.sub.T1, an identical feature
group 364 or 368 does not repeat itself over the length of the
image. Each feature group only appears once along the length of the
image I.sub.T2.
[0368] Now, when patterns T1 and T2 are aligned in the projector
apparatus and projected temporally, a comparison can be made
between observed features within a range along parallel epipolar
lines in successive images. The matching of features from a first
image of I.sub.T1 to sister features in a second image of I.sub.T2
can allow for feature verification in the first image As an
example, feature 366a is seen in a particular range along a given
epipolar line 370 in a first image of reflected pattern I.sub.T1.
Likewise, feature 366b appears along that same epipolar line in the
first image I.sub.T1, albeit at a distance from 366a. Such a
situation may lead to ambiguity in I.sub.T1.
[0369] A second image I.sub.T2 having reflected pattern T2 captured
in succession to the first image contains identical feature set
364a, 364b, and 364c on the same epipolar line, 370, as feature
366a appeared in image I.sub.T1. Also, in a different region of
that same epipolar line in image I.sub.T2 appears identical feature
set 368a, 368b, and 368c. Since these feature sets are distinct on
the epipolar line, each set can be respectively compared to
features 366a and 366b. That is to say, by correlating feature 366a
from I.sub.T2 with features set 364a, 364b, 364c in I.sub.T2 and
feature 366b from I.sub.T1 with features set 368a, 368b, 368c in
I.sub.T2, ambiguity can be annulled. This is because each of
features 366a and 366b in the first image is matched to a unique
feature set in the second image.
[0370] The process of projection and decoding is as follows. T1 and
T2 are projected alternately, synchronized with the camera. The
images are decoded separately at first. The decoding of I.sub.T1 is
carried out as described in above embodiments, although each
feature carries two image coordinates as it appears twice on the
epipolar line. I.sub.T2 likewise is decoded in the same manner as
above embodiments, and like features are found vertical to each
other along the same epipolar line. Therefore, if no movement in
the imaged objects occurred between frames, each feature within the
image sequence I.sub.T1, I.sub.T2, I.sub.T1 can be verified by
correlating between the two images. If I.sub.T1 and I.sub.T1 are
identical, then the imaged scene is static for the imaging time
frame. I.sub.T2 acts as a resolving pattern, as each feature at a
specific location in image I.sub.T2 is different than the feature
at the corresponding location in image I.sub.T1. Since multiple
feature types of the same type reappear on I.sub.T2 in close
proximity, even if movement occurs between frames, the matching
process can still identify associated features between the two
images. In essence, the above process combines principles from both
temporal and spatial encoding methods.
[0371] In temporal coding techniques, wherein more than one pattern
is projected on the object for each captured image, it is
understood that limitations may have to be placed on the speed of
motion of the object being imaged.
[0372] Reference is made to FIG. 14, which shows image I.sub.P1 and
exemplary characters or feature types of pattern P1. Characters 176
represent features encoded over every period of the pattern, the
nature of such features being described above in FIG. 10. It is
noted that the assignment of letters to feature types in the
current figure is independent of FIG. 10. The length of each period
of the pattern P1 is 16 and the height is 2. That is to say, same
feature types belong to either the same diagonal column, for
instance identical features A and E, or to columns modulo 16, for
instance identical features A and F. Modulo 16 means that same
features are distanced periodically every 16 diagonal columns.
However, features from diagonal columns that are not modulo 16 are
always of different types, such as features A, B, C, and D. On
every diagonal column only two different feature types appear in an
alternating fashion, one feature having a black center value and
one with a white center value, for example A and G. As explained,
features A, E, and F are individual features appearing in the
pattern that are of the same type. For purposes of clarity, these
individual features are only referred to as feature type (A) in
FIG. 3 and other figures where P1 is shown, while in the current
figure, each separate feature is assigned a unique letter, A, E,
and F respectively.
[0373] Again, the rotation of the grid pattern ensures that every
two identical features that appear in an alternating fashion on
each diagonal column do not fall on the same epipolar line. As an
example, features A and E are two individual features in the
pattern that are of the same type. Feature A falls on epipolar line
178. Feature E, identical to feature A, is found four features
below A on the same diagonal line. On this diagonal line, the A/E
feature type occurs every other feature. Feature E falls on
epipolar line 180, horizontally differentiable from 178. We see
that each feature in the image can be expressed in terms of feature
type and the feature's associated epipolar line.
[0374] The above embodiments discuss the projection of a single
pattern onto an imaged object. The projection of a second pattern
can provide further valuable information in determining the third
dimension or depth. In particular, dual projection together with
dual imaging may significantly increase the accuracy and
reliability of feature type identification as well as provide new
ways of reaching extremely high resolution in depth readings.
Moreover, in certain embodiments, texture information may be
extracted from the information obtained from the dual projection
methods discussed herein.
[0375] Dual projection methods may include both temporal methods
and/or spectral methods. Temporal methods are such where two or
more patterns are projected over separate time intervals. Spectral
methods are such where two or more patterns are projected with
differing wavelengths of light. Temporal methods may utilize a
spatial light modulator such as an LCD, DLP, or DMD. Spectral
methods may, for example, be carried out by an optical
multiplexer.
[0376] In the embodiments below, in the case of spectral-separation
implementations, spectral separation is preferably small so as to
allow for direct comparison of reflection intensities. That is to
say, two identical patterns projected onto an imaged object at
different but adjacent wavelengths will reflect with substantially
identical reflection intensities. So for instance, in a first image
a high intensity or white spot is projected onto a particular
bright surface at a certain wavelength and is mostly reflected. In
a second inverse image I'.sub.-P1 a low intensity or black spot is
projected onto the same bright surface at an adjacent wavelength
and is mostly absorbed. Since the black spot in I'.sub.-P1 is
projected at an adjacent wavelength, the black spot behaves in an
opposite manner to the projected white spot in I'.sub.P1 for the
given surface, and is mostly absorbed. Likewise, white spots
projected at the two adjacent wavelengths in the images both
reflect in almost identical manners. Black spots projected at the
two adjacent wavelengths in the images both absorb in almost
identical manners.
[0377] Reference is made to FIG. 15, which is an illustration of a
preferred dual projection and dual imaging embodiment of the
present invention. I'.sub.P1 is an image of a Rubik's Cube having
pattern P1 of the previous embodiments, as explained in detail in
FIGS. 3 and 11, projected upon the cube. Image I'.sub.P1 is a
second image of the cube having the negative of the pattern P.sub.1
projected thereupon. Finally, the image I'.sub.P1-I'.sub.-P1 is the
resultant image obtained from the subtraction of I'.sub.-P1 from
I'.sub.P1. Close-ups 182 and 184 are the respective pattern
reflections from the bottom right four squares of the cube in image
I'.sub.P1 and I.sub.P1. In both images I'.sub.P1 and I'.sub.-P1, an
imaged white point is a result of a projection and thereafter
reflection of a maximum light intensity on that surface location of
the imaged cube. Likewise, an imaged black point is a result of a
projection and thereafter reflection of a minimum light intensity
on that surface location of the imaged cube. So, if in I'.sub.P1, a
reflected white point is seen at a particular surface location on
the cube, then in I'.sub.-P1 a reflected black point is observed at
the same surface location on the cube. For instance, white maximum
point 186 seen in image I'.sub.P1 is seen as black point 188 in
image I'.sub.-P1. Likewise, white saddle point 190 in image
I'.sub.P1 is replaced by black saddle point 192 in image
I'.sub.-P1.
[0378] As the Rubik's Cube contains squares of varying colors, the
reflected pattern shows high contrast when reflected from squares
of bright colors, such as 194, and less contrast when reflected
from squares of darker colors, such as 196. This is a result of the
fact that the energy of projected white spots, or light intensity
maxima, is reflected from bright surfaces to a much larger extent
than the energy's reflection from dark surfaces, where much of the
energy is absorbed. A projected white spot, reflected from a bright
colored surface appears white in the image, signifying an area of
the imaged object with high intensity reflection. In contrast, when
a projected white spot is reflected from a dark colored surface,
the reflected white spot appears much closer to black in the image,
signifying an area of the imaged object with low intensity
reflection. In either case, however, a projected white spot will
always reflect from any given imaged object surface point at a
higher intensity than a projected black spot on that same
point.
[0379] Therefore, no matter what the imaged surface reflectance
characteristics are, an imaged black point will always have a
reflection intensity lower than an imaged white point at the same
location on the imaged object surface. Thus, the subtraction of a
black point intensity value 188 at a particular location in image
I'.sub.-P1 from a white point intensity value 186 at the same
location in image I'.sub.P1 will always yield a positive intensity
value 200. Likewise, the subtraction of a white point 202 in image
I'.sub.-P1 from a black point 204 in image I'.sub.P1, both at the
same image location, will always yield a negative intensity value
206. As such, a value of each pixel in the resultant image is
signed, either positive or negative. For visualization purposes, as
display screens and prints cannot understand negative pixel values,
a normalized scale may be used to represent both positive and
negative resultant intensity values.
[0380] By projecting a pattern and the pattern negative on an
object, and subtracting the image of the reflected negative pattern
from the image of the reflected original pattern, maxima and minima
locations in image I'.sub.P1 may be determined directly by a
measure of the grey-level or intensity of the same image locations
in resultant image I'.sub.P1-I'.sub.-P1. All positive intensity
values in the I'.sub.P1-I'.sub.-P1 image indicate a white spot
while all negative intensity values in the I'.sub.P1-I'.sub.-P1
image indicate a black spot. Again, since negative intensity values
may not be understood by display screens, a normalized scale may be
used for image display. This direct measure of maxima and minima
with such a dual projection is unlike the single pattern projection
case described above in FIG. 11. In the case of single pattern
projection, the additional analysis of local intensity values of
surrounding points to the maxima and/or minima is preferred to
ensure a higher level of correct feature type identifications.
Therefore, the subtraction method of the present embodiment allows
for a more robust pattern recognition analysis and thus a more
reliable and powerful engine for derivation of 3D spatial
coordinates from identified features in the captured 2D image.
[0381] Furthermore, the dynamic range of the intensity values in
the I'.sub.P1-I'.sub.-P1 image is doubled by the subtraction of the
two patterns. This gives rise to a more accentuated pattern and
thus aids in feature type recognition. Another advantage is the
cancellation of the effect of ambient light on the resultant image
I'.sub.P1-I'.sub.-P1. The resultant image shows an accentuated
pattern with less interference of imaged object texture.
[0382] It is understood that the two images may be taken temporally
or at the same time through use of spectral differentiation.
Likewise, the two patterns may be projected with spectral or
polarity separation, at the same time or temporally, or both
patterns may have the same wavelength or polarity but the
projections are separated in time. So long as the spectral
separation is close, the reflection intensities of the two
reflected patterns will be virtually exact negatives at any given
imaged object surface location.
[0383] Reference is made to FIG. 16, which is an illustration of
the addition of the two images, I'.sub.P1 and I'.sub.-P1. The
addition of the two images with opposite pattern illumination
intensities projected thereupon leads to a cancellation of the
pattern in the I'.sub.P1+P'.sub.-P1 image. This is because the
addition of any two reflection intensities for any given identical
location in the two images leads to a resultant image showing the
reflection of maximum light intensity from all points on the
object. The variations in reflectance intensity are thus only a
function of the texture of the imaged object. That is to say, the
changing texture of the imaged object affects the reflection
intensities with which a white point is reflected from the various
points of the imaged object. The resultant image separates the
texture of the depth imaged object from the imaged pattern observed
in images I'.sub.P1 and I'.sub.-P1.
[0384] Reference is now made to FIG. 17, which is a preferred
embodiment of a dual projection and dual imaging method of the
present invention. I'.sub.P1, as in previous figures, is an image
of a Rubin Cube having the pattern P1 projected upon the cube.
Image I'.sub.-C is a second image of the cube having the same
pattern as in I'.sub.P1 projected thereupon only the maxima and
minima are switched. That is to say, a maximum 186 in image
I'.sub.P1 is seen as minimum 210 in image I'.sub.-C. However, a
saddle point, for instance white saddle 190 in image I'.sub.P1
remains white saddle point 212 in image I'.sub.-C. Thus, a maximum
is replaced by a minimum and visa versa, but the saddle points
remain unchanged in both images. Finally, image I'.sub.P1-I'.sub.-C
is the resultant image from a subtraction of image I'.sub.-C from
image I'.sub.P1.
[0385] As seen in close up 214, what remains in the image
I'.sub.P1-I'.sub.-C are the maxima and minima, sometimes referred
to as the carriers of the code. Again, for any given location on
the image sensor, a reflected local intensity maximum point in
image I'.sub.P1 is replaced by a reflected local intensity minimum
point in image I'.sub.-C. The local intensity maximum appears as a
white spot in the image, while the local intensity minimum appears
as a black spot. The saddles are identical in both images. So, for
instance, when the intensity value of location 210 in image
I'.sub.-C is subtracted from the intensity value of the same
location 186 in image I'.sub.P1, white spot intensity value 216 is
obtained in the resultant image I'.sub.P1-I'.sub.-C. Similarly,
when the intensity value of location 218 in image I'.sub.-C is
subtracted from the intensity value of the same location 220 in
image I'.sub.P1 black spot intensity value 222 is obtained in the
resultant image I'.sub.P1-I'.sub.-C. In contrast, when saddle point
212 in image I'.sub.-C is subtracted from the saddle point 190 of
the same intensity and same image location, the identity of the
saddle points as white or black disappears. On the other hand, the
locations of the saddle points in the I'.sub.P1 image are now
sharpened and accentuated by the subtraction. The subtraction of
the two images thus neutralizes the saddle values in the image but
clearly delineates their locations. Moreover, the subtraction of
the two images leads to a cancellation of ambient light,
[0386] The resulting image I'.sub.P1-I'.sub.-C can now easily be
scanned for maxima, minima, and saddle point locations. As
explained above, in FIG. 10, the maxima and minima in the imaged
pattern represent the center points of reflected features in the
pattern. As the resultant image shows only the saddles' location,
but not value, the search for maxima and minima is thus simplified,
and becomes faster and more error-proof. These maxima and minima
then indicate the existence of a feature and the feature's epipolar
line location in the image may be determined. What remains is to
determine feature type by identification of the saddle points. This
may be done by either analysis of I'.sub.P1, as in the single
pattern case, or as described next.
[0387] Reference is made to FIG. 18, which shows the addition of
the two imaged patterns of FIG. 17, I'.sub.P1 and I'.sub.-C. The
addition of the two images leads to a resultant image that shows
the reflection of maximum light intensity for all carrier
locations. In other words, all former carrier positions become
white in the resultant image. Only the saddle points remain. The
addition of the two images leads to the addition of two opposite
carrier reflection intensities at identical locations in the two
images. In contrast, the saddles are identical in both images, so
the addition of the two images leads to an accentuation of the
saddle points.
[0388] For instance, in close-up 224 of the I'.sub.P1 image, a
black feature with center point 226 is shown. The feature's two
upper saddles 228 and 230 are black. The bottom left saddle 232 is
also black while the bottom right saddle 234 is white. In close up
236 of the same place on the Rubik's Cube in the I'.sub.-C image,
the same feature now has white center point 238 instead of black.
The saddles 240-246 remain the same. Therefore, the resultant image
close-up 248 shows the four saddles 250-256. Since all carriers
become white in the resultant image I'.sub.P1+U'.sub.-C, the
saddles may be clearly identified without confusion with local
maxima and minima. FIGS. 17 and 18 show that the separation of the
carriers from the saddle points allows for a more robust decoding
process as the detection of carriers and saddles is simplified.
Image I'.sub.P1-I'.sub.-C provides the carrier and saddle detection
and location information while image I'.sub.P1+I'.sub.-C provides
the saddle identity information.
[0389] Reference is made to FIG. 19, which is an illustration of
the resultant image obtained from the absolute value of the
subtraction of image I'.sub.-P1 from image I'.sub.P1. The resultant
image 270 is an outline of the pattern. Since the two images are
inverses of each other, as discussed above in FIGS. 15 and 16, the
individual patterns can be expressed as two opposite sinusoidal
curves. Graph 260 is a cross section of the sinusoidal curves that
comprise each of the two inverse patterns. Each local maximum and
white saddle location in the I'.sub.P1 image is replaced,
respectively, by a local minimum location and black saddle in the
I'.sub.-P1 image. Conversely, each local minimum and black saddle
location in the I'.sub.P1 image is replaced, respectively, by a
local maximum location and white saddle in the I'.sub.-P1 image.
The absolute subtraction of any two extrema or saddles from
identical locations in the two images leads to a bright intensity
spot in the resultant image |I'.sub.P1-I'.sub.-P1|. Therefore, any
maximum, minimum, or saddle point is bright in the image
|I'.sub.P1-I'.sub.-P1|. This is seen in close up 272 of the
resultant image.
[0390] The dashed curved lines 262 and 264 indicate the border
areas between black and white spots of the respective images
I'.sub.P1 and I'.sub.-P1. On the sinusoidal graph, these dashed
curved lines are represented by the meeting point, or the 0 point,
between the respective graphs of images I'.sub.P1 and I'.sub.-P1.
Where these points meet, the result of the absolute subtraction
|I'.sub.P1-I'.sub.-P1| is 0. These meeting points occur in a two
dimensional plane in the resultant image, creating the black
outline 266 of the pattern. The black outline represents a minimum
in the image 270 and provides a sub-pixel resolution of a
continuity of highly identifiable curves in the resultant image.
Points along this pattern outline can be associated with spatial
coordinates through various triangulation techniques.
[0391] Reference is made to FIG. 20, which is an illustration of
the resultant image obtained from the absolute value of the
subtraction of image I'.sub.-C from image I'.sub.P1 As discussed in
FIGS. 17 and 18 above, the I'.sub.-C image of the cube has the same
pattern as in I'.sub.P1 projected thereupon only the maxima and
minima are switched. That is to say, a maximum 186 in image
I'.sub.P1 is seen as minimum 210 in image I'.sub.-C. However, white
saddle point 190 in image I'.sub.P1 remains white saddle point 212
in image I'.sub.-C. Thus, a maximum is replaced by a minimum and
visa versa, but the saddle points remain unchanged in both
images.
[0392] The absolute value of the subtraction of image I'.sub.-C
from image I'.sub.P1 yields image 274. The absolute subtraction of
either a black or white carrier in image I'.sub.-C from either a
black or white carrier in image I'.sub.P1 leads to all white
carriers, seen in close up image 276. The black crossings represent
the borderline areas between black and white carriers in the
I'.sub.P1 and I'.sub.-C images. The borderline areas are the areas
where the white carrier begins to turn black and visa versa. At the
precise borderline, the pixels are neither black nor white, as the
value is precisely between a maximum and a minimum. The absolute
subtraction of these pixels leads to 0 value, or a black or minimum
point in image 274. The resulting image has less ambient light and
allows for clearer decoding and accurate localization of
extrema.
[0393] Reference is made to FIG. 21, which is an illustration of a
particular embodiment of dual pattern projection and dual pattern
imaging. The first image is I'.sub.P1, and the second is an image
of the cube with ambient light illuminating the image. The
subtraction of the image of the cube with ambient light from the
image of the cube I'.sub.P1 having the pattern P1 projected
thereupon, provides a resultant image substantially free of ambient
light. As a result, the pattern in the resultant image is
accentuated and thus more readily decoded. This is seen in close up
of resultant image 278 where the black and white points of the
image are clearer than seen in the original I'.sub.P1 image.
[0394] Reference is made to FIG. 22, which is an illustration of a
particular embodiment of dual pattern projection and dual pattern
imaging. The first image is I'.sub.P1, and the second is an image
of the cube with uniform or relatively uniform light illuminating
the image. The division of the image of the cube I'.sub.P1 by the
image of the cube with uniform light projected thereupon, provides
a resultant image substantially free of object texture. As a
result, the pattern in the resultant image is accentuated and thus
more readily decoded. This is seen in close up of resultant image
280 where the black and white points of the image are clearer than
seen in the original I'P1 image.
[0395] In all of the dual projection and imaging embodiments of
FIGS. 15-22, the alignment, or relative spatial position, between
the imaged object in both images is limited somewhat. The degree to
which movement between imaged frames is limited is a typically a
function of the imaging strategy and the speed of the imaging
device. Both temporal and spectral imaging devices may be used with
the previous embodiments to capture dual images.
[0396] For instance, the simplest implementation of temporal
imaging of the dual patterns in the previous embodiments is through
two "snapshots" of the imaging device over two time intervals. One
advanced method of temporal dual imaging, supported by modern
camera boards, is non-uniform-exposure timing. Modern camera
sensors enable control over the exposure time of the camera sensor.
So for example, it is possible to provide pairs of interval
exposure times placed close to each other, rather than uniformly
spaced in time. For instance, image I1 may be taken at t=0
milliseconds, image I2 at milliseconds, image I3 at 100
milliseconds, image I4 at 105 milliseconds, and so on. As a result,
a series of images is obtained at an average rate of 20 images per
second. However, pairs of images are provided that are taken at
consecutive time intervals rather than spaced apart in time As a
result, if the movement occurring between the first and second time
interval is slight enough, or alternatively if the imaging device
is fast enough, comparison between the dual images may be carried
out as in previous embodiments.
[0397] The above strategy of two consecutive images at adjacent
time intervals is dependent on fast projection speeds. The imaging
apparatus sensor should ideally not be exposed to any time beyond
the projection time, as this would lead to ambient light
disruptions in the imaged pattern. Other sensor designs to increase
imaging speed for the embodiments of the present invention include
certain CMOS per-pixel-based computation sensors. These specially
designed sensors carry out computations immediately at each pixel
location upon the pixel receiving light intensity. This is in
contrast to computing pixel values subsequent to the emptying of
the sensor to computation modules.
[0398] Furthermore, the above dual imaging strategy may be
implemented to obtain texture images in parallel to the captured
depth information. In preferred embodiments, the pattern is
projected every other image frame. In the remaining image frames,
uniform illumination is projected onto the object and the object is
imaged. Assuming minimal relative motion between the two frames, it
is possible to apply the texture information obtained while
projecting uniform light onto the object to depth information
captured in the images containing the projected pattern.
[0399] It is understood that spectral methods of capturing two
patterns are also possible. Such implementations may include the
use of multiple CCDs.
[0400] Reflectance characteristics of certain high texture surfaces
may lead to feature identification error in single imaging device
systems. Therefore, in order to ensure that correct 3D spatial
coordinates are derived, an additional imaging device may be added
to the system. The additional imaging apparatus allows for what is
referred to as stereo depth decoding in addition to the single
image decoding methods described previously herein.
[0401] For example, a dual imaging system using methods and
apparatuses of certain embodiments may be conveniently mounted on a
moving vehicle in an urban environment to capture the geometric
constructs in three dimensions of buildings, landmarks, and other
objects both stationary and moving in an urban scene.
[0402] To further understand stereo depth decoding using the
methods and apparatuses of preferred embodiments, we refer to FIG.
23. In stereo-based depth imaging and decoding systems, two imaging
devices placed near to each other capture a given scene, preferably
simultaneously. The two images, denoted as image A and image B, are
nearly the same. However, since the two images are taken from
slightly different angles, imaged objects appear at slightly
different pixel locations in the two images. Given a fixed distance
between the imaging devices for every frame, any object that ever
appears on a certain pixel location in image A must appear on a
specific epipolar line in image B. Likewise, any object that ever
appears on a certain pixel location in image B appears on a
specific epipolar line in image A. The distance of the imaged
object appearing at a certain pixel location in one image affects
where along the corresponding epipolar line the object appears in
the second image.
[0403] Now, when a projector is added in between the two imaging
devices, or cameras, a structured light pattern such as discussed
above may be projected onto the imaged urban objects. In each of
the two images, reflected pattern features appear together with the
texture of imaged objects, such as seen in FIG. 11. A given pixel
in either image may contain part of a reflected feature and thus
any randomly chosen square pixel area of the image may contain
features or parts of features in addition to the imaged object
texture. The features, or parts of features, together with the
imaged object texture comprise the totality of information on any
given pixel. The information appearing at a square pixel location
in image A, will always appear on a unique set of coordinates (a
specific set of epipolar lines) in image B. Likewise any
information appearing at a square pixel location in image B, will
always appear on a unique set of epipolar lines in image A.
Further, if epipolar separation techniques associated with the
pattern structure and projection are utilized, feature types appear
only once on each epipolar line. Therefore, each small area along
epipolar lines in each image is unique and the features provide
non-repeating "artificial texture" for each image. The set of
points comprising epipolar fields between images is determined
generally through stereo calibration or stereo rig calibration
techniques known in the art.
[0404] The epipolar line in image B, EP1B, comprises the totality
of points upon which the information contained on pixel area PX1 in
image A may be found in image B. Likewise, the epipolar line in
image A, EP1A, comprises the totality of points upon which the
information contained on pixel area PX2 in image B may be found in
image A. The relationship between pixel locations in image A to
pixel locations in image B is referred to as an epipolar field.
[0405] Referring to FIG. 24, two additional epipolar fields exist.
These fields are: 1) the epipolar field between the projector and
image A and 2) the epipolar field between the projector and image
B. These fields behave as described in previous embodiments above
and are not expounded on further here. Noticeably, in light of the
above discussion, feature F1 may appear in Image A at pixel area
PX1 in addition to the texture of an imaged object. In such a case,
both feature F1 and the imaged object appear at the same point on
EP1'' in image B.
[0406] Stereo correspondence can then be carried out as follows. A
particular group or "window" of pixels is chosen at a certain
location in either image, say image A. A similar group of pixels to
that chosen in image A will only be found along a certain set of
epipolar lines in image B. Image B is scanned along these epipolar
lines for a matching group of pixels. When a matching pixel set is
found, the location in image B is noted. We now know matching pixel
windows and their corresponding locations in each of image A and B.
The distance between the two imaging apparatuses is known, along
with the angles of projection and image capture for each
camera/image apparatus. Triangulation methods can now be used to
determine the 3D spatial coordinates of the objects imaged on those
pixel locations in each image. This process is continued for each
pixel group location to obtain a 3D point cloud of the imaged
object(s) for each frame.
[0407] In stereo depth decoding of two images containing a
reflected pattern, the features are not decoded as in the single
image case. Rather, the pattern features act as a very dense and
high contrast "artificial texture". "Windows" of pixels are
compared between the two images to find matching pixel areas.
Furthermore, since feature types do not repeat themselves along the
epipolar lines when using projection techniques of the present
embodiments, each small "window" of pixels is unique along
respective epipolar lines. Again, these windows may not include
full pattern features, but only parts of features. In either case,
the features act as unique "artificial texture" along the epipolar
line on top of the texture of the imaged objects.
[0408] Surfaces that are likely to lead to feature identification
error in single imaging systems include for example high textured
surfaces, transparent and semitransparent surfaces, and dark
surfaces. This is because features of the encoded light projected
onto high-textured surfaces are likely to become deformed and thus
not decodable as a result of reflectance from the high-textured
surface. Although reflected features may not be decipherable, they
still add unique texture along epipolar lines to the already
textured image. The comparison between two captured 2D images is
made easier as unique "windows" can be matched to each other. The
addition of the reflected pattern aids in the stereo matching
process. In non-textured or low textured surfaces, such as walls or
grass, features may be extracted from one image, as described in
previous embodiments.
[0409] The additional imaging device thus, firstly, provides
"another opportunity" to derive the depth coordinates of a given
scene of a given image frame. In total, 3D coordinates of a given
scene using the configuration of the present embodiment are
computed three times. Once in each image separately through
reflected feature decoding and a third time from the comparison of
correspondences between the first and second image.
[0410] The system geometric constraints that are to be imposed to
enable the stereo depth decoding in addition to single image
decoding are as follows. A suitable displacement, sometimes termed
baseline, is set between a first imaging device and the projector
and between a second imaging device and the projector. The two
imaging devices are ideally on opposite sides of the projector,
each at 180 degrees from the projector. Thus, the two baselines are
parallel to each other and the three optical centers (projector and
two imaging devices) are on a single line or almost single line.
The order of projectors and imaging devices may vary. However, the
epipolar separation techniques impose the constraint that all
projectors and imaging devices be situated in a substantially
straight line in relation to each other. The projected pattern is
projected at a suitable tilt angle, as described in above
embodiments, to ensure epipolar separation both along epipolar
fields between each imaging device and the projector as well as the
epipolar fields between the two imaging devices.
[0411] Practically, it may further be desirable to position the
projector closer to one of the imaging devices, say the first
imaging device. The smaller distance between the first imaging
device and the projector is particularly suited for decoding of
features reflected from close objects. The medium displacement
between the second imaging device and the projector is particularly
suited for decoding of features reflected from objects still
farther away. Finally, the large displacement between the first and
second imaging device is suitable for stereo matching of imaged
distant objects. Since the projector has limited energy, the
projected pattern may often not reach distant objects. In such a
case, the task of extracting depth is carried out through stereo
matching without reflected features, what is referred to as passive
stereo matching. Therefore, the largest baseline is reserved for
the task of stereo matching.
[0412] Reference is now made to FIG. 25, which is a simplified flow
chart showing the steps in the process of generating a 3-D map of
an urban region. In the example of FIG. 25, the urban region is not
limited to a single "scheme" or landscape. Instead, multiple 3D
images are captured from different locations and different angles,
for example, by mounting a projector and/or image apparatus on a
moving vehicle and capturing images when the moving vehicle is at
different locations.
[0413] The flow chart shows the example of a projector and two
imaging apparatuses but other configurations with more projectors
and imaging apparatuses are possible. The reason for use of at
least two image apparatuses in the current embodiment is, as
explained above, to enable stereoscopic depth measurement in
addition to the active triangulation methods described until now
using a single imaging apparatus.
[0414] The process is as follows. First, in step s1, a two
dimensional coded light pattern is generated as in previous
embodiments. This two dimensional pattern should be structured so
that the epipolar separation techniques discussed above are capable
of being implemented. The pattern P1 in previous embodiments is
especially suited for urban modeling as the pattern is scalable and
can be made rather large for urban objects. Second, in step s2, the
pattern is projected onto objects in urban scenes from a
projector(s) mounted on a moving vehicle. These urban objects may
include buildings, cars, greenery, street signs, people, or any
other object. Projection techniques are as discussed above to
ensure epipolar separation.
[0415] In step s3, images are captured from each of the two imaging
apparatuses which are located at preferably slightly different
angles to the scene. The images are preferably captured
simultaneously and show a combination of the original texture and
the pattern. In step s4, each of the two captured images is
analyzed independently, as in previous embodiments, according to
features and their locations along respective epipolar lines. In
addition in step s4, a comparison using stereo correspondence as
discussed above is carried out between the two images. In this
comparison, similar pixel areas in both images are identified. As
discussed, similar pixel areas indicate where the same imaged
location appears in each of the images. A given pixel area in one
image is searched for along certain epipolar lines in the other
image and a correspondence is found.
[0416] Now, correspondences between features in each image and the
original pattern are found out in step s5, as described in
embodiments above. In addition, in step 5, relative locations
between corresponding pixel "windows" in the two images is
computed. From correspondences, 3D spatial coordinates are found
using triangulation, seen step s6.
[0417] As discussed above, the comparison between images allows for
the derivation of 3D coordinates from rich texture surfaces, dark
surfaces, locally unsmooth surfaces, transparent surfaces, shiny
surfaces, etc, that under single pattern conditions could likely be
identified erroneously or not at all. Taken together, the three 3D
coordinate mappings of the imaged objects allow for the
construction of a 3D point cloud of the imaged scene, seen as step
s7. This point cloud may be further processed to obtain a mesh or
3D surface, seen as step s8. The 3D surfaces may then be further
processed together with texture data from additional cameras or
CCDs. Aerial data of the urban scene may also be added. Such
texture and aerial data may complement the depth 3D depth map
obtained through steps s1-s8.
[0418] Steps s4 through s9 are typically carried out by an image
processing device or devices of various kinds known in the art.
Texture data may be obtained in a number of ways. One way is by
adding an additional CCD. Another is by adding a tunable filter to
a single "switchable" CCD, such that at certain time intervals the
CCD captures pattern and thus depth information, and at other
intervals the CCD captures texture information. Still another
method is by simply adding a textured-dedicated camera to work in
parallel to the depth capturing system.
[0419] As stated, the single pattern method employed in preferred
embodiments allows for relative motion between the
imaging/projector system and imaged urban objects. Therefore, the
method and apparatus are particularly suited for an implementation
where the imaging apparatus is mounted on a ground-vehicle moving
at various speeds throughout an urban environment.
[0420] To obtain dynamic scene modeling in three dimensions, steps
s2 through s8 are repeated and a sequence of three dimensional
images obtained. Processing of data from moving objects may occur
in real time through appropriate processing software and hardware
to derive surfaces and shapes from the 3D point clouds of the
objects and their movement in 3D space over time.
[0421] As the present embodiment in FIG. 25 may be implemented
through projection of a single pattern structured light code, the
speed of image capture is typically only limited by the speed of
the imaging apparatus in which the pattern is implemented and thus
a rapid 3D modeling of city landscape is possible. The above
embodiment of urban scene modeling is not limiting, and other
embodiments utilizing the methods and apparatuses for stereo
decoding are possible.
[0422] Stereo matching, either passive or active, provides a much
more dense sampling of the imaged scene. This is because the
content of any given pixel of say a first image provides a
disparity measurement in relation to the same content appearance in
a second image. This disparity measurement means a depth value for
each given pixel. So if the CCD sensor contains X pixels, X depth
coordinates are derived.
[0423] In contrast, in decoding of single images as explained
above, each feature that is decoded has a certain spatial area on
the sensor. Let's say typically 10 square pixels, so a decoded 2D
image from which 3D spatial coordinates are triangulated has X/10
depth coordinates. So a stereo matching implementation has 10 times
more depth coordinates, and thus denser sampling.
[0424] Moreover, in stereo matching, sub-pixel resolution is
possible by performing the matching process between two up-sampled
versions of the two compared images. For example, each 4.times.4
block of pixels is up-sampled by a factor of 4 to a size
16.times.16 pixel blocks in each of two images. A certain
up-sampled block is chosen in a first image. Then a similar block
to this up-sampled block of pixels is searched for in the second
image. The correspondence between pixels is now in terms of
up-sampled pixels. So if a disparity of say 21 pixels is found
between the two images, this is equal to 21/4=5.25 original pixels.
The remaining quarter pixel is the sub-pixel resolution in terms of
original image resolution.
[0425] The method and apparatus of motion based three dimensional
image capture and depth measurement of the present embodiments has
particular application to the medical imaging field. Surface images
that determine the shape and dimensions of an anatomical surface
part in three dimensions may be computed in real time for modeling
of both stationary and moving surface anatomy. The 3D point clouds,
determined through the embodiments above, may be further processed
to obtain surfaces and shapes of the objects and their movement in
3D space over time. These surfaces and shapes are referred to as
triangulation meshes.
[0426] Such 3D modeling data may be particularly useful for various
medical applications including but not limited to external three
dimensional imaging for tumor analysis, skin growth analysis, bone
structure analysis, dental surgery applications, maxillofacial and
other osteotomic surgeries, skin cancer and melanoma diagnosis,
prosthetic applications, rehabilitation purposes, sports medicine
applications, and plastic surgery and other aesthetic medical
procedures.
[0427] In the fields of reconstructive, plastic, and dental
surgery, to name a few, biomechanical modeling may utilize dynamic
3D modeling of anthropometric data in surgery planning and post
surgery review. The expected aesthetic result of surgery is
important to a patient, and the imaging system of the current
embodiments allows for soft tissue predication in three dimensions.
Post surgery review of anatomical changes, such as breast implants
and reductions, dermal fillers, face and neck lifts, can be carried
out through use of the 3D imaging system of the current
embodiments.
[0428] For rehabilitation applications skeleton modeling of the
anatomical part, such as discussed in U.S. Pat. No. 6,133,921, may
be carried out based on analysis of the 3D point cloud. Motion
analysis of the skeleton model may then be performed. In the area
of dental applications, 3D maxillary modeling and other facial bone
analysis is possible. Other applications include measurement of
tumor shape and size, reconstructive surgery planning and
diagnosis, dental and plastic surgery, prosthetics, rehabilitation,
and skin cancer analysis.
[0429] Reference is now made to FIG. 26, which is a simplified flow
chart showing the steps in the process of dimensional image capture
of surface anatomy according to the present embodiments. In step
300, a two dimensional coded light pattern is generated. Next, in
step 302, the generated pattern is projected onto an external
anatomical part. As skin surfaces are naturally low texture
surfaces, the structured light pattern of the current embodiments
is naturally suited for skin surface imaging. In step 304, a 2D
image of the anatomical part and the reflected pattern is captured
by an imaging device. In step 306, the captured image is sent to a
processor, for extracting the reflected feature types and their
locations along respective epipolar lines in the captured image.
The locations of the features along their epipolar lines are then
associated with 3D coordinates on the imaged anatomical part from
which the features were reflected, step 308. This process of
correspondence between feature locations along epipolar lines and
3D spatial coordinates determines the anatomical shape. The process
is carried out through triangulation techniques, as discussed
above. For each identified feature in the 2D image, a corresponding
3D coordinate is thus derived indicating the point in space at
which that feature reflected off of the anatomy part. Through a
compilation of all such 3D coordinates, a 3D point cloud is derived
that gives a three dimensional map of the imaged anatomical
part(s), step 310. If the anatomical part is moving, then steps 304
through 310 are repeated and a sequence of three dimensional
coordinates in space is obtained. This sequence comprises a point
cloud data set over time from which a skeleton in motion may be
generated, step 312.
[0430] In preferred embodiments, this stationary or dynamic
skeleton model is further processed and preferably output to a 2D
or 3D screen for viewing. As mentioned, the three dimensional
coordinates may be processed in real time for either stationary or
moving anatomy. Through post processing, for example, the point
cloud and/or skeleton model is transformed into a 3D surface
representing the dimensions of the imaged anatomical part. The 3D
imaged surface, both in motion or stationary, can be used in the
applications shown at the bottom of the figure and discussed
above.
[0431] In an alternative embodiment, particularly in prosthetic or
rehabilitation applications, the anatomical part may be covered
with clothing or other material. Furthermore, the above embodiments
may be implemented together with additional imaging devices, in
particular one or more texture based imaging devices to obtain
textural information together with the depth information. Examples
of such implementations include RGBZ, splats, color voxels, and
textured mesh. Typically, these implementations involve the
addition of a textured-dedicated camera to work in parallel to the
depth capturing system. This texture based information may allow
for additional diagnosis of ailments associated with skin color as
well as increased visibility by avoiding occlusions. Moreover, such
an additional texture-based camera allows for the reconstruction of
high-contrast, high-textured, non-smooth surfaces such as hair.
[0432] Furthermore, other embodiments may preferably utilize a
second imaging device as discussed above. Stereo active
triangulation techniques offer several advantages. Less occlusions
occur in the obtained 3D image, as the object or anatomical element
is now imaged from more than one viewpoint. For high-contrast, high
textured surfaces mentioned above, such as body parts covered by
hair, freckles, melanoma, stereo imaging is preferred as discussed
above. Finally, higher resolution is obtained with stereo matching
as discussed above as well. As the human body is generally a low
textured surface, the single image decoding and 3D spatial
coordinate derivation is utilized for most applications requiring
geometric shape of stationary or moving body parts.
[0433] Many medical applications require 3D motion capture for
anatomy motion analysis. Furthermore, the human body is not static
even at rest. As a result, dense sampling is necessary for each
frame.
[0434] Still further applications of motion based three dimensional
image capture and depth measurement of the present embodiments are
in the automotive arena. One important application is a 3D backup
camera for motor vehicles. The ability to measure depth of
potential obstacles in the path of a vehicle in reverse and warn
the driver in time is a critical safety need. For any given
identified object in the path of a reversing vehicle, it is desired
to determine the classification of the object such as size, precise
location, object-vehicle relative velocity, and importantly whether
the object is animate or inanimate.
[0435] A typical flow chart for the automotive application is shown
in FIG. 27. As mentioned above, at least one projector and at least
one sensor apparatus are installed on the rear side of the vehicle.
The projector projects the 2D coded light pattern, step 324,
possibly using IR light from a laser LED projector. The pattern is
reflected from both stationary and moving objects in a defined
lateral and depth range behind the vehicle. The sensor then
captures the reflected pattern and texture information, seen as
step 326. This data is then processed, step 328, by an onboard
processing unit to identify features and their locations in 2D
image along respective epipolar lines. These locations of features
are then corresponded to distances of identified objects, step 332.
These distances are then used to warn driver either audibly, or
visually on a display screen on the dashboard for instance. This
process is repeated continuously over time to provide continuous
real time information to the driver.
[0436] Preferred embodiments utilize narrowband monochromatic
imaging which is simpler and cheaper than capturing color images.
Monochromatic imaging allows the projection of low cost invisible
IR light.
[0437] It is expected that during the life of this patent many
relevant devices and systems will be developed and the scope of the
terms herein is intended to include all such new technologies a
priori.
[0438] The terms "light pattern", "light code", and "light code
pattern", are used herein to refer to any light encoding technique,
including, but not limited to, structured light, coded light, and
other equivalents known in the art.
[0439] The term "epipolar co-linear" simply means that two
independent features appear on the same epipolar line in a given 2D
image.
[0440] The term "reflection point distance:image point location
relation" refers to a one to one relationship between each point in
the captured 2D image and a point in 3D space. This relationship
can be provided in, for instance, non-volatile memory such as flash
memory.
[0441] The term "feature-type-inhomogeneous plurality" refers to a
set of features wherein each feature is unique.
[0442] The object(s) imaged by the method and apparatus described
herein can be alternatively referred to as the "target scene" or
"calibration scene" in the case of calibration.
[0443] In addition to other image capture apparatuses, a 2D array
of photodetectors may be used as well.
[0444] The term "insensitive to epipolar line ordering" is
equivalent to saying "insensitive to scene distance".
[0445] It is appreciated that certain features of the invention,
which are, for clarity, described in the context of separate
embodiments, may also be provided in combination in a single
embodiment. Conversely, various features of the invention, which
are, for brevity, described in the context of a single embodiment,
may also be provided separately or in any suitable
subcombination.
[0446] Although the invention has been described in conjunction
with specific embodiments thereof, it is evident that many
alternatives, modifications and variations will be apparent to
those skilled in the art. Accordingly, it is intended to embrace
all such alternatives, modifications and variations that fall
within the spirit and broad scope of the appended claims. All
publications, patents, and patent applications mentioned in this
specification are herein incorporated in their entirety by
reference into the specification, to the same extent as if each
individual publication, patent or patent application was
specifically and individually indicated to be incorporated herein
by reference. In addition, citation or identification of any
reference in this application shall not be construed as an
admission that such reference is available as prior art to the
present invention.
* * * * *