Image Capture Using Three-dimensional Reconstruction Bilbrey; Brett ; et al. [Apple Inc.]

Image Capture Using Three-dimensional Reconstruction

Bilbrey; Brett ; et al.

Patent Application Summary

U.S. patent application number 13/246821 was filed with the patent office on 2012-03-29 for image capture using three-dimensional reconstruction. This patent application is currently assigned to Apple Inc.. Invention is credited to Brett Bilbrey, Michael F. Culbert, Rich DeVaul, David S. Gere, Mushtaq Sarwar, David I. Simon.

Application Number	20120075432 13/246821
Document ID	/
Family ID	44947179
Filed Date	2012-03-29

United States Patent Application	20120075432
Kind Code	A1
Bilbrey; Brett ; et al.	March 29, 2012

IMAGE CAPTURE USING THREE-DIMENSIONAL RECONSTRUCTION

Abstract

Embodiments may take the form of three-dimensional image sensing devices configured to capture an image including one or more objects. In one embodiments, the three-dimensional image sensing device includes a first image device configured to capture a first image and extract depth information for the one or more objects. Additionally, the image sensing device includes a second imaging device configured to capture a second image and determine an orientation of a surface of the one or more objects.

Inventors:	Bilbrey; Brett; (Sunnyvale, CA) ; Culbert; Michael F.; (Monte Sereno, CA) ; Simon; David I.; (San Francisco, CA) ; DeVaul; Rich; (Mountain View, CA) ; Sarwar; Mushtaq; (San Jose, CA) ; Gere; David S.; (Palo Alto, CA)
Assignee:	Apple Inc. Cupertino CA
Family ID:	44947179
Appl. No.:	13/246821
Filed:	September 27, 2011

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
61386865	Sep 27, 2010

Current U.S. Class:	348/48 ; 348/E13.074
Current CPC Class:	H04N 13/25 20180501; G01J 4/00 20130101; G06K 9/00255 20130101; H04N 13/271 20180501; G02B 30/25 20200101; G06T 7/557 20170101; G07C 9/37 20200101; G06T 7/593 20170101; G06T 7/596 20170101; H04N 13/243 20180501
Class at Publication:	348/48 ; 348/E13.074
International Class:	H04N 13/02 20060101 H04N013/02

Claims

1. A three-dimensional imaging apparatus configured to capture at least one image including one or more objects, comprising: a first sensor for capturing a polarized image, the first sensor including a first imaging device and a polarized filter associated with the first imaging device; a second sensor for capturing a first non-polarized image; a third sensor for capturing a second non-polarized image; and at least one processing module for deriving depth information for the one or more objects utilizing at least the first non-polarized image and the second non-polarized image, the processing module further operative to combine the polarized image, the first non-polarized image, and the second non-polarized image to form a composite three-dimensional image.

2. The three-dimensional imaging apparatus of claim 1, wherein the first sensor is positioned between the second and third sensors such that a blind region of the first sensor is between blind regions of the second and third sensors.

3. The three-dimensional imaging apparatus of claim 1, wherein: the first sensor is a luminance sensor; the second sensor is a first chrominance sensor; and the third sensor is a second chrominance sensor.

4. The three-dimensional imaging apparatus of claim 3, wherein a field of view of the first sensor is offset from both a field of view of the second sensor and a field of view of the third sensor.

5. The three-dimensional imaging apparatus of claim 3, wherein: the polarized image is a polarized luminance image; the first non-polarized image is a first chrominance image; the second non-polarized image is a second chrominance image; the at least one processing module is configured to generate a stereo disparity map from at least the first and second chrominance images; and the at least one processing module is configured to derive depth information at least partially from the stereo disparity map.

6. The three-dimensional imaging apparatus of claim 5, further comprising a fourth sensor configured to capture a second luminance image; wherein the at least one processing module is further configured to refine the stereo disparity map based on the second luminance image.

7. The three-dimensional imaging apparatus of claim 1, wherein the polarized filter comprises an array of polarizing subfilters.

8. The three-dimensional imaging apparatus of claim 7, wherein: the first sensor comprises at least one pixel; and a first polarized subfilter of the array of polarizing subfilters overlays the at least one pixel.

9. The three-dimensional imaging apparatus of claim 8, wherein the first polarized subfilter of the array of polarizing subfilters has a different type of polarization than a second polarized subfilter of the array of polarizing subfilters.

10. The three-dimensional imaging apparatus of claim 8, wherein: the at least one pixel receives polarized light reflected from an imaged object, the polarized light corresponding to a polarization type of the first polarized subfilter; and the first sensor determines a surface normal of the imaged object by measuring a polarization of the light received by the at least one pixel.

11. The three-dimensional imaging apparatus of claim 10, further comprising: at least a second pixel adjacent to the at least one pixel; wherein the second polarized subfilter overlays the at least a second pixel; and the first sensor determines a surface normal of the imaged object by measuring a polarization of the light received by the at least a second pixel and comparing it to the polarization of the light received by the at least one pixel.

12. The three-dimensional imaging apparatus of claim 8, wherein the luminance imaging device includes at least one additional pixel that corresponds to an unpolarized area of the polarized filter.

13. The three-dimensional imaging apparatus of claim 12, wherein the polarized luminance image is a high dynamic range image created from a first luminance image recorded at least by the at least one pixel and a second luminance image recorded at least by the at least one additional pixel.

14. The three-dimensional imaging apparatus of claim 8, wherein: the first sensor includes a microlens array; and at least one microlens of the microlens array corresponds to the at least one pixel and is configured to focus light onto the at least one pixel.

15. The three-dimensional imaging apparatus of claim 1, wherein the at least one processing module is further configured to identify at least one face in the composite three-dimensional image utilizing at least one of the surface information or the depth information.

16. A three-dimensional imaging apparatus configured to capture at least one image including one or more objects, comprising: a first sensor for capturing a polarized chrominance image and determining surface information for the one or more objects, the first sensor including a color imaging device and a polarized filter associated with the color imaging device; a second sensor for capturing a first luminance image; a third sensor for capturing a second luminance image; and at least one processing module for deriving depth information for the one or more objects utilizing at least the first luminance image and the second luminance image and combining the polarized chrominance image, the first luminance image, and the second luminance image to form a composite three-dimensional image utilizing the surface information and the depth information.

17. A method for capturing at least one image of an object, comprising: capturing a polarized image of the object; capturing a first non-polarized image of the object; capturing a second non-polarized image of the object; deriving depth information for the object from at least the first non-polarized image and the second non-polarized image; determining a plurality of surface normals for the object, the plurality of surface normals derived from the polarized image; creating a three-dimensional image from the depth information and the plurality of surface normals.

18. The method of claim 17, wherein the operation of deriving depth information for the object comprises creating a stereo disparity from the first non-polarized image and the second non-polarized image.

19. The method of claim 17, further comprising: determining, based on the surface normals, a simulated lighting of the object; and altering the three-dimensional image to insert the simulated lighting of the object.

20. The method of claim 17, wherein the operation of determining a plurality of surface normals for the object comprises: grouping each pixel of a pixel array into a subarray with at least one other pixel; evaluating the polarized light received by the subarray; and based on the evaluation, assigning a surface normal to a portion of the image recorded by the subarray.

21. The method of claim 17, wherein the polarized image and first non-polarized image are captured by a single sensor.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit under 35 U.S.C. .sctn.119(e) to 61/386,865, filed Sep. 27, 2010 and titled "Image Capture Using Three-Dimensional Reconstruction," the disclosure of which is hereby incorporated herein in its entirety.

BACKGROUND

[0002] I. Technical Field

[0003] The disclosed embodiments relate generally to image sensing devices and, more particularly, to image sensing devices that utilize three-dimensional reconstruction to form a three-dimensional image.

[0004] II. Background Discussion

[0005] Existing three-dimensional image capture devices, such as digital cameras and video recorders, can derive limited three-dimensional visual information for objects located within a captured area. For example, some imaging devices can extract approximate depth information relating to objects located within the captured area, but are incapable of obtaining detailed geometric information relating to the surfaces of these objects. Such sensors may be able to approximate the distances of objects within the captured area, but cannot accurately reproduce the three-dimensional shape of the objects. Alternatively other imaging devices can obtain and reproduce surface detail information for objects within the captured area, but are incapable of extracting depth information. Accordingly, these sensors may be incapable of differentiating between a small object positioned close to the sensor and a large object positioned far away from the sensor.

SUMMARY

[0006] Embodiments described herein relate to systems, apparatuses and methods for capturing a three-dimensional image using one or more dedicated imaging devices. One embodiment may take the form of a three-dimensional imaging apparatus configured to capture at least one image including one or more objects, comprising: a first sensor for capturing a polarized image, the first sensor including a first imaging device and a polarized filter associated with the first imaging device; a second sensor for capturing a first non-polarized image; a third sensor for capturing a second non-polarized image; and at least one processing module for deriving depth information for the one or more objects utilizing at least the first non-polarized image and the second non-polarized image, the processing module further operative to combine the polarized image, the first non-polarized image, and the second non-polarized image to form a composite three-dimensional image.

[0007] Another embodiment may take the form of three-dimensional imaging apparatus configured to capture at least one image including one or more objects, comprising: a first sensor for capturing a polarized chrominance image and determining surface information for the one or more objects, the first sensor including a color imaging device and a polarized filter associated with the color imaging device; a second sensor for capturing a first luminance image; a third sensor for capturing a second luminance image; and at least one processing module for deriving depth information for the one or more objects utilizing at least the first luminance image and the second luminance image and combining the polarized chrominance image, the first luminance image, and the second luminance image to form a composite three-dimensional image utilizing the surface information and the depth information.

[0008] Still another embodiment may take the form of a method for capturing at least one image of an object, comprising: capturing a polarized image of the object; capturing a first non-polarized image of the object; capturing a second non-polarized image of the object; deriving depth information for the object from at least the first non-polarized image and the second non-polarized image; determining a plurality of surface normals for the object, the plurality of surface normals derived from the polarized image; and creating a three-dimensional image from the depth information and the plurality of surface normals.

[0009] This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages will be apparent from the following more particular written description of various embodiments, as further illustrated in the accompanying drawings and defined in the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] FIG. 1A is a functional block diagram that illustrates certain components of one embodiment of a three-dimensional imaging apparatus;

[0011] FIG. 1B is a close-up view of one embodiment of the second imaging device shown in FIG. 1A;

[0012] FIG. 1C is a close-up view of another embodiment of the second imaging device shown in FIG. 1A;

[0013] FIG. 1D is a close-up view of another embodiment of the second imaging device shown in FIG. 1A;

[0014] FIG. 2 is a functional block diagram that illustrates certain components of another embodiment of a three-dimensional imaging apparatus;

[0015] FIG. 3 is a functional block diagram that illustrates certain components of another embodiment of a three-dimensional imaging apparatus;

[0016] FIG. 4 depicts a sample polarization filter that may be used in accordance with embodiments discussed herein, including the imaging apparatuses of FIGS. 1A-3; and

[0017] FIG. 5 depicts a second sample polarization filter that may be used in accordance with embodiments discussed herein, including the imaging apparatuses of FIGS. 1A-3.

DETAILED DESCRIPTION

[0018] One embodiment may take the form of a three-dimensional imaging apparatus, including a first and second imaging device. The first imaging device may have two unique imaging devices that may be used in concert to derive depth data for objects within the field of detection of the sensors. Alternatively, the first imaging device may have a single imaging device that provides depth data. The second imaging device may be at least partially overlaid with a polarizing filter in order to obtain polarization data of light impacting the device, and thus the surface orientation of any objects reflecting such light.

[0019] The first imaging device may derive approximate depth information relating to objects within its field of detection and supply the depth information to an image processing device. The second imaging device may capture surface detail information relating to objects within its field of detection and supply the surface detail information to the image processing device. The image processing device may combine the depth information with the surface detail information in order to create a three-dimensional image that includes both surface detail and accurate depth information for objects in the image.

[0020] In the following discussion of illustrative embodiments, the term "image sensing device" includes, without limitation, any electronic device that can capture still or moving images. The image sensing device may utilize analog or digital sensors, or a combination thereof, for capturing the image. In some embodiments, the image sensing device may be configured to convert or facilitate converting the captured image into digital image data. The image sensing device may be hosted in various electronic devices including, but not limited to, digital cameras, personal computers, personal digital assistants (PDAs), mobile telephones, or any other devices that can be configured to process image data. Sample image sensing devices include charge-coupled device (CCD) sensors, complementary metal-oxide-semiconductor sensors, infrared sensors, light detection and ranging sensors, and the like. Further, the image sensing devices may be sensitive to a range of colors and/or luminances, and may employ various color separation mechanisms such as Bayer arrays, Foveon X3 configurations, multiple CCD devices, dichroic prisms and the like.

[0021] FIG. 1A is a functional block diagram of one embodiment of a three-dimensional imaging apparatus for capturing and storing image data. In one embodiment, the three-dimensional imaging apparatus may be a component within an electronic device. For example, the three-dimensional imaging apparatus may be employed in a standalone digital camera, a laptop computer, a media player, a mobile phone, and so on and so forth.

[0022] As shown in FIG. 1A, the three-dimensional imaging apparatus 100 may include a first imaging device 102, a second imaging device 104, and an image processing module 106. The first imaging device 102 may include a first imaging device and the second imaging device 104 may include a second imaging device and a polarizing filter 108 associated with the second imaging device. As will be further discussed below, the first imaging device 102 may be configured to derive approximate depth information relating to objects in the image, and the second imaging device 104 may be configured to derive surface orientation information relating to objects in the image.

[0023] In one embodiment, the fields of view of the first and second imaging devices 112, 114 may be offset so that the received images are slightly different. For example, the field of view 112 of the first imaging device 102 may be vertically, diagonally, or horizontally offset from the second imaging device 104, or may be closer or further away from a reference plane or point. As will be further discussed below, offsetting the fields of view of the first and second imaging devices 112, 114 may provide data useful for generating stereo disparity maps, as well as extracting depth information. However, in other embodiments, the fields of view of the first and second imaging devices 112, 114 may be substantially the same.

[0024] The first and second imaging devices 102, 104 may be each be formed from an array of light-sensitive pixels. That is, each pixel of the imaging devices may detect at least one of the various wavelengths that make up visible light. The signal generated by each such pixel may vary depending on the wavelength of light impacting it so that the array may thus reproduce a composite image of the object. In one embodiment, the first and second imaging devices 102, 104 may have substantially identical pixel array configurations. For example, the first and second imaging devices may have the same number of pixels, the same pixel aspect ratio, the same arrangement of pixels, and/or the same size of pixels. However, in other embodiments, the first and second imaging devices may have different numbers of pixels, pixel sizes, and/or layouts. For example, in one embodiment, the first imaging device 102 may have a smaller number of pixels than the second imaging device 104, or vice versa, or the arrangement of pixels may be different between the sensors.

[0025] The first imaging device 102 may be configured to capture a first image and process the image to detect depth or distance information relating to objects in the image. For example, the first imaging device 102 may be configured to derive an approximate relative distance of an object 110 by measuring properties of electromagnetic waves as they are reflected off or scattered by the object and captured by the first imaging device. In one embodiment, the first imaging device may be a Light Detection And Ranging (LIDAR) sensor. The LIDAR sensor may emit laser pulses that are reflected off of the surfaces of objects in the image and detect the reflected signal. The LIDAR sensor may then calculate the distance of an object from the sensor by measuring the time delay between transmission of a laser pulse and the detection of the reflected signal. Other embodiments may utilize other types of depth-detection techniques, such as infrared reflection, RADAR, laser detection and ranging, and the like.

[0026] Alternatively, a stereo disparity map may be generated to derive depth or distance information relating to objects present in the image. In one embodiment, a stereo disparity map may be formed from the first image captured by the first imaging device and a second image captured by the second imaging device. Various methods and processes for creating stereo disparity maps from two offset images are known to those skilled in the art and thus are not discussed further herein. Generally, the stereo disparity map is a depth map in which depth information for objects shown in the images is derived from the offset first and second images. For example, the second image may include some or all of the objects captured in the first image, but with the position of the objects being shifted in one direction (typically, although not necessarily, horizontally). This shift may be measured and used to calculate the distance of the objects from the first and second imaging devices.

[0027] The second imaging device 104 may be configured to capture a second image and derive detailed surface information for objects in the image. As shown in FIG. 1A, in one embodiment, a polarizing filter 108 may be positioned between the second imaging device and an object 110, such that light reflected off the object passes through the polarizing filter to produce polarized light. The polarized light is then transmitted by the filter 108 to the second imaging device 104. The second imaging device 104 may be any electronic sensor capable of detecting various wavelengths of light, such as those commonly used in digital cameras, digital video cameras, mobile telephones and personal digital assistants, web cameras, and so on and so forth. For example, the second imaging device 104 may be, but is not limited to, a charge-coupled device (CCD) imaging device or a complementary metal-oxide-semiconductor (CMOS) sensor.

[0028] In one embodiment, a polarizing filter may overlay the second imaging device. As shown in FIG. 1B, the polarizing filter 108 may include an array of polarizing subfilters 120. Each of the polarizing subfilters 122 within the array may overlay one or more pixels 124 of the second imaging device 104. In one embodiment, the polarizing filter 108 may be overlaid over the second imaging device 104 so that each polarizing subfilter 122 in the array 120 is aligned with a corresponding pixel 124. The polarizing subfilters 122 may have different types of polarizations. For example, a first polarizing subfilter may have a horizontal polarization, a second subfilter may have a vertical polarization, a third may have +45 degree polarization, a fourth may have a -45 degree polarization, and so on and so forth. In some embodiments, left and right-hand circular polarizations may be used. Accordingly, the polarized light that is transmitted from the polarizing filter 108 to the second imaging device 104 may be polarized differently for some of the pixels than for others.

[0029] Another embodiment, shown in FIG. 1C, may include a microlens array 130 overlaying the polarization filter 108. Each of the microlenses 132 in the microlens array 130 may overlay one or more polarizing subfilters 122 to focus polarized light onto a corresponding pixel 124 of the second imaging device. The microlenses 132 in the array 130 may each be configured to refract light impacting on the second imaging device, as well as transmit light to an underlying polarizing subfilter 122. Accordingly, each microlens 132 may correspond to one of the pixels 124 of the second imaging device 104. The microlenses 132 can be formed from any suitable material for transmitting and diffusing light through the light guide, including plastic, acrylic, silica, glass, and so on and so forth. Additionally, the light guide may include combinations of reflective material, highly transparent material, light absorbing material, opaque material, metallic material, optic material, and/or any other functional material to provide extra modification of optical performance. In another embodiment, shown in FIG. 1D, the microlenses 134 of the microlens array 136 may be polarized. In this embodiment, the polarized microlens array 136 may overlay the pixels 122 of the second imaging device 104 so that polarized light is focused onto the pixels 122 of the second imaging device.

[0030] In one embodiment, the microlenses 136 may be convex and have a substantially rounded configuration. Other embodiments may have different configurations. For example, in one embodiment, the microlenses 136 may have a conical configuration, in which the top end of each microlens is pointed. In other embodiments, the microlenses 136 may define truncated cones, in which the tops of the microlenses form a substantially flat surface. Additionally, in some embodiments, the microlenses 136 may be concave surfaces, rather than convex. As is known, the microlenses may be formed using a variety of techniques, including laser-cutting techniques, and/or micro-machining techniques, such as diamond turning. After the microlenses 136 are formed, an electrochemical finishing technique may be used to coat and/or finish the microlenses to increase their longevity and/or enhance or add any desired optical properties. Other methods for forming the microlenses may entail the use of other techniques and/or machinery, as is known.

[0031] Unpolarized light that is reflected off of the surfaces of objects in the image may be fully or partially polarized according to Fresnel's laws. Generally, the polarization may be correlated to the plane angle of incidence on the surface, as well as to the physical properties of the material. For example, light reflecting off highly reflective materials, such as polished metal, may be less polarized than light reflecting off of a dull surface. In one embodiment, light that is reflected off the surfaces of objects in the image may be passed through the array of polarization filters. The resulting polarized light may be captured by the pixels of the second imaging device so that each such pixel of the second imaging device receives light only if that light is polarized according to the polarization scheme of its corresponding filter. The second imaging device 104 may then measure the polarization of the light impacting on each pixel and derive the surface geometry of the object. For example, the second imaging device 104 may determine the orientation and/or curvature of the surface of an object. In one embodiment, the orientation and/or curvature of the surface may be determined for each pixel of the second imaging device 104 and combined to obtain the surface geometry for all of the surfaces of the object.

[0032] The first and second images may be transmitted to the image processing module 106, which may combine the first image captured by and transmitted from the first imaging device 102 with the second image captured by and transmitted from the second imaging device 104, to output a composite three-dimensional image. This may be accomplished by aligning the first and second images and overlaying one of the images on top of the other using a variety of techniques, including warping the first and second images, selectively cropping at least one of these images, using calibration data for the image processing module, and so on and so forth. As discussed above, the first image may supply depth information relating to the objects in the image, while the second image may supply surface geometry information for the objects in the image. Accordingly, the combined three-dimensional image may include accurate depth information for each object, while also providing accurate object surface detail.

[0033] In one embodiment, the first image supplying the depth information may have a lower or coarser resolution (e.g., lower pixel count per unit area), than the second image supplying the surface geometry information. In this embodiment, the composite three-dimensional image may include high resolution surface detail for objects in the image, but the amount of overall processing by the image processing module may be reduced due to the lower resolution of the first image. As discussed above, other embodiments may produce first and second images having substantially the same resolution, or the first image supplying the depth information may have a higher resolution than the second image.

[0034] Another embodiment of a three-dimensional imaging apparatus 200 is shown in FIG. 2. The imaging apparatus 200 generally includes a first chrominance sensor 202, a luminance sensor 204, a second chrominance sensor 206, and an image processing module 208. The luminance sensor 204 may be configured to capture a luminance component of incoming light. Additionally, each of the chrominance sensors 202 may be configured to capture color components of incoming light. In one embodiment, the chrominance sensors 202,206 may sense the R (Red), G (Green), and B (Blue) components of an image and process these components to derive chrominance information. Other embodiments may be configured to sense other color components, such as yellow, cyan, magenta, and so on. Further, in some embodiments, two luminance sensors and a single chrominance sensor may be used. That is, certain embodiments may employ a first luminance sensor, a first chrominance sensor and a second luminance sensor, such that a stereo disparity (e.g., stereo depth) map may be generated based on the offsets of the two luminance images. Each luminance sensor captures one of the two luminance images in this embodiment. Further, in such an embodiment, the chrominance sensor may be used to capture color information for a picture, while one or both luminance sensors capture luminance information. In this embodiment, both of the luminance sensors may be overlaid, fitted, or otherwise associated with one or more polarizing filters to receive and capture surface normal information for a surface, as described in more detail herein. Multiple luminance sensors with polarizing filters may be used, for example, in low light conditions where chrominance information may be lost or muted.

[0035] "Chrominance sensors" 202, 206 may be implemented in a variety of fashions and may sense/capture more than just chrominance. For example, the chrominance sensor(s) 202, 206 may be implemented as a Bayer array, an RGB sensor, a CMOS sensor, and so on and so forth. Accordingly, it should be appreciated that a chrominance sensor may also capture luminance information; chrominance is typically derived from the RGB sensor data.

[0036] Returning to an embodiment having two chrominance sensors 202, 206 and a single luminance sensor 104, the first chrominance sensor 202 may take the form of a first color imaging device. The luminance sensor may take the form of a luminance imaging device that is overlaid by a polarizing filter. The second chrominance sensor 206 may take the form of a second color imaging device. In one embodiment, the luminance sensor 204 and two chrominance sensors 202, 206 may be separate integrated circuits. However, in other embodiments, the luminance and chrominance sensors may be formed on the same circuit and/or formed on a single board or other element. In alternative embodiments, the polarizing filter 210 may be placed over either of the chrominance sensors instead of (or in addition to) the luminance sensor.

[0037] As shown in FIG. 2, the polarizing filter 210 may be positioned between the luminance sensor 204 and an object 211, such that light reflected off the object passes through the polarizing filter and impacts the corresponding luminance sensor. The luminance sensor 204 may be any electronic sensor capable of detecting various wavelengths of light, such as those commonly used in digital cameras, digital video cameras, mobile telephones and personal digital assistants, web cameras, and so on and so forth.

[0038] As discussed above with respect to FIGS. 1A and 1B, the luminance and chrominance sensors 202, 204, 206 may be formed from an array of color-sensitive pixels. The pixel arrangement may vary between sensors or may be identical, in a manner similar to that previously discussed.

[0039] In one embodiment, respective color filters may overlay the first and second color sensors and allow the sensors to capture the color portions of a sensed image as chrominance images. Similarly, an additional filter may overlay the luminance sensors and allow the imaging device to capture the luminance portion of a sensed image as a luminance image. The luminance image, along with the chrominance images, may be transmitted to the image processing module 208. As will be further described below, the image processing module 208 may combine the luminance image captured by and transmitted from the luminance sensor 204 with the chrominance images captured by and transmitted from the chrominance sensors, to output a composite image.

[0040] It should be appreciated that the luminance of an image may be expressed as a weighted sum of red, green and blue wavelengths of the image, in the following manner:

L=0.59 G+0.3 R+0.11 B

[0041] Where L is luminance, G is detected green light, R is detected red light, and B is detected blue light. The chrominance portion of an image may be the difference between the full color image and the luminance image. Accordingly, the full color image may be the chrominance portion of the image combined with the luminance portion of the image. The chrominance portion may be derived by mathematically processing the R, G, and B components of an image, and may be expressed as two signals or a two dimensional vector for each pixel of an imaging device. For example, the chrominance portion may be defined by two separate components Cr and Cb, where Cr may be proportional to detected red light less detected luminance, and where Cb may be proportional to detected blue light less detected luminance. In some embodiments, the first and second chrominance sensors 202, 206 may be configured to detect red and blue light and not green light, for example, by covering pixel elements of the color imaging devices with a red and blue filter array. This may be done in a checkerboard pattern of red and blue filter portions. In other embodiments, the filters may include a Bayer-pattern filter array, which includes red, blue, and green filters. Alternatively, the filter may be a CYGM (cyan, yellow, green, magenta) or RGBE (red, green, blue, emerald) filter.

[0042] As discussed above, the luminance portion of a color image may have a greater influence on the overall image resolution than the chrominance portions of a color image. In some embodiments, the luminance sensor 204 may be an imaging device that has a higher pixel count than that of the chrominance sensors 202, 206. Accordingly, the luminance image generated by the luminance sensor 204 may be a higher resolution image than the chrominance images generated by the chrominance sensors 202, 206. In other embodiments, the luminance image may be stored at a higher resolution or transmitted at higher bandwidth than the chrominance images.

[0043] In some embodiments, the fields of view of any two of the luminance and chrominance sensors may be offset so that the produced images are slightly different. As discussed above, the image processing module may combine the high resolution luminance image captured by and transmitted from luminance sensor 204 with the first and second chrominance images captured by and transmitted from the first and second chrominance 202, 206 sensors to output a composite three-dimensional image. As will be further discussed below, the image processing module 204 may use a variety of techniques to account for differences between the high-resolution luminance image and first and second chrominance images to form the composite three-dimensional image.

[0044] Depth information for the composite image may be derived from the two chrominance images. In this embodiment, the fields of view 212, 216 of the first and second chrominance sensors 202, 206 may be offset from one another and the image processing module 208 may be configured to compute depth information for objects in the image by comparing the first chrominance image with the second chrominance image. The pixel offsets may be used to form a stereo disparity map between the two chrominance images. As discussed above, the stereo disparity map may be a depth map in which depth information for objects in the images is derived from the offset first and second chrominance images.

[0045] In some embodiments, depth information for the composite image may be derived from the two chrominance images in conjunction with the luminance image. In this embodiment, the image processing module may further compare the luminance image with one or both of the chrominance images to form further stereo disparity maps between the luminance image and the chrominance images. Alternatively, the image processing module may be configured to refine the accuracy of the stereo disparity map generated initially using only the two chrominance sensors 202, 206.

[0046] Surface detail information may be derived from the luminance sensor 204. As previously mentioned, the luminance sensor 204 may include a luminance imaging device and an associated polarizing filter 210. The polarizing filter 210 may include an array of polarizing subfilters, with each of the polarizing subfilters within the array corresponding to a pixel of the second imaging device. In one embodiment, the polarizing filter may be overlaid over the luminance imaging device so that each polarizing subfilter in the array is aligned with a corresponding pixel. In some embodiments, the polarizing filters in the array may have different types of polarizations. However, in other embodiments, the polarizing subfilters in the array may have the same type of polarization.

[0047] Light reflected off the surfaces of objects in the image may be passed through the array of polarization subfilters. The resulting polarized light may be captured by the pixels of the luminance imaging device so that each pixel of the luminance imaging device may receive light that is polarized according to the polarization scheme of its corresponding subfilter. The luminance imaging device may then measure the polarization of the light impacting on the pixels and derive the surface geometry of the object. In one embodiment, the orientation and/or curvature of the surface may be determined for each pixel of the luminance imaging device and combined to obtain the surface geometry for all of the surfaces of the object 211.

[0048] As discussed above with respect to FIG. 1C, in some embodiments, the polarizing filter 210 may be overlaid by a corresponding microlens 130 array to focus the light onto the pixels of the luminance imaging device. In other embodiments, such as that shown in FIG. 1D, the microlens array 134 may be polarized, so that a separate polarizing filter overlaying the luminance imaging device is not needed.

[0049] The luminance and first and second chrominance images may then be transmitted to the image processing module. The image processing module 208 may combine the luminance image captured by and transmitted from the luminance imaging device 205, with the first and second chrominance images captured by and transmitted from the chrominance imaging devices 203, 207, to output a composite three-dimensional image. In one embodiment, this may be accomplished by warping the luminance and two chrominance images, such as to compensate for depth of field effects or stereo effects, and substantially aligning the images to form the composite three-dimensional image. Other techniques for aligning the luminance and chrominance images include selectively cropping at least one of these images by identifying fiducials in the fields of view of the first and second chrominance images and/or luminance images, or by using calibration data for the image processing module 208. As discussed above, the stereo disparity map generated between the two chrominance images may supply depth information relating to the objects in the image, while the luminance image supplies surface geometry information for the objects in the image. Accordingly, the combined three-dimensional image may include accurate depth information for each object, while also providing accurate object surface detail.

[0050] In one embodiment, the first and second chrominance images supplying the depth information may each have a lower pixel count than the luminance image supplying the surface geometry information. As discussed above, this may result in a composite three-dimensional image that has high resolution surface detail of objects in the image and approximated depth information. Accordingly, the amount of overall processing by the image processing module may be reduced due to the lower resolution of the chrominance images.

[0051] Each of the luminance and chrominance sensors can have a blind region due to a near field object 211 that may partially or fully obstruct the fields of view of the sensors. For example, the near field object may block the field of view of the sensors to prevent the sensors from detecting part or all of a background or a far field object that is positioned further from the sensors than the near-field object 211. In one embodiment, the chrominance sensors 202, 206 may be positioned such that the blind regions of the chrominance sensors do not overlap. Accordingly, chrominance information that is missing from one of the chrominance sensors due to a near field object may, in many cases, be captured by the other chrominance sensor of the three-dimensional imaging apparatus. The captured color information may then be combined with the luminance information from the luminance sensor and incorporated into the final image, as previously described. Due to the offset blind regions of the chrominance sensors 202, 206, stereo imaging artifacts may be reduced in the final image by ensuring that color information is supplied by at least one of the chrominance sensors where needed. In other words, color information for each of the pixels of the luminance sensor 204 may be supplied by at least one of the chrominance sensors 202, 206.

[0052] Still with respect to FIG. 2, the luminance sensor 204 may be positioned between the chrominance sensors 202, 206 so that the blind region of the luminance sensor may be between the blind regions of the first and second chrominance sensors. This configuration may prevent or reduce overlap between the blind regions of the first and second chrominance sensors, while also allowing for a more compact arrangement of sensors within the three-dimensional imaging apparatus. However, in other embodiments, the chrominance sensors may be positioned directly adjacent one another and the luminance sensor may be positioned on either side of the chrominance sensors, rather than in-between the sensors.

[0053] As alluded to above, other embodiments may utilize two luminance sensors and a single chrominance sensor positioned between the luminance sensors. The chrominance sensor may include a polarizing filter and a color imaging device associated with the polarizing filter. In this embodiment, the chrominance sensor may be configured to derive surface geometry information for objects in the image and the luminance images generated by the luminance sensors may be processed to extract depth information for objects in the image.

[0054] Another embodiment of a three-dimensional imaging apparatus 300 may include four or more sensors. As shown in FIG. 3, one embodiment may include two chrominance sensors 302, 306, two luminance sensors 304, 310, and an image processing module 308. Similar to the embodiment shown in FIG. 2, this embodiment may include two chrominance sensors 302, 306 positioned on either side of a first luminance sensor 304. In contrast to the embodiment shown in FIG. 2, however, the surface detail information may be supplied by a second luminance sensor 310 that includes a polarizing filter 312 and a luminance imaging device associated with the polarizing filter. In one embodiment, the second luminance sensor may be positioned on top of or below the first luminance sensor. In other embodiments, the second luminance sensor 310 may be horizontally or diagonally offset from the first luminance sensor 304. Additionally, the second luminance sensor 310 may be positioned in front of or behind the first luminance sensor 304. Each of the luminance and chrominance sensors generally (although not necessarily) interacts directly with the image processing module 308.

[0055] In this embodiment, depth information may be obtained by forming a stereo disparity map between the first luminance sensor 304 and the two chrominance sensors 302, 306, and the surface detail information may be obtained by the second luminance sensor 310. This embodiment may allow for generating a more accurate stereo disparity map, since the map may be formed from the luminance image generated by the first luminance sensor 304 and the two chrominance images generated by the chrominance sensors 302, 306, and is not limited to the data provided by the two chrominance images. This may allow for more accurate depth calculation, as well as for better alignment of the produced images to form the composite three-dimensional image. In another embodiment, the stereo disparity map may be generated from the two luminance images generated by the first and second luminance sensors 304, 310, as well as the two chrominance images generated by the chrominance sensors 302, 306.

[0056] As previously mentioned, embodiments discussed herein may employ a polarized filter 312 that is placed atop, above, or otherwise between a light source and an imaging sensor. For example, polarized filters are shown in FIG. 1C as a separate layer beneath microlenses and in FIG. 1D as integrated with microlenses. In any of the embodiments disclosed herein, the polarized filter 312 may be patterned in the manner shown in FIG. 4. That is, the polarization filter 312 may take the form of a set of individually polarized elements 314, each of which passes through a different polarization of light. As shown in FIG. 4, the individually polarized elements 314 may be vertically polarized 316, horizontally polarized 318, +45 degree polarized 320 and -45 degree polarized 322. These four individually polarized elements may be arranged in a two-by-two array in certain embodiments.

[0057] In such embodiments, each individually polarized element may overlay a single pixel (or, in some embodiments, a group of pixels). The pixel beneath the individually polarized element thus receives and senses light having a single polarity. As discussed above, the surface orientation of an object reflecting light onto a pixel may be determined through the polarization of the light impacting that pixel. Thus, each group of pixels (here, each group of four pixels) may cooperate to determine the surface orientation (e.g., surface normal) of a portion of an object reflecting light onto the group of pixels. Accordingly, resolution of an image may be traded in exchange for the ability to detect surface detail and curvature of objects.

[0058] FIG. 5 is a top-down view of an alternative arrangement of polarized subfilters 330 (e.g., individually polarized elements) that may overlay the pixels of a digital image sensor. On FIG. 5, each element of the filter array marked with an "X" indicates a group of polarized subfilters 330 arranged, for example, in the two-by-two grid previously mentioned. Alternative arrangements are possible. Each element of the filter that is unmarked has no polarized subfilters. Rather, light impinges upon pixels beneath these portions of the filter without any filtering at all. Thus, the filter shown in FIG. 5 is a partial filter; certain groups of pixels sense polarized light while others are not so filtered and sense unpolarized light.

[0059] It can be seen from FIG. 5 that the groups of polarized subfilters 330 and the non-filtered sections of the filter generally alternate. Other patterns of polarized and nonpolarized areas may be formed. For example, a filter may have half as many polarized areas as nonpolarized or twice as many. Accordingly, the pattern shown in FIG. 5 is illustrative only.

[0060] As one example of another polarizing filter pattern, the polarized filter may be configured to provide three different light levels, designated A, B, and C for purposes of this discussion. Each pixel (or pixel group) may be placed beneath or otherwise adjacent to a portion of a filter that is polarized to permit light of levels A, B, or C therethrough. Thus, the image sensor, when taken as a whole, could be considered to simultaneously capture three images that are not physically offset (or are very minimally physically offset, such as by the width of a few pixels) but have different luminance levels. It should be appreciated that the varying light levels may be achieved by changing the polarization patterns and/or degree of polarization. Further, it should be appreciated that any number of different, distinct levels of light may be created by the filter, and thus any number of images having varying light levels may be captured by the associated sensor. Thus, embodiments may create multiple images that may be employed to create a single high dynamic range image, as described in more detail below.

[0061] It should be appreciated that each group of pixels beneath a group of polarized subfilters 330 receives less light than a group of pixels beneath an unpolarized area of the filter. Essentially, the unpolarized groups of pixels may record an image having a first light level while the polarized pixel groups record the same image but at a darker light level. The images are recorded from the same vantage and at the same time since the pixels are interlaced with one another. Thus, there is practically no offset for the images captured by the polarized and unpolarized pixel groups. Essentially, this replicates capturing two images simultaneously from the exact same vantage point, but at two different exposure times. The embodiment trades resolution for the ability to capture an image at two different light levels simultaneously and without displacement.

[0062] Given the foregoing, it should be appreciated that the image having a higher light level (e.g., the "first image") and the image having a lower light level (e.g., the "second image") may be combined to achieve a number of effects. As but one example, the two images may be used to create high dynamic range (HDR) images. As known in the art, HDR images are generally created by overlaying and merging images that are captured at near-identical or identical locations, temporally close to one another, and at different lighting levels. The variance in lighting level is typically achieved by changing the exposure time of the image capture device. Since the present embodiment captures two images effectively with different exposure times , it should be appreciated that these images may be combined to create HDR images.

[0063] Further, the polarization pattern of the filter may be varied to effectively create three or more images at three or more total exposures (for example, by suitable varying and/or interleaving various groups of polarized subfilters and unpolarized areas). This may permit an even wider range of images to be combined to create a HDR image. "Total exposure" may be achieved by varying any, all, or a combination of f-stop, exposure time and filtering.

[0064] Likewise, these images may be combined to create a variety of effects. As one example, the variance in light intensity between images may cause certain objects in the first image to be more detailed than those objects are in the second image, and vice versa. As one example, if the images show a person standing in front of a sunlit window, the person may be dark and details of the person may be lost in the first image, since the light from the window will overpower the person. However, in the second image, the person may be more visible and detailed but the window may appear dull and/or details of the space through the window may be lost. The two images may easily be combined, such that the portion of the second image showing the person is overlaid into the identical space in the first image. Thus, a single image with both the person and background in detail may be created. Because there is little or no pixel offset, these types of substitutions may be relatively easily accomplished.

[0065] Further, since the embodiment may determine image depths and surface orientations, HDR images may be further enhanced. Many HDR images suffer from "halos" or bleeding effects around objects. This is typically caused by blending together the multiple images into the HDR image and attempting to normalize for abrupt changes in color or luminance caused by the boundary between an object and the background, or between two objects. Because traditional images lack depth and surface orientation data, they cannot distinguish between objects. Thus, the HDR process creates a visual artifact around high contrast boundaries.

[0066] Since embodiments disclosed herein can effectively map objects in space and obtain surface information about these objects, boundaries of the objects may be determined prior to the HDR process. Thus, the HDR process may be performed not on a per-image basis, but on a per-object basis within the image. Likewise, backgrounds may be treated and processed separately from any objects in the image. Thus, the halo/bleeding effects may be reduced or removed entirely, since the HDR process is not attempting to blend color and/or luminance across two discrete elements in an image.

[0067] Further, it should be appreciated that certain embodiments may employ a polarizing filter having non-checkered and/or asymmetric patterns. As one example, a polarizing filter maybe associated with an image sensor such that the majority of pixels receive polarized light. This permits the image sensor to gather additional surface normal data but at a cost of luminance, and potentially chrominance, information. Non-polarized sections of the filter may overlay a certain number of pixels. For example, every fifth, 10.sup.th, 20.sup.th and so on pixel may receive unpolarized light. In this fashion, the data captured by the pixels receiving unpolarized light may be used to estimate and enhance the luminance/chrominance information captured by the pixels underlying polarized portions of the filter. Essentially, the unpolarized image data may be used to correct the polarized image data. An embodiment may, for example create a curve fitted to the image data captured by the unpolarized pixels. This curve may be applied to data captured by pixels underlying the polarized sections of the filter and the corresponding polarized data may be fitted to the curve. This may improve the luminance and/or chrominance of the overall image.

[0068] Likewise, the polarization filter may vary in any fashion that facilitates processing image data captured by the pixels of the sensor array. For example, the following array shows a sample polarization filter, where the first letter indicates if the pixel receives polarized or clear/unpolarized light (designated by a "P" or a "C," respectively) while the second letter indicates the wavelength to which the particular pixel is sensitive/filtered ("R" for red, "G" for green and "B" for blue):

TABLE-US-00001 PR PG PR PG CR CG PR PG PR PG PG PB PG PB CG CB PG PB PG PB CR CG CR CG CR CG CR CG CR CG CG CB CG CB CG CB CG CB CG CB PR PG PR PG CR CG PR PG PR PG PG PB PG PB CG CB PG PB PG PB

[0069] The exact pattern may be varied to enhance, facilitate, and/or speed up image processing. Thus, the polarization pattern may vary according to the purpose of the image sensor, the software with which it is used, and the like. It should be appreciated that the foregoing array is an example only. Checkerboard or other repeating patterns need not be used; certain embodiments may use differing patterns depending on the end application or result desired.

[0070] The ability to extract both surface detail and depth information for objects in an image can extend the performance of the three-dimensional imaging apparatus in a number of ways. For example, the depth information may be used to derive size information for the objects in the image. Accordingly, the three-dimensional imaging apparatus may be capable of differentiating a large object positioned far away from the camera from a small object positioned close to the camera and having the same shape as the large object.

[0071] In another embodiment, the three-dimensional imaging apparatus may be used for gaze detection or eye tracking. Existing gaze detection techniques require transmitting infrared (IR) waves to an individual's retina and sensing the reflected infrared waves with a camera to determine the location of the individual's pupil and lens. Such techniques can be inaccurate, since an individual's retina is located behind the lens and cornea and may not be readily detectable. This technique can be improved by utilizing surface detail and depth information to measure the flatness of an individual's lens and determine the location of the lens with respect to the individual's eye.

[0072] In another embodiment, the three-dimensional imaging apparatus may be used to enhance imaging of partially obscured scenes. For example, the three-dimensional sensing device may determine the surface detail associated with an unobscured object, and save this information, for example, to a memory device. The saved surface information can be retrieved and superimposed into images in which the object is otherwise obscured. One example of such a situation is when an object is partially obscured by fog. In this situation, the image can be artificially enhanced by superimposing the object into the scene using the saved surface detail information.

[0073] In another embodiment, the three-dimensional imaging apparatus may be used to artificially reposition the light source within an image. Alternatively or in addition, the intensity of the light source may be adjusted to brighten or darken objects in the image. In this embodiment, the surface detail information, which can include the curvature and orientation of the surfaces of an object, may be used to calculate the position and/or intensity of the light source in the image. Accordingly, the light source may be virtually moved to a different positioned, or the intensity of the light source can be changed. Additionally, the surface detail information may further be used to calculate shadow positions that would naturally appear due to repositioning or changing the intensity of a light source.

[0074] In related embodiments, the three-dimensional imaging apparatus may be used to alter the balance of light within the image from different light sources. For example, the image may include light from two different light sources, including, but not limited to, natural light, florescent light, incandescent light, and so on and so forth. The different types of light may have different polarization signatures as it is reflected off of surfaces of objects. This may allow for calculating both the position of the light sources, as discussed above, as well as for identifying the light sources that are impacting on surfaces of objects in an image. Accordingly, the intensity of each light source, as well as the positions of the light sources, may be manipulated by a user to balance the light sources in the image according to the user's preference.

[0075] Similarly, the three-dimensional imaging apparatus may be configured to remove unwanted visual effects caused by light sources within the image, such as, but not limited to, glare and gloss. Glare and gloss may be caused by the presence of a large luminance ratio between the surface of a captured object and the glare source, which may be sunlight or an artificial light source. To remove areas of glare and gloss in the image, the three-dimensional imaging apparatus may use the surface detail information to determine the position and/or intensity of the light source in the image, and alter the parameters of the light source to remove or reduce the amount of glare and gloss caused by the light source. For example, the light source may be virtually repositioned or dimmed so that the amount of light impacting on a surface is reduced.

[0076] In another related embodiment, the three-dimensional imaging apparatus may further be used to artificially reposition and/or remove objects within an image. In this embodiment, the surface detail information may be used to calculate the position and/or intensity of the light source in the image. Once the position of the light source is obtained, the imaging apparatus may calculate shadow positions of the objects after they have been moved based on the calculated position of the light source, as well as the surface detail information. The depth information may be used to calculate the size of objects within the image, so that the objects may be appropriately sized as they are virtually positioned further or closer to the camera.

[0077] In a further embodiment, the three-dimensional imaging apparatus may be used to artificially modify the shape of objects within the image. For example, the surface detail information may be calculated and modified according to various parameters input by a user. As discussed above, the surface detail information may be used to calculate the position and/or intensity of the light source within the image, and corresponding shadow positions may be calculated according to the modified surface orientations.

[0078] In another embodiment, the three-dimensional imaging apparatus may be used for recognizing facial gestures. Facial gestures may include, but are not limited to, smiling, grimacing, frowning, winking, and so on and so forth. In one embodiment, this may be accomplished by detecting the orientation of various facial muscles using surface geometry data, such as the mouth, eyes, nose, forehead, cheeks, and so on, and correlating the detected orientations with various gestures. The gestures may then be correlated to various emotions associated with the gestures to determine the emotion of an individual in an image.

[0079] In another embodiment, the three-dimensional imaging apparatus may be used to scan an object, for example, to create a three-dimensional model of the object. This embodiment may be accomplished by taking multiple photographs of the object or video while rotating the object. As the object is rotated, the image sensing device may capture more of the surface geometry and use the geometry to create a three-dimensional model of the object. In another related embodiment, multiple photographs or video may be taken while the image sensing device is moved relative to the object, and used to construct a three-dimensional model of the objects within the captured image(s). For example, a user may take video of a home while walking through the home and the image sensing device could use the calculated depth and surface detail information to create a three-dimensional model of the home. The depth and surface detail information of multiple photographs or video stills may then be matched to construct a seamless composite three-dimensional model that combines the surface detail and depth from each of the photos or video.

[0080] In another embodiment, the three-dimensional imaging apparatus may be used to correct geometric distortions in a two or three-dimensional image. For example, if an image is taken using a fish-eye lens, the image processing module may warp the image so that it appears undistorted. In one embodiment, this may be accomplished by using the surface detail information to recognize various objects in the distorted image and warping the image based on saved surface detail information of the undistorted object. Accordingly, the three-dimensional imaging apparatus may recognize an object in the image, such as a table, using image recognition techniques, and calculate the distance of the table in the captured scene from the imaging apparatus. The three-dimensional imaging apparatus may then substitute a saved image of the object from another image, position the image of the object into the image of the scene at the calculated depth, and modify the image of the object so that it is properly scaled within the image of the scene.

[0081] In still another embodiment, the surface normal data may be used to construct a stereo disparity map. Most stereo disparity mapping systems look for repeating patterns of pixels and, based on the offset of the repeating pattern between images, assign the pattern a particular distance from the sensor. This may be crude since objects may be differently aligned with respect to one another when the image is captured by a first sensor and a second offset sensor. That is, the offset of the first and second sensors may cause objects to appear to have a different spatial relationship to one another, and thus the pixels representing those objects may vary between images.

[0082] However, the surface normals of each object should appear the same to each image sensor, so long at the sensors are coplanar. Thus, by comparing surface normals (as received and recorded through the polarized filters) detected by each image sensor against one another, the stereo mapping may be enhanced and refined. Once a pixel match or near-match is determined, the surface normals may be compared. If the surface normals differ, then the pixels may represent different objects in each image and depth information may be difficult or impossible to assign. If the surface normals match, then the pixels represent the same object(s) in each image and a depth may be more definitively determined and assigned.

[0083] The foregoing represent certain embodiments, systems and methods and are intended to be examples only. Accordingly, the proper scope of protection should not be limited by any of the foregoing examples.

* * * * *