Estimating Sensor Sensitivity Chuang; Jason C. ; et al. [Chuang; Jason C.]

Estimating Sensor Sensitivity

Chuang; Jason C. ; et al.

Patent Application Summary

U.S. patent application number 12/274233 was filed with the patent office on 2013-05-23 for estimating sensor sensitivity. The applicant listed for this patent is Jason C. Chuang, Holger Winnemoeller. Invention is credited to Jason C. Chuang, Holger Winnemoeller.

Application Number	20130128056 12/274233
Document ID	/
Family ID	48426461
Filed Date	2013-05-23

United States Patent Application	20130128056
Kind Code	A1
Chuang; Jason C. ; et al.	May 23, 2013

ESTIMATING SENSOR SENSITIVITY

Abstract

A method, system, and computer-readable storage medium for determining an estimate of sensor sensitivity associated with an image. A noise level of an image is determined, then an estimate of sensor sensitivity associated with the image is automatically determined, e.g., by a trained classifier based on the determined noise level. Additionally, the sensor sensitivity estimate can be used to determine scene brightness.

Inventors:

Chuang; Jason C.; (Stanford, CA) ; Winnemoeller; Holger; (Seattle, WA)

Applicant:

Name	City	State	Country	Type
Chuang; Jason C. Winnemoeller; Holger	Stanford Seattle	CA WA	US US

Family ID:

48426461

Appl. No.:

12/274233

Filed:

November 19, 2008

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
61098542	Sep 19, 2008

Current U.S. Class:	348/187 ; 348/E17.001; 382/100; 382/305
Current CPC Class:	G06K 9/6267 20130101; H04N 17/002 20130101; G06K 9/40 20130101; H04N 5/2351 20130101; G06K 9/209 20130101
Class at Publication:	348/187 ; 382/100; 382/305; 348/E17.001
International Class:	H04N 17/00 20060101 H04N017/00; G06K 9/54 20060101 G06K009/54; G06K 9/00 20060101 G06K009/00

Claims

1. A computer-implemented method, comprising: de-noising an image to generate a de-noised image from the image; determining a noise signature of the image, wherein said determining the noise signature is based on differences between the image and the de-noised image; and automatically determining an estimate of sensor sensitivity associated with the image based on the determined noise signature, wherein the sensor sensitivity is of a sensor used to capture the image.

2. The computer-implemented method of claim 1, wherein said determining a noise signature of the image comprises: de-noising the image, thereby generating a de-noised image; decomposing each of the image and the de-noised image into respective and corresponding pluralities of patches; determining at least one respective patch descriptor for each patch of the image based on the patch of the image and the corresponding patch of the de-noised image, wherein the patch descriptor characterizes noise in the patch of the image by summarizing differences between the patch of the image and the corresponding patch of the de-noised image; and creating the noise signature for the image based on the patch descriptors.

3. The computer-implemented method of claim 2, wherein said decomposing each of the image and the de-noised image into respective and corresponding pluralities of patches comprises partitioning each image multiple times with respective different patch sizes.

4. The computer-implemented method of claim 2, wherein said determining at least one respective patch descriptor for each patch of the image comprises determining a patch descriptor for each patch per color channel.

5. The computer-implemented method of claim 2, wherein the respective patch descriptor further includes a texture descriptor that characterizes texture of the patch.

6. The computer-implemented method of claim 5, wherein said creating the noise signature for the image based on the patch descriptors comprises: discarding descriptors of patches with high frequency texture, thereby leaving a subset of the patches with little or no high frequency texture; and creating the noise signature based on the subset of the patches.

7. The computer-implemented method of claim 2, wherein said creating the noise signature for the image based on the patch descriptors comprises: sorting the patch descriptors into bins based on intensity, color channel, and/or texture; sorting the patch descriptors within each bin based on amount of pixel alteration; selecting the patch descriptors with the lowest pixel alteration within each bin, wherein the amount of pixel alteration indicates the amount of noise in the corresponding patch; and creating the noise signature based on a weighted average pixel alteration of the selected patch descriptors.

8. The computer-implemented method of claim 2, wherein said automatically determining an estimate of sensor sensitivity associated with the image based on the determined noise signature comprises: providing the noise signature for the image to a trained classifier as input; and the trained classifier generating the estimate of sensor sensitivity based on the noise signature.

9. The computer-implemented method of claim 8, further comprising training the classifier prior to said trained classifier generating the estimate of sensor sensitivity, wherein said training the classifier comprises: determining noise signatures for each of a plurality of photographs for which sensor sensitivity is known; and training the classifier using the determined noise signatures and corresponding sensor sensitivity values to the classifier as training input.

10. The computer-implemented method of claim 8, wherein the classifier comprises a plurality of classifiers, each directed to a respective camera model or make.

11. The computer-implemented method of claim 1, further comprising: automatically determining an estimate of scene brightness based on the estimate of sensor sensitivity and metadata of the image, wherein the metadata comprises aperture information, exposure time information, and intensity information for the image; and storing the estimate of scene brightness, wherein the estimate of scene brightness is useable to categorize the image.

12. The computer-implemented method of claim 11, further comprising: categorizing the image based on the estimate of scene brightness, thereby determining a category for the image, wherein the category for the image is useable to perform semantic based image operations.

13. The computer-implemented method of claim 12, further comprising: determining one or more keywords or tags for the image based on the determined category, wherein the one or more keywords or tags are useable to perform search, retrieval, and matching operations with respect to the image.

14. A non-transitory computer-readable storage medium that stores program instructions computer-executable to implement: de-noising an image to generate a de-noised image from the image; determining a noise signature of the an image, wherein said determining the noise signature is based on differences between the image and the de-noised image; and automatically determining an estimate of sensor sensitivity associated with the image based on the determined noise signature, wherein the sensor sensitivity is of a sensor used to capture the image.

15. The non-transitory computer-readable storage medium of claim 14, wherein said determining a noise signature of the image comprises: de-noising the image, thereby generating a de-noised image; decomposing each of the image and the de-noised image into respective and corresponding pluralities of patches; determining at least one respective patch descriptor for each patch of the image based on the patch of the image and the corresponding patch of the de-noised image, wherein the patch descriptor characterizes noise in the patch of the image by summarizing differences between the patch of the image and the corresponding patch of the de-noised image; and creating the noise signature for the image based on the patch descriptors.

16. The non-transitory computer-readable storage medium of claim 15, wherein said decomposing each of the image and the de-noised image into respective and corresponding pluralities of patches comprises partitioning each image multiple times with respective different patch sizes.

17. The non-transitory computer-readable storage medium of claim 15, wherein said determining at least one respective patch descriptor for each patch of the image comprises determining a patch descriptor for each patch per color channel.

18. The non-transitory computer-readable storage medium of claim 15, wherein the respective patch descriptor further includes a texture descriptor that characterizes texture of the patch.

19. The non-transitory computer-readable storage medium of claim 18, wherein said creating the noise signature for the image based on the patch descriptors comprises: discarding descriptors of patches with high frequency texture, thereby leaving a subset of the patches with little or no texture; and creating the noise signature based on the subset of the patches.

20. The non-transitory computer-readable storage medium of claim 15, wherein said creating the noise signature for the image based on the patch descriptors comprises: sorting the patch descriptors into bins based on intensity, color channel, and/or texture; sorting the patch descriptors within each bin based on amount of pixel alteration; selecting the patch descriptors with the lowest pixel alteration within each bin, wherein the amount of pixel alteration indicates the amount of noise in the corresponding patch; and creating the noise signature based on a weighted average pixel alteration of the selected patch descriptors.

21. The non-transitory computer-readable storage medium of claim 15, wherein said automatically determining an estimate of sensor sensitivity associated with the image based on the determined noise signature comprises: providing the noise signature for the image to a trained classifier as input; and the trained classifier generating the estimate of sensor sensitivity based on the noise signature.

22. The non-transitory computer-readable storage medium of claim 21, wherein the program instructions are further computer-executable to train the classifier prior to said trained classifier generating the estimate of sensor sensitivity, wherein said training the classifier comprises: determining noise signatures for each of a plurality of photographs for which sensor sensitivity is known; and training the classifier using the determined noise signatures and corresponding sensor sensitivity values to the classifier as training input.

23. The non-transitory computer-readable storage medium of claim 21, wherein the classifier comprises a plurality of classifiers, each directed to a respective camera model or make.

24. The non-transitory computer-readable storage medium of claim 14, wherein the program instructions are further computer-executable to implement: automatically determining an estimate of scene brightness based on the estimate of sensor sensitivity and metadata of the image, wherein the metadata comprises aperture information, exposure time information, and intensity information for the image; and storing the estimate of scene brightness, wherein the estimate of scene brightness is useable to categorize the image.

25. The non-transitory computer-readable storage medium of claim 24, wherein the program instructions are further computer-executable to implement: categorizing the image based on the estimate of scene brightness, thereby determining a category for the image, wherein the category for the image is useable to perform semantic based image operations.

26. The non-transitory computer-readable storage medium of claim 25, wherein the program instructions are further computer-executable to implement: determining one or more keywords or tags for the image based on the determined category, wherein the one or more keywords or tags are useable to perform search, retrieval, and matching operations with respect to the image.

27. A system, comprising: at least one processor; and a memory coupled to the at least one processor, wherein the memory stores program instructions, wherein the program instructions are executable by the at least one processor to: de-noise an image to generate a de-noised image from the image; determine a noise signature of the image, wherein said determining the noise signature is based on differences between the image and the de-noised image; and automatically determine an estimate of sensor sensitivity associated with the image based on the determined noise signature, wherein the sensor sensitivity is of a sensor used to capture the image.

28. The system of claim 27, wherein to determining a noise signature of the image, the program instructions are computer-executable to: de-noise the image, thereby generating a de-noised image; decompose each of the image and the de-noised image into respective and corresponding pluralities of patches; determine at least one respective patch descriptor for each patch of the image based on the patch of the image and the corresponding patch of the de-noised image, wherein the patch descriptor characterizes noise in the patch of the image by summarizing differences between the patch of the image and the corresponding patch of the de-noised image; and create the noise signature for the image based on the patch descriptors.

29. The system of claim 28, wherein said decomposing each of the image and the de-noised image into respective and corresponding pluralities of patches comprises partitioning each image multiple times with respective different patch sizes.

30. The system of claim 28, wherein to determine at least one respective patch descriptor for each patch of the image, the program instructions are computer-executable to determine a patch descriptor for each patch per color channel.

31. The system of claim 28, wherein the respective patch descriptor further includes a texture descriptor that characterizes texture of the patch.

32. The system of claim 31, wherein to create the noise signature for the image based on the patch descriptors, the program instructions are computer-executable to: discard descriptors of patches with high frequency texture, thereby leaving a subset of the patches with little or no texture; and create the noise signature based on the subset of the patches.

33. The system of claim 28, wherein to create the noise signature for the image based on the patch descriptors, the program instructions are computer-executable to: sort the patch descriptors into bins based on intensity, color channel, and/or texture; sort the patch descriptors within each bin based on amount of pixel alteration; select the patch descriptors with the lowest pixel alteration within each bin, wherein the amount of pixel alteration indicates the amount of noise in the corresponding patch; and create the noise signature based on a weighted average pixel alteration of the selected patch descriptors.

34. The system of claim 28, wherein to automatically determine an estimate of sensor sensitivity associated with the image based on the determined noise signature, the program instructions are computer-executable to: provide the noise signature for the image to a trained classifier as input, wherein the trained classifier generates the estimate of sensor sensitivity based on the noise signature.

35. The system of claim 34, wherein the program instructions are further executable to train the classifier prior to said trained classifier generating the estimate of sensor sensitivity, wherein to train the classifier the program instructions are computer-executable to: determine noise signatures for each of a plurality of photographs for which sensor sensitivity is known; and train the classifier using the determined noise signatures and corresponding sensor sensitivity values to the classifier as training input.

36. The system of claim 34, wherein the classifier comprises a plurality of classifiers, each directed to a respective camera model or make.

37. The system of claim 27, wherein the program instructions are further executable to: automatically determine an estimate of scene brightness based on the estimate of sensor sensitivity and metadata of the image, wherein the metadata comprises aperture information, exposure time information, and intensity information for the image; and store the estimate of scene brightness, wherein the estimate of scene brightness is useable to categorize the image.

38. The computer-readable storage medium of claim 37, wherein the program instructions are further executable to: categorize the image based on the estimate of scene brightness, thereby determining a category for the image, wherein the category for the image is useable to perform semantic based image operations.

39. The computer-readable storage medium of claim 38, wherein the program instructions are further executable to: determine one or more keywords or tags for the image based on the determined category, wherein the one or more keywords or tags are useable to perform search, retrieval, and matching operations with respect to the image.

40. A computer-implemented method, comprising: executing instructions on a computing platform, the instructions for de-noising an image so that a de-noised image is generated from the image; executing instructions on a computing platform so that binary digital electronic signals representing a noise signature of the image are determined, wherein said determined noise signature is based on differences between the image and the de-noised image; executing instructions on the computing platform so that binary digital electronic signals representing an estimate of sensor sensitivity associated with the image are determined based on the noise signature, wherein the sensor sensitivity is of a sensor used to capture the image; and storing an indication of the determined estimate of sensor sensitivity.

Description

BACKGROUND

[0001] 1. Field of the Invention

[0002] The present invention is directed generally to digital image processing, and more particularly to estimating sensor sensitivity, e.g., from a single image. The application of the estimated sensor sensitivity to estimating scene brightness is also described.

[0003] 2. Description of the Related Art

[0004] Digital images may include raster graphics, vector graphics, or a combination thereof. Raster graphics data (also referred to herein as bitmaps) may be stored and manipulated as a grid of individual picture elements called pixels. A bitmap may be characterized by its width and height in pixels and also by the number of bits per pixel. Commonly, a color bitmap defined in the RGB (red, green blue) color space may comprise between one and 16 bits per pixel for each of the red, green, and blue channels. An alpha channel may be used to store additional data such as per-pixel transparency values. Vector graphics data may be stored and manipulated as one or more geometric objects built with geometric primitives. The geometric primitives (e.g., points, lines, polygons, Bezier curves, and text characters) may be based upon mathematical equations to represent parts of digital images.

[0005] FIG. 1 illustrates the general operation of a digital camera at a high level. One or more sources of illumination, i.e., an illuminant, emit light, which bounces of objects in a scene. Some of this light enters the lens of the camera, passing the camera's aperture. The aperture is open (with fixed aperture size) for a finite amount of time, as indicated in FIG. 1. During this time, the light passing through the aperture reaches the camera's image sensor, which has a specified sensor size, as shown, where it is integrated, and converted into a digital image, depending on the sensor's sensitivity setting (ISO 12232:2006 value, referred to herein as "ISO" or "ISO value" for convenience).

[0006] Scene brightness is the amount of light illuminating a given scene. Consideration of scene brightness may allow one to guess whether a picture was taken indoors vs. outdoors because the intensity of sunlight is several orders of magnitude greater than that of any artificial light. Knowing whether a picture was taken indoors vs. outdoors, in addition with other content dependent information, can help determine the more specific environment for the photograph.

[0007] Scene brightness can be directly measured at the time a picture is taken by placing a light meter on a subject. For most nonprofessional photographs, however, such measurement is not performed. However, scene brightness is commonly calculated a posteriori from a photograph, based on several camera parameters at the time of exposure. More specifically, current scene brightness (B) computation relies on four parameters: aperture (f), exposure time (t), sensor sensitivity (S), and pixel intensity in the image (I). While it is possible to reconstruct scene brightness a posteriori, a significant portion of real life photographs are missing the required sensor sensitivity information (S). Modern digital cameras store shooting parameters in the EXIF metadata fields of a digital photograph. The parameters aperture (f) and exposure time (t) are readily available for almost all images with EXIF data. Pixel intensity (I) is stored in the pixel data and is thus always available.

[0008] Unfortunately, the same is not true for the sensor sensitivity (S) parameter. Various camera manufactures record sensor sensitivity information (commonly referred to as the ISO value) in proprietary formats and in nonstandard EXIF fields, which can be corrupted or discarded by third party image editing or transfer software. For example, only a portion of photographs in typical photo databases or archives have ISO values recorded in their metadata. Current algorithms either cannot proceed with the calculation of scene brightness altogether, or insert a constant for the ISO value in their equations, which may lead to low quality brightness estimates and may not be sufficient for many applications.

SUMMARY

[0009] Various embodiments of systems, methods, and computer-readable storage media for estimating sensor sensitivity from a single image are presented.

[0010] A noise signature of an image may be determined. In other words, the image (i.e., image data) may be analyzed to determine the degree of noise in the image, and a signature or metric characterizing the degree of noise determined for the image. In one embodiment, the image may be de-noised, thereby generating a de-noised image. In other words, a digital signal recorded from a camera's image sensor is assumed to be corrupted by noise, thus deviating from a hypothetical ideal (noiseless) image. To approximate this ideal image, small deviations in the input signal may be removed or eliminated via an appropriate filter. However, it should be noted that this is but one exemplary way to produce the de-noised image. More generally, the idea of de-noising the image is to produce an ideal (or an approximation of an) image free of noise, i.e., a substantially noise-free image, which may be accomplished via any of a variety of ways, e.g., via a generative model. In other words, the de-noised image may be produced independently of the original image. Note, however, that for simplicity and clarity, the term "de-noised image" is used herein to refer to such a substantially noise-free image, regardless of the manner in which it was obtained or generated.

[0011] In one embodiment, both the original image and the de-noised image may be decomposed into respective and corresponding pluralities of patches. In other words, each image may be decomposed into a respective plurality of pixel patches, where each patch from the original image has a corresponding patch from the de-noised image, thus forming corresponding pairs of patches. In one embodiment, patches of various sizes may be generated, e.g., square patches of sizes 4.times.4, 8.times.8, and 16.times.16 pixels, and so each image may be partitioned or decomposed multiple times-once per patch size. Said another way, decomposing each of the image and the de-noised image into respective and corresponding pluralities of patches may include partitioning each image multiple times with respective different patch sizes. Computing patches at several different rectangle sizes enables the method to obtain cumulative information at multiple resolutions. In some embodiments, neighboring patches may overlap, e.g., by half the patch width, and so each portion of the image may be represented in more than one patch of a given size, and may also be represented in multiple patches of different size. Note that the spatial arrangement of patches is irrelevant to the present method, and so all patches (of all sizes) may be treated as a single long list of patches. Note further that generally, decomposing the image(s) may a produce a large number of patches, e.g., on the order of 800,000 patches for typical digital images of 2-5 Megapixels, although with smaller or larger images, and different sized patches, different numbers of patches may be produced.

[0012] A descriptor may be determined for each patch in the image based on the patch of the image and the corresponding patch of the de-noised image, where the descriptor characterizes noise in the patch of the image by summarizing the patch's pixel intensity and pixel alteration due to de-noising. In one embodiment, a descriptor (for each patch) may be determined for each of a plurality of image attributes. For example, in some embodiments, a separate descriptor may be determined per color channel of the image, e.g., for each of R(ed), G(reen), and B(lue) color channels of the image, although it should be noted that other color models may be used as desired, e.g., CMYK (Cyan, Magenta, Yellow, Black), HSV (Hue, Saturation, and Value), and/or HSL (Hue, Saturation, and Lightness (or Luminance)), among others.

[0013] Moreover, in some embodiments, texture may also be included as a characterizing feature of the images, e.g., of the image patches. As used herein, "texture" refers to random, stochastic, or uncorrelated high-frequency variations in an image, e.g., exemplified by images of snow, sand, grass, forests, etc. Note that sometimes such textures may be confused with noise, and so including texture information in the characterization of a patch may improve the quality of the patch descriptors.

[0014] Thus, a plurality of patch descriptors may be determined that characterize differences between corresponding patches of the original image and the de-noised image, and thus that characterize noise in the patches of the (original) image. The patch descriptors may be a descriptor of the noise in that patch which includes, but is not limited to, the average difference of pixels in the noisy and de-noised image. Additional entries in such a descriptor (e.g., vector) may include a texture descriptor that characterizes or classifies texture of the patch, amongst other metrics.

[0015] Then, a noise signature for the image may be created based on the plurality of patch descriptors. There are numerous ways the noise signature may be built. For example, in one embodiment, a subset of the patches, or more accurately, patch descriptors, may be selected and used to build a noise signature for the image.

[0016] The patches (or patch descriptors) may then be divided or sorted by their intensity, color channel, and/or texture. Each patch descriptor may be assigned to a bin with corresponding intensity level, color channel, and/or texture. Within each bin, the patches may be sorted by the amount of pixel alteration, and the descriptors with the lowest noise values selected. For example, in one embodiment, multiple groups of various sizes of the lowest noise descriptors may be selected to avoid being influenced unduly by misleading statistical deviations in the image data. For example, in one exemplary embodiment, the 25, 40, and 80 lowest noise valued descriptors may be selected, although in other embodiments, other numbers of groups (including one) and thresholds may be used as desired. The noise signature may be defined (and created) as the weighted average pixel alteration of these 25, 40, and 80 patches, weighted so that patches with less noise are given greater weight. Any of a number of different approaches may be taken in weighting the patches (or descriptors), the point being to capture the minimum pixel alteration in each of the bins.

[0017] Thus, in some embodiments, the noise signature may comprise a set of discrete noise values or descriptors correlated to one or more image attributes or parameters, e.g., intensity level, color channel, and/or texture, and thus may be or include a (possibly multi-dimensional) noise profile with respect to these parameters.

[0018] An estimate of sensor sensitivity associated with the image may be automatically determined based on the determined noise signature. For example, the noise signature may be provided as in input to a trained classifier, e.g., a support vector machine, neural network, etc., which may then operate to estimate the sensor sensitivity. In other words, a trained classifier may take the noise signature as input, e.g., as input features, and may generate a corresponding sensor sensitivity value, where the sensor sensitivity value is a "prediction" or estimate of the sensor sensitivity of the camera that produced the image. In one embodiment, the sensor sensitivity value may take the form of an ISO sensitivity value, e.g., in accordance with ISO 12232:2006, although any other representation of sensor sensitivity may be used as desired. The classifier may be implemented in any of various ways, e.g., as a support vector machine (SVM), a neural network, etc., as desired. The predicted or estimated sensor sensitivity value may be stored, e.g. in a memory medium, and/or output to a display device or an external device, e.g. over a network. The sensor sensitivity estimate may be used to perform further analysis of the image.

[0019] In one embodiment, a classifier may be built for each camera model (and possibly camera make) using a collection of photographs (e.g., random photographs taken with the same camera and retrieved from a photographic archive). In other words, the classifier may include a plurality of classifiers, each directed to a respective camera model or make. When the sensitivity (e.g., ISO) value of an image is missing, the classifier may then predict the image's sensitivity (e.g., ISO) value based on its noise signature, as described above.

[0020] In some embodiments, an estimate of scene brightness for the image may be automatically determined based on the estimate of sensor sensitivity and metadata of the image, where the metadata includes aperture information, exposure time information, and intensity information (for the image). The estimate for scene brightness may then be stored in a memory medium, or output to an external device, e.g., a monitor, printer, or other computer system, e.g., over a network. The estimate of scene brightness for the image may be used, e.g., in conjunction with other aspects of the image, in any of numerous applications, one example of which is to categorize or classify the image (or scene), e.g., for subsequent use by search engines, e.g., for search and retrieval of images based on semantic content of the image(s), among other applications. For example, images may be categorized according to scenario, e.g., beach, mountain, sea, mall, etc., where a particular image may be identified as representing a particular category.

[0021] Thus, various embodiments of the above methods may be used to estimate sensor sensitivity associated with an image, e.g., to generate an estimate of sensor sensitivity (e.g., ISO value) of the camera that produced the image.

BRIEF DESCRIPTION OF THE DRAWINGS

[0022] FIG. 1 illustrates a process of capturing a digital photograph;

[0023] FIG. 2 is a block diagram of an exemplary system configured to implement embodiments of the present invention;

[0024] FIG. 3 illustrates an exemplary image analysis module, according to one embodiment;

[0025] FIG. 4 is a flowchart illustrating a method for estimating sensor sensitivity from a single image, according to one embodiment;

[0026] FIG. 5 is a flowchart illustrating another embodiment of a method for estimating sensor sensitivity from a single image;

[0027] FIG. 6 illustrates an exemplary process flow diagram illustrating one embodiment of the method of FIG. 5; and

[0028] FIG. 7 is a flowchart of a method for determining scene brightness based on an estimate of sensor sensitivity associated with the image, according to one embodiment.

[0029] While the invention is susceptible to various modifications and alternative forms, specific embodiments are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.

DETAILED DESCRIPTION OF EMBODIMENTS

[0030] Embodiments of the systems and methods described herein may be used to estimate light sensor sensitivity from a single image. Examples of applications for such an estimation of sensor sensitivity include, but are not limited to, image classification and camera classification, as well as scene or image decomposition and analysis, e.g., for scientific or security applications, e.g., monitoring, surveillance, etc.

[0031] In the following detailed description, numerous specific details are set forth to provide a thorough understanding of claimed subject matter. However, it will be understood by those skilled in the art that claimed subject matter may be practiced without these specific details. In other instances, methods, apparatuses or systems that would be known by one of ordinary skill have not been described in detail so as not to obscure claimed subject matter. Some portions of the detailed description are presented in terms of algorithms or symbolic representations of operations on data bits or binary digital signals stored within a computing system memory, such as a computer memory. These algorithmic descriptions or representations are examples of techniques used by those of ordinary skill in the data processing arts to convey the substance of their work to others skilled in the art. An algorithm is here, and is generally, considered to be a self-consistent sequence of operations or similar processing leading to a desired result. In this context, operations or processing involve physical manipulation of physical quantities. Typically, although not necessarily, such quantities may take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared or otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to such signals as bits, data, values, elements, symbols, characters, terms, numbers, numerals or the like. It should be understood, however, that all of these and similar terms are to be associated with appropriate physical quantities and are merely convenient labels. Unless specifically stated otherwise, as apparent from the following discussion, it is appreciated that throughout this specification discussions utilizing terms such as "processing," "computing," "calculating," "determining" or the like refer to actions or processes of a computing platform, such as a computer or a similar electronic computing device, that manipulates or transforms data represented as physical electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the computing platform.

[0032] FIG. 2 is a block diagram illustrating constituent elements of a computer system 200 that is configured to implement embodiments of the systems and methods described herein. The computer system 200 may include one or more processors 210 implemented using any desired architecture or chip set, such as the SPARC.TM. architecture, an x86-compatible architecture from Intel Corporation or Advanced Micro Devices, or an other architecture or chipset capable of processing data. Any desired operating system(s) may be run on the computer system 200, such as various versions of Unix, Linux, Windows.RTM. from Microsoft Corporation, MacOS.RTM. from Apple Inc., or any other operating system that enables the operation of software on a hardware platform. The processor(s) 210 may be coupled to one or more of the other illustrated components, such as a memory 220, by at least one communications bus.

[0033] In one embodiment, a specialized graphics card or other graphics component 256 may be coupled to the processor(s) 210. The graphics component 256 may include a graphics processing unit (GPU) 270, which in some embodiments may be used to perform at least a portion of the techniques described below. Additionally, the computer system 200 may include one or more imaging devices 252. The one or more imaging devices 252 may include various types of raster-based imaging devices such as monitors and printers. In one embodiment, one or more display devices 252 may be coupled to the graphics component 256 for display of data provided by the graphics component 256.

[0034] In one embodiment, program instructions 240 that may be executable by the processor(s) 210 to implement aspects of the techniques described herein may be partly or fully resident within the memory 220 at the computer system 200 at any point in time. The memory 220 may be implemented using any appropriate medium such as any of various types of ROM or RAM (e.g., DRAM, SDRAM, RDRAM, SRAM, etc.), or combinations thereof. The program instructions may also be stored on a storage device 260 accessible from the processor(s) 210. Any of a variety of storage devices 260 may be used to store the program instructions 240 in different embodiments, including any desired type of persistent and/or volatile storage devices, such as individual disks, disk arrays, optical devices (e.g., CD-ROMs, CD-RW drives, DVD-ROMs, DVD-RW drives), flash memory devices, various types of RAM, holographic storage, etc. The storage 260 may be coupled to the processor(s) 210 through one or more storage or I/O interfaces. In some embodiments, the program instructions 240 may be provided to the computer system 200 via any suitable computer-readable storage medium including the memory 220 and storage devices 260 described above.

[0035] The computer system 200 may also include one or more additional I/O interfaces, such as interfaces for one or more user input devices 250. In addition, the computer system 200 may include one or more network interfaces 254 providing access to a network. It should be noted that one or more components of the computer system 200 may be located remotely and accessed via the network. The program instructions may be implemented in various embodiments using any desired programming language, scripting language, or combination of programming languages and/or scripting languages, e.g., C, C++, C#, Java.TM., Perl, etc. The computer system 200 may also include numerous elements not shown in FIG. 2, as illustrated by the ellipsis.

[0036] FIG. 3 illustrates an exemplary image analysis module that may implement embodiments of a method for estimating sensor sensitivity from a single image, as described below with reference to FIG. 4. In one embodiment, module 300 may provide a user interface 302 that includes one or more user interface elements via which a user may initiate, interact with, direct, and/or control the method performed by module 300. Module 300 may be operable to obtain digital image data for a digital image 310, receive user input 312 specifying attributes of the process, and determine and output an estimate of sensor sensitivity for the image data 320, e.g., using a classifier 304. In various embodiments, the classifier may or may not be included in the module itself. Moreover, in some embodiments, the module 300 may be operable to determine an estimate of scene brightness for the image based on the estimated sensor sensitivity and other parameters such as aperture (f), exposure time (t), and pixel intensity (I), as will be discussed in more detail below.

[0037] Image analysis module 300 may be implemented as or in a stand-alone application or as a module of or plug-in for an image analysis and/or processing application. Examples of types of applications in which embodiments of module 300 may be implemented may include, but are not limited to, image analysis and editing, processing, and/or presentation applications, as well as applications in security or defense, educational, scientific, medical, publishing, digital photography, digital films, games, animation, marketing, and/or other applications in which digital image analysis, editing or presentation may be performed. Specific examples of applications in which embodiments may be implemented include, but are not limited to, Adobe.RTM. Photoshop.RTM. and Adobe.RTM. Illustrator.RTM.. In addition to generating an estimate of sensor sensitivity, module 300 may be used to display, manipulate, modify, classify, and/or store the image, for example to a memory medium such as a storage device or storage medium.

Overview

[0038] Embodiments of the techniques disclosed herein may provide for the estimation of light sensor sensitivity from a single image, based on the insight that the sensor sensitivity, e.g., the ISO value, can be reconstructed with relatively little error, allowing for accurate scene brightness estimation for photographs where sensor sensitivity information was not stored based on images for which the sensitivity is available. More specifically, since increasing sensor sensitivity also amplifies the noise in a photograph, noise in a photograph (i.e. variations in pixel intensity not caused by corresponding variations in the scene) can be used to deduce the amount of amplification applied to the image. To estimate the missing sensitivity value of an image, the method takes advantage of the fact that there is a strong correlation between the sensitivity (e.g., ISO setting) of an image sensor and the noise the sensor records-the higher the sensitivity, the more noise is recorded. This relationship allows the determination of an approximate or estimated sensitivity (e.g., ISO value) from the amount of noise in an image.

[0039] Additionally, in some embodiments, this estimated sensor sensitivity may be used in combination with other parameters, e.g., image metadata, such as aperture (f) information, exposure time (t) information, and pixel intensity (I) information, to determine an estimate of scene brightness for the image, which may be useful in numerous applications.

[0040] For example, in one exemplary application, the estimate of scene brightness may be used (possibly in combination with other image attributes) to classify the image. More specifically, the estimate of scene brightness may be used for scene classification. Scene classification is the process of determining the category of the environment (e.g. mall, street, beach, park, forest, ocean, etc.) in which a photograph is taken. In image acquisition and image processing, understanding where a photograph is taken allows for more accurate reconstruction of the color, tone-mapping, etc. of the image. Moreover, as part of a broader goal, scene category (classification) information may be used in further applications, e.g., to improve the quality of photograph annotation and to allow for content based semantic image search, among others.

FIG. 4--Method for Estimating Light Sensor Sensitivity Based on an Image

[0041] FIG. 4 is a flowchart illustrating a method for estimation of light sensor sensitivity based on a single image. The method shown in FIG. 4 may be used in conjunction with embodiments of the computer system shown in FIG. 2, among other devices. In various embodiments, some of the method elements shown may be performed concurrently, in a different order than shown, or may be omitted. Additional method elements may also be performed as desired. Any of the method elements described may be performed automatically (i.e., without user intervention). As shown, this method may operate as follows.

[0042] First, in 402, a noise signature of an image may be determined. In other words, the image (i.e., image data) may be analyzed to determine the degree of noise in the image, and a signature or metric characterizing the degree of noise determined for the image. Note that there are numerous ways in which the noise signature may be determined. One particular approach to determining the noise signature is discussed below with respect to FIG. 5.

[0043] In 404, an estimate of sensor sensitivity associated with the image may be automatically determined based on the determined noise signature. For example, the noise signature may be provided as in input to a trained classifier, e.g., a support vector machine, neural network, etc., which may then operate to estimate the sensor sensitivity.

[0044] The estimate of the sensor sensitivity may then be stored, e.g. in a memory medium, and/or output to a display device or an external device, e.g. over a network. As will be discussed below, the sensor sensitivity estimate may be used to perform further analysis of the image. A more detailed embodiment of the method of FIG. 4 is described below with reference to FIG. 5.

FIG. 5--Another Method for Estimating Light Sensor Sensitivity Based on an Image

[0045] FIG. 5 is a flowchart illustrating a more detailed embodiment of the method of FIG. 4. The method shown in FIG. 5 may be used in conjunction with embodiments of the computer system shown in FIG. 2, among other devices. In various embodiments, some of the method elements shown may be performed concurrently, in a different order than shown, or may be omitted. Additional method elements may also be performed as desired. Any of the method elements described may be performed automatically (i.e., without user intervention). As shown, this method may operate as follows.

[0046] First, in 502, an image may be de-noised, thereby generating a de-noised image. In other words, a digital signal recorded from a camera's image sensor is assumed to be corrupted by noise, thus deviating from a hypothetical ideal (noiseless) image. To approximate this ideal image, small deviations in the input signal may be removed or eliminated via an appropriate filter. Any of various filters can be used, including, but not limited to, Gaussian, Median, and Bilateral filters, among others. Thus, the original image is assumed to be noisy, and the de-noised image is assumed to be (substantially) noiseless. However, it should be noted that this is but one exemplary way to produce the de-noised image. More generally, the idea of de-noising the image is to produce an ideal (or an approximation of an) image free of noise, i.e., a substantially noise-free image, which may be accomplished via any of a variety of ways, e.g., via a generative model. In other words, the de-noised image may be produced independently of the original image. Note, however, that for simplicity and clarity, the term "de-noised image" is used herein to refer to such a substantially noise-free image, regardless of the manner in which it was obtained or generated.

[0047] FIG. 6 is an exemplary process flow diagram illustrating various elements corresponding to the method elements and features of the method of FIG. 5. As may be seen, item 1 (so labeled) in FIG. 6 illustrates the above-described de-noising of the original input image to generate the de-noised image.

[0048] In 504, both the original image and the de-noised image may be decomposed into respective and corresponding pluralities of patches. In other words, each image may be decomposed into a respective plurality of pixel patches, where each patch from the original image has a corresponding patch from the de-noised image, thus forming corresponding pairs of patches. In one embodiment, patches of various sizes may be generated, e.g., square patches of sizes 4.times.4, 8.times.8, and 16.times.16 pixels, and so each image may be partitioned or decomposed multiple times-once per patch size. Said another way, decomposing each of the image and the de-noised image into respective and corresponding pluralities of patches may include partitioning each image multiple times with respective different patch sizes. Computing patches at several different rectangle sizes enables the method to obtain cumulative information at multiple resolutions. In some embodiments, neighboring patches may overlap, e.g., by half the patch width, and so each portion of the image may be represented in more than one patch of a given size, and may also be represented in multiple patches of different size. Note that the spatial arrangement of patches is irrelevant to the present method, and so all patches (of all sizes) may be treated as a single long list of patches. Note further that generally, decomposing the image(s) may a produce a large number of patches, e.g., on the order of 800,000 patches for typical digital images of 2-5 Megapixels, although with smaller or larger images, and different sized patches, different numbers of patches may be produced. The decomposition of the images is illustrated by item 2 in FIG. 6.

[0049] In 506, a descriptor may be determined for each patch in the image based on the patch of the image and the corresponding patch of the de-noised image, where the descriptor characterizes noise in the patch of the image by summarizing the patch's pixel intensity and pixel alteration due to de-noising. In one embodiment, a descriptor (for each patch) may be determined for each of a plurality of image attributes. For example, in some embodiments, a separate descriptor may be determined per color channel of the image, e.g., for each of R(ed), G(reen), and B(lue) color channels of the image, although it should be noted that other color models may be used as desired, e.g., CMYK (Cyan, Magenta, Yellow, Black), HSV (Hue, Saturation, and Value), and/or HSL (Hue, Saturation, and Lightness (or Luminance)), among others.

[0050] Moreover, in some embodiments, texture may also be included as a characterizing feature of the images, e.g., of the image patches. As used herein, "texture" refers to random, stochastic, or uncorrelated high-frequency variations in an image, e.g., exemplified by images of snow, sand, grass, forests, etc. Note that sometimes such textures may be confused with noise, and so including texture information in the characterization of a patch may improve the quality of the patch descriptors.

[0051] Thus, in one embodiment, for every corresponding pair of patches {P.sub.i, P.sub.d}, where P.sub.i is a patch from the input image and P.sub.d is a patch from the de-noised image, a patch descriptor may be determined. In one embodiment, the patch descriptor may be of the form:

P(P.sub.i, P.sub.d)=(I(P.sub.i), n(P.sub.i, P.sub.d)) (1)

[0052] where I(P.sub.i)=[0, 1] is the intensity of patch, computed as the mean pixel intensity of the patch from the original image, I(P.sub.i)=P.sub.i, and where n(P.sub.i, P.sub.d) is the pixel alteration due to the de-noise process of 502 in each patch and in some embodiments may be defined as the average pixel difference (e.g., in R, G, B channels) between the original and the de-noised patch:

n ( P i , P d ) = 1 N j = 1 N P i j - P d j . ( 2 ) ##EQU00001##

[0053] Note, however, that the average difference is just one metric that can be used to assign a value to the noise-difference between the noisy and de-noised image. Other distance metrics may be used that bias the result to maximum or minimum deviations. Additionally, the value need not necessarily be normalized by division by N. Thus, the expression of equation (2) is just one embodiment of a way to establish a difference value for the corresponding noisy and de-noised image patch. For simplicity, the "average difference" approach is used in the remainder of this document, although it should be noted that is but one exemplary or optional approach, and that any other measures may be used as desired.

[0054] In embodiments where each patch descriptor is based on pixel intensity and texture per color channel, the patch descriptor may be in the form:

P(P.sub.i, P.sub.d)={I(P.sub.i), n(P.sub.i, P.sub.d), t(p.sub.i, p.sub.d)}.sub.c (3)

[0055] where t(p.sub.i, p.sub.d) is a texture classifier. The texture classifier t may include a small class of textures that occur frequently and occupy a significant area of a photograph, e.g., skin, sand, leaves, sky, snow, etc. c denote a color channel, where in some embodiments, c={Red, Green, Blue}, i.e., c may iterate over the each of the RGB color channels in an image. As noted above, in other embodiments, other color channels (models) may be used as desired. I(P.sub.i)=[0, 1] is the average intensity of R, G, or B pixels in the patch specified by c. n(P.sub.i, P.sub.d) is defined similarly to equation (2), but for pixels specifically belonging to color channel c.

[0056] Thus, a plurality of patch descriptors may be determined that characterize differences between corresponding patches of the original image and the de-noised image, and thus that characterize noise in the patches of the (original) image. The generation of the patch descriptors of 506 is illustrated by item 3 of FIG. 6, where the symbol {circle around (.times.)} represents a generic operator that takes as input corresponding patches from the noisy and de-noised image and produces a (possibly vector) descriptor of the noise in that patch which includes, but is not limited to, the average difference of pixels in the noisy and de-noised image. Additional entries in such a descriptor (e.g., vector) may include a texture descriptor that characterizes or classifies texture of the patch, amongst other metrics.

[0057] In 508, a noise signature for the image may be created based on the plurality of patch descriptors. There are numerous ways the noise signature may be built. For example, in one embodiment, a subset of the patches, or more accurately, patch descriptors, may be selected and used to build a noise signature for the image. The noise signature is illustrated by item 4 in FIG. 6.

[0058] High frequency changes in neighboring pixel values commonly arise from two sources: high frequency texture or noise in the digital image. Noise may be assumed to be present in all patches to varying degrees, but virtually every photograph contains at least some areas (hence some patches) where the image is smooth and free of high frequency texture. In one embodiment, patches with high frequency texture may be discarded (or not considered). The remaining patches may be assumed to be free of high frequency texture, and thus it may also be assumed that their high frequency pixel alteration values arise from noise only. In other words, descriptors of patches with high frequency texture may be discarded, thereby leaving a subset of the patches with little or no high frequency texture, where the noise signature may be created based on the subset of the patches.

[0059] In an alternative embodiment, the patch may be classified as having a texture of a certain characteristic (e.g., expressed as a texture class, such as "sand", "waves", "leaves", or, more generally, as a parametric descriptor, based on, e.g. a power-spectrum). Using this second approach, the textured patches are not discarded, but instead used to create a higher-dimensional noise signature (one that includes texture as a dimension). Thus, for example, a noise signature S={S(I, t, c)}.sub.I,t,c (where t refers to texture class) may record or represent the amount of noise in the image for pixels corresponding to different levels of intensity, to different texture types, and for each color channel. Each entry in the noise signature, S(I, t, c), may denote the estimated amount of noise for pixels with intensity I, texture t, and in color channel c. The patch intensity may be converted from a continuous variable into a discrete variable by binning it into K levels (bins). K should be sufficiently large to ensure enough resolution, although large K incurs additional computational cost without significant gain in the quality of the noise signature. In one embodiment, K=20 levels, although in other embodiments, K may be set to any other value desired.

[0060] The patches (or patch descriptors) may then be divided or sorted by their intensity level, color channel, and/or texture. Each patch descriptor may be assigned to a bin with corresponding intensity level, color channel, and/or texture. Within each bin, the patches may be sorted based on the amount of pixel alteration, and the descriptors with the lowest noise values selected. For example, in one embodiment, multiple groups of various sizes of the lowest noise descriptors may be selected to avoid being influenced unduly by misleading statistical deviations in the image data. For example, in one exemplary embodiment, the 25, 40, and 80 lowest noise valued descriptors may be selected, although in other embodiments, other numbers of groups (including one) and thresholds may be used as desired. Note that the amount of pixel alteration may indicate the amount of noise in the corresponding patch. The noise signature S(I, t, c) may be defined as the weighted average pixel alteration of these 25, 40, and 80 patches, where patches with less noise are given greater weight. Any of a number of different approaches may be taken in weighting the patches (or descriptors), the point being to capture the minimum pixel alteration in each of the (I, t, c) bins.

[0061] Thus, in some embodiments, the noise signature may comprise a set of discrete noise values or descriptors correlated to one or more image attributes or parameters, e.g., intensity level, color channel, and/or texture, and thus may be or include a (possibly multi-dimensional) noise profile with respect to these parameters.

[0062] Finally, in 510, a trained classifier, denoted by item 5 of FIG. 6, may take the noise signature as input, e.g., as input features, and may generate a corresponding sensor sensitivity value, where the sensor sensitivity value is a "prediction" or estimate of the sensor sensitivity of the camera that produced the image. In one embodiment, the sensor sensitivity value may take the form of an ISO sensitivity value, e.g., in accordance with ISO 12232:2006, denoted by item 6 in FIG. 6, although any other representation of sensor sensitivity may be used as desired. As noted above, the classifier may be implemented in any of various ways, e.g., as a support vector machine (SVM), a neural network, etc., as desired. As also noted above, the predicted or estimated sensor sensitivity value may be stored, e.g. in a memory medium, and/or output to a display device or an external device, e.g. over a network. As also mentioned above, the sensor sensitivity estimate may be used to perform further analysis of the image, as described below.

[0063] Note that the trained classifier must have been trained at some point prior to the above generation of the sensor sensitivity value. In one embodiment, the training may include determining noise signatures as described above for each of a plurality of photographs for which sensor sensitivity is known, and providing these noise signatures and the corresponding sensor sensitivity values (e.g., ISO values) to the classifier as training input, e.g., as input labels, and the classifier trained accordingly. In one embodiment, a classifier may be built for each camera model (and possibly camera make) using a collection of photographs (e.g., random photographs taken with the same camera and retrieved from a photographic archive). In other words, the classifier may include a plurality of classifiers, each directed to a respective camera model or make.

[0064] When the sensitivity (e.g., ISO) value of an image is missing, the classifier may then predict the image's sensitivity (e.g., ISO) value based on its noise signature, as described above. Note that for camera models whose sensitivity (e.g., ISO) value is encoded in a proprietary format, experiments may be manually conducted and test images acquired in order to generate a collection of example photographs to train the classifier(s). A generic classifier may then be built (trained) for multiple camera models by including photographs from similar models from the same camera maker.

[0065] Thus, various embodiments of the above methods may be used to estimate sensor sensitivity associated with an image, e.g., to generate an estimate of sensor sensitivity (e.g., ISO value) of the camera that produced the image.

Determination of Scene Brightness

[0066] Note that sensor sensitivity (S) does not affect how much light enters a camera. Rather, the amount of light that reaches the sensor Q=Bt/f.sup.2 is completely determined by scene brightness (B), aperture (f) and exposure time (t). Given identical scene and camera setup, a photograph taken at ISO 100 is generated from the same amount of light as a photograph taken at ISO 200. The only difference is that pixel intensity in the latter image is amplified by a factor of two. As noted above, increasing sensor sensitivity also amplifies the noise in the photograph, and so noise in a photograph (i.e. variations in pixel intensity not caused by corresponding variations in the scene) can be used to deduce the amount of amplification applied to the image, as discussed above in detail.

[0067] In the absence of sensor sensitivity information, embodiments of the present method may first estimate the amount of light (Q') captured by the camera sensor based on pixel intensity (I) information alone, then scene brightness (B') may be computed based on aperture (f) and exposure time (t) information.

[0068] The conventional scene brightness calculation can be described the following where Q is a function of I and S:

B=(f.sup.2/t)Q(I, S) (4)

[0069] Embodiments of the present approach may compute scene brightness based on estimated Q', as a function of I only:

B'=(f.sup.2/t)Q'(I) (5)

[0070] The following factors therefore influence the digital photographic exposure of a scene, i.e. the brightness of image pixels as a consequence of environmental factors and camera parameters:

[0071] 1. Overall scene brightness-the light bouncing off all objects in the scene and entering the camera;

[0072] 2. Aperture size-area through which light can stream to reach the image sensor;

[0073] 3. Image sensor size-the area across which light photons are detected;

[0074] 4. Exposure Time-the duration in which light can stream through the aperture; and

[0075] 5. Sensor sensitivity-determines the probability that sensor events are triggered by photons instead of random thermal events, i.e. noise--higher sensitivity results in higher noise.

[0076] In the description below, the following symbols may be used to represent the above measures:

[0077] (L)--luminance; (B)--scene brightness; (f)--aperture (quotient of aperture area and image sensor area); (t)--exposure time; (S)--sensor sensitivity; (I)--intensity of pixels in the final digital photograph.

Scene Brightness Calculation Based on f, t, S, I

[0078] Strictly speaking, the brightness of a scene refers to the average scene luminance (L) in physical units of lux (light energy per second per unit area). For photographic purposes, it is not necessary to know the absolute physical luminance. Photographers are generally more interested in knowing how much brighter or darker a scene is relative to another (i.e. standard) scene. The definition of a standard scene differs slightly between camera makers. A standard scene, as defined by Canon and Nikon, is one lit by 0.0125 lux of light, although it should be noted that other "standard" values may be used as desired.

[0079] Scene brightness (B) in photography, therefore, is a dimensionless quantity and expresses scene luminance (L) relative to a reference luminance (L.sub.0):

B=L/L.sub.0. (6)

[0080] Exposure step (EV) expresses scene brightness on a logarithmic scale. In photography literature, a change in brightness by a factor two is frequently referred to as a change of 1 exposure step:

EV=log.sub.2(B) (7)

[0081] To determine scene brightness from pixel intensity (I) in a photographic image, camera parameters that affect how a digital still camera responds to light must be taken into account.

[0082] The part of the ISO 12232:2006 standard relevant to the present method specifies that in a scene lit by standard luminance L.sub.0, photographing a grey card that reflects 18% of incident light with an aperture of f/1 (f.sub.0=1.0) with an exposure time of 1 second (t.sub.0=1.0) and with a sensor sensitivity of ISO 100 (S.sub.0=100) should produce pixels that are 18% saturated (I.sub.0=0.18) in the final photographic image.

[0083] Scene brightness (B) and the pixel intensity (I) are, therefore, correlated through aperture (f), exposure time (t), and sensor sensitivity (S). The relationship is given by the following equation.

B(f.sub.0/f).sup.2(t/t.sub.0)=(S.sub.0/S)(I/I.sub.0)

Bt/f.sup.2=Q=556I/S (8)

[0084] The left hand side of equation (8), Bt/f.sup.2, describes the amount of light (Q) that enters the camera and is captured by the sensor. The right hand side of equation (8), 556I/S, determines how light energy (Q) is converted to pixel intensity in the final photographic image.

[0085] Aperture (f) is the opening in a lens that allows light to pass through. A photographer can limit the amount of light entering a camera by reducing the size of the aperture. This size is expressed in f numbers where "an f number of 2.8" is written as "f/2.8". The radius of an aperture f/2 is half that of an aperture f/1. Consequently, an aperture f/2 has an opening that is a quarter in area and lets in one fourth as much light as an aperture f/1.

[0086] Exposure time (t) describes the duration for which the shutter is opened. A photographer can increase or decrease the amount of light arriving at the sensor by exposing a photograph for longer or shorter periods of time. The relationship is linear--exposing for twice as long lets in twice as much light.

[0087] Pixel intensity (I) in a photograph is linearly proportional to the amount of light captured by the sensor.

[0088] The rate of conversion from light energy to pixel intensity is given by sensor sensitivity (S) which is expressed in ISO speed ratings. When exposed to the same amount of light, a sensor set at ISO 200 produces pixels that have twice the intensity as a sensor set at ISO 100.

[0089] In other words, given the intensity of pixels in a photograph corresponding to an object that reflects 18% of incident light, and knowing the aperture, exposure time, and sensor sensitivity used to compose the photograph, we can compute the brightness of the original scene:

B=(f.sup.2/t)(556I/S) (9)

FIG. 7--Method for Estimating Scene Brightness Based on an Estimate of Image Sensor Sensitivity

[0090] FIG. 7 is a flowchart of a method for determining scene brightness based on an estimate of sensor sensitivity associated with the image. The method shown in FIG. 7 may be used in conjunction with embodiments of the computer system shown in FIG. 2, among other devices. In various embodiments, some of the method elements shown may be performed concurrently, in a different order than shown, or may be omitted. Additional method elements may also be performed as desired. Any of the method elements described may be performed automatically (i.e., without user intervention). As shown, this method may operate as follows.

[0091] First, in 702, a sensor sensitivity associated with an image may be estimated. In other words, an estimate of sensor sensitivity associated with an image may be determined, e.g., in accordance with an embodiment of the methods described above.

[0092] In 704, an estimate of scene brightness for the image may be automatically determined based on the estimate of sensor sensitivity and metadata of the image, where the metadata includes aperture information, exposure time information, and intensity information (for the image), e.g., via equation (9), derived above. The estimate for scene brightness may then be stored in a memory medium, or output to an external device, e.g., a monitor, printer, or other computer system, e.g., over a network.

[0093] As noted above, the estimate of scene brightness for the image may be used, e.g., in conjunction with other aspects of the image, in any of numerous applications, one example of which is to categorize or classify the image (or scene), e.g., for subsequent use by search engines, e.g., for search and retrieval of images based on semantic content of the image(s), among other applications. For example, images may be categorized according to scenario, e.g., beach, mountain, sea, mall, etc., where a particular image may be identified as representing a particular category, e.g., of .about.20 categories.

[0094] Thus, in some embodiments, the method may further include categorizing the image based on the estimate of scene brightness, thereby generating a category for the image, where, as mentioned above, the category for the image may be usable to perform semantic based image operations. For example, suitable keywords or tags representing the identified category may be determined or suggested, where the keywords or tags may be usable to perform search, retrieval, and matching operations with respect to the image, e.g., where search tools can then locate the image based on the keywords or tags.

Exemplary Preliminary Results

[0095] Applying the above techniques using photographs taken with Nikon D80 cameras by different users, an average prediction error of 1.0 EV has been achieved. However, it should be noted that these results are preliminary and can be improved with further refinement to the classification process (which used an SVM with a linear separator), a better or more detailed noise signature.

[0096] Consumer digital cameras currently available can operate under varying lighting conditions, e.g., from dark environments lit by candle light to bright environments under direct sunlight, where the brightness can vary by factors of over 1,000,000 (i.e. over 20 exposure steps). The three camera parameters (f, t, S) used in conjunction may allow a photographer to adapt to different lighting conditions:

[0097] Aperture: By closing down (f/28) and opening up the aperture (f/1), a photographer can adjust the rate at which light enters the camera by a factor up to 784 (a change of 9.6 EV).

[0098] Exposure time: By exposing for shorter (1 ms) or longer period of time (30 s), a photographer can adapt to light conditions that differ by a factor of over 30,000 (a change of 14.9 EV).

[0099] Sensor sensitivity: The latest digital cameras can adjust the sensor sensitivity by a factor up to 256 (a change of 8.0 EV).

[0100] Note that without sensor sensitivity information, there can be significant error in calculating scene brightness. Thus, embodiments of the above method(s) may facilitate estimation of scene brightness with an average error of 1.0 exposure step or better.

Exemplary Applications

[0101] One direct application of scene brightness is in predicting environmental scenes. "Indoors vs outdoors" is often considered to the top level category of environmental scenes, and the ability to predict it accurately can impact how well more specific environmental scenes can be predicted. Sensor brightness is currently one of the most reliable parameters for determining whether a photograph is taken indoors versus outdoors.

[0102] Some existing work uses the noise signature to synthesize noise uniformly across the image. The above-described extended noise signature that models texture may allow adaptive noise synthesis where different amounts of noise are generated for each region of an image based on its texture. Another potential application of the noise signature disclosed herein is regarding noise removal that treats different texture regions differently.

[0103] Embodiments of the above methods may be applied to digital scans of film images, e.g., of digitized images of "analog" or film-based photographs, e.g., film negatives, although additional analysis and/or processing may be required to take into account the effects of the scanning operation, as well as any peculiarities or idiosyncrasies of film-based photography.

[0104] Thus, various embodiments of the systems and methods disclosed herein may be used to estimate sensor sensitivity for an image, and may further facilitate estimation of scene brightness using the estimated sensor sensitivity and additional metadata associated with the image.

[0105] Although the embodiments above have been described in detail, numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.

* * * * *