U.S. patent application number 11/536513 was filed with the patent office on 2008-04-03 for salience preserving image fusion.
This patent application is currently assigned to Microsoft Corporation. Invention is credited to Xiaoou Tang, Chao Wang, Qiong Yang, Zhongfu Ye.
Application Number | 20080080787 11/536513 |
Document ID | / |
Family ID | 39230531 |
Filed Date | 2008-04-03 |
United States Patent
Application |
20080080787 |
Kind Code |
A1 |
Yang; Qiong ; et
al. |
April 3, 2008 |
Salience Preserving Image Fusion
Abstract
Salience-preserving image fusion is described. In one aspect,
multi-channel images are fused into a single image. The fusing
operations are based on importance-weighted gradients. The
importance weighted gradients are measured using respective
salience maps for each channel in the multi-channel images.
Inventors: |
Yang; Qiong; (Beijing,
CN) ; Wang; Chao; (Hefei, CN) ; Tang;
Xiaoou; (Beijing, CN) ; Ye; Zhongfu; (Hefei,
CN) |
Correspondence
Address: |
LEE & HAYES PLLC
421 W RIVERSIDE AVENUE SUITE 500
SPOKANE
WA
99201
US
|
Assignee: |
Microsoft Corporation
Redmond
WA
|
Family ID: |
39230531 |
Appl. No.: |
11/536513 |
Filed: |
September 28, 2006 |
Current U.S.
Class: |
382/284 |
Current CPC
Class: |
G06T 5/50 20130101; G06T
2207/20221 20130101 |
Class at
Publication: |
382/284 |
International
Class: |
G06K 9/36 20060101
G06K009/36 |
Claims
1. A method at least partially implemented by a computer, the
method comprising: fusing multi-channel images into a single image
based on importance-weighted gradients, each multi-channel image
comprising multiple channels; and wherein the importance weighted
gradients are measured using respective salience maps for each
channel in the multi-channel images.
2. The method of claim 1, wherein fusing the multi-channel images
further comprises: for each channel in each image of the
multi-channel images: adjusting the channel to have a same mean
gradient as a maximum mean gradient channel; measuring local
salience for respective pixels in the channel; and assigning a
normalized weight to each pixel of the pixels based on respective
ones of the local salience measurements; generating an importance
weighted contrast form based on normalized weights associated with
respective ones of the pixels in the channels; constructing, using
the importance weighted contrast form, a target gradient based on
pixels in each of the channels; and creating the single image from
the target gradient.
3. The method of claim 2, wherein constructing the target gradient
is accomplished using pixel-based Eigen decomposition operations in
view of the importance weighted contrast form.
4. The method of claim 1, wherein the method further comprises
removing halo effects associated with the target gradient.
5. The method of claim 4, wherein removing the halo effects further
comprises implementing dynamic range compression operations on a
pixel-by-pixel basis.
6. The method of claim 1, wherein fusing the multi-channel images
removes color from the single image.
7. The method of claim 1, wherein the method further comprises
presenting the single image to a viewer.
8. A computer-readable medium comprising computer-program
instructions executable by a processor, the computer-program
instructions, when executed, for performing operations comprising:
for each image in a set of source images, each image comprising
multiple channels of pixel information, calculating a respective
salience map of multiple salience maps for a gradient of each
channel in the image; weighting, based on respective ones of
multiple salience maps, respective contributions of each channel of
the multiple channels to global statistics; and highlighting, based
on weighted contribution of each channel to global statistics,
gradients with high saliency in a target gradient to preserve
salient features in a single image.
9. The computer-readable medium of claim 8, wherein before
calculating the respective salience map, the method further
comprises adjusting each channel of the multiple channels in each
image of the source images to have a same mean gradient as a
maximum mean gradient channel so that the channels can be compared
and computed at a same gradient level.
10. The computer-readable medium of claim 8, wherein the gradients
are represented in an importance weighted contrast form, and
wherein highlighting the gradients with high saliency in the target
gradient further comprises: constructing, using the importance
weighted contrast form, the target gradient based on pixels in each
of the channels; and fusing the single image from the target
gradient.
11. The computer-readable medium of claim 8, wherein the
computer-program instructions further comprise instructions for
removing halo effects from the target gradient.
12. The computer-readable medium of claim 8, wherein the
computer-program instructions further comprise instructions for
implementing dynamic range compression operations to remove the
halo effects.
13. The computer-readable medium of claim 8, wherein the
computer-program instructions for calculating the salience maps,
weighting the respective contributions, and highlighting the
gradients with high saliency are implemented responsive to
receiving a request to remove color from respective ones of the
source images.
14. The computer-readable medium of claim 8, wherein the
computer-program instructions further comprise instructions for
presenting the single image to a viewer.
15. A computing device comprising: a processor; and a memory
coupled to the processor, the memory comprising computer-program
instructions executable by the processor for implementing
operations comprising: receiving multiple multi-channel images,
each multi-channel image comprising multiple channels of pixel
information; and responsive to receiving the multi-channel images,
fusing gradients from the multiple channels associated with each of
the multi-channel images into a salience-preserving fused image,
operations for fusing the gradients being based on local salience
measurements for each pixel of the pixel information.
16. The computing device of claim 15, further comprising presenting
the salience-preserving fused image to a viewer.
17. The computing device of claim 15, further comprising: for each
channel of the multiple channels: adjusting the channel to have a
same mean gradient as a maximum mean gradient channel; and
calculating the local salience measurements for respective pixels
in each of the channels.
18. The computing device of claim 16, further comprising for each
of the multi-channel images, assigning a normalized weight to each
pixel in each channel of the multiple channels based on respective
ones of the local salience measurements.
19. The computing device of claim 18, further comprising:
generating an importance weighted contrast form based on normalized
weights associated with respective ones of the pixels in the
channels; and constructing, using the importance weighted contrast
form, a target gradient based on pixels in each of the
channels.
20. The computing device of claim 19 further comprising reducing,
using dynamic range compression operations, halo(s) associated with
the target gradient.
Description
BACKGROUND
[0001] Multi-channel image fusion and visualization operations are
useful for many applications. For example, multi-channel image
fusion can be used to fuse multiple images with different
respective exposures, and therefrom, generate a single image with a
high dynamic range (HDR). In image fusion, one of the most
important issues is how to preserve the salience from the sources.
Since gradient convey important salient features, we conduct the
fusion on gradients. However, traditional fusion methods based on
gradient treat gradients from multi-channels as a multi-valued
vector, and compute associated statistics under the assumption of
identical distribution. In fact, different source channels may
reflect different important salient features, and their gradients
are basically non-identically distributed. This prevents existing
methods from successful salience preservation.
SUMMARY
[0002] Salience-preserving image fusion is described. In one
aspect, multi-channel images are fused into a single image. The
fusing operations are based on importance-weighted gradients. The
importance weighted gradients are measured using respective
salience maps for each channel in the multi-channel images.
[0003] This Summary is provided to introduce a selection of
concepts in a simplified form that are further described below in
the detailed description. This Summary is not intended to identify
key features or essential features of the claimed subject matter,
nor is it intended to be used as an aid in determining the scope of
the claimed subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] In the Figures, the left-most digit of a component reference
number identifies the particular Figure in which the component
first appears.
[0005] FIG. 1 shows an exemplary illustration of how identical
distribution assumptions may cause salient features from certain
channels to be obliterated by other channels, according to one
embodiment
[0006] FIG. 2 shows an exemplary system for salience preserving
image fusion, according to one embodiment.
[0007] FIG. 3 shows an exemplary set of graphs illustrating
origination of halo effects in gradient visualization, according to
one embodiment.
[0008] FIG. 4 shows an exemplary procedure for salience preserving
image fusion, according to one embodiment.
DETAILED DESCRIPTION
Overview
[0009] Conventional multi-channel image fusion techniques based on
gradients generally address gradients from multiple channels as a
multi-valued vector, and compute associated statistics under an
assumption of identical distribution. However, different source
channels can reflect different important salient features, and
corresponding gradients are non-identically distributed. This
prevents existing image fusion techniques from successfully
preserving salience. For example, when a minority of channels
embody important features, such features may be obliterated by
unimportant features embodied by the majority of channels.
[0010] FIG. 1 shows an exemplary illustration of how identical
distribution assumptions may cause salient features from certain
channels to be obliterated by other channels, according to one
embodiment. More particularly, FIG. 1 shows use of principal
component analysis (PCA) to find a linear mapping on i.i.d and non
i.i.d. samples. Referring to FIG. 1, dots 102 through 106 represent
class 1 samples, and dots 108 through 110 represent class 2
samples. Lines through respective ones of the class samples
represent component axis. FIG. 1(a) represents the PCA on i.i.d.
samples. FIG. 1(b) represents PCA on double-distribution samples.
FIG. 1(c) represents the weighted PCA on double-distribution
samples in (b). Referring to these aspects of FIG. 1, and with
respect to how identical distribution assumptions may cause salient
features from certain channels to be obliterated by other channels,
conventional target gradient can be deemed as a principal component
of all source channel's gradients except for a normalization
factor, if using the Euclidean metric. Assume that the samples are
gradients at a given pixel from multiple channels. In FIG. 1(a),
all channels manifest salient features, and PCA finds their
principal component, thereby providing a good representation of the
salient feature. But when there are some channels which have small
gradients (class 2 dots 108 of FIG. 1(b)), the PCA results become
meaningless (FIG. 1(b)).
[0011] In view of the above, systems and methods for salience
preserving image fusion, which are described with respect to FIGS.
1(c) through 4, assign different weights to different channels; the
more salient feature the channel conveys, the larger weight is
assigned. This protects the important information from being
obliterated by those unimportant, and thereby preserves salience in
the channels properly in the fusion process. This is shown in FIG.
1(c), where the principal axis is quite close to that in FIG. 1(a)
when we weight each sample by the square of its l.sup.2-norm.
Specifically, the systems and method for salience preserving image
fusion preserve salient features in source images by first
measuring a respective salience map for each source channel. The
respective saliency measures are then used to weight each source
channel's contribution to compute the statistical representation.
The systems and methods use the weights to highlight gradients with
high saliency in a target gradient, and thereby, preserve salient
features in a resulting image. In one implementation, the systems
and methods implement dynamic range compression on the target
gradient to reduce the undesirable halo effects.
An Exemplary System
[0012] Although not required, systems and methods for salience
preserving image fusion are described in the general context of
computer-executable instructions executed by a computing device
such as a personal computer. Program modules generally include
routines, programs, objects, components, data structures, etc.,
that perform particular tasks or implement particular abstract data
types. While the systems and methods are described in the foregoing
context, acts and operations described hereinafter may also be
implemented in hardware.
[0013] FIG. 2 shows an exemplary system 200 for salience preserving
image fusion, according to one embodiment. System 200 includes a
computing device 202. Computing device 202 may be for example a
general purpose computing device, a server, a laptop, a mobile
computing device, and/or so on. Computing device 202 includes
processor 204 coupled to system memory 206. Processor 204 may be a
microprocessor, microcomputer, microcontroller, digital signal
processor, etc. System memory 206 includes, for example, volatile
random access memory (e.g., RAM) and non-volatile read-only memory
(e.g., ROM, flash memory, etc.).
[0014] System memory 206 comprises program modules 208 and program
data 210. Program modules 208 include, for example, salience
preserving image fusion module 212 and "other program modules" 214
such as an Operating System (OS) to provide a runtime environment,
one or more applications to leverage operations of salience
preserving image fusion 212, device drivers, and/or so on. Such
applications are arbitrary and could include any type of
application that leverages co-registered multimodal imagery from
diverse sources (e.g., magnetic resonance scanners, aerial and
earth orbiting sensors, infrared and scientific-grade grayscale
cameras, etc).
[0015] Salience preserving image fusion module 212 ("image fusion
module 212") preserves salient features in source images 216 when
fusing each channel of the source images 216 into a single
resulting image 217. In this implementation, a single channel
corresponds to one image. For example, an RGB channel is treated as
three channels (one red channel, one green channel, and one blue
channel), and thus, there are three images for image fusion
operations. For the multi-exposures, there are multiple grayscale
images where each grayscale image is treated as one channel. To
preserve salient features in source images 216, image fusion module
212 first computes a respective salience map 218 for each channel
gradient in each multi-channel source image 216 (hereinafter often
referred to as "source images 216"). Each source image 216 is a
co-registered multimodal digital image such as a photograph, a
static image frame of a video sequence, etc. Digital images are
defined by pixels, and pixels are representative of combinations of
primary colors. A "channel" in this context is the grayscale image
of the same size as the digital image, representing of just one of
these primary colors. For example, a digital image in RGB space
will have a red, a green and a blue channel, whereas a grayscale
image has just one channel. There can be any number of primary
colors in a digital image. Thus, a channel is a grayscale image
based on any such primary color. Channel gradient is a distribution
of color from low to high values (e.g., from black to white). In
general, gradient provides a measure of overall contrast in an
image, and to some extent, a measure of overall feature visibility
and detail in a corresponding channel.
[0016] In one implementation, source images 216 are accessed or
otherwise received by image fusion module 212 from an image source
219. Such image sources are arbitrary and may include any type of
one or more image sources (e.g., digital imaging device(s),
databases(s), etc.) that produce or otherwise store co-registered
multi-channel digital images.
[0017] Image fusion module 212 utilizes respective ones of the
saliency maps 218 (i.e., salience measures) to weight each source
channel's contribution to the target gradient. For purposes of
exemplary illustration, such weights are shown as "source channel
contribution weights 220". In view of the source channel
contribution weights 220, image fusion module 212 highlights
gradients with high saliency in a target gradient 222, and thereby,
preserves salient features in a resulting fused image 217. In one
implementation, image fusion module 212 implements dynamic range
compression operations on the target gradient 222 to reduce the
undesirable halo effects.
[0018] These and other aspects for salience preserving image fusion
are now described in greater detail.
Salience Preserving Fusion
Gradient Fusion with Salience Map
[0019] Importance weights based on salience map 218: Denote
N-channel registered source images 216 as {f.sub.k, k=1, 2, L, N},
wherein W represents a region of a whole image 216. Image fusion
module 212 adjusts all source channels in each image 216 to have
same mean gradients as a maximum mean gradient channel. This is so
that the source channels can be compared and computed at the same
level. For channel f.sub.k, and for pixel p, fusion module 212
measures salience (i.e., respective ones of salience measurements
218) of p in f.sub.k as follows:
S.sub.k(p)=mean.sub.q.sub.IQ.sub.p{d(f.sub.k(p),f.sub.k(q))},
(1)
S.sub.k(p)=rescale{1-S.sub.k(p)/max.sub.q.sub.IW(S.sub.k(q))},
(2)
where Q.sub.p is the neighborhood of pixel p; rescale is an
operation to ensure dynamic range of S.sub.k is within [0, 1],
rescale(A)=(A-A.sub.min)/(A.sub.max-A.sub.min), (3)
and d(a,b) is defined as
d ( a , b ) = - ( b - a ) 2 2 s 2 . ( 4 ) ##EQU00001##
("A" indicates the variable of the function rescale, A.sub.min and
A.sub.max indicate the minimum and maximum of A, that is, the range
of A is from A.sub.min to A.sub.max--I.e., the same as
"d(a,b").
[0020] In this implementation, fusion module 212 sets Q.sub.p to be
a 5' 5 neighborhood of p, and s.sup.2=100 (although different
values can also be used). S.sub.k(p) represents contrast around p,
thus measures local salience. S.sub.k(p) with a value closer to 1
means pixel p is more important within channel k. Fusion module 212
compares all the salience maps S.sub.k, and assigns a normalized
weight to each pixel in each channel as follows:
w k ( p ) = S k ( p ) n a .degree. l = 1 N ( S l ( p ) 2 n ) . ( 5
) ##EQU00002##
Here, w.sub.k is defined to be the importance weight of channel k.
The positive parameter n indicates a degree that the fused gradient
resonates with the channel of high salience.
[0021] Importance-weighted contrast form: Fusion module 212
constructs an importance-weighted contrast form 226 as follows:
C ( p ) = a .degree. k ( w k ( p ) f k x ) 2 a .degree. k w k 2 ( p
) f k x .times. f k y a .degree. k w k 2 ( p ) f k x .times. f k y
a .degree. k ( w k ( p ) f k y ) 2 ( 6 ) ##EQU00003##
This is rewritten as
[0022] w 1 f 1 x w 2 f 2 x L w N f N x w 1 f 1 y w 2 f 2 y L w N f
N y w 1 f 1 x w 1 f 1 y w 2 f 2 x w 2 f 2 y M M w N f N x w N f N y
( 7 ) ##EQU00004##
From a statistical perspective, if
[0023] ( f k x , f k y ) T , k = 1 , 2 , N ##EQU00005##
are used as the samples, the contrast form C(p) is exactly a
weighted covariance matrix of the samples with the weight
w.sub.k(p), that is, for each pixel p, if we deem the gradients
from N channels as N samples
{ ( f k x , f k y ) T , k = 1 , 2 , N } ##EQU00006##
with some distribution, and we weight these samples by w.sub.k,
then the covariance matrix of weighted samples is exactly C(p).
[0024] Target gradient field: Fusion module 212 constructs target
gradient V(p)--i.e., target gradient 222--at pixel p, using
eigen-decomposition on C(p) as follows:
C ( p ) u ( p ) = .lamda. ( p ) u ( p ) , st . { u ( p ) 2 = 1 u (
p ) T ( k .gradient. f k ( p ) ) > 0 V ( p ) = .lamda. ( p ) u (
p ) ( 8 ) ##EQU00007##
where .lamda.(p) is the largest eigenvalue of c(p), and u(p) is the
corresponding eigenvector. The target gradient V (122) is the
principle component of the source channels' gradients weighted by
their importance, which is the optimal representation for the
weighted gradients in the sense of least-mean-square-error. Fusion
module 212, by applying the importance weight as equation (5) on
the contrast form 226, preserves the salience of the sources in the
field of the target gradient 222.
[0025] Dynamic Range Compression for Halo Reduction: FIG. 3 shows
an exemplary set of graphs illustrating origination of halo effects
in gradient visualization, according to one embodiment. Since
source images 216 are fused in the gradient field, and not directly
by intensity, rendering the resulting images 217 to a display
device with a dynamic range that is less than the dynamic range of
the resulting image(s) 217 may cause visualization issues such as
undesired image "halos". A one-dimensional example is shown in FIG.
3. Assume the dynamic range is [0,1]. f.sub.1 and f.sub.2 are two
source signals, with a same step at different locations. After
their gradients df.sub.1 and df.sub.2 are fused by max operation,
denoted as df, direct reconstruction leads to a result f exceeding
the range [0,1]. In view of a dynamic range limitation, the
limitation represents a hard constraint causing halos to occur at
sharp jumps in the result f.sub.d. Such halos result in degradation
in the visual perception. Thus, in one implementation, fusion
module 212 controls the range of the target gradient 222 for halo
reduction.
[0026] In view of the above, a gradient V may be over-enlarged, and
thus, may result in halo artifacts. Since a large gradient can be
still easily perceived after slight attenuation, and a slight boost
on the small gradient can help improve the weak detail, in one
implementation fusion module 212 modifies the target gradient 222
as follows:
V * ( p ) = ( a V ( p ) ) 1 - b .times. V ( p ) ( 8 )
##EQU00008##
This is a two-parameter family of functions. The parameter b is
within (0,1); the parameter controls strength of the attenuation
for large gradients (also the strength of boost for small
gradients). Parameter a determines which gradient magnitudes remain
unchanged (multiplied by a scale factor of 1). Gradients of
magnitude larger than a are attenuated, while gradients of
magnitude smaller than a are slightly magnified. In one
implementation, b is set to 0.8, and a=0.8.times.mean{|V|},
although in different implementations different parameter values
can be used.
[0027] A modified gradient V* may result in halo reduction in
general. Since the strong edges can also be easily observed, the
target V* preserves the salience, and fusion module 212 uses V* as
the final target gradient 222, as now described.
[0028] Reconstruction from target gradient: Given a target gradient
V* (122), a fused result (i.e., a resulting fused image 217) is a
two-dimension (2D) function g which minimizes the following:
.sub.W|Ng-V*|.sup.2dW,g(x,y)I[0,255] (9)
Eq. (9) is the objective function, which says, we'd like to find an
image whose gradient has the least mean square error to the target
gradient V*. Fusion module 212 solves function (9) iteratively as
follows:
g ( p ) t + 1 2 = g ( p ) t + 1 4 ( Dg t ( p ) - divV * ( p ) ) g (
p ) t + 1 2 = max ( 0 , min ( 255 , g ( p ) t + 1 2 ) ) ( 10 )
##EQU00009##
Eq. (10) is a classical numerical algorithm to implement the
optimization process. When the iteration stops, the final g(p) is
the resulting fused image 217. For purposes of exemplary
illustration, such a final fused image is shown as a respective
portion of "other program data" 224. In general, operation of (9)
and (10) fuse the information (salient features) from multiple
source images in the computation of final target gradient V*. In
this step, the image is reconstructed from a gradient. Please note
that this approach is presented in a general mathematical form, so
it can be implemented to an arbitrary number of channels in
multi-channel source images 216.
An Exemplary Procedure
[0029] FIG. 4 shows an exemplary procedure 400 for salience
preserving image fusion, according to one embodiment. For purposes
of discussion, the operations of FIG. 4 are described in reference
to FIG. 2. For instance, in the description, the left-most digit of
a component (computer program modules, devices, data, etc.) or
operation reference number identifies the particular figure in
which the component or operation first appears. For example, with
respect to fusion module 212, the leftmost digit of the fusion
module 212 is a "1", indicating that fusion module 212 is first
presented and described with respect to FIG. 2. In one
implementation, fusion module 212 implements at least a subset of
the operations of procedure 400. For example, although fusion
module 212 may implement operations associated with blocks 402
through 416, in one implementation, a different application
implements the operations of block 416 to render or present the
salience-preserved fused image(s) generated by fusion module
212.
[0030] Referring to FIG. 4, at block 402, for each channel in each
image in a set of multi-channel images 216, adjust the channel to
have the same mean gradient as a maximum mean gradient channel.
Operations of block 404, for each of the multi-channel images,
measures local salience for each pixel in each channel of the
image. Such local salience measurements are shown by salience
measurements/maps 218 of FIG. 2. Operations of block 406, for each
of the multi-channel images, assigns a normalized weight to each
pixel in each channel based on respective ones of the local
salience measurements. Operations of block 408, based on the
weights assigned to respective ones of the pixels in each channel
(each of the images), generates an importance weighted contrast
form 226. Operations of block 410 construct a target gradient 222
based on each pixel in each of the multi-channel images 216. This
is accomplished, for each pixel, using Eigen decomposition in view
of the pixel and importance weighted contrast form 226.
[0031] Operations of block 412 reduce, using dynamic range
compression operations on a pixel-by-pixel basis, the halo effects
associated with the target gradient 222. Operations of block 414,
generate a salience-preserved fused image from the target gradient
222. Operations of block 416 present to the salience-preserved
fused image to a viewer.
Alternate Implementations
[0032] Color removal operations are similar to image fusion in the
sense that the information from ROB channels is integrated into a
single channel. As a simple example, and in one implementation,
fusion module 212 implements the above described image fusion
operations to perform the task of color removal. For example, in
one implementation, fusion module 212 implements RGB fusion using
the above described techniques to remove color from source images
212. In one implementation, these color removal operations are
responsive to receiving a request from an application 214 to remove
color from a set of multi-channel source images 216.
CONCLUSION
[0033] Although salience preserving image fusion has been described
in language specific to structural features and/or methodological
operations or actions, it is understood that the implementations
defined in the appended claims are not necessarily limited to the
specific features or actions described. Rather, the specific
features and operations discussed above with respect to FIGS. 1(c)
through 4 are disclosed as exemplary forms of implementing the
claimed subject matter.
* * * * *