U.S. patent application number 13/616321 was filed with the patent office on 2014-02-20 for data-driven distributionally robust optimization.
This patent application is currently assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION. The applicant listed for this patent is Martin Mevissen, Emanuele Ragnoli, Jia Yuan Yu. Invention is credited to Martin Mevissen, Emanuele Ragnoli, Jia Yuan Yu.
Application Number | 20140052409 13/616321 |
Document ID | / |
Family ID | 50100656 |
Filed Date | 2014-02-20 |
United States Patent
Application |
20140052409 |
Kind Code |
A1 |
Mevissen; Martin ; et
al. |
February 20, 2014 |
DATA-DRIVEN DISTRIBUTIONALLY ROBUST OPTIMIZATION
Abstract
Embodiments of the disclosure include a method for providing
data-driven distributionally robust optimization. The method
includes receiving a plurality of samples of one or more uncertain
parameters for a complex system and calculating a distribution
uncertainty set for the one or more uncertain parameters. The
method also includes receiving a deterministic problem model
associated with the complex system that includes an objective and
one or more constraints and creating a distributionally robust
counterpart (DRC) model based on the distribution uncertainty set
and the deterministic problem model. The method further includes
formulating the DRC as a generalized problem of moments (GPM),
applying a semi-definite programing (SDP) relaxation to the GPM and
generating an approximation for a globally optimal distributionally
robust solution to the complex system.
Inventors: |
Mevissen; Martin; (Dublin,
IE) ; Ragnoli; Emanuele; (Dublin, IE) ; Yu;
Jia Yuan; (Quebec, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Mevissen; Martin
Ragnoli; Emanuele
Yu; Jia Yuan |
Dublin
Dublin
Quebec |
|
IE
IE
CA |
|
|
Assignee: |
INTERNATIONAL BUSINESS MACHINES
CORPORATION
Armonk
NY
|
Family ID: |
50100656 |
Appl. No.: |
13/616321 |
Filed: |
September 14, 2012 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13587980 |
Aug 17, 2012 |
|
|
|
13616321 |
|
|
|
|
Current U.S.
Class: |
702/181 |
Current CPC
Class: |
G06F 17/11 20130101 |
Class at
Publication: |
702/181 |
International
Class: |
G06F 17/18 20060101
G06F017/18 |
Claims
1. A method for providing data-driven distributionally robust
optimization, the method comprising: receiving a plurality of
samples of one or more uncertain parameters for a complex system;
calculating a distribution uncertainty set for the one or more
uncertain parameters; receiving a deterministic problem model
associated with the complex system that includes an objective and
one or more constraints, wherein the plurality of samples is
described by an unknown distribution; creating a distributionally
robust counterpart (DRC) model based on the distribution
uncertainty set and the deterministic problem model; formulating
the DRC as a generalized problem of moments (GPM); applying a
semi-definite programing (SDP) relaxation to the GPM; and
generating an approximation for a globally optimal distributionally
robust solution to the complex system.
2. The method of claim 1, wherein formulating the DRC as the GPM
comprises: calculating a dual minimization problem of an inner
maximization problem; transforming a feasible set of an inner
minimization problem to match a structure of the feasible set of an
outer minimization problem; and reducing a
minimization-minimization problem to a minimization problem, which
constitutes the GPM.
3. The method of claim 1, wherein formulating the DRC as the GPM
comprises: calculating a dual-maximization problem of an inner
minimization problem, transforming a feasible set of a inner
maximization problem to match the structure of the feasible set of
an outer maximization problem; and reducing a
maximization-maximization problem to a maximization problem, which
constitutes the GPM.
4. The method of claim 1, wherein calculating the distribution
uncertainty set for the one or more uncertain parameters is based
on a polynomial estimate of a probability density function.
5. The method of claim 1, wherein calculating the distribution
uncertainty set for the one or more uncertain parameters is based
on statistical estimates for a plurality of moments of the unknown
distribution of the uncertain system parameters up to an arbitrary
order.
6. The method of claim 1, wherein calculating the distribution
uncertainty set for the one or more uncertain parameters is based
on histogram estimates for the unknown distribution of the
uncertain system parameters.
7. The method of claim 1, wherein the distribution uncertainty set
includes a support that is described by one or more multivariate
polynomial inequality constraints.
8. The method of claim 1, wherein the objective is described as
multivariate polynomial.
9. The method of claim 1, wherein the equality and/or inequality
constraints are described as multivariate polynomials.
10. The method of claim 1, wherein the approximation for a
distributionally robust solution includes precision level.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] The present application is a continuation of and claims
priority from U.S. patent application Ser. No. 13/587,980 filed on
Aug. 17, 2012 the entire contents of which are incorporated herein
by reference.
BACKGROUND
[0002] The present invention relates to optimization of complex
systems under uncertainty, and more specifically, to data-driven
distributionally robust optimization of complex systems.
[0003] Many of today's complex systems require decision making that
is affected by uncertainty in one or more system parameters, such
as usage or demand. Currently, data relating to the uncertain
aspects of these systems is periodically collected by one or more
sensors or meters. Current robust optimization models only exploit
the collected usage data for support information of distributions
for the uncertain parameters, which often leads to overly
conservative models. Current distributionally robust optimization
systems only exploit the observed data to construct distributional
uncertainty sets consistent with the first two moments of the
observed data and/or handle only restrictive classes of objective
functions and constraints. On the other hand, current stochastic
optimization models require highly accurate knowledge of
distribution of uncertain system parameters.
[0004] Furthermore, in large-scab: systems characterization of
uncertainty of system parameters given based on collected data can
be challenging.
[0005] Optimization models of complex real-world systems often
involve nonlinear functionalities in objective and constraints.
However, current distributionally robust optimization models can
not take into account broad classes of nonlinearities in objective
and constraints.
SUMMARY
[0006] Embodiments include a method for providing data-driven
distributionally robust optimization. The method includes receiving
a plurality of samples of one or more uncertain parameters for a
complex system, the plurality of samples being described by an
unknown distribution, and calculating a distribution uncertainty
set for the one or more uncertain parameters. The method also
includes receiving a deterministic problem model associated with
the complex system that includes an objective and one or more
constraints and creating a distributionally robust counterpart
(DRC) model based on the distributional uncertainty set and the
deterministic problem model. The method further includes
formulating the DRC as a generalized problem of moments (GPM),
applying a semi-definite programing (SDP) relaxation to the GPM and
generating an approximation for a globally optimal distributionally
robust solution to the complex system.
[0007] Additional features and advantages are realized through the
techniques of the present invention. Other embodiments and aspects
of the invention are described in detail herein and are considered
a part of the claimed invention. For a better understanding of the
invention with the advantages and the features, refer to the
description and to the drawings.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0008] The subject matter which is regarded as the invention is
particularly pointed out and distinctly claimed in the claims at
the conclusion of the specification. The forgoing and other
features, and advantages of the invention are apparent from the
following detailed description taken in conjunction with the
accompanying drawings in which:
[0009] FIG. 1 illustrates a block diagram of a computer system for
use in practicing the teachings herein;
[0010] FIG. 2 illustrates a flow diagram of a method for providing
data-driven distributionally robust optimization in accordance with
an embodiment; and
[0011] FIG. 3 illustrates a block diagram of a system for providing
data-driven distributionally robust optimization in accordance with
an embodiment.
DETAILED DESCRIPTION
[0012] In accordance with exemplary embodiments, an optimization
model is provided that is robust with respect to uncertain input
data, but not too conservative due to extreme scenarios, and is
able to handle a variety of nonlinear objectives and
constraints.
[0013] In accordance with exemplary embodiments, systems and
computer program products for deriving approximations of globally
optimal, distributionally robust solutions to optimization problems
with parameter uncertainty in the presence of samples of the
uncertain parameters are provided. In exemplary embodiments, the
system takes into account uncertainty in the input data and employs
convex optimization solvers to provide globally optimal
distributionally robust solutions for a broad class of models for
real-world systems.
[0014] FIG. 1 illustrates a block diagram of a computer system 100
for use in practicing the teachings herein. The methods described
herein can be implemented in hardware, software (e.g., firmware),
or a combination thereof. In an exemplary embodiment, the methods
described herein are implemented in hardware, and may be part of
the microprocessor of a special or general-purpose digital
computer, such as a personal computer, workstation, minicomputer,
or mainframe computer. The computer system 100 therefore includes
general-purpose computer 101.
[0015] In an exemplary embodiment, in terms of hardware
architecture, as shown in FIG. 1, the computer 101 includes a
processor 105, memory 110 coupled to a memory controller 115, and
one or more input and/or output (I/O) devices 140, 145 (or
peripherals) that are communicatively coupled via a local
input/output controller 135. The input/output controller 135 can
be, for example but not limited to, one or more buses or other
wired or wireless connections, as is known in the art. The
input/output controller 135 may have additional elements, which are
omitted for simplicity, such as controllers, buffers (caches),
drivers, repeaters, and receivers, to enable communications.
Further, the local interface may include address, control, and/or
data connections to enable appropriate communications among the
aforementioned components.
[0016] The processor 105 is a hardware device for executing
hardware instructions or software, particularly that stored in
memory 110. The processor 105 can be any custom made or
commercially available processor, a central processing unit (CPU),
an auxiliary processor among several processors associated with the
computer 101, a semiconductor based microprocessor (in the form of
a microchip or chip set), a macroprocessor, or generally any device
for executing instructions. The processor 105 includes a cache 170,
which may include, but is not limited to, an instruction cache to
speed up executable instruction fetch, a data cache to speed up
data fetch and store, and a translation lookaside buffer (TLB) used
to speed up virtual-to-physical address translation for both
executable instructions and data. The cache 170 may be organized as
a hierarchy of more cache levels (L1, L2, etc.).
[0017] The memory 110 can include any one or combination of
volatile memory elements (e.g., random access memory (RAM, such as
DRAM, SRAM, SDRAM, etc.)) and nonvolatile memory elements (e.g.,
ROM, erasable programmable read only memory (EPROM), electronically
erasable programmable read only memory (EEPROM), programmable read
only memory (PROM), tape, compact disc read only memory (CD-ROM),
disk, diskette, cartridge, cassette or the like, etc.). Moreover,
the memory 110 may incorporate electronic, magnetic, optical,
and/or other types of storage media. Note that the memory 110 can
have a distributed architecture, where various components are
situated remote from one another, but can be accessed by the
processor 105.
[0018] The instructions in memory 110 may include one or more
separate programs, each of which comprises an ordered listing of
executable instructions for implementing logical functions. In the
example of FIG. 1, the instructions in the memory 110 include a
suitable operating system (OS) 111. The operating system 111
essentially controls the execution of other computer programs and
provides scheduling, input-output control, file and data
management, memory management, and communication control and
related services.
[0019] In an exemplary embodiment, a conventional keyboard 150 and
mouse 155 can be coupled to the input/output controller 135. Other
output devices such as the I/O devices 140, 145 may include input
devices, for example but not limited to a printer, a scanner,
microphone, and the like. Finally, the I/O devices 140, 145 may
further include devices that communicate both inputs and outputs,
for instance but not limited to, a network interface card (NIC) or
modulator/demodulator (for accessing other files, devices, systems,
or a network), a radio frequency (RF) or other transceiver, a
telephonic interface, a bridge, a router, and the like. The system
100 can further include a display controller 125 coupled to a
display 130. In an exemplary embodiment, the system 100 can further
include a network interface 160 for coupling to a network 165. The
network 165 can be an IP-based network for communication between
the computer 101 and any external server, client and the like via a
broadband connection. The network 165 transmits and receives data
between the computer 101 and external systems. In an exemplary
embodiment, network 165 can be a managed IP network administered by
a service provider. The network 165 may be implemented in a
wireless fashion, e.g., using wireless protocols and technologies,
such as WiFi, WiMax, etc. The network 165 can also be a
packet-switched network such as a local area network, wide area
network, metropolitan area network, Internet network, or other
similar type of network environment. The network 165 may be a fixed
wireless network, a wireless local area network (LAN), a wireless
wide area network (WAN) a personal area network (PAN), a virtual
private network (VPN), intranet or other suitable network system
and includes equipment for receiving and transmitting signals.
[0020] If the computer 101 is a PC, workstation, intelligent device
or the like, the instructions in the memory 110 may further include
a basic input output system (BIOS) (omitted for simplicity). The
BIOS is a set of essential routines that initialize and test
hardware at startup, start the OS 111, and support the transfer of
data among the hardware devices. The BIOS is stored in ROM so that
the BIOS can be executed when the computer 101 is activated. When
the computer 101 is in operation, the processor 105 is configured
to execute instructions stored within the memory 110, to
communicate data to and from the memory 110, and to generally
control operations of the computer 101 pursuant to the
instructions.
[0021] In exemplary embodiments, a generalized problem of moments
(GPM) with polynomial data is an optimization problem of the
form:
mi.sub.+.sub.(K).intg.p(x)d.mu.(x)
s.t. .intg.f.sub.i(x)d.mu.(x)=b.sub.i, i.epsilon.l,
.intg.f.sub.j(x)d.mu.(x).ltoreq.b.sub.j, j.epsilon.J, (1)
where K={x.epsilon..sup.n|g.sub.1(x).gtoreq.0, . . . ,
g.sub.m(x).gtoreq.0}, p, f.sub.i(i.epsilon.l),
f.sub.j(j.epsilon.J).sub.g.sub.j(j=1, . . . , m).epsilon.[x], I, J
a finite or infinite set of indices, and .sub.+(K)a set of finite
Borel measure supported on K. The GPM is an infinitely dimensional
linear program. It can be approximated by the Lasserre's hierarchy
of SDP relaxations:
min.sub.y L.sub.y(p)
s.t. M.sub.d(y)0,
M.sub.d-d.sub.j (g.sub.jy)0, j=1, . . . , m,
L.sub.y(f.sub.i)=b.sub.i, i.epsilon.l,
L.sub.y(f.sub.i).ltoreq.b.sub.j, j.epsilon.J,
y.sub.0.ltoreq.Const., (2)
where
d j := deg ( g j ) 2 ##EQU00001##
and d.epsilon.N the order of a single SDP relaxation. Let min (GPM)
the minimum of (1), min (SDP.sub.d) the minimum and y.sub.d* the
minimizer of (2). If K compact and its quadratic module
archimedian, min (SDP.sub.d).fwdarw.min (GPM) for d.fwdarw..infin..
Moreover, if (1) has a unique measure as minimal solution, then
y.sub.d* converges to the moment vector of the optimal measure for
(1) as well. In exemplary embodiments, M.sub.d(y) and
M.sub.d-d.sub.j(g.sub.ly) are linear combinations of the components
y.sub..alpha.of y with real-valued, symmetric matrices as
multipliers, and M.sub.d(y) stands for a constraint where the
matrix M.sub.d(y) is required to be positive semidefinite. In
exemplary embodiments L.sub.y(p) is a linear operator that maps a
polynomial P(x)=E.sub..alpha.P.sub..alpha.x.sub..alpha. to a linear
combinations of the components y.sub..alpha. of y such that
L.sub.y(p)=.SIGMA..sub..alpha.P.sub..alpha.y.alpha.. Note, if
.mu..epsilon.M.sub.+(K) is additionally restricted to be a
probability measure, the GPM (1) becomes a Polynomial Optimization
Problem (POP). Thus, in exemplary embodiments, a POP can be
approximated by Lasserre's hierarchy of SDP (2) as well.
[0022] In exemplary embodiments when no knowledge of a distribution
of a uncertain parameter over the uncertainty set is assumed, the
optimization problem can be represented as:
min.sub.x.epsilon.X
maE.sub..mu.(.xi.)[h(x,.xi.)]=min.sub.x.epsilon.X
ma.intg..sub.Sh(x,.epsilon.)d.mu.(.xi.)=min.sub.x.epsilon.X
maF(x,.mu.) (3)
where some set of finite Borel measures supported on S.OR
right..sup.m, as the distributionally robust counterpart (DRC) of
the problem:
min.sub.x.epsilon.X h(x, .xi.) (4)
for some given .xi..epsilon.S. In the data-driven case, is
constructed from a given sample .xi..sup.(1), . . . , .xi..sup.(N)
of the actual, unknown distribution .mu.* of the uncertain
parameter .xi..
[0023] In exemplary embodiments, an uncertainty set is calculated
from a given sample of collected data. In exemplary embodiments,
uncertainty sets can be defined by small deviations around the
statistical moments. The statistical moments of .mu.* are given
by:
m .alpha. N = 1 N i = 1 N ( .xi. ( i ) ) .alpha. , ##EQU00002##
for all .alpha..epsilon..sup.m. The uncertainty set
.sub.d,.epsilon.,N can be represented as:
.sub.d,.epsilon.,N={.mu..epsilon..sub.+(S)|.intg..sub.S.xi..sup..alpha.d-
.mu.(.xi.).ltoreq.m.sub..alpha..sup.N+.epsilon..sub..alpha.,
.intg..sub.S.xi..sup..alpha.d.mu.(.xi.).gtoreq.m.sub..alpha..sup.N-.epsil-
on..sub..alpha..A-inverted.|.alpha.|.ltoreq.d},
for some d.epsilon. and .epsilon..sub..alpha.>0, where .sub.+(S)
the set of finite Borel measures supported on the set S. .epsilon.
needs to be chosen such that .mu.* .epsilon.. Furthermore, assuming
the support S:={.xi.|g.sub.j(.xi.).gtoreq.0(j=1, . . . , r)} of
.mu.* where g.sub.j .epsilon.[.xi.] are multivariate real
polynomials, is compact, and h.epsilon.[x,.xi.], i.e. a
multivariate polynomial in x and .xi..
[0024] In exemplary embodiments, the distributionally robust
counterpart model can be reformulated as generalized problem of
moments. Utilizing the dual of the inner maximization problem, (3)
can be re-written as:
min.sub.x.epsilon.X,.lamda..sup.+.ltoreq.0,.lamda..sup.-.ltoreq.0
.SIGMA..sub..alpha..lamda..sub..alpha..sup.+b.sub..alpha..sup.++.SIGMA..s-
ub.a.lamda..sub..alpha..sup.-b.sub..alpha..sup.-
s.t.
.SIGMA..sub..alpha..lamda..sub..alpha..sup.+h.sub..alpha..sup.+(.xi-
.)+.SIGMA..sub..alpha..lamda..sub..alpha..sup.-h.sub..alpha..sup.-(.xi.).l-
toreq.h(x, .xi.).A-inverted..xi..epsilon.S, (5)
where h.sub..alpha..sup.+(.xi.):=-.xi..sup..alpha.,
h.sub..alpha..sup.-(.xi.):=.xi..sup..alpha.,
b.sub..alpha..sup.+:=-m.sub..alpha..sup.N-.epsilon..sub..alpha. and
b.sub..alpha..sup.-:=m.sub..alpha..sup.N-.epsilon..sub..alpha.,
i.e.
.intg..sub.Sh.sub..alpha..sup.+(.xi.)d.mu.(.xi.).gtoreq.b.sub..alpha..sup-
.+and
.intg..sub.Sh.sub..alpha..sup.-(.xi.)d.mu.(.xi.).gtoreq.b.sub..alpha-
..sup.- the inequalities in the definition of .sub.d,.epsilon.,N.
Assuming
h(x,.xi.):=.SIGMA..sub..alpha.h.sub..alpha.(.xi.)x.sup..alpha.
where h.sub..alpha..epsilon.[.xi.]. Define z:=(x,.lamda..sup.+,
.lamda..sup.-) and Z:=X.times..sub.-.sup.2k. Then, (5) can be
rewritten as
min z .di-elect cons. Z .alpha. b ~ .alpha. z .alpha. s . t .
.beta. h ~ .beta. ( .xi. ) z .beta. .gtoreq. 0 .A-inverted. .xi.
.di-elect cons. S , ( 6 ) ##EQU00003##
where {tilde over (b)} and {tilde over (h)}.sub..beta. defined
based on b, h.sub..alpha..sup.+, h.sub..alpha..sup.- and h. Then,
the polynomial constraint (6) can be tightened by introducing r
matrix variables A.sub.j:
min z .di-elect cons. Z , A j .di-elect cons. s ( d ) .alpha. b ~
.alpha. z .alpha. s . t . .beta. h ~ .beta. ( .xi. ) z .beta. = j =
0 r g j ( .xi. ) u d ( .xi. ) A j u d ( .xi. ) .A-inverted. .xi.
.di-elect cons. m , A j 0 .A-inverted. j .di-elect cons. { 0 , , r
} , ( 7 ) ##EQU00004##
where g.sub.0(.xi.):=1 and u.sub.d(.xi.)=(1,.xi..sub.1, . . . ,
.xi..sub.m.sup.d) the standard basis of [.xi.].sub.d.
[0025] The set of polynomial equality constraints in (7) can be
rewritten as a set of constraints linear in z.sup..beta. and
components of A.sub.j. Since each of the constraints
A.sub.j.epsilon..sub.+.sup.s(d) can be written as a finite number
of scalar, polynomial inequality constraints, (7) is equivalent to
a polynomial optimization problem, which can be approximated by a
sequence of semi-definite programing (SDP) relaxations.
[0026] In exemplary embodiments, the of (3) is a set of Borel
measures defined by their densities with respect to the Lebesgue
measure supported on the same set S as before. The statistical
estimate for the density of .mu.* based on the sample .xi..sup.(1),
. . . , .xi..sup.(N) can be denote by f.sup.N.epsilon.L.sup.2(S).
Any one of numerous known methods including, but not limited to,
Kernel Density Estimation or wavelets, can be used to estimate
densities of an unknown measure given a finite sample. In exemplary
embodiments, f.sup.N is a multivariate polynomial, in order to
guarantee the set D.sub.d,.epsilon.,N constructed below is nonempty
when d is fixed.
[0027] In an exemplary embodiment, the uncertainty set may be
represented as:
.sub..epsilon.,N={.mu..epsilon..sub.+(S)|.intg..sub.Sd.mu.(.xi.)=.intg..-
sub.Sf(.xi.)d.xi.,.intg..sub.S(f(.xi.)-f.sup.N(.xi.)).sup.2d.xi..ltoreq..e-
psilon.}.
[0028] In order to derive tractable distributionally robust
counterpart model (DRC), we consider the following truncated,
polynomial approximation for D.sub..epsilon.,N:
.sub.d,.epsilon.,N={f.epsilon.[.xi.].sub.d|f(.xi.).gtoreq.0.A-inverted..-
xi..epsilon.S,.intg..sub.S(f(.xi.)-f.sup.N(.xi.)).sup.2d.xi..ltoreq..epsil-
on.}, (8)
i.e. the inner maximization problem is equivalent to:
ma.sub.[.xi.].sub.d
.SIGMA..sub..alpha.x.sup..alpha..intg..sub.Sh.sub..alpha.(.xi.)f(.xi.)d.x-
i.
s.t. f(.xi.).gtoreq.0 .A-inverted..xi..epsilon.S,
.intg..sub.S(f(.xi.)-f.sup.N(.xi.)).sup.2d.xi..ltoreq..epsilon..
(9)
[0029] For a fixed x.epsilon.X, (9) is a polynomial optimization
problem. Its dual is a minimization problem involving polynomials,
moment expressions and the closed, semialgebraic set S as well.
Therefore, the DRC (3) can be reformulated as a minimization
problem involving moments and polynomials, which can be
approximated by a converging sequence of SDP relaxations.
[0030] In another exemplary embodiment, the uncertainty set may be
represented as:
.sub..epsilon.,N={.mu..epsilon..sub.+(s)|.intg..sub.Sd.mu.(.xi.)=.intg..-
sub.Sf(.xi.)d.xi.,
max.sub..xi..epsilon.S|f(.xi.)-f.sup.N(.xi.)|.ltoreq..epsilon.}.
In order to derive tractable DRC, .sub..epsilon.,N can be
approximated as:
.sub.d,.epsilon.,N={f.epsilon.[.xi.].sub.d|f(.xi.).gtoreq.0.A-inverted..-
xi..epsilon.S,
max.sub..xi..epsilon.S|f(.xi.)-f.sup.N(.xi.)|.ltoreq..epsilon.},
10)
i.e. the inner maximization problem is equivalent to:
ma.sub.[.xi.].sub.d
.SIGMA..sub..alpha.x.sup..alpha..intg..sub.Sh.sub..alpha.(.xi.)f(.xi.)d.x-
i.
s.t. f(.xi.).gtoreq.0 .A-inverted..xi..epsilon.S,
-f(.xi.)+f.sup.N(.xi.)-.epsilon..gtoreq.0
.A-inverted..xi..epsilon.S,
f(.xi.)-f.sup.N(.xi.)+.epsilon..gtoreq.0
.A-inverted..xi..epsilon.S. (11)
With f(.xi.)=.SIGMA..sub..beta.f.sub..beta..xi..sup..xi., (11) can
be rewritten as:
max ( f .beta. ) .beta. .di-elect cons. q .beta. f .beta. h .beta.
( x ) - .beta. f .beta. .xi. .beta. .ltoreq. 0 .A-inverted. .xi.
.di-elect cons. S , s . t . .beta. f .beta. .xi. .beta. .ltoreq. f
N ( .xi. ) - .epsilon. .A-inverted. .xi. .di-elect cons. S , -
.beta. f .beta. .xi. .beta. .ltoreq. - f N ( .xi. ) + .epsilon.
.A-inverted. .xi. .di-elect cons. S , ( 12 ) ##EQU00005##
where
h.sub..beta.(x):=.SIGMA..sub..alpha.x.sup..alpha.h.sub..alpha.,.bet-
a.:=.SIGMA..sub..alpha.x.sup..alpha..intg..sub.Sh.sub..alpha.(.xi.).xi..su-
p..beta.d.xi.. Taking the dual of (12), we are able to reformulate
the min-max problem (3) as the minimization problem:
min x .di-elect cons. X , .mu. 1 .di-elect cons. + ( S ) , .mu. 2
.di-elect cons. + ( S ) , .mu. 3 .di-elect cons. + ( S ) .intg. S 0
.mu. 1 ( .xi. ) + .intg. S f N ( .xi. ) - .epsilon. .mu. 2 ( .xi. )
+ .intg. S - f N ( .xi. ) + .epsilon. .mu. 3 ( .xi. ) s . t . -
.intg. S .xi. .beta. .mu. 1 ( .xi. ) + .intg. S .xi. .beta. .mu. 2
( .xi. ) - .intg. S .xi. .beta. .mu. 3 ( .xi. ) = h .beta. ( x )
.A-inverted. .beta. . ( 13 ) ##EQU00006##
[0031] Assuming X:={x|k.sub.j(x).gtoreq.0, j=1, . . . , t} compact,
where k.sub.j.epsilon.[x]. (13) is equivalent to:
min v .di-elect cons. + ( X ) , .mu. 1 .di-elect cons. + ( S ) ,
.mu. 2 .di-elect cons. + ( S ) , .mu. 3 .di-elect cons. + ( S )
.intg. S 0 .mu. 1 ( .xi. ) + .intg. S f N ( .xi. ) - .epsilon. .mu.
2 ( .xi. ) + .intg. S - f N ( .xi. ) + .epsilon. .mu. 3 ( .xi. ) s
. t . - .intg. S .xi. .beta. .mu. 1 ( .xi. ) + .intg. S .xi. .beta.
.mu. 2 ( .xi. ) - .intg. S .xi. .beta. .mu. 3 ( .xi. ) = .intg. X h
.beta. ( x ) v ( x ) .A-inverted. .beta. , .intg. X v ( x ) = 1. (
14 ) ##EQU00007##
Problem (14) is a Generalized Problem of Moments (GPM) with
polynomial data, whose minimum can be approximated up to arbitrary
precision by a hierarchy of SDP relaxations, if X and S are compact
and archimedian. Moreover, in the case (14) has a unique minimizer,
the sequence of optimal solutions of the hierarchy of SDPs
converges to this minimizer for increasing relaxation order.
[0032] In exemplary embodiments, it can be assumed that samples
.xi..sup.(1), . . . , .xi..sup.(N) for the uncertain parameter
.xi., which takes values in a given bounded interval [A, B].OR
right., which can be broken into K-intervals:
u.sub.0, . . . , u.sub.K-1,
such that:
|u.sub.k|=|B-A|/K for all k=0, . . . , K-1.
Let m.sub.0, . . . , m.sub.K-1 denote the midpoints of the
respective intervals, the empirical distribution {circumflex over
(F)}.sub.N,K can be defined as:
F ^ N , K ( k ) = 1 N i = 1 N 1 [ .xi. ( i ) .di-elect cons. u k ]
##EQU00008## for all k = 0 , , K - 1. ##EQU00008.2##
[0033] In exemplary embodiments, the uncertainty set may be
approximated by the following optimization problem:
min.sub.x.epsilon.X
max.sub..gamma..epsilon.U.sub..epsilon..intg..sub..xi..epsilon.Sh(x,.xi.)-
.gamma.(.xi.)d.xi., (15)
where:
U = { .gamma. : [ A , B ] .fwdarw. + : .intg. .gamma. ( .xi. ) .xi.
= 1 , .gamma. ( z ) - k = 1 K F ^ N , K ( k ) 1 [ z .di-elect cons.
u k ] .ltoreq. for all z } . ##EQU00009##
Let d=(d.sub.1, . . . , d.sub.K) denote a vector in .sup.K. The
optimization problem (15) can be approximated by the following:
min.sub.x.epsilon.X
max.sub.d.epsilon.W.sub..epsilon..SIGMA..sub.k=0.sup.K-1h(x,m.sub.k)d.sub-
.k, (16)
where the uncertainty set is:
W = { d .di-elect cons. .DELTA. K : d k - F ^ N , K ( k ) .ltoreq.
B - A K for all k } ##EQU00010##
[0034] In exemplary embodiments, the optimization problem can be
reformulated as a polynomial optimization problem. Observe that the
inner maximization of (16) can be written as:
max d .di-elect cons. K k = 0 K - 1 h ( x , m k ) d k s . t . d k -
F ^ N , K ( k ) .ltoreq. , k = 0 , , K - 1 ; F ^ N , K ( k ) - d k
.ltoreq. , k = 0 , , K - 1 ; i = 1 K d i .ltoreq. 1 ; d j .gtoreq.
0 , j = 0 , , K - 1. ( M 1 ) ##EQU00011##
The dual of the maximization (M1) is:
min y .di-elect cons. 2 K + 1 k = 0 K - 1 y k ( + F ^ N , K ( k ) )
+ = K 2 K - 1 y ( - F ^ N , K ( - K ) ) + y 2 K ##EQU00012## s . t
. y k - y K + k + y 2 K .gtoreq. h ( x , m k ) , k = 0 , , K - 1 ,
y j .gtoreq. 0 , j = 0 , , 2 K . ##EQU00012.2##
By the Duality Theorem for linear programs, the primal and dual
have the same optimal value; hence, the optimization (16) can be
written as:
min x .di-elect cons. X , y .di-elect cons. 2 K + 1 k = 0 K - 1 y k
( + F ^ N , K ( k ) ) + = K 2 K - 1 y ( - F ^ N , K ( - K ) ) + y 2
K s . t . y k - y K + k + y 2 K .gtoreq. h ( x , m k ) , k = 0 , ,
K - 1 , y j .gtoreq. 0 , j = 0 , , 2 K . ( A 1 ) ##EQU00013##
(A1) is a polynomial optimization problem (POP) of dimension
n+2K+1, its degree coincides with the degree of h.
[0035] If h is a multivariate polynomial in the arguments x and
.xi., then the optimization problem:
min.sub.x.epsilon.X
max.sub.d.epsilon.W.sub..epsilon..SIGMA..sub.k=0.sup.K-1h(x,m.sub.k)d.sub-
.k, (17)
can be approximated by a sequence of SDP relaxations.
[0036] In exemplary embodiments, .xi. denotes the uncertain
parameter, which is a random variable takes values in a set
S[A,B].sup.d.OR right..sup.d and .xi..sup.(1), . . . , .xi..sup.(N)
denotes a sequence of random variables with the same probability
distribution as .xi.. The set can be partitioned [A,B].sup.d into a
regular grid of 10 hypercubes of equal volume:
{v.sub..alpha.:.alpha..epsilon.[K].sup.d}
Letting m.sub.a denote the center of the hypercube v.sub.a for
every a, the partition of [A,B] can be defined as K intervals of
equal length:
u.sub.0, . . . , u.sub.K-1.
.xi..sub.l.sup.(i) denotes the l-th component of the sample
.xi..sup.(i), for every i, and every pair l.noteq.l', the random
variables .xi..sub.l.sup.(i) and .xi..sub.l'.sup.(i) are
independent. If N and K are fixed; the marginal empirical
frequencies {circumflex over (F)}.sub.l can be defined as:
F ^ ( k ) = 1 N i = 1 N 1 [ .xi. ( i ) .di-elect cons. u k ] for
all k = 0 , , K - 1 , and ##EQU00014## = 1 , , d .
##EQU00014.2##
Let a.sub.1, . . . , a.sub.d denote the components of
a.epsilon.[K].sup.d. The joint empirical frequencies can be defined
as:
G ^ ( a ) = = 1 d F ^ ( a ) for all a .di-elect cons. [ K ] d .
##EQU00015##
[0037] Consider the robust optimization problem with an uncertainty
set centered on the empirical density:
min.sub.x.epsilon.X
max.sub..gamma..epsilon.U.sub..epsilon..intg..sub..xi..epsilon.Sh(x,.xi.)-
.gamma.(.xi.)d.xi., (18)
where:
U = { .gamma. : S .fwdarw. + : .intg. .gamma. ( z ) z = 1 , .gamma.
( z ) - a .di-elect cons. [ K ] d G ^ ( a ) 1 [ z .di-elect cons. v
a ] .ltoreq. for all z } . ##EQU00016##
The optimization problem (18) can be approximated by the
following:
min.sub.x.epsilon.X
max.sub.p.epsilon.W.sub..epsilon..SIGMA..sub..alpha..epsilon.[K].sup.dh(x-
,m.sub.a)p(a), (19)
where the uncertainty set is:
W.sub..epsilon.={p:[K].sup.d.fwdarw..sub.+:.SIGMA..sub..alpha.p(.alpha.)-
.ltoreq.1,|p(.alpha.)-G(.alpha.)|.ltoreq..epsilon. for all
.alpha.}.
[0038] In exemplary embodiments, the optimization problem can be
reformulated as a polynomial optimization problem. The inner
maximization of (19) can be written as:
min x max p a .di-elect cons. [ K ] d h ( x , m a ) p ( a ) s . t .
p ( a ) - G ^ ( a ) .ltoreq. , .A-inverted. a .di-elect cons. [ K ]
d ; G ^ ( a ) - p ( a ) .ltoreq. , .A-inverted. a .di-elect cons. [
K ] d ; a .di-elect cons. [ K ] d p ( a ) .ltoreq. 1 ; p ( a )
.gtoreq. 0 , .A-inverted. a .di-elect cons. [ K ] d . ( M 2 )
##EQU00017##
The dual of the maximization (M2) is:
min x min y , y ' , y '' a .di-elect cons. [ K ] d y a ( + G ^ ( a
) ) + a ' .di-elect cons. [ K ] d y a ' ' ( - G ^ ( a ' ) ) + y ''
##EQU00018## s . t . y a - y a ' + y '' .gtoreq. h ( x , m a ) ,
.A-inverted. a .di-elect cons. [ K ] d , y a .gtoreq. 0 , y a '
.gtoreq. 0 , y '' .gtoreq. 0 , .A-inverted. a .di-elect cons. [ K ]
d . ##EQU00018.2##
By the Duality Theorem for linear programs, the primal and dual
have the same optimal value.
[0039] Referring now to FIG. 2, a flow diagram illustrating a
method 200 for providing data-driven distributionally robust
optimization in accordance with an embodiment is shown. As shown at
block 202, the method 200 includes receiving sample of uncertain
parameters for a system. In exemplary embodiments, the samples may
be received from one or more sensors or meters in the system. Next,
as shown at block 204, the method includes calculating a
distribution uncertainty set using statistical tools based on
polynomial probability density functions, moments up to arbitrary
order or histogram estimates. In exemplary embodiments, the
distributional uncertainty set may be described by polynomial
inequality constraints. As shown at block 206, the method 200 also
includes receiving a deterministic problem model that includes one
objective and one or more constraints. In exemplary embodiments,
the objective and the equality and inequality constraints may be
described as multivariate real-valued polynomials. As shown at
block 208, the method 200 includes creating a distributionally
robust counterpart (DRC) model based on the distributional
uncertainty set and the deterministic problem model. Next, as shown
at block 210, the method 200 includes formulating the DRC as a
Generalized Problem of Moments (GPM) with polynomial data. As shown
at block 212, a Semi-Definite Programming (SDP) relaxation is
applied to the GPM. The method 200 concludes at block 214 by
generating an approximation for distributionally robust
solution.
[0040] In exemplary embodiments, the approximation for
distributionally robust solution includes a precision level that
can be evaluated after the approximation for distributionally
robust solution is created. In exemplary embodiments, the
distributional uncertainty set includes a support that is described
by one or more multivariate polynomial inequality constraints.
[0041] Referring now to FIG. 3, a block diagram of a system 300 for
providing data-driven distributionally robust optimization is
shown. As illustrated the system 300 includes a complex system 310
and an optimization system 304, which may be a computer system
similar to the one shown in FIG. 1. The complex system 310 includes
one or more sensors 312 that are configured to monitor the
operation of one or more parameters of the complex system 310. In
addition, the complex system 310 includes one or more controls 314
that are configured to control one or more operational
characteristic of the complex system. In exemplary embodiments, the
optimization system 304 receives data samples from the one or more
sensors 312 and calculates an approximation for distributionally
robust solution. Based on the approximation for distributionally
robust solution, the controls 314 of the complex system 310 are
configured to optimize the operation of the complex system 310.
[0042] In an exemplary embodiment, the complex system 310 may be a
water distribution network characterized by a connected graph G (N,
E), representing a water distribution network. Where Nis the set of
nodes and E the set of pipes connecting the nodes. In addition,
p.sub.i the pressure, e.sub.i elevation, and d.sub.i demand at
i.epsilon.N, q.sub.i,j the flow from i to j, and hl.sub.i,j the
headloss caused by friction in case of flow i to j for
(i,j).epsilon.E. The optimization goal is to minimize the overall
pressure in the water distribution network while adhering to the
mass and energy conservations laws for flow and pressure:
min
.SIGMA..sub.i.epsilon.Np.sub.i+.SIGMA..sub.j.epsilon.N(d.sub.j-.SIGM-
A..sub.k.noteq.jq.sub.k,j+.SIGMA..sub.l.noteq.jq.sub.j,l).sup.2
s.t. p.sub.min.ltoreq.p.sub.i.ltoreq.p.sub.max
.A-inverted.i.epsilon.N,
q.sub.min.ltoreq.q.sub.i,j.ltoreq.p.sub.max
.A-inverted.(i,j).epsilon.E,
q.sub.i,j(p.sub.j+e.sub.j-p.sub.i-e.sub.i+hl.sub.i,j)q.sub.i,j)).ltoreq.-
0 .A-inverted.(i,j).epsilon.E,
p.sub.j+e.sub.j-p.sub.i-e.sub.i+hl.sub.i,j)q.sub.i,j)).ltoreq.0
.A-inverted.(i,j).epsilon.E, (20)
[0043] Assuming the headloss, hl.sub.i,j is a quadratic function in
q and the vector of demands d=(d.sub.1, . . . , d.sub.|N|) is not
known exactly and is therefore affected by uncertainty. Given a
sample d.sup.(1), . . . , d.sup.(K) of measurements of the demands
at K discrete, equidistant time points.
Letting p=(p.sub.1, . . . , p|N|).epsilon..sup.|N| and
q=(q.sub.1,2, q.sub.2,1, . . . ,).epsilon..sup.2|E| and defining
h(p,q,d):=.SIGMA..sub.i.epsilon.Np.sub.i+.SIGMA..sub.j.epsilon.N(d.sub.j--
.SIGMA..sub.k.noteq.jq.sub.k,j+.SIGMA..sub.l.noteq.jq.sub.j,l).sup.2,
X:={(p,q).epsilon..sup.|N|+2|R||p.sub.min.ltoreq.p.sub.i.ltoreq.p.sub.max
.A-inverted.i, q.sub.min.ltoreq.q.sub.i,j.ltoreq.q.sub.max,
q.sub.i,j(p.sub.j+e.sub.j-p.sub.i-e.sub.i+hl.sub.i,j).ltoreq.0,
p.sub.j+e.sub.j-p.sub.i-e.sub.i+hl.sub.i,j(q.sub.i,j).gtoreq.0.A-inverted-
.(i,j)}. (20) is then equivalent to:
min.sub.(p,q).epsilon.Xh(p,q,d) (21)
for a given demand profile d, i.e. it is of the form (4). In
addition to the sampled demand, lower and upper bounds for demand
at each node are given. Therefore,
d.epsilon.S:={{tilde over
(d)}.epsilon..sup.|N||d.sub.i.sup.min.ltoreq.{tilde over
(d)}.sub.i.ltoreq.d.sub.i.sup.max}.
The uncertainty set can be defined as:
.sub.t,.epsilon.,K={f.epsilon.[d].sub.t|f(d).gtoreq.0.A-inverted.d.epsil-
on.S, max.sub.d.epsilon.S|f(d)-f.sup.K(d)|.ltoreq..epsilon.}.
(22)
The distributionally robust counterpart of (20) can be represented
as:
min ( p , q ) .di-elect cons. X max f .di-elect cons. t , .epsilon.
, K .intg. S h ( p , q , d ) f ( d ) d , ##EQU00019##
i.e. it falls into the class (3).
[0044] As will be appreciated by one skilled in the art, aspects of
the present invention may be embodied as a system, method or
computer program product. Accordingly, aspects of the present
invention may take the form of an entirely hardware embodiment, an
entirely software embodiment (including firmware, resident
software, micro-code, etc.) or an embodiment combining software and
hardware aspects that may all generally be referred to herein as a
"circuit," "module" or "system." Furthermore, aspects of the
present invention may take the form of a computer program product
embodied in one or more computer readable medium(s) having computer
readable program code embodied thereon.
[0045] Any combination of one or more computer readable medium(s)
may be utilized. The computer readable medium may be a computer
readable signal medium or a computer readable storage medium. A
computer readable storage medium may be, for example, but not
limited to, an electronic, magnetic, optical, electromagnetic,
infrared, or semiconductor system, apparatus, or device, or any
suitable combination of the foregoing. More specific examples (a
non-exhaustive list) of the computer readable storage medium would
include the following: an electrical connection having one or more
wires, a portable computer diskette, a hard disk, a random access
memory (RAM), a read-only memory (ROM), an erasable programmable
read-only memory (EPROM or Flash memory), an optical fiber, a
portable compact disc read-only memory (CD-ROM), an optical storage
device, a magnetic storage device, or any suitable combination of
the foregoing. In the context of this document, a computer readable
storage medium may be any tangible medium that can contain, or
store a program for use by or in connection with an instruction
execution system, apparatus, or device.
[0046] A computer readable signal medium may include a propagated
data signal with computer readable program code embodied therein,
for example, in baseband or as part of a carrier wave. Such a
propagated signal may take any of a variety of forms, including,
but not limited to, electro-magnetic, optical, or any suitable
combination thereof. A computer readable signal medium may be any
computer readable medium that is not a computer readable storage
medium and that can communicate, propagate, or transport a program
for use by or in connection with an instruction execution system,
apparatus, or device.
[0047] Program code embodied on a computer readable medium may be
transmitted using any appropriate medium, including but not limited
to wireless, wireline, optical fiber cable, RF, etc., or any
suitable combination of the foregoing.
[0048] Computer program code for carrying out operations for
aspects of the present invention may be written in any combination
of one or more programming languages, including an object oriented
programming language such as Java, Smalltalk, C++ or the like and
conventional procedural programming languages, such as the "C"
programming language or similar programming languages. The program
code may execute entirely on the user's computer, partly on the
user's computer, as a stand-alone software package, partly on the
user's computer and partly on a remote computer or entirely on the
remote computer or server. In the latter scenario, the remote
computer may be connected to the user's computer through any type
of network, including a local area network (LAN) or a wide area
network (WAN), or the connection may be made to an external
computer (for example, through the Internet using an Internet
Service Provider).
[0049] Aspects of the present invention are described below with
reference to flowchart illustrations and/or block diagrams of
methods, apparatus (systems) and computer program products
according to embodiments of the invention. It will be understood
that each block of the flowchart illustrations and/or block
diagrams, and combinations of blocks in the flowchart illustrations
and/or block diagrams, can be implemented by computer program
instructions. These computer program instructions may be provided
to a processor of a general purpose computer, special purpose
computer, or other programmable data processing apparatus to
produce a machine, such that the instructions, which execute via
the processor of the computer or other programmable data processing
apparatus, create means for implementing the functions/acts
specified in the flowchart and/or block diagram block or
blocks.
[0050] These computer program instructions may also be stored in a
computer readable medium that can direct a computer, other
programmable data processing apparatus, or other devices to
function in a particular manner, such that the instructions stored
in the computer readable medium produce an article of manufacture
including instructions which implement the function/act specified
in the flowchart and/or block diagram block or blocks.
[0051] The computer program instructions may also be loaded onto a
computer, other programmable data processing apparatus, or other
devices to cause a series of operational steps to be performed on
the computer, other programmable apparatus or other devices to
produce a computer implemented process such that the instructions
which execute on the computer or other programmable apparatus
provide processes for implementing the functions/acts specified in
the flowchart and/or block diagram block or blocks.
[0052] The flowchart and block diagrams in the Figures illustrate
the architecture, functionality, and operation of possible
implementations of systems, methods and computer program products
according to various embodiments of the present invention. In this
regard, each block in the flowchart or block diagrams may represent
a module, segment, or portion of code, which comprises one or more
executable instructions for implementing the specified logical
function(s). It should also be noted that, in some alternative
implementations, the functions noted in the block may occur out of
the order noted in the figures. For example, two blocks shown in
succession may, in fact, be executed substantially concurrently, or
the blocks may sometimes be executed in the reverse order,
depending upon the functionality involved. It will also be noted
that each block of the block diagrams and/or flowchart
illustration, and combinations of blocks in the block diagrams
and/or flowchart illustration, can be implemented by special
purpose hardware-based systems that perform the specified functions
or acts, or combinations of special purpose hardware and computer
instructions.
[0053] The terminology used herein is for the purpose of describing
particular embodiments only and is not intended to be limiting of
the invention. As used herein, the singular forms "a", "an" and
"the" are intended to include the plural forms as well, unless the
context clearly indicates otherwise. It will be further understood
that the terms "comprises" and/or "comprising," when used in this
specification, specify the presence of stated features, integers,
steps, operations, elements, and/or components, but do not preclude
the presence or addition of one more other features, integers,
steps, operations, element components, and/or groups thereof.
[0054] The corresponding structures, materials, acts, and
equivalents of all means or step plus function elements in the
claims below are intended to include any structure, material, or
act for performing the function in combination with other claimed
elements as specifically claimed. The description of the present
invention has been presented for purposes of illustration and
description, but is not intended to be exhaustive or limited to the
invention in the form disclosed. Many modifications and variations
will be apparent to those of ordinary skill in the art without
departing from the scope and spirit of the invention. The
embodiment was chosen and described in order to best explain the
principles of the invention and the practical application, and to
enable others of ordinary skill in the art to understand the
invention for various embodiments with various modifications as are
suited to the particular use contemplated.
[0055] The flow diagrams depicted herein are just one example.
There may be many variations to this diagram or the steps (or
operations) described therein without departing from the spirit of
the invention. For instance, the steps may be performed in a
differing order or steps may be added, deleted or modified. All of
these variations are considered a part of the claimed
invention.
[0056] While the preferred embodiment to the invention had been
described, it will be understood that those skilled in the art,
both now and in the future, may make various improvements and
enhancements which fall within the scope of the claims which
follow. These claims should be construed to maintain the proper
protection for the invention first described.
* * * * *