U.S. patent application number 11/376322 was filed with the patent office on 2007-09-20 for system and method for optimizing data in value-based storage system.
This patent application is currently assigned to International Business Machines Corporation. Invention is credited to Nikhil Bansal, Frederick Douglis, Lisa Karen Fleischer, Kirsten Weale Hildrum, Akshay Kumar Reddy Katta, John Davis Palmer, Elizabeth Suzanne Richards, David Tao, William Harold Tetzlaff, Joel Leonard Wolf, Philip Shi-lung Yu.
Application Number | 20070220219 11/376322 |
Document ID | / |
Family ID | 38519312 |
Filed Date | 2007-09-20 |
United States Patent
Application |
20070220219 |
Kind Code |
A1 |
Bansal; Nikhil ; et
al. |
September 20, 2007 |
System and method for optimizing data in value-based storage
system
Abstract
A method (and system) of storing data in a value-based storage
system, includes optimizing a value of data stored in the
value-based storage system.
Inventors: |
Bansal; Nikhil; (Yorktown
Heights, NY) ; Douglis; Frederick; (Basking Ridge,
NJ) ; Fleischer; Lisa Karen; (Ossining, NY) ;
Hildrum; Kirsten Weale; (Hawthorne, NY) ; Katta;
Akshay Kumar Reddy; (New York, NY) ; Palmer; John
Davis; (San Jose, CA) ; Richards; Elizabeth
Suzanne; (Columbia, MD) ; Tao; David; (Glen
Burnie, MD) ; Tetzlaff; William Harold; (Mount Kisco,
NY) ; Wolf; Joel Leonard; (Katonah, NY) ; Yu;
Philip Shi-lung; (Chappaqua, NY) |
Correspondence
Address: |
MCGINN INTELLECTUAL PROPERTY LAW GROUP, PLLC
8321 OLD COURTHOUSE ROAD
SUITE 200
VIENNA
VA
22182-3817
US
|
Assignee: |
International Business Machines
Corporation
Armonk
NY
|
Family ID: |
38519312 |
Appl. No.: |
11/376322 |
Filed: |
March 16, 2006 |
Current U.S.
Class: |
711/159 |
Current CPC
Class: |
H04L 49/90 20130101;
H04L 49/9084 20130101; H04L 47/2433 20130101; H04L 47/2416
20130101 |
Class at
Publication: |
711/159 |
International
Class: |
G06F 13/00 20060101
G06F013/00 |
Goverment Interests
GOVERNMENT RIGHTS
[0001] This invention was made with Government support under
Contract No.: H98230-04-3-001 awarded by the U.S. Dept. of Defense.
The Government has certain rights in this invention.
Claims
1. A method of storing data in a value-based storage system,
comprising: optimizing a value of data stored in the value-based
storage system.
2. The method according to claim 1, wherein said optimizing a value
comprises: estimating a rate and a value function of data object
production during an interval of time; computing an optimal
decision for allocating new data to at least one data vat of said
storage system, deleting existing data from said at least one data
vat and for moving existing data from a first data vat to another
data vat; and implementing said optimal decision during said
interval of time.
3. The method according to claim 2, wherein said interval of time
comprises a fixed, projected interval of time.
4. The method according to claim 2, further comprising: repeating
said estimating a rate, said computing an optimal decision and said
implementing said optimal decision.
5. The method according to claim 1, wherein said optimizing a value
comprises: computing an optimal decision for allocating new data to
at least one data vat of said storage system and for moving
existing data from a first data vat to another data vat.
6. The method according to claim 5, wherein said computing an
optimal decision comprises: limiting an amount of said existing
data that is moved from said first vat to said another vat.
7. The method according to claim 5, wherein said computing an
optimal decision comprises: eliminating said moving existing data
from said first vat to said another vat.
8. The method according to claim 1, wherein said storage system
comprises a distributed computer system.
9. The method according to claim 1, wherein data input into said
storage system comprises streaming data.
10. The method according to claim 2, wherein said implementing
comprises automatically allocating new data, deleting existing data
and moving existing data.
11. The method according to claim 2, wherein said optimal decision
comprises allocating an amount of said new data to said at least
one data vat that is substantially equal to an amount of said
existing data deleted from said at least one data vat.
12. The method according to claim 1, wherein said storage system
comprises a plurality of loosely coupled data vats.
13. The method according to claim 1, wherein said optimizing
comprises maximizing a value of retained data in the storage
system.
14. The method according to claim 2, further comprising:
periodically gathering information about data being written and
about a state of the storage system; and revising said optimal
decision based on gathered information.
15. The method according to claim 2, wherein said optimal decision
comprises maintaining an equal value function in each of said at
least one data vat.
16. A signal-bearing medium tangibly embodying a program of machine
readable instructions executable by a digital processing apparatus
to perform a method of storing data in a storage system, said
method comprising: optimizing a value of data stored in the storage
system.
17. A method for deploying computing infrastructure, comprising
integrating computer-readable code into a computing system, wherein
the computer readable code in combination with the computing system
is capable of performing a method of storing data in a storage
system, said method of storing data in a storage system comprising:
optimizing a value of data stored in the storage system.
18. A system for storing data in a storage system, comprising: an
optimizing unit that optimizes a value of data stored in the
storage system.
19. The system according to claim 18, wherein said optimizing unit
comprises: an estimating unit that estimates a rate and a value
function of data object production during an interval of time; a
computing unit that computes an optimal decision for allocating new
data to at least one data vat of said storage system, deleting
existing data from said at least one data vat and for moving
existing data from a first data vat to another data vat; and an
implementing unit that implements said optimal solution during said
interval of time.
20. The system according to claim 18, wherein said optimizing unit
comprises: a computing unit that computes an optimal decision for
allocating new data to at least one data vat of said storage system
and for moving existing data from a first data vat to another data
vat.
Description
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] The present invention generally relates to a method and
apparatus for optimizing storage in a stream-based distributed
computer system, and more particularly to a method and apparatus
for maximizing the value of retained data in a storage system
incorporating retention function-based data detection.
[0004] 2. Description of the Related Art
[0005] Computer storage systems for storing inputted data are
commonly known. However, not all commonly known computer data
storage systems are designed to handle streaming data applications.
Distributed computer systems, which for purposes of the present
application refer to storage systems including multiple storage
units (e.g., "vats") that are coupled together, have been
specifically designed to handle streaming data applications.
However, distributed computer systems designed to handle very
large-scale (e.g., on the scale of hundreds of thousands of
incoming streams of data) are in their infancy.
[0006] Highly scalable distributed computer systems that may handle
complex applications involving large quantities of streaming data
are possible. In particular, distributed computer systems,
including tens of thousands of processing nodes, may have the
capability of concurrently supporting hundreds of thousands of
incoming and derived data streams and having storage subsystems
with a capacity of multiple petabytes.
[0007] Even at these large sizes (e.g., a storage capacity of
multiple petabytes), the distributed computer systems will not be
able to handle all of the streaming data. That is, the processors
cannot handle all of the streaming data and will be fully utilized.
Additionally, the offered load will far exceed the processing power
capabilities of the systems and the storage systems will be over
capacity.
[0008] FIG. 1 depicts a conventional distributed storage subsystem
100 for a streaming data system described above. Incoming and
derived streams 102 of data are processed by interconnected
applications on a distributed set 104 of processing nodes
106a-106c. These processing nodes 106a-106c are interconnected via
a network 108, and connected, via a storage network, to a
collection 110 of storage vats 112a-112c. Each vat 112a-112s may
include an individual file system.
[0009] Storing streaming data presents a challenge that is
qualitatively different from that of conventional systems (i.e.,
systems including non-streaming input data), because of the huge
quantities of primal (incoming) and processed data, which needs to
be written to disk. The storage subsystem of a conventional
computer system is typically configured with sufficient capacity to
handle the data. Deletion of data is typically done manually. But
in a streaming environment, massive amounts of data are being
written constantly. No reasonable amount of storage will be able to
keep up with the incoming and derived streaming data, and therefore
very little of the data can be kept permanently. In fact, one can
assume that in steady state, the storage subsystem will constantly
be more or less fully allocated.
[0010] Thus, as new data arrives, an equivalent amount of old data
must be flushed (deleted). Since the deletion operations will
happen at great rates, they cannot be done manually (as is done in
conventional systems). Given that a typical distributed computer
system for steaming data applications will run continuously, there
will be no `down` time to fix problems. Therefore, any attempt to
optimize the storage of the streaming data must be done in real
time. Therefore, conventional storage techniques, where data is
deleted manually, are not ideal for a streaming data system.
[0011] Stored data objects in streaming systems are typically
regarded as immutable once created. Thus, the storage subsystem has
the roles of handling initial writes, potentially multiple reads,
and, finally, deletion of the data.
[0012] One solution to the automatic deletion of data might be to
keep the most recent data, displacing the oldest data first. This
is commonly known as the first in, first out (FIFO) approach.
Another idea is to retain data based on the time of its last usage
(initial write or subsequent read). This is commonly known as the
least recently used (LRU) approach, effectively treating the entire
storage subsystem as though it were a huge cache. Each of these
techniques is a conventional technique that has been used in
non-streaming data systems.
[0013] However, neither of these concepts will work well for
streaming data applications, because these approaches do not
optimize the value of data being retained.
[0014] Accordingly, there is a need for a more sophisticated
approach. A conventional approach for handling streaming data has
been developed that treats data differently based on its current
importance to the overall system. For example, the headlines of
news articles from CNN might be worth storing for longer periods of
time than the actual body of the news articles.
[0015] The approach is to define for each data object to be written
to disk a function describing its projected value over time (i.e.,
a so-called time value of information objects). This retention
value function is typically non-increasing, within a range from 0
to 100, though neither of these properties is strictly required.
The storage subsystem then deletes the data with the lowest current
retention function values as space is needed. This design results
in a relative rather than absolute notion of value. That is, the
retention function value at a given time does not guarantee the
amount of time the data object has left before being deleted. The
overhead associated with such a deletion method is manageable, at
least as long as the number of such functions is not too large.
[0016] The creation of the retention value functions is generally
the responsibility of the application, and defined at a much
coarser level than that of the data objects themselves. Each data
object belongs to a so-called retention class. All data objects in
a particular retention class have retention values determined by
the same retention value function. Thus, retention classes are the
atomic unit on which retention value functions are defined.
Different data objects within a retention class can have varying
ages, and therefore have different values at any given time.
[0017] Occasionally, it may be useful to modify a particular
retention value function, or to remove certain data objects from a
retention class and add them to another, thus changing the
retention value functions for those objects. Storage class
retention function assignments and data object retention value
function modifications are the job of analytics, and these are
orthogonal to the present embodiment.
[0018] The above-described technique has been used in an
environment including a single storage unit (e.g., vat). In an
individual vat, space is essentially fluid, and deleting existing
data frees up space for a comparable amount of new data. As a
practical implementation, one can approximate this flow balance
concept via a waterline. The waterline is defined for a given vat
and time, so that data whose value is below this waterline will be
deleted. Data whose value is at or above this waterline will be
retained. The waterline rises and falls over time, depending on the
amount of new data that must be added to the vat.
[0019] However, the notion of waterlines takes on a much different
character when there are multiple vats (e.g., as in the distributed
storage system 100 depicted in FIG. 1). Absent a global
optimization strategy, the waterlines of the various vats may drift
and become quite different over time. This may result in the
deletion of higher valued data than would be removed in a scenario
with one global vat with a single waterline. It would therefore
clearly be useful if the waterlines of the various vats were
identical, or, more precisely, as close as possible to equal, given
the other constraints in the system.
[0020] Therefore, it is clear that a novel and very effective
optimization method is necessary for a storage component of a
distributed computer system to handle large scale stream processing
applications.
SUMMARY OF THE INVENTION
[0021] In view of the foregoing and other exemplary problems,
drawbacks, and disadvantages of the conventional methods and
structures, an exemplary feature of the present invention is to
provide a method (and system) for optimizing storage in a
stream-based distributed computer system by maximizing the value of
retained data in the storage system.
[0022] It is another exemplary feature to minimize the total value
of all data removed (e.g., deleted) from the storage system. In
other words, it is an exemplary feature of the present invention to
maximize the total value of data retained in the storage
system.
[0023] In accordance with a first exemplary aspect of the present
invention, a method (and system) of storing data in a value based
storage system includes optimizing a value of stored data in the
value based storage system. The value may be optimized by computing
an optimal decision for allocating new data to at least one data
vat in the storage system, deleting existing data from at least one
data vat and for moving existing data from a first data vat to
another data vat in the storage system.
[0024] In accordance with a second exemplary aspect of the present
invention a signal-bearing medium tangibly embodies a program of
machine-readable instructions executable by a digital processing
apparatus to perform a method of storing data in a storage system,
where the method includes optimizing a value of stored data in the
storage system.
[0025] In accordance with a third exemplary aspect of the present
invention a method for deploying computing infrastructure includes
integrating computer-readable code into a computing system, wherein
the computer readable code in combination with the computing system
is capable of performing a method of storing data in a storage
system, where the method of storing data in a storage system
includes optimizing a value of stored data in the storage
system.
[0026] In accordance with a fourth exemplary aspect of the present
invention a system for storing data in a storage system includes an
optimizing unit that optimizes a value of stored data in the
storage system.
[0027] As indicated above, a distributed storage subsystem may be
used in a computer system running a plurality of applications. Each
application has a choice of one of the vats in the distributed
storage system. The inventors have discovered that it is important
to ensure, with minimal communication, that applications make
decisions that are good for the system as a whole. To ensure that
the applications make good decisions, periodically, the optimizer
of the present invention will gather information about the data
being written and the state of the storage system, and then
instruct the applications to revise their choice of vats.
[0028] The problem of optimizing or balancing of the vats in the
storage system is somewhat similar to traditional file assignment
problems (FAPs). However, the large majority of FAPs have had the
goal of trying to balance load across the storage subsystem.
Balancing waterlines, as in certain exemplary embodiments of the
present invention, instead presents a different challenge.
[0029] Traditional FAPs have generally made decisions about initial
data placement and periodic data movement. Proper initial placement
is relatively more critical in a streaming system such as described
above. That is because data movement is less useful from a
cost/benefit analysis perspective in a system as depicted in FIG.
1.
[0030] That is, data may only be read a few times before being
deleted, so the overhead of movement is high relative to its
expected utility. Furthermore, movement of data is simply more
expensive in a distributed storage system. Thus, one is forced to
make very careful initial placement decisions, and treat data
movement as expensive (and consequently limited), or even
prohibited. In accordance with one exemplary aspect of the present
invention, the method should behave as well, or almost as well,
when data movement is not allowed at all.
[0031] Therefore, in accordance with an aspect of the present
invention, the method (and system) minimizes the total value of all
data deleted, subject to reasonable and practical constraints, such
as local and global movement constraints. Minimizing the total
values of the deleted data is equivalent to maximizing the total
values of the data retained. This may be achieved by making optimal
decisions about where to write newly created data, and also how to
move data around within the storage subsystem, provided such
movement is within the limits allowed and justified.
[0032] Therefore, certain exemplary aspects of the present
invention propose optimizing the value of stored data in a
value-based storage system by estimating the rates and value
functions of data object production during a fixed projected
interval of time, computing optimal decisions for allocating new
data to the vats and moving the existing data from one vat to
another, and implementing the decisions in a dynamic fashion during
a fixed interval of time. Periodically, information will be
gathered about the data being written and the state of the storage
system, and the decisions concerning the placement and deletion of
data from the vats may be revised. Accordingly, the method will
make decisions that are good for the system as a whole.
[0033] With the above and other unique and unobvious exemplary
aspects of the present invention, it is possible to maximize the
total value of data retained in the storage system by making
optimal decisions concerning where to write newly created data,
deleting existing data and how to relocate data within the storage
system. Additionally, certain aspects of the present invention are
directed to maintaining identical (or as close to identical as
possible) waterlines in the plurality of vats in the distributed
storage system.
BRIEF DESCRIPTION OF THE DRAWINGS
[0034] The foregoing and other exemplary purposes, aspects and
advantages will be better understood from the following detailed
description of an exemplary embodiment of the invention with
reference to the drawings, in which:
[0035] FIG. 1 illustrates a conceptual view of an exemplary
conventional storage subsystem 100;
[0036] FIG. 2 illustrates a method 200 of optimizing data in a
value-based storage system in accordance with an exemplary
embodiment of the present invention;
[0037] FIG. 3A depicts an example of a value function 300a for a
retention class and vat in accordance with the exemplary embodiment
illustrated in FIG. 2;
[0038] FIG. 3B depicts an example of a total value function 300b
for a retention class and vat in accordance with the exemplary
embodiment illustrated in FIG. 2;
[0039] FIG. 4 illustrates an exemplary process 400 for computing
the total value function 300b for a retention class and vat in
accordance with the exemplary embodiment illustrated in FIG. 2;
[0040] FIG. 5 depicts an exemplary conceptual flow graph 500 for a
linear program used in accordance with an exemplary embodiment of
the present invention;
[0041] FIG. 6 illustrates a block diagram of a system 600 for
optimizing data in a value-based storage system in accordance with
an exemplary embodiment of the present invention;
[0042] FIG. 7 illustrates a block diagram of the environment and
configuration of an exemplary system 700 for incorporating the
present invention; and
[0043] FIG. 8 illustrates a storage medium 800 for storing steps of
the program for optimizing data in a value-based storage system
according to an exemplary embodiment of the present invention.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS OF THE INVENTION
[0044] Referring now to the drawings, and more particularly to
FIGS. 1-8, there are shown exemplary embodiments of the method and
structures according to the present invention.
[0045] Prior to describing the method and system of the present
invention, it is important to examine the constraints of the
problem presented. The first constraint corresponds to a key
rationale for the vats themselves. That is, different vats
typically have different properties, and not all retention classes
will be suitable for all vats.
[0046] For example, vats may have availability properties (e.g,
redundant array of inexpensive discs (RAID) level) performance
properties (e.g, nominal latency), security properties (e.g., some
vats may be more secure than others), different locations in the
distributed network (e.g., a distance metric might be appropriate)
and qualitative properties (e.g., some vats might be reserved for
DB2 data).
[0047] Each retention class may have specific requirements with
respect to these properties, and thus be allowed only on a subset
of the vats (The acceptable vats are those that meet all of the
requirements). The optimization method allocates newly created data
to a vat, which is acceptable. Furthermore, the optimization method
may move existing data from one acceptable vat to another
acceptable vat. In accordance with an exemplary embodiment of the
present invention, the optimization method will only allocate newly
created data and more existing data to an acceptable vat. Second,
the optimization method obeys a variety of constraints describing
(at either a local or a global level) the maximum amount of allowed
movement. Finally, the method ensures than no vat receives too many
requests for reads and writes.
[0048] In accordance with an exemplary aspect of the present
invention, the optimization method (and system) requires minimal
centralized control and direction. The method is epoch-based,
gathering input, solving and implementing the computed solution
entirely automatically. The exact length of an epoch is not
crucial, as long as the length is sufficient to complete the
optimization method. For example, an epoch may be fixed at a length
of half-an-hour to one full hour.
[0049] FIG. 2 illustrates a method of storing data in a storage
system according to an exemplary embodiment of the present
invention. As indicated above, the optimization method is
epoch-based, and the length of an epoch, say E, may be chosen by
the system administrator.
[0050] During each epoch, each of the following steps may be
executed. The time T since the current epoch started is intialized
to 0, and the clock starts (step 201). (Such timers are standardly
available in computer systems). An input assembler or module then
generates and assembles the input required for the method (step
202). The output is fed to a linear program (LP) assembler (step
203), which generates the specific instance of the LP employed in
the method. The LP represents the optimization problem to be
solved.
[0051] The LP is then solved (step 204) by any of a variety of
commercially available LP solvers. The solution obtained in step
204 indicates which new data is input into each vat, which existing
data is removed and retained in each vat, and which data is moved
to another vat. Ideally, the amount of data added to a vat equals
the amount of data removed from a vat, and the amount of existing
data moved between the vats is minimized or eliminated. It is ideal
to minimize or eliminate the amount of existing data that is moved
from one vat to another vat because movement of data between vats
incurs significant overhead and is therefore generally not
practical.
[0052] Then, the amount of elapsed time T since the start of the
current epoch is checked (step 205) to determine if it is less than
E (the length of an epoch). If the amount of elapsed time T is less
than E, then the method checks to see if refined or corrected input
data has now become available (step 206). If no refined or
corrected input data is now available, then the method again checks
to determine if the amount of elapsed time T since the start of the
current epoch is checked is less than E (e.g., by returning to step
205). If new refined or corrected data has become available, then
the input assembler or module again generates and assembles the
input required for the method (e.g., by returning to step 202),
starting the process of creating a new LP solution with the changed
input data.
[0053] If, however, the amount of time T is greater than or equal
to E, then the method implements a solution for all retention
classes and vats during the next epoch (step 207). Then, the method
is automatically repeated.
[0054] To further understand the input generator module, consider a
finite collection of M retention classes indexed by r. These
retention classes may correspond to existing data on a disk, to new
data being written to a disk, or to both. There is also a finite
collection of N vats indexed by v. For ease of notation, an
exemplary embodiment also employs a vat 0 corresponding to new data
(e.g., data not yet assigned to an `actual` vat).
[0055] Furthermore, Z[r][v] represents the estimated amount (in
bytes) of retention class r data in vat v. In particular, Z[r] [0]
is the amount of new data in retention class r. C[v] represents the
capacity (in bytes) of vat v. A[r][v] represents 1 if the retention
class r is allowed in vat v, and 0 otherwise. The M.times.(N+1)
matrix A is called the "acceptability matrix". c[v][v'] represents
the (per byte) cost of moving data from vat v to vat v'. k[v][v']
represents the maximum amount of data (in bytes) that can be moved
in one epoch from vat v to vat v'. K represents the maximum amount
of data (in bytes) that can be moved between vats in one epoch.
d[r] represents the expected access rate for data in retention
class r. D[v] represents the maximum access rate threshold for vat
v. .alpha. is a number between 0 and 1, and will weight the degree
to which waterline optimization matters relative to load
balancing.
[0056] Z[r][v] and d[r] can be estimated based on the current state
of the system, via any standard forecasting techniques. C[v] is a
property (e.g., vat storage capacity) of the storage devices in vat
v, and may be measured by the number of bytes. A[r][v] can be
computed as the conjunction of the required criteria for retention
class r based on the properties of vat r.
[0057] The computation of A[r][v] in an exemplary embodiment of the
present invention involves checking the availability, performance,
security, location and other qualitative requirements, and setting
the acceptability matrix to be 1 if all constraints are met, 0
otherwise. The constants c[v][v'], k[v][v'], K and .alpha. are
user.
[0058] For purposes of the present description of an exemplary
embodiment of the present invention, it is assumed that all vats in
the storage system are full. In most situations, all of the vats in
the storage system will be full. However, it will be easily
understood, by one skilled in the art, how to apply the method of
the present invention to a storage system in which all of the vats
are not completely full.
[0059] The method constructs a function V[r][v] for each retention
class r and vat v. The independent variable of V[r][v] represents
the amount of data (in bytes) from retention class r, which will be
deleted from vat v to accommodate new or existing data entering the
vat. (If v=0 it will represent new data that is deleted
immediately, and never stored.) The dependent variable of V[r] [v]
represents the total value of the data deleted.
[0060] Because the bulk delete function removes data of smallest
value, an exemplary embodiment of the invention starts by ordering
the data in terms of increasing value per byte for each retention
class r and vat v. This gives rise to a function W[r][v] defined as
the value W[r][v](w) of the (last) object of data removed if a
total of w bytes are deleted. W[r][v] is a step function with one
step for each different value of data in the vat (this is
exemplarily depicted in FIG. 3A).
[0061] The function V[r][v] is the integral of this function
between 0 and w. Because of the nature of W[r] [v], the function
V[r] [v] is an increasing and piecewise linear convex function of w
(this is exemplarily depicted in FIG. 3B). This convexity is an
important part of the current invention, because if the convexity
is not present, the solution described need not be optimal.
[0062] FIG. 4 depicts a flowchart of the method of creation of the
total value function V[r][v] for each retention class r and each
vat v. First, r is set to 1 (step 401). Then, v is set to 0 (step
402). Next, p is set to 1, VX[r][v][0] and VY[r][v][0] are set to 0
(step 403). The data associated with the value function W is then
ordered (step 404). The data is ordered in terms of the value, and
can be sorted by any standard sorting scheme. Any known ordering
technique may be used for ordering the data associated with the
value function W. The output includes P[r][v] points
(WX[r][v][1],WY[r][v][1]), . . . , (WX[r] [v] [P[r] [v]],WY[r] [v]
[P[r] [v]]).
[0063] The value VX[r] [v] [p] is then computed as
VX[r][v][p-1]+WX[r][v][p], and the value VY[r][v][p] is computed as
VY[r][v][p-1]+WX[r][v][p]*WY[r][v][p] (step 405). p is incremented
by 1 (step 406). The value of p is then tested to determine if
p<P (step 407).
[0064] If p<P, then the value VX[r] [v] [p] and the value of
VY[r] [v] [p] is again computed (e.g., returns to step 405). If p
is not less than P, then v is incremented by 1.
[0065] The value of v is then tested to determine if v.ltoreq.N. If
v.ltoreq.N, then the method returns to step 403. If v is not
.ltoreq.N, then r is incremented by 1 (step 410).
[0066] The value or r is tested to determine if r.ltoreq.M. If
r.ltoreq.M, then the method returns to step 402. If r is not
.ltoreq.M, then the method terminates (step 411). The line segments
from (VX[r][v][p-1],VY[r][v][p-1]) to (VX[r] [v] [p],VY[r] [v] [p])
represent the pieces of the total value function.
[0067] In accordance with the exemplary embodiment of the method
depicted in FIG. 2, once the total value function is created, the
linear program (LP) is formulated, which is solvable by any of
several commercially available LP solvers. The intuition for this
LP comes from a flow graph composed of nodes and arcs. In certain
special cases, the problem will actually be solvable via network
flow solvers, which are also available commercially. Consider the
flow graph shown in FIG. 5. The flow graph depicted in FIG. 5
includes three types of nodes.
[0068] There is a first column (501) of (source) nodes (r,v)
(501a-501d) for each retention class r and each vat v for which
retention class r is relevant. The nodes are blocked into N+1
groups, one group for the new data and N groups for the actual
vats. The group for vat 0 has nodes (501a) for each retention
class. The group for vat v has non-trivial nodes (501b-501d) for
each retention class r with A[r][v]=1. These nodes introduce
Z[r][v] units of flow into the graph.
[0069] Furthermore, there is a second column (502) of nodes v, one
for each actual vat (502a-502c). There is also a sink node (503) on
the right of FIG. 5.
[0070] As shown in FIG. 5, the nodes (501, 502 and 503) are
connected by arcs. There are two types of arcs from the nodes in
the first column (501) to the nodes in the second column (502). One
type (504) (e.g., the solid arcs in FIG. 5) correspond to movement
of data in the retention classes between distinct and actual vats
(e.g., from (r,v) to (r,v')). An arc exists only if
A[r][v]=A[r][v']=1. The capacity of the arc is defined as k[v][v'].
The cost along the arc is defined as c[v][v']. The other type of
arcs (505) (e.g., the dashed arcs) correspond to either movement
(initial assignment) from vat 0, or to leaving existing data for a
retention class on the same vat. In the first case v=0, and in the
second case v=v'. Again, an arc from retention class r of vat 0 to
vat v exists only if A[v][r]=1. The cost along these arcs is 0 and
the capacity is infinite.
[0071] There is an additional type of arc (506) (e.g., the dotted
arcs in FIG. 5) from nodes in the first column (501) to the sink
node (503). An arc from node (r,0) to the sink node (503)
corresponds to deleting new data belonging to retention class r,
while an arc from node (r,v) to the sink node corresponds to
deleting existing retention class r data from vat v. The capacity
of an arc from (r,v) to the sink is infinite. The cost along this
arc is V[r][v].
[0072] Furthermore, there is an additional type of arc (507) from
nodes in the second column (502) (e.g., solid arcs in FIG. 5) to
the sink node (503). These arcs (507) represent retaining data in
the vat. The capacity of the arc from vat v is C[v]. The cost along
this arc is 0.
[0073] The LP solver may use many (e.g., in an exemplary
embodiment, three) types of decision variables. First, y[r][v][v']
is the amount of data from retention class r that will be moved
from vat v to vat v'. This data will be retained, and represents
the flow from a node in the first column (501) of FIG. 5 to a node
in the second column (502). Second, w[r][v] is the amount of new or
existing data from retention class r that will be deleted from vat
v. This represents flow from the first column (501) of FIG. 5 to
the sink (503). .gamma. is a bound on the degree to which load
balancing goals cannot be achieved.
[0074] The optimization formulation, which can be submitted to any
commercially available LP solver, is as follows:
[0075] Minimize a (sum over r,v of V[r][v](w[r][v])+sum over v sum
over v'c[v][v'] sum over r of y[r][v][v'])+(1-.alpha.) .gamma.
subject to the following:
1. w[r][v]+sum over v':A[r][v']=1} of y[r][v][v']=Z[r][v] for all
(r,v);
2. sum over r:A[r][v]=A[r][v']=1 sum over v' of y[r][v'][v]=C[v]
for all;
3. sum over r:A[r][v]=1 of y[r][v][v']<=k[v][v'] for all v neq
0, v' neq v;
4. sum over r,v neq 0,v' of y[r][v][v']<=K;
5. sum over r,v' of y[r][v'][v'] d[r]<=.gamma. D[v] for all v;
and
6. w[r] [v], y[r] [v] [v'], .gamma.>=0 & for all r,v,v'.
[0076] The objective function includes summands for the value of
deleted data, for the cost of moving data from vat to vat, and for
load balancing. By scaling the cost coefficient c[v][v'] and the
constant .alpha., the optimization method can easily vary the
importance of value of deleted data relative to the cost of moving
data from vat to vat.
[0077] Equations 1 represent the flow conservation constraints for
the source nodes (r,v) in the first column (501) of FIG. 5.
Equations 2 represent the flow conservation constraints for the
nodes v in the second column (502). Inequalities 3 represent the
local movement constraints, and inequality 4 represents the global
movement constraint. Inequality 5 bounds the factor by which the
load balancing goals will be missed by y, and inequalities 6
represent the non-negativity constraints. Constraints 3, 4 and 5
turn the optimization problem into a linear program rather than a
network flow problem.
[0078] FIG. 6 illustrates a block diagram of a system 600 for
optimizing data in a value-based storage system in accordance with
an exemplary embodiment of the present invention. The system 600
includes an estimating unit 601, a computing unit 602 and an
implementing unit 603. The estimating unit 601 estimates the rates
and value functions of data object production during a fixed
interval of time (e.g., the epoch E). The computing unit 602
computes the optimal decisions for allocating the new data to the
vats, deleting existing data from the vats and moving the existing
data from one vat to another. The implementing unit 603 dynamically
implements the optimal decisions during the fixed interval of
time.
[0079] FIG. 7 shows a typical hardware configuration of an
information handling/computer system in accordance with the
invention that preferably has at least one processor or central
processing unit (CPU) 711. The CPUs 711 are interconnected via a
system bus 712 to a random access memory (RAM) 714, read-only
memory (ROM) 716, input/output adapter (I/O) 718 (for connecting
peripheral devices such as disk units 721 and tape drives 740 to
the bus 712), user interface adapter 722 (for connecting a keyboard
724, mouse 726, speaker 728, microphone 732, and/or other user
interface devices to the bus 712), communication adapter 734 (for
connecting an information handling system to a data processing
network, the Internet, an Intranet, a personal area network (PAN),
etc.), and a display adapter 736 for connecting the bus 712 to a
display device 738 and/or printer 739 (e.g., a digital printer or
the like).
[0080] As shown in FIG. 7, in addition to the hardware and process
environment described above, a different aspect of the invention
includes a computer implemented method of performing the inventive
method. As an example, this method may be implemented in the
particular hardware environment discussed above.
[0081] Such a method may be implemented, for example, by operating
a computer, as embodied by a digital data processing apparatus to
execute a sequence of machine-readable instructions. These
instructions may reside in various types of signal-bearing
media.
[0082] Thus, this aspect of the present invention is directed to a
programmed product, comprising signal-bearing media tangibly
embodying a program of machine-readable instructions executable by
a digital data processor incorporating the CPU 711 and hardware
above, to perform the method of the present invention.
[0083] This signal-bearing media may include, for example, a RAM
(not shown) contained with the CPU 711, as represented by the
fast-access storage, for example. Alternatively, the instructions
may be contained in another signal-bearing media, such as a
magnetic data storage diskette or CD disk 800 (FIG. 8), directly or
indirectly accessible by the CPU 711.
[0084] Whether contained in the diskette 800, the computer/CPU 711,
or elsewhere, the instructions may be stored on a variety of
machine-readable data storage media, such as DASD storage (e.g., a
conventional "hard drive" or a RAID array), magnetic tape,
electronic read-only memory (e.g., ROM, EPROM, or EEPROM), an
optical storage device (e.g., CD-ROM, WORM, DVD, digital optical
tape, etc,), or other suitable signal-bearing media including
transmission media such as digital and analog and communication
links and wireless. In an illustrative embodiment of the invention,
the machine-readable instructions may comprise software object
code, compiled from a language such as "C", etc.
[0085] Additionally, it should also be evident to one of skill in
the art, after taking the present application as a whole, that the
instructions for the technique described herein can be downloaded
through a network interface from a remote storage facility.
[0086] While the invention has been described in terms of several
exemplary embodiments, those skilled in the art will recognize that
the invention can be practiced with modification within the spirit
and scope of the appended claims.
[0087] Further, it is noted that, Applicants' intent is to
encompass equivalents of all claim elements, even if amended later
during prosecution.
* * * * *