U.S. patent application number 14/643763 was filed with the patent office on 2015-07-02 for managing array computations during programmatic run-time in a distributed computing environment.
The applicant listed for this patent is Hewtett-Packard Development Company, L. P.. Invention is credited to Nathan L Binkert, Parthasarathy Ranganathan, Indrajit Roy, Robert Schreiber, Mehul A Shah, Shivaram Venkataraman.
Application Number | 20150186189 14/643763 |
Document ID | / |
Family ID | 49996308 |
Filed Date | 2015-07-02 |
United States Patent
Application |
20150186189 |
Kind Code |
A1 |
Venkataraman; Shivaram ; et
al. |
July 2, 2015 |
MANAGING ARRAY COMPUTATIONS DURING PROGRAMMATIC RUN-TIME IN A
DISTRIBUTED COMPUTING ENVIRONMENT
Abstract
A plurality of array partitions are defined for use by a set of
tasks of the program run-time. The array partitions can be
determined from one or more arrays that are utilized by the program
at run-time. Each of the plurality of computing devices are
assigned to perform one or more tasks in the set of tasks. By
assigning each of the plurality of computing devices to perform one
or more tasks, an objective to reduce data transfer amongst the
plurality of computing devices can be implemented.
Inventors: |
Venkataraman; Shivaram;
(Berkeley, CA) ; Roy; Indrajit; (Mountain View,
CA) ; Shah; Mehul A; (Saratoga, CA) ;
Schreiber; Robert; (Palo Alto, CA) ; Binkert; Nathan
L; (Redwood City, CA) ; Ranganathan;
Parthasarathy; (San Jose, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Hewtett-Packard Development Company, L. P. |
Houston |
TX |
US |
|
|
Family ID: |
49996308 |
Appl. No.: |
14/643763 |
Filed: |
March 10, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13562248 |
Jul 30, 2012 |
9015721 |
|
|
14643763 |
|
|
|
|
Current U.S.
Class: |
718/105 |
Current CPC
Class: |
G06F 9/5088 20130101;
G06F 9/4881 20130101; G06F 9/5066 20130101 |
International
Class: |
G06F 9/50 20060101
G06F009/50; G06F 9/48 20060101 G06F009/48 |
Claims
1-15. (canceled)
16. A method for executing a program comprising a set of tasks, the
method performed by one or more processors of a master node and
comprising: assigning a plurality of the set of tasks to each
respective one of a plurality of worker nodes; sequencing the
plurality of tasks for execution by the respective worker node by
(i) identifying a first task in the plurality of tasks that yields
an array, (ii) identifying a second task in the plurality of tasks
that utilizes the array, and (iii) assigning the first task to be
performed by the respective worker node before the second task.
17. The method of claim 16, further comprising: identifying an
update to the array; and implementing a callback on the respective
node to perform the second task utilizing the updated array.
18. The method of claim 16, further comprising: maintaining, in a
memory resource, metadata defining a current execution state of the
program.
19. The method of claim 18, further comprising: backing up the
respective worker node after completion of each of the plurality of
tasks.
20. The method of claim 18, wherein the metadata further defines an
execution state for each of the plurality of worker nodes.
21. The method of claim 20, further comprising: transmitting a
periodic message to the plurality of worker nodes to determine the
current execution state for each of the plurality of worker
nodes.
22. The method of claim 21, further comprising: determining, based
on the periodic message, that the respective worker node has
failed; in response to determining that the respective node has
failed, identifying, from the metadata, a previous execution state
for the respective worker node; and restarting the respective
worker node based on the previous execution state.
23. A master node comprising: at least one processor; and at least
one memory resource storing instructions for executing a program
comprising a set of tasks, wherein the instructions, when executed
by the at least one processor, cause the master node to: assign a
plurality of the set of tasks to each respective one of a plurality
of worker nodes; sequence the plurality of tasks for execution by
the respective worker node by (i) identifying a first task in the
plurality of tasks that yields an array, (ii) identifying a second
task in the plurality of tasks that utilizes the array, and (iii)
assigning the first task to be performed by the respective worker
node before the second task.
24. The master node of claim 23, wherein the instructions, when
executed by the at least one processor, further cause the master
node to: identify an update to the array; and implement a callback
on the respective node to perform the second task utilizing the
updated array.
25. The master node of claim 23, wherein the instructions, when
executed by the at least one processor, further cause the master
node to: maintain, in the at least one memory resource, metadata
defining a current execution state of the program.
26. The master node of claim 25, wherein the instructions, when
executed by the at least one processor, further cause the master
node to: back up the respective worker node after completion of
each of the plurality of tasks.
27. The master node of claim 25, wherein the metadata further
defines an execution state for each of the plurality of worker
nodes.
28. The master node of claim 27, wherein the instructions, when
executed by the at least one processor, further cause the master
node to: transmit a periodic message to the plurality of worker
nodes to determine the current execution state for each of the
plurality of worker nodes.
29. The master node of claim 28, wherein the instructions, when
executed by the at least one processor, further cause the master
node to: determine, based on the periodic message, that the
respective worker node has failed; in response to determining that
the respective node has failed, identify, from the metadata, a
previous execution state for the respective worker node; and
restart the respective worker node based on the previous execution
state.
30. A non-transitory computer readable medium storing instructions
for executing a program comprising a set of tasks, wherein the
instructions, when executed by at least one processor of a master
node, cause the master node to: assign a plurality of the set of
tasks to each respective one of a plurality of worker nodes;
sequence the plurality of tasks for execution by the respective
worker node by (i) identifying a first task in the plurality of
tasks that yields an array, (ii) identifying a second task in the
plurality of tasks that utilizes the array, and (iii) assigning the
first task to be performed by the respective worker node before the
second task.
31. The non-transitory computer readable medium of claim 30,
wherein the instructions, when executed by at least one processor
of a master node, further cause the master node to: identify an
update to the array; and implement a callback on the respective
node to perform the second task utilizing the updated array.
32. The non-transitory computer readable medium of claim 30,
wherein the instructions, when executed by at least one processor
of a master node, further cause the master node to: maintain, in a
memory resource, metadata defining a current execution state of the
program and an execution state for each of the plurality of worker
nodes.
33. The non-transitory computer readable medium of claim 32,
wherein the instructions, when executed by at least one processor
of a master node, further cause the master node to: back up the
respective worker node after completion of each of the plurality of
tasks.
34. The non-transitory computer readable medium of claim 33,
wherein the instructions, when executed by at least one processor
of a master node, further cause the master node to: transmit a
periodic message to the plurality of worker nodes to determine the
current execution state for each of the plurality of worker
nodes.
35. The non-transitory computer readable medium of claim 34,
wherein the instructions, when executed by at least one processor
of a master node, further cause the master node to: determine,
based on the periodic message, that the respective worker node has
failed; in response to determining that the respective node has
failed, identify, from the metadata, a previous execution state for
the respective worker node; and restart the respective worker node
based on the previous execution state.
Description
BACKGROUND
[0001] Linear algebra, the study of vector spaces, has many
programming applications, such as in the areas of data mining,
image processing, and graph analysis. Linear algebra operations can
involve matrix operations, linear transformations, solutions the
linear equations, and other such computations. There are many
applications that require large-scale data analysis using linear
algebra. Such applications can sometimes require extensive
computing resources, such as provided by parallel computing
environments.
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] FIG. 1 illustrates an example system for managing array
computations during programmatic run-time in a distributed
computing environment, according to an embodiment.
[0003] FIG. 2 illustrates an example method for managing array
computations during programmatic run-time in a distributed
computing environment, according to an embodiment.
[0004] FIG. 3 illustrates an example computing system to implement
functionality such as provided by an example system of FIG. 1
and/or an example method of FIG. 2.
DETAILED DESCRIPTION
[0005] Examples embodiments described herein provide a distributed
computing environment that is configured to enhance processing of
arrays during programming run-time. In such embodiments, the
distributed arrays provide a shared, in memory view of multiple
dimensional data stored across multiple machines.
[0006] Examples embodiments provided herein include system and
method for managing array computations during programmatic run-time
in a distributed computing environment. In some implementations, a
plurality of array partitions are defined for use by a set of tasks
of the program run-time. The array partitions can be determined
from one or more arrays that are utilized by the program at
run-time. Each of the plurality of computing devices are assigned
to perform one or more tasks in the set of tasks. By assigning each
of the plurality of computing devices to perform one or more tasks,
an objective to reduce data transfer amongst the plurality of
computing devices can be implemented. In particular, the objective
can be implemented by (i) determining two or more array partitions
that are accessed or modified by one or more tasks in the set of
tasks during run-time, and (ii) locating the two or more array
partitions on a first computing device of the plurality of
computing devices during the run-time.
[0007] An example computing system includes a master node. a
plurality of worker nodes, and a storage resource that maintains
array data. The master node and the plurality of worker nodes
implement a plurality of tasks that are implemented as part of a
program during run-time. During run-time, the master node
determines, for multiple arrays that are utilized in a program at
run-time, a plurality of array partitions that are used by a set of
tasks of the program run-time. The master node also assigns each of
the plurality of workers to perform one or more tasks in the set of
tasks. The master node may also determine two or more array
partitions that are accessed or modified from the storage resource
by one or more tasks in the set of tasks during run-time.
Additionally, the master node selects to locate the two or more
array partitions on a first worker of the plurality of workers
during the run-time.
[0008] Still further, example embodiments provide an array in a
distributed computing environment in which multiple computing
devices are used. The array can be processed in connection with
performance of a plurality of tasks that are handled concurrently
by the multiple computing devices. A set of data transfer reduction
objectives can be implemented in processing the array. The set of
data transfer reduction objectives including (i) determining a
array partition of the array that is accessed and modified by at
least two of the plurality of tasks, and (ii) assigning a same one
of the multiple computing devices to perform the at least two of
the plurality of tasks.
[0009] Among other benefits, examples such as described herein
enable efficient implementation of large-scale data analysis which
require use of array computations (e.g., linear algebra).
[0010] An array refers to a multi-dimensional data element,
represented by m.times.n. An array partition refers to a portion of
an array, such as a row, column or block of an array.
[0011] One or more embodiments described herein provide that
methods, techniques and actions performed by a computing device
(e.g., node of a distributed file system) are performed
programmatically, or as a computer-implemented method.
Programmatically means through the use of code, or
computer-executable instructions. A programmatically performed step
may or may not be automatic.
[0012] With reference to FIG. 1 or FIG. 2, one or more embodiments
described herein may be implemented using programmatic modules or
components. A programmatic module or component may include a
program, a subroutine, a portion of a program, or a software
component or a hardware component capable of performing one or more
stated tasks or functions. As used herein, a module or component
can exist on a hardware component independently of other modules or
components. Alternatively, a module or component can be a shared
element or process of other modules, programs or machines.
[0013] System Description
[0014] FIG. 1 illustrates an example system for managing array
computations during programmatic run-time in a distributed
computing environment, according to an embodiment. In an
embodiment, a system 10 is implemented with a distributed computing
environment that includes a master node 100 and one or more worker
nodes 110. The distributed computing environment can also include
storage resources 128. The worker nodes 110 can correspond to a
network-connected computing devices that implements tasks and other
functions. The master node 100 can include a server or other
computing device that provides control (e.g., task assignment,
coordination) and fault tolerance for the worker nodes 110. In some
implementations, the storage resources 128 of the distributed
computing environment can include a distributed or hared storage
driver 120, and a storage layer 130.
[0015] In system 10, tasks 119 for executing program 102 can be
distributed to the worker nodes 110 in parallel. In this way,
parallel loops can execute, for example, functions of the program
102 that access and modify arrays (or array portions). While the
program 102 can enable programmers to specify array partitions and
their placement, system 10 can be implemented to programmatically
determine placement of arrays, array portions, and functions and
tasks 119 that utilize arrays, based on pre-defined data transfer
reduction objectives and other considerations. The data transfer
reduction objectives can reduce the amount of data transfer that
would otherwise be required from the transfer of array data as
compared to, for example, distribution of arrays, array partitions
or functions/tasks that are randomly assigned to worker nodes 110,
or based on alternative considerations such as load-balancing.
[0016] During run-time, the computing environment performs array
computations, using one or more array data sources. In some
examples, the program 102 is constructed to implement a set of data
transfer reduction objectives that minimize data movement of the
programs array data sources during the run-time. Among other
benefits, the reduction in data movement can significantly enhance
the performance of the distributed computing environment in running
the program 102.
[0017] The construct of program 102 can enable specific directives
103 and arguments 105 (or other semantic combinations) which
promote data transfer reduction objectives that control the manner
in which arrays and array partitions are processed. In particular,
the data transfer reduction objectives can co-locate array
partitions that are processed together, such as subjected to the
same functions (e.g., array multiplication). The program 102 can
enable arguments that specify arrays, or portions thereof, on which
operations such as linear algebra and array multiplication are
performed. The directives 103 can include a partition directive
that results in the creation of a distributed array, and
partitioned by blocks for which dimensions can be specified or
determined. The program 102 can be constructed to enable other
directives that promote the data transfer reduction objectives for
arrays.
[0018] The storage layer 130 can include one or more data stores,
such as provided by HBase, Vetrica or MySQL Cluster. At run-time,
each of the master node 100 or worker nodes 110 can access the
storage layer 130 to construct arrays and array partitions. The
storage driver 120 supports parallel loading of array partitions.
In some implementations, the storage driver 120 also supports
callbacks that are registered with arrays and array partitions.
When a callback is registered, a call back notification 129 can be
communicated, from an entity that processes the array, to the
storage driver 120 (or storage layer 130) in response to, for
example, designated change events. The callback notification 129
can identify a change in a particular array or array partition. In
examples provided, the changes that trigger callbacks can be
specified by the program 102. For example, the programmer can
specify what changes register callbacks to the storage driver 120
during run-time.
[0019] During run-time, the master node 100 can access the storage
resources 128 to receive array data 111 for arrays and array
partitions that are to be utilized in the performance of program
102. Likewise, the worker nodes 110 can access the storage
resources to receive array data 113 corresponding to array
partitions that are to reside with the particular worker during
run-time. The master node 100 can further include logic to
implement one or more data transfer reduction objectives. The data
transfer reduction objectives can be implemented in part by, for
example, a partition co-location component 112 that operates to
determine a same worker node 110 for residence of array partitions
that are accessed and modified together in, for example, the same
function or task.
[0020] In addition to the co-location component 112, the master
node 100 can implement the objective by scheduling tasks 119 as
part of the run-time execution of the program 102. In particular, a
task scheduling component 114 can assign tasks 119 to worker nodes
110, based on, for example, the presence of a needed array
partition for that task on one of the worker nodes 110. The
scheduling of tasks can include (i) location determination, where
one of the worker nodes 110 from the set are selected o implement
the individual tasks 119, and/or (ii) sequencing or timing the
performance of tasks (e.g., using considerations such as
dependencies amongst tasks for completion of other tasks). Other
considerations, such as described with examples of FIG. 2 can also
be included in the logic of the task scheduling component 114.
[0021] As described by an example of FIG. 1, data transfer
reduction objectives can be implemented to reduce or minimize
transfer of array data amongst the worker nodes 110 of the
distributed computing environment. For example, the partition
co-location component 112 can operate to maintain array partitions
together when functions such as multiplication of the array
partitions is needed. Likewise, the task scheduling component 114
can schedule tasks 119 to reduce or otherwise minimize the data
flow between nodes. In some examples, the worker nodes 110 can be
linked across networks, and communicate with one another using
remote direct access (RDA) operations 115. In some variations, the
RDA operations 115 can be used to transfer array data 113,
corresponding to array partitions. For example, a task executing on
one worker node 110 can access an array partition from another
worker node 110 via one or more RDA operations 115. As noted,
reduction in the number of RDA operations 115 can increase
efficiency and performance for when the program 102 is implemented
in run-time. Additionally, the worker nodes 110 can access array
data 113 from the storage resource 128, and the implementation of
objectives such as described can reduce the amount of data that is
transferred between remote data storage resources 128 to the
individual worker nodes 110.
[0022] An example of FIG. 1 illustrates data transfer reduction
objectives that consider partition co-location and/or tasks
scheduling. Other implementations can promote data transfer
reduction objectives. For example, as described with an example of
FIG. 2, array partitions can be cached on each worker node 110,
using respective worker memory 118, when certain conditions are
present that permit use of cached array data. In particular, the
caching of the array partitions and data can be implemented for
instances when array partitions are unchanged. The use of caching
can further limit, for example, the need for worker nodes 110 to
access remote data sources for array data 113.
[0023] In some implementations, the system 10 can assign versions
to array partition. When, for example, the worker node 110 changes
the array partition, the master node 100 or worker node 110 can
change the version of the array partition. When the array partition
is unchanged, the version of the array partition is unchanged, and
the cached version of the array partition can be used.
[0024] In some variations, the worker memory 118 can also be used
to maintain task queues. Task queues can also be implemented in
order to enable efficient implementation of tasks with dependence,
with minimal transfer of array data. For example, task queues can
store handles to functions that have embedded dependents, as a
mechanism to enable those functions to be queued until their
respective dependents are updated.
[0025] In one implementation, each worker 110 communicates a
notification to the master node 100 when an array has been changed.
For example, the arrays can register the callback notifications 129
which cause the workers 110 to communicate notifications of new
array versions when changes occur. The master node 100 can respond
to the changes and send notifications to other arrays when a new
array version exists. As described with an example of FIG. 2,
sending notifications of array versions to worker nodes 110 in
connection with tasks can facilitate data transfer reduction
objectives for arrays, such as caching.
[0026] The master node 100 can also utilize one or more
programmatic mechanisms for detecting and handling arrays and array
partitions that are dependent on other arrays at run-time. For
example, the directives 105 of the program 102 can include commands
that identify dependencies amongst arrays or array partitions. As
additional examples, the directives 105 of the program 102 can also
specify when arrays need to be updated, particularly in the context
of incremental functions or functions which are dependent on other
functions. In an implementation, when a first array is dependent on
other arrays, those other arrays are updated before a particular
operation is performed on the first array. Tasks can also be queued
pending completion of other tasks that update required arrays or
array partitions.
[0027] In some examples, task queues can be provided with storage
resources 128. The task queues can include handles to functions
that are dependents of individual task clauses. The program 102 can
be implemented to register call backs for arrays that are specified
in clauses with dependencies. When such an array is updated, the
clause that specifies that array as a dependency receives a
notification signaling the occurrence of the array update. The
notification can signal a version of the array partition that is
updated. The version can be used to retrieve array data 113
corresponding to the update from the particular worker where the
change was made, or from storage resource 128.
[0028] In some variations, the master node 100 includes fault
tolerance component 116. The fault tolerance component 116 can be
used to perform operations for backing up the master node 100 and
the worker nodes 110. The master node 100 can maintain metadata
information in a metadata store 126, for use in performing some or
all of the backup operations. For the master node 100, the metadata
123 that is stored identifies the execution state of program 102,
as well as the worker information (e.g., state of tasks or workers)
and the symbol table. The master node 100 can, for example, be
backed after completion of each task. The metadata can also include
the version 139 of each array partition handled on each worker node
110. For example, the worker node 110 can communicate a new version
139 for an array partition to the master node 100 when that array
partition is changed, and the master node 100 can communicate the
array version 139 to other workers that need the partition. In
addition to sending notifications of new array versions 139 that
the master node can restart (or re-assign) the worker node 110 to
implement the task on correct version of the array partition if
failure occurs,
[0029] Additionally, the master node 100 can send periodic
heartbeat messages to determine the progress or state of the worker
nodes 110. The a worker node 110 fails, the worker can be
restarted. Metadata for the particular worker can be maintained
with the metadata store 126 and used to return the worker to a most
recent state (that was backed up). The metadata for the workers can
reflect the particular task and/or function that was running on the
worker node 110, the inputs used by the worker node 110, and the
version of the arrays that were in use on the failed worker node
110.
[0030] As described, processes such as described with fault
tolerance component 116 enable, for example, the long array
processes to be implemented and efficiently restored, even in the
face of failure by the master node.
[0031] Methodology
[0032] FIG. 2 illustrates an example method for managing array
computations during programmatic run-time in a distributed
computing environment, according to an embodiment. A method such as
described by an example of FIG. 2 can be implemented using, for
example, components described with an example of FIG. 1.
Accordingly, reference is made to elements of FIG. 1 for purpose of
illustrating a suitable component for performing a step or ub-step
being described.
[0033] Arrays partitions are identified which are accessed and/or
modified together (210). The directives 105 of the program 102 can
include commands that enable the programmer to specify arrays and
array partitions, with size parameters that are either manually
(e.g., by programmer) or programmatically identified. In one
example, the program construct can support array partitioning. More
specifically, array partitioning can be specified by directives
included in the program 102. A directive can, for example, specify
an array in an argument, and specify whether the dimension of the
array partition, including whether the array partition is a row,
column or block. Additionally, array partitioning can be
dynamically determined.
[0034] At run-time, a set of data transfer reduction objectives can
be implemented to reduce or minimize the instances in which array
data is transferred between nodes of the computing environment
(220). In some implementations, the set of data transfer reduction
objectives can seek to co-locate arrays and array partitions on
worker nodes 110. For example, the master node 100 may seek to
co-locate array partitions which are accessed and modified together
in a same function, or as part of a same task (222). Embodiments
recognize that linear algebra programs often rely on structured
processing, and array partitions can often be co-located by placing
the ith partition of the corresponding arrays on the same worker
node 110. In some variations, the directives of the program 102 can
enable the programmer to specify a command that identifies which
arrays are related and suitable for co-location on the same worker
node 110.
[0035] As an addition or alternative, the master node 100 can also
schedule tasks to minimize array data movement amongst the worker
nodes (224). The task scheduling can include determining the
location (e.g., worker node 110) on which the task is to be
performed (225). In one implementation, the master node 100 can
calculate the amount of remote data copy required for when a task
is assigned to one of the worker nodes 110. The tasks can then be
scheduled to minimize data movement. For example, the task
scheduling component 114 of the master node 100 can locate worker
nodes 100 to perform tasks based on, for example, the calculated
amount of remote data copy required for individual tasks, as well
as directives that specify relations amongst arrays in the program
102.
[0036] The task scheduling can also determine and accommodate the
sequence in which tasks, or functions of tasks are performed (227).
For example, the task scheduling component 114 can include
determining dependencies of tasks on one another, using, for
example, the directives specified in the program 102. Task queues
can also be implemented in order to enable efficient implementation
of tasks with dependence, with minimal transfer of array data. In
implementation, dependent functions can be identified, and a
function requiring other updates can be queued pending updates to
such arrays. For example, when all dependence of a particular task
are implemented and updated, then the particular task is
performed.
[0037] As a variations, the sequencing of tasks can also include
implementing callbacks between nodes, so that when array transfers
do occur, they are efficiently implemented (e.g., minimized in
occurrences). As an example, if Task A performs array
multiplication to yield Array A, and Task B utilizes Array A, then
the task scheduling component 124 can sequence the Task A and B so
that the tasks can be co-located on the same worker node. As a
variation, callbacks can be utilized between different nodes that
carry Task A and Task B so that Task B is triggered with updates to
the Array A.
[0038] The task scheduling can also utilize array caching (229) to
reduce array data transfer. For example, the task scheduling
component 114 can automatically cache and reuse arrays and array
partitions whose version have not changed. For example, at
run-time, the worker node 110 can keep the reference to an array
alive after it is cached, and await notification of the master node
100 when another worker node or task has a new version of that
array available. The worker node 110 can then access and receive
the new version of the array from the storage resources 128. The
worker node 110 thus accesses the storage resources 128 when the
new version is available, rather than at, for example, each
calculation when, for example, the array version has not
changed.
[0039] Hardware Diagram
[0040] FIG. 3 illustrates an example computing system to implement
functionality such as provided by an example system of FIG. 1
and/or an example method of FIG. 2. For example, computing system
300 can be used to implement master node 100, or any one of the
worker nodes 110. In one implementation, computer system 300
includes at least one processor 305 for processing instructions.
Computer system 300 also includes a memory 306, such as a random
access memory (RAM) or other dynamic storage device, for storing
information and instructions to be executed by processor 305. The
memory 306 can include a persistent storage device, such as a
magnetic disk or optical disk. The memory 306 can also include
read-only-memory (ROM). The communication interface 318 enables the
computer system 300 to communicate with one or more networks
through use of the network link 320.
[0041] Computer system 300 can include display 312, such as a
cathode ray tube (CRT), a LCD monitor, or a television set, for
displaying information to a user. An input device 314, including
alphanumeric and other keys, is coupled to computer system 300 for
communicating information and command selections to processor 305.
The computer system 300 may be operable to implement functionality
described with master node 100 or worker node 110 of system 10 (see
FIG. 1). Accordingly, computer system 300 may be operated to
implement a programmatic construct using a language such as, for
example, R in order to implement functionality that provides for
efficient management and use of arrays. The communication interface
318 can be used to, for example: (i) receive instructions 311 that
are part of the program construct, (ii) exchange array data 313
with the storage resource 128 (see FIG. 1), and/or (iii) send or
receive master-worker communications, such as array update
notifications 315 that identify array versions and state
information.
[0042] Embodiments described herein are related to the use of
computer system 300 for implementing the techniques described
herein. According to one embodiment, those techniques are performed
by computer system 300 in response to processor 305 executing one
or more sequences of one or more instructions contained in memory
306. Such instructions may be read into memory 306 from another
machine-readable medium, such as a storage device. Execution of the
sequences of instructions contained in main memory 306 causes
processor 305 to perform the process steps described herein. In
alternative embodiments, hard-wired circuitry may be used in place
of or in combination with software instructions to implement
embodiments described herein. Thus, embodiments described are not
limited to any specific combination of hardware circuitry and
software.
[0043] Although illustrative embodiments have been described in
detail herein with reference to the accompanying drawings,
variations to specific embodiments and details are encompassed by
this disclosure. It is intended that the scope of embodiments
described herein be defined by claims and their equivalents.
Furthermore, it is contemplated that a particular feature
described, either individually or as part of an embodiment, can be
combined with other individually described features, or parts of
other embodiments. Thus, absence of describing combinations should
not preclude the inventor(s) from claiming rights to such
combinations.
* * * * *