U.S. patent application number 12/756019 was filed with the patent office on 2010-08-05 for method and system for executing a task and medium storing a program therefor.
This patent application is currently assigned to International Business Machines Corporation. Invention is credited to David Francis Bacon, Perry Sze-Din Cheng, David Paul Grove, Daniel J. Spoonhower.
Application Number | 20100198885 12/756019 |
Document ID | / |
Family ID | 38823182 |
Filed Date | 2010-08-05 |
United States Patent
Application |
20100198885 |
Kind Code |
A1 |
Bacon; David Francis ; et
al. |
August 5, 2010 |
METHOD AND SYSTEM FOR EXECUTING A TASK AND MEDIUM STORING A PROGRAM
THEREFOR
Abstract
A method of executing a task includes executing, by using a
processor, a first task including a low-frequency task in which
garbage is collected using a garbage collector, initializing a
second task including a high-frequency task by constructing an
instance of a class that implements a standard runnable thread
interface, and creating a data structure for supporting
communication between the second task and lower priority threads,
the data structure being accessible by a thread running in a
garbage-collected heap in the first task to communicate data
between the high-frequency task and the low-frequency task,
validating the second task to ensure that the second task is
executable without synchronizing with the first task, instantiating
the second task to create a class for executing the second task;
and after the instantiating the second task, executing the second
task, the garbage collector being preemptable by the second
task.
Inventors: |
Bacon; David Francis;
(Sleepy Hollow, NY) ; Cheng; Perry Sze-Din; (New
York City, NY) ; Grove; David Paul; (Ridgefield,
CT) ; Spoonhower; Daniel J.; (Pittsburgh,
PA) |
Correspondence
Address: |
MCGINN INTELLECTUAL PROPERTY LAW GROUP, PLLC
8321 OLD COURTHOUSE ROAD, SUITE 200
VIENNA
VA
22182-3817
US
|
Assignee: |
International Business Machines
Corporation
Armonk
NY
|
Family ID: |
38823182 |
Appl. No.: |
12/756019 |
Filed: |
April 7, 2010 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11449774 |
Jun 9, 2006 |
|
|
|
12756019 |
|
|
|
|
Current U.S.
Class: |
707/813 ;
707/E17.001 |
Current CPC
Class: |
G06F 12/0269 20130101;
Y10S 707/99957 20130101 |
Class at
Publication: |
707/813 ;
707/E17.001 |
International
Class: |
G06F 12/00 20060101
G06F012/00; G06F 17/30 20060101 G06F017/30 |
Claims
1. A method of executing a task, comprising: executing, by using a
processor, a first task comprising a low-frequency task in which
garbage is collected using a garbage collector; initializing a
second task comprising a high-frequency task by: constructing an
instance of a class that implements a standard runnable thread
interface; and creating a data structure for supporting
communication between said second task and lower priority threads,
said data structure being accessible by a thread running in a
garbage-collected heap in said first task to communicate data
between said high-frequency task and said low-frequency task;
validating said second task to ensure that said second task is
executable without synchronizing with said first task;
instantiating said second task to create a class for executing said
second task; and after said instantiating said second task,
executing said second task, said garbage collector being
preemptable by said second task.
2. The method of claim 1, wherein said validating said second task
comprises determining a set of objects (O), a set of field
signatures (F), a set of methods (M), and a set of virtual method
signatures (V) which are accessible by said second task.
3. The method of claim 2, wherein said determining said set of
objects comprises a data-sensitive analysis.
4. The method of claim 1, wherein said instantiating said second
task comprises: creating an executable thread that will repeatedly
invoke said second task; tracking said set of objects which are
accessible by said second task; pinning said objects such that said
objects are immovable by said first task; extending an interface to
a run-time system with second task run-time support code for
supporting said pinning of said objects; performing another pinning
of said objects to determine if any pointers in said objects are
stale and, if stale pointers are determined, updating said stale
pointers; forwarding pointers for objects associated with said
second task; and installing a set of objects which are accessible
by said second task into run-time data structures such that said
set of objects is external to said second task.
5. The method of claim 4, wherein only said second task run-time
support code has access to said pinning said objects and unpinning
said pinned objects.
6. The method of claim 1, wherein said executing said second task
comprises: performing an allocation while handling an exceptional
condition during said executing said second task.
7. The method of claim 1, wherein said executing said second task
further comprises: adding a set of pinned objects to a global table
for a duration of executing said second task to ensure that any
object accessible to said second task is accessible from another
root.
8. The method of claim 1, wherein said second task is implemented
using standard programming language constructs and wherein objects
associated with said second task are automatically discovered by
the run-time system.
9. The method of claim 1, wherein objects associated with said
second task reside in a garbage-collected heap, are referenceable
by other objects in said garbage-collected heap, are pinned during
a life of said second task, and are subject to garbage collection
upon termination of said second task.
10. The method of claim 1, wherein said executing said second task
is performed without executing a run-time memory access check.
11. The method of claim 1, wherein said validating said second task
comprises adding an identifier to a header of an object.
12. The method of claim 1, wherein in said executing said second
task, a priority of said second task is higher than a thread that
schedules said first task.
13. The method of claim 1, wherein a thread for said executing said
second task is distinguished from a thread for executing said first
task by using a flag in a thread-specific data structure, said flag
being used by a run-time scheduler to determine which threads
should reach a safe point before a collection quantum begins.
14. The method of claim 1, wherein said second task is terminatable
at any instruction without negatively effecting a remainder of a
system in which said second task is executed.
15. The method of claim 1, wherein said frequency of said second
task comprises a sampling frequency.
16. The method of claim 1, wherein said low-frequency task
comprises a first frequency and said high-frequency comprises a
second frequency greater than said first frequency.
17. The method of claim 1, wherein said low-frequency and
high-frequency tasks are instantiated and initialized with a shared
channel queue, and the high-frequency task is validated before
being passed to a scheduler, wherein said scheduler periodically
invokes the high-frequency task, and wherein said validating said
second task is performed while an application is running and before
said second task is invoked.
18. The method of claim 1, wherein said validating said second task
comprises creating a data-specific call graph for the data
structure, and performing a set of checks on bytecodes of resulting
methods.
19. A system for executing a task, comprising: a first executing
module for executing a first task comprising a low-frequency task
in which garbage is collected using a garbage collector; an
initializing module for initializing a second task comprising a
high-frequency task by: constructing an instance of a class that
implements a standard runnable thread interface; and creating a
data structure for supporting communication between said second
task and lower priority threads, said data structure being
accessible by a thread running in a garbage-collected heap in said
first task to communicate data between said high-frequency task and
said low-frequency task; a validating module for validating said
second task to ensure that said second task is executable without
synchronizing with said first task; an instantiating module for
instantiating said second task to create a class for executing said
second task; and a second executing module for, after said
instantiating said second task, executing said second task, said
garbage collector being preemptable by said second task.
20. A programmable storage medium tangibly embodying a program of
machine-readable instructions executable by a digital processing
apparatus to perform a method of executing a task, the method
comprising: executing, by using a processor, a first task
comprising a low-frequency task in which garbage is collected using
a garbage collector; initializing a second task comprising a
high-frequency task by: constructing an instance of a class that
implements a standard runnable thread interface; and creating a
data structure for supporting communication between said second
task and lower priority threads, said data structure being
accessible by a thread running in a garbage-collected heap in said
first task to communicate data between said high-frequency task and
said low-frequency task; validating said second task to ensure that
said second task is executable without synchronizing with said
first task; instantiating said second task to create a class for
executing said second task; and after said instantiating said
second task, executing said second task, said garbage collector
being preemptable by said second task.
Description
RELATED APPLICATIONS
[0001] This Application is a Continuation Application of U.S.
patent application Ser. No. 11/449,774 filed on Jun. 9, 2006.
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] The present invention relates to a method (and system) for
executing a task and, more particularly, to a method of executing a
task (e.g., a low latency task) which includes executing a second
task, a garbage collector in a first task being preemptable by the
second task.
[0004] 2. Description of the Related Art
[0005] The complexity of real-time systems is increasing rapidly as
the isolated real-time controllers of the past give way to complex
systems in which many coordinated systems interact to create an
integrated multi-level real-time system, such as a car or a ship.
In the face of this huge increase in complexity, traditional
real-time methodologies based on low-level coding styles and fixed
data structures are no longer tenable.
[0006] Java's software engineering benefits have made it a
compelling choice for modular application development: memory
safety, a standardized model of concurrency, and a host of common
libraries make it possible to develop large and complex
applications in a piecewise fashion. With the development of
Metronome real-time garbage collection technology and the Real-Time
Specification for Java (RTSJ) standard, Java has become a viable
platform for the creation of such complex real-time systems.
However, there are many problems remaining before a dynamic,
garbage collected language like Java can provide a complete
real-time solution down to the lowest-level, highest-frequency
applications.
[0007] Real-time garbage collection, as exemplified by the
Metronome system, is currently able to achieve worst-case latencies
on the order of a millisecond. This latency bound may scale down by
another factor of two to four as a result of improved
implementation techniques. However, latency is limited by the need
for atomic processing of certain data structures, and by the
overhead of context switching, especially since the collector tends
to evict much of the application data from the cache.
[0008] As a result, real-time Java programmers with response time
requirements below a millisecond are currently forced to resort to
the RTSJ's NoHeapRealtimeThread (NHRT) construct. As its rather
verbose moniker implies, code executed by an NHRT is isolated from
the garbage-collected heap and may therefore be run concurrently
with the collector or interrupt it arbitrarily. However, NHRTs may
only create and access objects in the manually-managed Immortal and
Scoped memory areas.
[0009] Unfortunately, this requires fundamental changes to the
semantics of the Java programming language. Scopes essentially
provide region-based memory management with neither explicit region
qualifiers nor implicit region inference. As a result, the region
restrictions of a method are not documented in its interface and
may vary dynamically based on its inputs. This greatly lowers the
level of abstraction of the language, requires expensive run-time
checks, and makes it difficult to reason about the behavior of an
object without understanding its implementation.
[0010] Most perniciously, NHRTs require dynamic checks on both
reads and writes of reference fields to guarantee that region
constraints are not violated. These dynamic checks will throw
exceptions when they fail, while reading or writing reference
fields of an object in the heap will never throw an exception. This
represents a deep change to the language semantics and a
significant obstacle to code re-use.
SUMMARY OF THE INVENTION
[0011] In view of the foregoing and other exemplary problems,
drawbacks, and disadvantages of the conventional systems and
methods, a purpose of the exemplary aspects of the present
invention is to provide a method (and system) of executing a task
(e.g., a low latency task) that may be simpler to program and
implement, more reliable and more efficient than other
approaches.
[0012] In an exemplary embodiment of the present invention, a
method for executing a task (e.g., a low latency task) includes
executing a first task in which garbage is collected using a
garbage collector, and executing a second task, the garbage
collector being preemptable (e.g., preempted) by the second
task.
[0013] In an exemplary aspect of the present invention, a second
task (which can preempt the garbage collector) may have a lower
latency (e.g., higher frequency) than a first task.
[0014] In another embodiment of the present invention, a method of
executing a task includes executing a first task in which garbage
is collected using a garbage collector, initializing a second task
by: constructing an instance of a class that implements a standard
runnable thread interface, and creating a data structure for
supporting communication and synchronization between the second
task and lower priority threads (e.g., threads having a lower
priority than the second task), validating the second task to
ensure that the second task is executable without synchronizing
with the first task, instantiating the second task to create a
class for the executing the second task by one of: creating an
executable thread that will repeatedly invoke the second task, and
associating the second task with an interrupt handler and executing
a second task, the garbage collector being preemptable (e.g.,
preempted) by the second task.
[0015] In another exemplary embodiment of the present invention, a
system for executing a task includes a first executing module for
executing a first task in which garbage is collected using a
garbage collector, and a second executing module for executing a
second task which preempts said garbage collector.
[0016] In another exemplary embodiment of the present invention, an
interrupt handler for an operating system includes an interrupt
module for interrupting an operation in the operating system, and a
system for executing a task including a first executing module for
executing a first task in which garbage is collected using a
garbage collector, and a second executing module for executing a
second task which is associated with the interrupt module and the
garbage collector being preemptable (e.g., preempted) by the second
task.
[0017] Another exemplary embodiment of the present invention
includes a programmable storage medium tangibly embodying a program
of machine-readable instructions executable by a digital processing
apparatus to perform a method of executing a task according to the
present invention.
[0018] In another exemplary embodiment, the method according to the
present invention may include deploying computing infrastructure in
which computer-readable code is integrated into a computing system,
such that the code and the computing system combine to perform the
executing of the first task in which garbage is collected using a
garbage collector, and the executing a second task, the garbage
collector being preemptable (e.g., preempted) by the second
task.
[0019] With its unique and novel features, the present invention
provides a method (and system) of executing a task (e.g., a low
latency task) which may be simpler to program and implement, more
reliable and more efficient than other approaches.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] The foregoing and other exemplary purposes, aspects and
advantages will be better understood from the following detailed
description of an exemplary embodiment of the invention with
reference to the drawings, in which:
[0021] FIG. 1 illustrates a method 100 of executing a task
according to an exemplary aspect of the present invention;
[0022] FIG. 2 illustrates a method 200 of executing a task
according to an exemplary aspect of the present invention;
[0023] FIG. 3 illustrates a system 300 for executing a task
according to an exemplary aspect of the present invention;
[0024] FIG. 4 illustrates code for implementing a low-frequency
task according to an exemplary aspect of the present invention;
according to an exemplary aspect of the present invention;
[0025] FIG. 5 illustrates code for implementing a high-frequency
task according to an exemplary aspect of the present invention;
[0026] FIG. 6 illustrates code for implementing task integration
according to an exemplary aspect of the present invention;
[0027] FIG. 7 illustrates a system 700 for executing a task
according to an exemplary aspect of the present invention;
according to an exemplary aspect of the present invention;
[0028] FIGS. 8A and 8B provide code for operations that may be
illegal according to an exemplary aspect of the present
invention;
[0029] FIG. 9 illustrates how an interface to run-time support
operations may be added to a virtual machine to support the
Construct according to an exemplary aspect of the present
invention;
[0030] FIG. 10 provides Table 1 which includes Validation times for
several exemplary Constructs according to an exemplary aspect of
the present invention;
[0031] FIGS. 11-14 provide histogram plots which present the
results from experiments conducted by the inventors;
[0032] FIG. 15 illustrates an oscilloscope view of a Construct
according to an exemplary aspect of the present invention;
[0033] FIGS. 16-17 provide histogram plots which present the
results from experiments conducted by the inventors;
[0034] FIG. 18 illustrates a typical hardware configuration which
may be used for implementing the system and method according to the
exemplary aspects of the present invention; and
[0035] FIG. 19 illustrates a programmable storage medium 1900
tangibly embodying a program of machine-readable instructions
executable by a digital processing apparatus to perform the method
according to the exemplary aspects of the present invention.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS OF THE INVENTION
[0036] Referring now to the drawings, and more particularly to
FIGS. 1-19, there are shown exemplary embodiments of the method and
structures of the present invention.
[0037] FIG. 1 illustrates a method 100 for executing a task
according to an exemplary aspect of the present invention. The
method 100 includes executing (110) a first task in which garbage
is collected using a garbage collector, and executing (120) a
second task, the garbage collector being preemptable (e.g.,
preempted) by the second task.
[0038] In an exemplary aspect of the present invention, a second
task (which can preempt the garbage collector) may have a lower
latency (e.g., higher frequency) than a first task.
[0039] FIG. 2 illustrates method 200 for executing a task according
to another exemplary aspect of the present invention. The method
200 includes executing (210) a first task in which garbage is
collected using a garbage collector, initializing (220) a second
task by constructing an instance of a class that implements a
standard Runnable thread interface, and creating a data structure
for supporting communication and synchronization between the second
task and lower priority threads (e.g., threads having a lower
priority than the second task), validating (230) the second task to
ensure that the second task is executable without synchronizing
with the first task, instantiating (240) the second task to create
a class for the executing the second task by one of creating an
executable thread that will repeatedly invoke the second task, and
associating the second task with an interrupt handler, and
executing (250) a second task, the garbage collector being
preemptable (e.g., preempted) by the second task.
[0040] FIG. 3 illustrates a system 300 (e.g., a software
implemented system) for executing a task (e.g., a low latency
task), according to an exemplary aspect of the present invention.
The system 300 includes a first executing module 310 for executing
a first task in which garbage is collected using a garbage
collector 311, and a second executing module 320 (e.g., a new
programming construct) for executing a second task having a higher
frequency than the first task, the garbage collector being
preemptable (e.g., preempted) by the second task.
[0041] As illustrated in FIG. 3, the first executing module 310 may
receive data for executing a first task and output a result of
executing a first task, and the second executing module 320 may
receive data for executing a second task and output a result of
executing a second task.
[0042] The system 300 may also include an initializing module 330
for initializing the second task by constructing an instance of a
class that implements a standard Java Runnable interface, and
creating a data structure for supporting communication and
synchronization between the second task and low-frequency threads,
and a validating module 340 for validating the second task to
ensure that the second task is executable without synchronizing
with the first task, and an instantiating module 350 for
instantiating the second task to create a class for the executing
the second task.
[0043] Further, the instantiating module 350 may include a tracker
351 for tracking a set of objects which are accessible by the
second task, and a pinner 352 for pinning the objects such that the
objects are immovable by the first task.
Overview
[0044] While real-time garbage collection has achieved worst-case
latencies on the order of a millisecond, this technology is
approaching its practical limits. For tasks requiring extremely low
latency, and especially periodic tasks with frequencies above 1
KHz, Java programmers must currently resort to the
NoHeapRealtimeThread construct of the Real-Time Specification for
Java. This technique requires expensive run-time checks, can result
in unpredictable low-level exceptions, and inhibits communication
with the rest of the garbage-collected application.
[0045] The present invention, on the other hand, includes a new
programming construct that can arbitrarily preempt the garbage
collector, and may guarantee safety and allow its data to be
visible to the garbage-collected heap. The programming construct
according to the present invention may be a strict subset of Java,
and require no run-time memory access checks. Safety may be
enforced using a data-sensitive analysis and simple run-time
support with extremely low overhead.
[0046] The inventors have implemented the programming construct in
a Java virtual machine, in experiments in which the construct was
run at frequencies up to 22 KHz (a 45 .mu.s period). Across 10
million periods, 99.997% of the executions ran within 10 .mu.s of
their deadline, compared to 99.999% of the executions of the
equivalent program written in C.
Detailed Discussion
[0047] The present invention includes a programming construct
(e.g., "Construct") for tasks that can arbitrarily preempt a
garbage collector and yet may be comprised entirely of standard
Java code without extra run-time memory access checks. This
programming construct (e.g., the second executing module 310 in
FIG. 3) according to the present invention may be simple to design,
develop, and understand. The programming construct can also be
terminated in the event of cost over-run without causing deadlocks
or data structure corruption.
[0048] It should be noted that the inventors implemented an
exemplary aspect of the present invention using a Construct which
they named "Eventron". Thus, there may be referenced to the term
Eventron herein. However, this term should in no way be considered
limiting and, in fact, basically any term may be substituted for
the term Eventron herein.
[0049] Specifically, other terms may be substituted for the term
"Eventron" in the programming code discussed herein. Thus, for
example, the Construct according to an exemplary aspect of the
present invention may be entitled "Construct X" in which case the
term "ConstructXReadChannelofShort" may be substituted for the term
"EventronReadChannelOfShort" discussed herein, and so forth.
[0050] The new programming construct may be both more and less
restrictive than NHRTs and Scopes. Allocation within the new
programming construct may not be permitted, and neither may be
modification of pointers. These restrictions may be enforced at
validation time using a simple bytecode analysis. Once the
Construct is validated, it may not throw any exceptions except
those resulting from normal Java semantics such as array bounds
checking.
[0051] The Construct may be less restrictive than Scopes because
the construct's data structures can be accessed by normal Java
threads running in the garbage collected heap. As a result,
communication of data between high-frequency tasks, and lower
frequency, garbage-collected tasks is greatly simplified.
[0052] Validation occurs at run-time, examining both the code and
the data of the construct, but must be performed before the
Construct may be scheduled. By choosing this intermediate point
between fully static and fully dynamic checking, we are able to
perform a simple analysis whose restrictions are easy for
programmers to understand, and yet obtain a fairly precise call
graph since it is based on a known set of previously instantiated
objects. The separation of the program into an "initialization" and
a "mission" phase is also natural to most real-time
applications.
[0053] The Construct is well suited to tasks like sensor
processing, in which sensor data must be sampled at a very high
frequency, buffered, and then processed en masse by a
lower-frequency task.
[0054] The Construct is also well-suited to simple high-frequency
control loops that read a limited amount of data, perform some
simple processing (such as FFTs) and then produce actuator control
values. The ability to share data between high-and low-frequency
tasks allows the use of more complex, dynamically allocated,
garbage-collected data structures in the low-frequency tasks that
make higher-level planning decisions requiring more complex
processing.
[0055] The combination of the Construct for high-frequency tasks
and real-time garbage collection for medium-to low-frequency tasks
provides an integrated programming methodology that enables the use
of heap-allocated data structures in even the most demanding
real-time applications, and yet does not change the Java
programming model in any fundamental way.
Example: Music Generation
[0056] We begin with an example to illustrate the use of the
Construct to implement an integrated real-time application. The
example combines both low- and high-frequency tasks to generate and
play back music without relying on the operating system or
underlying hardware to buffer the output. A low-frequency task
performs the computation necessary to compose a musical score and
generate the raw audio data, while a high-frequency task sends the
raw audio data to the unbuffered output device, one sample at a
time. To provide seamless audio playback at the standard frequency
of 22.05 KHz, the high-frequency task in this example must be run
every 45 .mu.s.
[0057] Implementing a Low-Frequency Task
[0058] FIG. 4 illustrates a program for executing a low-frequency
task. This task generates the next note of a musical score,
converts the note to an array of samples, and then writes the
samples to a shared channel.
[0059] That is, the task in FIG. 4 may use a music composition
library (not shown) to generate a musical score one note at a time.
Each note is then converted to a set of samples and written to a
shared channel. The low-frequency task (including any libraries it
uses) may rely on a garbage collector to automatically manage
memory; the code used to implement this task may only be
constrained at the point where it interacts with the high-frequency
task.
[0060] In this example, that interaction may be managed by an
EventronReadChannelOfShort. This general purpose data structure may
provide a form of impedance matching between the low-and
high-frequency tasks and can be configured to hide the latency and
jitter of the low-frequency task. While the code shown in FIG. 4
may allocate an array from which the raw audio samples are copied
into the channel queue, the channel queue also provides a mechanism
to avoid this copy.
[0061] Implementing a High-Frequency Task
[0062] FIG. 5 illustrates a program for a high-frequency task,
according to an exemplary aspect of the present invention. This
task may reads a single 16-bit sample from the shared channel and
play it using an interface to the raw audio device.
[0063] The high-frequency task in FIG. 5 may be implemented using
the Construct abstraction. During an iteration (e.g., each
iteration), the task may read a single 16-bit sample from the
shared channel and write that sample out to the audio device.
[0064] A number of constraints may be enforced on the Construct
code during initialization using an on-line analysis. For example,
note that reference fields may be declared final on lines 2-6. The
inventors found that the use of final may be a natural and familiar
way to ensure that the Construct data structure remains fixed over
time.
[0065] The high-frequency task may use the non-blocking interface
of the EventronChannel to read individual samples. This operation
may return either a sample or a value indicating that the channel
is empty or closed. Alternatively, programmers can avoid reserving
special indicator values and instead use exceptions. However, all
exceptions should (e.g., must) be preallocated, as also illustrated
in the example.
[0066] This task in FIG. 5 also uses a native method (lines 9 and
19) to communicate with the audio playback device. Many
applications of the Construct may require native methods to
communicate with hardware. The Construct run-time takes steps to
ensure the relative safety of Java native interface (JNI) code that
is invoked by a Construct.
[0067] Notice that the high-frequency task does not necessarily
contain a looping construct. Instead, the scheduler may invoke this
task periodically as indicated during initialization. The Construct
may continue to be invoked until it is explicitly removed from the
scheduling queue (by some other thread), or when an exception is
thrown from the Construct run( ) method.
[0068] Integrating Tasks
[0069] FIG. 6 illustrates a program for task integration, according
to an exemplary aspect of the present invention. The two tasks may
be instantiated, initialized with a shared channel queue, and the
high-frequency task may be validated before being passed to the
scheduler.
[0070] That is, FIG. 6 provides an example of integration of the
low-and high-frequency tasks. On line 11 in FIG. 6, the programmer
may instantiate a shared data structure which is the sole mechanism
for interaction between the low-and high-frequency tasks. The
low-frequency task may be scheduled on lines 18 and 19 by
instantiating an ordinary Java thread.
[0071] Before the high-frequency task can be executed, it may be
first be validated (line 16) to ensure that it can safely preempt
the collector. While validation may fail and cause a checked
exception to be thrown in the main( ) method, memory access checks
are not necessarily performed while executing the Construct.
[0072] Thus errors may occur during the initialization phase of the
application instead of during the execution of the Construct
itself. This is a natural time to detect any problems in the
configuration of the Construct, as the application should also
ensure that any required hardware is properly configured and that
sufficient resources are available to run the Construct.
[0073] Validation may also cause the run-time system to pin down
any data structures accessible by the Construct: these objects may
not be collected or even moved to another location in the heap
while the Construct is active. However, once the Construct is
disabled, these objects can be moved and, if they are otherwise
unreachable, reclaimed.
The Life and Times of the Construct
[0074] FIG. 7 provides an example of a system 700 (e.g., a
software-implemented system) in which the Construct 701 may
interact with the heap 710, according to an exemplary aspect of the
present invention. In addition, the heap 710 may be accessible by
the stack 705 and global 715. Further, the Construct 701 may
include buffers 703 which may store values in the construct, to
provide efficient access to the values.
[0075] Objects 702 associated with a Construct 701 may reside in
the garbage-collected heap 710 but may be pinned for the life of
the Construct. The objects 702 may be referenced by other heap
objects 711, and may be subject to garbage collection once the
Construct 701 terminates.
[0076] The Construct 701 may defined by a set of Java methods that
may be executed as a high-frequency task, the data structures
manipulated by those methods, and a thread that will execute them.
In the present invention, Constructs 701 can safely preempt the
virtual machine, and in particular the garbage collector, at any
point by eliminating any synchronization between Constructs 701 and
the collector.
[0077] First, it is ensured that the Constructs 701 cannot change
the reachability of any object managed by the collector. While this
condition might be enforced by additional run-time checks, the
present invention may instead impose constraints on the code and
data structures defining a Construct 701 that can be verified
before the Construct 701 is executed. These constraints ensure that
the set of objects accessible to a Construct 701 is fixed for the
duration of its execution, while allowing objects used by a
Construct 701 to be accessed by ordinary Java threads. These
constraints may be enforced using a run-time static analysis: an
analysis of the Construct bytecode may be performed while the
program is executing, but before the Construct itself is invoked.
This may allow an efficient and precise analysis.
[0078] Secondly, the need for synchronization between the garbage
collector and a Construct 701 may be eliminated by precluding any
collector operations that, if preempted by a Construct 701, might
expose the heap 710 in an inconsistent state. Specifically, objects
accessible to a Construct 701 may be pinned. Thus, the collector
may never be interrupted while moving one of the Construct's
objects.
[0079] The lifetime of a Construct 701 may be split into five
phases. First, a Construct 701 may be initialized in the same way
an ordinary Java structure is constructed. During the second phase,
validation, the present invention may verify that the Construct 701
code can safely preempt the collector and discover (e.g.,
simultaneously discover) the set of objects that should be
pinned.
[0080] The third phase is instantiation, in which the Construct's
data structure may be pinned and an executable thread may be
created. The fourth phase is execution, during which the Construct
701 may be invoked one or more times, either at specific times,
with a predetermined period, or in response to certain external
events. Finally, a Construct may be destroyed, at which point the
objects in its data structure may be unpinned and eventually
garbage collected, unless the objects are reachable from the heap
710 or other Constructs 701.
Initialization
[0081] To create a Construct, the programmer may construct an
instance of a class that implements the standard Java Runnable
interface. This object may form the root of the Construct data
structure, and its run( ) method may form the entrypoint for the
execution of the Construct.
[0082] The programmer may also build the data structures necessary
to support communication and synchronization between the Construct
and low-frequency threads. Typically, this may involve the use of
one or more channels.
Validation
[0083] During the validation phase, the present invention may
ensure that a Construct may be safely executed without
synchronizing with the garbage collector (i.e., the collector).
Validation may take place while the application is running but
before the Construct is invoked. This may allow the validation
process to take advantage of information that is present at
run-time and to detect potentially misbehaving code while the task
is being set up rather than during the execution of the Construct
itself.
[0084] Validation should (e.g., must) ensure three features of the
Construct code: first, that the Construct code does not allocate
any memory, second, that the Construct code does not perform any
blocking operations, and third, that the extent of the heap that it
accesses remains fixed.
[0085] These restrictions may be enforced by creating a
data-specific call graph for the particular Construct being
validated, and performing a set of checks on the bytecodes of the
resulting methods. This process can be viewed as a kind of partial
evaluation.
[0086] Enforcement of Restrictions
[0087] FIGS. 5A and 5B provides a list of operations that may be
illegal in Construct code. Specifically, FIG. 5A provides a table
which includes illegal bytecodes in Construct code and FIG. 5B
provides a list of illegal method invocations in Construct
code.
[0088] Operations that may be illegal within a Construct are
summarized in FIG. 5A. When any of these operations are
encountered, a ConstructValidationException is thrown reporting the
source code location of the offending instruction.
[0089] As shown in FIG. 5A, bytecodes (e.g., all bycodes) that
perform allocation may be prohibited, as well as bytecodes that
perform synchronization either explicitly (via monitor bytecodes)
or implicitly (via synchronized methods). As further shown in FIG.
5B, reflective access to such operations may also be prohibited, as
is the use of the wait and notify methods of Object (which would
necessarily generate exceptions since the Construct thread is never
able to enter a monitor).
[0090] The last group of restrictions in FIG. 5A may ensures that
Constructs neither read nor update mutable heap references. Writes
to object fields, static variables, and array elements may only be
permitted for primitive types. Reads are permitted for
non-primitive types, but only if the field is declared final.
[0091] There are at least two complications with respect to final
fields: first, Java does not provide any mechanism to specify the
immutability of array elements. To this end, the present invention
may provide a special class, ImmutableArray<T>, which stores
an array of immutable references to objects of class T.
[0092] An ImmutableArray may be initialized by passing it an array
of references, which is copied into a private instance array of the
ImmutableArray object. The only methods provided are get(index) and
length( ). Since the finality of the array elements within an
ImmutableArray is enforced programmatically, the validator allows
their use within a Construct.
[0093] The reflective analogues may get and set of the proscribed
pointer-accessing bytecodes may also be prohibited. Note that
unlike the byte-code operations, the reflective operations may be
more restricted, since at validation time the fields or methods
being accessed may not be known. In particular, neither reflective
access to final pointer fields nor reflective method invocation may
be permitted within a Construct.
[0094] The second complication with respect to finality has to do
with exposure of incompletely constructed objects. This problem may
be solved in the present invention by using a simple run-time
assisted mechanism.
[0095] In practice, the inventors found that rejecting all methods
that perform allocation may inhibit the reuse of some existing
library methods inside the Construct. This is often due to the
common Java idiom of allocating exception objects ("throw new
FooException"). In many cases, the programmer knows that the
offending code will never be executed but the validator
conservatively assumes otherwise. An optional feature of the
validator allows programmers to exempt certain methods from the
normal check for allocating bytecodes. To keep this feature from
compromising safety, the run-time allocator may throw an immediate
OutOfMemoryError whenever an allocation occurs on a Construct
thread.
[0096] Data-Sensitive Call Graph Analysis
[0097] The validator's analysis may explore the set of reachable
methods and the set of reachable objects (e.g., simultaneously),
letting each set inform the other. Rather than build a complete
call graph, the validator may analyze only those methods that could
be invoked given the set of objects reachable by the Construct. In
the course of its analysis, the validator may conservatively
determine sets of objects O, field signatures F, methods M, and
virtual method signatures V reachable to the Construct. These are
the structures which may be accessed by the Construct during its
execution.
[0098] Initially, O may contain only the object implementing the
Runnable interface, and the only method in M is its run( ) method.
For each method added to the call graph, the validator may consider
each byte-code instruction of that method in turn. Note that each
method may be analyzed at most one time, and therefore the
validator may terminate in time linear in the size of the program
and the size of the heap. [0099] getfield--The signature of target
field f is added to the validation set F. In addition, the
validator may consider every object in O that defines a field
matching that signature. For every such object, the validator may
use reflection to determine the referent of its field f; this
referent may then added to O.
[0100] As objects are added to O, each one should (e.g., must) be
inspected for fields that are already found in F, and the referents
of these fields should be added to O. Therefore, in the course of
processing a single getfield instruction, only one field signature
may be added to F, but many objects may be added to O. [0101]
getstatic--The validator may add (e.g., immediately add) the
referent of the static field to O. Any fields of the referent that
appear in F are considered as above. [0102] invokestatic,
invokespecial--For method invocations whose targets can be
statically determined, the validator may add the target method to
M. [0103] invokeinterface, invokevirtual--For virtual method
invocations, the signature of the target method may be added to V.
The validator may then consider each object in O implementing that
signature. For each such object, its implementation may be added to
M and subsequently validated.
[0104] Just as the fields of objects added to O may (e.g., must) be
compared to the signatures in F, the methods of those objects may
(e.g., must) be compared to the signatures in V. That is, for an
object (e.g., each object) added to O, and for a method (e.g., each
method) m of that object, if m matches a signature in V, then m may
be added to M.
[0105] Finally, the restrictions described above may be enforced as
the method (e.g., each method) is processed. If a prohibited
bytecode instruction or method invocation appears in an analyzed
method body, the analysis may terminate immediately, throwing an
exception that indicates where the offending code can be found.
[0106] If the validation succeeds, the objects in O may be those
that may be accessed by the Construct and therefore should (e.g.,
must) be pinned.
[0107] Recall that any reference field (instance or static)
accessed by a Construct may (e.g., must) be declared final. This
restriction may be used not only to ensure that synchronization
with the collector can be safely omitted, it may also be helpful
(e.g., necessary) for the soundness of the above analysis. Without
this restriction, the validator may not be able to safely assume
which objects may occupy a given field or which implementations of
a given method signature might be invoked.
[0108] Because the present invention may require only those fields
that might be referenced by a Construct to be declared final, some
reference fields in the Construct data structure may not be final.
As a result, pinned objects may hold (mutable) references to
unpinned objects, as shown in FIG. 7. This is safe since the
inventors' analysis has shown that these fields may not be accessed
by the Construct.
[0109] Data-Sensitive Analysis versus Other Analyses
[0110] Note that the analysis utilized by the present invention may
be neither flow-sensitive nor context-sensitive. It may be however,
data-sensitive: the call graph may be constructed for the
invocation of the run( ) method of a particular concrete Construct
instance. In general, flow- and context-sensitivity can increase
the precision of call graph analysis. However, even fairly
aggressive algorithms still make some conservative approximations,
especially about the contents of instance fields, which can induce
imprecision in the call graph.
[0111] JIT compilers may often compensate for the imprecision of
static call graph construction algorithms by exploiting profile
data, online class hierarchy analysis, and other optimistic and
speculative optimizations. This often works well in practice, but
if the assumption turns out to be invalid, the code may need to be
(e.g., must be) recompiled (an operation that may not necessarily
mesh well with real-time execution).
[0112] In principle, a data-sensitive analysis may be more or less
precise than any of the above techniques, depending on the
particular class hierarchies, call graphs, and execution orders.
However, the inventors believe that a data-sensitive analysis may
often yield the most precise result in practice.
[0113] Regardless of the precision, however, there is at least one
more compelling reason to use data-sensitive analysis: it may be
simple enough that it can be easily described to a programmer. More
complex analyses are easily perturbed by small changes in the code,
and a seemingly trivial change can cause two variables to become
possibly aliased.
[0114] The analysis used by the present invention may avoid this
form of speculation and rely on two forms of information which may
be specified (e.g., specified directly) by the user: the object
graph and the use of final fields. In contrast to other analyses,
the data-sensitive analysis in the exemplary aspects of the present
invention may use mechanisms that may be familiar to Java
programmers and may yield an induced call graph that may be quite
stable in the face of changes to the program.
[0115] Implementation
[0116] The present invention may implement the validation phase
using a combination of standard Java reflection (to explore the
graph of reachable objects) and specialized VM-specific code to
consult the bytecodes and constant pool of already loaded and
validated classes (validation may (e.g., must) run against the
in-memory classes to prevent security attacks that change the class
files).
[0117] It may be straightforward to implement the same validation
algorithm using a class file inspection package, with or without a
static analysis framework (for example, an earlier prototype used
the Domo program analysis library). However, such a validator may
only be safe to use for development-time exploration of Construct
validity. A run-time validator may (e.g., must) ensure that it is
working with the actual bytecodes that have been loaded into the
running VM.
[0118] Preventing Violations of Finality
[0119] To help ensure complete safety of Construct execution, some
additional checks may be required for code that executes outside of
the Construct threads.
[0120] Exposure of Incompletely Constructed Objects
[0121] Unfortunately, simply using final may not guarantee that
heap structures will be immutable. A constructor may expose the
pointer before the constructor's fields are completely initialized,
for instance by storing this into a static field. If a partially
constructed object is present in a Construct data structure, the
validator may see a final field as null even though it will be
subsequently initialized to a non-null value. In this case, the
validator may have based its analysis on incomplete information,
and the safety of the Construct could be compromised.
[0122] Attempts to prevent such operations via static analyses tend
to be grossly over-conservative because the targets of virtual
calls are not available.
[0123] An exemplary aspect of the present invention, however,
provides a solution which may include adding a "fully constructed"
flag to the header of an object (e.g., every object). The
interpreter may set this flag as a constructor returns (the return
of any constructor for a given class is sufficient to guarantee
that final fields of that class and its superclasses are
initialized). The present invention may provide a run-time
interface allowing the flag to be queried, and the validator may
throw an exception whenever it encounters an object for which this
flag is not set. This may result in a simple programming model and
an efficient implementation.
[0124] The specification to the programmer may be simple: all
Construct objects must be completely constructed.
[0125] Reflection and JNI
[0126] It may also be possible to undermine the restrictions of
final using reflection. The method invocation restrictions on
Construct code (FIG. 8B) may already rule out any reflective access
to pointers in the Construct data structure from the Construct
itself. However, other threads may have full access to reflection,
and in Java 1.5, reflection may even allow the modification of
final fields (!) after a call to setAccessible( ).
[0127] According to the specification, this functionality may only
be intended to be used during deserialization and other uses may
have "unpredictable effects." However, any violation of finality
may render Constructs unsafe, which the inventors do not consider
to be an acceptable "unpredictable effect," so this should be
prevented.
[0128] In an exemplary aspect of the present invention , a
preferred approach to this problem may include modifying the
Field.set( ) implementation of java.lang.reflect to check whether
the field being set is a final field of an object pinned by a
Construct. In that case, a run-time exception may be thrown.
Similarly, Array.set( ) can be modified to prevent reflective
modification of pointers inside the arrays that implement the
ImmutableArray class.
[0129] Restrictions on reflective field access should not cause
runtime exceptions in code that uses the interface as intended,
since Construct data structures may be constructed before
validation, and so by definition are not necessarily being
de-serialized.
[0130] Similar to reflection, JNI code can circumvent Construct
restrictions and violate validation assumptions. For example,
native code can execute Java methods that were not part of the
validation-time call graph, read and write reference fields that
the validator did not know were to be read or written, enter
monitors, or allocate objects. Despite this, many Constructs must
use native code to interface with hardware devices.
[0131] One could argue that JNI code is inherently unsafe since it
is capable of directly modifying any word in memory (indeed, no JNI
access control restrictions are enforced in the Java programming
language as a deliberate design decision). However, the inventors
judged that JNI code that was completely safe to execute in other
contexts ought not to produce arbitrary results simply because it
was invoked by a Construct.
[0132] The present invention may catch native code violations with
inexpensive dynamic checks in those JNI helper functions that
should not be called by Constructs. Each helper function that would
be unsafe to execute on a Construct thread may immediately throw an
exception if executed on a Construct thread. This allows problems
to be caught early in the testing phase (albeit not necessarily at
validation time).
Instantiation
[0133] Once a Construct has been validated, it should (e.g., must)
be instantiated. This may involve (e.g., requires) either creating
an executable thread that will invoke the Construct repeatedly at
high frequency, or associating the Construct with an interrupt
handler. In either case, the run-time system should (e.g., must)
also track the set of objects accessible to the Construct and pin
these objects so that they are not moved by the garbage collector.
The set of Construct-reachable objects may be created during the
validation phase (to avoid redundant object traversals) and may be
now installed into the run-time data structures. This set may be
external to the Construct itself and persist until the Construct is
destroyed.
[0134] Pinning of Construct Objects
[0135] For Constructs to work properly in conjunction with a
collector that moves objects (whether or not the collector is
incremental), the objects should (e.g., must) be pinned before the
Construct is allowed to execute.
[0136] FIG. 9 illustrates how Interfaces to run-time support
operations may be added to the virtual machine to support the
Constructs (e.g., Eventrons).
[0137] There are various collector architectures that move objects,
among them page-based defragmenting collectors, generational
collectors, and semi-space copying collectors, as well as their
concurrent variants (although the present invention is not
necessarily limited to any particular type of garbage collector).
There are also various approaches to pinning objects suited to the
different kinds of collector architectures. For instance, in a
generational collector, objects should not be pinned in the
nursery, so a nursery evacuation should be forced as part of the
Construct instantiation procedure.
[0138] The present invention is described herein based on the
design used in the inventors' actual implementation that runs
together with the Metronome collector, a real-time incremental
collector that organizes memory into page-based segregated free
lists (although the present invention is not limited thereto).
Defragmentation may be performed on an as-needed basis and compact
objects within a size class.
[0139] To minimize changes to the Java run-time system, the
inventors' chose to implement the run-time support for Constructs
in Java whenever possible. One exception includes support for
pinning objects. In an exemplary aspect, the inventors extended the
interface to the run-time system with the functions shown in FIG.
9.
[0140] Invoking pinObjects may prevent the objects from being moved
by the collector. Invoking unpinObjects may revert this state
change. Only the Construct run-time support code may have access to
the pinning and unpinning methods. The native support may be
wrapped by Java code that aggregates pinning information from
multiple Constructs. These native methods may be called (e.g., are
only called) with collections of objects the first time those
objects are used by a Construct or once all Constructs using those
objects have terminated.
[0141] This interface may be similar to the Create/DestroyGlobalRef
functions found in the Java Native Interface. While those functions
may ensure that a given object will not be freed, they do not
necessarily guarantee that the referent will not be relocated by
the collector. Though no standard support for pinned objects may
exist in Java, several other languages may already provide similar
functionality. For example, the above interface can be easily
implemented using existing run-time support in either the Common
Language Runtime or the Haskell Foreign Function Interface.
[0142] Since Metronome uses a page-based memory architecture, the
implementation in an exemplary aspect of the present invention of
pinning may maintain a count of the number of pin operations
applied to the objects of each page. At instantiation time,
pinObjects( ) may be invoked with the set of objects that are
accessible to the Construct but that are not already pinned. Since
pinning may be atomic with respect to interleavings with the
garbage collector, once the entire data structure has been pinned,
it may be guaranteed that the collector will not move any of the
objects.
[0143] Compensating for Races with Compaction
[0144] However, this does not necessarily guarantee collector
independence: there may be pointers among the objects that were
pinned early in this stage that refer to objects that were later
pinned but moved by the collector in the intervening time. Thus,
there may be stale pointers in the Construct data structure. For
the garbage-collected heap, this may not be a problem since those
pointers will be fixed during the subsequent tracing phase of the
collector before the old versions of objects are reaped.
[0145] Unlike ordinary Java threads, the Construct's stack is not
necessarily scanned to identify roots or fix stale pointers once it
begins execution. Thus, a Construct could potentially load a stale
pointer into a stack variable and dereference it after the old
object (with its forwarding pointer) had been reaped by the
collector and re-allocated.
[0146] This problem may be addressed by performing a second
iteration over the Construct objects after pinning is completed.
The two-phase nature of object pinning and forwarding may be part
of the implementation of pinObjects. The second iteration may check
whether any pointers in the object are stale and immediately update
them to the forwarded copy of the referent.
[0147] This second pass may guarantee that no stale pointers remain
in any Construct object. After this point, no Construct object may
be moved by the collector nor may any Construct object contain
pointers that will be modified by the collector. Therefore, the
Construct stack may never contain stale references and does not
need to be examined, and at no point in the Construct's execution
should (e.g., must) it synchronize with the collector.
[0148] Once all Construct objects are pinned and their pointers are
forwarded, a Construct thread may be created. The run-time may
ensure that becomeEventronThread is invoked for each Construct so
that it may be distinguished from ordinary Java threads.
Execution
[0149] Thread Priorities
[0150] Once a Construct is released, it may arbitrarily preempt the
garbage collector. Constructs by default may be created with a
priority just higher than the thread that schedules the garbage
collector (which itself may run at a priority just higher than the
collector threads themselves).
[0151] Construct priorities may be set to any value allowed by the
underlying system. Constructs may co-exist with RTSJ
NoHeapRealTimeThreads, and their relative scheduling may be handled
by the operating system based on their respective priorities.
[0152] Distinguishing Construct Threads
[0153] Construct threads should (e.g., must) not yield to garbage
collection threads at the beginning of each collector quantum.
Therefore, an exemplary aspect of the present invention may
distinguish Construct threads using a flag in a thread-specific
data structure. This flag may be used by the run-time scheduler to
determine which threads must reach a safe point before a collection
quantum can begin. In addition, as the collector is scanning the
stack of each application thread, it should (e.g., must) ignore the
stacks of Construct threads since these threads are not necessarily
at safe points.
[0154] Because the stacks of Construct threads are not necessarily
scanned during collection, it should (e.g, must) be ensured that
any object accessible to a Construct is also reachable from some
other root. The set of pinned objects associated with each
Construct may serve to maintain this invariant: each set of pinned
objects may be added to a global table for the duration of the
execution of the Construct.
[0155] Construct threads may share some characteristics with
NoHeapRealtimeThreads, for example, neither may be suspended during
the execution of the collector. However, these two types of threads
may be distinguished, as Constructs are not necessarily subject to
the same checks on heap access.
[0156] Run-Time Exceptions
[0157] While the inventors' analysis may rule out any allocation
that may occur because of code intentionally written by the
programmer, the present invention may allow the programmer to relax
this restriction for code included from libraries, meaning that
unanticipated allocations may occur, causing an immediate
OutOfMemoryError. In addition, the system may perform allocation
while handling an exceptional condition that arises during normal
execution (e.g., NullPointerException).
[0158] The inventors found it impractical to change all of the code
paths involved, and typically the run-time does much more than
allocate an exception object (e.g., it also allocates a string of
message text). In an exemplary aspect (e.g., the inventors'
implementation), these allocations may also cause an immediate
OutOfMemoryError. Despite this, the inventors' implementation may
ensure that all such exceptions contain valid traceback
information, which simplifies debugging.
[0159] Exceptions may be caught by a Construct, including both
those exceptions that are preallocated and those that arise from
allocation. Unhandled exceptions may result in the termination of
the execution phase. The Construct may (e.g., will) not be re-run
until it is explicitly rescheduled.
Termination
[0160] Once a Construct has run its course, either after it has
been disabled by a call from another thread, or because it threw or
failed to catch an exception, its data structures may be unpinned
and the Construct thread may revert to the ordinary state for a
brief interval before terminating. During this brief interval, the
Construct thread may store into reference fields, and this may
explain how exceptions thrown during the execution phase may be
preserved for inspection by other threads.
[0161] When a Construct is destroyed, the destruction routine may
unpin the associated data structure, so that Construct objects that
remain live due to pointers from the mutable heap may be no longer
pinned, and there may be no "pinning leak."
[0162] Killing, Restarting, and Cost Enforcement
[0163] Since Constructs do not necessarily modify pointer
structures and do not necessarily hold locks (because all
operations should (e.g., must) be non-blocking), it should be
possible to terminate a Construct at any instruction without
negative effects on the rest of the system.
[0164] This avoids another major problem with other approaches to
high-frequency tasks in Java, namely that they may hold resources
which may make the semantics of early termination poorly
defined.
[0165] The only circumstance under which killing Constructs is
unsafe may be when the program is using a custom synchronization
protocol in which the Construct might expose partially updated
state non-atomically. However, as long as the channel abstractions
are used this should (e.g., will) not happen.
[0166] This capability can be used to implement either asynchronous
termination of the entire Construct or of one particular execution.
This may mesh well with cost enforcement, since it may allow the
early termination of an iteration that misses its deadline, and may
immediately reset it to a state in which it can handle the next
event in a timely fashion.
Construct Channels
[0167] To aid programmers in transferring data between tasks of
different priorities and frequencies, an exemplary aspect of the
present invention may provide an implementation of wait-free,
allocation-free channels. These channels may allow data to be
streamed to and from Construct tasks. Though the inventors intended
that EventronChannels may be common utility classes, they may still
be checked by the validation process described above.
[0168] An exemplary aspect of the present invention may also
provide the lower level Notifier object (discussed below as part of
the implementation of EventronChannels). The Notifier, along with
the properties of the Java memory model and/or the facilities of
the java.util.concurrent.atomic package, can be used to implement
other wait-free utilities.
[0169] EventronChannels should (e.g., must) be wait-free to avoid
priority inversion. An exemplary aspect of the present invention
may provide two versions of the channel abstraction: one that
provides wait-free access to the reader and a blocking interface to
the writer (called a EventronReadChannel) and another that provides
a symmetric interface (EventronWriteChannel). (The present
invention may also provide wait-free access for the low-priority
half of these channels.) This may allow for relatively
straightforward implementations of both high- and low-frequency
tasks: high-frequency tasks may be driven by the scheduler, and if
they avoid looping and use only wait-free structures such as
ConstructChannels, they should have predictable execution times.
Low-frequency tasks can be designed to produce or process data as
quickly as possible and can rely on the channel to throttle their
execution.
[0170] Channels should (e.g., must) also be free of allocation and
pointer mutation to avoid synchronization with the collector. Like
other Construct data structures, they should (e.g., must) perform
all allocation during initialization. An exemplary aspect of the
present invention may specialize EventronChannels for the single
reader/single writer case to allow efficient implementation in the
context of these constraints.
[0171] Each EventronChannel may provide a wait-free communication
mechanism by implementing a circular queue or ring of primitive
values. These values may be grouped together into buffers to
provide more efficient access. EventronChannels may come in seven
"flavors," one for each of the each of the primitive Java
types.
[0172] Programming Interface
[0173] At their core, ConstructChannels may be similar to input and
output streams, providing methods to read and write individual
elements and arrays of them. In the most basic interface, reading
(respectively, writing) an array of values may cause the elements
of the array to be copied from (respectively, to) the internal data
structure.
[0174] However, the present invention may also offer a slightly
more complex, but lower overhead, alternative. A reader or writer
may also acquire access to the internal buffer array. This may
allow data to be transferred between (e.g., directly between) tasks
without any additional copies. Programmers should (e.g., must) be
careful to relinquish references (e.g., all references) to a buffer
once it has been released for use by the other task. Failure to do
so may lead to unexpected values being read from the channel.
[0175] In many applications, when a channel overrun or underrun
occurs, the high-frequency task should (e.g., must) continue to
execute either by discarding data (in the former case) or using a
default value (in the latter), while the error itself should be
handled by some other task. EventronChannels may provide a
mechanism to automatically communicate these exceptional conditions
back to the low-frequency task: a subsequent write (read) performed
by the low-frequency task may throw an UnderrunException
(OverrunException).
[0176] Configuration
[0177] Two important (e.g., the most important) channel
configuration parameters include the number of buffers and the size
of each buffer. By dividing the channel storage into a fixed set of
buffers, the present invention may allow the programmer to control
the rate of synchronization between the reader and the writer. Each
buffer may be owned by at most one of the two processes, allowing
efficient access in the common case.
[0178] Only when moving from one buffer to the next may atomic
memory accesses (e.g., compare-and-swap) be required. The size and
number of buffers should (e.g., must) be determined given the
relative jitter of the two tasks. The size of each buffer may
determine the minimum latency between when values are written to
and read from the channel, and the number of buffers may determine
the maximum such latency.
[0179] In the audio playback example, a primary source of latency
for the low-frequency task is the garbage collector. Suppose in
this example that one is able to guarantee a maximum pause time of
5 ms and a minimum mutator utilization of 50% over any 10 ms
window. Then, the proper configuration may include three buffers
each containing 5 ms of data or, equivalently, 111 samples (recall
that our example uses 22.05 kHz audio).
[0180] Initially the (low-frequency) writer may (e.g., will) be 10
ms ahead of the reader, but due to interruptions by the garbage
collector, it may lag behind to the point where it is only 5 ms
ahead. Such a scenario may (e.g., will) only occur if the collector
has consumed a significant portion of the previous scheduling
quanta: by the present invention's minimum utilization guarantee,
the collector may (e.g., will) consume less of the subsequent
quanta and allow the synthesizer to speed ahead once again.
[0181] The EventronWriteChannel may have one additional
configuration parameter. In the case of an overrun, some data will
be lost: either the oldest data should (e.g., must) be overwritten,
or the newest should (e.g., must be discarded. An exemplary aspect
of the present invention (e.g., the inventors' implementation) may
support both, leaving the programmer to determine which is most
appropriate for a given application.
[0182] Implementation
[0183] Each EventronChannel may be implemented using an
ImmutableArray of BufferState objects, where such objects may
encapsulate a buffer and an integer state variable. The state of
each buffer may determine both the state of the content (whether
the buffer contains data or not) and who owns the buffer. For
example, a buffer (e.g., each buffer) in a EventronReadChannel may
be empty, full, reading, or writing, the last two indicating
ownership by the reader or writer, respectively.
[0184] There may be two additional states, underrun and closed,
that may be used to communicate changes in the status of the
channel as a whole. When an underrun occurs, the reader may wrest
ownership of the current buffer from the writer, setting its state
to underrun. The next access by the writer may (e.g., will)
discover the state change and throw exception. Analogously, the
writer may indicate the end of the data stream by setting the state
of its current buffer to closed.
[0185] EventronChannels may depend on atomic operations provided by
the Java.util.concurrent.atomic package. Atomic compare-and-swap
operations may be used to update the state of each buffer. For
example, an underrun may occur at the same moment that the writer
finishes writing to a buffer. Both tasks may attempt to update the
state of this buffer (to underrun and full, respectively). An
atomic update may ensure that both tasks agree on the winner of the
race.
[0186] Further, the implementation of channels may require a
mechanism to allow the high-frequency task to notify the
low-frequency one when a buffer becomes available. The standard
Java lock-notify-unlock may be unsuitable and while the
implementation of locks in java.util.concurrent.locks may include
tryLock and signal operations that avoid synchronization, they may
not perform allocation and pointer mutation, rendering them
unusable by Constructs. (This is partly because these locks may
support an unbounded number of waiters.) Instead, an exemplary
aspect of the present invention may provide a Notifier class that
wraps any Object. That is, an exemplary aspect of the present
invention may provide a notifyIfWaiting method, implemented
internally to the virtual machine, which notifies threads (if any)
waiting on the wrapped object's lock without blocking, allocating,
or updating any reference field.
Evaluation
[0187] The present inventors have tested the performance of both
validation and execution of Constructs, in particular comparing
them to equivalent C programs and evaluating the interference from
other (garbage collected) JVM threads.
[0188] All experiments were run on an IBM Intellistation Z Pro with
2 Pentium 4 Xeon processors (at 2.4 GHz and with 2-way
hyperthreading) and 2 GB of physical memory. The operating system
is Linux with the 2.6.14 kernel. Additionally, various patches to
improve real-time performance were applied including
high-resolution timers (HRT) and robust mutexes for priority
inheritance.
[0189] The base for the implementation is IBM's Real-Time edition
of the J9 virtual machine, which includes the Metronome garbage
collector, RTSJ, and an ahead-of-time compiler. The language level
is 1.5 (with generics) using the J2SE libraries.
[0190] The inventors' prototype implementation of Constructs
included the "fully constructed" check and safety checks for
reflective field updates. Although, in the inventor's experiments
the aspects of the Construct prototype were not integrated into the
production VM used to gather the results reported below, based on
earlier experiments with the prototype implementation, the
inventors do not believe this materially impacts the results
reported below.
[0191] All Constructs were run at 22.05 KHz (a 45.35 .mu.s period),
a standard frequency for digital audio. Tests were run until
10,000,000 samples were obtained. These runs were sufficiently
large to encompass periodic operating system activity which could
perturb the results and to provide a statistically meaningful
comparison. For example, the operating system flushes data to the
disk every 30 seconds and performs timer interrupts every 1
millisecond.
[0192] Validation
[0193] FIG. 10 provides Table 1 which includes Validation times for
several exemplary Constructs. The Null Construct has a trivial run(
) method, Audible is the music generation example, and FFT computes
a fast Fourier transform.
[0194] The inventors measured the time required to validate this
small set of exemplary Contructs. Table 1 shows the size of these
examples and the time required to validate them. Each example
includes any library code that may be executed by the Construct,
for example, some methods of EventronReadChannelOfShort in the
Audible example. While these validation times are much longer than
the execution of the Constructs themselves, validation must only be
performed once per instantiation of a Construct.
[0195] Base Performance: Constructs versus C
[0196] To establish the base timeliness results of Constructs, we
implemented a C program which uses the same OS function calls to
establish a periodic task of the same 22.05 KHz frequency. Other
than computation to establish the scheduling behavior, neither
version performs any work. The inventors measured the timeliness of
both versions by measuring the interarrival times of the Construct
and C handler.
[0197] Noting that the y-axis is logarithmic, FIG. 11 shows that
nearly all of the data is centered tightly around the target period
time of 45.3 .mu.s. More precisely, 99.64% of the samples lie
within 5 .mu.s of the target time. Within 10 .mu.s of the target
time, Constructs include 99.997% of the data while C includes
99.999% of the samples.
[0198] Both the C version and the Construct versions have nearly
the same distribution, both in terms of the extent and count of
both the outlier clusters and the main region. This curve
represents the degree to which this particular version of Linux is
real-time and represents the best we can achieve from user-level
code.
[0199] In this graph, the histogram's x-axis ends at 120 .mu.s
although there is one additional group of outliers centered at 4.0
ms. In fact, all runs, regardless of whether it is C or Constructs,
which version of Constructs, and the load under which it is run,
suffer from the same effect. Thus, the inventors believe that the
effect stems from the operating system effect such as sync-ing the
disk. In this particular figure, the C version has 12 outliers
ranging from 3.93 to 4.06 ms while the Construct version has 7
outliers ranging from 3.99 to 4.05 ms. This 4.0 ms effect is
pervasive and stems from the operating system.
[0200] Timeliness in the Presence of GC
[0201] Having established that Constructs can potentially achieve
latency similar to that of a C application, we now test the main
design feature of Constructs by measuring the effect that garbage
collection has on Construct interarrival times. To do this, the
Construct was run together with a lower-priority Java thread that
continually allocates, thus triggering periodic garbage
collections. The Construct's scheduling is potentially affected by
both the mutator thread and the garbage collection thread.
[0202] As seen in FIG. 12, these two effects combine to slightly
worsen the behavior of the Construct. The distribution is not as
tight as before so that at 10 .mu.s, 99.990% of the data is
included. However, nearly all of the distribution lies within 25
.mu.s of the target time. In fact, the tail of the distribution is
better when the mutator thread and the garbage collection threads
are enabled.
[0203] To demonstrate the effectiveness with which Constructs
address the latency problem, the the Construct was replaced with an
equivalent Java version which runs at high priority. FIG. 13 shows
that the interarrival times of a regular Java thread are far worse
than that of a Construct. In fact, the outliers are so distant that
we extended the x-axis from 130 .mu.s to 1 ms. The number of
outliers increases by a factor of 20, and these outliers are
centered around 550 .mu.s which is slightly greater than the time
the garbage collection thread typically runs for. For non-realtime
garbage collectors, the interarrival times would be even worse.
[0204] Performance Under Load: Constructs with CPU-bound
Threads
[0205] After having established that the base performance of
Constructs may be similar to that of a similar C application, the
inventors tested the performance of Constructs when run in
conjunction with regular Java threads in a single Java VM. To
stress the system, the inventors chose four benchmarks from the
SPECjvm98 suite (db, jack, javac, and mpegaudio) which are
CPU-bound batch applications so that the threads are almost always
eligible to run.
[0206] As FIG. 14 shows, the Constructs are slightly less timely
when the system is under load and the deviation is dependent on the
load. With db, the deviation is small, and 99.990% of the data
falls within a 10 .mu.s window. On the other hand, javac causes
greater perturbations, and a 10 .mu.s window includes only 99.986%
of the samples. While these numbers are still very high, the
differences are significant and show that other threads do cause
definite perturbations. Because this version of Linux does not have
the pre-emption patches (which at the time were too unstable),
various system calls can cause the kernel to enter critical
sections in which it will not yield. For example, a lower-priority
thread can request a non-blocking I/O operation which nonetheless
will block an unrelated Construct thread.
[0207] FIG. 15 shows the passage of time as a series of strips or
"scan lines" moving from left to right and then top to bottom. A
Construct (dark gray) was executed every 45 .mu.s. The width of
oscilloscope view is exactly four times the period of the
Construct, thus the Construct executions line up vertically. The
alarm task (part of the GC scheduler) is shown in black and runs
with a period of 200 .mu.s, slightly longer than the width of the
view, resulting in the staggering effect. Finally the GC execution
is shown in light gray. Note that the Construct continues to run at
fixed intervals despite the execution of the other tasks.
[0208] Interaction with Low-Frequency Tasks: Audible Constructs
[0209] The inventors tested the performance of the audible
Construct that was described above. In addition to showing
perturbation from the lower-priority low-frequency task, this
example demonstrates that the wait-free channels perform as
described. Since the audible Construct writes to /dev/dsp, timing
anomalies may arise from operating system interactions. In fact,
despite using non-blocking I/O, the system call write periodically
takes quite a while.
[0210] Because this anomaly is not necessarily central to the
Construct design but deals with lower-level aspects of specific
devices, we first present performance result without the write in
FIG. 16. There are no significant changes to the tail of the
distribution although the central portion of the distribution grows
slightly wider similar to effect of garbage collection in FIG. 12.
The similarity suggests that the operating system does not always
promptly schedule high-priority threads even when there is no
interaction (via mutexes, for example) between the high- and
low-priority threads.
[0211] Further, the inventors showed the pronounced effect of
including the call to write in the audible Construct in FIG. 17.
There is clearly a cluster of data centering around 200 .mu.s.
Examination of the raw data shows that every 93 ms, the call to
write takes up to 220 .mu.s instead of the usual 2 .mu.s. The audio
card performs double buffering and each buffer is 2048 samples in
size. At 22050 Hz, this corresponds to 93 ms so we believe that,
despite opening /dev/dsp with non-blocking I/O (which did improve
matters), certain calls are still affected by buffering operations
within the kernel.
[0212] Advantages over Related Art
[0213] Construct validation may involve constructing a call graph
and analyzing it to detect violations of the programming model. A
wide range of call graph construction algorithms have been
described. Static analyses have also been adapted to operate
on-line and handle Java language features such as reflection,
dynamic class loading, and JNI. However, unlike all of this
previous work, the present invention may include a Construct
validator which may analyze the code in the context of the exact
object instances on which it will operate. This may significantly
simplify analysis since the actual object instance can be
reflectively examined to determine the types and contents of its
fields.
[0214] The Real-Time Specification for Java (RTSJ) is a standard
that defines a number of extensions and changes to the virtual
machine and standard Java libraries to support real-time
application development. The RTSJ defines NoHeapRealtimeThreads
(NHRTs) as the proper high frequency tasks. NHRTs are forbidden
from accessing any object allocated in the garbage-collected heap.
Instead, NHRTs may allocate from either the ImmortalMemoryArea or
one of the ScopedMemoryAreas. Objects allocated in the former will
be preserved for the remainder of the life of the virtual machine;
thus the ImmortalMemoryArea is suitable for only those objects that
are allocated during the initialization phase of an
application.
[0215] Programming effectively and safely with RTSJ scoped memory
is widely recognized as being quite challenging.
[0216] Static type systems have been proposed to make it easier to
write correct programs using RTSJ's scoped memory abstractions. In
these systems, correctly typed programs cannot cause memory access
violations, therefore all of the RTSJ mandated dynamic memory
checks can be eliminated. Such systems statically partition heap
connectivity, but allow real time tasks to dynamically allocate
memory (using RTSJ scoped memory) within their heap partition. In a
Construct programming model according to an exemplary aspect of the
present invention, however, heap connectivity may not be restricted
but dynamic memory allocation by the Construct may be
forbidden.
[0217] Other related art has reported on a detailed performance
comparison of a jRate RTSJ implementation and the RTSJ reference
implementation (RI) from TimeSys. One relevant data point from
these experiments is that it was observed that the RTSJ RI Periodic
Thread implementation was only able to achieve predictable behavior
for tasks with periods greater than or equal to 30 ms. Higher
frequency tasks suffered significant jitter. In contrast, a
Construct Implementation according to an exemplary aspect of the
present invention may be capable of achieving highly predictable
behavior for tasks with periods as small as 45 .mu.s.
Conclusion
[0218] Constructs according to the exemplary aspect of the present
invention may provide a simple and elegant means of writing tasks
requiring extremely low latency while co-existing with a
garbage-collected language. By taking advantage of Java's existing
final mechanism, augmented with some run-time support to eliminate
loopholes that could compromise safety, a simple data-sensitive
analysis may be able to guarantee safety before Constructs begin
executing. Constructs may be simpler to program, simpler to
implement, more reliable, and more efficient than other approaches
like scoped memory.
[0219] Constructs may make it feasible to program almost any
real-time system entirely in pure Java code. Further, Constructs
may be used to directly generate CD-quality audio waveforms and to
write device drivers. The inventors believe that Constructs may
make it possible to replace even the lowest levels of system
software with safe code in a managed run-time system.
[0220] Another exemplary embodiment of the present invention
includes an interrupt handler (e.g., in an operating system) which
may be in a garbage collected language. The interrupt handler
includes an interrupt module for interrupting an operation in the
operating system, and a system for executing a task, including a
first executing module for executing a first task in which garbage
is collected using a garbage collector, and a second executing
module for executing a second task which is associated with the
interrupt module, the second task having a higher frequency than
the first task, and the garbage collector being preemptable by the
second task.
[0221] Referring again to the drawings, FIG. 18 illustrates a
typical hardware configuration which may be used for implementing
the computer system and method according to the exemplary aspects
of the present invention. The configuration has preferably at least
one processor or central processing unit (CPU) 1811. The CPUs 1811
are interconnected via a system bus 1812 to a random access memory
(RAM) 1814, read-only memory (ROM) 1816, input/output (I/O) adapter
1818 (for connecting peripheral devices such as disk units 1821 and
tape drives 1840 to the bus 1812), user interface adapter 1822 (for
connecting a keyboard 1824, mouse 1826, speaker 1828, microphone
1832, and/or other user interface device to the bus 1812), a
communication adapter 1834 for connecting an information handling
system to a data processing network, the Internet, and Intranet, a
personal area network (PAN), etc., and a display adapter 1836 for
connecting the bus 1812 to a display device 1838 and/or printer
1839. Further, an automated reader/scanner 1841 may be included.
Such readers/scanners are commercially available from many
sources.
[0222] In addition to the system described above, a different
aspect of the invention includes a computer-implemented method for
performing the above method. As an example, this method may be
implemented in the particular environment discussed above.
[0223] Such a method may be implemented, for example, by operating
a computer, as embodied by a digital data processing apparatus, to
execute a sequence of machine-readable instructions. These
instructions may reside in various types of signal-bearing
media.
[0224] Thus, this aspect of the present invention is directed to a
programmed product, including signal-bearing media tangibly
embodying a program of machine-readable instructions executable by
a digital data processor to perform the above method.
[0225] Such a method may be implemented, for example, by operating
the CPU 1811 to execute a sequence of machine-readable
instructions. These instructions may reside in various types of
signal bearing media.
[0226] Thus, this aspect of the present invention is directed to a
programmed product, including signal-bearing media tangibly
embodying a program of machine-readable instructions executable by
a digital data processor incorporating the CPU 1811 and hardware
above, to perform the method of the invention.
[0227] This signal-bearing media may include, for example, a RAM
contained within the CPU 1811, as represented by the fast-access
storage for example. Alternatively, the instructions may be
contained in another signal-bearing media, such as a magnetic data
storage diskette 1900 (FIG. 19), directly or indirectly accessible
by the CPU 1811.
[0228] Whether contained in the computer server/CPU 1811, or
elsewhere, the instructions may be stored on a variety of
machine-readable data storage media, such as DASD storage (e.g, a
conventional "hard drive" or a RAID array), magnetic tape,
electronic read-only memory (e.g., ROM, EPROM, or EEPROM), an
optical storage device (e.g., CD-ROM, WORM, DVD, digital optical
tape, etc.), paper "punch" cards, or other suitable signal-bearing
media. In an illustrative embodiment of the invention, the
machine-readable instructions may comprise software object code,
complied from a language such as "C" etc.
[0229] With its unique and novel features, the present invention
provides a method (and system) of executing a task (e.g., a low
latency task) which may be simpler to program and implement, more
reliable and more efficient than other approaches.
[0230] It should be noted that while an exemplary aspect of the
present invention has been described as being implemented using the
Java programming language, the present invention is not necessarily
limited to Java. That is, the present invention may be implemented
using any garbage collected programming language.
[0231] While the invention has been described in terms of one or
more exemplary embodiments, those skilled in the art will recognize
that the invention can be practiced with modification within the
spirit and scope of the appended claims. Specifically, one of
ordinary skill in the art will understand that the drawings herein
are meant to be illustrative, and the design of the inventive
assembly is not limited to that disclosed herein but may be
modified within the spirit and scope of the present invention.
[0232] Further, Applicant's intent is to encompass the equivalents
of all claim elements, and no amendment to any claim the present
application should be construed as a disclaimer of any interest in
or right to an equivalent of any element or feature of the amended
claim.
* * * * *