U.S. patent application number 14/726316 was filed with the patent office on 2015-09-17 for data collisions in concurrent programs.
This patent application is currently assigned to Microsoft Technology Licensing, LLC. The applicant listed for this patent is Microsoft Technology Licensing, LLC. Invention is credited to John Erickson, Madan Musuvathi.
Application Number | 20150261654 14/726316 |
Document ID | / |
Family ID | 45329764 |
Filed Date | 2015-09-17 |
United States Patent
Application |
20150261654 |
Kind Code |
A1 |
Erickson; John ; et
al. |
September 17, 2015 |
DATA COLLISIONS IN CONCURRENT PROGRAMS
Abstract
Described are techniques for detecting data collisions between a
first portion and a second portion of an application executing on a
computer, the first portion and the second portions executing
concurrently with respect to each other. While the first portion
and second portion are executing, before the first portion accesses
a memory location shared by the first portion and the second
portion, a value stored in the memory location is captured and the
first portion is delayed. While the second portion continues to
execute the first portion is delayed. After a period of the first
portion having been paused or slowed, the current content of the
memory location is compared with the captured content to determine
if there is a data collision. The first and second portions may be
threads, and the capturing, delaying, and determining may be
performed by code inserted to the application after it has been
compiled.
Inventors: |
Erickson; John; (Redmond,
WA) ; Musuvathi; Madan; (Redmond, WA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Microsoft Technology Licensing, LLC |
Redmond |
WA |
US |
|
|
Assignee: |
Microsoft Technology Licensing,
LLC
|
Family ID: |
45329764 |
Appl. No.: |
14/726316 |
Filed: |
May 29, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12819069 |
Jun 18, 2010 |
9069894 |
|
|
14726316 |
|
|
|
|
Current U.S.
Class: |
717/128 |
Current CPC
Class: |
G06F 11/3644 20130101;
G06F 11/3632 20130101; G06F 11/3636 20130101; G06F 11/366
20130101 |
International
Class: |
G06F 11/36 20060101
G06F011/36 |
Claims
1. A method of testing for potential data collisions between a
first thread and a second thread of an application executing on a
computer, the first thread and the second thread configured to
execute concurrently with respect to each other, the method
comprising: automatically identifying a variable as a test target
by determining that the variable is accessed by the first thread
and by determining that the same variable is also accessed by the
second thread; based on the variable having been identified as a
test target, automatically inserting delaying code and comparing
code into the application, the delaying and comparing code inserted
within application code of the first thread and at an access
operation within application code of the first thread that accesses
the variable, the delaying and comparing code located according to
the location of the access operation within the application code of
the first thread; while the first thread and second thread are
executing, before the access operation accesses the identified
variable shared by the first thread and the second thread,
capturing, by the comparing code, a value stored in the variable,
and delaying, by the delaying code, the first thread, wherein the
second thread continues to execute while the first thread is
delayed; and after a period of the first thread having been paused
or slowed, comparing, by the comparing code, a current content of
the variable with the captured content, and taking a debugging
action in response to a determination that the current content and
the captured content differ.
2. A method according to claim 1, wherein the debugging action
comprises storing an indication corresponding to the determination
that the current content and the captured content differ.
3. A method according to claim 1, wherein the first thread and the
second thread comprise respective execution threads.
4. A method according to claim 1, wherein the second thread updates
the variable with new content after the capturing such that the
current content differs from the captured content.
5. A method according to claim 1, further comprising setting a
debug register to trigger when the variable is accessed, and
unsetting the debug register after such trigger.
6. A method according to claim 1, wherein the comparing code
executes as part of the first thread without exiting the first
thread.
7. A method according to claim 6, wherein the comparing code
determines if the variable has changed during a period of delay
caused by the delaying code.
8. Storage hardware storing information to enable a computer to
perform a process, the process comprising: automatically inserting
test code into native code of a thread at an access operation of a
program by identifying a location of the access operation in the
thread, the access operation accessing a shared memory location;
and executing the application, the application comprising the
native code and the test code, wherein when the test code executes
the test code: captures a value from the shared memory location,
then induces a delay of the thread of the application, and then
determines whether the captured value is equal to content read from
the shared memory location after the delay.
9. Storage hardware according to claim 8, wherein during the access
operation comprises a read operation that reads the shared memory
location.
10. Storage hardware according to claim 9, wherein another thread
of the application is executing concurrently during the induced
delay, and wherein the other thread updates the content of the
shared memory location during the induced delay and the test code
detects the update.
11. Storage hardware according to claim 8, wherein the native code
and the test code comprise binary processor instructions.
12. Storage hardware according to claim 8, wherein the test code
does not cause an exit from the thread.
13. Storage hardware according to claim 12, wherein multiple shared
memory locations are tested for updates that occurred during the
induced delay.
14. Storage hardware according to claim 8, wherein the determining
determines that the captured value and the content are not equal
and in response a message is sent to an interactive debugger.
15. A method of detecting a data collision over data shared by
concurrently executing first and second threads of an application,
the first thread and the second thread both having code that
accesses shared memory units, the method comprising: automatically
locating access operations of the first thread that access the
shared memory units, and in response automatically inserting test
code portions into the first thread at the respective identified
access operations of the shared memory units; executing the first
thread, including executing the test code portions, each the test
code portion, when executed in the first thread, inducing a pause
or slow-down of the first thread and also checking whether content
of a corresponding shared memory unit has changed; and when
determined by a test code portion that a corresponding shared
memory unit's content changed, initiating a debugging or testing
action by a corresponding test code portion.
16. A method according to claim 15, wherein the debugging or
testing action comprises storing information identifying the shared
memory unit.
17. A method according to claim 16, wherein a test code portion is
inserted according to an annotation of corresponding source
code.
18. A method according to claim 15, wherein a symbol table is used
to locate the access operations.
19. A method according to claim 15, wherein the pause or slow-down
is induced by at least one of a sleep operation, a prioritization
operation, or by executing another thread.
20. A method according to claim 15, wherein the debugging action
comprises capturing context information that comprises a thread
identifier, a register, a stack snapshot, or contents of a shared
memory including a content captured before a pause or delay and a
content after the pause or delay.
Description
REFERENCE TO RELATED INVENTION
[0001] This is a continuation patent application of copending
application Ser. No. 12/819,069 (allowed), filed Jun. 18, 2010,
entitled "DATA COLLISIONS IN CONCURRENT PROGRAMS". The
aforementioned application is hereby incorporated herein by
reference in its originally filed form.
BACKGROUND
[0002] With increased use of multithreaded applications and
multicore processors, data collisions between concurrent threads
have become an increasingly significant problem for program
developers and testers. Generally, when two threads are constructed
to share memory or data in a systematic way (e.g., using thread
locks or the like), data collisions do not occur. However, when two
threads freely access shared memory, for example a shared object or
variable, either might alter the shared object or variable without
visibility to the other thread, and the other thread may obtain
unexpectedly altered or inconsistent data from the shared object or
variable, which is a data collision. Moreover, the occurrence of
such a collision may not be immediately apparent to the program. In
general, data collisions may cause in-deterministic program
behavior.
[0003] To explain further, a data collision may be thought of as
analogous to a traffic intersection without a traffic light, where
the intersection is some shared memory (e.g., a shared register,
object, variable, memory barrier, etc) and cars passing through the
intersection are as threads freely accessing and updating the
shared memory. Eventually two cars may collide at the intersection,
just as one thread may update the shared memory during a time when
another thread is accessing the shared memory.
[0004] Because such data collisions often manifest only later when
the corrupt shared memory is inconsistent with program state
expected by the program, data collisions have been difficult to
detect. Approaches have tended to use complex analysis to predict
when a data collision could occur. However, these approaches are
often unreliable and may lead to false positive detection, that is,
"detection" of a data collision when one has not occurred or when a
data collision (an actual corruption of data) cannot occur due
perhaps to some mutually exclusive program semantics. In addition,
data collision detection has not been able to be performed at the
kernel level.
[0005] Techniques related to data collision detection are discussed
below.
SUMMARY
[0006] The following summary is included only to introduce some
concepts discussed in the Detailed Description below. This summary
is not comprehensive and is not intended to delineate the scope of
the claimed subject matter, which is set forth by the claims
presented at the end.
[0007] Data collisions between a first portion and a second portion
of an application executing on a computer may be detected. The
first portion and the second portions execute concurrently with
respect to each other. While the first portion and second portion
are executing, before the first portion accesses (reads or writes)
a memory location shared by the first portion and the second
portion, a value stored in the memory location is captured and the
first portion is delayed. While the second portion continues to
execute the first portion is delayed. After a period of the first
portion having been paused or slowed, the current content of the
memory location is compared with the captured content to determine
if there is a data collision. The first and second portions may be
threads, and the capturing, delaying, and determining may be
performed by code inserted into the application after it has been
compiled.
[0008] Many of the attendant features will be explained below with
reference to the following detailed description considered in
connection with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The present description will be better understood from the
following detailed description read in light of the accompanying
drawings, wherein like reference numerals are used to designate
like parts in the accompanying description.
[0010] FIG. 1 shows an application executing a first thread and a
second thread.
[0011] FIG. 2 shows an example implementation of a data collision
detection technique.
[0012] FIG. 3 shows an embodiment for augmenting an application
program with code for detecting data collisions.
[0013] FIG. 4 shows execution of the modified program.
[0014] FIG. 5 shows an example modified program.
[0015] FIG. 6 shows an example that uses a global active list.
[0016] FIG. 7 shows a computer including storage and
processor(s).
DETAILED DESCRIPTION
[0017] Embodiments discussed below relate to detecting data
collisions in programs with concurrently executing parts. An
explanation of data collisions will be provided first, followed by
explanation of a general approach involving a thread, before
accessing a shared memory location, capturing the content of a
shared memory location, pausing, and then after pausing checking to
see if the current shared memory differs from what was captured,
thus indicating whether the shared memory location was changed by
some other thread as the shared memory location was about to be
accessed. This will be followed by detailed description of how a
program may be modified and run to detect data collisions. The
makeup and behavior of such a modified program will be explained
further with reference to program code.
[0018] FIG. 1 shows an application 100 executing a first thread 102
and a second thread 104. A first thread 102 and a second thread 104
are shown executing. The application 100 and threads 102, 104 may
be presumed to be executing on a computer, an example of which is
described below with reference to FIG. 7. The threads 102, 104 are
shown as operation stacks, including various known operations 106
executing in order from the top of FIG. 1. The threads 102, 104
both have read and write access to a portion of shared memory 108
of the computer, which, programmatically, may store objects,
variables, registers, or the like.
[0019] The threads 102, 104 may have a potential data collision. A
shared memory location 110 may be used by both first thread 102 and
second thread 104. For example, first thread 102 might have a read
operation 112 to access memory location 110, for example in the
form of "somevariable=foo" (assign the content of foo to the
content of somevariable). Memory reads can take many forms and
occur in numerous ways known to those skilled in the art. A data
collision may occur when the following events occur. The first
thread 102 has some operations 106 near the read operation 112. As
these execute and as the read operation 112 executes, the content
stored in shared memory location 110 may be updated by the second
thread 104, which, during or immediately before the read operation
112 may issue a write operation 114 to write new content ("A2") to
shared memory location 110. First thread 102 may be executing under
the semantic presumption that shared memory location 110 had a
value (e.g., "A1") consistent with data state of first thread 102.
Thus, when first thread 102 issues read operation 112 and obtains
the content of shared memory location 110, it unknowingly obtains
some extraneous content such as "A2", which may be inconsistent
with the data state of first thread 102. First thread 102 may be
able to continue executing without immediate failure, yet may
eventually crash or store erroneous output. For example, if the
shared memory location 110 stores a bank transaction amount, an
associated account may be over credited or over debited.
[0020] Though FIG. 1 shows a read-write data collision, write-write
data collisions are also possible. In such a case, if it is assumed
that the first thread 102 issues a write operation instead of a
read operation 112, corruption or the like may occur because the
second thread 104 may overwrite the content stored into the shared
memory location 110 by the first thread 102. For example, first
thread 102 might write "A1", and second thread 104 might
immediately overwrite it by storing "A2" in the shared memory
location 110. Generally, two threads concurrently reading from
shared memory cannot usually give rise to a data collision, but two
threads concurrently reading while another thread concurrently
writes could. As used herein, "access" will refer to either read or
write access, and concurrent accesses of two threads will refer to
either read-write access or write-write access.
[0021] FIG. 2 shows an example implementation of a data collision
detection technique. In general, the technique involves
artificially delaying a thread ((slowing, pausing, sleeping,
suspending, lowering execution priority, etc.) when the thread
comes to a shared memory access operation, and monitoring the value
stored in that memory location to determine whether content of the
shared memory location is externally changed (i.e., not changed by
the thread being monitored) while the thread is delayed. The term
"artificial" refers to an action that is not part of the program
logic of the application program running the thread. In other
words, given an optimized or "clean" compilation of the application
program's source code, with no instrumentation or other extraneous
interference, the same thread would execute without the artificial
delay. The delay is "artificial" in that the normal operation of
the application program and thread is altered such that it runs
slower or with delay that would not occur if the application
program were simply compiled and executed. During normal operation
of the application program, the chance of two threads that are
capable of colliding actually doing so may be small. By expanding
the collision window, the chance of an actual collision occurring
is increased. Moreover, by testing the shared data while a thread
is thus delayed, an actual collision can be detected while it is in
progress.
[0022] Returning to FIG. 2, where the following numerals correspond
to steps in FIG. 2, in one embodiment, a thread may be artificially
delayed by augmenting the program so that when it executes, code to
test for data collisions is executed as part of the program in the
memory space of the program. To do so, the program may be analyzed
to identify potential data collisions, as shown in block 140. For
example, accesses to data (a memory location) identified as shared
may be identified. Then, instrumentation or test code corresponding
to the identified accesses may be inserted into the program as
shown in block 142. As shown in the next block 144, at runtime
(when the program is running), a first thread may begin executing.
At block 145, the first thread comes to a previously identified
data or memory access, and at block 148 the content of the shared
data is copied to a temporary register or variable and the first
thread is slowed/paused (artificially delayed). At block 150, as
(or just before) the threat resumes normal execution (i.e.,
executing native code not test code) to perform the data access
identified at block 140, the content of the data is compared
against the temporary copy, and at block 152 if the current content
and copied content do not match, a data collision is identified,
and some debugging action is taken, such as entering an interactive
debug mode, displaying an alert, updating a database, logging
information about the collision such as the line number in the
program or an object or symbol name corresponding to the shared
data, or other known actions for testing or debugging software.
[0023] Note that in block 148 the copying of the content can occur
before or after the thread is delayed. Other operational orders can
vary. For example, normal execution can resume before comparing the
content of the shared memory with the copied content. Generally,
the order of operations is not significant, as long as there is
some delay before the current content is compared to the captured
content.
[0024] FIG. 3 shows an embodiment for augmenting an application
program with code for detecting data collisions. A binary
instrumentation tool 170 performs the actions to be described. It
may be assumed that a compiler 171 or the like has produced or
compiled a program 172 (e.g., "prog.exe") from source code 173,
where the program 172 is in the form of binary code, machine
instructions, bytecode, platform-independent intermediate-code, or
the like. This program 172 may be referred to as the native or
unaugmented program. In sum, program 172 may be any ordinary
program compiled or built using any number of widely known
programming tools. The binary instrumentation tool 170 may in
practice be a combination of executable tools or modules, such as a
disassembler, a code parser, symbol table reader and analyzer, etc.
A suite such as Vulcan, published by Microsoft Corporation, may
serve as a base for the binary instrumentation tool 170; a
Vulcan-like suite may be modified to implement various embodiments
and techniques described herein.
[0025] Accordingly, the instrumentation tool 170 is run and is
passed: the program 172, and possibly also program symbols 174
(e.g., a symbol table), any library 176 used by the program 172,
instrumentation symbols 178, or any other information (perhaps
output by compiler 171) that may be useful in teasing apart program
172 and inserting test code therein. For example, the binary
instrumentation tool 170 may perform a process 180 such as
receiving the program 172. The program 172 is disassembled using
well known tools. The program 172 may thus be reproduced in the
form of assembly code or intermediate code which is amenable to
analysis and manipulation (for discussion, assembly language will
be used as an example). The assembly language program is then
analyzed to find suitable memory access operations (described in
detail below), and the instrumentation tool 170 then augments the
assembly code by inserting test assembly code to perturb or delay
execution near the data accesses and to test for data changes (such
testing is also described in further detail below). Note that a
thread need not be stalled a specific amount of time; delay may be
introduced by other techniques such as swapping to a different
thread, reducing execution priority of the thread or an
encompassing process, etc. The augmented assembly code is then
assembled into an executable modified program 182, which is shown
in FIG. 3 as "modified-program.exe". Similarly, library 176 may be
augmented and assembled as a modified library 184.
[0026] The symbol table or symbols 178 may be helpful for
embodiments that allow a tester to specify various options. For
instance, one embodiment may allow tester to specify which types of
memory operations to examine, which data types to consider, whether
only data types or symbols updated since a prior compilation are to
be considered, and so on. Regardless of any such options, it is
possible to produce modified program 182 without requiring a user
to manually annotate source code 173 or otherwise interact with the
semantic content of source code 173 or program 182; a modified
version able to self-detect and self-report actual data collisions
may be automatically produced.
[0027] FIG. 4 shows execution of the modified program 182. A test
tool 210 may be used to coordinate execution of the modified
program 182, although the test tool 210 is not needed. The test
tool may also take into account the modified library 184, the
source code 173, annotations 186 of known data collisions, and so
on (note that any reference to an item herein as being optional
should not be understood to imply that other items described herein
are required). As the modified program 182 executes it may generate
data collision information 212, such as information about a data
collision to be stored in a log file 214 or a message to an
interactive debugger 216, etc. Any thread context information may
be captured, such as a thread ID, registers, a stack snapshot,
before/after values, etc.
[0028] FIG. 5 shows an example modified program 182. A thread 232
or some portion of modified program 182 capable of concurrency may
have included therein both native or non-test operations, as well
as instrumentation or test operations 234 for delaying the thread
230. Such operations may include an observation or capture
operation 236 that obtains a copy of the content of the shared data
to be inspected. A delay operation 238 may perturb the thread 230
by causing it to slow, pause, sleep, loop, etc. (note that a delay
can be designed so that, for a given platform or environment, the
delay will not result in an execution that could not occur in the
unaugmented version of the program). A native access operation 240
may access the shared data. A test or comparison operation 242 may
compare the previously captured content of the shared data with the
current content of the shared data (e.g., the content of "_testvar"
is compared to the content "foo"). An action operation 244 may be
executed conditionally if the shared data is found to have changed.
Note that the order of the various test operations 234 may vary.
For example, in one embodiment, the delay operation 238 may occur
before the capture operation 236. In another embodiment, the
comparison operation 242 occurs before the native access operation
240.
[0029] An embodiment may also be implemented by using debug
registers to catch a collision as it is occurring. An operation
after operation 240 may set an available debug register to trigger
upon access of "foo", and an operation near operation 246 can unset
the debug register.
[0030] In yet another embodiment, a resume operation 246 may be
performed to cause the thread 230 to execute at normal speed (for
example, raising the execution priority if the thread 230 has been
slowed by lowering its execution priority). In general, the test
operations may be understood to operate near or in proximity to the
native access operation 240, thereby expanding the time (and
therefore probability) during which a data collision may be
detected. One of ordinary skill in the art of programming will
appreciate that the precise timing and order of the test operations
can vary while still achieving the desired detection effect.
Moreover, test operations, such as delay and detection can occur
with varying proximity to the native access operation 240 if other
native operations of the thread 230 are unlikely to interfere with
the delay and detection. Moreover, a window of delay in combination
with data checking can occur anywhere in the thread 230, regardless
of proximity to a native data access 240. However, where delay and
detection are associated with a particular identified access
operation to a shared memory location, portions of test code 234
may be placed in the modified program 182 in locations based on and
relative to the location of the particular access operation.
[0031] Test code 234 fragments inserted into the modified program
182 may in practice include references to a test library 248, which
may have functions to detect and act on collisions. The test
library 248 may be linked to the modified program 182. When a
capture operation 236, for example, is called, a corresponding part
of test library 248 may be executed to perform multiple operations
such as creating a register, reading from a register, storing to a
register, etc.
[0032] FIG. 6 shows an example that uses a global active list 270.
Code for a thread 272 may be modified to cause the thread to add a
variable or object reference to a global active watch list. Another
thread 274 may have similar watch list code. The watch list code
may include a test 276 to see if the data location under inspection
(e.g., "foo" at the read operation "myvar=foo") is in active use in
another thread, that is, there is a check to see if "foo" is
already in the global active list 270. If not, then a write
operation 278 creates an entry 277 for "foo" in the global active
list. If "foo" is in the global active list 270, then a data
collision is deemed to have occurred (two threads have indicated,
via the global watch list 270, that they are both using the shared
data "foo"). If a collision has occurred, then some appropriate
test or debug action 280 may be executed. When the thread 272 is
done accessing the shared data "foo", it performs a remove
operation 282 to remove "foo" from the global active list. The
global active list may preferably be used via calls to functions in
test library 248.
[0033] In practice, each thread will most likely have only one
active entry in the global active list at any given time. In this
embodiment, the window of delay may begin when the entry 277 is
added to the global active list 270 and end when the entry 277 is
removed. An artificial delay may be initiated after the entry 277
has been added.
[0034] Regarding the duration that a thread may be delayed, an
order of magnitude of milliseconds may be sufficient for some
applications, although the amount of time may be varied as needed.
A tester may use trial and error or experience to find suitable
delay times under varying conditions. In one embodiment, a single
delay may be used to find multiple of data collisions by performing
the delay for an extended window that covers multiple accesses to
shared data. When finished, multiple updates may be checked
for.
[0035] Both static and dynamic approaches may be used to select
which data accesses are to be identified and instrumented for data
collision detection. A test user may provide the instrumentation
tool 170 with static information about what data is to be
monitored. For example, the user may indicate specific data types
to be included and/or ignored, whether read-write collisions,
write-write collisions, or both, are to be considered, and so on.
The user may provide information about delay lengths, such as
whether to use random delays, whether to consider only recently
updated data types, and so on. The instrumentation tool 170 may
implement heuristics to dynamically select which data accesses
should be instrumented and how to do so. For example, frequency
information about execution paths in the program may be used to
determine where instrumentation should be placed and how long
delays should be; to compensate for the reduced probability of a
collision being detected in infrequently executed code paths, such
paths may benefit from larger windows of delay to compensate for
the lower probability (due to less frequent execution) of a
collision being detected.
[0036] In another embodiment, a modified compiler may insert the
test code into the application program during compile time, for
example, in response to a compile flag setting.
[0037] With use of the read-delay-compare method described above,
it may be possible to ignore mappings from virtual address space to
physical address space. That is, the technique may be used without
regard for the address space of the memory location. Moreover it
may not be necessary to find a complete set of locking mechanisms.
Thus, instrumented code may even be executed within the kernel.
Furthermore, it is possible to detect collisions with
uninstrumented code, by using another process, by direct memory
access (DMA) mechanisms, etc. That is to say, external code may
perform the same functionality for non-instrumented code.
[0038] FIG. 7 shows a computer 300 including storage 302 and
processor(s) 304. The storage 302 may include random-access memory
(RAM), non-volatile storage such as dynamic RAM, recordable storage
media, and/or other current or future storage devices. The storage
302 and processor(s) 304 may cooperate to perform various of the
techniques described above, as configured by known and readily
available programming tools.
CONCLUSION
[0039] Embodiments and features discussed above can be realized in
the form of information stored in volatile or non-volatile computer
or device readable media. This is deemed to include at least media
such as optical storage (e.g., CD-ROM), magnetic media, flash ROM,
or any current or future means of storing digital information. The
stored information can be in the form of machine executable
instructions (e.g., compiled executable binary code), source code,
bytecode, or any other information that can be used to enable or
configure computing devices to perform the various embodiments
discussed above. This is also deemed to include at least volatile
memory such as RAM and/or virtual memory storing information such
as CPU instructions during execution of a program carrying out an
embodiment, as well as non-volatile media storing information that
allows a program or executable to be loaded and executed. The
embodiments and features can be performed on any type of computing
device, including portable devices, workstations, servers, mobile
wireless devices, and so on.
* * * * *