U.S. patent application number 12/242505 was filed with the patent office on 2009-12-10 for memory management for closures.
Invention is credited to Gerald Blaine Garst, JR., Benjamin C. Trumbull.
Application Number | 20090307669 12/242505 |
Document ID | / |
Family ID | 41401352 |
Filed Date | 2009-12-10 |
United States Patent
Application |
20090307669 |
Kind Code |
A1 |
Garst, JR.; Gerald Blaine ;
et al. |
December 10, 2009 |
MEMORY MANAGEMENT FOR CLOSURES
Abstract
Methods, software media, compilers and programming techniques
are described for binding data to a function using thunk synthesis.
In one exemplary method, a computing system executes a program
having a function with a first set of arguments. In response to the
function being called, a function pointer of the function is
synthesized to recover an extra argument for the function in
addition to the first set of arguments.
Inventors: |
Garst, JR.; Gerald Blaine;
(Los Altos, CA) ; Trumbull; Benjamin C.; (San
Jose, CA) |
Correspondence
Address: |
APPLE INC./BSTZ;BLAKELY SOKOLOFF TAYLOR & ZAFMAN LLP
1279 OAKMEAD PARKWAY
SUNNYVALE
CA
94085-4040
US
|
Family ID: |
41401352 |
Appl. No.: |
12/242505 |
Filed: |
September 30, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61059724 |
Jun 6, 2008 |
|
|
|
Current U.S.
Class: |
717/130 ;
711/154; 711/E12.002 |
Current CPC
Class: |
G06F 12/023 20130101;
G06F 11/362 20130101; G06F 9/4484 20180201; G06F 8/443
20130101 |
Class at
Publication: |
717/130 ;
711/154; 711/E12.002 |
International
Class: |
G06F 9/44 20060101
G06F009/44; G06F 12/02 20060101 G06F012/02 |
Claims
1. A computer-implemented method comprising: executing a program
having a function with a first set of arguments in a computing
system; and in response to the function being called, synthesizing
a function pointer of the function to recover an extra argument for
the function in addition to the first set of arguments.
2. The method of claim 1, wherein synthesizing the function pointer
comprises: determining if a first and a second sets of memory have
been pre-allocated for synthesizing the function pointer; and
causing an operating system (O/S) of the computing system to
allocate the first and the second sets of memory if the first and
the second sets of memory have not been pre-allocated yet.
3. The method of claim 2, wherein synthesizing the function pointer
further comprises: accessing a first location in the first set of
memory associated with the function; using a first pointer at the
first location to find a second location in the second set of
memory; and using a second pointer at the second location to load
an extra argument.
4. The method of claim 3, wherein synthesizing the function pointer
further comprises: pushing the extra argument onto a stack.
5. The method of claim 3, wherein synthesizing the function pointer
further comprises: jumping to a third location referenced by a
third pointer in the second set of memory to access executable code
of the function.
6. A computer-implemented method comprising: allocating a first set
and a second set of memory in a computing system; writing a
plurality of references into the first set of memory to reference
corresponding locations in the second set of memory; marking the
first set of memory to be non-writable and executable after writing
the plurality of references into the first set of memory; and
marking the second set of memory to be writable, wherein the second
set of memory is usable to store at least one of a reference to
data and a reference to executable code during execution of a
program.
7. The method of claim 6, further comprising: associating a
function in the program with a pair of a first location in the
first set of memory and a second location in the second set of
memory; retrieving a first pointer from the first location in
response to the program calling the function; using the first
pointer to access the second location in the second set of memory
to retrieve a second pointer; and loading an extra argument to the
function using the second pointer.
8. The method of claim 7, further comprising: disassociating the
pair from the function after the function has been completed.
9. The method of claim 8, wherein disassociating the pair from the
function comprises: using garbage collector in the computing system
to free up the pair.
10. An apparatus comprising: a first set of memory; a second set of
memory; an operating system to write references to locations in the
second set of memory into the first set of memory, and then to
change the first set of memory from writable to non-writable and
executable; and a compiler to generate a thunk in response to a
call of the function, wherein the thunk is executable to load an
extra argument for the function using data in the first set of
memory and the second set of memory.
11. The apparatus of claim 10, further comprising: a stack on which
the thunk pushes the extra argument.
12. The apparatus of claim 10, further comprising: a library having
a coordinator function to associate the function with a first
location in the first set of memory.
13. The apparatus of claim 12, wherein the coordinator function is
further operable to disassociate the function from the first
location in the first set of memory.
14. A machine readable storage medium storing executable program
instructions which when executed by a data processing system cause
the data processing system to perform a method comprising:
executing a program having a function with a first set of arguments
in a computing system; and in response to the function being
called, synthesizing a function pointer of the function to recover
an extra argument for the function in addition to the first set of
arguments.
15. The machine readable storage medium of claim 14, wherein
synthesizing the function pointer comprises: determining if a first
and a second sets of memory have been pre-allocated for
synthesizing the function pointer; and causing an operating system
(O/S) of the computing system to allocate the first and the second
sets of memory if the first and the second sets of memory have not
been pre-allocated yet.
16. The machine readable storage medium of claim 15, wherein
synthesizing the function pointer further comprises: accessing a
first location in the first set of memory associated with the
function; using a first pointer at the first location to find a
second location in the second set of memory; and using a second
pointer at the second location to load an extra argument.
17. The machine readable storage medium of claim 16, wherein
synthesizing the function pointer further comprises: pushing the
extra argument onto a stack.
18. The machine readable storage medium of claim 16, wherein
synthesizing the function pointer further comprises: jumping to a
third location referenced by a third pointer in the second set of
memory to access executable code of the function.
19. A machine readable storage medium storing executable program
instructions which when executed by a data processing system cause
the data processing system to perform a method comprising:
allocating a first set and a second set of memory in a computing
system; writing a plurality of references into the first set of
memory to reference corresponding locations in the second set of
memory; marking the first set of memory to be non-writable and
executable after writing the plurality of references into the first
set of memory; and marking the second set of memory to be writable,
wherein the second set of memory is usable to store at least one of
a reference to data and a reference to executable code during
execution of a program.
20. The machine readable storage medium of claim 19, further
comprising: associating a function in the program with a pair of a
first location in the first set of memory and a second location in
the second set of memory; retrieving a first pointer from the first
location in response to the program calling the function; using the
first pointer to access the second location in the second set of
memory to retrieve a second pointer; and loading an extra argument
to the function using the second pointer.
21. The machine readable storage medium of claim 20, further
comprising: disassociating the pair from the function after the
function has been completed.
22. The machine readable storage medium of claim 21, wherein
disassociating the pair from the function comprises: using garbage
collector in the computing system to free up the pair.
Description
RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional
Patent Application No. 61/059,724, filed on Jun. 6, 2008, which is
herein incorporated by reference.
BACKGROUND
[0002] This disclosure relates to memory management and memory
allocation of data structures and functions.
[0003] The run-time organization of memory for a computer program
often divides a system's memory into regions to store data used by
the program. For example, a portion of the memory stores the
executable software or code and another portion stores the
variables, arguments, data, etc. used by the executable software or
code. Often, local or automatic variables are stored in a stack
memory structure and global variables are stored at fixed locations
in a so-called global memory, and a heap memory structure can be
used to store variables and other data. In programs written in C or
C++ or Objective C or other C-like procedural languages, including
Java, a run-time stack holds the local variables for the currently
executing function or functions; each execution of a function may
be referred to as an activation. The run-time stack holds the local
variables for the currently executing activation as well as the
activation or function, which called the currently executing
function. The currently executing function F1 has its data (e.g.,
local variables within the scope of function F1) at the top of the
stack and just below the top of the stack is the data for the
function F2 which called F1, and so on. Further information about
stack usage can be found at pages 230-240 in the book Mac OS X
Internals--A Systems Approach by Amit Singh (Addison-Wesley, 2007,
Pearson Education, Inc.); these pages are incorporated herein by
reference. A function's stack frame is "lost" when the function
completes/exits and returns control to its caller. Hence, the local
variables in the scope of the function are not retained valid in
the stack after the function returns control to the caller of the
function. A programmer can decide to avoid use of the stack by
defining variables as global variables or by using a call, such as
malloc, to allocate space for data in the heap memory structure; in
this case, the stack is avoided (but can still be used for
functions which use local variables that do not need to persist
outside of their respective scope).
[0004] Programmers often desire to use a function or data structure
known as a closure. A closure is a function that is evaluated in an
environment containing zero or more bound variables. When called,
the function can access these variables. In some languages, a
closure may occur when a function is defined within another
function, and the inner function refers to local variables of the
outer function. At run-time, when the outer function executes, a
closure is formed and consists of the inner function's code and
references to any variables of the outer function required by the
closure. Memory allocation in the prior art causes the closure and
all bound local variables to be initially stored in the heap memory
structure (and in this case the closure is always on the heap),
although certain compilers attempt to determine if a closure will
never need to be stored on the heap, in which case they are
allocated, at run-time, on the stack (so in this case the closure
is always on the stack). In other cases, a compiler can create
run-time code which will automatically migrate a closure from an
initial position in the stack to the heap in response to an escape
from the closure's lexical scope. In the prior art, recovery of the
heap-based storage is done through a garbage collector, which is
uncommon for C or C based languages.
[0005] Furthermore, there is a common need and use of compiler
generated thunks to recover or initialize extra data before calling
specific functions. This allows the functions to appear to take
fewer arguments than are necessary yet be supplied with the correct
and complete amount of data. In GNU C Compiler (GCC), an inner
function is written (e.g., a pre-compare function with reference to
existing stack address) to recover the extra data. The inner
function is passed in as a thunk. The problem with this approach is
that the processor has to execute instruction on the stack to make
it possible. This creates grave security risk because malicious
programs can take over. Currently, some processors disallow
executable code on the stack so the above GCC technique does not
work with these processors.
[0006] Another conventional technique is to allocate a page, making
it writable. Code is written on this page, including reference to
the stack (which contains private data). Then the operating system
is asked to mark this page not writable, but only executable.
Sometimes, the execution cache of the processor may need to be
flushed to make this approach work. This approach is expensive and
has poor performance because a page has to be specifically
allocated.
SUMMARY OF THE DESCRIPTION
[0007] In one embodiment of the invention, a method for executing
software written in a language, which uses a stack memory structure
to store local or automatic variables, includes writing a data
structure of a block or closure to the stack memory structure and
then executing a block copy process, caused or invoked by a
programmer in the creation of the software, to copy the block to a
heap memory structure which is configured to store global
variables. The block includes a function pointer that references a
function which uses data in the block. In at least certain
implementations of this method, the block is always initialized to
be stored in the stack and a programmer is required to explicitly
copy the closure to the heap; this explicit copy may be caused by
writing a "block copy" call in the software, and this call at run
time will invoke a block copy process or subroutine. Variables in
the lexical scope of the closure or block are copied to the heap
such that they still work after the lexical scope of the closure's
creation is destroyed by returning from the function which created
the closure. As a further optimization, certain variables are
imported or appended to the closure in the heap as constants. The
use of the heap's space may be managed by recovering space through
explicit program instructions (e.g. a "block-dispose" call or
instruction written by the programmer to match the "block-copy"
call or instruction for a block) or through garbage collection
techniques or through a combination of both.
[0008] Another aspect of this description relates to debugging when
blocks or closures are present in the software being debugged.
Source level debugging is a process of examining a program and
providing data, where possible, on command, of where the
instruction counter(s) are with respect to the original source as
well as, potentially, the current data values of variables within
that source program. A block, in one embodiment, includes a data
block with a specialized function pointer that references a
function that knows and uses the layout of that data block for
computation according to the syntax of the block. A block can
appear to a debugger as an opaque data structure that can be
invoked like a function. The debugger may not be able to display
the data in the opaque data structure, and it is desirable to
provide a way for the debugger to display this data. In at least
certain embodiments, one way for the debugger to display this data
is to associate the specialized layout of the block with the
specialized function that is referenced by the block. The data
layout definitions are keyed to the specialized function referenced
by the block. The debugger finds the specialized function within
the otherwise opaque data structure of the block and can then use
conventional debugger lookup functions to find and display the
specialized layout information for that block.
[0009] Another aspect of this description relates to defenses
against viruses and other malicious code. In order to prevent
viruses and other malicious code from harming a computation it is
necessary to not allow writeable data to be used as executable
instructions. There is a common need and use of compiler generated
thunks to recover or initialize extra data before calling specific
functions. A thunk as used herein broadly refers to a piece of code
to perform some delayed computation. This allows the functions to
appear to take fewer arguments than are necessary yet be supplied
with the correct and complete amount of data. The prior art often
requires that each such thunk be allocated on its own page of
memory such that it can start with writeable permission, be written
upon with instructions to recover specific data, and then having
the page marked as not writeable and executable, and depending on
the architecture of the processor, the processor's instruction
cache is flushed; these operations in the prior art are
computationally expensive. Since each thunk requires a page of
memory, these thunks need to be tracked and deallocated in many
implementations. Uses of such thunks include "inner functions" in
GCC and certain implementations of closures. At least certain
embodiments of the invention provide an efficient way to use thunks
by allocating paired chunks of memory and on one chunk of the pair
write a series of small thunks that dereference a data area
counterpart on the other chunk in the pair. Once written, the
memory is protected as execute only and each thunk-data pair in the
series is separately allocated and restored. Each thunk-data pair
is provided as part of a thunk allocation which also takes the
extra data that needs to be recovered. The extra data is stored in
the appropriate location in the writeable chunk of memory. The
instructions that are written vary according to the needs of the
compiler or client and can be as simple as pushing the extra data
as an extra stack argument.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The present invention is illustrated by way of example and
not limitation in the figures of the accompanying drawings in which
like references indicate similar elements.
[0011] FIG. 1 is a flow chart of an example of a method performed
by a programmer according to one embodiment of the invention.
[0012] FIG. 2 is a flow chart of an example of a method performed
by a compiler according to one embodiment of the invention.
[0013] FIG. 3 is a flow chart showing an example of the run-time
execution of software containing blocks according to one embodiment
of the invention.
[0014] FIGS. 4A, 4B, and 4C show representations of stack and heap
memory structures at run-time at different times during run-time
according to certain embodiments of the invention.
[0015] FIG. 5 shows a representation of a block according to an
embodiment of the invention which supports debugging.
[0016] FIG. 6 shows an example of a method for debugging software
which includes blocks according to one embodiment of the
invention.
[0017] FIG. 7A illustrates one embodiment of a method to set up
memory for thunk synthesis.
[0018] FIG. 7B illustrates one embodiment of a method to use thunk
synthesis to bind extra data to a function.
[0019] FIG. 8 illustrates one embodiment of two chunks of memory
allocated for thunk synthesis.
[0020] FIG. 9 shows an example of a data processing system which
may be used in at least some embodiments of the invention.
DETAILED DESCRIPTION
[0021] Various embodiments and aspects of the inventions will be
described with reference to details discussed below, and the
accompanying drawings will illustrate the various embodiments. The
following description and drawings are illustrative of the
invention and are not to be construed as limiting the invention.
Numerous specific details are described to provide a thorough
understanding of various embodiments of the present invention.
However, in certain instances, well-known or conventional details
are not described in order to provide a concise discussion of
embodiments of the present inventions.
[0022] Reference in the specification to one embodiment or an
embodiment means that a particular feature, structure or
characteristic described in connection with the embodiment is
included in at least one embodiment of the invention. The
appearance of the phrase "in one embodiment" in various places in
the specification do not necessarily refer to the same
embodiment.
[0023] Examples of methods for writing, compiling, and executing
software which includes one or more block structures are provided
in this description in the context of a language which uses a stack
memory structure to store local or automatic variables. Examples of
such languages include the language C, C++, and Objective C and
other C-like procedural languages, including Java, etc. The block
includes a function pointer that references a function which uses
data in the block. In at least certain embodiments, the block is
always initialized to be stored in the stack and the programmer is
required to explicitly copy the block to the program's heap, and
this explicit copy operation may be caused by writing an
indication, such as a call or a directive or an instruction, to
cause a block copying operation to be performed at run-time. A heap
memory structure may be implemented in a variety of different ways
including a tree structure, as is known in the art.
[0024] FIG. 1 shows an example of a method performed by a
programmer during the creation of software which includes at least
one block structure. While the software is being created in
operation 101, the programmer writes an indication, in operation
103, which causes a block copy to occur, at run-time, of the block
from a stack to the program's heap. This indication may be
implemented in a variety of different ways. For example, it may be
a call to a shared run-time routine which is accessed from the
program via a normal subroutine call. Another implementation may
have the compiler recognize the directive and to compile the block
copy code in a so called "inline" manner. There are other choices
known to those skilled in the art. In at least certain embodiments,
the programmer may also be required to write an indication to
cause, at run-time, the removal of the block from the program's
heap. This is shown in operation 105 in FIG. 1. In alternative
embodiments, the compiler may provide a warning or error message if
no such indication is found in the software for the corresponding
block. In at least certain embodiments, there should be a match,
for each block, of the indication to cause a block copy and an
indication to cause, at run-time, the removal of the block from the
program's heap when the processing of the block has been completed.
After the software has been completed, the programmer causes the
compiler to process the software in order to create an executable
version of the software which can run at run-time. FIG. 2
represents an example of a method for compiling such software, and
FIG. 3 shows an example of the execution at run-time of such
software.
[0025] The operations shown in FIG. 2 may be performed by a
compiler in the sequence shown in FIG. 2 or in an alternative
sequence which has the operations in a different order or sequence.
Further, the compiler may perform fewer operations or additional
operations. Moreover, some of the operations may be performed
previously to create shared run-time routines or libraries which
are linked in at run-time, and hence these operations may not need
to be performed when compiling the software. It will be appreciated
that the block and the function it references may contain only
local or automatic variables and yet the compiler can perform the
necessary operations to create executable software which operates
properly at run-time. It will also be understood that in certain
embodiments, several of the operations may be merged.
[0026] In operation 201 of FIG. 2, the software is compiled to
initially allocate memory for the block on a respective stack frame
for the block. The compiler may also allocate space on the stack
for a shared variable data structure by recognizing "by reference"
variables used within a block, and for each, preparing memory space
for each within a shared variable data structure. The shared
variable data structure, which may be referred to as a by_ref (or
byref) data structure, may include a forwarding pointer, initially
set to refer to itself, for the variable, a count of the number of
active uses of the shared variable when not running in a garbage
collection system, and data used to assist a shared subroutine
implementation of a block copy process in determining if the data
structure is on the heap or is on the stack, and also additional
information regarding any supplementary actions that need to be
performed upon the variable when it is copied or disposed, as
necessary, depending on the language or if a garbage collection
system is not present and the variable requires such memory
management support, such as calls to adjust shared variable
reference counts or other memory management instructions. For each
block that uses a variable in a shared variable data structure,
there is data within the block on the stack with a reference to the
stack based variable structures that it will use.
[0027] In operation 203, the compiler processes any indication,
entered by the programmer, to cause a block copy operation at
run-time, which will copy the block from a respective stack to a
heap and to cause an updating of stack based pointers held within
the byref variable structures to point to the location on the heap.
In one embodiment, the compiler may process a call to a block copy
subroutine in a shared run-time library which performs the block
copy of the block from the stack to the program's heap. It will be
understood that, in a typical embodiment, only the first indication
within the program will cause the copy of the block from the stack
to the heap and further indications do not cause additional copies
of the block to be created on the heap. In operation 205, the
compiler processes a block release indication to cause, at
run-time, removal of the block from the heap. In one embodiment,
the block release indication was entered by the programmer in
writing the software according to a rule which requires that for
every block copy indication for a particular block, there should be
a block release indication. In alternative embodiments, the
compiler may display an error message should such block release
indication not be present. The block release indication should be
placed in an appropriate location within the software such that the
block is released after the completion of execution of the
block.
[0028] In operation 207, the compiler creates software that
references shared byref variables indirectly through a pointer held
within a structure holding the variables. In operation 211, the
compiler creates software (e.g. a call to a run-time routine) to
remove a copy of the shared byref variable data structure from the
heap at every lexical escape of a shared variable, when the
run-time system is not using a garbage collector. The code
generator in operation 211 may, in one embodiment, be a call to a
"byref_block_release" function that actually recovers the storage
in the heap if after decrementing a reference count it finds the
count at zero. Operation 213 performs an optional method which can
be used to copy variables as constants for use within a block. In
another embodiment, these variables may not be treated as constants
("const"), but although they may be changed the effects of the
change would not be shared. It will be understood that this is an
optimization procedure in which certain variables are imported into
the block as constants. This can be performed at the programmer's
discretion by having the variable named by value which causes the
software created in operation 213 to append the constant to the
data block structure on the heap during run-time (as shown in
operation 307).
[0029] The method shown in FIG. 3 occurs when the compiled software
is executed. Operation 301 is performed when a function containing
a block is called or when a new lexical scope within a function is
entered that itself contains a block. In one embodiment, a shared
run-time library or routine may be called to perform operation 301
to write the block to the stack. This can be seen in FIGS. 4A and
4B which show a stack 403 for a program and a heap 405 for the
program. The stack, as shown in FIG. 4A, is empty and then, as
shown in FIG. 4B, is written to in order to store the block 407
into the stack 403. In operation 303, the block copy instructions
are executed in order to copy the block from its stack frame to the
program's heap. This executing of the block copy operation
typically occurs before the escape from a lexical scope from the
block and is not performed in response to an escape from that
scope. In operation 305, referenced variables, such as shared
variable data structures, are copied from the stack to the heap and
the forwarding pointers for those reference variables are also
changed to point to the referenced variables on the heap. For
example, as shown in FIG. 4C, the by_ref data 413 has been copied
to the heap and the forwarding pointer 415 points to that data on
the heap. The by_ref data 413 is a copy of the shared variable data
structure 409 A within the stack 403.
[0030] As noted above, an optional operation may be performed to
append a constant, which replaces a variable, by appending the
constant to represent the variable to the block's data structure on
the heap.
[0031] Operation 309 and 311 are performed at run-time in order to
release memory from the heap for both the block data structure and
the shared variable data structure. The execution of the block
dispose instructions, which may be called by the block release
indication specified as part of operation 205, cause the removal of
the block from the heap. The programmer will typically place this
call at the appropriate point in the program when the block is no
longer needed or else rely on garbage collection if present to
recover the heap memory. Operation 311 may be automatically
performed to remove a copy of the shared variable data structure;
this operation may be caused to occur by a compiler inserting calls
to a routine at every place where the code which uses the shared
variable data structure exits or where the variable goes out of
scope.
[0032] One embodiment of the run-time embodiment of the invention
may manage memory recovery by requiring the programmer to include
calls to the block dispose subroutine. Hence, memory recovery is
self-managed rather than being managed by an automatic garbage
collection system. However, in other embodiments, garbage
collection routines may be utilized, in which case the garbage
collection systems manages the block release instructions.
[0033] Another aspect of the present invention relates to debugging
when blocks or closures are present in software being debugged. A
block in one embodiment includes a data block with a specialized
function pointer that references a function that knows and uses the
layout of that data block for computation according to the syntax
of the block. A block can appear to a debugger as an opaque data
structure that can be invoked like a function. Hence, the debugger
may not be able to display the data in the opaque data structure,
and it may be desirable to provide a way for the debugger to
display this data. In at least certain embodiments, one way for the
debugger to display this data is to associate the specialized
layout of the block with the specialized function that is
referenced by the block. One association is shown in FIG. 5 in
which a function pointer is stored with a pointer to the layout
description table. In this case, the block 501 includes a block
layout description table 503 which defines the data layout
definitions of the data within the block. The function pointer 505
includes a pointer to the layout description table. This allows for
an association between the specialized function referenced by the
block and the data layout definitions which are key to the
specialized function.
[0034] FIG. 6 shows an example in a debugging method for using the
association between a function and a block layout description
table. In operation 601, the function pointer can be retrieved from
the block by the debugger which can then be used to retrieve the
block layout description table using conventional debugger lookup
functions in operation 605. This will permit the debugger to find
and display the specialized layout information for the block so
that the programmer can understand how the block is used relative
to the function referenced by the block, etc.
[0035] Following is an example of a formal specification of blocks
according to one implementation.
The Block Type
[0036] A new type is introduced to C and by extension Objective-C,
C++, and Objective-C++. The type is a pair consisting of the result
value type and a list of parameter types very similar to a function
type.
[0037] The string "int (char, float)" describes the type of a Block
that has a result value of type int and two parameters, the first
of type char and the second of type float.
Block Declarations
[0038] A Block type is declared using function pointer style
notation but substituting for *. The following are all valid Block
types:
TABLE-US-00001 void (void) int (int, char)
[0039] Variadic ` . . . ` arguments are supported. A Block that
takes no arguments must specify void in the argument list. An empty
parameter list does not represent, as K&R provide, any argument
list.
Operations
[0040] There is one operation on Block types, invoking them with a
type checked set of parameters and simultaneously extracting a
result value.
[0041] Blocks are invoked with a list of expression parameters of
types corresponding to the declaration.
[0042] Objective-C extends the definition of a Block type to be
that also of id. A variable or expression of Block type may be
messaged or used as a parameter wherever an id may be. The converse
is also true.
[0043] All Blocks are constructed to be Objective-C objects
regardless of whether the Objective-C run-time is operational in
the program or not.
Implementation
[0044] A Block is implemented as a structure that starts with the
following fields.
TABLE-US-00002 enum { BLOCK_NEEDS_FREE = (1 << 24),
BLOCK_HAS_COPY_DISPOSE = (1 << 25), BLOCK_IS_GC = (1 <<
27), }; struct Block_basic { void *isa; // initialized to
&_NSConcreteStackBlock int Block_flags; // int32_t int
Block_size; // XXX should be packed into Block_flags void
(*Block_invoke)(void *); // really a function pointer returning the
correct type and taking the appropriate args void
(*Block_copy)(void *dst, void *src); void (*Block_dispose)(void *);
};
[0045] Compiler generated code for invoking a Block can extract the
invoke function pointer and call it passing the Block data
structure and all additional parameters.
TABLE-US-00003 (x->Block_invoke)(x, `a`); ((*y)->
Block_invoke)(*y, `a`);
[0046] Note that if the Block returns a value, such as a structure,
that is passed via a hidden argument in the ABI (Application Binary
Interface), the normal ABI conventions are followed. Thus, a raw
Block pointer, just like a raw function pointer, cannot be
correctly invoked without knowing its return value type.
Block Literal
[0047] A Block literal is created by the new use of thetoken as a
unary operator. The form is, generally, thesymbol followed by the
parenthesized list of expression parameters, and a code body. The
return type is inferred from the type of the return statement
expression. The code body is that of a compound statement. The list
must appear before the first statement (if any) and is itself
enclosed by | tokens. Examples:
TABLE-US-00004 int x, y; _block int z; ( void ) { z = x + y; }
[0048] Local automatic (stack) variables referenced within the
compound statement of a Block are imported and captured by the
block. Global variables and references to global functions are
treated normally. A variable declared with the_block storage
specifier is moved by the compiler into a special on-stack
structure that can, if needed, be copied to a heap based memory
location while still being shared by both the function that defines
it and all Blocks that reference it. Local variables not marked
with_block are imported as const copies of their values at the time
the Block expression is formed.
[0049] The compound statement body of a Block establishes a new
lexical scope such that new local variables may be defined. Other
variable references are to the definition point in the closest
enclosing lexical scope. A local variable defined in a Block may be
referenced as either a const import or by_reference in a subsequent
Block.
[0050] Objective-C extends the definition to allow the use of the
names of instance variables when a Block expression is formed in an
appropriate instance method (Class methods have no access to
instance variables). If instance variables are referenced then a
const import of the self variable is made and all accesses within
the Block are via the imported const version of self. There is no
similar provision for C++ because it is not likely to be desirable
to form a const zero-argument constructor copy of this.
Constant Imports
[0051] In the example above the values x and y are implicitly
imported into the compound statement as const variables. The basic
structure is augmented as
TABLE-US-00005 struct { struct Block_basic base; const int x; const
int y; ...; // byref reference to z } void
_Block_copy_assign(struct Block_basic **dest, const struct
Block_basic *src); void _Block_destroy(const struct Block_basic
*src);
[0052] This structure is allocated on the stack and the fields x
and y are initialized to the values of x and y variables at the
point of declaration. C scalars, structures, references, and Blocks
are simply assigned as simple initialization assignment.
[0053] Blocks support persistence by way of the run-time helper
function_Block_copy( ). The compiler provides assistance in copying
for variables of types Objective-C objects, Blocks, by_reference
variables, and C++ stack objects. There are two additional helper
routines synthesized for use by_Block_copy, if needed, and if
present, Blocks_flags is marked with BLOCK_HAS_COPY_DISPOSE. The
first is the copy helper which takes the new and the existing Block
data structure. Objective-C object pointers are sent the -retain
message unless the -fobjc-gc-only flag is set, and they are
assigned using the objc_assignStrongCast( ) operation if -fobjc-gc
or -fobjc-gc-only flag is set. Block variables are copied using the
specific objc run-time helper function_Block_copy_assign. C++
objects are copied using the default copy constructor.
[0054] Similarly the compiler provides a destruction helper
function that is passed the Block data structure. Objective-C
objects are sent the -release message unless the -fobjc-gc-only
flag is set. Block variables are passed to a support
routine_Block_destroy, and C++ objects have their appropriate
destructors synthesized.
By Reference Parameters
[0055] By_ref parameters (those marked with_block) are limited to
automatic variables of an enclosing scope. Conceptually, every
local variable that is imported as a by_reference variable in any
block in that scope, including that of the function/method, is
actually allocated on the stack as a member of its own unique
structure. This structure will be copied to a heap if one of the
Blocks that references this variable is copied using Block_copy( ).
To support continued access to this variable as it is copied, the
structure contains a forwarding pointer that is initially set to
the start of this structure, and all accesses to that variable are
made indirectly through the forwarding pointer.
[0056] After the point of last use and before escape from the
enclosing scope a call to a run-time dispose function
Block_destroy_byref( ) is made upon the structure.
[0057] The layout of the shared storage is
TABLE-US-00006 struct Block_byref { long reserved; struct
Block_byref *forwarding; int flags; int size; void
(*byref_keep)(struct Block_byref *dst, struct Block_byref *src);
void (*byref_destroy)(struct Block_byref *); // long shared[0];
};
[0058] The copy and destroy helper routines synthesized for a Block
must also, for each such shared structure containing a shared
by_reference variable, emit a copy helper function
call_Block_byref_assign_copy to preserve (if necessary) the shared
data structure by copying it to the heap. And in the destroy helper
it must call_Block_byref_release( ) to help recover the heap
reference.
[0059] Similar to the case of constant imported variables, if
variables of types Objective-C objects, Blocks, or C++ stack
variables are named in by_reference sections, they need support
help from the compiler for when they are copied to the heap. The
support help is identical to that of const imports, except that the
destinations are not typed or treated as const variables. The flags
word of the Block_byref structure should be marked with
BLOCK_HAS_COPY_DISPOSE if such helper routines are present.
[0060] C++ stack objects continue to require destructors despite
their enclosure in the stack based structure.
[0061] The idea here is that all stack based Blocks share a stack
based byref data structure. The byref data structure initially
holds non-retained objects and Blocks. Upon copy, the run-time
arranges to mark the copy as a copy and to properly retain its
components.
[0062] A byref storage structure is conceptually required for each
variable shared in any reachable Block.
[0063] Note: All variables used as by_reference variables in the
same set of Blocks may share the same shared storage structure.
Control Flow
[0064] The compound statement of a Block is treated much like a
function body with respect to control flow in that gotos, breaks,
and continues do not escape the Block. Exceptions are treated
"normally" in that when thrown they pop stack frames until a catch
clause is found.
Local Variables
[0065] The scope of local variables is that of a function--each
activation frame contains a new copy of variables declared within
the local scope of the Block. Such variable declarations should be
allowed anywhere rather than only with C99 or gnu specific flags,
including for statements.
[0066] There are no "Block" lifetime scoped variables that persist
across multiple invocations. This would likely require a way to
specify finalization logic for when the Block was recovered.
Thunk Synthesis
[0067] Another aspect of some embodiments of the current invention
relates to defenses against viruses and other malicious code. Some
embodiments of computing systems include dynamic compilers.
Generally speaking, a dynamic compiler in a computing system allows
embedding of data addresses directly in dynamically generated code
(as opposed to a static compiler), and compiles the code into
binary code executable by a processor in the computing system. It
should be noted that for small instruction sequences such as are
needed by thunks only a few well known instructions need be
generated and this knowledge may be embedded by the compiler into
the program directly and thus not require that an entire dynamic
compiler be embedded within the program. An address of a function
of interest in the binary code allows recovery of data address. In
some embodiments, this function of interest may be viewed as an
address of code instruction with a list of arguments. The address
of the function of interest is put into a stack during execution. A
function pointer is synthesized to recover additional argument(s)
using a thunk, which is substantially similar to a mini function
within the function of interest. As discussed above, a thunk as
used herein broadly refers to a piece of code to perform some
delayed computation. In some embodiments, the thunk provides a new
way to bind data to a particular function, which may be referred to
as per function runtime data. This approach is applicable to
pre-compiled instructions in various types of computing languages,
such as C, object-oriented programming languages (e.g., C++,
Objective-C), assembly languages, etc.
[0068] To further illustrate the above concept, one example is used
in the following discussion. In this example, an array of strings
in a specific format (e.g., first name in the first six (6)
characters, last name in the next six (6) characters, etc.) can be
input to a facility (e.g., a sort routine named Qsort), which calls
a compare function to compare the strings in order to sort them. A
compare function is built to look up extra data that indicates by
which piece of data the strings are supposed to be sorted (e.g., by
first name), and/or the row and column by which the strings are to
be sorted. According to one conventional approach, the extra data,
which is writable data, is used as executable instruction. But such
usage may pose a severe security risk to the computing system
because viruses or malicious code may be introduced via the
writable data.
[0069] In some embodiments, the above security problem is solved
using thunk synthesis, which provides a way to bind data to a
particular function. To create reusable resource for thunk
synthesis, the operating system first sets up a portion of the
memory in the computing system. FIG. 7A illustrates one embodiment
of a method to set up the memory for thunk synthesis. The operating
system first pre-allocates two chunks of memory for thunk synthesis
in block 700. The first chunk of memory is initially marked as
writable in block 702. Then code is written to the first chunk of
memory in block 704. The code is usable to recover a corresponding
address in the second chunk of memory (e.g., a direct pointer to
specific address in the second chunk). Then the first chunk of
memory is marked as not writable, but executable, in block 706.
Finally, the second chunk of memory is marked writable in block
708. The second chunk of memory is for storing the actual addresses
of data.
[0070] Note that the operating system performs the above operations
only once to create this resource (i.e., the two chunks of memory).
These two chunks of memory are reusable. In other words, there is
essentially no additional setup cost incurred when a second
function is executed because the resource needed has already been
set up during execution of the first function.
[0071] FIG. 7B illustrates one embodiment of a method to use thunk
synthesis to bind extra data to a function. The method may be
performed by processing logic comprising hardware, software,
firmware, or a combination thereof. Processing logic starts
execution of a program at block 710. During execution of the
program, processing logic calls a function at block 712 in response
to an instruction within the program. The pre-compiled code of the
function typically includes a list of one or more arguments.
However, the function may require one or more arguments in addition
to the arguments in the list. To recover the additional arguments
for the function, processing logic performs the following
operations.
[0072] At block 714, processing logic determines if two chunks of
memory have been set up yet. If not, processing logic calls a
coordinator function in a library to set up the two chunks of
memory at block 718. In some embodiments, the coordinator function
in the library is used to get a function slot and to return the
function slot. For example, the coordinator function may cause the
operating system to allocate two chunks of memory for thunk
synthesis. Then the coordinator function may cause the operating
system to set the two chunks of memory up by marking them as
writable or non-writable, executable, etc., as described above with
reference to FIG. 7A. Then the operating system hands back the two
chunks of memory ready for thunk synthesis.
[0073] Otherwise, if the two chunks of memory have already been set
up, then processing logic looks up a first pointer in a first
location in the first chunk of memory associated with the function
at block 720. Then processing logic uses the first pointer to look
up a second location in the second chunk of memory at block 724.
Processing logic further looks up a second pointer at the second
location in the second chunk of memory at block 726. Finally,
processing logic uses the second pointer to load an extra argument
of the function or to jump to a closure function at block 728.
[0074] Applying the above technique to the previous example, a
program including the function Qsort, may be processed as follows.
As previously discussed, Qsort calls a programmer supplied
function, compare, to sort an array of strings. The programmer will
supply an inner function sortByName that uses local information
about the column positions of the first and last names. Because the
compiler knows that there are secret additional parameters (namely
the column position information), the compiler writes the
sortByName function in a way that accesses a secret extra
parameter, say, held in a scratch register. It then writes code
that will allocate one of the many thunks established by the
coordinator. One embodiment of two chunks of memory 810 and 820 for
thunk synthesis is illustrated in FIG. 8. All thunks generated in
the system are written to take a piece of data stored in the second
writable chunk of memory 820. After writing code that allocates the
thunk, the compiler writes code that stores the address of the
sortByName function 824 and the address of the column position
variables 822 into the thunk. It then supplies the thunk's address
of its code in the first chunk of memory 810 as the parameter to
the Qsort routine. After generating the call to the Qsort routine,
the compiler generates code to inform the coordinator that it can
reuse the thunk. Alternatively, if a garbage collection system or
other memory management mechanism is present, the compiler may do
nothing.
[0075] During execution, Qsort calls the thunk code which loads the
previously stored address of the column position variables 822 into
the known scratch register and then jumps to the address 824 also
stored in the second chunk of memory 820, which is the code for the
sortByName function. It does the comparison using the extra
information and returns directly to Qsort. This operation is done
for every pair of elements that the Qsort function determines is
necessary to sort the array.
[0076] After Qsort returns execution passes back to the recovery
code that the compiler generated and the thunk is returned to the
coordinator for some other use. Note that the thunks are small and
that many are established in the two chunks of memory 810 and 820,
where the thunks are handed out and recovered.
[0077] If the two chunks of memory 810 and 820 run out of space,
then the processor may call the coordinator function again to
extend the two chunks of memory 810 and 820 by allocating more
memory to the two chunks of memory 810 and 820.
[0078] When execution of the function compare has been completed,
get_function may be called to return (or to free up) the thunk in
the two chunks of memory 810 and 820. Alternatively, a garbage
collection system or other memory recovery mechanism may be used
instead of get_function to return the thunk.
[0079] FIG. 9 shows one example of a typical computer system, which
may be used with the present invention. Note that while FIG. 9
illustrates various components of a computer system, it is not
intended to represent any particular architecture or manner of
interconnecting the components as such details are not germane to
the present invention. It will also be appreciated that personal
digital assistants (PDAs), cellular telephones, handheld computers,
media players (e.g. an ipod), entertainment systems, devices which
combine aspects or functions of these devices (e.g. a media player
combined with a PDA and a cellular telephone in one device), an
embedded processing device within another device, network
computers, a consumer electronic device, and other data processing
systems which have fewer components or perhaps more components may
also be used with or to implement one or more embodiments of the
present invention. The computer system of FIG. 9 may, for example,
be a Macintosh computer from Apple Inc. The system may be used when
programming or when compiling or when executing the software
described.
[0080] As shown in FIG. 9, the computer system 45, which is a form
of a data processing system, includes a bus 51 which is coupled to
a processing system 47 and a volatile memory 49 and a non-volatile
memory 50. The processing system 47 may be a microprocessor from
Intel which is coupled to an optional cache 48. The bus 51
interconnects these various components together and also
interconnects these components to a display controller and display
device 52 and to peripheral devices such as input/output (I/O)
devices 53 which may be mice, keyboards, modems, network
interfaces, printers and other devices which are well known in the
art. Typically, the input/output devices 53 are coupled to the
system through input/output controllers. The volatile memory 49 is
typically implemented as dynamic RAM (DRAM) which requires power
continually in order to refresh or maintain the data in the memory.
The nonvolatile memory 50 is typically a magnetic hard drive, a
flash semiconductor memory, or a magnetic optical drive or an
optical drive or a DVD RAM or other types of memory systems which
maintain data (e.g. large amounts of data) even after power is
removed from the system. Typically, the nonvolatile memory 50 will
also be a random access memory although this is not required. While
FIG. 9 shows that the nonvolatile memory 50 is a local device
coupled directly to the rest of the components in the data
processing system, it will be appreciated that the present
invention may utilize a non-volatile memory which is remote from
the system, such as a network storage device which is coupled to
the data processing system through a network interface such as a
modem or Ethernet interface. The bus 51 may include one or more
buses connected to each other through various bridges, controllers
and/or adapters as is well known in the art.
[0081] It will be apparent from this description that aspects of
the present invention may be embodied, at least in part, in
software. That is, the techniques may be carried out in a computer
system or other data processing system in response to its
processor, such as a microprocessor, executing sequences of
instructions contained in a machine-readable storage medium such as
a memory (e.g. memory 49 and/or memory 50). In various embodiments,
hardwired circuitry may be used in combination with software
instructions to implement the present invention. Thus, the
techniques are not limited to any specific combination of hardware
circuitry and software nor to any particular source for the
instructions executed by the data processing system. In addition,
throughout this description, various functions and operations are
described as being performed by or caused by software code to
simplify description. However, those skilled in the art will
recognize what is meant by such expressions is that the functions
result from execution of the code by a processor, such as the
processing system 47.
[0082] In the foregoing specification, the invention has been
described with reference to specific exemplary embodiments thereof.
It will be evident that various modifications may be made thereto
without departing from the broader spirit and scope of the
invention as set forth in the following claims. The specification
and drawings are, accordingly, to be regarded in an illustrative
sense rather than a restrictive sense.
* * * * *