Method of reducing the number of instructions in a program code sequence Patent Grant Tyma May 11, 1 [PreEmptive Solutions, Inc.]

Method of reducing the number of instructions in a program code sequence

Tyma May 11, 1

Patent Grant 5903761

U.S. patent number 5,903,761 [Application Number 08/961,717] was granted by the patent office on 1999-05-11 for method of reducing the number of instructions in a program code sequence. This patent grant is currently assigned to PreEmptive Solutions, Inc.. Invention is credited to Paul Tyma.

United States Patent	5,903,761
Tyma	May 11, 1999

Method of reducing the number of instructions in a program code sequence

Abstract

A method of reducing the number of instructions in a computer program. A program definition instruction and a use instruction that operate on the same program variable are identified. If the use instruction may be moved ahead of one or more other instruction in the computer program to be adjacent the definition instruction, then the use instruction and the definition instruction are removed from the computer program.

Inventors:	Tyma; Paul (Broadview Heights, OH)
Assignee:	PreEmptive Solutions, Inc. (Euclid, OH)
Family ID:	25504888
Appl. No.:	08/961,717
Filed:	October 31, 1997

Current U.S. Class:	717/148; 717/154
Current CPC Class:	G06F 8/445 (20130101); G06F 8/433 (20130101)
Current International Class:	G06F 9/45 (20060101); G06F 009/45 ()
Field of Search:	;395/705-709

References Cited [Referenced By]

U.S. Patent Documents


4965724	October 1990	Utsumi et al.
5287510	February 1994	Hall et al.
5396631	March 1995	Hayashi et al.
5596732	January 1997	Hosoi
5805895	September 1998	Breternitz, Jr. et al.
5835776	November 1998	Tirumalai et al.

Other References

Jason Steinhorn, Compiling Java, Embedded Systems Programming, pp. 42-56, Sep. 1998. .
Instantiations. Inc. Java Speed Barrier Smashed; Key Benchmarks Indicate New JOVE Technology Produces Java Speeds up to 15 Times that of Current Technologies, Internet-WWW, Jul. 29, 1998..

Primary Examiner: Hafiz; Tariq R.
Assistant Examiner: Zhen; Wei
Attorney, Agent or Firm: Blakely, Sokoloff Taylor & Zafman LLP

Claims

What is claimed is:

1. A method of reducing the number of instructions in a computer program, the method comprising the steps of:

identifying in the program a definition instruction and a use instruction that operate on the same program variable;

determining whether the use instruction may be moved ahead of one or more other instructions in the computer program to be adjacent the definition instruction; and

removing the use instruction and the definition instruction from the computer program if the use instruction can be moved ahead of the one or more other instructions to be adjacent the definition instruction.

2. The method of claim 1 wherein the computer program includes a sequence of instructions that can be executed in a stack-based virtual machine.

3. The method of claim 1 wherein the computer program includes a sequence of Java bytecodes that can be executed in a Java virtual machine.

4. The method of claim 1 wherein the step of determining whether the use instruction may be moved ahead of one or more other instructions to be adjacent the definition instruction includes step of determining whether the use instruction is preceded by a stack push instruction that pushes an operand onto the stack for use in a non-commutative operation with the program variable.

5. The method of claim 1 wherein the step of determining whether the use instruction may be moved ahead of one or more other instructions to be adjacent the definition instruction includes the step of determining whether the definition instruction is within a stack-balanced region that does not include the use instruction and that cannot be altered to include the use instruction.

6. The method of claim 1 further comprising the step of generating the computer program by compiling Java source code into a sequence of Java bytecodes, the definition instruction being a Java store instruction and the use instruction being a Java load instruction.

7. The method of claim 1 wherein the step of determining whether the use instruction may be moved ahead of one or more other instructions includes the step of identifying a stack-balanced sequence of instructions.

8. The method of claim 7 wherein the step of identifying a stack-balanced sequence of instructions includes the step of identifying a sequence of instructions that, if executed, will cause an equal number of stack push and pop operations to take place in an order such that, throughout execution of the sequence of instructions, the number of completed stack pop operations does not exceed the number of completed stack push operations.

9. The method of claim 1 further comprising the step of determining whether contents of the program variable are pushed onto the stack by an instruction that succeeds the use instruction, and wherein said step of removing is performed only if contents of the program variable are not pushed onto the stack by an instruction that succeeds the use instruction.

10. A method of reducing the number of instructions in a program code sequence, the method comprising the steps of:

replacing adjacent use instructions in the program code sequence that operate on the same variable with a single use instruction and at least one instruction which, when executed, causes a value on top of a stack to be duplicated on the stack;

removing adjacent definition-use instruction pairs from the program code sequence;

reordering stack push instructions ahead of other instructions in the program code sequence; and

repeating the steps of replacing, removing and reordering until no stack push instructions are reordered by said step of reordering.

11. The method of claim 10 wherein the step of reordering stack push instructions ahead of other instructions comprises the step of moving a first stack push instruction ahead of other instructions in the program code sequence to a point at which at least one of the following criteria is satisfied:

the first stack push instruction has been moved ahead of all instructions in the program code sequence that do not cause a value to be pushed onto the stack;

the first stack push instruction cannot be moved further ahead of other instructions without being moved into a stack-balanced sequence of instructions;

the first stack push instruction cannot be moved further ahead of other instructions without being moved ahead of a second stack push instruction that supplies an operand for a non-commutative operation that also receives an operand from the first stack push instruction; and

the first stack push instruction cannot be moved further ahead of other instructions without being moved ahead of a stack pop instruction that operates on a program variable that is also operated upon by the stack push instruction.

12. An article of manufacture including one or more computer-readable media having a program code stored thereon which, when executed by a processor, causes the processor to perform the steps of:

identifying in a sequence of instructions a definition instruction and a use instruction that operate on the same program variable,

determining whether the use instruction may be moved ahead of one or more other instructions in the sequence of instructions to be adjacent the definition instruction; and

removing the use instruction and the definition instruction from the sequence of instructions if the use instruction can be moved ahead of the one or more other instructions to be adjacent the definition instruction.

13. The article of claim 12 wherein the sequence of instructions is a sequence of instructions for execution in a stack-based virtual machine.

14. The article of claim 12 wherein the sequence of instructions is a sequence of Java bytecodes for execution in a Java virtual machine.

15. An article of manufacture including one or more computer-readable media having sequences of instructions stored thereon which, when executed by a processor, cause the processor to reduce the number of instructions in a program code sequence by performing the steps of:

replacing adjacent use instructions that operate on the same variable with a single use instruction and at least one instruction which, when executed, causes a value on top of a stack to be duplicated on the stack;

removing adjacent definition-use instruction pairs from the program code sequence; reordering stack push instructions ahead of other instructions in the program code sequence; and

repeating the steps of replacing, removing and reordering until no stack push instructions are reordered by said step of reordering.

16. The article of claim 15 wherein the step of reordering stack push instructions ahead of other instructions comprises the step of moving a first stack push instruction ahead of other instructions in the program code sequence to a point at which at least one of the following criteria is satisfied:

the first stack push instruction has been moved ahead of all instructions in the program code sequence that do not cause a value to be pushed onto the stack;

the first stack push instruction cannot be moved further ahead of other instructions without being moved into a stack-balanced sequence of instructions;

the first stack push instruction cannot be moved further ahead of other instructions without being moved ahead of a second stack push instruction that supplies an operand for a non-commutative operation that also receives an operand from the first stack push instruction; and

the first stack push instruction cannot be moved further ahead of other instructions without being moved ahead of a stack pop instruction that operates on a program variable that is also operated upon by the stack push instruction.

17. A computer data signal embodied in a carrier wave and representing sequences of instructions which, when executed by a processor, cause the processor to perform the steps of:

identifying in a computer program a definition instruction and a use instruction that operate on the same program variable;

determining whether the use instruction may be moved ahead of one or more other instructions in the computer program to be adjacent the definition instruction; and

removing the use instruction and the definition instruction from the computer program if the use instruction can be moved ahead of the one or more other instructions to be adjacent the definition instruction.

18. The computer data signal of claim 17 wherein the computer program is a sequence of Java bytecodes for execution in a Java virtual machine.

19. A method of preventing reverse-compiling a sequence of bytecodes to obtain program source code, the method comprising the steps of:

removing from the sequence of bytecodes adjacent definition and use instructions that operate on the same program variable; and

reordering push instructions within the sequence of bytecodes.

20. The method of claim 19 further comprising the step of replacing a use instruction that is adjacent another use instruction with an instruction to duplicate a value on top of a stack.

21. The method of claim 19 comprising the step of iteratively performing the steps of removing and reordering.

Description

FIELD OF THE INVENTION

The present invention relates to the field of computer science, and more particularly to a method of reducing the number of instructions in a sequence of compiled program code.

BACKGROUND OF THE INVENTION

Some modern compilers, most notably the Java compiler from Sun Microsystems, are designed to compile source code (e.g., Java programs) into sequences of instructions to be executed on a stack-based virtual machine. A key benefit of compiling source code for execution on a virtual machine is that the compiled code may be executed by any processor that can be programmed to implement the virtual machine, regardless of the processor's internal architecture.

One drawback to compiling code for execution on a virtual machine is that execution is usually much slower than if the program had been compiled into native instructions executable by the underlying processor. In a stack-based virtual machine like the Java virtual machine, the stack is usually maintained in system memory so that stack push and pop operations are relatively time consuming and contribute to the relatively slow execution rate of the virtual machine.

Another drawback to compiling code for execution on a virtual machine is that the compiled code tends to be easy to reverse compile into a version of the original source code. This is a serious concern for many software developers. After spending large amounts of time and money developing a software program, developers do not want to place the program in the public domain in a form that gives away the source code.

SUMMARY OF THE INVENTION

A method of reducing the number of instructions in a computer program is disclosed. A definition instruction and a use instruction that operate on the same variable of the computer program are identified. A determination is made as to whether the use instruction may be moved ahead of one or more other instructions in the computer program so that it is adjacent the definition instruction. If the use instruction may be moved to be adjacent the definition instruction, the use instruction and the definition instruction are removed from the computer program.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar elements and in which:

FIG. 1 is a Java source code listing.

FIG. 2 is a sequence of Java bytecodes that result from compilation of the source code of FIG. 1.

FIG. 3 is a modified version of the sequence of Java bytecodes in FIG. 2.

FIG. 4 is a Java source code listing.

FIG. 5 is a sequence of Java bytecodes that result from compilation of the source code of FIG. 4.

FIG. 6 is a directed-acyclic graph that corresponds to a modified version of the bytecode sequence of FIG. 5.

FIG. 7 is a method diagram according to one embodiment of the present invention.

FIG. 8A is a diagram illustrating the use of push migration to reorder instructions in the bytecode sequence of FIG. 5.

FIG. 8B is a diagram illustrating the removal of adjacent definition-use instruction pairs from the reordered bytecode sequence of FIG. 6.

FIG. 8C is a diagram illustrating the result of iteratively performing push migration and adjacent pair removal on the bytecode sequence of FIG. 5.

DETAILED DESCRIPTION

A method for reducing the number of instructions in stack-based program code is disclosed. According to one embodiment, the program code is iteratively analyzed to determine opportunities for instruction removal. The superlative requirement for the possibility of instruction removal is that program output is equivalent before and after transformation. Many opportunities are present because of limitations of the original programming language and/or common programming design techniques, which are known as "good design" but translate poorly to stack-based virtual machine instructions. These opportunities manifest themselves as the ability to remove instructions that deal with storage of transient values into memory. Storing and recalling transient values from memory is a common, often necessary, practice in register-based machines. In stack-based machines, however, storing and recalling transient values from memory often becomes superfluous because the stack is inherently a place for temporary storage.

In the present invention, the rate of program execution is increased by eliminating unnecessary stack push and pop operations. In interpreted environments and in processors which execute instructions directly, fewer instructions means reduced execution time. A Just-in-Time (JIT) compiled environment converts stack-based program code into register-based program code immediately prior to execution. Complex code sequences tend to impede the expediency and effectiveness of this process. Therefore, in JIT environments code size reduction also results in faster execution times.

In one embodiment, code reordering and instruction replacement is used to increase the number of stack push and pop instructions that can be eliminated. As discussed below, one highly desirable effect of such code reordering and instruction replacement is that the modified code cannot be easily reverse compiled to obtain the original source code. This effect is referred to as obfuscating the source code. In other words, application of one or more embodiments of the present invention to reduce the number of instructions in a program code sequence not only speeds program execution, but also obfuscates the source code from which the program code sequence is generated.

The Java programming language from Sun Microsystems is used as an example throughout the following description because of its applicability to being compiled into a stack-based intermediate representation ("Java" and "Sun" are trademarks of Sun Microsystems, Inc.). The Java programming language is compiled primarily into an intermediate representation known as bytecodes. Bytecodes are instructions that can be executed by a Java virtual machine. Because the Java virtual machine is implemented by execution of a program that has been ported to most popular computer architectures, Java programs can be run on many different types of computers (e.g., Intel, Macintosh, IBM RISC, etc.) without modification. Unfortunately, this "program which runs programs" abstraction (i.e., execution of program to implement Java virtual machine which then executes Java bytecodes) has taken its toll in terms of execution speed. Java programs are notoriously slow. The present invention can be used to help speed up execution of Java programs and other stack-based program code.

Bytecodes

Java bytecodes are similar to most assembly languages except that they are stack-based. For example, an expression in Java such as:

would compile to bytecodes that look like:

______________________________________ iload.sub.-- 2 // ".sub.-- 2" denotes variable y bipush 4 // push a constant 4 iadd // add the two top stack elements istore.sub.-- 1 // ".sub.-- 1" denotes variable x ______________________________________

The iload.sub.-- 2 instruction loads the value from variable y and pushes it on top of a stack maintained by a Java virtual machine. As indicated in the comment field, the ".sub.-- 2" suffix denotes variable y. The bipush instruction pushes the constant 4 onto the stack. The iadd instruction pops the top two values off the stack (the value of y and the constant), adds them, and then pushes the sum onto the stack. The istore instruction pops the sum off the stack and stores it into variable x. Note that local variable names at runtime are lost after compilation and are merely designated by an ordinal number (i.e., .sub.-- 2 denotes variable y, .sub.-- 1 denotes variable x). Also note that the "i" prefix of many instructions indicates an integer instruction. The examples herein are restricted to integers for clarity, but by no means is the invention limited to integers. The invention is equally effective when operating on floating point or Boolean types (i.e., instructions such as lload, fload, dload, etc.). The canonical stack operation "push" occurs in several different Java bytecodes including bipush, sipush, and iload. The "pop" operation analogously is emulated with istore among others.

Examining sequences of code is simplified by partitioning the code into basic blocks. A basic block is a program code sequence (typically assembly code) that includes no goto statements leading out of the program code sequence and that is not entered (except at the first instruction) by a goto statement that is external to the program code sequence. If a code sequence includes a goto statement that exits the code sequence, or if the code sequence is entered at some point by a goto statement that resides outside the code sequence, then that code sequence would be divided into two or more basic blocks at the points of entry and exit.

Transient Variables

FIG. 1 is a portion of Java source code that, when compiled and executed, causes the contents of two variables to be swapped. As FIG. 1 shows, a temporary (i.e., transient) variable called "temp" is needed to hold the initial value of variable "a" so that variable "a" can be overwritten without loss of its initial value. Otherwise, the initial value of variable "a" would not be available to be stored in variable "b". If the source code programmer had direct access to a stack in a stack-based machine (virtual or otherwise) the programmer could store the initial value of variable "a" on the stack. However, access to the stack of a stack-based machine is typically not available to the programmer so that the "temp" variable is required. In one embodiment of the present invention, the compiled version of the code in FIG. 1 is reordered in a manner that does not destroy the execution result of the code, but which allows instructions associated with the temp variable to be removed in favor of storage on the stack of a stack-based machine.

FIG. 2 shows a Java bytecode sequence that results from compiling the Java source code in FIG. 1. In the bytecode sequence, the source code variables "a", "b" and "temp" are indicated by the suffixes .sub.-- 1, .sub.-- 2 and .sub.-- 3, respectively. For example, when executed, the iload.sub.-- 1 instruction causes the contents of variable "a" to be pushed onto the stack of a Java virtual machine, the istore.sub.-- 3 instruction causes the value at the top of the stack (i.e., the contents of variable "a") to be popped off the stack and stored in "temp", and so forth.

FIG. 3 illustrates the Java bytecode sequence of FIG. 2 after it has been modified by a technique called "push migration". Push migration is the act of moving (i.e., reordering) stack push instructions (i.e., instructions such as "iload" which cause a value to be pushed onto a stack) as near to the beginning of a code sequence as possible. As discussed below, significant restrictions on instruction reordering exist and therefore FIG. 3 displays only one allowable move: iload.sub.-- 3 from the second to last statement to immediately after the istore.sub.-- 3. In another operation, called "definition-use pair removal", the legality of removing the iload.sub.-- 3 and istore.sub.-- 3 instructions is determined. Accepting for now that the iload.sub.-- 3 and the istore.sub.-- 3 instructions can be removed, the code sequence is reduced to four instructions that perform the same operation as the original six, without the overhead of the temp variable. This is significant not only for the reduced number of instructions, but also because of the iload.sub.-- 3 and istore.sub.-- 3 instructions involve time-consuming memory access. Consider that to perform the istore.sub.-- 3 in the original version, the top of stack is read (a memory access), a stack pointer is updated (a memory access or register update), and a value is assigned to temp (another memory access). The iload.sub.-- 3 requires analogous effort. Thus, by removing the iload.sub.-- 3 and istore.sub.-- 3 instructions, execution time is significantly reduced.

Still referring to FIG. 3, note that the code sequence of FIG. 3 takes advantage of the last-in, first-out nature of the stack to avoid having to access the temp variable (i.e., avoiding the istore.sub.-- 3 and the iload.sub.-- 3 instruction). Even though this code is completely legal within the Java virtual machine, it cannot be easily be obtained by reformulating the Java source code. More importantly, at least from a source code obfuscation standpoint, it is not a simple matter to regenerate the original source code of FIG. 1 by reverse compiling the bytecode sequence of FIG. 3. The reason for this is that the Java programming language, like many high-level programming languages, does not provide a source-level construct that allows a variable merely to be pushed onto the stack without there being an associated operation on the pushed value. Note that the source code in FIG. 1 is fairly simple. When bytecodes obtained from a larger, more complex portion of source code are reordered, it becomes substantially more difficult to regenerate the original source code. Thus, code reordering according to embodiments of the present invention provides a significant impediment to would-be copyists.

FIG. 4 depicts a sequence of Java source code statements that are assumed to constitute a basic block of code. That is, it is assumed that there are no goto statements into or out of this sequence (note that there may be a goto to the first instruction of the sequence). This sequence of source code statements, and the bytecodes that result from its compilation are used throughout the remainder of this description to explain various embodiments of the present invention.

Stack-Balanced Instruction Sequences

FIG. 5 illustrates a compiled version of the expression sequence of FIG. 4. FIG. 5 also illustrates a partitioning of the code sequence in two ways. The dotted arrows on the left mark code sequences called "stack-balanced blocks" (or "stack-balanced instruction sequences"). Stack-balanced blocks are instruction sequences that, after execution, leave the stack in the same state as before execution. More specifically, a stack-balanced block is a sequence of instructions that, when executed, causes an equal number of stack push and pop operations to take place in an order such that, throughout execution of the sequence of instructions, the number of completed stack pop operations does not exceed the number of completed stack push instructions. Thus, it is perfectly legal for instructions within a stack-balanced block to push values onto the stack, but the values must be popped off the stack by the end of the stack-balanced block so that the stack is unchanged. As discussed below, in at least one embodiment of the present invention, stack-balanced blocks are identified and used to determine the legality of instruction reordering.

Definition Instructions, Use Instructions and Definition-Use Pairs

According to one embodiment of the present invention, the number of instructions in stack-based program code is reduced by removing definition instructions and use instructions that constitute definition-use pairs. A definition instruction is an instruction to pop a value off a stack and into a memory location reserved for a program variable (e.g., a store instruction in a Java bytecode sequence). A use instruction is an instruction that obtains a value from a memory location reserved for a program variable and pushes the instruction onto the stack (e.g., a load instruction in a Java bytecode sequence). Note that for both definition and use instructions the memory location of the program variable may be transitory, as in the case of a cache memory location.

The expression "definition-use pair" refers to a definition instruction and a use instruction that operate on the same program variable. As discussed below, the candidates for instruction removal are definition-use pairs in which the constituent definition and use instructions are adjacent or can be made adjacent without destroying the integrity of the code sequence. For example, the double-headed arrow E in FIG. 5 indicates an adjacent definition-use pair. The other arrows A, B, C and D identify definition-use pairs in the listing that are not adjacent. According to one embodiment of the present invention, the instructions are reordered to the extent legally possible to create adjacent definition-use pairs. Herein, a legal instruction reordering or other change to a program code sequence refers to an instruction reordering or other change that does not affect the execution result of the code sequence, except that deeper stack usage may occur. As discussed below, one technique for ensuring the legality of instruction reordering, is to prevent interlacing (i.e., partial overlapping) of stack-balanced blocks.

Removal of Adjacent Definition-Use Pairs; Subsequent Use Instructions

According to one embodiment, any adjacent definition-use pair may be removed so long as the variable referenced by the pair is not used in any subsequent instruction in the code sequence. Herein, use of a variable refers to pushing the variable onto the stack. The scope of a subsequent use determination is limited to the method, function or procedure that contains the definition-use pair. Assuming that the code sequence shown in FIG. 4 constitutes an entire function, then the pair E may legally be removed because variable 3 is not used after pair E. (Note that variable 3 is defined after pair E, but not used.) By removing pair E, the result of the isub instruction (preceding the istore.sub.-- 3) is cached on top of the stack until it is popped by the imul instruction in the subsequent stack-balanced block. Also, removal of the definition-use pair E causes their respective stack-balanced states to merge.

FIG. 6 illustrates the code sequence of FIG. 5 in a directed-acyclic graph (DAG). A post-order traversal of the graph yields the original code sequence, less the removal of definition-use pair E. Each sub-tree corresponds to a stack-balanced block. The dashed arrows A, B, C and D indicate the location in the DAG of the constituent instructions of definition-use pairs. The fact that the definition-use pair E has been removed from the code sequence can be seen in the rightmost two subtrees.

Overview of a Method According to One Embodiment

According to one embodiment of the present invention, the number of instructions in stack-based program code may be removed using the method of FIG. 7. At step 61, stack-balanced blocks in the program code are identified. At step 63, definition-use instruction pairs in the program code, both adjacent and non-adjacent, are identified. For each of the definition-use pairs identified in step 63 that are adjacent, the program code is examined to determine whether there is a subsequent use instruction at step 65. Each of the adjacent definition-use pairs for which there is no subsequent use instruction are removed from the program code at step 67. At step 69, push migration is performed to reorder push instructions ahead of other instructions in the program code to the extent legal. The determination of whether a given instruction reordering is legal is discussed below. At step 71, adjacent uses of the same program variable (i.e., adjacent use instructions) are replaced by a single use instruction and one or more instructions that cause the value at the top of the stack to be duplicated on the stack. This is a code replacement step and is discussed further below. According to one embodiment, if no code reordering or code replacement occur in steps 69 and 71, respectively, then the method is completed, otherwise the method loops back to step 63 and steps 63, 65, 67, 69 and 71 are repeated. This is indicated by decision step 73. Thus, there may be multiple iterations of steps 63, 65, 67, 69 and 71 before the method is completed. The above-recited steps are discussed in further detail below.

Identifying Stack-Balanced Blocks

According to one embodiment, stack-balanced blocks in a program code sequence are identified by examining the code sequence in reverse order. Each instruction is evaluated according to what it takes (i.e., pops) from the stack and what it gives (i.e., pushes) to the stack. Starting at the bottom instruction of the code sequence of FIG. 5 and proceeding upward:

______________________________________ istore.sub.-- 3 // takes 1, gives 0, cum. stack depth: 1 imul // takes 2, gives 1, cum. stack depth: 2 bipush 100 // takes 0, gives 1, cum. stack depth: 1 iload.sub.-- 3 // takes 0, gives 1, cum. stack depth: 0 ______________________________________

The bottom instruction of the code sequence is considered to mark the end of a stack-balanced sequence of instructions. Proceeding upward, when an instruction is reached that returns the stack depth to zero (i.e., zero stack depth relative to the starting stack depth of a basic block), that instruction is considered to mark the beginning of the stack-balanced sequence of instructions. The remainder of the code sequence is similarly examined to find other stack-balanced sequences of instructions.

Identifying Definition-Use Instruction Pairs

According to one embodiment, definition-use pairs are identified by traversing a program code sequence from beginning to end. Each time a definition (e.g., a store) of a given variable is found, it is noted and the scan continues. If a use (e.g., a load) of the same variable is found, then the store and load are marked as a definition-use pair. Single definitions, single uses, and "use-definition" pairs (e.g., a load of a variable followed by a store of the variable) are not considered to be definition-use pairs.

Subsequent Use and Adjacent Definition-Use Pair Removal

As discussed above in reference to FIG. 5, adjacent definition-use pairs may be removed from the instruction sequence so long as there is no subsequent use of the variable. There are several techniques for determining whether there is a subsequent-use, including live variable analysis, reaching definitions, use-definition chains, and others. Generally, any technique may be used to determine subsequent use without departing from the spirit and scope of the present invention. Once it is determined that the variable has no subsequent use, the adjacent definition-use pair (i.e., the store and load statements) is removed from the code.

Push Migration

According to one embodiment, push migration is performed in an attempt to move all push instructions (i.e., load, bipush, sipush) as near to the beginning of the code sequence as possible. If a push instruction can be moved upward in the code sequence until it is adjacent a corresponding pop of the same variable (i.e., if an adjacent definition-use pair can be created), then both the push and pop instructions can be removed from the code sequence. In one embodiment, push migration is governed by the following rules:

1. A load instruction may not be moved ahead of a store instruction which references the same variable.

2. An instruction which results in a stack push (i.e., a stack push instruction) may not be moved ahead of another stack push instruction that is in the same stack-balanced sequence of instructions. This rule prevents operands from being reversed in non-commutative operations (e.g., X-5 being illegally transformed into 5-X). This rule may be relaxed, however, for commutative operations including, but not limited to, addition and multiplication.

3. A stack push instruction may not be moved into a stack-balanced block from a location outside the stack-balanced block . (Note, a stack push instruction may be moved from one position within a stack-balanced block to another position within the stack-balanced block and a stack push instruction may be moved between the end of one stack-balanced block and the start of the next stack-balanced block).

FIG. 8A illustrates the state of the code in FIG. 5 after a first push migration has taken place (note that adjacent definition-use pair E remains in the example for clarity). One result that follows from reordering instructions according to the above-stated push migration rules is that stack-balanced blocks are prevented from interlacing. This is indicated by the double-headed arrows on the left side of the code sequence. According to one embodiment, a stack-balanced block may encompass another stack-balanced block, but may not include only part of another stack-balanced block (i.e., stack-balanced blocks may not interlaced). By reordering instructions as discussed above, a deeper use of the stack may at times occur (i.e., a larger portion of stack memory may be used than without reordering), but the instruction reordering does not otherwise change the result achieved by executing the program.

As shown in FIG. 8A, definition-use pairs B, C, D and E are adjacent definition-use pairs. By removing these adjacent definition-use pairs, the code sequence in FIG. 8B is obtained. Note that definition pair A has not been removed because it is not an adjacent pair. Note also that removal of adjacent definition-use pairs B, C, D and E has caused the six stack-balanced regions of the FIG. 6 code listing to be merged into two stack-balanced regions. This is indicated by the two double-headed arrows on the left side of FIG. 8B.

Adjacent Use Replacement

Because of push migration, code sequences tend to be left in states where pushes are generally in the beginning of the code sequence and pops are at the end. In some cases, use instructions that have been made adjacent (e.g., as a result of push migration), and that push the same value onto the stack, can be replaced by a single use instruction and one or more stack specific instructions that cause the value at the top of the stack to be duplicated. An example of such an instruction include the Java bytecode "dup", which duplicates the top stack element, and the Java bytecode "dup2", which duplicates the top two stack elements.

As an example of replacing adjacent-use instructions, consider the following instruction sequence that might result from push migration:

aload.sub.-- 0

aload.sub.-- 0

aload.sub.-- 0

aload.sub.-- 0

This code sequence includes four adjacent uses of variable 0 and may be replaced by the following:

aload.sub.-- 0

dup

dup2

Not only are dup and dup2 instructions smaller than most load instructions (therefore consuming less code space), but in stack-based machines they usually execute faster than memory-accessing load instructions. Further, after adjacent use replacement, additional opportunities for adjacent pair removal may be created. As an example, consider the following code sequence:

istore.sub.-- 1

// . . . other instructions

iload.sub.-- 1

// . . . other instructions

iload.sub.-- 1

After push migration, the following code sequence is obtained:

istore.sub.-- 1

iload.sub.-- 1

load.sub.-- 1

// . . . other instructions

Note that the istore instruction and the two iload instructions now form an adjacent definition-use pair followed by another use instruction. As discussed above, the definition-use pair removal rules of one embodiment do not permit removal of the adjacent definition-use pair because of the subsequent use. However, because the two iload instructions operate on the same variable (i.e., variable 1), they constitute adjacent use instructions that can be replaced as follows:

istore.sub.-- 1

iload.sub.-- 1

dup

Another iteration of the subsequent use determination step will now find that there are no further uses of variable 1 beyond the adjacent definition-use pair. The definition-use pair can now be removed. The dup instruction will remain to duplicate the value present at the top of the stack prior to execution of the now-removed istore.sub.-- 1 instruction.

Iteration

Some definition-use pairs are not removed in a first execution of the above-described steps of definition-use pair removal, push migration and adjacent use replacement (e.g., definition-use pair A in FIG. 8B). However, such pairs may be removed in subsequent iterations of the removal, migration and replacement steps. One reason for this is that adjacent definition-use pair removal merges stack-balanced blocks and creates opportunities for further push migration. FIG. 8B, for example, shows that adjacent pair removal has caused the bipush.sub.-- 55 statement to be in the middle of an expanded stack-balanced sequence so that it may be moved upward during push migration. Migration of the bipush.sub.-- 55 statement causes definition-use pair A to become an adjacent pair. Consequently, in a second iteration of the definition-use pair removal step, pair A is removed. This is shown in FIG. 8C. As indicated by the double-headed arrow on the left side of FIG. 8C, all of the stack-balanced blocks of the original code sequence have been merged into a single stack-balanced code sequence. This complete merger of stack-balanced blocks is not necessary, or even possible, in every case.

According to one embodiment, the steps of adjacent definition-use pair removal, push migration and adjacent use replacement are repeated until no further push migration or adjacent use replacement is possible. At that point, other techniques for reducing the number of instructions in the code sequence may be applied.

Having described a method for practicing the present invention, it should be noted that the individual steps therein may be performed by a processor programmed with instructions that cause the processor to perform the recited steps, specific hardware components that contain hard-wired logic for performing the recited steps, or any combination of programmed computer components and custom hardware components. Nothing disclosed herein should be construed as limiting the present invention to a single embodiment wherein the recited steps are performed by a specific combination of hardware components. Moreover, in the case of a programmed processor implementation, sequences of instructions which may be executed by a processor to carry out the method of the present invention may be stored and distributed on a computer readable medium or may be transmitted on a transmission media via a carrier wave.

In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, accordingly to be regarded in an illustrative rather than a restrictive sense.

* * * * *