Parameterized States In Symbolic Execution For Software Testing Li; Guodong ; et al. [Fujitsu Limited]

Parameterized States In Symbolic Execution For Software Testing

Li; Guodong ; et al.

Patent Application Summary

U.S. patent application number 14/253342 was filed with the patent office on 2015-10-15 for parameterized states in symbolic execution for software testing. This patent application is currently assigned to Fujitsu Limited. The applicant listed for this patent is Fujitsu Limited. Invention is credited to Indradeep Ghosh, Guodong Li.

Application Number	20150293831 14/253342
Document ID	/
Family ID	54265170
Filed Date	2015-10-15

United States Patent Application	20150293831
Kind Code	A1
Li; Guodong ; et al.	October 15, 2015

PARAMETERIZED STATES IN SYMBOLIC EXECUTION FOR SOFTWARE TESTING

Abstract

Methods and systems for symbolic execution of software under test include the use of parametric states to losslessly represent a group of concrete execution states. Mathematical abstractions may represent differences between execution states and may define a parametric constraint for a parametric state. The parametric states may be usable for symbolic execution to reduce an amount of memory resources consumed and/or reduce a computational load during symbolic execution. Using parametric states, a larger state space and more program behaviors may be testable using symbolic execution.

Inventors:

Li; Guodong; (San Jose, CA) ; Ghosh; Indradeep; (Cupertino, CA)

Applicant:

Name	City	State	Country	Type
Fujitsu Limited	Kawasaki-shi		JP

Assignee:

Fujitsu Limited
Kawasaki-shi
JP

Family ID:

54265170

Appl. No.:

14/253342

Filed:

April 15, 2014

Current U.S. Class:	717/124
Current CPC Class:	G06F 11/36 20130101; G06F 11/3676 20130101; G06F 11/3608 20130101
International Class:	G06F 11/36 20060101 G06F011/36

Claims

1. A computer-implemented method for symbolic execution of executable code using parametric states, comprising: identifying shared state elements and different state elements in execution states associated with the executable code; representing the different state elements using a state abstraction; based on the state abstraction, defining a parametric state having a parametric constraint, the parametric state losslessly representing a group of the execution states, wherein an execution state in the group of execution states is an instance of the parametric state; and symbolically executing the executable code using a processor having access to memory media, including using the parametric state.

2. The method of claim 1, wherein representing the different state elements using a state abstraction includes: simplifying at least some of the different state elements; and identifying common patterns in the different state elements.

3. The method of claim 2, further comprising: matching the identified common patterns with stored common patterns.

4. The method of claim 2, wherein defining the parametric state includes: defining the parametric state based on the identified common patterns.

5. The method of claim 1, wherein the group of execution states represents iterations of a loop, further comprising: defining a path condition and a value store of the parametric state from a loop body of the loop, wherein defining the parametric state includes defining the parametric constraint further based on a loop header of the loop.

6. The method of claim 1, wherein symbolically executing the executable code using a processor having access to memory media, including using the parametric state, further comprises: producing test cases using the parametric state; and using the test cases, performing bug checking on the group of states.

7. The method of claim 1, further comprising: replacing the group of execution states with the parametric state to reduce a number of the execution states associated with the executable code.

8. The method of claim 7, wherein replacing the group of execution states with the parametric state results in a reduction of memory resources consumed by the symbolic execution.

9. The method of claim 1, wherein the state abstraction is selected to minimize a complexity of the parametric state.

10. One or more non-transitory computer-readable media storing processor-executable instructions for symbolic execution of executable code using parametric states, the instructions, when executed by a processor, for causing the processor to: identify shared state elements and different state elements in execution states associated with the executable code; represent the different state elements using a state abstraction; based on the state abstraction, define a parametric state having a parametric constraint, the parametric state losslessly representing a group of the execution states, wherein an execution state in the group of execution states is an instance of the parametric state; and use the parametric state for symbolic execution.

11. The one or more computer-readable media of claim 10, wherein the instructions to represent the different state elements using a state abstraction includes instructions to: simplify at least some of the different state elements; and identify common patterns in the different state elements.

12. The one or more computer-readable media of claim 11, further comprising instructions to: match the identified common patterns with stored common patterns.

13. The one or more computer-readable media of claim 11, wherein the instructions to define the parametric state include instructions to: define the parametric state based on the identified common patterns.

14. The one or more computer-readable media of claim 10, wherein the group of execution states represents iterations of a loop, further comprising instructions to: define a path condition and a value store of the parametric state from a loop body of the loop, wherein the instructions to define the parametric state include instructions to define the parametric constraint further based on a loop header of the loop.

15. The one or more computer-readable media of claim 10, wherein the instructions to use the parametric state for symbolic execution, further comprise instructions to: produce test cases using the parametric state; and using the test cases, perform bug checking on the group of states.

16. The one or more computer-readable media of claim 10, further comprising instructions to: replace the group of execution states with the parametric state to reduce a number of the execution states associated with the executable code.

17. The one or more computer-readable media of claim 16, wherein the instructions to replace the group of execution states with the parametric state, when executed, result in a reduction of memory resources consumed by the symbolic execution.

18. The one or more computer-readable media of claim 16, wherein the instructions to replace the group of execution states with the parametric state, when executed, result in a reduction of a computational load of the processor for the symbolic execution.

19. The one or more computer-readable media of claim 10, wherein the state abstraction is selected to minimize a complexity of the parametric state.

Description

BACKGROUND

[0001] 1. Field of the Disclosure

[0002] The present disclosure relates to symbolic execution for software testing and, specifically, to parameterized states in symbolic execution.

[0003] 2. Description of the Related Art

[0004] With the ubiquitous presence of software programs permeating many aspects of daily life, software testing by validating and verifying software code to provide robust and reliable executable programs has become an important part of the software development process. Traditionally, software quality has been assured through manual software testing, which may be tedious, difficult, and may often give poor coverage of the software code. More recently, automated techniques for formal software testing have been developed. One such technique is symbolic execution.

[0005] Symbolic execution is a non-explicit state model-checking technique that treats input to a software program as symbol variables. Symbolic execution includes creation of complex equations by executing finite paths in the software program with symbolic variables and then solving the complex equations with a solver, typically known as a decision procedure, to obtain error scenarios, if any exist. In contrast to explicit state model checking, symbolic execution may be able to work out possible input values and possible use cases of the possible input values in a software program under analysis. Symbolic execution may be used to automatically generate test inputs with high structural coverage for the software under test. However, during symbolic execution, a very large number of execution paths and/or states may be generated that may limit the applicability of symbolic execution due to high consumption of computational resources.

SUMMARY

[0006] In one aspect, a disclosed method for symbolic execution of executable code using parametric states includes identifying shared state elements and different state elements in execution states associated with the executable code. The method may include representing the different state elements using a state abstraction. Based on the state abstraction, the method may include defining a parametric state having a parametric constraint, the parametric state losslessly representing a group of the execution states. An execution state in the group of execution states may be an instance of the parametric state. The method may further include symbolically executing the executable code using a processor having access to memory media, including using the parametric state.

[0007] Additional disclosed aspects for symbolic execution of executable code using parametric states may include a non-transitory computer-readable memory media including instructions executable by a processor and a system including a processor having access to memory media.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] FIG. 1 is a block diagram of selected elements of an embodiment of a symbolic execution system;

[0009] FIG. 2 is a block diagram of selected elements of an embodiment of a parameterization process;

[0010] FIG. 3 is a flow chart depicting selected elements of an embodiment of a method for parameterized state merging;

[0011] FIGS. 4A, 4B, 4C, and 4D are control flow graphs of selected elements of an embodiment of an example of parameterized state merging;

[0012] FIG. 5 is a block diagram of selected elements of an embodiment of a symbolic execution system; and

[0013] FIG. 6 is a flow chart depicting selected elements of an embodiment of a method for symbolic execution using parameterized states.

DESCRIPTION OF PARTICULAR EMBODIMENT(S)

[0014] In the following description, details are set forth by way of example to facilitate discussion of the disclosed subject matter. It should be apparent to a person of ordinary skill in the field, however, that the disclosed embodiments are exemplary and not exhaustive of all possible embodiments.

[0015] Throughout this disclosure, a hyphenated form of a reference numeral refers to a specific instance of an element and the un-hyphenated form of the reference numeral refers to the element generically or collectively. Thus, as an example (not shown in the drawings), widget "12-1" refers to an instance of a widget class, which may be referred to collectively as widgets "12" and any one of which may be referred to generically as a widget "12". In the figures and the description, like numerals are intended to represent like elements.

[0016] As used herein, "symbolic execution" refers to methods and operations for software testing that simulates (or emulates) the execution of software under test using symbolic rather than concrete inputs. Symbolic execution may compute the effect of these symbolic inputs in the software under test using symbolic expressions. Symbolic execution may be used to formally test and validate various types of software programs and/or modules. As used herein, an "executor" refers to an execution engine that performs symbolic execution on a "software under test (SUT)". The software under test may be a complete application program or may be one or more selected code modules.

[0017] In particular embodiments, symbolic execution characterizes each execution path in the software under test with a path condition that may be defined as a conjunction of boolean expressions. Each boolean expression in the path condition may denote one branching decision made during the symbolic execution of a distinct path of the software under test. When the symbolic execution is finished, multiple path conditions may be generated, each path condition corresponding to a feasible execution path of the program code with respect to the symbolic inputs. The solutions to the path conditions may be used as test inputs so that the software under test is executed along a particular defined execution path. In particular embodiments, a decision procedure, such as a SMT (Satisfiability Modulo Theory) solver, is used to find the solutions to the path conditions and to remove false paths. When such solutions are obtained for actual execution paths in the software under test, then exhaustive testing of the program or module may be possible. In addition, some so-called "sanity properties" may be checked during symbolic execution, such as memory out-of-bound access, divide-by-zero, and certain types of user-defined assertions.

[0018] Although symbolic execution may validate the software under test and may achieve greater test coverage of program behaviors than traditional testing approaches, symbolic execution may be computationally intensive and may consume a significant amount of computational resources, such as processor time, memory space, number of processors, etc., which may adversely impact the economic feasibility of symbolic execution. For example, symbolic execution may suffer from a path and/or state explosion problem, in which the software under test may result in a computationally intractable symbolic model for processing by the executor. As a result of the path/state explosion problem, the executor may process a large number of execution paths and associated states and may fail to output a set of useful test cases in a reasonable time. Thus, a too large number of execution paths and associated states may represent a primary bottleneck for symbolic execution, and may limit scalability and/or applicability of symbolic execution to certain software under test, which may be undesirable.

[0019] As will be presented herein, the execution states and corresponding paths of a software under test may be represented using a state tree, which may include a number of nodes representing states and a number of edges between the nodes representing path conditions. Each node in the state tree may represent an execution state containing an instruction counter, a path condition, and a value store. Each edge may represent a constraint connecting one node to another.

[0020] In U.S. Patent Publication No. 2012/0311545, certain problems associated with excessive execution paths during symbolic execution are addressed with methods and systems for losslessly reducing a number of execution paths and a number of states of a software under test comprising a basic code block absent any intermediate execution jumps or jump targets. As will be described in further detail herein, the inventors of the present disclosure have discovered methods and systems for parameterized state management in symbolic execution that may provide state merging techniques using parameterized methods. The methods and systems for parameterized state management in symbolic execution described herein may be usable to improve resource consumption of symbolic execution and, thus, may result in a faster and more efficient generation of useful test cases using symbolic execution, which may be beneficial for automated software testing. It is noted that the methods and systems described herein may be usable with previous improvements and solutions used in symbolic execution.

[0021] As used herein, a parametric state ("pstate") in symbolic execution may define a group of concrete states ("estate") that are represented using symbolic parameters. In this manner, a pstate may be used to reduce complexity in symbolic execution by avoiding enumeration of individual cstates. A pstate may represent an exact encoding of a full set of enumerated cstates. Pstates may be particularly applicable to loops and switch statements (or case statements) in symbolic execution where iterations and cases are independent of one another.

[0022] Turning now to the drawings, FIG. 1 illustrates a block diagram of selected elements of an embodiment of symbolic execution system 100. As shown, symbolic execution system 100 may represent a collection of tools for performing symbolic execution, as described herein. In certain embodiments, at least certain portions of symbolic execution system 100, such as executor 120, may represent executable code and/or instructions executable by a processor to perform symbolic execution (see also FIG. 5). In FIG. 1, executor 120 may represent an execution engine that performs symbolic execution on executable code (SUT) 110, which represents software under test. It is noted that symbolic execution system 100 may be used for various kinds of executable code (SUT) 110, including executable code in different languages and/or executable code compiled using different types or versions of compilers. As shown in FIG. 1, executor 120 may generate test cases 150 and statistics 152 as output of the symbolic execution. Test cases 150 may represent results of actual executable code that may be used to analyze and evaluate executable code (SUT) 110, for example, for bug checking and/or for other run-time errors. In particular embodiments, symbolic expressions obtained by symbolically executing executable code (SUT) 110 by executor 120 may be solved using, for example, an SMT solver. The solutions, if any, may be used to generate test cases 150 for testing and validating executable code (SUT) 110. Statistics 152 may represent statistical values describing test cases 150, such as statistics for tracking coverage of specific execution paths within executable code (SUT) 110.

[0023] As shown in FIG. 1, executor 120 may include runtime environment 122, optimizer 124, executor manager 126, state manager 128, symbolic expression handler 130, and concrete interpreter 130. Runtime environment 122 may be specific to a particular type of programming language or compiler and may enable simulation of runtime execution of executable code (SUT) 110. Optimizer 124 may include functionality to optimize states and/or paths used in symbolic execution. Executor manager 126 may be responsible for managing an execution environment of executor 120. State manager 128 may include functionality for generating and managing instances of states in symbolic execution, including cstates and pstates. Symbolic expression handler 130 may be used to process and solve symbolic expressions in symbolic execution. Concrete interpreter 130 may be used to analyze execution properties of concrete states in symbolic execution.

[0024] In operation of symbolic execution system 100, parametric states may be used to represent a set of concrete states without losing information. In other words, a set of concrete states may be losslessly merged into one or more parametric states for symbolic execution. The generation of the parametric states may include determining whether the merging of the cstates is possible, including determining whether the differences among the cstates are suitable for parametric formulation. In certain instances, simplifications and optimizations may be applied to the cstates to enable merging into pstates. Also, from many possible abstraction schemes for defining the pstates, a best abstraction scheme may be determined. Finally, the parameterization of cstates into pstates may be performed in a manner that improves overall computational efficiency of symbolic execution such that the benefits from using a pstate are greater than the costs of parameterization of the pstate.

[0025] Referring now to FIG. 2, a block diagram of selected elements of an embodiment of parameterization process 200 is depicted. As shown, parameterization process 200 may represent operations and data used for parameterized state merging in symbolic execution, as described herein. In particular embodiments, parameterization process 200 may be performed using symbolic execution system 100 (see FIG. 1), for example, by state manager 128. Specifically, parameterization process 200 may be used to generate pstate 204 representing one or more parameterized states that fully describe a set of cstates 202.

[0026] In conventional symbolic execution, a concrete execution state (i.e., a cstate) may include an instruction pointer value (IP), a path condition (PC), and a value store (not shown in the drawings). The IP may mark a current execution point during symbolic execution. It will be assumed herein that two cstates are merged into a pstate only when they have the same IP. The PC may represent a set of symbolic constraints that an execution state satisfies. The value store may map addresses to values, which can be concrete or symbolic values. A parametric state or pstate may represent a group of cstates having shared patterns. The differences in the group of cstates that may be combined into a pstate may be described using a symbolic formula. The differences in the group of cstates used to form a pstate may be differences in path conditions and/or differences in symbolic values. Accordingly, a pstate produced by parametric state merging, as described herein, may include an additional portion, namely a parametric constraint ("pconstraint"), which is a symbolic boolean formula that refers to the characteristics of the group of cstates. In other words, the pconstraint summarizes the differences in the group of cstates that are used to form a pstate. Thus, a cstate may be viewed as an instance of a corresponding pstate, while a pstate may completely describe all associated cstates. From the perspective of the executor, after parameterized state merging, a given pstate may behave exactly the same as each cstate in the group of cstates.

[0027] In FIG. 2, parameterization process 200 describes operations in a particular embodiment of parameterized state merging of cstates 202 into pstate 204. Cstates 202 are represented by a set of N number of cstates 202, shown as cstate 202-1, cstate 202-2, and so on, up to cstate 202-N. Cstates 202 may be received by simplification 210, which may analyze cstates 202 to determine opportunities for expression simplification and normalization. In one embodiment, simplification 210 may attempt to match at least certain portions of state trees for cstates 202 to ascertain whether the differences may be abstracted into a generalized and simplified form. Simplification 210 may employ symbolic reasoning to solve and simplify expressions associated with cstates 202. Then pattern identification 212 may access pattern database 250 to search for patterns in expressions associated with cstates 202 after simplification 210. In some embodiment, pattern identification 212 may attempt to find invariants among PCs and value stores of cstates 202 that may indicate possibilities for state merging. Pattern database 250 may represent a knowledge store of common invariant patterns that are accessible for matching by pattern identification 212. After pattern identification 212, parameterized merging 214 may then use the patterns identified by pattern identification 212 and/or known state abstraction schemes (not shown) to generate pstate 204, which may represent one or more execution states representing all cstates 202 for purposes related to symbolic execution.

[0028] Turning now to FIG. 3, a block diagram of selected elements of an embodiment of method 300 for parameterized state merging is depicted in flow-chart form. Method 300 may be performed by executor 120 to implement parameterization process 200 (see FIGS. 1 and 2). It is noted that certain operations described in method 300 may be optional or may be rearranged in different embodiments.

[0029] Method 300 may begin (operation 301) with a decision whether any instruction pointers (IP) are left for execution (operation 302). The IP in operation 302 may represent instructions associated with a software under test using symbolic execution. When the result of operation 302 is NO, method 300 may end (operation 304). When the result of operation 302 is YES, a next IP may be executed and produce (operation 306) a new cstate S. A path condition (PC) and/or symbolic expressions of cstate S may be simplified (operation 308). Operation 308 may include normalization of expressions in cstate S. Then, a decision may be made whether a merge point has been reached (operation 310) When the result of operation 310 is NO, method 300 may return to operation 302. When the result of operation 310 is YES, a decision may be made whether cstate S can be absorbed into an existing pstate P (operation 312). Pstate P may absorb cstate S when pstate P is extendable to accommodate cstate S by extending P's parametric condition. When the result of operation 312 is YES, the parametric condition of pstate P may be updated (operation 316). When the result of operation 312 is NO, a decision may be made whether a state P' exists having the same IP as cstate S (operation 314). The state P' may be a cstate or a pstate. When the result of operation 314 is NO, method 300 may return to operation 302. When the result of operation 314 is YES, a decision may be made whether cstate S can be merged with P' (operation 320). The condition in operation 320 may be satisfied when PCs of cstate S and state P' may be abstracted to a common PC, and when variables in cstate S and state P' may be abstracted into a common variable or described with logical evaluations, such as if-then-else statements. When the result of operation 320 is NO, method 300 may return to operation 314. When the result of operation 320 is YES, cstate S and state P' are merged to produce a new parametric state P.sub.n. (operation 322). In certain embodiments, state P' may be reused for the new parametric state P.sub.n. Upon merging in operation 322, the new parametric state P.sub.n may include an abstraction of cstate S. After operation 322 and after operation 316, cstate S and state P, when present, may be removed (operation 324). After operation 324, method 300 may return to operation 302.

[0030] A first example of parametric state merging will now be described in further detail. In the first example, consider the following Expressions 1 and 2 that may appear as cstates in a software under test:

2a+b+c+1 Expression 1

3+b+c+6a Expression 2

[0031] In a first method, common patterns and/or subpatterns may result in identification of a pconstraint for Expressions 1 and 2. For example, using expression simplification and normalization, differences in the expressions may be identified and evaluated for possible abstraction. For example, Expression 2 may be normalized according to variable order to result in Expression 3.

6a+b+c+3 Expression 3

[0032] The differences between Expressions 1 and 3 may now be identified as the terms 2a, 6a and 1, 3. From these differences, a symbolic variable i may be introduced where i has values of (1, 3) to obtain the abstraction in Expression 4, which may represent the pconstraint for a pstate that includes both original Expressions 1 and 2.

(2i*a)+b+c+i Expression 4

[0033] An expression may be absorbed into another expression by augmenting a variable value domain. For example, the following Expression 5 is an instance of Expression 4 for i=5, showing that the abstraction format in Expression 4 may also absorb Expression 5.

10a+b+c+5 Expression 5

[0034] Instead of the first method in the first example, a second method to obtain Expression 4 may involve using an invariant tool and refining to constrain possible values of i. Instead of the first method and/or the second method in the first example, a third method may involve using a pattern library to compare Expressions 1 and 2 with common invariant patterns, such as first order polynomials, to match with Expression 4.

[0035] Turning now to FIGS. 4A, 4B, 4C, and 4D, selected elements of embodiments of a second example of parametric state merging are shown as respective state trees 400, 401, 402, and 403. The second example illustrated in FIGS. 4A-D is based on symbolic execution of the following executable code under test, in which the symbolic inputs are the variables: n, a, and b. In FIGS. 4A-D, state trees for the symbolic execution are illustrated to show an execution state after a complete iteration of the executable code under test for values of n from 1 to 1000.

TABLE-US-00001 int k = 0, v = 0; for (int i = 1; i <= 1000; i++) { if (i == n) { if (k > f (i * b)) v = a + i * b; } k += 2; } assert(v + k != 500);

Executable Code Under Test for the Second Example.

[0036] In FIG. 4A, state tree 400 represents completed execution states when a first iteration of the for-loop is complete. State S1 may represent the initial code int k=0, v=0, while state S2 may represent the first for-loop entry when the condition "i==n" specifies that the path condition is n=1. The case with the condition "i.noteq.n" is not shown. States S3 and S4 may represent the two possibilities of the conditional if-statement code. Thus, after the completed execution state for n=1, for state S3, 0>f(b), v=a+b and k=2, while for state S4, 0.ltoreq.f(b), v=0 and k=2. Because this is the first iteration of the symbolic execution, no pconstraints have yet been identified and no pstates exist yet.

[0037] In FIG. 4B, state tree 401 represents completed execution states when the second iteration of the for-loop is complete, where the states for n=2 are produced. State S1 may represent the initial code int k=0, v=0, while states S2 and S5 may represent the for-loop entry code respectively for n=1 and n=2. States S3 and S4 may represent the two possibilities of the conditional if-statement code for n=1. When S3 and S4 execute the second loop iteration with n=2, since condition "2==n" conflicts with "n=1", states S3 and S4 skip executing the if-statement, and increase variable k's value by 2 to be 4. Similarly, states S6 and S7 may represent the two possibilities of the conditional if-statement code for n=2. Thus, after the completed execution state for n=2, for state S6, 2>f(b), v=a+2b and k=4, while for state S7, 2.ltoreq.f(b), v=0 and k=4. At this time, state S6 is not an instance of any existing pstate, but does share the same IP as states S3 and S4. Thus, states S3 and S6 may be abstracted to 2(n-1)>f(b) with v=a+n*b. State S3 may be reused for the parametric state, and state S6 may be removed. At this time, state S7 is not an instance of any existing pstate, but does share the same IP as states S3 and S4. Thus, states S4 and S7 may be abstracted to 2(n-1).ltoreq.f(b) with v=0. State S4 may be reused for the parametric state, and state S7 may be removed. State S5 may also be removed when states S6 and S7 are removed since state S2 corresponds to state S5.

[0038] In FIG. 4C, state tree 402 represents completed execution states when the third iteration of the for-loop is complete, where the states for n=3 are produced. State S1 may represent the initial code int k=0, v=0, while states S2 and S8 may represent the for-loop entry code respectively for n=1, 2 and n=3. States S3 and S4 may now represent two pstates for n.epsilon.[1,2], and states S9 and S10 may represent the two possibilities of the conditional if-statement code for n=3. Thus, after the completed execution state for n=3, for state S9, 4>f(3b), v=a+3b and k=6, while for state S10, 4.ltoreq.f(3b), v=0 and k=6. At this time, state S9 is an instance of pstate S3 and state S10 is an instance of pstate S4. Thus, pstates S3 and S4 may absorb states S9 and S10 respectively and states S8, S9, and S10 may be removed.

[0039] In FIG. 4D, state tree 403 represents completed execution states when the 1000.sup.th iteration of the for-loop is complete, where the states for n=1000 are produced. State S1 may represent the initial code int k=0, v=0, while states S2 and S11 may represent the for-loop entry code respectively for n=1, 2 . . . , 999 and n=1000. States S3 and S4 may now represent two pstates for n.epsilon.[1,999], and states S12 and S13 may represent the two possibilities of the conditional if-statement code for n=1000. Thus, after the completed execution state for n=1000, for state S12, 1998>f(1000b), v=a+1000b and k=2000, while for state S13, 1998.ltoreq.f(1000b), v=0 and k=2000. At this time, state S12 is an instance of pstate S3 and state S13 is an instance of pstate S4. Thus, pstates S3 and S4 may absorb states S12 and S13 respectively and states S11, S12, and S13 may be removed.

[0040] Although iterations for certain values of n have been omitted from the second example in FIGS. 4A-D, it will be evident that in FIG. 4D, pstates S3 and S4 now represent pstates for n c [1,1000] without any loss of information. After merging into pstates S3 and S4, the assertion in the SUT may be checked accurately using the two pstates S3 and S4 rather than iterating through the original 2000 cstates, resulting in a substantial reduction of the computational cost for bug finding, as well as reducing the number of generated tests. Thus, in the second example, 2000 cstates have been replaced with 2 pstates, resulting in a substantial reduction of states for symbolic execution, with a corresponding reduction in computational resource consumption associated with the symbolic execution, the computational resource consumption including a memory consumption and/or a computational load of a processor performing the symbolic execution.

[0041] As shown in the second example above, parametric state merging, as described herein, may be particularly effective for loop structures during symbolic execution. The iterations of a loop may often exhibit similar behavior and may differ mainly in how the loop index is used. This behavior may be used to define pconstraints, for example, in the loop header, while common patterns, including PCs and value stores, may be identified in the loop body.

[0042] In various embodiments, SUT including the following loop structure may be investigated statically to infer pconstraints and common patterns. The two loop headers imply that a pconstraint candidate is i.epsilon.[k1, . . . ].LAMBDA.j.epsilon.[i, . . . ] for symbolic values i and j. The loop body implies that the parametric path condition is c(i,j) and the parametric value of v is f(i,j).

TABLE-US-00002 for (int i = k1; ...) { ... for (int j = i, ...) { ... if (c(i,j)) { v = f(i,j); ... } } }

Executable Loop Structure for Inferring Parametric Patterns.

[0043] Referring now to FIG. 5, a block diagram showing selected elements of an embodiment of symbolic execution system 500 is illustrated. As depicted in FIG. 5, symbolic execution system 500 includes processor 501 and memory media 510, which may communicate using system bus 502. Also shown accessible via system bus 502 is network adapter 520 that may provide connectivity to a network.

[0044] As shown in FIG. 5, memory media 510 may represent volatile, non-volatile, fixed, and/or removable media, and may be implemented using magnetic and/or semiconductor memory. Memory media 510 is capable of storing instructions and/or data. As shown, memory media 510 stores instructions (e.g., code executable by processor 501, such as a computer processor or microprocessor) including operating system (OS) 512, and executor 120 (see FIG. 1). Operating system 512 may be any of a variety of operating systems, such as a UNIX variant, LINUX, a Microsoft Windows.RTM. operating system, or a different operating system. As noted previously, executor 120 may include state manager 128, which may provide functionality for generating and managing instances of states in symbolic execution, including cstates and pstates, as described herein.

[0045] Turning now to FIG. 6, a block diagram of selected elements of an embodiment of method 600 for symbolic execution using parameterized states is depicted in flow chart form. Method 600 may be performed by executor 120 to implement parameterization process 200 (see FIGS. 1 and 2). It is noted that certain operations described in method 600 may be optional or may be rearranged in different embodiments.

[0046] Method 600 may begin by identifying (operation 602) shared state elements and different state elements in execution states associated with the executable code. The different state elements may be represented (operation 604) using a state abstraction. The state abstraction may be selected to minimize a complexity of the parametric state. Operation 604 may include mathematically simplifying at least some of the different state elements, and identifying common patterns in the different state elements. The identified common patterns may be matched with stored common patterns. Based on the state abstraction, method 300 may include defining (operation 606) a parametric state having a parametric constraint, the parametric state losslessly representing a group of the execution states. Operation 606 may include defining the parametric state based on the common patterns. An execution state in the group of execution states may be an instance of the parametric state. The parametric state may be used (operation 608) for symbolic execution. Operation 608 may include producing test cases using the parametric state, and using the test cases, performing bug checking on the group of states.

[0047] In certain embodiments, the group of execution states represents iterations of a loop and method 600 includes defining the parametric constraint based on a loop header of the loop, and defining a path condition and a value store of the parametric state from a loop body of the loop (not shown in FIG. 6). In various embodiments, method 600 includes replacing the group of execution states with the parametric state to reduce a number of the execution states associated with the executable code (not shown in FIG. 6), which may include reducing memory resources consumed by the symbolic execution.

[0048] As disclosed herein, methods and systems for symbolic execution of software under test include the use of parametric states to losslessly represent a group of concrete execution states. Mathematical abstractions may represent differences between execution states and may define a parametric constraint for a parametric state. The parametric states may be usable for symbolic execution to reduce an amount on computational resources. Using parametric states, a larger state space and more program behaviors may be testable using symbolic execution.

[0049] The above disclosed subject matter is to be considered illustrative, and not restrictive, and the appended claims are intended to cover all such modifications, enhancements, and other embodiments which fall within the true spirit and scope of the present disclosure. Thus, to the maximum extent allowed by law, the scope of the present disclosure is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited by the foregoing detailed description.

* * * * *