Prediction-based distributed parallel simulation method Patent Grant Yang July 15, 2 [Yang; Sei Yang]

Prediction-based distributed parallel simulation method

Yang July 15, 2

Patent Grant 8781808

U.S. patent number 8,781,808 [Application Number 12/987,481] was granted by the patent office on 2014-07-15 for prediction-based distributed parallel simulation method. The grantee listed for this patent is Sei Yang Yang. Invention is credited to Sei Yang Yang.

United States Patent	8,781,808
Yang	July 15, 2014

Prediction-based distributed parallel simulation method

Abstract

The simulation consists of a front-end simulation and a back-end simulation. The front-end simulation can use an equivalent model at different abstraction level, or a simulation model for the back-end simulation. The back-end simulation uses the simulation result of front-end simulation so that it can run one or more simulation runs sequentially or in parallel. Alternatively, models at lower level of abstraction are simulated together with a model at higher level of abstraction in parallel using two or more simulators.

Inventors:

Yang; Sei Yang (Busan, KR)

Applicant:

Name	City	State	Country	Type
Yang; Sei Yang	Busan	N/A	KR

Family ID:

44309617

Appl. No.:

12/987,481

Filed:

January 10, 2011

Prior Publication Data


	Document Identifier	Publication Date
	US 20110184713 A1	Jul 28, 2011

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number	Issue Date
12089665
PCT/KR2006/004059	Oct 10, 2006

Foreign Application Priority Data


Oct 10, 2005 [KR]			10-2005-0095803
Oct 18, 2005 [KR]			10-2005-0098941
Dec 12, 2005 [KR]			10-2005-0122926
Dec 19, 2005 [KR]			10-2005-0126636
Jan 20, 2006 [KR]			10-2006-0006079
Mar 1, 2006 [KR]			10-2006-0019738
Apr 25, 2006 [KR]			10-2006-0037412
Apr 27, 2006 [KR]			10-2006-0038426
May 15, 2006 [KR]			10-2006-0043611
May 29, 2006 [KR]			10-2006-0048394
Jul 23, 2006 [KR]			10-2006-0068811
Sep 22, 2006 [KR]			10-2006-0092573
Sep 22, 2006 [KR]			10-2006-0092574

Current U.S. Class:	703/13
Current CPC Class:	G06F 30/33 (20200101)
Current International Class:	G06F 17/50 (20060101); G06F 7/62 (20060101)
Field of Search:	;703/13

References Cited [Referenced By]

U.S. Patent Documents


5801938	September 1998	Kalantery
6134514	October 2000	Liu et al.
6182247	January 2001	Hermann et al.
6182258	January 2001	Hollander
6247147	June 2001	Beenstra et al.
6286114	September 2001	Veenstra et al.
6345240	February 2002	Havens
6389558	May 2002	Hermann et al.
6457162	September 2002	Stanion
6460148	October 2002	Veenstra et al.
6701491	March 2004	Yang
6704889	March 2004	Veenstra et al.
6748352	June 2004	Yuen et al.
6760898	July 2004	Sanchez et al.
6816828	November 2004	Ikegami
6826717	November 2004	Draper et al.
6856950	February 2005	Abts et al.
7020722	March 2006	Sivier et al.
7036046	April 2006	Rally et al.
2002/0116694	August 2002	Fournier et al.
2002/0133325	September 2002	Hoare et al.
2003/0093569	May 2003	Sivier et al.
2006/0004862	January 2006	Fisher et al.
2006/0117274	June 2006	Tseng et al.
2008/0306721	December 2008	Yang

Foreign Patent Documents


2001256267	Sep 2001	JP
1020010057800	Jul 2001	KR
1020040093310	Jan 2006	KR
1020050116706	Jun 2006	KR
2005093575	Oct 2005	WO

Primary Examiner: Silver; David
Attorney, Agent or Firm: Cantor Colburn LLP

Parent Case Text

CROSS-REFERENCE TO RELATED PATENT APPLICATIONS

This application is a continuation-in-part of application Ser. No. 12/089,665, filed Apr. 9, 2008, which is National Phase entry case of PCT Application PCT/KR2006/004059 filed on Oct. 10, 2006.

Also, this application claims the benefit of Korean Patent Application No. 10-2005-0095803, Korean Patent Application No. 10-2005-0098941, Korean Patent Application No. 10-2005-0122926, Korean Patent Application No. 10-2005-0126636, Korean Patent Application No. 10-2006-0006079, Korean Patent Application No. 10-2006-0019738, Korean Patent Application No. 10-2006-0037412, Korean Patent Application No. 10-2006-0038426, Korean Patent Application No. 10-2006-0043611, Korean Patent Application No. 10-2006-0048394, Korean Patent Application No. 10-2006-0068811, Korean Patent Application No. 10-2006-0092573, and Korean Patent Application No. 10-2006-0092574, filed in Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.

Claims

What is claimed is:

1. A distributed parallel simulation method for a predetermined model at a specific abstraction level comprising: (a) obtaining expected inputs and expected outputs for at least one local simulation among a plurality of local simulations in the distributed parallel simulation for the predetermined model, wherein the local simulation is defined as each of simulations for local design objects in case of executing a parallel simulation by spatially dividing the predetermined model into a plurality of local design objects; and (b) for at least one part of entire simulation time period, executing said at least one local simulation in the distribution parallel simulation for the predetermined model by using the expected inputs and the expected outputs, wherein the expected inputs and the expected outputs for the at least one local simulation of the step (a) is obtained by: (a1) while executing the distributed parallel simulation for the predetermined model at the specific abstraction level, in said at least one local simulation, simultaneously simulating a first model and a local design object corresponding to the local simulation, wherein the first model is same to the predetermined model except that an abstraction level of the first model is higher than an abstraction level of the predetermined model, and the step (b) comprises: (b1) determining whether actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which said at least one local simulation among the plurality of local simulations is executed independently with other local simulations by using the expected inputs as inputs of simulation for local design object corresponding to the local simulation while not performing a communication and a synchronization, which maintains causality relationship between the plurality of local simulations, with other local simulations; (b2) if it is determined that the actual outputs do not match with the expected outputs in the step (b1), executing at least one local simulation among the plurality of local simulations by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing the communication and the synchronization with other local simulations; (b3) while executing the at least one local simulation according to the step (b2), determining whether actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which the at least one local simulation among the plurality of local simulations is executed by using the actual inputs, which is obtained through the communication and the synchronization with other local simulations, as inputs of simulation for local design object corresponding to the local simulation; and (b4) if it is determined that, in the step (b3), the actual outputs match with the expected outputs at least for a predetermined times in a specific time point, executing the at least one local simulation among the plurality of local simulations by using again the expected inputs and the expected outputs of the step (a) from the specific time point while not performing the communication and the synchronization with other local simulations.

2. The distributed parallel simulation method of claim 1, wherein, through the step (b), at least one local simulation among said plurality of local simulations is executed while performing at least one checkpoint, the checkpoint being performed for a roll-back which can happen in a future, wherein the step (b2) comprises: (c1) if the actual outputs do not match with the expected outputs in the step (b1), informing other local simulations of an occurrence of mismatch and a corresponding simulation time at which the mismatch occurred, (c2) in each of the local simulations, determining whether the roll-back is needed by comparing the corresponding simulation time informed by the local simulation in which the mismatch occurred in the step (c1) to a current simulation time, and performing the roll-back if it is determined that the roll-back is needed, and (c3) executing at least one local simulation among said plurality of local simulations by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing the communication and the synchronization with other local simulations, at the same time, output of the local design object corresponding to the local simulation is being compared with the expected output for returning to the local simulation with the expected input, which could eliminate the communication and the synchronization with other local simulations.

3. The distributed parallel simulation method of claim 1, wherein, through the step (b), at least one local simulation among said plurality of local simulations is executed while performing at least one checkpoint, the checkpoint being performed for a roll-back which can happen in a future, wherein the step (b2) comprises: (d1) if the actual outputs do not match with the expected outputs in the step (b1), informing other local simulations of an occurrence of mismatch and a simulation time t_d, wherein the simulation time t_d is a time at which the mismatch occurred, (d2) in each of the local simulations, comparing the simulation time t_d, which is informed by the local simulation in which the mismatch occurred in the step (d1), to a simulation time t_c of each of local simulations, wherein the simulation time t_c is a current simulation time of the each of local simulations, (d3) setting the current simulation time of every local simulation identical by either determining that a roll-back is needed if the simulation time t_c is later than the simulation time t_d (i.e., t_c>t_d) and performing a roll-back, or determining that a roll-forward is needed if the simulation time t_c is earlier than the simulation time t_d (i.e., t_c<t_d) and performing a roll-forward, and (d4) executing said every local simulation by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which said every local simulation is executed while performing the communication and the synchronization with other local simulations, at the same time, output of the local design object corresponding to the local simulation is being compared with the expected output for returning to the local simulation with the expected input, which could eliminate the communication and the synchronization with other local simulations.

4. The distributed parallel simulation method of claim 1, wherein said specific abstraction level of said predetermined model is GL (Gate Level), and the abstraction level of said first model is RTL (Register Transfer Level) or mixed level of RTL and GL (Gate level).

5. The distributed parallel simulation method of claim 1, wherein said specific abstraction level of said predetermined model is RTL, and the abstraction level of said first model is ESL (Electronic System Level) or mixed level of ESL and RTL.

6. A distributed parallel simulation method for a predetermined model at a specific abstraction level comprising: (a) obtaining expected inputs and expected outputs for at least one local simulation among a plurality of local simulations in the distributed parallel simulation for the predetermined model, wherein the local simulation is defined as each of simulations for local design objects in case of executing a parallel simulation by spatially dividing the predetermined model into a plurality of local design objects; and (b) for at least one part of entire simulation time period, executing said at least one local simulation in the distribution parallel simulation for the predetermined model by using the expected inputs and the expected outputs, wherein the expected inputs and the expected outputs for the at least one local simulation of the step (a) is obtained by: prior to execution of the distributed parallel simulation for the predetermined model at the specific abstraction level, while performing a simulation with a first model which is same to the predetermined model except that an abstraction level of the first model is higher than an abstraction level of the predetermined model, storing input information and output information for at least one design object, which exist in the first model, as the expected inputs and the expected outputs, and the step (b) comprises: (b1) determining whether actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which said at least one local simulation among the plurality of local simulations is executed independently with other local simulations by using the expected inputs as inputs of simulation for local design object corresponding to the local simulation while not performing a communication and a synchronization, which maintains causality relationship between the plurality of local simulations, with other local simulations; (b2) if it is determined that the actual outputs do not match with the expected outputs in the step (b1), executing at least one local simulation among the plurality of local simulations by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing the communication and the synchronization with other local simulations; (b3) while executing the at least one local simulation according to the step (b2), determining whether actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which the at least one local simulation among the plurality of local simulations is executed by using the actual inputs, which is obtained through the communication and the synchronization with other local simulations, as inputs of simulation for local design object corresponding to the local simulation; and (b4) if it is determined that, in the step (b3), the actual outputs match with the expected outputs at least for a predetermined times in a specific time point, executing the at least one local simulation among the plurality of local simulations by using again the expected inputs and the expected outputs of the step (a) from the specific time point while not performing the communication and the synchronization with other local simulations.

7. The distributed parallel simulation method of claim 6, wherein said specific abstraction level of said predetermined model is GL (Gate Level), and the abstraction level of said first model is RTL (Register Transfer Level) or mixed level of RTL and GL (Gate level).

8. The distributed parallel simulation method of claim 6, wherein said specific abstraction level of said predetermined model is RTL, and the abstraction level of said first model is ESL (Electronic System Level) or mixed level of ESL and RTL.

9. The distributed parallel simulation method of claim 6, wherein, through the step (b), at least one local simulation among said plurality of local simulations is executed while performing at least one checkpoint, the checkpoint being performed for a roll-back which can happen in a future, wherein the step (b2) comprises: (c1) if the actual outputs do not match with the expected outputs in the step (b1), informing other local simulations of an occurrence of mismatch and a corresponding simulation time at which the mismatch occurred, (c2) in each of the local simulations, determining whether the roll-back is needed by comparing the corresponding simulation time informed by the local simulation in which the mismatch occurred in the step (c1) to a current simulation time, and performing the roll-back if it is determined that the roll-back is needed, and (c3) executing at least one local simulation among said plurality of local simulations by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing the communication and the synchronization with other local simulations, at the same time, output of the local design object corresponding to the local simulation is being compared with the expected output for returning to the local simulation with the expected input, which could eliminate the communication and the synchronization with other local simulations.

10. The distributed parallel simulation method of claim 6, wherein, through the step (b), at least one local simulation among said plurality of local simulations is executed while performing at least one checkpoint, the checkpoint being performed for a roll-back which can happen in a future, wherein the step (b2) comprises: (d1) if the actual outputs do not match with the expected outputs in the step (b1), informing other local simulations of an occurrence of mismatch and a simulation time t_d, wherein the simulation time t_d is a time at which the mismatch occurred, (d2) in each of the local simulations, comparing the simulation time t_d, which is informed by the local simulation in which the mismatch occurred in the step (d1), to a simulation time t_c of each of local simulations, wherein the simulation time t_c is a current simulation time of the each of local simulations, (d3) setting the current simulation time of every local simulation identical by either determining that a roll-back is needed if the simulation time t_c is later than the simulation time t_d (i.e., t_c>t_d) and performing a roll-back, or determining that a roll-forward is needed if the simulation time t_c is earlier than the simulation time t_d (i.e., t_c<t_d) and performing a roll-forward, and (d4) executing said every local simulation by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which said every local simulation is executed while performing the communication and the synchronization with other local simulations, at the same time, output of the local design object corresponding to the local simulation is being compared with the expected output for returning to the local simulation with the expected input, which could eliminate the communication and the synchronization with other local simulations.

11. A distributed parallel simulation method for a predetermined model at a specific abstraction level comprising: (a) obtaining expected inputs and expected outputs for at least one local simulation among a plurality of local simulations in the distributed parallel simulation for the predetermined model, wherein the local simulation is defined as each of simulations for local design objects in case of executing a parallel simulation by spatially dividing the predetermined model into a plurality of local design objects; and (b) for at least one part of entire simulation time period, executing said at least one local simulation in the distribution parallel simulation for the predetermined model by using the expected inputs and the expected outputs, wherein the expected inputs and the expected outputs for the at least one local simulation of the step (a) is obtained by: prior to execution of the distributed parallel simulation for the predetermined model at the specific abstraction level, while performing a simulation with a first model which is same to the predetermined model except that at least one design modification exists in at least one design object in the predetermined model, storing input information and output information for at least one design object, which exist in the first model, as the expected inputs and the expected outputs, and the step (b) comprises: (b1) determining whether actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which said at least one local simulation among the plurality of local simulations is executed independently with other local simulations by using the expected inputs as inputs of simulation for local design object corresponding to the local simulation while not performing a communication and a synchronization, which maintains causality relationship between the plurality of local simulations, with other local simulations; (b2) if it is determined that the actual outputs do not match with the expected outputs in the step (b1), executing at least one local simulation among the plurality of local simulations by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing the communication and the synchronization with other local simulations; (b3) while executing the at least one local simulation according to the step (b2), determining whether actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which the at least one local simulation among the plurality of local simulations is executed by using the actual inputs, which is obtained through the communication and the synchronization with other local simulations, as inputs of simulation for local design object corresponding to the local simulation; and (b4) if it is determined that, in the step (b3), the actual outputs match with the expected outputs at least for a predetermined times in a specific time point, executing the at least one local simulation among the plurality of local simulations by using again the expected inputs and the expected outputs of the step (a) from the specific time point while not performing the communication and the synchronization with other local simulations.

12. The distributed parallel simulation method of claim 11, wherein, through the step (b), at least one local simulation among said plurality of local simulations is executed while performing at least one checkpoint, the checkpoint being performed for a roll-back which can happen in a future, wherein the step (b2) comprises: (c1) if the actual outputs do not match with the expected outputs in the step (b1), informing other local simulations of an occurrence of mismatch and a corresponding simulation time at which the mismatch occurred, (c2) in each of the local simulations, determining whether the roll-back is needed by comparing the corresponding simulation time informed by the local simulation in which the mismatch occurred in the step (c1) to a current simulation time, and performing the roll-back if it is determined that the roll-back is needed, and (c3) executing at least one local simulation among said plurality of local simulations by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing the communication and the synchronization with other local simulations, at the same time, output of the local design object corresponding to the local simulation is being compared with the expected output for returning to the local simulation with the expected input, which could eliminate the communication and the synchronization with other local simulations.

13. The distributed parallel simulation method of claim 11, wherein, through the step (b), at least one local simulation among said plurality of local simulations is executed while performing at least one checkpoint, the checkpoint being performed for a roll-back which can happen in a future, wherein the step (b2) comprises: (d1) if the actual outputs do not match with the expected outputs in the step (b1), informing other local simulations of an occurrence of mismatch and a simulation time t_d, wherein the simulation time t_d is a time at which the mismatch occurred, (d2) in each of the local simulations, comparing the simulation time t_d, which is informed by the local simulation in which the mismatch occurred in the step (d1), to a simulation time t_c of each of local simulations, wherein the simulation time t_c is a current simulation time of the each of local simulations, (d3) setting the current simulation time of every local simulation identical by either determining that a roll-back is needed if the simulation time t_c is later than the simulation time t_d (i.e., t_c>t_d) and performing a roll-back, or determining that a roll-forward is needed if the simulation time t_c is earlier than the simulation time t_d (i.e., t_c<t_d) and performing a roll-forward, and (d4) executing said every local simulation by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which said every local simulation is executed while performing the communication and the synchronization with other local simulations, at the same time, output of the local design object corresponding to the local simulation is being compared with the expected output for returning to the local simulation with the expected input, which could eliminate the communication and the synchronization with other local simulations.

Description

BACKGROUND OF THE INVENTION

(1) Field of the Invention

The present invention is to increase the verification performance and efficiency for systematically verifying digital systems with more than multi-million gates by using simulation and prototyping from Electronic System Level (ESL) down to Gate Level (GL) through Register Transfer Level (RTL).

(2) Description of the Related Art

In design verification, simulation is to build a pair of computer-executable models which consists of DUV (Design Under Verification) or one or more than one design object (to be defined later) inside of DUV, and TB (testbench) which drives it, to translate it into a sequence of machine instructions of a computer through a simulation compilation process, and to execute it on the computer. Therefore, simulation execution is basically accomplished by the sequential execution of machine instructions of a computer, and there are many simulation methods (event-driven simulation, cycle-based simulation, compiled simulation, interpreted simulation, co-simulation, algorithmic-level simulation, instruction-level simulation, transaction-level simulation, RTL simulation, gate-level simulation, transistor-level simulation, circuit-level simulation, etc). In other words, simulation represents a variety of processes in which DUV and TB, that are executable SW models built in a computer at a proper abstraction level (there are many abstraction level existed in IC design such as circuit-level, transistor-level, gate-level, RTL, transaction-level, instruction-level (if the design object is a processor), algorithmic-level, etc) by a modeling process, are executed in a computer to realize its functional specification or functional characteristic in SW. The advantage of simulation is to virtually evaluate the functional specification or functional characteristic of design object before the design objects is actually implemented and fabricated, to provide a high flexibility due to the SW nature, and to obtain high visibility and controllability on DUV or TB which is critical for debugging. But, its shortcoming is a low performance comes from the fact that the simulation execution is a sequential execution of machine instructions sequence. If the design complexity is large alike to the modern designs having 100 million or more gates, the simulation speed becomes extremely slow (for example, it will take 3.2 years to simulation an 100 million gates design for 100,000,000 cycles by an event-driven simulation whose speed is 1 cycle/sec). In this present invention, the simulation is defined as any SW modeling and SW execution method of DUV and TB at the proper abstraction level. More specifically, in this present invention, the simulation is defined as the process including implementing the behavior of DUV and TB at a specific abstraction level as a specific computer data structure and its well defined operations on it so that it is computer-executable, and performing a series of computations and processing of the operations on the data structure with input values in computer (Therefore, in this present invention, the simulation can be carried out by not only any commercial simulator, but also internally built simulators. Also, any process including a series of computation or processing the operations on the data structure with input values in computer is considered as the simulation if the process meets the above definition of simulation).

In contrast, the traditional prototyping is to build a system on PCB (Printed Circuit Board) by using manufactured semiconductor chips (for example, sample chips) or FPGA (Field Programmable Gate Array) chips, which implement DUV, and other components necessary to the construction of the entire system (in simulation, other components are modeled as TB), and to verify the DUV in either in-circuit or in-system environment while the entire system is running a real or almost real operation speed. If DUV and TB are not modeled virtually in SW, but physically implemented for verification, it is advantageous to verify at the extremely high speed. However, as in the prototyping environment the visibility and controllability are very low, the debugging is very difficult when it operates incorrectly.

The design size of digital circuits or digital systems are growing to tens of million or hundreds of million gates and their functionality is becoming very complex as the IC (Integrated Circuit) design and fabrication technology has been being developed rapidly. Especially, system-level ICs so called SOC (System On Chip) has usually one or more embedded processor cores (RISC core or DSP core, and specific examples are ARM11 core from ARM or Teak DSP core from CEVA), and the large part of its functionality is realized in SW. The reduction of design time is very critical to the related products success because of short time to market due to the growing competition in the market. Therefore, there is a growing interest from the industry about ESL design methodology for designing chips. Chips that are designed by using ESL design methodology, which exists at the higher level abstraction level than traditional RTL (Register Transfer Level), need the SW developments that drives them as well as the HW designs. Therefore, in recent development trend the Virtual Platform which is a SW model of a real HW (we will call it VP hereafter) is built as a system level model (ESL model) for architecture exploration, SW development, HW/SW co-verification, and system verification (whereas, traditional prototyping is a physical platform (we will call it PP hereafter)). VP can be also used as an executable specification, i.e. a golden reference model. As VP is made of at higher abstraction level, its development time is short. Also, it can be used to verify TB before DUV is available. VP also plays a critical role in platform-based design (PBS), which is widely adopted in SOC designs, because VP can be made of transaction-level on-chip bus models and other transaction-level component models (these are called TLM models), which can be simulated at much higher simulation speed (about 100 to 10,000 times faster than RTL model). Currently, there are many commercial tools for creating and executing VP, such as MaxSim from ARM, ConvergenSC from CoWare, Incisive from Cadence, VisualElite from Summit Design, VSP from Vast Systems Technology, SystemStudio from Synopsys, Platform Express from Mentor Graphics, VTOC from TenisonEDA, VSP from Carbon Design Systems, VirtualPlatform from Virutech, etc. Therefore, VP can provide many benefits in SOC designs. In SOC designs, as the most important factor of VP is its fast execution speed suitable to develop some softwares, it is modeled not at RTL using Verilog or VHDL, but at higher abstraction level such as transaction-level or algorithmic-level using SystemC or C/C++. The abstraction level, which is the most important concept in system-level designs, is the level of the representation detail of corresponding design object (explained in detail later). Digital systems can be classified into layout-level, transistor-level, gate-level, RTL, transaction-level, algorithmic-level, etc from the low level of abstraction to the high level of abstraction. That is, gate-level is a lower abstraction than RTL, RTL is a lower abstraction than transaction-level, and transaction-level is a lower abstraction than algorithmic-level. Therefore, if the abstraction level of a specific design object A is transaction-level and its abstraction level of a design object B refined from A is RTL, then it is defined design object A is at higher level of abstraction than design object B. Also, if a design object X has design objects A and C, and a design object Y has design objects B, which is a refined design object from A, and C, it is defined design object X is at higher level of abstraction than design object Y. Moreover, the accuracy of delay model determines the level of abstraction at same gate level or same RTL. That is, even though there are at same gate-level, the net-list with zero-delay model is at higher abstraction than the net-list with unit-delay model, and the net-list with unit-delay model is at higher abstraction than the net-list with full timing model using SDF (Standard Delay Format). Recent SOC designs can be thought as a progressive refinement process of an initial design object, which must be implemented as a chip eventually, from the initial abstraction level, e.g. transaction-level, to the final abstraction level, e.g. gate-level (refer FIG. 14). The core of design methodology using progressive refinement process is to refine the design blocks progressively existed inside a design object MODEL_DUV(HIGH) modeled at high level of abstraction so that a refined design object MODEL_DUV(LOW) modeled at low level of abstraction is obtained automatically (for example, through logic synthesis or high-level synthesis), manually, or by both. As a detailed example, in the refinement process of ESL to RTL, which is to get an implementable RTL model from an ESL model (this process is currently carried out by human, high-level synthesis, or both), the ESL model is MODEL_DUV(HIGH) and the implementable RTL model is MODEL_DUV(LOW), and in the refinement process of RTL to GL (Gate Level), which is to get a GL model, i.e. gate-level netlist, from an implementable RTL model (this process is currently carried out by logic synthesis), the RTL model is MODEL_DUV(HIGH) and the GL model is MODEL_DUV(LOW). The GL model can become a timing accurate GL model if the delay information in SDF (Standard Delay Format), which is extracted from the placement and routing, is back-annotated.

There is one thing to mention. It is not absolutely necessary for an ESL model that all design objects in the model are at system level. This is also true for a RTL model. In an ESL model, it is possible that a few design objects are at RTL and they are surrounded by the abstraction wrappers which make the abstraction of the RTL objects same as the other ESL objects. Also, in an RTL model, it is possible that a few design objects are at GL and they are surrounded by the abstraction wrappers which make the abstraction of the GL objects same as the other RTL objects. At the same reason, in a GL model a few design objects, e.g. memory block which is not produced a net-list at gate-level by logic synthesis, can be at RTL. Therefore, in this present invention "a model at the specific level of abstraction" is a model at any level of abstraction (not only ESL, RTL, and GL, but also any mixed levels of abstraction such as a mixed level of ESL/RTL, a mixed level of RTL/GL, a mixed level of ESL/RTL/GL, etc) that can be existed in a refinement process from ESL to GL. Also, the "abstraction level" includes not only ESL, RTL, and GL, but also any mixed levels of abstraction such as a mixed level of ESL/RTL, a mixed level of RTL/GL, a mixed level of ESL/RTL/GL, etc. For example, if a DUV consists of four design objects, A, B, C, and D, A and B are at ESL, C is at RTL, and D is at GL, the DUV is a mixed ESL/RTL/GL model of abstraction and can be called a model at the specific level of abstraction (Also, it is possible to be called a model at mixed ESL/RTL/GL of abstraction). From now on, we will call a model at mixed levels of abstraction if we must clearly mention that the model is represented at the mixed levels of abstraction (Arbitrary design object, such as DUV or TB, can be called a model, but if there is no specific mention, a model is defined as a design object including DUV (Design Under Verification) and TB (Testbench)).

Transaction, which is the most important concept at ESL, represents an information that is defined over logically related multiple signals or pins as a single unit, and uses function calls to communicate among design objects. By contrast, the information on the signals or pins at RTL is represented by bit or bit vector only. Transaction can be defined cycle by cycle (we'll call this type of transaction cycle-accurate transaction, and ca-transaction in short), over multiple cycles (we'll call this type of transaction timed transaction, cycle-count transaction, or PV-T transaction and timed-transaction in short), or without the concept of cycles (we'll call this type of transaction untimed-transaction in short). The timed-transaction is represented by Transaction_name (start_time, end_time, other_attributes. In fact, there is no standard definition about transaction, but it is mostly general to define and classify into untimed-transaction, timed-transaction, and ca-transaction explained above. Within the transaction, untimed-transaction is at the highest level of abstraction, but the least accurate in timing, and ca-transaction is at the lowest level of abstraction, but the most accurate in timing. Timed-transaction is at between.

The refinement process is incremental so that the design objects at TL (Transaction-level) in VP are progressively refined into the design objects at RTL which have at least signal-level cycle accuracy. At the end of the transformation, design objects at TL are translated into design objects ar RTL, therefore the transaction-level VP is refined into the implementable RTL model. Also, the design objects at RTL (Transaction-level) in the RTL model are progressively refined into the design objects at GL which have at least signal-level timing accuracy. At the end of the transformation, design objects at RTL are translated into design objects ar GL, therefore the RTL model is refined into an GL model. FIG. 14 shows the example of the refinement process explained above.

There are two objects to be designed in SOC designs, the first is DUV (Design Under Verification) and the second is TB (Testbench). DUV is the design entity that should be manufactured as chip, and TB is a SW model which represents an environment in which the chip is mounted and operated. TB is for simulating DUV. During the simulation, it is general TB provides stimuli to DUV, and processes the output from DUV. In general, DUV and TB has a hierarchy so that there may be one or more lower modules at inside, each of these lower module can be called design block. In a design block there may be one or more design modules inside, and a design module there may be one or more submodules inside. In this present invention, we will call any of design blocks, design modules, submodules, DUV, TB, some part of design blocks, design modules, submodules, DUV, or TB, or any combination of design blocks, design modules, submodules, DUV, and TB, "design object" (For example, any module or part of the module in Verilog is a design object, any entity or part of the entity in VHDL is a design object, or any sc_module or part of the sc_module in SystemC is a design object). Therefore, VP can be seen as a design object. So are the part of VP, one or more design blocks in VP, the part of a design block, some design modules in a design block, some submodules in a design module, the part of a design block, the part of a submodule, etc. (In short, entire DUV and TB, or some part of DUV and TB can be seen as design object).

In the design process using progressive refinement the simulation at high level of abstraction can be run fast, but the simulation at low level of abstraction is relatively slow. Therefore, the simulation speed decreases dramatically as the refinement process goes down to lower level of abstraction. Contrast to the conventional single simulation (in this present invention, the definition of single simulation includes not only using one simulators, but also using more than one simulators, e.g. using one Verilog simulator and one Vera simulator, and running these simulators on a single CPU), there is a distributed parallel simulation method using two or more simulators for increasing the simulation speed. The examples of the simulator are HDL (Hardware Description Language) simulators (such as NC-Verilog/Verilog-XL and X-sim from Cadence, VCS from Synopsys, ModelSim from Mentor, Riviera/Active-HDL from Aldec, FinSim from Fintronic, etc), HVL (Hardware Verification Language) simulators (such as e simulator from Cadence, Vera simulator from Synopsys, etc), SDL (System Description Language) simulators (e.g. SystemC simulator such as Incisive simulator from Cadence, etc), and ISS (Instruction-Set Simulator)(such as ARM RealView Development Suite Instruction Set Simulator, etc). For another classification, there are event-driven simulators or cycle-based simulator. The simulators in this present invention include any of these simulators. Therefore, when two or more simulators use in this present invention, each of simulators can be any of simulators mentioned above. Distributed parallel simulation (or parallel distributed simulation, or parallel simulation in short), which is to perform a simulation in a distributed processing environment, is the most general parallel simulation technique, in which DUV and TB, i.e. a model at specific level of abstraction, are partitioned into two or more design objects, and each of design objects is distributed into a simulator and executed on it (see FIG. 5). Therefore, the distributed parallel simulation requires the partitioning step at which divides a simulation model into two or more design objects. In this present invention, we will call the design object that should be executed in a specific local simulation (to be defined later) through the partition a "local design object".

Recently, distributed parallel simulation can be possible by connecting two or more computers with a high speed computer network such as giga-bit ethernet and running a simulator on each computer, or using multiprocessor-computer which has two or more CPU cores (in this present invention, local simulation is the simulation executed by each of those simulators that is called a local simulator in the distributed parallel simulation). However, the performance of traditional distributed parallel simulation severely suffer from the communication and synchronization overhead among local simulators. Therefore, two basic methods are known for synchronization, one conservative (or pessimistic) the other optimistic. The conservative synchronization guarantees the causality relation among simulation events so that these is no need to roll-back, but the speed of distributed parallel simulation is dictated by the slowest local simulation and these is too much synchronizations. The optimistic synchronization temporally allows the violation of the causality relation, but corrects it later by roll-back so that the reduction of roll-backs is very critical for the simulation speed. But, because current distributed parallel simulation using optimistic synchronization does not consider to minimize the roll-back by maximizing the simulation periods when a local simulation does not require any synchronization with other local simulations, the simulation performance degrades significantly due to the excessive roll-backs. Distributed parallel simulation using conventional optimistic approach and one using conventional pessimistic approach are well known in many documents and papers, therefore the detailed explanation is omitted in this present invention. One more thing to mention is it is desirable to have same number of processors in a distributed parallel simulation as the number of local simulations for maximizing the simulation performance, but it is still possible to perform a distributed parallel simulation as long as there are two or more processors available even though the number of local simulation is larger than that of processors. In summary, the synchronization and communication methods for both optimistic approach and pessimistic approach greatly limit the performance of distributed parallel simulation using two or more simulators.

Moreover, during the progressive refinement process it is very important to maintain the model consistency between a model at high level of abstraction and a model at low level of abstraction because the model at high level of abstraction serves as a reference model for the model at low level of abstraction. However, in the current progressive refinement process there is no efficient method to maintain the model consistency between two models existing at two different abstraction levels.

Moreover, as there is no systematic method in the debugging process in which the design errors are identified and removed in the design process using the progressive refinement, the large amount of time must be consumed.

PRIOR ART DOCUMENTS

Korean Laid-open Patent Publication No. 10-2006-0066634 which corresponds to Korean Patent Application No. 10-2005-116706 Korean Laid-open Patent Publication No. 10-0921314 which corresponds to Korean Patent Application No. 10-2004-93310 Korean Laid-open Patent Publication No. 10-2001-0057800 Japanese Laid-open Patent Publication No, JP2001-256267 U.S. Pat. No. 6,182,247 U.S. Pat. No. 6,247,147 U.S. Pat. No. 6,286,114 U.S. Pat. No. 6,389,558 U.S. Pat. No. 6,460,148 U.S. Pat. No. 6,704,889 U.S. Pat. No. 6,760,898 U.S. Pat. No. 6,826,717 U.S. Pat. No. 7,036,046 U.S. Pat. No. 6,345,240 U.S. Pat. No. 5,801,938 U.S. Pat. No. 6,701,491

SUMMARY OF THE INVENTION

(1) Technical Problem

The object of present invention is to provide a systematic verification method through the progressive refinement from the system level to the gate level.

Another object of present invention is to provide a systematic verification method which can solve the degradation of verification performance as the progressive refinement goes down to the low level of abstraction.

Still, another object of present invention is to allow the entire design and verification process using progressive refinement from the high level of abstraction to the low level of abstraction in a systematic and automatic way.

Still, another object of present invention is to provide a verification method in which the model consistency is effectively maintained among two or more models existed at different levels of abstraction.

Still, another object of present invention is to provide an efficient verification method through progressive refinement, in which a model at the low level of abstraction is efficiently verified using a model at the high level of abstraction as a reference model.

Still, another object of present invention is to provide a method for increasing the speed of distributed parallel simulation by eliminating synchronization overhead and communication overhead.

Still, another object of present invention is to provide a systematic and consistent fast debugging method for correcting design errors (these design errors are not only HW design errors, but also SW design errors) in the entire verification phase from simulation-based verification to physical prototype-based verification.

Still, another object of present invention is to provide a high visibility and controllability throughout virtual prototypes or simulators for debugging the incorrect behavior of physical prototype in which DUV is operated in the in-circuit or in-system environment where DUV has one or more user clocks (in the case of two or more user clocks, these are asynchronous with no phase relation).

In the design verification of complex system-level designs such as embedded systems, the verification includes not only HW-centric verification, but also SW-centric verification so that it must be a system-level design verification. Therefore, in the present invention, the design verification covers not only traditional HW verification, but also HW/SW co-verification verifying SW as well as HW.

(2) Technical Solution

According to an embodiment of the present invention, a distributed parallel simulation method for a predetermined model which has a specific abstraction level comprising:

(a) obtaining expected inputs and expected outputs for at least one local simulation among a plurality of local simulations in the distributed parallel simulation for the predetermined model, wherein the local simulation is defined as each of simulations for local design objects in case of executing a parallel simulation by spatially dividing the predetermined model into a plurality of local design objects; and

(b) executing said at least one local simulation in the distribution parallel simulation for the predetermined model by using the expected inputs and the expected outputs, for at least a part of entire simulation time period.

An example of this embodiment is shown in FIGS. 6, 7, and so on. Also, an example of flow chart of a distributed parallel simulation is shown in FIG. 24.

More preferably, the step (b) comprises:

(c1) determining whether said at least one local simulation among the plurality of local simulations can be executed independently with other local simulations by using the expected inputs and the expected outputs while omitting a communication and a synchronization with other local simulations; and

(c2) executing said at least one local simulation independently with other local simulations if it is determined that said at least one local simulation can be independently executed in the step (c1), or executing said at least one local simulation by using actual inputs which are obtained through the communication and the synchronization with the other local simulations if it is determined that said at least one local simulation cannot be independently executed in the step (c1).

An example of this embodiment is shown in FIGS. 6, 7, and so on.

More preferably, the step (b) comprises:

(d1) determining whether the actual outputs match with the expected outputs obtained in the step (a), wherein said actual outputs are outputs obtained from an execution in which said at least one local simulation among the plurality of local simulations is executed independently with other local simulations by using the expected inputs as inputs of simulation for local design object corresponding to the local simulation while omitting a communication and a synchronization with other local simulations; and

(d2) if it is determined that the actual outputs do not match with the expected outputs in the step (d1), executing at least one local simulation among the plurality of local simulations by using the actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing a communication and a synchronization with other local simulations.

Flow charts for this embodiment are shown in FIG. 25.

More preferably, through the step (b), at least one local simulation among said plurality of local simulations is executed while performing at least one checkpoint, the checkpoint being performed for roll-back which can happen in the future,

wherein the step (b) comprises:

(e1) determining whether the actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which said at least one local simulation among the plurality of local simulations is executed independently with other local simulations by using the expected inputs as inputs of simulation for local design object corresponding to the local simulation while omitting a communication and a synchronization with other local simulations;

(e2) if the actual outputs do not match with the expected outputs in the step (e1), informing other local simulations of an occurrence of mismatch and a corresponding simulation time at which the mismatch occurred,

(e3) in each of the local simulations, determining whether the roll-back is needed by comparing the corresponding simulation time informed by the local simulation in which the mismatch occurred in the step (e2) to a current simulation time, and performing the roll-back if it is determined that the roll-back is needed, and

(e4) executing at least one local simulation among said plurality of local simulations by using the actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the at least one local simulation is executed while performing a communication and a synchronization with other local simulations.

An example of checkpointing shown in FIG. 12 as X marks. Also, flow charts for this embodiment are shown in FIGS. 26-29, especially FIGS. 26 and 27.

More preferably, the step (b) comprises:

(f1) determining whether the actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which said at least one local simulation among the plurality of local simulations is executed by using the expected inputs as inputs of simulation for local design object corresponding to the local simulation while omitting a communication and a synchronization with other local simulations;

(f2) if the actual outputs do not match with the expected outputs in the step (f1), informing other local simulations of an occurrence of mismatch and a corresponding simulation time t_d at which the mismatch occurred,

(f3) in each of the local simulations, comparing the corresponding simulation time t_d, which is informed by the local simulation in which the mismatch occurred in the step (f2), to a current simulation time t_d of each of local simulations,

(f4) setting the current simulation time of every local simulation identical by either determining that a roll-back is needed if the time t_c is later than the time t_d (i.e., t_c>t_d) and performing a roll-back, or determining that a roll-forward is needed if the time t_c is earlier than the time t_d (i.e., t_c<t_d) and performing a roll-forward, and

(f5) executing every local simulation by using actual inputs as inputs of simulation for local design object corresponding to the local simulation, wherein the actual inputs are inputs obtained from an execution in which the every local simulation is executed while performing a communication and a synchronization with other local simulations.

Flow charts for this embodiment are shown in FIGS. 26-29, especially FIGS. 26 and 27.

More preferably, the step (b) comprises:

(g1) determining whether the actual outputs match with the expected outputs obtained in the step (a), wherein the actual outputs are outputs obtained from an execution in which said at least one local simulation among the plurality of local simulations is executed by using said actual inputs, which is obtained through a communication and a synchronization with other local simulations, as inputs of simulation for local design object corresponding to the local simulation; and

(g2) executing said at least one local simulation among the plurality of local simulations by using again the expected inputs and the expected outputs of the step (a) from a specific time point while omitting a communication and a synchronization with other local simulations, wherein the specific time point is when the actual outputs match with the expected outputs at least for a predetermined times.

Flow charts for this embodiment are shown in FIGS. 26-29. An example of the predetermined times is three times. For the details, please refer to the description on FIGS. 26 and 28.

More preferably, the step of obtaining the expected inputs and the expected outputs for said at least one local simulation is achieved by:

while executing the distributed parallel simulation for the predetermined model which has the specific abstraction level,

in said at least one local simulation, simultaneously simulating a first model and a local design object corresponding to the local simulation, wherein the first model is same to the predetermined model except that an abstraction level of the first model is higher than an abstraction level of the predetermined model.

An example of abstraction levels are shown in FIG. 13. Also, this embodiment relates to FIG. 12.

More preferably, the step of obtaining the expected inputs and the expected outputs for said at least one local simulation comprises:

prior to execution of the distributed parallel simulation for the predetermined model which has the specific abstraction level,

while performing a simulation with a first model which is same to the predetermined model except that an abstraction level of the first model is higher than an abstraction level of the predetermined model,

storing input information and output information for at least one design object which exist in the first model.

An example of abstraction levels are shown in FIG. 13. Also, this embodiment relates to FIG. 12.

More preferably, the abstraction level of said first model is RTL (Register Transfer Level) or mixed level of RTL and GL (Gate level) if said specific abstraction level of said predetermined model is GL (Gate Level), or

wherein the abstraction level of said first model is ESL (Electronic System Level) or mixed level of ESL and RTL if said specific abstraction level of said predetermined model is RTL.

An example of abstraction levels are shown in FIG. 13. This is simply an example, and the abstraction level is not limited thereto. Also, a combination of the abstraction levels are possible.

(3) Advantageous Effects

As already explained, an advantageous effect of present invention is to reduce the total verification time and increase the verification efficiency by executing the verification for a model at lower level of abstraction fast by using the simulation result of a model at higher level of abstraction when a complex design start at ESL.

Another advantageous effect of present invention is to provide a systematic verification method through the progressive refinement from the system level to the gate level such that the high execution speed and 100% visibility are provided from simulation-based verification to physical-prototype-based verification.

Still, another advantageous effect of present invention is to provide a systematic verification method which can solve the degradation of verification performance as the progressive refinement goes down to the low level of abstraction.

Still, another advantageous effect of present invention is to allow the entire design and verification process using progressive refinement from the high level of abstraction to the low level of abstraction in a systematic and automatic way.

Still, another advantageous effect of present invention is to provide a verification method in which the model consistency is effectively maintained among two or more models existed at different levels of abstraction.

Still, another advantageous effect of present invention is to provide an efficient verification method through progressive refinement, in which a model at the low level of abstraction is efficiently verified using a model at the high level of abstraction as a reference model.

Still, another advantageous effect of present invention is to provide a method for increasing the speed of distributed parallel simulation by eliminating synchronization overhead and communication overhead.

Still, another advantageous effect of present invention is to provide a systematic and consistent fast debugging method for correcting design errors in the entire verification phase from simulation-based verification to physical prototype-based verification.

Still, another advantageous effect of present invention is to provide a high visibility and controllability throughout virtual prototypes or simulators for debugging the incorrect behavior of physical prototype in which DUV is operated in the in-circuit or in-system environment where DUV has one or more user clocks.

As the present invention may be embodied in several forms without departing from the spirit or essential characteristics thereof, it should also be understood that the above-described embodiments are not limited by any of the details of the foregoing description, unless otherwise specified, but rather should be construed broadly within its spirit and scope as defined in the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of an example of the design verification apparatus in this present invention.

FIG. 2 is another schematic diagram of an example of the design verification apparatus in this present invention.

FIG. 3 is a schematic diagram of an example of the hierarchy of an ESL model and its corresponding hierarchy of a RTL model.

FIG. 4 is a schematic diagram of an example of the hierarchy of a RTL model and its corresponding hierarchy of a GL model.

FIG. 5 is a schematic diagram of an example of the execution of a distributed parallel simulation whose environment is consisted of two or more computers connected a computer network.

FIG. 6 is a schematic diagram of an example of the execution of a time-sliced parallel simulation, where t-DCP is obtained at the front-end simulation with a model of higher level of abstraction, and back-end simulation is executed in temporally parallel.

FIG. 7 is a schematic diagram of an example of the execution of a distributed-processing-based parallel simulation, where s-DCP is obtained at the front-end simulation with a model of higher level of abstraction, and back-end simulation is executed in spatially parallel.

FIG. 8 is a schematic diagram of an example of the components consisting of the instrumentation code added for a parallel-processing-based distributed simulation in this present invention.

FIG. 9 is a schematic diagram of an example of a cycle-accurate bus operation in the unit of signal at RTL and its corresponding cycle-accurate bus operation in the unit of transaction at TL.

FIG. 10 is a schematic diagram of an example showing design objects in the ESL model and its corresponding design objects in the RTL model depicted in FIG. 3.

FIG. 11 is a schematic diagram of an example of a generation of design objects DO_t_mixed(i) at mixed level of abstraction such that each of design objects in the ESL model depicted in FIG. 10 is replaced a corresponding design object in the RTL model.

FIG. 12 is a schematic diagram of an example of an execution of a distributed-processing-based parallel simulation with a RTL model as back-end simulation by using the design state information collected at one or more simulation times and periods when six independent parallel front-end simulations with six mixed design objects DO_t_mixed(1), DO_t_mixed(2), . . . , DO_t_mixed(6) depicted in FIG. 11 are being executed.

FIG. 13 is a schematic diagram of an example of the design and verification process using progressive refinement from the initial level of abstraction to the final level of abstraction.

FIG. 14 is a schematic diagram of an example of a progressive refinement process from a RTL model to a GL model.

FIG. 15 is a schematic diagram of an example of a distributed-processing-based parallel simulation or time-sliced parallel simulation with a model at lower level of abstraction using s-DCP or t-DCP when the verification progresses from the verification with a TL model to the verification with a GL model through the verification with a RTL model by progressive refinement.

FIG. 16 is a schematic diagram of an example of a part of a model for the simulation method in this present invention.

FIG. 17 is a schematic diagram of an example of a part of the instrumentation code added to the model partially depicted in FIG. 16 for a distributed-processing-based parallel simulation by the verification software in this present invention.

FIG. 18 is a schematic diagram of an example of another part of the instrumentation code added to the model partially depicted in FIG. 16 for a distributed-processing-based parallel simulation by the verification software in this present invention.

FIG. 19 is a schematic diagram of an example of another part of the instrumentation code added to the model partially depicted in FIG. 16 for a distributed-processing-based parallel simulation by the verification software in this present invention.

FIG. 20 is a schematic diagram of an example of a combined method of distributed-processing-based parallel execution/singular execution.

FIG. 21 is a schematic diagram of an example of the situation in which the synchronization overhead and communication overhead between a simulator and a hardware-based verification platform of simulation acceleration is reduced by distributed-processing-based parallel execution in this present invention.

FIG. 22 is a schematic diagram of an example of logical connection topology among two or more local simulators installed in two or more computers for a distributed-processing-based parallel simulation in this present invention.

FIG. 23 is a schematic diagram of an example of a distributed parallel simulation environment which is consisted of two or more computers and two or more simulators.

FIG. 24 is an example of the overall flow diagram of the conventional distributed parallel simulation.

FIG. 25 is an example of the overall flow diagram of the distributed-processing-based parallel simulation in this present invention.

FIG. 26 is an example of the overall flow diagram for the execution of the local simulation for the execution of distributed-processing-based parallel simulation in this present invention.

FIG. 27 is another example of the overall flow diagram for the execution of the local simulation for the execution of distributed-processing-based parallel simulation in this present invention.

FIG. 28 is an example of the overall flow diagram for the execution of a local simulation by a local simulator in the star connection topology.

FIG. 29 is an example of the overall flow diagram of the SW sever module in a central computer in the star connection topology.

FIG. 30 is a schematic diagram of an example of pseudo code for the behavior of some components in FIG. 8.

FIG. 31 is a schematic diagram of an example of pseudo code for the behavior of the other components in FIG. 8.

FIG. 32 is a schematic diagram of another example of components for the behavior of the instrumentation code of the distributed-processing-based parallel simulation in this present invention.

FIG. 33 is another example of the entire flow diagram for a distributed-processing-based parallel simulation in this present invention.

FIG. 34 is a schematic diagram of an example of the design verification apparatus in this present invention.

FIG. 35 is a schematic diagram of the system structure of ChipScope and ILA core from Xilinx.

FIG. 36 is a schematic diagram of an example of the instrumentation circuit for debugging or instrumentation code for debugging including a parallel-load/serial-scanout register added to a user design.

FIG. 37 is a schematic diagram of an example of the instrumentation circuit for debugging or instrumentation code for debugging including a parallel-load register added to a user design.

FIG. 38 is a schematic diagram of an example of the instrumentation circuit for debugging or instrumentation code for debugging including a two-level parallel-load/serial-scanout register added to a user design.

FIG. 39 is a schematic diagram of an example of the instrumentation circuit for debugging or instrumentation code for debugging including CAPTURE_VIRTEX primitive for using the readback capture capability of Xilinx.

FIG. 40 is a schematic diagram of another example of the design verification apparatus in this present invention.

FIG. 41 is a schematic diagram of another example of the design verification apparatus in this present invention.

FIG. 42 is a schematic diagram of an example of situations in which a debugging reference point in time is located in a debugging window.

FIG. 43 is a schematic diagram of another example of the instrumentation circuit for debugging or instrumentation code for debugging including a parallel-load/serial-scanout register added to a user design.

EXPLANATION OF SYMBOL NUMBERS IN THE FIGURES

32: Verification software 34: HDL simulator 35: Computer 37: ESL model 38: Design object for design block 39: Design objects for design module 40: RTL model 42: On-chip bus 50: Expected input 52: Expected output 53: Design object containing DUV and TB at higher level of abstraction 54: Control module of run-with-expected-input&output/run-with-actual-input&output 56: Selection module of expected-input/actual-input 58: Compare module of expected-output/actual-output 59: Compare module of expected-input/actual-input 60: s-DCP generation/save module 62: Instrumentation code added to a design under verification by the verification software 64: Communication and synchronization module for distributed parallel simulation 333: SW server module existed in a central computer, which is responsible for controlling and connecting the local simulations of distributed parallel simulation in the star connection topology 343: Simulator executing a local simulation in an environment of distributed parallel simulation 353: central computer 354: peripheral computer 370: GL model 380: A specific design object in a RTL model 381: Another specific design object in a RTL model 382: Still another specific design object in a RTL model 383: Still another specific design object in a RTL model 384: Still another specific design object in a RTL model 385: Still another specific design object in a RTL model 387: Design object representing a design module existed in a GL model, but not in a RTL model 404: A part of a model for design verification executed in a local simulator 420: On-chip bus design object including a bus arbiter and an address decoder in a RTL model 606: s-DCP save buffer 644: Local simulation run-time module for distributed parallel simulation 646: Communication and synchronization module for simulation acceleration 648: Hardware-based verification platform 650: Simulation acceleration run-time module 660: Design object in a model partitioned to be executed in a local simulator of distributed parallel simulation 670: VPI/PLI/FLI 674: Socket API 676: TCP/IP socket 678: Device API 680: Device Driver 682: HAL (Hardware Abstraction Layer) 684: Giga-bit LAN card 827: Target board 828: Debugging interface cable 832: In-circuit/in-system debugging software 834: HDL simulator 835: Computer 838: FPGA or non-memory IC chip 840: Other devices on board 842: DUV 843: Design object 844: Trigger module 846: CAPTURE_VIRTEX primitive module 848: Flipflop or latch in a design object that requires 100% visibility 850: Parallel-load/serial-scanout register 852: Parallel-load register 854: Two-level parallel-load/serial-scanout register 855: Model checker 856: Interface between a target board and a computer 857: Software executed in a server computer in simulation acceleration mode/mixed simulation acceleration mode 858: Design block implemented in an FPGA 860: Computer network 868: Transactor 880: State information and input information saving and obtaining controller including a controller for loading time of a two-level parallel-load/serial-scanout register 882: Clock domain of user clock 1 in a two-level parallel-load/serial-scanout register 884: Clock domain of user clock 2 in a two-level parallel-load/serial-scanout register 886: Clock domain of user clock m in a two-level parallel-load/serial-scanout register 890: State information and input information saving and obtaining controller including a controller for loading time of a parallel-load register 892: Clock domain of user clock 1 in a parallel-load register 894: Clock domain of user clock 2 in a parallel-load register 896: Clock domain of user clock m in a parallel-load register 900: State information and input information saving and obtaining controller including a controller for loading time of a parallel-load register with a single clock 902: Instrumentation circuit for debugging, or instrumentation code for debugging 904: Embedded memory for saving the input information 906: Two-level parallel-load/serial-scanout register 908: Serial-scanout register of two-level parallel-load/serial-scanout register 910: Binary counter 912: Control FSM for saving and obtaining the state information and input information 914: JTAG macro

DETAILED DESCRIPTION OF THE INVENTION

The accompanying drawings, which are included to provide a further understanding of the invention and which constitute a part of the specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.

FIG. 1 is a schematic diagram of an example of the design verification apparatus in this present invention.

In FIG. 1, reference numeral 32 denotes verification software, reference numeral 34 denotes HDL simulator, and reference numeral 35 denotes computer.

FIG. 2 is another schematic diagram of an example of the design verification apparatus in this present invention.

In FIG. 2, a plurality of computers 35 are connected to computer network.

FIG. 3 is a schematic diagram of an example of the hierarchy of an ESL model and its corresponding hierarchy of a RTL model.

The design verification apparatus that can be used for applying the design verification method in the present invention can be consisted of a verification software 32, and one or more computers 35, which install one or more simulators 34. Another design verification apparatus that can be used for applying the design verification method in the present invention can be consisted of a verification software, one or more computers, which install one or more simulators, and one or more simulation accelerators (FPGA boards having simulation acceleration capability are seen as the simulation accelerator), hardware emulators, or physical prototyping boards having one or more FPGA chips or ASIC chips (hereafter, prototyping board in short). We will call simulation accelerators, hardware emulators, and prototyping boards as hardware-based verification platforms. The verification software is running on the computer, and if there are 2 or more computers, then they are connected by a network (for example, Internet or giga-bit ethernet) so that the files or data are transferred among them through the network. One or more simulators for design verification can be consisted of various simulators mentioned before. For example, they can be made of event-driven simulators only (in this case, the distributed parallel simulation becomes PDES (Parallel Discrete Event Simulation)), cycle-based simulators and event-driven simulators, cycle-based simulators and transaction-based simulators, transaction-based simulators only, event-driven simulators and transaction-based simulators, event-driven simulators and cycle-based simulators and transaction-based simulators, instruction-level simulators and event-driven simulators, instruction-level simulators and cycle-based simulators, instruction-level simulators and event-driven simulators and cycle-based simulator, etc. If said two or more simulators consist of event-driven simulators and cycle-based simulators, the distributed parallel simulation runs in co-simulation mode such a way that some design objects are run by event-driven simulation and other design objects are run by cycle-based simulation. Or, said two or more simulators consist of event-driven simulators, cycle-based simulators, and transaction-based simulators, the distributed parallel simulation runs in co-simulation mode such a way that some design objects are run by event-driven simulation, some other design objects are run by cycle-based simulation, and remaining design objects are run by transaction-based simulation. In other words, the distributed parallel simulation runs in co-simulation mode if said two or more simulators consist of different kinds of simulators. Moreover, one or more hardware-based verification platforms can be used together with different kinds of simulators in the distributed parallel simulation for running in co-simulation mode (In this case, we will call this co-simulation too even though it can be also called co-emulation).

In FIG. 3, ESL model 37 and RTL model 40 are shown.

In case of ESL model 37, design object for design block 38 are connected to on-chip bus 42.

In case of RTL model 40, design objects for design module 39 is shown. Another specific design object 381 in a RTL model, still another specific design object 382 in a RTL model, still another specific design object 383 in a RTL model, still another specific design object 384 in a RTL model, and still another specific design object 385 in a RTL model are shown. Each of specific design objects 381, 382, 383, 384, 385 consists of design objects 39 for design module. Reference numeral 420 denotes on-chip bus design object including a bus arbiter and an address decoder in a RTL model.

FIG. 4 is a schematic diagram of an example of the hierarchy of a RTL model and its corresponding hierarchy of a GL model. In this example, a design object 387, which has boundary scan cells, shows the additional hierarchy at GL model.

In FIG. 4, GL model 370 is shown. Reference numeral 387 denotes design object representing a design module existed in a GL model, but not in a RTL model.

Regarding FIG. 3 and FIG. 4, in this present invention, the progressive refinement process from ESL to GL is considered as two-step process, at the first step an implementable RTL model (hereafter it will be call a RTL model) is obtained from a transaction-level model (hereafter it will be called an ESL model) and at the second step a GL model (a GL model is a gate-level netlist which represents an interconnection structure of cells in a specific implementation library with which the placement and routing can be carried out) is obtained from a RTL model. We will call the first refinement step an ESL-to-RTL design, and the second refinement step a RTL-to-GL design. Also, we will call each of the various models existed at different abstraction levels in the progressive refinement process an "equivalent model at different abstraction level."

In general, it is important to have a same or similar hierarchical structure between a model at higher level of abstraction, MODEL_DUV(HIGH), and a model at lower level of abstraction, MODEL_DUV(LOW) (Refer FIG. 3 and FIG. 4). In SOC design, as the complexity of DUV is high, models at different levels of abstraction naturally have a same or similar hierarchical structure from the top hierarchy to a certain hierarchy. In this situation, there are corresponding design objects among the models. We'll call this partial hierarchy matching relation among the models. Therefore, a design by a progressive refinement can be thought as the process in which one or more design objects in a model at higher level of abstraction are replaced by their corresponding design objects in a model at lower level of abstraction that have said partial hierarchy matching relation. At the final stage of refinement process for a specific design object B(i)_refined, the verification for correct refinement of B(i)_refined is needed. But it is possible that other design objects are not refined yet. In such case, the design object B(i)_refined replaces the corresponding design object B(i)_abst in a model at high level of abstraction MODEL_DUV(HIGH) to make a model at mixed level of abstraction MODEL_DUV(MIXED) (we will call this kind of progressive refinement "partial refinement", where as the refinement process from MODEL_DUV(HIGH) to MODEL_DUV(LOW) is called "complete refinement"), and MODEL_DUV(MIXED) is executed for comparing its result with that of MODEL_DUV(HIGH). In a model MODEL_DUV(MIXED) there are already refined design object B(i)_refined and un-refined design objects B(k)_abst. But as the input/output port of B(i)_refined has different abstraction from those ports of B(k)_abst, the additional interface may be needed to connect those ports between B(i)_refined and B(k)_abst.

For example, in the case of ESL to RTL refinement, transactors are needed because the port at ESL is transaction-level on the transaction and the port at RTL is cycle-level on the pins or signals. The transactors can be different upon the degree of abstraction of the transaction, for example if a transaction at ESL is cycle accurate, the transactor may be simple, and if a transaction is cycle-count accurate, then the transactor may be relatively quite complex. Also, even though there is no need to have an extra interface between the input/output port at RTL and the input/output port at GL because they are the same as pins or signals, some timing adjustor may be needed to generate some signals with correct timing at the port boundaries if the verification at GL is to verify the timing (The delay values used in the timing adjustor can be obtained by analyzing SDF or delay parameters in the library cells, performing a very short gate-level timing simulation using SDF or a static timing analysis, or both).

The correctness of design can be verified by comparing the simulation result of a model at lower level of abstraction, MODEL_DUV(LOW), by a complete refinement process with the simulation result of a model at higher level of abstraction, MODEL_DUV(HIGH), or with the simulation result of a model at mixed level of abstraction, MODEL_DUV(MIXED), if necessary. However, the simulation speed of a model at mixed level of abstraction MODEL_DUV(MIXED) is lower than that of a model at higher level of abstraction MODEL_DUV(HIGH), and the simulation speed of a model at lower level of abstraction MODEL_DUV(LOW) is even lower than that of a model at higher level of abstraction MODEL_DUV(MIXED). This simulation speed degradation is one of main problems of the verification in the progressive refinement process.

During the partial refinement or complete refinement, the speeds of simulation with MODEL_DUV(MIXED) or MODEL_DUV(LOW) drop significantly compared to the speed of simulation with MODEL_DUV(HIGH), and this results in the increase of total verification time. For example, the speed of a RTL model is 10 to 10,000 times slower than that of a ESL model, and the speed of a GL model is 100 to 300 times slower than that of a RTL model.

In systematically progressive refinement verification method (hereafter, it will be called SPR in short) proposed in this present invention, a RTL verification run with an implementable RTL model at RTL can be executed in parallel or partially (partially execution can be possible by the incremental simulation method which will be explained later) by using the result of ESL verification runs with an ESL model at ESL or the result of ESL/RTL verification runs with a mixed ESL/RTL model MODEL_DUV(MIXED)_i at mixed ESL/RTL of abstraction, which is made of in the progressive refinement process. Moreover, in SPR method proposed in this present invention, a GL verification run with a GL model at GL can be executed in parallel or partially (partially execution can be possible by the incremental simulation method which will be explained later) by using the result of RTL verification runs with a RTL model at RTL or the result of RTL/GL verification runs with a mixed RTL/GL model MODEL_DUV(MIXED)_i at mixed RTL/GL of abstraction, which is made of in the progressive refinement process. Also, in SPR method proposed in this present invention, a ESL verification run with an ESL model at the specific transaction-level can be executed in parallel or partially (partially execution can be possible by the incremental simulation method which will be explained later) by using the result of ESL verification runs with an ESL model at higher transaction-level or the result of ESL verification runs with a mixed ESL model MODEL_DUV(MIXED)_i at mixed high transaction and low transaction level of abstraction, which is made of in the progressive refinement process.

The verification runs mentioned above are basically executed by simulation using one or more simulators, but it is also possible to execute them by simulation acceleration using one or more simulation accelerators, hardware emulators, or prototyping boards with simulators. As the simulation acceleration is simply to increase the speed of simulation by using hardware-based verification platform such as simulation accelerators, hardware emulators, or prototyping boards (in this case, the prototyping boards are controlled to operate in simulation acceleration mode by software, and are not in the in-circuit or in-system environment), we will include it (simulation acceleration) in simulation too in this present invention. Also, as in this present invention we do not consider any formal verification techniques, the verification in this present invention actually means the simulation. Therefore, in this present invention, the verification can be thought as a synonym for the simulation.

FIG. 5 is a schematic diagram of an example of the execution of a distributed parallel simulation whose environment consists of two or more computers connected a computer network.

In FIG. 5, a plurality of computers 35 are connected to computer network. Reference numeral 343 denotes simulator executing a local simulation in an environment of distributed parallel simulation. Computer 35 includes simulator 343 and verification software 32. Simulator 343 may include specific design object 380, 381, 382, 383, 384, 385 in a RTL model.

FIG. 6 is a schematic diagram of an example of the execution of a time-sliced parallel simulation, where t-DCP is obtained at the front-end simulation with a model of higher level of abstraction, and back-end simulation is executed in temporally parallel.

FIG. 6 shows an example of simulation method for a predetermined model which has a specific abstraction level.

A first model (i.e., the front-end simulation in FIG. 6) is set; the first model is the same to the predetermined model except that the abstraction level of the first model is higher than the abstraction level of the predetermined model.

In other words, target to be actually verified is the predetermined model (i.e., the back-end simulation in FIG. 6), but the first model is set in order to efficiently verify the predetermined model.

Dynamic information is obtained while simulating the first model. At this time, a plurality of simulations may be executed as shown in FIG. 12 which will be described later, although one simulation for the first model is shown in FIGS. 6 and 7.

The local simulation is, for example, each of simulations for local design objects in case of executing a parallel simulation by spatially dividing the predetermined model into a plurality of local design objects. From the dynamic information obtained for the first model, expected input and expected outputs might be obtained.

By using the expected inputs and the expected outputs, a plurality of local simulations regarding the predetermined model might be executed. In other words, the result of front-end simulation may be used in the back-end simulation.

Although the word `front-end` and `back-end` is used, it does not always mean that the front-end simulation is executed earlier than the back-end simulation. For example, the front-end simulation and the back-end simulation might be executed almost at the same time.

Using the expected input and the expected output makes the simulation faster because it may allow the simulation not to perform the communication and the synchronization with the other local simulations which are simulations for the local design objects.

Regarding FIG. 6 (also regarding FIG. 7), the embodiment of the present invention may also have the following procedures.

The procedure includes:

determining whether the at least one local simulation among the plurality of local simulations can be executed independently with other local simulations by using the expected inputs and the expected outputs while omitting a communication and a synchronization with other local simulations; and

executing said at least one local simulation independently with other local simulations if it is determined that said at least one local simulation can be independently executed in the said step, or executing said at least one local simulation by using actual inputs which are obtained through the communication and the synchronization with the other local simulations if it is determined that said at least one local simulation cannot be independently executed in the said step.

If the local simulations can be executed independently with other local simulations, the overall speed of the simulation increases because it means that no communication and no synchronization might be needed.

In SPR verification method in the present invention, the parallel or partial run of simulation at the low level of abstraction is carried out by using the simulation results at the high level of abstraction or the high/low mixed level of abstraction in the progressive refinement process, or by using the simulation results at the same level of abstraction which are obtained from the previous earlier simulation runs. Rarely, in SPR verification method in the present invention, the parallel or partial run of simulation at the high level of abstraction can be carried out by using the simulation results at the low level of abstraction, too (this is in the case when the design iteration occurs). In summary, an important thing in this present invention is to perform a present simulation fast by using the result of previous earlier simulation. Normally, the present simulation is carried out at the lower level of abstraction than that of the previous earlier simulation. But, in rare cases, the present simulation is carried out at the higher level of abstraction than that of the previous earlier simulation. Moreover, there can be one or more design modifications between the current simulation and the previous earlier simulation.

In the case of the previous earlier simulation at the higher level of abstraction than that of the present simulation, there are four methods which are explained below in detail. First, the design state information (defined later, and hereafter state information in short) of some design objects in a model at the high level of abstraction saved at one or more specific simulation times or periods during the simulation run with a model at the high level of abstraction is used in the simulation at the low level of abstraction (we will call this "usage method-1"). Second, the design state information of some design objects in one or more models at the mixed high/low level of abstraction saved at one or more specific simulation times or periods during two or more simulation runs with models at the mixed high/low level of abstraction is used in the simulation at the low level of abstraction (we will call this "usage method-2"). Third, the input/output information (defined later) of one or more design objects in a model at the high level of abstraction saved at the entire or partial simulation time during the simulation run with a model at the high level of abstraction is used in the simulation at the low level of abstraction (we will call this "usage method-3"). Fourth, the input/output information of one or more design objects in one or more models at the mixed high/low level of abstraction saved at the entire or partial simulation time during two or more simulation runs with models at the mixed high/low level of abstraction is used in the simulation at the low level of abstraction (we will call this "usage method-4").

Also, in the unique distributed-processing-based parallel execution method for distributed parallel simulation in this present invention, each of local simulations executes not only each of local design objects, but also a complete model of DUV and TB at higher level of abstraction or a complete model of DUV and TB optimized for faster simulation (for example, a model for cycle-based simulation is optimized for 10.times. faster simulation than a model for event-driven simulation) on each of local computers (by contrast, in traditional distributed parallel simulation, each of local simulations executes each of local design objects only) for obtaining the dynamic information from the complete model of DUV and TB, which is used as the expected inputs and expected outputs of the local design object to eliminate the synchronization overhead and communication overhead with other local simulations of the distributed parallel simulation, and to increase the speed of each of local simulations by each of local simulations (More detailed explanation will be later). The dynamic information of a model or design object is the logic values of one or more signals, values of one or more variables, or constants at one or more specific simulation times or periods (the period can be the entire simulation time) in the model or design object during the simulation. An example to get the dynamic information during the simulation is to use Verilog built-in system tasks, $dumpvars, $dumpports, $dumpall, $readmemb, $readmemh, etc, or user-defined system tasks (more detail can be found in Korean patent application 10-2005-116706. The dynamic information can be saved in VCD, SHM, VCD+, FSDB, or user-defined binary or text format.

The state information of a model is the dynamic information containing values of all flipflop output signals or variables, all latch output signals or variables and all combinational feedback signals or variables if there are any closed combinational feedback loops in the model at a specific simulation time (for example, at 29, 100, 511 nano-second simulation time) or for a specific simulation period (for example, 1 nano-second period from 29, 100, 200 nano-second to 29, 100, 201 nano-second). The state information of a design object is the dynamic information containing values of all flipflop output signals or variables, all latch output signals or variables and all combinational feedback signals or variables if there are any closed combinational feedback loops in the design object at a specific simulation time or for a specific simulation period.

The input information of a design object is values of all inputs and inputs of the design objects for a specific simulation time interval (this simulation time interval can be the entire simulation time). The output information of a design object is values of all outputs and inputs of the design objects for a specific simulation time interval (this simulation time interval can be the entire simulation time). The input/output information of a design object is values of all inputs, outputs and inputs of the design objects for a specific simulation time interval (this simulation time interval can be the entire simulation time).

The parallel simulation execution using a model at the specific level of abstraction in this present invention includes both distributed-processing-based parallel execution (hereafter, it will be called DPE in short), and time-sliced parallel execution (hereafter, it will be called TPE in short) (In other words, DPE and TPE are our unique parallel simulation methods in this present invention). For detailed explanation, t-DCP (Temporal Design Check Point) is defined first.

t-DCP (Temporal Design Check Point) is defined as the dynamic information of DUV or one or more design objects in DUV which is necessary for starting the simulation for DUV or one or more design objects in DUV from the arbitrary simulation time Ta, not the simulation time 0. Therefore, the state information of a design object is a specific example of t-DCP. But, a model for simulation must have both DUV and TB. Therefore, to start the simulation at the arbitrary simulation time Ta, other than simulation time 0, considering not only DUV but also TB is necessary. There are about three ways to do it. First, TB is executed from simulation time 0, and DUV from Ta. To do so, if TB is reactive, the output information of DUV (it may be necessary to save this at the previous simulation run) drives TB to run TB only from the simulation time 0 to Ta, and both DUV and TB are simulated together from Ta. If TB is non-reactive, executing TB alone from the simulation time 0 to Ta and executing both TB and DUV from Ta is possible. Second, to restart TB at the simulation time Ta, TB is saved for restart. That is to save the TB state, which are values of all variables and constants at a specific simulation time or period in TB, or the simulation state, and restore it later. However, to restart the execution from the saved TB state, the description style of TB must be confined (for example, synthesizable TB) or some manual TB modification may be needed. Third, the algorithmic-based input generation subcomponent in TB, which is difficult to start the execution at Ta, may be replaced with the pattern-based input generation subcomponent, which is easy to start the execution at Ta using a pattern pointer.

To apply one of three methods, the instrumentation code may need to be instrumented into a model for simulation or the simulation environment. Such instrumentation code can be automatically instrumented by the verification software in this present invention (the specific examples of such instrumentation code are given in FIGS. 16-18 which will be shown later).

The detailed simulation method using t-DCP such as a state information of a design object is given in Korean patent application 10-2005-116706.

FIG. 7 is a schematic diagram of an example of the execution of a distributed-processing-based parallel simulation, where s-DCP is obtained at the front-end simulation with a model of higher level of abstraction, and back-end simulation is executed in spatially parallel.

Regarding S-DCP, s-DCP is defined as the dynamic information of the equivalent model at different abstraction level of DUV or TB, the dynamic information of one or more design objects in the equivalent model at different abstraction level, the dyna