System and method for continuous data analysis of an ongoing clinical trial Ikeguchi, Edward F. ; et al. [deVries, Glen M.]

System and method for continuous data analysis of an ongoing clinical trial

Ikeguchi, Edward F. ; et al.

Patent Application Summary

U.S. patent application number 10/667848 was filed with the patent office on 2005-04-07 for system and method for continuous data analysis of an ongoing clinical trial. Invention is credited to deVries, Glen M., Ikeguchi, Edward F., Sherif, Tarek A..

Application Number	20050075832 10/667848
Document ID	/
Family ID	34393396
Filed Date	2005-04-07

United States Patent Application	20050075832
Kind Code	A1
Ikeguchi, Edward F. ; et al.	April 7, 2005

System and method for continuous data analysis of an ongoing clinical trial

Abstract

System and method of continuously analyzing trial data of an ongoing clinical trial is provided. A statistical analysis is performed on a trial database containing subject trial data. If the result of the statistical analysis does not exceed a predetermined threshold value, then the statistical analysis is repeated while the clinical trial is ongoing. In a blinded clinical trial, a grouped database is generated from the trial database and a blinding database prior to performing the statistical analysis. The grouped database groups the subject trial data according to the study groups. The ability to continuously monitor and analyze the trial data for statistical significance in tandem with data collection while the trial is ongoing provides many benefits to the researchers because the trial database no longer becomes the bottleneck in obtaining useful results and statistical analysis can be conducted on a near real-time basis without having to wait until completion of the trial.

Inventors:	Ikeguchi, Edward F.; (New York, NY) ; deVries, Glen M.; (New York, NY) ; Sherif, Tarek A.; (New York, NY)
Correspondence Address:	William H. Dippert Reed Smith, LLP 29th Floor 599 Lexington Avenue New York NY 10022 US
Family ID:	34393396
Appl. No.:	10/667848
Filed:	September 22, 2003

Current U.S. Class:	702/179 ; 705/2; 707/999.001; 707/999.1
Current CPC Class:	G06Q 10/10 20130101; G16H 70/20 20180101; G16H 50/70 20180101; G16H 10/20 20180101
Class at Publication:	702/179 ; 707/001; 707/100; 705/002
International Class:	G06F 007/00; G06F 017/30; G06F 017/60; G06F 017/00; G06F 015/00; G06F 017/18; G06F 101/14

Claims

What is claimed is:

1. A method of continuously analyzing trial data of an ongoing clinical trial, the method comprising: accessing a trial database containing trial data of subjects in a clinical trial; performing a statistical analysis on the accessed trial database; determining whether the result of the statistical analysis exceeds a predetermined threshold value; and if it is determined that the result of the statistical analysis does not exceed the predetermined threshold value, then repeating the steps of accessing, performing and determining while the clinical trial is ongoing.

2. The method according to claim 1, prior to the step of performing a statistical analysis, further comprising: reading a user defined criteria that defines the level of cleanliness of the trial data for statistical analysis; and retrieving only those trial data that meet the user defined criteria from the trial database.

3. The method according to claim 1, wherein if it is determined that the result of the statistical analysis does not exceed the predetermined threshold value, then waiting for a predetermined time period prior to the repeating step.

4. The method according to claim 1, wherein the clinical trial is a blinded clinical trial, further comprising: accessing a blinding database containing subject identifiers and associated study group identifiers, each study group identifier identifying to which study group an associated subject belongs; and producing a grouped database from the clinical database and the blinding database for statistical analysis, the grouped database grouping the study data according to the study group.

5. The method according to claim 4, wherein the grouped database is stored in a memory device that is inaccessible by any user.

6. The method according to claim 1, wherein the step of performing a statistical analysis is executed without locking the trial database.

7. The method according to claim 1, wherein the clinical trial is a blinded clinical trial, further comprising: reading a predefined criteria that defines the level of cleanliness of trial data required for analysis; retrieving only those trial data that meet the predefined criteria from the trial database; accessing a blinding database containing subject identifiers and an associated study group identifier for each subject, each study group identifier identifying to which study group each subject belongs; and producing a grouped database from the retrieved trial data and the blinding database for statistical analysis, the grouped database grouping the trial data according to the study group.

8. The method according to claim 7, wherein the grouped database is stored in a memory device that is inaccessible by any user to preserve the blindness of the clinical trial.

9. The method according to claim 1, further comprising alerting a user if it is determined that the result of the statistical analysis exceeds the predetermined threshold value.

10. The method according to claim 9, wherein the predetermined threshold value includes a predetermined statistical significance value.

11. The method according to claim 10, wherein the step of performing a statistical analysis comprises: retrieving a user defined statistical model; and running the retrieved user defined statistical model on the trial database.

12. A method of continuously analyzing trial data of an ongoing blinded clinical trial, the method comprising: accessing a trial database containing blinded trial data of subjects in an ongoing blinded clinical trial; accessing a blinding database containing subject identifiers and associated study group identifiers, each study group identifier identifying to which study group an associated subject belongs; producing a grouped database from the trial database and the blinding database, the grouped database grouping the trial data according to the study group; performing a statistical analysis on the produced grouped database; determining whether the result of the statistical analysis exceeds a predetermined threshold value; and if it is determined that the result of the statistical analysis does not exceed the predetermined threshold value, then repeating the above steps of: accessing a trial database, producing a grouped database, performing a statistical analysis, and determining while the clinical trial is ongoing.

13. The method according to claim 12, prior to the step of performing a statistical analysis, further comprising: reading a user defined criteria that defines the level of cleanliness of trial data for statistical analysis; and retrieving only those trial data that meet the user defined criteria from the trial database for statistical analysis.

14. The method according to claim 12, wherein the produced grouped database is stored in a memory device that is inaccessible by any user.

15. The method according to claim 12, wherein the step of performing a statistical analysis is executed without locking the trial database.

16. The method according to claim 12, further comprising alerting a user if it is determined that the result of the statistical analysis exceeds the predetermined threshold value.

17. The method according to claim 16, wherein the predetermined threshold value includes a predetermined statistical significance value.

18. A system for continuously analyzing an ongoing clinical trial comprising: a storage device operable to store a trial database containing trial data of subjects in an ongoing clinical trial; a processor coupled to the storage device; and an analysis program executable by the processor and operable to: perform a statistical analysis on the trial database; determine whether the output result of the statistical analysis exceeds a predetermined threshold value; and repeat the statistical analysis while the clinical trial is ongoing if it is determined that the result of the statistical analysis does not exceed the predetermined threshold value.

19. The system according to claim 18, wherein the analysis program is further operable to: read a user defined criteria that defines the level of cleanliness of trial data for statistical analysis; and retrieve only those trial data that meet the user defined criteria from the trial database.

20. The system according to claim 18, wherein if the analysis program determines that the result of the statistical analysis does not exceed the predetermined threshold value, then the analysis program waits for a predetermined time period prior to repeating the statistical analysis.

21. The system according to claim 18, wherein the clinical trial is a blinded clinical trial, the analysis program further operable to: access a blinding database containing subject identifiers and associated study group identifiers, each study group identifier identifying to which study group an associated subject belongs; and produce a grouped database from the trial database and the blinding database for statistical analysis, the grouped database grouping the trial data according to the study group.

22. The system according to claim 21, further comprising a memory device coupled to the processor and being inaccessible to any user, wherein the grouped database is stored only in the memory device.

23. The system according to claim 18, wherein the analysis program performs the statistical analysis without locking the trial database.

24. The system according to claim 18, wherein the analysis program is further operable to alert a user if it determines that the result of the statistical analysis exceeds the predetermined threshold value.

Description

TECHNICAL FIELD OF THE INVENTION

[0001] This application relates to data processing of clinical trial data and more specifically a system and method for statistically analyzing the clinical trial data.

BACKGROUND OF THE INVENTION

[0002] In the United States, the Food and Drug Administration (FDA) oversees the protection of consumers exposed to health-related products ranging from food, cosmetics, drugs, gene therapies and medical devices. Under the FDA guidance, clinical trials are performed to test the safety and efficacy of new drugs, medical devices or other treatments to ultimately ascertain whether or not a new medical therapy is appropriate for widespread human consumption.

[0003] More specifically, once a new drug or medical device has undergone studies in animals, and results appear favorable, it can be studied in humans. Before human testing is begun, findings of animal studies are reported to the FDA to obtain approval to do so. This report to the FDA is called an application for an Investigational New Drug (IND).

[0004] The process of experimentation is referred to as a clinical trial, which involves four phases. In Phase I, a few research participants, referred to as subjects, (approximately 5 to 10) are used to determine toxicity of a new treatment. In Phase II, more subjects (10-20) are used to determine efficacy and further ascertain safety. Doses are stratified to try to gain information about the optimal portion. A treatment may be compared to either a placebo or another existing therapy. In Phase III, efficacy is determined. For this phase, more subjects on the order of hundreds to thousands of patients are needed to perform a meaningful statistical analysis. A treatment may be compared to either a placebo or another existing therapy. In Phase IV (post-approval study), the treatment has already been approved by the FDA, but more testing is performed to evaluate long-term effects and to evaluate other indications.

[0005] During clinical trials, patients are seen at medical clinics and asked to participate in a clinical research project by their doctor, known as an investigator. After the patients sign an informed consent form, they are considered enrolled in the study, and are subsequently referred to as study subjects. A study sponsor, generally considered to be the company developing a new medical treatment and supporting the research, develops a study protocol. The study protocol is a document describing the reason for the experiment, the rationale for the number of subjects required, the methods used to study the subjects, and any other guidelines or rules for how the study is to be conducted. Prior to usage, the study protocol is reviewed and approved by an Institutional Review Board (IRB). An IRB serves as a peer review group, which evaluates a protocol to determine its scientific soundness and ethics for the protection of the subjects and investigator.

[0006] Creation of Study Groups (Study Arms)

[0007] Subjects enrolled in a clinical study are stratified into groups that allow data to be assessed in a comparative fashion. In a common example, one study arm, known as a control group (or "control"), will use a placebo, whereby a pill containing no active chemical ingredient is administered. In doing so, comparisons can be made between subjects receiving actual medication versus placebo.

[0008] Randomization

[0009] Subjects enrolled into a clinical study are assigned to a study arm in a random fashion, which is done to avoid biases that may occur in the selection of subjects for a trial. For example, a subject who is a particularly good candidate to respond to a new medication might be intentionally entered into the study arm to receive real medication and not a placebo. This could skew the data and outcome of the clinical trial to favor the medication under study, by the selection of subjects who are most likely to perform well with the medication. In instances where only one study group is present, randomization is not performed.

[0010] Blinding

[0011] Blinding is a process by which the study arm assignment for subjects in a clinical trial is not revealed to the subject (single blind) or to both the subject and the investigator (double blind). This minimizes the risk of data bias. Virtually all randomized trials are blinded by definition. In instances where only one study group is present, blinding is not performed.

[0012] Statistical Analysis of Trial Data

[0013] Generally, at the end of the trial, the database containing the completed trial data is shipped to a statistician for analysis. If particular occurrences, such as adverse events, are seen with an incidence that is greater in one group over another such that it exceeds the likelihood of pure chance alone, then it can be stated that statistical significance has been reached. Using statistical calculations, the comparative incidence of any given occurrence between groups can be described by a numeric value, referred to as a "p-value". A p-value of 1.0 indicates that there is a 100% likelihood that an incident occurred as the result of chance alone. Conversely, a p-value of 0.0 indicates that there is a 0% likelihood that an incident occurred as a result of chance alone. Generally, values of p<0.05 are considered to be "statistically significant", and values of p<0.01 are considered "highly statistically significant".

[0014] In some clinical trials, multiple study arms, or even a control group, may not be utilized. In such cases, only a single study group exists with all subjects receiving the same treatment. This is typically performed when historical data about the medical treatment, or a competing treatment is already known from prior clinical trials, and may be utilized for the purpose of making comparisons.

[0015] The creation of study arms, randomization, and blinding are techniques that are used in most clinical trials where scientific rigor is of high importance. However, these methods lead to several challenges, since they prevent the clinical trial sponsor from tracking key information related to safety and efficacy.

[0016] Regarding safety, the objective of any clinical trial is to document the safety of a new treatment. However, in clinical trials where randomization is conducted between two or more study arms, this can be determined only as a result of analyzing and comparing the safety parameters of one study group to another. Unfortunately, because the study arm assignments are blinded, there is no way to separate out subjects and their data into corresponding groups for purposes of performing comparisons while the trial is being conducted. Since many clinical trials may last for time periods extending for years, it is conceivable to have a treatment toxicity go unnoticed for prolonged periods without intervention.

[0017] Regarding efficacy, any clinical trial seeking to document efficacy will incorporate key variables that are followed during the course of the trial to draw the desired conclusion. In addition, studies will define certain outcomes, or endpoints, at which point a study subject is considered to have completed the protocol. These parameters, including both key variables and study endpoints, cannot be analyzed by comparison between study arms while the subjects are randomized and blinded. This poses potential problems in ethics and statistical analysis.

[0018] When new medications or other health-related treatments are of superior efficacy to anything else, it is ethical to allow usage of the treatment for those in imminent need, even prior to final government approval. Conversely, when available, it is considered unethical to withhold such treatments. For example, if a medication were to be identified that eradicated the Human Immunodeficiency Virus (HIV), it would be unethical to allow diseased patients to continue suffering and even die of the illness, while the medication was being clinically tested for purposes of government approval. Ideally, in such situations, identification of effective treatments should occur early in the project. Under these circumstances, non-treatment arms (i.e., those taking placebos) could be construed as unethical and should be eliminated. At present, when clinical trials are randomized and blinded, identification of a particularly effective treatment may not be realized until the entire clinical trial is completed.

[0019] Another related problem is statistical power. By definition, statistical power refers to the probability of a test appropriately rejecting the null hypothesis, or the chance of an experiment's outcome being the result of chance alone. Clinical research protocols are engineered to prove a certain hypothesis about a medical treatment's safety and efficacy, and disprove the null hypothesis. To do so, statistical power is required, which can be achieved by obtaining a large enough sample size of subjects in each study arm. When too few subjects are enrolled into the study arms, there is the risk of the study not accruing enough subjects to enable the null hypothesis to be rejected, and thus not reaching statistical significance. Because clinical trials that are randomized are blinded, the actual number of subjects distributed throughout study arms is not defined until the end of the project. Although this maintains data collection integrity, there are inherent inefficiencies in the system, regardless of the outcome.

[0020] In a case where the study data reaches statistical significance, as accrual of subjects continues, and data is received, an optimal time to close a clinical study would be at the very moment when statistical significance is achieved. While that moment may arrive earlier in the course of a clinical trial, there is no way of knowing this, and therefore time and money are lost. Moreover, study subjects are enrolled above and beyond what is needed to reach the goals of the study, thus placing human subjects under experimentation unnecessarily.

[0021] In a case where the study data nearly reaches statistical significance, while the study data falls short of statistical significance, there is reason to believe that this is due to a shortage of enrollment in the study. Frequently, to develop more supportive data, clinical trials will be extended. These "extension studies", however, can only begin after a full closure of the parent study, frequently requiring months to years before starting again.

[0022] In a case where the study data does not reach statistical significance, there is no trend toward significance, and there is little chance of reaching the desired conclusion. In that case, an optimal time to close a study is as early as possible once the conclusion can be established that the treatment under investigation does not work, and study data has little chance of reaching statistical significance (i.e., it is futile). In randomized and blinded clinical trials, this conclusion is difficult to arrive at until data analysis can be conducted. In these situations, time and money are lost. Moreover, an excess of human subjects are placed under study unnecessarily.

[0023] Data Safety Monitoring

[0024] To mitigate some of the risks related to the conduct of randomized and blinded clinical trials, a Data Safety Monitoring Board (DSMB) may be formed at the beginning of each protocol. In general, a DSMB is recommended for clinical trials that involve a potentially serious outcome (e.g., death, heart attack, etc.), are randomized and blinded, and extend for prolonged periods of time. In addition, a DSMB is required for trials that are sponsored by the United States government, namely, the National Institute of Health (NIH).

[0025] A DSMB generally consists of members who are domain experts in the field of study, such as physicians, as well as bio-statisticians. It is important that DSMB members be separate from personnel of the sponsor organization, and financial disclosure for all members is performed to minimize conflicts of interest. Prior to start of a clinical trial, standard operating procedures are established for the DSMB, including the frequency of meetings, initiation of interim analyses, conduct during interim analyses and criteria for discontinuation of the clinical trial. As it relates to the safety of study subjects, DSMB functions to examine trends of adverse occurrences rather than investigate specific reports, which are generally left to each IRB responsible for the activities of any given investigator.

[0026] Data Collection

[0027] A typical method of collecting and analyzing patient data is illustrated in the flow chart shown in FIG. 1. Patient data or charts 10 from the clinical trial are collected manually in paper forms. Using a technology called Electronic Data Capture (EDC) or Remote Data Entry (RDE), a computer (not shown) displays a Case Report Form (CRF) to a clinical research coordinator (CRC) 12, typically a nurse or doctor. The CRC 12 then enters the patient data 10 through the computer display which is received in block 14 by an EDC system which executes all of the steps included in a box 11. The received data is stored in a clinical trial database 38 through a link 20 which can be an electronic link such as a telephone line or Internet link. In block 18, it is determined whether the data inputted by the CRC 12 is clean using one or more rules. The rules may be implemented by simple range checking scripts, or by an inference rule engine or deterministic rule engine in order to identify potential problems with the data.

[0028] In addition to the software programs, block 18 may also involve research personnel known as monitors or Clinical Research Associates (CRA) who travel to the various research sites to perform source document verification (SDV) whereby the data in the database 38 is reconciled against individual patient charts to the degree required in the protocol.

[0029] If it is determined that the data entered is not clean, then block 22 generates a query which is then sent over the link 20 to the CRC 12. The blocks 14, 18 and 22 are repeated until all of the subject data 10 are entered. This is an iterative process that continues until resolution of all queries in the database 38.

[0030] Once all data 10 are entered, block 24 determines whether the clinical trial is over. If no, then the EDC system continues to receive the patient trial data 10 through block 14 as the trial continues. If the trial is over, control passes to block 26 where the entire database is locked from any changes, deletions or insertions of the data in the database 38. In one embodiment, locking involves turning the database 38 into a "read-only" state.

[0031] In block 28, a blinding data from a blinding database is retrieved. A simplified example blinding database 40 is shown in FIG. 4. The blinding database 40 is a database table having two columns. The first column contains a patient subject ID (subject identifier) and the second column contains an associated study arm or group the patient belongs to. In the table 40, 13 subjects belong to Study Arm "A" and 12 subjects belong to Study Arm "B". Because the database 40 is not associated with actual trial data, the table 40 by itself is relatively uninformative.

[0032] A simplified example trial database 38 is shown in FIG. 5. The embodiment shown is a database table containing two columns. The first column contains a patient subject ID and the second column is a database field called "Heart Attack" which specifies whether the subject had a heart attack. An entry of 0 means NO and entry of 1 means YES. As can be seen from the trial database 38, due to blinding of the subjects in the study groups, there is no way of knowing whether or not any discrepancy exists in the number of heart attacks seen in Group A versus B. Because the trial is randomized, without the blinding data 40, the table 38 by itself is relatively uninfomative.

[0033] In block 28, an unblinded database is produced from the trial database 38 and the retrieved blinding database 40 in which the subject ID is used as a common key. The result of the unblinding process of block 28 is shown in FIG. 6 as the unblinded database 41. In the embodiment shown, one database table is produced. The table 41 contains subject identifiers, Study Arm of the subjects, and Heart Attack data of those subjects. As can be appreciated by a person of ordinary skill in the art, there is a direct traceability from study data and subject ID to Study Arm.

[0034] In block 30, statistical analysis is performed on the unblinded data 42 to find out the efficacy and safety of the completed clinical trial.

[0035] During the course of any given randomized and blinded clinical trial, an interim analysis may be conducted. An interim analysis may result from urging of the DSMB for cause, or be a pre-planned event as described in the study protocol.

[0036] Conducting an interim analysis involves a process where the available data is verified and cleaned. The verification process generally involves a process by which trained personnel travel to the various research sites to reconcile submitted data against source documents, which generally implies the patient's chart, laboratory reports, radiographic readings, and others. The data cleaning process may involve a series of documented communications between the research site and a central data coordinating personnel to resolve inconsistencies or other conflicting data.

[0037] The refined database must then be sent to an impartial third party for statistical analysis. To conduct the analysis, the statistician must un-blind the clinical trial database by combining both the study data with the blinding key of which subjects are assigned to particular study arms. Since the clinical study is expected to continue beyond the interim analysis, the process of un-blinding must be conducted with great caution, so as not to reveal the blind status of subjects to any personnel involved in the execution of the clinical trial. Once a statistician has completed the interim analysis, a report is issued to the trial sponsor and DSMB.

[0038] Inclusive of the data cleaning, verification, un-blinding and statistical analysis processes, as well as the administrative resources for coordinating several groups of personnel for the un-blinding process, an interim analysis is often arduous, time-consuming and expensive.

[0039] In spite of the latest technological advancements made in the area of data collection through electronic systems, there is still a disadvantage in that it is very difficult to draw conclusions about a medical treatment while the data is being collected during the trial. This limitation stems primarily from the fact that statistical analysis cannot begin until the trial data has been fully cleaned and processed. At present, statistical analysis can only be conducted upon data in an "en bloc" fashion. This creates a situation where the ability to draw conclusions about a medical therapy inevitably lags behind the process of simply obtaining data in a database.

[0040] Regardless of how efficient the data collection process may be made through automation, the ability to acquire the information needed for critical decision-making is still suspended by the requirement to obtain a locked database in order for statistical work to advance.

[0041] Therefore, it is desirable to provide a method and system for conducting statistical analysis on the clinical data collected while the trial is ongoing.

[0042] In the case of a randomized clinical trial where maintaining confidentiality is important, it is also desirable to provide a secure system in which the blinding information is integrated in such a way that the clinical trial data and blinding data are stored securely to prevent users from accessing the data and yet allow the execution of programs for performing statistical comparisons between study arms while the trial is ongoing.

SUMMARY OF THE INVENTION

[0043] According to the present invention, a system and method of continuously analyzing trial data of an ongoing clinical trial is provided. A trial database containing subject trial data in a clinical trial is accessed, and a statistical analysis is performed on the accessed trial database. If the result of the statistical analysis does not exceed a predetermined threshold value, then the step of statistical analysis is repeated while the clinical trial is ongoing.

[0044] In another aspect of the invention, the present method uses a user definable criteria that defines the level of cleanliness of subject data for statistical analysis. In that case, only those subject data that meet the user defined criteria are selected from the trial database for statistical analysis.

[0045] In another aspect of the invention, when the result of the statistical analysis does not exceed the predetermined threshold value, then the analysis program waits for a predetermined time period prior to repeating the statistical analysis step. This is done so that additional subject data are added to the trial database.

[0046] In another aspect, the clinical trial is blinded. Accordingly, in addition to the trial database, a blinding database containing subject identifiers and associated study group identifiers is accessed. Each study group identifier identifies which study group an associated subject belongs to. Then a grouped database is produced from the clinical database and the blinding database for statistical analysis in which the grouped database groups trial data according to the study group the subjects belong to. Preferably, one data table is created for each study group and contains all trial data for those subjects that belong to that study group.

[0047] In yet another aspect of the search, the unblinded database is stored in a memory device that is inaccessible by any user in order to preserve the blindness of the trial.

[0048] In another aspect of the search, the statistical analysis is performed without locking the trial database.

[0049] In another aspect of the search, if the result of the statistical analysis exceeds the predetermined threshold value, a user is alerted. The predetermined threshold value may include a predetermined statistical significance value.

[0050] In another aspect of the search, there are many statistical models to choose from. A user selectable statistical model is retrieved and the retrieved model is run on the trial database.

BRIEF DESCRIPTION OF THE DRAWINGS

[0051] FIG. 1 is a flow diagram of a method of collecting and analyzing clinical trial data using an EDC system.

[0052] FIG. 2 is a functional block diagram of a clinical trial management system according to an exemplary embodiment of the present invention.

[0053] FIG. 3 is a flow diagram of a software routine that continuously analyzes the trial data while the trial is ongoing according to the present invention.

[0054] FIG. 4 is an example of a blinding database.

[0055] FIG. 5 is an example of a trial database containing subject trial data.

[0056] FIG. 6 is an example of an unblinded database derived from the blinding database of FIG. 4 and the trial database of FIG. 5.

[0057] FIG. 7A is an example of a trial database containing a status field that represents the levels of cleanliness of the subject data records.

[0058] FIG. 7B is a filtered trial database containing a subset of the trial database of FIG. 7A which have been selected according to a user specified status.

[0059] FIG. 8 is an example of a grouped database derived from the blinding database of FIG. 4 and the filtered trial database of FIG. 7B.

DETAILED DESCRIPTION OF THE INVENTION

[0060] As shown in FIG. 2, a clinical trial data management system 100 of the present invention is an Internet-enabled application solution framework that automates data collection, data cleaning, grouping if needed (as will be explained more fully later herein) and statistical data analysis while the trial is ongoing. The system 100 is connected to a computer network such as the Internet 120 through, for example, an I/O interface 102, which receives information from and sends information to Internet users over a communication link 20 and to one or more operators using a work station 117. The Internet users are typically CRC's located at various trial sites who transcribe the subjects' charts to the system 100. The system 100 includes, for example, memory 104 which is volatile, processor (CPU) 106, program storage 108, and data storage device 118, all commonly connected to each other through a bus 112. The program storage 108 stores, among others, a clinical trial analysis program or module 114 and one or more mathematical models 116 that are used to analyze the subject data and obtain the p-value for statistical significance. The data storage device 118 stores a clinical trial database 38 and blinding database 40. Any of the software program modules in the program storage 108 and data from the data storage 110 are transferred to the memory 104 as needed and is executed by the processor 106.

[0061] The system 100 can be any computer such as a WINDOWS-based or UNIX-based personal computer, server, workstation, minicomputer or a mainframe, or a combination thereof. While the system 100 is illustrated as a single computer unit for purposes of clarity, persons of ordinary skill in the art will appreciate that the system may comprise a group of computers which can be scaled depending on the processing load and database size.

[0062] FIG. 3 illustrates a flow diagram of a software routine 50 that continuously analyzes the trial data while the trial is ongoing according to the present invention. The routine 50 is stored in the storage device 108 and works with the EDC system 11 of FIG. 1 while the system 11 continuously collects and cleans the trial data.

[0063] In block 52, the routine 50 connects to a trial database 56 through a log-in procedure. A simplified trial database 56 is shown in FIG. 7A. The database 56 contains three columns comprising a patient subject ID field, a data status field, which specifies the level of cleanliness, and a "Heart Attack" field similar to FIG. 5.

[0064] FIG. 7A illustrates simplified trial data records that are at different levels of cleanliness. In the example shown in FIG. 7A, there are five levels of status. Level 1 indicates that there is an outstanding query that needs to be answered by the CRC 12 (see step 22 in FIG. 1). Level 2 indicates that the record is pending a review by another reviewer such as the sponsor of the trial. Level 3 indicates that it is pending a review by a clinical research associate (CRA) to travel to a research site to perform what is known as a source document verification (SDV). This typically involves a verification of the trial record with an actual patient chart. Level 4 indicates that it is pending a lock barring any intervention by any reviewer. Finally, Level 5 indicates that the record is locked which represents the highest level of clean data.

[0065] In the "Heart Attack" field, an entry of 0 means NO and entry of 1 means YES. The "Heart Attack" field also includes some erroneous data such as "don't know" for subject 118 or "Y" for subject 107. Accordingly, the status for those records indicates a "1" in which queries are outstanding.

[0066] Once connected, the routine 50 retrieves in block 60 a user specified criteria 54 stored in the storage device 108 which specifies the status or level of cleanliness of the trial database and in block 61 retrieves the trial database 56 which is filtered for those database records that satisfy the retrieved criteria. For an example, if the retrieved user specified criteria is 3, block 61 selects only those records that have a status of 3 or better. Such a filtered database 58 is shown in FIG. 7B. While the database 58 has a relatively higher level of cleanliness, it does have a fewer number of records. This is useful since, at any given point in time during the data collection process, the clinical trial database 56 may have data that has any combination of data pending SDV, containing outstanding queries, completed SDV but awaiting lock, and so on. Depending upon the operating procedures defined for any such clinical trial, only certain subsets of data may be suitable for inclusion in an analysis.

[0067] Once the database 39 is filtered according to the user specified criteria, block 62 is executed. In block 62, the blinding database 40 is retrieved in the memory 104. In block 64, the filtered trial database 58 and the blinding database 40 are used to produce a grouped database 42. In the embodiment shown, two database tables 66, 68, one for each study group without identifying subjects, are produced. One table 66 groups the Heart Attack data of subjects that belong to a control group (Study Arm A) while the other table 68 groups the Heart Attack data of subjects that belong to a non-control group (Study Arm B). As can be appreciated by person of ordinary skill in the art, there is no way to trace the origins of any given data point in either table 66 or table 68, to its original subject, and therefore either table, by itself, is relatively uninformative. Taken together, however, note that there seems to be a lot more heart attacks occurring in Study Arm B.

[0068] In the embodiment shown in FIG. 3, the trial is a randomized clinical trial. In that embodiment, the system 100 maintains the clinical trial database 38 and the blinding database 40 as separate physical and digital entities, in order to maintain their distinct nature. In other words, the trial data and blind data remain as two separate data tables and no table is created where the subject identifier, study group and heart attack status would all be found in that same table. Furthermore, system communication with the blinding database table occurs only by virtue of the machine programs executing specified actions to sort the clinical trial data. The clinical trial data in this scenario is preferably segregated into generic pools of data and remains de-identified to both the subject and the study arm, and thus indecipherable from the standpoint of the ability to trace a particular data item back to a specific subject. At no point during system function is the blinding table 40 transferred into an area of the system where the table is accessible by any user or displayed in any form of output. This maintains the integrity and confidentiality of the blinding information.

[0069] In block 70, the routine retrieves a user defined analysis method 72 stored in the storage device 108 and retrieves the method from the mathematical models 116 stored in the storage device. The model is then run to analyze the grouped database 42. Preferably, a statistical significance of the safety and efficacy of the unblinded database known as p-value is obtained. The mathematical model may include one or more formulas, representing mathematical calculations, whereby one or more variables in the clinical trial database are identified, and numeric result may be obtained. Such formulas might include calculations of: mean, median, mode, range, average deviation, standard deviation, and variance. In addition, an administrator may enter mathematical formulas to further analyze the data to make comparisons between groups of data, as defined by the study arms, to determine statistical metrics and significance by methods including Chi-square analysis, t-test, f-test, one-tailed test, two-tailed test, and Analysis of Variance (ANOVA).

[0070] Once the mathematical analysis is completed, a user-defined p-value 74 stored in the storage device 108 is retrieved in block 76. In block 78, it is determined whether the derived p-value exceeds the retrieved user defined p-value. As discussed in detail previously, a typical user defined p-value used may be 0.05 meaning that the difference between the control group and non-control group is statistically significant. Thus, if the derived value is less than 0.05, the decision in block 78 is YES. Then, the routine send an alert in block 80 without displaying the actual output value(s). The alert can be in the form of a flashing display, alarm, a change in the system output display to the user by virtue of color-coding, fonts, icons or text, or an automated system generated message to the user by way of email, facsimile, telephone or pager.

[0071] In block 82, the routine, as an option according to a user defined output mode 84, can also create other outputs such as the generic data tables 66, 68 created in block 64. The output data could take various formats including plain text, American Standard Code for Information Interchange (ASCII), and SAS. Where appropriate, this would allow for more customized statistical analysis to be performed. These outputs may also be integrated with other software packages for creation of customized graphical reports.

[0072] If the trial is a randomized clinical trial, it is preferable to execute only block 80 which provides a Boolean output as to whether or not a particular study parameter has reached the desired level of statistical significance or not. Block 82 in that case is then skipped. The benefit of such a mode is to maintain the blinding information as securely as possible, and minimize the ability for inference to be made about the study arm of any given subject. In monitoring the exact numeric determination of statistical significance for any given clinical trial variable, it is conceivable that the accession of new data could cause statistical metrics for a particular study arm to change in such a manner that inference could be made regarding the blinding status of the subject whose data was most recently added, thus compromising statistical veil.

[0073] Block 80 may be useful in non-randomized trials because there is a benefit to display the specific numeric value corresponding to statistical significance, and since there is no blinding information to protect, it would be offered as a second mode of operation in the system. Alternatively, a third mode could be provided, whereby numeric ranges of statistical significance could be defined into groups that would be output to the user of the system.

[0074] If, however, the p-value derived is higher than the user-defined p-value, then the derived value does not exceed the user defined threshold value. In that case, the decision is NO and the routine 50 executes block 86. In block 86, the routine 50 waits for a predetermined amount of time and control passes to block 52 where the process of analyzing the trial data while the trial is ongoing is repeated. In other words, the system 100 is active throughout the data collection phase of the clinical trial, sending alerts when key parameters reach the pre-set level of statistical measure.

[0075] As can be appreciated by persons of ordinary skill in the art, the ability of the present clinical trial system 100 to continuously and confidentially monitor and analyze the trial data for statistical significance in tandem with data collection while the trial is ongoing is a tremendous benefit to the researchers. The trial database no longer becomes the bottleneck in obtaining useful results and statistical analysis can be conducted on a near real-time basis.

[0076] This continuous near real-time statistical analysis feature in turn has far reaching implications. Specifically, by providing researchers with an early indication of the clinical trial, the present invention shortens the time frame required to reach critical decisions about a new medical therapy. Still another advantage is that the present system improves patient safety by setting thresholds for triggering alerts for adverse events. A related advantage is that a futile trial can be ended early, thereby saving the substantial cost of conducting the trial. Conversely, for a successful medical treatment, a trial can be ended early or the placebo arm can be eliminated. The present invention also provides the ability to more accurately identify the need to perform a full-scale interim analysis.

[0077] Various omissions, modifications, substitutions and changes in the forms and details of the device illustrated and in its operation can be made by those skilled in the art without departing in any way from the spirit of the present invention. Accordingly, the scope of the invention is not limited to the foregoing specification, but instead is given by the appended claims along with their full range of equivalents.

* * * * *