U.S. patent application number 10/667848 was filed with the patent office on 2005-04-07 for system and method for continuous data analysis of an ongoing clinical trial.
Invention is credited to deVries, Glen M., Ikeguchi, Edward F., Sherif, Tarek A..
Application Number | 20050075832 10/667848 |
Document ID | / |
Family ID | 34393396 |
Filed Date | 2005-04-07 |
United States Patent
Application |
20050075832 |
Kind Code |
A1 |
Ikeguchi, Edward F. ; et
al. |
April 7, 2005 |
System and method for continuous data analysis of an ongoing
clinical trial
Abstract
System and method of continuously analyzing trial data of an
ongoing clinical trial is provided. A statistical analysis is
performed on a trial database containing subject trial data. If the
result of the statistical analysis does not exceed a predetermined
threshold value, then the statistical analysis is repeated while
the clinical trial is ongoing. In a blinded clinical trial, a
grouped database is generated from the trial database and a
blinding database prior to performing the statistical analysis. The
grouped database groups the subject trial data according to the
study groups. The ability to continuously monitor and analyze the
trial data for statistical significance in tandem with data
collection while the trial is ongoing provides many benefits to the
researchers because the trial database no longer becomes the
bottleneck in obtaining useful results and statistical analysis can
be conducted on a near real-time basis without having to wait until
completion of the trial.
Inventors: |
Ikeguchi, Edward F.; (New
York, NY) ; deVries, Glen M.; (New York, NY) ;
Sherif, Tarek A.; (New York, NY) |
Correspondence
Address: |
William H. Dippert
Reed Smith, LLP
29th Floor
599 Lexington Avenue
New York
NY
10022
US
|
Family ID: |
34393396 |
Appl. No.: |
10/667848 |
Filed: |
September 22, 2003 |
Current U.S.
Class: |
702/179 ; 705/2;
707/999.001; 707/999.1 |
Current CPC
Class: |
G06Q 10/10 20130101;
G16H 70/20 20180101; G16H 50/70 20180101; G16H 10/20 20180101 |
Class at
Publication: |
702/179 ;
707/001; 707/100; 705/002 |
International
Class: |
G06F 007/00; G06F
017/30; G06F 017/60; G06F 017/00; G06F 015/00; G06F 017/18; G06F
101/14 |
Claims
What is claimed is:
1. A method of continuously analyzing trial data of an ongoing
clinical trial, the method comprising: accessing a trial database
containing trial data of subjects in a clinical trial; performing a
statistical analysis on the accessed trial database; determining
whether the result of the statistical analysis exceeds a
predetermined threshold value; and if it is determined that the
result of the statistical analysis does not exceed the
predetermined threshold value, then repeating the steps of
accessing, performing and determining while the clinical trial is
ongoing.
2. The method according to claim 1, prior to the step of performing
a statistical analysis, further comprising: reading a user defined
criteria that defines the level of cleanliness of the trial data
for statistical analysis; and retrieving only those trial data that
meet the user defined criteria from the trial database.
3. The method according to claim 1, wherein if it is determined
that the result of the statistical analysis does not exceed the
predetermined threshold value, then waiting for a predetermined
time period prior to the repeating step.
4. The method according to claim 1, wherein the clinical trial is a
blinded clinical trial, further comprising: accessing a blinding
database containing subject identifiers and associated study group
identifiers, each study group identifier identifying to which study
group an associated subject belongs; and producing a grouped
database from the clinical database and the blinding database for
statistical analysis, the grouped database grouping the study data
according to the study group.
5. The method according to claim 4, wherein the grouped database is
stored in a memory device that is inaccessible by any user.
6. The method according to claim 1, wherein the step of performing
a statistical analysis is executed without locking the trial
database.
7. The method according to claim 1, wherein the clinical trial is a
blinded clinical trial, further comprising: reading a predefined
criteria that defines the level of cleanliness of trial data
required for analysis; retrieving only those trial data that meet
the predefined criteria from the trial database; accessing a
blinding database containing subject identifiers and an associated
study group identifier for each subject, each study group
identifier identifying to which study group each subject belongs;
and producing a grouped database from the retrieved trial data and
the blinding database for statistical analysis, the grouped
database grouping the trial data according to the study group.
8. The method according to claim 7, wherein the grouped database is
stored in a memory device that is inaccessible by any user to
preserve the blindness of the clinical trial.
9. The method according to claim 1, further comprising alerting a
user if it is determined that the result of the statistical
analysis exceeds the predetermined threshold value.
10. The method according to claim 9, wherein the predetermined
threshold value includes a predetermined statistical significance
value.
11. The method according to claim 10, wherein the step of
performing a statistical analysis comprises: retrieving a user
defined statistical model; and running the retrieved user defined
statistical model on the trial database.
12. A method of continuously analyzing trial data of an ongoing
blinded clinical trial, the method comprising: accessing a trial
database containing blinded trial data of subjects in an ongoing
blinded clinical trial; accessing a blinding database containing
subject identifiers and associated study group identifiers, each
study group identifier identifying to which study group an
associated subject belongs; producing a grouped database from the
trial database and the blinding database, the grouped database
grouping the trial data according to the study group; performing a
statistical analysis on the produced grouped database; determining
whether the result of the statistical analysis exceeds a
predetermined threshold value; and if it is determined that the
result of the statistical analysis does not exceed the
predetermined threshold value, then repeating the above steps of:
accessing a trial database, producing a grouped database,
performing a statistical analysis, and determining while the
clinical trial is ongoing.
13. The method according to claim 12, prior to the step of
performing a statistical analysis, further comprising: reading a
user defined criteria that defines the level of cleanliness of
trial data for statistical analysis; and retrieving only those
trial data that meet the user defined criteria from the trial
database for statistical analysis.
14. The method according to claim 12, wherein the produced grouped
database is stored in a memory device that is inaccessible by any
user.
15. The method according to claim 12, wherein the step of
performing a statistical analysis is executed without locking the
trial database.
16. The method according to claim 12, further comprising alerting a
user if it is determined that the result of the statistical
analysis exceeds the predetermined threshold value.
17. The method according to claim 16, wherein the predetermined
threshold value includes a predetermined statistical significance
value.
18. A system for continuously analyzing an ongoing clinical trial
comprising: a storage device operable to store a trial database
containing trial data of subjects in an ongoing clinical trial; a
processor coupled to the storage device; and an analysis program
executable by the processor and operable to: perform a statistical
analysis on the trial database; determine whether the output result
of the statistical analysis exceeds a predetermined threshold
value; and repeat the statistical analysis while the clinical trial
is ongoing if it is determined that the result of the statistical
analysis does not exceed the predetermined threshold value.
19. The system according to claim 18, wherein the analysis program
is further operable to: read a user defined criteria that defines
the level of cleanliness of trial data for statistical analysis;
and retrieve only those trial data that meet the user defined
criteria from the trial database.
20. The system according to claim 18, wherein if the analysis
program determines that the result of the statistical analysis does
not exceed the predetermined threshold value, then the analysis
program waits for a predetermined time period prior to repeating
the statistical analysis.
21. The system according to claim 18, wherein the clinical trial is
a blinded clinical trial, the analysis program further operable to:
access a blinding database containing subject identifiers and
associated study group identifiers, each study group identifier
identifying to which study group an associated subject belongs; and
produce a grouped database from the trial database and the blinding
database for statistical analysis, the grouped database grouping
the trial data according to the study group.
22. The system according to claim 21, further comprising a memory
device coupled to the processor and being inaccessible to any user,
wherein the grouped database is stored only in the memory
device.
23. The system according to claim 18, wherein the analysis program
performs the statistical analysis without locking the trial
database.
24. The system according to claim 18, wherein the analysis program
is further operable to alert a user if it determines that the
result of the statistical analysis exceeds the predetermined
threshold value.
Description
TECHNICAL FIELD OF THE INVENTION
[0001] This application relates to data processing of clinical
trial data and more specifically a system and method for
statistically analyzing the clinical trial data.
BACKGROUND OF THE INVENTION
[0002] In the United States, the Food and Drug Administration (FDA)
oversees the protection of consumers exposed to health-related
products ranging from food, cosmetics, drugs, gene therapies and
medical devices. Under the FDA guidance, clinical trials are
performed to test the safety and efficacy of new drugs, medical
devices or other treatments to ultimately ascertain whether or not
a new medical therapy is appropriate for widespread human
consumption.
[0003] More specifically, once a new drug or medical device has
undergone studies in animals, and results appear favorable, it can
be studied in humans. Before human testing is begun, findings of
animal studies are reported to the FDA to obtain approval to do so.
This report to the FDA is called an application for an
Investigational New Drug (IND).
[0004] The process of experimentation is referred to as a clinical
trial, which involves four phases. In Phase I, a few research
participants, referred to as subjects, (approximately 5 to 10) are
used to determine toxicity of a new treatment. In Phase II, more
subjects (10-20) are used to determine efficacy and further
ascertain safety. Doses are stratified to try to gain information
about the optimal portion. A treatment may be compared to either a
placebo or another existing therapy. In Phase III, efficacy is
determined. For this phase, more subjects on the order of hundreds
to thousands of patients are needed to perform a meaningful
statistical analysis. A treatment may be compared to either a
placebo or another existing therapy. In Phase IV (post-approval
study), the treatment has already been approved by the FDA, but
more testing is performed to evaluate long-term effects and to
evaluate other indications.
[0005] During clinical trials, patients are seen at medical clinics
and asked to participate in a clinical research project by their
doctor, known as an investigator. After the patients sign an
informed consent form, they are considered enrolled in the study,
and are subsequently referred to as study subjects. A study
sponsor, generally considered to be the company developing a new
medical treatment and supporting the research, develops a study
protocol. The study protocol is a document describing the reason
for the experiment, the rationale for the number of subjects
required, the methods used to study the subjects, and any other
guidelines or rules for how the study is to be conducted. Prior to
usage, the study protocol is reviewed and approved by an
Institutional Review Board (IRB). An IRB serves as a peer review
group, which evaluates a protocol to determine its scientific
soundness and ethics for the protection of the subjects and
investigator.
[0006] Creation of Study Groups (Study Arms)
[0007] Subjects enrolled in a clinical study are stratified into
groups that allow data to be assessed in a comparative fashion. In
a common example, one study arm, known as a control group (or
"control"), will use a placebo, whereby a pill containing no active
chemical ingredient is administered. In doing so, comparisons can
be made between subjects receiving actual medication versus
placebo.
[0008] Randomization
[0009] Subjects enrolled into a clinical study are assigned to a
study arm in a random fashion, which is done to avoid biases that
may occur in the selection of subjects for a trial. For example, a
subject who is a particularly good candidate to respond to a new
medication might be intentionally entered into the study arm to
receive real medication and not a placebo. This could skew the data
and outcome of the clinical trial to favor the medication under
study, by the selection of subjects who are most likely to perform
well with the medication. In instances where only one study group
is present, randomization is not performed.
[0010] Blinding
[0011] Blinding is a process by which the study arm assignment for
subjects in a clinical trial is not revealed to the subject (single
blind) or to both the subject and the investigator (double blind).
This minimizes the risk of data bias. Virtually all randomized
trials are blinded by definition. In instances where only one study
group is present, blinding is not performed.
[0012] Statistical Analysis of Trial Data
[0013] Generally, at the end of the trial, the database containing
the completed trial data is shipped to a statistician for analysis.
If particular occurrences, such as adverse events, are seen with an
incidence that is greater in one group over another such that it
exceeds the likelihood of pure chance alone, then it can be stated
that statistical significance has been reached. Using statistical
calculations, the comparative incidence of any given occurrence
between groups can be described by a numeric value, referred to as
a "p-value". A p-value of 1.0 indicates that there is a 100%
likelihood that an incident occurred as the result of chance alone.
Conversely, a p-value of 0.0 indicates that there is a 0%
likelihood that an incident occurred as a result of chance alone.
Generally, values of p<0.05 are considered to be "statistically
significant", and values of p<0.01 are considered "highly
statistically significant".
[0014] In some clinical trials, multiple study arms, or even a
control group, may not be utilized. In such cases, only a single
study group exists with all subjects receiving the same treatment.
This is typically performed when historical data about the medical
treatment, or a competing treatment is already known from prior
clinical trials, and may be utilized for the purpose of making
comparisons.
[0015] The creation of study arms, randomization, and blinding are
techniques that are used in most clinical trials where scientific
rigor is of high importance. However, these methods lead to several
challenges, since they prevent the clinical trial sponsor from
tracking key information related to safety and efficacy.
[0016] Regarding safety, the objective of any clinical trial is to
document the safety of a new treatment. However, in clinical trials
where randomization is conducted between two or more study arms,
this can be determined only as a result of analyzing and comparing
the safety parameters of one study group to another. Unfortunately,
because the study arm assignments are blinded, there is no way to
separate out subjects and their data into corresponding groups for
purposes of performing comparisons while the trial is being
conducted. Since many clinical trials may last for time periods
extending for years, it is conceivable to have a treatment toxicity
go unnoticed for prolonged periods without intervention.
[0017] Regarding efficacy, any clinical trial seeking to document
efficacy will incorporate key variables that are followed during
the course of the trial to draw the desired conclusion. In
addition, studies will define certain outcomes, or endpoints, at
which point a study subject is considered to have completed the
protocol. These parameters, including both key variables and study
endpoints, cannot be analyzed by comparison between study arms
while the subjects are randomized and blinded. This poses potential
problems in ethics and statistical analysis.
[0018] When new medications or other health-related treatments are
of superior efficacy to anything else, it is ethical to allow usage
of the treatment for those in imminent need, even prior to final
government approval. Conversely, when available, it is considered
unethical to withhold such treatments. For example, if a medication
were to be identified that eradicated the Human Immunodeficiency
Virus (HIV), it would be unethical to allow diseased patients to
continue suffering and even die of the illness, while the
medication was being clinically tested for purposes of government
approval. Ideally, in such situations, identification of effective
treatments should occur early in the project. Under these
circumstances, non-treatment arms (i.e., those taking placebos)
could be construed as unethical and should be eliminated. At
present, when clinical trials are randomized and blinded,
identification of a particularly effective treatment may not be
realized until the entire clinical trial is completed.
[0019] Another related problem is statistical power. By definition,
statistical power refers to the probability of a test appropriately
rejecting the null hypothesis, or the chance of an experiment's
outcome being the result of chance alone. Clinical research
protocols are engineered to prove a certain hypothesis about a
medical treatment's safety and efficacy, and disprove the null
hypothesis. To do so, statistical power is required, which can be
achieved by obtaining a large enough sample size of subjects in
each study arm. When too few subjects are enrolled into the study
arms, there is the risk of the study not accruing enough subjects
to enable the null hypothesis to be rejected, and thus not reaching
statistical significance. Because clinical trials that are
randomized are blinded, the actual number of subjects distributed
throughout study arms is not defined until the end of the project.
Although this maintains data collection integrity, there are
inherent inefficiencies in the system, regardless of the
outcome.
[0020] In a case where the study data reaches statistical
significance, as accrual of subjects continues, and data is
received, an optimal time to close a clinical study would be at the
very moment when statistical significance is achieved. While that
moment may arrive earlier in the course of a clinical trial, there
is no way of knowing this, and therefore time and money are lost.
Moreover, study subjects are enrolled above and beyond what is
needed to reach the goals of the study, thus placing human subjects
under experimentation unnecessarily.
[0021] In a case where the study data nearly reaches statistical
significance, while the study data falls short of statistical
significance, there is reason to believe that this is due to a
shortage of enrollment in the study. Frequently, to develop more
supportive data, clinical trials will be extended. These "extension
studies", however, can only begin after a full closure of the
parent study, frequently requiring months to years before starting
again.
[0022] In a case where the study data does not reach statistical
significance, there is no trend toward significance, and there is
little chance of reaching the desired conclusion. In that case, an
optimal time to close a study is as early as possible once the
conclusion can be established that the treatment under
investigation does not work, and study data has little chance of
reaching statistical significance (i.e., it is futile). In
randomized and blinded clinical trials, this conclusion is
difficult to arrive at until data analysis can be conducted. In
these situations, time and money are lost. Moreover, an excess of
human subjects are placed under study unnecessarily.
[0023] Data Safety Monitoring
[0024] To mitigate some of the risks related to the conduct of
randomized and blinded clinical trials, a Data Safety Monitoring
Board (DSMB) may be formed at the beginning of each protocol. In
general, a DSMB is recommended for clinical trials that involve a
potentially serious outcome (e.g., death, heart attack, etc.), are
randomized and blinded, and extend for prolonged periods of time.
In addition, a DSMB is required for trials that are sponsored by
the United States government, namely, the National Institute of
Health (NIH).
[0025] A DSMB generally consists of members who are domain experts
in the field of study, such as physicians, as well as
bio-statisticians. It is important that DSMB members be separate
from personnel of the sponsor organization, and financial
disclosure for all members is performed to minimize conflicts of
interest. Prior to start of a clinical trial, standard operating
procedures are established for the DSMB, including the frequency of
meetings, initiation of interim analyses, conduct during interim
analyses and criteria for discontinuation of the clinical trial. As
it relates to the safety of study subjects, DSMB functions to
examine trends of adverse occurrences rather than investigate
specific reports, which are generally left to each IRB responsible
for the activities of any given investigator.
[0026] Data Collection
[0027] A typical method of collecting and analyzing patient data is
illustrated in the flow chart shown in FIG. 1. Patient data or
charts 10 from the clinical trial are collected manually in paper
forms. Using a technology called Electronic Data Capture (EDC) or
Remote Data Entry (RDE), a computer (not shown) displays a Case
Report Form (CRF) to a clinical research coordinator (CRC) 12,
typically a nurse or doctor. The CRC 12 then enters the patient
data 10 through the computer display which is received in block 14
by an EDC system which executes all of the steps included in a box
11. The received data is stored in a clinical trial database 38
through a link 20 which can be an electronic link such as a
telephone line or Internet link. In block 18, it is determined
whether the data inputted by the CRC 12 is clean using one or more
rules. The rules may be implemented by simple range checking
scripts, or by an inference rule engine or deterministic rule
engine in order to identify potential problems with the data.
[0028] In addition to the software programs, block 18 may also
involve research personnel known as monitors or Clinical Research
Associates (CRA) who travel to the various research sites to
perform source document verification (SDV) whereby the data in the
database 38 is reconciled against individual patient charts to the
degree required in the protocol.
[0029] If it is determined that the data entered is not clean, then
block 22 generates a query which is then sent over the link 20 to
the CRC 12. The blocks 14, 18 and 22 are repeated until all of the
subject data 10 are entered. This is an iterative process that
continues until resolution of all queries in the database 38.
[0030] Once all data 10 are entered, block 24 determines whether
the clinical trial is over. If no, then the EDC system continues to
receive the patient trial data 10 through block 14 as the trial
continues. If the trial is over, control passes to block 26 where
the entire database is locked from any changes, deletions or
insertions of the data in the database 38. In one embodiment,
locking involves turning the database 38 into a "read-only"
state.
[0031] In block 28, a blinding data from a blinding database is
retrieved. A simplified example blinding database 40 is shown in
FIG. 4. The blinding database 40 is a database table having two
columns. The first column contains a patient subject ID (subject
identifier) and the second column contains an associated study arm
or group the patient belongs to. In the table 40, 13 subjects
belong to Study Arm "A" and 12 subjects belong to Study Arm "B".
Because the database 40 is not associated with actual trial data,
the table 40 by itself is relatively uninformative.
[0032] A simplified example trial database 38 is shown in FIG. 5.
The embodiment shown is a database table containing two columns.
The first column contains a patient subject ID and the second
column is a database field called "Heart Attack" which specifies
whether the subject had a heart attack. An entry of 0 means NO and
entry of 1 means YES. As can be seen from the trial database 38,
due to blinding of the subjects in the study groups, there is no
way of knowing whether or not any discrepancy exists in the number
of heart attacks seen in Group A versus B. Because the trial is
randomized, without the blinding data 40, the table 38 by itself is
relatively uninfomative.
[0033] In block 28, an unblinded database is produced from the
trial database 38 and the retrieved blinding database 40 in which
the subject ID is used as a common key. The result of the
unblinding process of block 28 is shown in FIG. 6 as the unblinded
database 41. In the embodiment shown, one database table is
produced. The table 41 contains subject identifiers, Study Arm of
the subjects, and Heart Attack data of those subjects. As can be
appreciated by a person of ordinary skill in the art, there is a
direct traceability from study data and subject ID to Study
Arm.
[0034] In block 30, statistical analysis is performed on the
unblinded data 42 to find out the efficacy and safety of the
completed clinical trial.
[0035] During the course of any given randomized and blinded
clinical trial, an interim analysis may be conducted. An interim
analysis may result from urging of the DSMB for cause, or be a
pre-planned event as described in the study protocol.
[0036] Conducting an interim analysis involves a process where the
available data is verified and cleaned. The verification process
generally involves a process by which trained personnel travel to
the various research sites to reconcile submitted data against
source documents, which generally implies the patient's chart,
laboratory reports, radiographic readings, and others. The data
cleaning process may involve a series of documented communications
between the research site and a central data coordinating personnel
to resolve inconsistencies or other conflicting data.
[0037] The refined database must then be sent to an impartial third
party for statistical analysis. To conduct the analysis, the
statistician must un-blind the clinical trial database by combining
both the study data with the blinding key of which subjects are
assigned to particular study arms. Since the clinical study is
expected to continue beyond the interim analysis, the process of
un-blinding must be conducted with great caution, so as not to
reveal the blind status of subjects to any personnel involved in
the execution of the clinical trial. Once a statistician has
completed the interim analysis, a report is issued to the trial
sponsor and DSMB.
[0038] Inclusive of the data cleaning, verification, un-blinding
and statistical analysis processes, as well as the administrative
resources for coordinating several groups of personnel for the
un-blinding process, an interim analysis is often arduous,
time-consuming and expensive.
[0039] In spite of the latest technological advancements made in
the area of data collection through electronic systems, there is
still a disadvantage in that it is very difficult to draw
conclusions about a medical treatment while the data is being
collected during the trial. This limitation stems primarily from
the fact that statistical analysis cannot begin until the trial
data has been fully cleaned and processed. At present, statistical
analysis can only be conducted upon data in an "en bloc" fashion.
This creates a situation where the ability to draw conclusions
about a medical therapy inevitably lags behind the process of
simply obtaining data in a database.
[0040] Regardless of how efficient the data collection process may
be made through automation, the ability to acquire the information
needed for critical decision-making is still suspended by the
requirement to obtain a locked database in order for statistical
work to advance.
[0041] Therefore, it is desirable to provide a method and system
for conducting statistical analysis on the clinical data collected
while the trial is ongoing.
[0042] In the case of a randomized clinical trial where maintaining
confidentiality is important, it is also desirable to provide a
secure system in which the blinding information is integrated in
such a way that the clinical trial data and blinding data are
stored securely to prevent users from accessing the data and yet
allow the execution of programs for performing statistical
comparisons between study arms while the trial is ongoing.
SUMMARY OF THE INVENTION
[0043] According to the present invention, a system and method of
continuously analyzing trial data of an ongoing clinical trial is
provided. A trial database containing subject trial data in a
clinical trial is accessed, and a statistical analysis is performed
on the accessed trial database. If the result of the statistical
analysis does not exceed a predetermined threshold value, then the
step of statistical analysis is repeated while the clinical trial
is ongoing.
[0044] In another aspect of the invention, the present method uses
a user definable criteria that defines the level of cleanliness of
subject data for statistical analysis. In that case, only those
subject data that meet the user defined criteria are selected from
the trial database for statistical analysis.
[0045] In another aspect of the invention, when the result of the
statistical analysis does not exceed the predetermined threshold
value, then the analysis program waits for a predetermined time
period prior to repeating the statistical analysis step. This is
done so that additional subject data are added to the trial
database.
[0046] In another aspect, the clinical trial is blinded.
Accordingly, in addition to the trial database, a blinding database
containing subject identifiers and associated study group
identifiers is accessed. Each study group identifier identifies
which study group an associated subject belongs to. Then a grouped
database is produced from the clinical database and the blinding
database for statistical analysis in which the grouped database
groups trial data according to the study group the subjects belong
to. Preferably, one data table is created for each study group and
contains all trial data for those subjects that belong to that
study group.
[0047] In yet another aspect of the search, the unblinded database
is stored in a memory device that is inaccessible by any user in
order to preserve the blindness of the trial.
[0048] In another aspect of the search, the statistical analysis is
performed without locking the trial database.
[0049] In another aspect of the search, if the result of the
statistical analysis exceeds the predetermined threshold value, a
user is alerted. The predetermined threshold value may include a
predetermined statistical significance value.
[0050] In another aspect of the search, there are many statistical
models to choose from. A user selectable statistical model is
retrieved and the retrieved model is run on the trial database.
BRIEF DESCRIPTION OF THE DRAWINGS
[0051] FIG. 1 is a flow diagram of a method of collecting and
analyzing clinical trial data using an EDC system.
[0052] FIG. 2 is a functional block diagram of a clinical trial
management system according to an exemplary embodiment of the
present invention.
[0053] FIG. 3 is a flow diagram of a software routine that
continuously analyzes the trial data while the trial is ongoing
according to the present invention.
[0054] FIG. 4 is an example of a blinding database.
[0055] FIG. 5 is an example of a trial database containing subject
trial data.
[0056] FIG. 6 is an example of an unblinded database derived from
the blinding database of FIG. 4 and the trial database of FIG.
5.
[0057] FIG. 7A is an example of a trial database containing a
status field that represents the levels of cleanliness of the
subject data records.
[0058] FIG. 7B is a filtered trial database containing a subset of
the trial database of FIG. 7A which have been selected according to
a user specified status.
[0059] FIG. 8 is an example of a grouped database derived from the
blinding database of FIG. 4 and the filtered trial database of FIG.
7B.
DETAILED DESCRIPTION OF THE INVENTION
[0060] As shown in FIG. 2, a clinical trial data management system
100 of the present invention is an Internet-enabled application
solution framework that automates data collection, data cleaning,
grouping if needed (as will be explained more fully later herein)
and statistical data analysis while the trial is ongoing. The
system 100 is connected to a computer network such as the Internet
120 through, for example, an I/O interface 102, which receives
information from and sends information to Internet users over a
communication link 20 and to one or more operators using a work
station 117. The Internet users are typically CRC's located at
various trial sites who transcribe the subjects' charts to the
system 100. The system 100 includes, for example, memory 104 which
is volatile, processor (CPU) 106, program storage 108, and data
storage device 118, all commonly connected to each other through a
bus 112. The program storage 108 stores, among others, a clinical
trial analysis program or module 114 and one or more mathematical
models 116 that are used to analyze the subject data and obtain the
p-value for statistical significance. The data storage device 118
stores a clinical trial database 38 and blinding database 40. Any
of the software program modules in the program storage 108 and data
from the data storage 110 are transferred to the memory 104 as
needed and is executed by the processor 106.
[0061] The system 100 can be any computer such as a WINDOWS-based
or UNIX-based personal computer, server, workstation, minicomputer
or a mainframe, or a combination thereof. While the system 100 is
illustrated as a single computer unit for purposes of clarity,
persons of ordinary skill in the art will appreciate that the
system may comprise a group of computers which can be scaled
depending on the processing load and database size.
[0062] FIG. 3 illustrates a flow diagram of a software routine 50
that continuously analyzes the trial data while the trial is
ongoing according to the present invention. The routine 50 is
stored in the storage device 108 and works with the EDC system 11
of FIG. 1 while the system 11 continuously collects and cleans the
trial data.
[0063] In block 52, the routine 50 connects to a trial database 56
through a log-in procedure. A simplified trial database 56 is shown
in FIG. 7A. The database 56 contains three columns comprising a
patient subject ID field, a data status field, which specifies the
level of cleanliness, and a "Heart Attack" field similar to FIG.
5.
[0064] FIG. 7A illustrates simplified trial data records that are
at different levels of cleanliness. In the example shown in FIG.
7A, there are five levels of status. Level 1 indicates that there
is an outstanding query that needs to be answered by the CRC 12
(see step 22 in FIG. 1). Level 2 indicates that the record is
pending a review by another reviewer such as the sponsor of the
trial. Level 3 indicates that it is pending a review by a clinical
research associate (CRA) to travel to a research site to perform
what is known as a source document verification (SDV). This
typically involves a verification of the trial record with an
actual patient chart. Level 4 indicates that it is pending a lock
barring any intervention by any reviewer. Finally, Level 5
indicates that the record is locked which represents the highest
level of clean data.
[0065] In the "Heart Attack" field, an entry of 0 means NO and
entry of 1 means YES. The "Heart Attack" field also includes some
erroneous data such as "don't know" for subject 118 or "Y" for
subject 107. Accordingly, the status for those records indicates a
"1" in which queries are outstanding.
[0066] Once connected, the routine 50 retrieves in block 60 a user
specified criteria 54 stored in the storage device 108 which
specifies the status or level of cleanliness of the trial database
and in block 61 retrieves the trial database 56 which is filtered
for those database records that satisfy the retrieved criteria. For
an example, if the retrieved user specified criteria is 3, block 61
selects only those records that have a status of 3 or better. Such
a filtered database 58 is shown in FIG. 7B. While the database 58
has a relatively higher level of cleanliness, it does have a fewer
number of records. This is useful since, at any given point in time
during the data collection process, the clinical trial database 56
may have data that has any combination of data pending SDV,
containing outstanding queries, completed SDV but awaiting lock,
and so on. Depending upon the operating procedures defined for any
such clinical trial, only certain subsets of data may be suitable
for inclusion in an analysis.
[0067] Once the database 39 is filtered according to the user
specified criteria, block 62 is executed. In block 62, the blinding
database 40 is retrieved in the memory 104. In block 64, the
filtered trial database 58 and the blinding database 40 are used to
produce a grouped database 42. In the embodiment shown, two
database tables 66, 68, one for each study group without
identifying subjects, are produced. One table 66 groups the Heart
Attack data of subjects that belong to a control group (Study Arm
A) while the other table 68 groups the Heart Attack data of
subjects that belong to a non-control group (Study Arm B). As can
be appreciated by person of ordinary skill in the art, there is no
way to trace the origins of any given data point in either table 66
or table 68, to its original subject, and therefore either table,
by itself, is relatively uninformative. Taken together, however,
note that there seems to be a lot more heart attacks occurring in
Study Arm B.
[0068] In the embodiment shown in FIG. 3, the trial is a randomized
clinical trial. In that embodiment, the system 100 maintains the
clinical trial database 38 and the blinding database 40 as separate
physical and digital entities, in order to maintain their distinct
nature. In other words, the trial data and blind data remain as two
separate data tables and no table is created where the subject
identifier, study group and heart attack status would all be found
in that same table. Furthermore, system communication with the
blinding database table occurs only by virtue of the machine
programs executing specified actions to sort the clinical trial
data. The clinical trial data in this scenario is preferably
segregated into generic pools of data and remains de-identified to
both the subject and the study arm, and thus indecipherable from
the standpoint of the ability to trace a particular data item back
to a specific subject. At no point during system function is the
blinding table 40 transferred into an area of the system where the
table is accessible by any user or displayed in any form of output.
This maintains the integrity and confidentiality of the blinding
information.
[0069] In block 70, the routine retrieves a user defined analysis
method 72 stored in the storage device 108 and retrieves the method
from the mathematical models 116 stored in the storage device. The
model is then run to analyze the grouped database 42. Preferably, a
statistical significance of the safety and efficacy of the
unblinded database known as p-value is obtained. The mathematical
model may include one or more formulas, representing mathematical
calculations, whereby one or more variables in the clinical trial
database are identified, and numeric result may be obtained. Such
formulas might include calculations of: mean, median, mode, range,
average deviation, standard deviation, and variance. In addition,
an administrator may enter mathematical formulas to further analyze
the data to make comparisons between groups of data, as defined by
the study arms, to determine statistical metrics and significance
by methods including Chi-square analysis, t-test, f-test,
one-tailed test, two-tailed test, and Analysis of Variance
(ANOVA).
[0070] Once the mathematical analysis is completed, a user-defined
p-value 74 stored in the storage device 108 is retrieved in block
76. In block 78, it is determined whether the derived p-value
exceeds the retrieved user defined p-value. As discussed in detail
previously, a typical user defined p-value used may be 0.05 meaning
that the difference between the control group and non-control group
is statistically significant. Thus, if the derived value is less
than 0.05, the decision in block 78 is YES. Then, the routine send
an alert in block 80 without displaying the actual output value(s).
The alert can be in the form of a flashing display, alarm, a change
in the system output display to the user by virtue of color-coding,
fonts, icons or text, or an automated system generated message to
the user by way of email, facsimile, telephone or pager.
[0071] In block 82, the routine, as an option according to a user
defined output mode 84, can also create other outputs such as the
generic data tables 66, 68 created in block 64. The output data
could take various formats including plain text, American Standard
Code for Information Interchange (ASCII), and SAS. Where
appropriate, this would allow for more customized statistical
analysis to be performed. These outputs may also be integrated with
other software packages for creation of customized graphical
reports.
[0072] If the trial is a randomized clinical trial, it is
preferable to execute only block 80 which provides a Boolean output
as to whether or not a particular study parameter has reached the
desired level of statistical significance or not. Block 82 in that
case is then skipped. The benefit of such a mode is to maintain the
blinding information as securely as possible, and minimize the
ability for inference to be made about the study arm of any given
subject. In monitoring the exact numeric determination of
statistical significance for any given clinical trial variable, it
is conceivable that the accession of new data could cause
statistical metrics for a particular study arm to change in such a
manner that inference could be made regarding the blinding status
of the subject whose data was most recently added, thus
compromising statistical veil.
[0073] Block 80 may be useful in non-randomized trials because
there is a benefit to display the specific numeric value
corresponding to statistical significance, and since there is no
blinding information to protect, it would be offered as a second
mode of operation in the system. Alternatively, a third mode could
be provided, whereby numeric ranges of statistical significance
could be defined into groups that would be output to the user of
the system.
[0074] If, however, the p-value derived is higher than the
user-defined p-value, then the derived value does not exceed the
user defined threshold value. In that case, the decision is NO and
the routine 50 executes block 86. In block 86, the routine 50 waits
for a predetermined amount of time and control passes to block 52
where the process of analyzing the trial data while the trial is
ongoing is repeated. In other words, the system 100 is active
throughout the data collection phase of the clinical trial, sending
alerts when key parameters reach the pre-set level of statistical
measure.
[0075] As can be appreciated by persons of ordinary skill in the
art, the ability of the present clinical trial system 100 to
continuously and confidentially monitor and analyze the trial data
for statistical significance in tandem with data collection while
the trial is ongoing is a tremendous benefit to the researchers.
The trial database no longer becomes the bottleneck in obtaining
useful results and statistical analysis can be conducted on a near
real-time basis.
[0076] This continuous near real-time statistical analysis feature
in turn has far reaching implications. Specifically, by providing
researchers with an early indication of the clinical trial, the
present invention shortens the time frame required to reach
critical decisions about a new medical therapy. Still another
advantage is that the present system improves patient safety by
setting thresholds for triggering alerts for adverse events. A
related advantage is that a futile trial can be ended early,
thereby saving the substantial cost of conducting the trial.
Conversely, for a successful medical treatment, a trial can be
ended early or the placebo arm can be eliminated. The present
invention also provides the ability to more accurately identify the
need to perform a full-scale interim analysis.
[0077] Various omissions, modifications, substitutions and changes
in the forms and details of the device illustrated and in its
operation can be made by those skilled in the art without departing
in any way from the spirit of the present invention. Accordingly,
the scope of the invention is not limited to the foregoing
specification, but instead is given by the appended claims along
with their full range of equivalents.
* * * * *