U.S. patent application number 13/889046 was filed with the patent office on 2013-09-12 for compositions for use in identification of orthopoxviruses.
This patent application is currently assigned to IBIS BIOSCIENCES, INC.. The applicant listed for this patent is David J. Ecker, Thomas A. Hall, Steven A. Hofstadler, Rangarajan Sampath. Invention is credited to David J. Ecker, Thomas A. Hall, Steven A. Hofstadler, Rangarajan Sampath.
Application Number | 20130236884 13/889046 |
Document ID | / |
Family ID | 37494546 |
Filed Date | 2013-09-12 |
United States Patent
Application |
20130236884 |
Kind Code |
A1 |
Sampath; Rangarajan ; et
al. |
September 12, 2013 |
COMPOSITIONS FOR USE IN IDENTIFICATION OF ORTHOPOXVIRUSES
Abstract
Oligonucleotide primers and compositions and kits containing the
same for rapid identification of orthopoxviruses by amplification
of a segment of viral nucleic acid followed by molecular mass
analysis are provided.
Inventors: |
Sampath; Rangarajan; (San
Diego, CA) ; Hall; Thomas A.; (Oceanside, CA)
; Ecker; David J.; (Encinitas, CA) ; Hofstadler;
Steven A.; (Vista, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Sampath; Rangarajan
Hall; Thomas A.
Ecker; David J.
Hofstadler; Steven A. |
San Diego
Oceanside
Encinitas
Vista |
CA
CA
CA
CA |
US
US
US
US |
|
|
Assignee: |
IBIS BIOSCIENCES, INC.
Carlsbad
CA
|
Family ID: |
37494546 |
Appl. No.: |
13/889046 |
Filed: |
May 7, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13451216 |
Apr 19, 2012 |
|
|
|
13889046 |
|
|
|
|
11210516 |
Aug 24, 2005 |
8163895 |
|
|
13451216 |
|
|
|
|
10728486 |
Dec 5, 2003 |
7718354 |
|
|
11210516 |
|
|
|
|
60604329 |
Aug 24, 2004 |
|
|
|
Current U.S.
Class: |
435/5 |
Current CPC
Class: |
C12Q 1/6888 20130101;
C12Q 1/701 20130101 |
Class at
Publication: |
435/5 |
International
Class: |
C12Q 1/70 20060101
C12Q001/70 |
Goverment Interests
STATEMENT OF GOVERNMENT SUPPORT
[0002] This invention was made with United States Government
support under DARPA/SPO contract BAA00-09. The United States
Government may have certain rights in the invention.
Claims
1-18. (canceled)
19. A method for identification of an unknown orthopoxvirus
comprising: amplifying nucleic acid from said orthopoxvirus using
an oligonucleotide primer 13 to 35 nucleobases in length comprising
at least 70% sequence identity with SEQ ID NO: 1 and an
oligonucleotide primer 15 to 35 nucleobases in length comprising at
least 70% sequence identity with SEQ ID NO: 24 to obtain an
amplification product; measuring the molecular mass of said
amplification product; optionally, determining the base composition
of said amplification product from said molecular mass; and
comparing said molecular mass or base composition with a plurality
of molecular masses or base compositions of known orthopoxvirus
bioagent identifying amplicons, wherein a match between said
molecular mass or base composition and a member of said plurality
of molecular masses or base compositions identifies said unknown
orthopoxvirus.
20. A method of determining the presence or absence of an
orthopoxvirus species in a sample comprising: amplifying nucleic
acid from said sample using the composition of an oligonucleotide
primer 13 to 35 nucleobases in length comprising at least 70%
sequence identity with SEQ ID NO: 1 and an oligonucleotide primer
15 to 35 nucleobases in length comprising at least 70% sequence
identity with SEQ ID NO: 24 to obtain an amplification product;
determining the molecular mass of said amplification product;
optionally, determining the base composition of said amplification
product from said molecular mass; and comparing said molecular mass
or base composition of said amplification product with the known
molecular masses or base compositions of one or more known
orthopoxvirus species bioagent identifying amplicons, wherein a
match between said molecular mass or base composition of said
amplification product and the molecular mass or base composition of
one or more known orthopoxvirus species bioagent identifying
amplicons indicates the presence of said orthopoxvirus species in
said sample.
21. A method for determination of the quantity of an unknown
orthopoxvirus in a sample comprising: contacting said sample with
an oligonucleotide primer 13 to 35 nucleobases in length comprising
at least 70% sequence identity with SEQ ID NO: 1 and an
oligonucleotide primer 15 to 35 nucleobases in length comprising at
least 70% sequence identity with SEQ ID NO: 24 and a known quantity
of a calibration polynucleotide comprising a calibration sequence;
concurrently amplifying nucleic acid from said orthopoxvirus in
said sample with an oligonucleotide primer 13 to 35 nucleobases in
length comprising at least 70% sequence identity with SEQ ID NO: 1
and an oligonucleotide primer 15 to 35 nucleobases in length
comprising at least 70% sequence identity with SEQ ID NO: 24 and
amplifying nucleic acid from said calibration polynucleotide in
said sample with an oligonucleotide primer 13 to 35 nucleobases in
length comprising at least 70% sequence identity with SEQ ID NO: 1
and an oligonucleotide primer 15 to 35 nucleobases in length
comprising at least 70% sequence identity with SEQ ID NO: 24 to
obtain a first amplification product comprising an orthopoxvirus
bioagent identifying amplicon and a second amplification product
comprising a calibration amplicon; determining the molecular mass
and abundance for said orthopoxvirus bioagent identifying amplicon
and said calibration amplicon; and distinguishing said
orthopoxvirus bioagent identifying amplicon from said calibration
amplicon based on molecular mass, wherein comparison of
orthopoxvirus bioagent identifying amplicon abundance and
calibration amplicon abundance indicates the quantity of
orthopoxvirus in said sample.
22. The method of claim 21 further comprising repeating said steps,
wherein a different primer pair is used, wherein each member of
said different primer pair is of a length of 13 to 35 nucleobases
and comprises 70% to 100% sequence identity with the corresponding
member of any of the pairs of primers of SEQ ID NOs: 2:25, 3:26,
5:28, 6:29, or 7:30.
23. The method of claim 21 further comprising repeating said steps,
wherein two different primer pairs are used, wherein each member of
said two different primer pairs is of a length of 13 to 35
nucleobases and comprises 70% to 100% sequence identity with the
corresponding member of any of the pairs of primers of SEQ ID NOs:
2:25, 3:26, 5:28, 6:29, or 7:30.
24. The method of claim 21 further comprising repeating said steps,
wherein three different primer pairs are used, wherein each member
of said three different primer pairs is of a length of 13 to 35
nucleobases and comprises 70% to 100% sequence identity with the
corresponding member of any of the pairs of primers of SEQ ID NOs:
2:25, 3:26, 5:28, 6:29, or 7:30.
25. The method of claim 21 further comprising repeating said steps,
wherein four different primer pairs are used, wherein each member
of said four different primer pairs is of a length of 13 to 35
nucleobases and comprises 70% to 100% sequence identity with the
corresponding member of any of the pairs of primers of SEQ ID NOs:
2:25, 3:26, 5:28, 6:29, or 7:30.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation-in-part of U.S. patent
application Ser. No. 10/728,486 filed Dec. 5, 2003, and claims the
benefit of priority to U.S. Provisional Application Ser. No.
60/604,329 filed Aug. 24, 2004, each of which is incorporated
herein by reference in its entirety.
FIELD OF THE INVENTION
[0003] The present invention relates generally to the field of
genetic identification and quantification of orthopoxviruses and
provides methods, compositions and kits useful for this purpose, as
well as others, when combined with molecular mass analysis.
BACKGROUND OF THE INVENTION
A. Orthopoxviruses
[0004] The poxviruses comprise a large family of complex DNA
viruses that infect both vertebrate and invertebrate hosts. General
properties of the Poxvirus family include (a) a large complex
virion containing enzymes for mRNA synthesis, (b) a genome composed
of a single linear double-strand DNA molecule of 130 to 300
kilobases, and (c) the ability to replicate within the cytoplasmic
compartment of the cell. The vertebrate poxviruses have been placed
into six genera: Orthopoxvirus, Parapoxvirus, Capripoxvirus,
Leporipoxvirus, Suipoxvirus, and Avipoxvirus.
[0005] Three members of the Orthopoxvirus genus are known to cause
disease in humans. The most notorious member of the Poxvirus family
is the variola virus which, before its eradication, was responsible
for smallpox. Cowpox virus and Monkeypox virus also cause disease
in humans. Additional members of the Orthopoxvirus genus include:
Buffalopox virus, Camelpox virus, Rabbitpox virus, Raccoonpox
virus, Volepox virus and Ectromeila virus.
B. Bioagent Detection
[0006] A problem in determining the cause of a natural infectious
outbreak or a bioterrorist attack is the sheer variety of organisms
that can cause human disease. There are over 1400 organisms
infectious to humans; many of these have the potential to emerge
suddenly in a natural epidemic or to be used in a malicious attack
by bioterrorists (Taylor et al., Philos. Trans. R. Soc. London B.
Biol. Sci., 2001, 356, 983-989). This number does not include
numerous strain variants, bioengineered versions, or pathogens that
infect plants or animals.
[0007] Much of the new technology being developed for detection of
biological weapons incorporates a polymerase chain reaction (PCR)
step based upon the use of highly specific primers and probes
designed to selectively detect individual pathogenic organisms.
Although this approach is appropriate for the most obvious
bioterrorist organisms, like smallpox and anthrax, experience has
shown that it is very difficult to predict which of hundreds of
possible pathogenic organisms might be employed in a terrorist
attack. Likewise, naturally emerging human disease that has caused
devastating consequence in public health has come from unexpected
families of bacteria, viruses, fungi, or protozoa. Plants and
animals also have their natural burden of infectious disease agents
and there are equally important biosafety and security concerns for
agriculture.
[0008] An alternative to single-agent tests is to perform
broad-range consensus priming of a gene target conserved across
groups of bioagents. Broad-range priming has the potential to
generate amplification products across entire genera, families, or,
as with bacteria, an entire domain of life. This strategy has been
successfully employed using consensus 16S ribosomal RNA primers for
determining bacterial diversity, both in environmental samples
(Schmidt et al., J. Bact., 1991, 173, 4371-4378) and in natural
human flora (Kroes et al., Proc. Nat. Acad. Sci. (USA), 1999, 96,
14547-14552). One drawback of this approach for unknown bioagent
detection and epidemiology is that analysis of the PCR products
requires cloning and sequencing of hundreds to thousands of
colonies per sample, which is impractical to perform rapidly or on
a large number of samples.
[0009] Conservation of sequence is not as universal for viruses.
Large groups of viral species, however, share conserved
protein-coding regions, such as regions encoding viral polymerases
or helicases. Like bacteria, consensus priming has also been
described for detection of several viral families, including
coronaviruses (Stephensen et al., Vir. Res., 1999, 60, 181-189),
enteroviruses (Oberste et al., J. Virol., 2002, 76, 1244-51;
Oberste et al., J. Clin. Virol., 2003, 26, 375-7; and Oberste et
al., Virus Res., 2003, 91, 241-8), retroid viruses (Mack et al.,
Proc. Natl. Acad. Sci. U.S.A., 1988, 85, 6977-81; Seifarth et al.,
AIDS Res. Hum. Retroviruses, 2000, 16, 721-729; and Donehower et
al., J. Vir. Methods, 1990, 28, 33-46), and adenoviruses
(Echavarria et al., J. Clin. Micro., 1998, 36, 3323-3326). However,
as with bacteria, there is no adequate analytical method other than
sequencing to identify the viral bioagent present.
[0010] In contrast to PCR-based methods, mass spectrometry provides
detailed information about the molecules being analyzed, including
high mass accuracy. It is also a process that can be easily
automated. DNA chips with specific probes can only determine the
presence or absence of specifically anticipated organisms. Because
there are hundreds of thousands of species of benign pathogens,
some very similar in sequence to threat organisms, even arrays with
10,000 probes lack the breadth needed to identify a particular
organism.
[0011] There is a need for a method for identification of bioagents
which is both specific and rapid, and in which no culture or
nucleic acid sequencing is required.
[0012] The present invention provides, inter alia, methods of
identifying unknown viruses, including viruses of the Orthopoxvirus
genus. Also provided are oligonucleotide primers, compositions, and
kits containing the oligonucleotide primers, which define
orthopoxvirus identifying amplicons and, upon amplification,
produce corresponding amplification products whose molecular masses
provide the means to identify orthopoxviruses at the species and
sub-species or strain level.
SUMMARY OF THE INVENTION
[0013] The present invention provides, inter alia, primers and
compositions comprising pairs of primers, and kits containing the
same for use in identification of orthopoxviruses. The primers are
designed to produce orthopoxvirus identifying amplicons of DNA
encoding genes essential to orthopoxvirus replication. The
invention further provides compositions comprising one or more
pairs of primers and kits containing the same, which are designed
to provide species and sub-species or strain level characterization
of orthopoxviruses.
[0014] The individual orthopoxvirus primers of the invention are
primers that are 13 to 35 nucleobases in length comprising at least
70% sequence identity with any of SEQ ID NOs: 1, 2, 3, 4, 5, 6, 24,
25, 26, 27, 28, and 29. The primer pairs of the invention comprise
these same individual primers in the following combinations: SEQ ID
NOs: 1:24, 2:25, 3:26, 4:27, 5:28, and 6:29. The kits of the
invention can comprise any combination of the same primer
pairs.
[0015] The invention also provides methods of using the primer
pairs and kits comprising the same for identification of
orthopoxviruses and also for determining the presence or absence of
an orthopoxvirus in a sample by using the primer pairs to obtain
orthopoxvirus bioagent identifying amplicons, determining their
molecular masses or base compositions and comparing the molecular
masses or base compositions with molecular masses or base
compositions of known orthopoxvirus bioagent identifying
amplicons.
[0016] The invention also provides orthopoxvirus bioagent
identifying amplicons obtained by amplification of a segment of a
genome of an orthopoxvirus with any of the primer pairs listed
above. The orthopoxvirus genomes from which orthopoxvirus bioagent
identifying amplicons are obtained include, but are not limited to,
the GenBank Accession numbers given in Table 3 (vide infra).
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] FIG. 1 is a representative process diagram illustrating a
representative primer design process.
[0018] FIG. 2 is a representative process diagram for
identification and determination of the quantity of a bioagent in a
sample.
[0019] FIG. 3 is a pseudo 4-D plot of base compositions of
orthopoxviruses obtained with primer pair number 299.
[0020] FIG. 4 is a pseudo 4-D plot of base compositions of
orthopoxviruses obtained with primer pair number 297.
DETAILED DESCRIPTION OF EMBODIMENTS
[0021] The present invention provides, inter alia, methods for
detection and identification of orthopoxviruses in an unbiased
manner using orthopoxvirus identifying amplicons. Intelligent
primers are selected to hybridize to conserved sequence regions of
nucleic acids derived from an orthopoxvirus and which bracket or
flank variable sequence regions to yield an orthopoxvirus
identifying amplicon. The orthopoxvirus identifying amplicon can be
amplified and is amenable to molecular mass determination. The
molecular mass then provides a means to uniquely identify the
orthopoxvirus without a requirement for prior knowledge of the
possible identity of the orthopoxvirus. The molecular mass or
corresponding base composition signature (BCS) of the amplification
product is then matched against a database of molecular masses or
base composition signatures. Furthermore, the method can be applied
to rapid parallel multiplex analyses, the results of which can be
employed in a triangulation identification strategy. The present
method provides rapid throughput and does not require nucleic acid
sequencing of the amplified target sequence for orthopoxvirus
detection and identification.
[0022] In the context of the present invention, a "bioagent" is any
organism, cell, or virus, living or dead, or a nucleic acid derived
from such an organism, cell or virus. Examples of bioagents
include, but are not limited, to cells, including but not limited
to human clinical samples, cell cultures, bacterial cells and other
pathogens), viruses, viroids, fungi, protists, parasites, and
pathogenicity markers (including but not limited to: pathogenicity
islands, antibiotic resistance genes, virulence factors, toxin
genes and other bioregulating compounds). Samples may be alive or
dead or in a vegetative state (for example, vegetative bacteria or
spores) and may be encapsulated or bioengineered. In the context of
this invention, a "pathogen" is a bioagent which causes a disease
or disorder.
[0023] As used herein, "intelligent primers" are primers that are
designed to bind to highly conserved sequence regions of a bioagent
identifying amplicon that flank an intervening variable region and
yield amplification products which ideally provide enough
variability to distinguish each individual bioagent, and which are
amenable to molecular mass analysis. By the term "highly
conserved," it is meant that the sequence regions exhibit between
about 80-100%, or between about 90-100%, or between about 95-100%
identity among all or at least 70%, at least 80%, at least 90%, at
least 95%, or at least 99% of species or strains.
[0024] As used herein, "broad range survey primers" are intelligent
primers designed to identify an unknown bioagent at the genus
level. In some cases, broad range survey primers are able to
identify unknown bioagents at the species or sub-species level. As
used herein, "division-wide primers" are intelligent primers
designed to identify a bioagent at the species level and
"drill-down" primers are intelligent primers designed to identify a
bioagent at the sub-species level. As used herein, the
"sub-species" level of identification includes, but is not limited
to, strains, subtypes, variants, and isolates.
[0025] As used herein, a "bioagent division" is defined as group of
bioagents above the species level and includes but is not limited
to, orders, families, classes, clades, genera or other such
groupings of bioagents above the species level.
[0026] As used herein, a "sub-species characteristic" is a genetic
characteristic that provides the means to distinguish two members
of the same bioagent species. For example, one viral strain could
be distinguished from another viral strain of the same species by
possessing a genetic change (e.g., for example, a nucleotide
deletion, addition or substitution) in one of the viral genes, such
as the RNA-dependent RNA polymerase. In this case, the sub-species
characteristic that can be identified using the methods of the
present invention is the genetic change in the viral
polymerase.
[0027] As used herein, the term "bioagent identifying amplicon"
refers to a polynucleotide that is amplified from a bioagent in an
amplification reaction whose sequence 1) ideally provides base
composition variability to distinguish among individual bioagents
and 2) whose molecular mass is amenable to molecular mass
determination.
[0028] As used herein, a "base composition" is the exact number of
each nucleobase (A, T, C and G) in a given sequence. As used
herein, a "base composition signature" (BCS) is the exact base
composition (i.e., the number of A, T, G and C nucleobases)
determined from the molecular mass of a bioagent identifying
amplicon.
[0029] As used herein, a "base composition probability cloud" is a
representation of the diversity in base composition resulting from
a variation in sequence that occurs among different isolates of a
given species. The "base composition probability cloud" represents
the base composition constraints for each species and is typically
visualized using a pseudo four-dimensional plot.
[0030] As used herein, a "wobble base" is a variation in a codon
found at the third nucleotide position of a DNA triplet. Variations
in conserved regions of sequence are often found at the third
nucleotide position due to redundancy in the amino acid code.
[0031] In the context of the present invention, the term "unknown
bioagent" may mean either: (i) a bioagent whose existence is known
(such as the well known bacterial species Staphylococcus aureus for
example) but which is not known to be in a sample to be analyzed,
or (ii) a bioagent whose existence is not known (for example, the
SARS coronavirus was unknown prior to April 2003). For example, if
the method for identification of coronaviruses disclosed in
commonly owned U.S. patent Ser. No. 10/829,826 (incorporated herein
by reference in its entirety) was to be employed prior to April
2003 to identify the SARS coronavirus in a clinical sample, both
meanings of "unknown" bioagent are applicable since the SARS
coronavirus was unknown to science prior to April, 2003 and since
it was not known what bioagent (in this case a coronavirus) was
present in the sample. On the other hand, if the method of U.S.
patent Ser. No. 10/829,826 was to be employed subsequent to April
2003 to identify the SARS coronavirus in a clinical sample, only
the first meaning (i) of "unknown" bioagent would apply since the
SARS coronavirus became known to science subsequent to April 2003
and since it was not known what bioagent was present in the
sample.
[0032] As used herein, "triangulation identification" means the
employment of more than one bioagent identifying amplicons for
identification of a bioagent.
[0033] In the context of the present invention, "viral nucleic
acid" includes, but is not limited to, DNA, RNA, or DNA that has
been obtained from viral RNA, such as, for example, by performing a
reverse transcription reaction. Viral RNA can either be
single-stranded (of positive or negative polarity) or
double-stranded.
[0034] As used herein, the term "etiology" refers to the causes or
origins, of diseases or abnormal physiological conditions.
[0035] As used herein, the term "nucleobase" is synonymous with
other terms in use in the art including "nucleotide,"
"deoxynucleotide," "nucleotide residue," "deoxynucleotide residue,"
"nucleotide triphosphate (NTP)," or deoxynucleotide triphosphate
(dNTP).
[0036] Despite enormous biological diversity, all forms of life on
earth share sets of essential, common features in their genomes.
Since genetic data provide the underlying basis for identification
of orthopoxvirus by the methods of the present invention, it is
desirable to select segments of nucleic acids which ideally provide
enough variability to distinguish each individual bioagent and
whose molecular mass is amenable to molecular mass
determination.
[0037] Unlike bacterial genomes, which exhibit conversation of
numerous genes (i.e. housekeeping genes) across all organisms,
viruses do not share a gene that is essential and conserved among
all virus families. Therefore, viral identification is achieved
within smaller groups of related viruses, such as members of a
particular virus family or genus. For example, RNA-dependent RNA
polymerase is present in all single-stranded RNA viruses and can be
used for broad priming as well as resolution within the virus
family.
[0038] Disclosed in U.S. Patent Application Publication Nos.
2003-0027135, 2003-0082539, 2003-0228571, 2004-0209260,
2004-0219517, and 2004-0180328, and in U.S. application Ser. Nos.
10/660,997, 10/728,486, 10/754,415, and 10/829,826, all of which
are commonly owned and incorporated herein by reference in their
entirety, are methods for identification of bioagents (any
organism, cell, or virus, living or dead, or a nucleic acid derived
from such an organism, cell or virus) in an unbiased manner by
molecular mass and base composition analysis of "bioagent
identifying amplicons" which are obtained by amplification of
segments of essential and conserved genes which are involved in,
for example, translation, replication, recombination and repair,
transcription, nucleotide metabolism, amino acid metabolism, lipid
metabolism, energy generation, uptake, secretion and the like.
Examples of these proteins include, but are not limited to,
ribosomal RNAs, ribosomal proteins, DNA and RNA polymerases,
RNA-dependent RNA polymerases, RNA capping and methylation enzymes,
elongation factors, tRNA synthetases, protein chain initiation
factors, heat shock protein groEL, phosphoglycerate kinase, NADH
dehydrogenase, DNA ligases, DNA gyrases and DNA topoisomerases,
helicases, metabolic enzymes, and the like.
[0039] To obtain bioagent identifying amplicons, primers are
selected to hybridize to conserved sequence regions which bracket
or flank variable sequence regions to yield a segment of nucleic
acid which can be amplified and which is amenable to methods of
molecular mass analysis. The variable sequence regions provide the
variability of molecular mass which is used for bioagent
identification. Upon amplification by PCR or other amplification
methods with the specifically chosen primers, an amplification
product that represents a bioagent identifying amplicon is
obtained. The molecular mass of the amplification product, obtained
by mass spectrometry for example, provides the means to uniquely
identify the bioagent without a requirement for prior knowledge of
the possible identity of the bioagent. The molecular mass of the
amplification product or the corresponding base composition (which
can be calculated from the molecular mass of the amplification
product) is compared with a database of molecular masses or base
compositions and a match indicates the identity of the bioagent.
Furthermore, the method can be applied to rapid parallel analyses
(for example, in a multi-well plate format) the results of which
can be employed in a triangulation identification strategy which is
amenable to rapid throughput and does not require nucleic acid
sequencing of the amplified target sequence for bioagent
identification.
[0040] The result of determination of a previously unknown base
composition of a previously unknown bioagent (for example, a newly
evolved and heretofore unobserved virus) has downstream utility by
providing new bioagent indexing information with which to populate
base composition databases. The process of subsequent bioagent
identification analyses is, thus, greatly improved as more base
composition data for bioagent identifying amplicons becomes
available.
[0041] In some embodiments of the present invention, at least one
viral nucleic acid segment is amplified in the process of
identifying the viral bioagent. Thus, the nucleic acid segments
that can be amplified by the primers disclosed herein and that
provide enough variability to distinguish each individual bioagent
and whose molecular masses are amenable to molecular mass
determination are herein described as viral bioagent identifying
amplicons.
[0042] In some embodiments of the present invention, viral bioagent
identifying amplicons comprise from about 45 to about 200
nucleobases (i.e. from about 45 to about 200 linked nucleosides; or
up to about 200 nucleobases). One of ordinary skill in the art will
appreciate that the invention embodies viral bioagent identifying
amplicons of 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,
58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91,
92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106,
107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119,
120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132,
133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145,
146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158,
159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171,
172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184,
185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197,
198, 199, and 200 nucleobases in length, or any range
therewithin.
[0043] It is the combination of the portions of the viral bioagent
nucleic acid segment to which the primers hybridize (hybridization
sites) and the variable region between the primer hybridization
sites that comprises the viral bioagent identifying amplicon.
[0044] In some embodiments, viral bioagent identifying amplicons
amenable to molecular mass determination which are produced by the
primers described herein are either of a length, size or mass
compatible with the particular mode of molecular mass determination
or compatible with a means of providing a predictable fragmentation
pattern in order to obtain predictable fragments of a length
compatible with the particular mode of molecular mass
determination. Such means of providing a predictable fragmentation
pattern of an amplification product include, but are not limited
to, cleavage with restriction enzymes or cleavage primers, for
example. Thus, in some embodiments, viral bioagent identifying
amplicons are larger than 200 nucleobases and are amenable to
molecular mass determination following restriction digestion.
Methods of using restriction enzymes and cleavage primers are well
known to those with ordinary skill in the art.
[0045] In some embodiments, amplification products corresponding to
viral bioagent identifying amplicons are obtained using the
polymerase chain reaction (PCR) which is a routine method to those
with ordinary skill in the molecular biology arts. Other
amplification methods may be used such as ligase chain reaction
(LCR), low-stringency single primer PCR, and multiple strand
displacement amplification (MDA). These methods are also well known
to those with ordinary skill.
[0046] Intelligent primers are designed to bind to highly conserved
sequence regions that flank an intervening variable region and
yield viral bioagent identifying amplicons upon amplification,
which ideally provide enough variability to distinguish each
individual viral bioagent, and which are amenable to molecular mass
analysis. In some embodiments, the highly conserved sequence
regions exhibit between about 80-100%, or between about 90-100%, or
between about 95-100% identity, or between about 99-100% identity.
The molecular mass of a given amplification product provides a
means of identifying the viral bioagent from which it was obtained,
due to the variability of the variable region. Thus, design of
intelligent primers requires selection of a variable region with
appropriate variability to resolve the identity of a given
bioagent. Viral bioagent identifying amplicons are ideally specific
to the identity of the viral bioagent, however, this is not an
absolute requirement because multiple viral bioagent identifying
amplicons can be used in a triangulation strategy (vide infra).
[0047] Identification of viral bioagents can be accomplished at
different taxonomic levels using intelligent primers suited to
resolution of each individual level of identification. Broad range
survey intelligent primers are designed with the objective of
identifying a bioagent as a member of a particular division (e.g.,
an order, family, genus or other such grouping of viral bioagents
above the species level). As a non-limiting example, members of the
Orthopoxvirus genus may be identified as such by employing broad
range survey intelligent primers such as primers which target RNA
or DNA polymerases, helicases, or other viral genes. In some
embodiments, broad range survey intelligent primers are capable of
identification of bioagents at the species, sub-species or strain
level.
[0048] Division-wide intelligent primers are designed with an
objective of identifying a bioagent at the species level.
Division-wide intelligent primers are not always required for
identification at the species level because broad range survey
intelligent primers may provide sufficient identification
resolution to accomplishing this identification objective.
[0049] Drill-down intelligent primers are designed with the
objective of identifying a bioagent at the sub-species level
(including strains, subtypes, variants and isolates) based on
sub-species characteristics. Drill-down intelligent primers are not
always required for identification at the sub-species level because
broad range survey intelligent primers may provide sufficient
identification resolution to accomplishing this identification
objective.
[0050] A representative process flow diagram used for primer
selection and validation process is outlined in FIG. 1. For each
group of organisms, candidate target sequences are identified (200)
from which nucleotide alignments are created (210) and analyzed
(220). Primers are then designed by selecting appropriate priming
regions (230) which then enables the selection of candidate primer
pairs (240). The primer pairs are then subjected to in silico
analysis by electronic PCR (ePCR) (300) wherein bioagent
identifying amplicons are obtained from sequence databases such as
GenBank or other sequence collections (310) and checked for
specificity in silico (320). Bioagent identifying amplicons
obtained from GenBank sequences (310) can also be analyzed by a
probability model which predicts the capability of a given amplicon
to identify unknown bioagents such that the base compositions of
amplicons with favorable probability scores are then stored in a
base composition database (325). Alternatively, base compositions
of the bioagent identifying amplicons obtained from the primers and
GenBank sequences can be directly entered into the base composition
database (330). Candidate primer pairs (240) are validated by in
vitro amplification by a method such as PCR analysis (400) of
nucleic acid from a collection of organisms (410). Amplification
products thus obtained are analyzed to confirm the sensitivity,
specificity and reproducibility of the primers used to obtain the
amplification products (420).
[0051] Many of the important pathogens, including the organisms of
greatest concern as biological weapons agents, have been completely
sequenced. This effort has greatly facilitated the design of
primers and probes for the detection of individual bioagents. Thus,
the combination of broad-range priming with division-wide and
drill-down priming described herein is being used very successfully
in several applications of the technology, including environmental
surveillance for biowarfare threat agents and clinical sample
analysis for medically important pathogens.
[0052] Synthesis of primers is well known and routine in the art.
The primers may be conveniently and routinely made through the
well-known technique of solid phase synthesis. Equipment for such
synthesis is sold by several vendors including, for example,
Applied Biosystems (Foster City, Calif.). Any other means for such
synthesis known in the art may additionally or alternatively be
employed.
[0053] The primers are employed as, for example, compositions for
use in methods for identification of viral bioagents as follows: a
primer pair composition is contacted with nucleic acid (such as,
for example, DNA from a DNA virus, or DNA reverse transcribed from
the RNA of an RNA virus) of an unknown viral bioagent. The nucleic
acid is then amplified by a nucleic acid amplification technique,
such as PCR for example, to obtain an amplification product that
represents a viral bioagent identifying amplicon. The molecular
mass of each strand of the double-stranded amplification product is
determined by a molecular mass measurement technique such as, for
example, mass spectrometry wherein the two strands of the
double-stranded amplification product are separated during the
ionization process. In some embodiments, the mass spectrometry is
electrospray Fourier transform ion cyclotron resonance mass
spectrometry (ESI-FTICR-MS) or electrospray time of flight mass
spectrometry (ESI-TOF-MS). A list of possible base compositions can
be generated for the molecular mass value obtained for each strand
and the choice of the correct base composition from the list is
facilitated by matching the base composition of one strand with a
complementary base composition of the other strand. The molecular
mass or base composition thus determined is then compared with a
database of molecular masses or base compositions of analogous
bioagent identifying amplicons for known viral bioagents. A match
between the molecular mass or base composition of the amplification
product and the molecular mass or base composition of an analogous
bioagent identifying amplicon for a known viral bioagent indicates
the presence and/or identity of the unknown bioagent. In some
embodiments, the primer pair used is one of the primer pairs of
Table 1. In some embodiments, the method is repeated using a
different primer pair to resolve possible ambiguities in the
identification process or to improve the confidence level for the
identification assignment.
[0054] In some embodiments, a viral bioagent identifying amplicon
may be produced using only a single primer (either the forward or
reverse primer of any given primer pair), provided an appropriate
amplification method is chosen, such as, for example, low
stringency single primer PCR (LSSP-PCR). Adaptation of this
amplification method in order to produce viral bioagent identifying
amplicons can be accomplished by one with ordinary skill in the art
without undue experimentation.
[0055] In some embodiments, the oligonucleotide primers are broad
range survey primers which hybridize to conserved regions of
nucleic acid encoding DNA polymerase, RNA polymerase, DNA helicase,
RNA helicase, or thioredoxin-like gene of all (or between 80% and
100%, between 85% and 100%, between 90% and 100%, or between 95%
and 100%) known orthopoxviruses and produce orthopoxvirus
identifying amplicons. As used herein, the phrase "broad range
survey primers" refers to primers that bind to nucleic acid
encoding genes essential to orthopoxvirus replication (e.g., for
example, DNA and RNA polymerases, RNA and RNA helicases and
thioredoxin-like gene) of all (or between 80% and 100%, between 85%
and 100%, between 90% and 100%, or between 95% and 100%) known
species of orthopoxviruses. In some embodiments, the primer pairs
comprise oligonucleotides ranging in length from 13 to 35
nucleobases, each of which have from 70% to 100% sequence identity
with any of the primers shown in Table 1.
[0056] In some cases, the molecular mass or base composition of a
viral bioagent identifying amplicon defined by a broad range survey
primer pair does not provide enough resolution to unambiguously
identify a viral bioagent at the species level. These cases benefit
from further analysis of one or more viral bioagent identifying
amplicons generated from at least one additional broad range survey
primer pair or from at least one additional division-wide primer
pair. The employment of more than one bioagent identifying amplicon
for identification of a bioagent is herein referred to as
"triangulation identification."
[0057] In other embodiments, the oligonucleotide primers are
division-wide primers which hybridize to nucleic acid encoding
genes of species within a genus of viruses. In other embodiments,
the oligonucleotide primers are drill-down primers which enable the
identification of sub-species characteristics. Drill down primers
provide the functionality of producing bioagent identifying
amplicons for drill-down analyses such as genotyping or strain
typing when contacted with nucleic acid under amplification
conditions. Identification of such sub-species characteristics is
often critical for determining proper clinical treatment of viral
infections. In some embodiments, sub-species characteristics are
identified using only broad range survey primers and division-wide,
and drill-down primers are not used.
[0058] In some embodiments, the primers used for amplification
hybridize to and amplify genomic DNA, DNA of bacterial plasmids,
DNA of DNA viruses or DNA reverse transcribed from RNA of an RNA
virus.
[0059] In some embodiments, the primers used for amplification
hybridize directly to viral RNA and act as reverse transcription
primers for obtaining DNA from direct amplification of viral RNA.
Methods of amplifying RNA using reverse transcriptase are well
known to those with ordinary skill in the art and can be routinely
established without undue experimentation.
[0060] One with ordinary skill in the art of design of
amplification primers will recognize that a given primer need not
hybridize with 100% complementarity in order to effectively prime
the synthesis of a complementary nucleic acid strand in an
amplification reaction. Moreover, a primer may hybridize over one
or more segments such that intervening or adjacent segments are not
involved in the hybridization event (e.g., for example, a loop
structure or a hairpin structure). The primers of the present
invention may comprise at least 70%, at least 75%, at least 80%, at
least 85%, at least 90%, at least 95% or at least 99% sequence
identity with any of the primers listed in Table 1. Thus, in some
embodiments of the present invention, an extent of variation of 70%
to 100%, or any range therewithin, of the sequence identity is
possible relative to the specific primer sequences disclosed
herein. Determination of sequence identity is described in the
following example: a primer 20 nucleobases in length which differs
in contiguous nucleobases from another 20 nucleobase primer by only
two residues has 18 of 20 identical residues (18/20=0.9 or 90%
sequence identity). In another example, a primer 15 nucleobases in
length having all residues identical to a 15 nucleobase segment of
another primer that is 20 nucleobases in length would have
15/20=0.75 or 75% sequence identity with the 20 nucleobase primer.
In yet another example, a first primer, 35 nucleobases in length
having a 20 nucleobase segment which is identical to the entire
sequence of a second primer of a length of 20 nucleobases has 100%
sequence identity with the second primer.
[0061] Percent homology, sequence identity or complementarity, can
be determined by, for example, the Gap program (Wisconsin Sequence
Analysis Package, Version 8 for UNIX, Genetics Computer Group,
University Research Park, Madison Wis.), using default settings,
which uses the algorithm of Smith and Waterman (Adv. Appl. Math.,
1981, 2, 482-489). In some embodiments, complementarity of primers
with respect to the conserved priming regions of viral nucleic acid
is between about 70% and 100%. In other embodiments, homology,
sequence identity or complementarity, is between about 80% and
100%. In yet other embodiments, homology, sequence identity or
complementarity, is at least 90%, at least 92%, at least 94%, at
least 95%, at least 96%, at least 97%, at least 98%, at least 99%
or is 100%.
[0062] In some embodiments, the primers described herein comprise
at least 70%, at least 75%, at least 80%, at least 85%, at least
90%, at least 92%, at least 94%, at least 95%, at least 96%, at
least 98%, or at least 99%, or 100% (or any range therewithin)
sequence identity with the primer sequences specifically disclosed
herein. Thus, for example, a primer may have between 70% and 100%,
between 75% and 100%, between 80% and 100%, and between 95% and
100% sequence identity with SEQ ID NO: 1. Likewise, a primer may
have similar sequence identity with any other primer whose
nucleotide sequence is disclosed in Table 1.
[0063] One with ordinary skill is able to calculate percent
sequence identity or percent sequence homology and able to
determine, without undue experimentation, the effects of variation
of primer sequence identity on the function of the primer in its
role in priming synthesis of a complementary strand of nucleic acid
for production of an amplification product of a corresponding viral
bioagent identifying amplicon.
[0064] In some embodiments of the present invention, the
oligonucleotide primers are 13 to 35 nucleobases in length (13 to
35 linked nucleotide residues; or up to 35 nucleotide residues).
These embodiments comprise oligonucleotide primers 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34 or 35 nucleobases in length, or any range therewithin.
[0065] In some embodiments, any given primer can comprise a
modification comprising the addition of a non-templated T residue
to the 5' end of the primer (i.e., the added T residue does not
necessarily hybridize to the nucleic acid being amplified). The
addition of a non-templated T residue has an effect of minimizing
the addition of non-templated adenyl residues as a result of the
non-specific enzyme activity of Taq polymerase (Magnuson et al.,
Biotechniques, 1996, 21, 700-709), an occurrence which may lead to
ambiguous results arising from molecular mass analysis.
[0066] In some embodiments of the present invention, primers may
contain one or more universal bases. Because any variation (due to
codon wobble in the 3.sup.rd position) in the conserved regions
among species is likely to occur in the third position of a DNA (or
RNA) triplet, oligonucleotide primers can be designed such that the
nucleotide corresponding to this position is a base which can bind
to more than one nucleotide, referred to herein as a "universal
nucleobase." For example, under this "wobble" pairing, inosine (I)
binds to U, C or A; guanine (G) binds to U or C, and uridine (U)
binds to U or C. Other examples of universal nucleobases include,
but are not limited to, nitroindoles such as 5-nitroindole or
3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995,
14, 1001-1003), the degenerate nucleotides dP or dK (Hill et al.,
Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 4258-4263), an acyclic
nucleoside analog containing 5-nitroindazole (Van Aerschot et al.,
Nucleosides and Nucleotides, 1995, 14, 1053-1056) or the purine
analog 1-(2-deoxy-.beta.-D-ribofuranosyl)-imidazole-4-carboxamide
(Sala et al., Nucl. Acids Res., 1996, 24, 3302-3306).
[0067] In some embodiments, to compensate for the somewhat weaker
binding by the wobble base, the oligonucleotide primers are
designed such that the first and second positions of each triplet
are occupied by nucleotide analogs which bind with greater affinity
than the unmodified nucleotide. Examples of these analogs include,
but are not limited to, 2,6-diaminopurine which binds to thymine,
5-propynyluracil which binds to adenine and 5-propynylcytosine and
phenoxazines, including G-clamp, which binds to G. Propynylated
pyrimidines are described in U.S. Pat. Nos. 5,645,985, 5,830,653
and 5,484,908, each of which is commonly owned and incorporated
herein by reference in its entirety. Propynylated primers are
described in U.S Patent Application Publication No. 2003-0170682,
which is also commonly owned and incorporated herein by reference
in its entirety. Phenoxazines are described in U.S. Pat. Nos.
5,502,177, 5,763,588, and 6,005,096, each of which is incorporated
herein by reference in its entirety. G-clamps are described in U.S.
Pat. Nos. 6,007,992 and 6,028,183, each of which is incorporated
herein by reference in its entirety.
[0068] In some embodiments, to enable broad priming of rapidly
evolving RNA viruses, primer hybridization is enhanced using
primers containing 5-propynyl deoxycytidine and deoxythymidine
nucleotides. These modified primers offer increased affinity and
base pairing selectivity.
[0069] In some embodiments, non-template primer tags are used to
increase the melting temperature (T.sub.m) of a primer-template
duplex in order to improve amplification efficiency. A non-template
tag is at least three consecutive A or T nucleotide residues on a
primer which are not complementary to the template. In any given
non-template tag, A can be replaced by C or G and T can also be
replaced by C or G. Although Watson-Crick hybridization is not
expected to occur for a non-template tag relative to the template,
the extra hydrogen bond in a G-C pair relative to an A-T pair
confers increased stability of the primer-template duplex and
improves amplification efficiency for subsequent cycles of
amplification when the primers hybridize to strands synthesized in
previous cycles.
[0070] In other embodiments, propynylated tags may be used in a
manner similar to that of the non-template tag, wherein two or more
5-propynylcytidine or 5-propynyluridine residues replace template
matching residues on a primer. In other embodiments, a primer
contains a modified internucleoside linkage such as a
phosphorothioate linkage, for example.
[0071] In some embodiments, the primers contain mass-modifying
tags. Reducing the total number of possible base compositions of a
nucleic acid of specific molecular weight provides a means of
avoiding a persistent source of ambiguity in determination of base
composition of amplification products. Addition of mass-modifying
tags to certain nucleobases of a given primer will result in
simplification of de novo determination of base composition of a
given bioagent identifying amplicon from its molecular mass.
[0072] In some embodiments of the present invention, the mass
modified nucleobase comprises one or more of the following: for
example, 7-deaza-2'-deoxyadenosine-5-triphosphate,
5-iodo-2'-deoxyuridine-5'-triphosphate,
5-bromo-2'-deoxyuridine-5'-triphosphate,
5-bromo-2'-deoxycytidine-5'-triphosphate,
5-iodo-2'-deoxycytidine-5'-triphosphate,
5-hydroxy-2'-deoxyuridine-5'-triphosphate,
4-thiothymidine-5'-triphosphate,
5-aza-2'-deoxyuridine-5'-triphosphate,
5-fluoro-2'-deoxyuridine-5'-triphosphate,
O6-methyl-2'-deoxyguanosine-5'-triphosphate,
N2-methyl-2'-deoxyguanosine-5'-triphosphate,
8-oxo-2'-deoxyguanosine-5'-triphosphate, or
thiothymidine-5'-triphosphate. In some embodiments, the
mass-modified nucleobase comprises .sup.15N or .sup.13C or both
.sup.15N and .sup.13C.
[0073] In some cases, a molecular mass of a given bioagent
identifying amplicon alone does not provide enough resolution to
unambiguously identify a given bioagent. The employment of more
than one viral bioagent identifying amplicon for identification of
a bioagent is herein referred to as triangulation identification.
Triangulation identification is pursued by analyzing a plurality of
bioagent identifying amplicons selected within multiple genes. This
process is used to reduce false negative and false positive
signals, and enable reconstruction of the origin of hybrid or
otherwise engineered bioagents. For example, identification of the
three part toxin genes typical of B. anthracis (Bowen et al., J.
Appl. Microbiol., 1999, 87, 270-278) in the absence of the expected
signatures from a representative orthopoxvirus genome would suggest
a genetic engineering event.
[0074] In some embodiments, the triangulation identification
process can be pursued by characterization of bioagent identifying
amplicons in a massively parallel fashion using the polymerase
chain reaction (PCR), such as multiplex PCR where multiple primers
are employed in the same amplification reaction mixture, or PCR in
multi-well plate format wherein a different and unique pair of
primers is used in multiple wells containing otherwise identical
reaction mixtures. Such multiplex and multi-well PCR methods are
well known to those with ordinary skill in the arts of rapid
throughput amplification of nucleic acids.
[0075] In some embodiments, the molecular mass of a given viral
bioagent identifying amplicon is determined by mass spectrometry.
Mass spectrometry has several advantages, not the least of which is
high bandwidth characterized by the ability to separate (and
isolate) many molecular peaks across a broad range of mass to
charge ratio (m/z). Thus mass spectrometry is intrinsically a
parallel detection scheme without the need for radioactive or
fluorescent labels or probes, since every amplification product is
identified by its molecular mass. The current state of the art in
mass spectrometry is such that less than femtomole quantities of
material can be readily analyzed to afford information about the
molecular contents of the sample. An accurate assessment of the
molecular mass of the material can be quickly obtained,
irrespective of whether the molecular weight of the sample is
several hundred, or in excess of one hundred thousand atomic mass
units (amu) or Daltons.
[0076] In some embodiments, intact molecular ions are generated
from amplification products using one of a variety of ionization
techniques to convert the sample to gas phase. These ionization
methods include, but are not limited to, electrospray ionization
(ES), matrix-assisted laser desorption ionization (MALDI) and fast
atom bombardment (FAB). Upon ionization, several peaks are observed
from one sample due to the formation of ions with different
charges. Averaging the multiple readings of molecular mass obtained
from a single mass spectrum affords an estimate of molecular mass
of the bioagent identifying amplicon. Electrospray ionization mass
spectrometry (ESI-MS) is particularly useful for very high
molecular weight polymers such as proteins and nucleic acids having
molecular weights greater than 10 kDa, since it yields a
distribution of multiply-charged molecules of the sample without
causing a significant amount of fragmentation.
[0077] The mass detectors used in the methods of the present
invention include, but are not limited to, Fourier transform ion
cyclotron resonance mass spectrometry (FT-ICR-MS), time of flight
(TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triple
quadrupole.
[0078] Although the molecular mass of amplification products
obtained using intelligent primers provides a means for
identification of bioagents, conversion of molecular mass data to a
base composition signature is useful for certain analyses. As used
herein, a base composition signature (BCS) is the exact base
composition determined from the molecular mass of a bioagent
identifying amplicon. In one embodiment, a BCS provides an index of
a specific gene in a specific organism. As used herein, a base
composition is the exact number of each nucleobase (A, T, C and
G).
[0079] RNA viruses depend on error-prone polymerases for
replication and therefore their nucleotide sequences (and resultant
base compositions) drift over time within the functional
constraints allowed by selection pressure. Base composition
probability distribution of a viral species or group represents a
probabilistic distribution of the above variation in the A, C, G
and T base composition space and can be derived by analyzing base
compositions of, for example, all known isolates of that particular
species.
[0080] In some embodiments, assignment of the likelihood that a
previously unknown or un-indexed base composition corresponds to a
particular virus, or a related member of a group of viruses is
accomplished using base composition probability clouds or base
composition density polyhedrons. Base compositions, like sequences,
vary slightly from isolate to isolate within species or individual
genotypes. It is possible to manage this diversity by building base
composition probability clouds around the composition constraints
for each species. This permits identification of organisms in a
fashion similar to sequence analysis. A pseudo four-dimensional
plot can be used to visualize the concept of base composition
probability clouds. Likewise, a system of tetrahedral axes can be
used to build a polyhedron according to seven base composition
constraints. Optimal primer design requires optimal choice of
bioagent identifying amplicons and maximizes the separation between
the base composition signatures of individual bioagents. Areas
where clouds overlap indicate regions that may result in a
misclassification, a problem which is overcome by a triangulation
identification process using bioagent identifying amplicons not
affected by overlap of base composition probability clouds or
density polyhedrons.
[0081] In some embodiments, pre-calculated base composition
probability clouds provide the means for screening potential primer
pairs in order to avoid potential misclassifications of base
compositions. In other embodiments, base composition probability
clouds provide the means for predicting the identity of a bioagent
whose assigned base composition was not previously observed and/or
indexed in a bioagent identifying amplicon base composition
database due to evolutionary transitions in its nucleic acid
sequence. Thus, in contrast to probe-based techniques, mass
spectrometry determination of base composition does not require
prior knowledge of the composition or sequence in order to make the
measurement. Methods of calculating base composition probability
clouds are described in U.S. Patent Application Publication No.
2004-0209260. Likewise methods of calculating base composition
density polyhedrons are described in U.S. patent application Ser.
No. 11/073,362.
[0082] The present invention provides bioagent classifying
information similar to DNA sequencing and phylogenetic analysis at
a level sufficient to identify a given bioagent. Furthermore, the
process of determination of a previously unknown base composition
for a given bioagent (for example, in a case where sequence
information is unavailable) has downstream utility by providing
additional bioagent indexing information with which to populate
base composition databases. The process of future bioagent
identification is, thus, greatly improved as more base compositions
become available in base composition databases.
[0083] In some embodiments, the identity and quantity of an unknown
bioagent can be determined using a representative process
illustrated in FIG. 2. Primers (500) and a known quantity of a
calibration polynucleotide (505) are added to a sample containing
nucleic acid of an unknown bioagent (508). The total nucleic acid
in the sample is then subjected to an amplification reaction to
obtain amplification products (510). The molecular masses of
amplification products are determined from which are obtained
molecular mass and abundance data (515). The molecular mass of the
bioagent identifying amplicon (520) provides the means for its
identification (525) and the molecular mass of the calibration
amplicon obtained from the calibration polynucleotide (530)
provides the means for its identification (535). The abundance data
of the bioagent identifying amplicon (540) is recorded and the
abundance data for the calibration data (545) is recorded, both of
which are used in a calculation which determines the quantity of
unknown bioagent in the sample (550).
[0084] For concurrent identification and quantitation of an unknown
bioagent, a sample comprising the unknown bioagent is contacted
with a pair of primers which provide the means for amplification of
nucleic acid from the bioagent, and a known quantity of a
polynucleotide that comprises a calibration sequence. The nucleic
acids of the bioagent and of the calibration sequence are amplified
and the rate of amplification is reasonably assumed to be similar
for the nucleic acid of the bioagent and of the calibration
sequence. The amplification reaction then produces two
amplification products: a bioagent identifying amplicon and a
calibration amplicon. The bioagent identifying amplicon and the
calibration amplicon should be distinguishable by molecular mass
while being amplified at essentially the same rate. Effecting
differential molecular masses can be accomplished by choosing as a
calibration sequence, a representative bioagent identifying
amplicon (from a specific species of bioagent) and performing, for
example, a 2-8 nucleobase deletion or insertion within the variable
region between the two priming sites. The amplified sample
containing the bioagent identifying amplicon and the calibration
amplicon is then subjected to molecular mass analysis by, for
example, mass spectrometry. The resulting molecular mass analysis
of the nucleic acid of the bioagent and of the calibration sequence
provides molecular mass data and abundance data for the nucleic
acid of the bioagent and of the calibration sequence. The molecular
mass data obtained for the nucleic acid of the bioagent enables
identification of the unknown bioagent and the abundance data
enables calculation of the quantity of the bioagent, based on the
knowledge of the quantity of calibration polynucleotide contacted
with the sample.
[0085] In some embodiments, construction of a standard curve where
the amount of calibration polynucleotide spiked into the sample is
varied provides additional resolution and improved confidence for
the determination of the quantity of bioagent in the sample. The
use of standard curves for analytical determination of molecular
quantities is well known to one with ordinary skill and can be
performed without undue experimentation.
[0086] In some embodiments, multiplex amplification is performed
where multiple bioagent identifying amplicons are amplified with
multiple primer pairs which also amplify the corresponding standard
calibration sequences. In this or other embodiments, the standard
calibration sequences are optionally included within a single
vector which functions as the calibration polynucleotide. Multiplex
amplification methods are well known to those with ordinary skill
and can be performed without undue experimentation. However, for
the purpose of measurement of bioagent identifying amplicons by
mass spectrometry, it is advantageous to ensure that no single
strand of a double stranded bioagent identifying amplicon has a
molecular mass substantially similar to another single strand
present in the multiplex amplification mixture to avoid the
presence of overlapping mass peaks in the resulting mass
spectrum.
[0087] In some embodiments, the calibrant polynucleotide is used as
an internal positive control to confirm that amplification
conditions and subsequent analysis steps are successful in
producing a measurable amplicon. Even in the absence of copies of
the genome of a bioagent, the calibration polynucleotide should
give rise to a calibration amplicon. Failure to produce a
measurable calibration amplicon indicates a failure of
amplification or subsequent analysis step such as amplicon
purification or molecular mass determination. Reaching a conclusion
that such failures have occurred is in itself, a useful event.
[0088] In some embodiments, the calibration sequence is comprised
of DNA. In some embodiments, the calibration sequence is comprised
of RNA.
[0089] In some embodiments, the calibration sequence is inserted
into a vector which then itself functions as the calibration
polynucleotide. In some embodiments, more than one calibration
sequence is inserted into the vector that functions as the
calibration polynucleotide. Such a calibration polynucleotide is
herein termed a "combination calibration polynucleotide." The
process of inserting polynucleotides into vectors is routine to
those skilled in the art and can be accomplished without undue
experimentation. Thus, it should be recognized that the calibration
method should not be limited to the embodiments described herein.
The calibration method can be applied for determination of the
quantity of any bioagent identifying amplicon when an appropriate
standard calibrant polynucleotide sequence is designed and used.
The process of choosing an appropriate vector for insertion of a
calibrant is also a routine operation that can be accomplished by
one with ordinary skill without undue experimentation.
[0090] Bioagents that can be identified by the methods of the
present invention include RNA viruses. The genomes of RNA viruses
can be positive-sense single-stranded RNA, negative-sense
single-stranded RNA or double-stranded RNA. Examples of RNA viruses
with positive-sense single-stranded genomes include, but are not
limited to members of the Caliciviridae, Picornaviridae,
Flaviviridae, Togaviridae, Retroviridae and Coronaviridae families.
Examples of RNA viruses with negative-sense single-stranded RNA
genomes include, but are not limited to, members of the
Filoviridae, Rhabdoviridae, Bunyaviridae, Orthomyxoviridae,
Paramyxoviridae and Arenaviridae families. Examples of RNA viruses
with double-stranded RNA genomes include, but are not limited to,
members of the Reoviridae and Bimaviridae families.
[0091] In some embodiments of the present invention, RNA viruses
are identified by first obtaining RNA from an RNA virus, or a
sample containing or suspected of containing an RNA virus,
obtaining corresponding DNA from the RNA by reverse transcription,
amplifying the DNA to obtain one or more amplification products
using one or more pairs of oligonucleotide primers that bind to
conserved regions of the RNA viral genome, which flank a variable
region of the genome, determining the molecular mass or base
composition of the one or more amplification products and comparing
the molecular masses or base compositions with calculated or
experimentally determined molecular masses or base compositions of
known RNA viruses, wherein at least one match identifies the RNA
virus. Methods of isolating RNA from RNA viruses and/or samples
containing RNA viruses, and reverse transcribing RNA to DNA are
well known to those of skill in the art.
[0092] Orthopoxviruses represent DNA virus examples of viral
bioagents which can be identified by the methods of the present
invention. Orthopoxviruses are extremely diverse at the nucleotide
and protein sequence levels and are thus difficult to detect and
identify using currently available diagnostic techniques.
[0093] In some embodiments of the present invention, the
orthopoxvirus target gene is DNA polymerase, RNA polymerase, DNA
helicase, RNA helicase, or thioredoxin-like gene.
[0094] In other embodiments of the present invention, the
intelligent primers produce bioagent identifying amplicons within
stable and highly conserved regions of orthopoxvirus genomes. The
advantage to characterization of an amplicon in a highly conserved
region is that there is a low probability that the region will
evolve past the point of primer recognition, in which case, the
amplification step would fail. Such a primer set is, thus, useful
as, for example, a broad range survey-type primer. In another
embodiment of the present invention, the intelligent primers
produce bioagent identifying amplicons in a region which evolves
more quickly than the stable region described above. The advantage
of characterization bioagent identifying amplicon corresponding to
an evolving genomic region is that it is useful for distinguishing
emerging strain variants.
[0095] The present invention also has significant advantages as a
platform for identification of diseases caused by emerging viruses.
The present invention eliminates the need for prior knowledge of
bioagent sequence to generate hybridization probes. Thus, in
another embodiment, the present invention provides a means of
determining the etiology of a virus infection when the process of
identification of viruses is carried out in a clinical setting and,
even when the virus is a new species never observed before. This is
possible because the methods are not confounded by naturally
occurring evolutionary variations (a major concern for
characterization of viruses which evolve rapidly) occurring in the
sequence acting as the template for production of the bioagent
identifying amplicon. Measurement of molecular mass and
determination of base composition is accomplished in an unbiased
manner without sequence prejudice.
[0096] Another embodiment of the present invention also provides a
means of tracking the spread of any species or strain of virus when
a plurality of samples obtained from different locations are
analyzed by the methods described above in an epidemiological
setting. In one embodiment, a plurality of samples from a plurality
of different locations is analyzed with primers which produce viral
bioagent identifying amplicons, a subset of which contains a
specific virus. The corresponding locations of the members of the
virus-containing subset indicate the spread of the specific virus
to the corresponding locations.
[0097] The present invention also provides kits for carrying out
the methods described herein. In some embodiments, the kit may
comprise a sufficient quantity of one or more primer pairs to
perform an amplification reaction on a target polynucleotide from a
bioagent to form a bioagent identifying amplicon. In some
embodiments, the kit may comprise from one to fifty primer pairs,
from one to twenty primer pairs, from one to ten primer pairs, or
from two to five primer pairs. In some embodiments, the kit may
comprise one or more, two or more, three or more, or four or more
primer pairs, wherein each member of the pair is of a length of 13
to 35 nucleobases and has 70% to 100% sequence identity with any of
the primers recited in Table 1.
[0098] In some embodiments, the kit may comprise one or more broad
range survey primer(s), division wide primer(s), or drill-down
primer(s), or any combination thereof. A kit may be designed so as
to comprise particular primer pairs for identification of a
particular bioagent. For example, a broad range survey primer kit
may be used initially to identify an unknown bioagent as a member
of the Orthopoxvirus genus. Another example of a division-wide kit
may be used to distinguish Bangladesh 1975, India-1967 and
Garcia-1966 strains of variola virus from each other. A drill-down
kit may be used, for example, to distinguish different subtypes or
genotypes of orthopoxviruses. In some embodiments, any of these
kits may be combined to comprise a combination of broad range
survey primers and division-wide primers so as to be able to
identify the species of an unknown bioagent.
[0099] In some embodiments, the kit may contain standardized
calibration polynucleotides for use as internal amplification
calibrants. Internal calibrants are described in commonly owned
U.S. Patent Application Ser. No. 60/545,425, which is incorporated
herein by reference in its entirety.
[0100] In some embodiments, the kit may also comprise a sufficient
quantity of reverse transcriptase (if an RNA virus is to be
identified for example), a DNA polymerase, suitable nucleoside
triphosphates (including any of those described above), a DNA
ligase, and/or reaction buffer, or any combination thereof, for the
amplification processes described above. A kit may further include
instructions pertinent for the particular embodiment of the kit,
such instructions describing the primer pairs and amplification
conditions for operation of the method. A kit may also comprise
amplification reaction containers such as microcentrifuge tubes and
the like. A kit may also comprise reagents or other materials for
isolating bioagent nucleic acid or bioagent identifying amplicons
from amplification, including, for example, detergents, solvents,
or ion exchange resins which may be linked to magnetic beads. A kit
may also comprise a container such as a 96-well plate. A kit may
also comprise a table of measured or calculated molecular masses
and/or base compositions of bioagents using the primer pairs of the
kit.
[0101] While the present invention has been described with
specificity in accordance with certain of its embodiments, the
following examples serve only to illustrate the invention and are
not intended to limit the same. In order that the invention
disclosed herein may be more efficiently understood, examples are
provided below. It should be understood that these examples are for
illustrative purposes only and are not to be construed as limiting
the invention in any manner.
EXAMPLES
Example 1
Orthopoxvirus Identifying Amplicons
[0102] For design of primers that define orthopoxvirus identifying
amplicons, all available sequences for members of the Orthopoxvirus
genus were obtained from GenBank and the Poxvirus database (world
wide web at poxvirus.org) and aligned and scanned for regions where
pairs of PCR primers would amplify products between about 45 to
about 200 nucleotides in length and distinguish species and/or
sub-species from each other by their molecular masses or base
compositions. A typical process shown in FIG. 1 is employed.
[0103] A database of expected base compositions for each primer
region is generated using an in silico PCR search algorithm, such
as (ePCR). An existing RNA structure search algorithm (Macke et
al., Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporated
herein by reference in its entirety) has been modified to include
PCR parameters such as hybridization conditions, mismatches, and
thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci.
U.S.A., 1998, 95, 1460-1465, which is incorporated herein by
reference in its entirety). This also provides information on
primer specificity of the selected primer pairs.
[0104] Table 1 represents a collection of primers (sorted by
forward primer name) designed to identify orthopoxviruses using the
methods described herein. Primer sites were identified on five
essential genes: DNA polymerase (E9L), RNA polymerase (A24R) DNA
helicase (A18R), RNA helicase (K8R) and thioredoxin-like gene
(A25L). The forward or reverse primer name shown in Table 1
indicates the gene region of the viral genome to which the primer
hybridizes relative to a reference sequence. For example, the
forward primer name K8R_NC001611.sub.--221.sub.--238_F indicates a
forward primer "_F" that hybridizes to residues 221-238 of an
orthopoxvirus reference sequence represented by GenBank Accession
No. NC001611. In Table 1, T.sup.a=5-propynyluracil (a propynylated
version of T); and C.sup.a=5-propynylcytosine (a propynylated
version of C). The primer pair number is an in-house database index
number.
TABLE-US-00001 TABLE 1 Primer Pairs for Identification of
Orthopoxvirus Bioagents Primer Pair Forward For SEQ Reverse Rev SEQ
Number Primer Name Forward Sequence ID NO: Primer Name Reverse
Sequence ID NO: 296 A18R_NC001611_
GAAGT.sup.aT.sup.aGAAC.sup.aC.sup.aGGGATCA 1 A18R_NC001611_
ATTATCGGT.sup.aC.sup.aGT.sup.aT.sup.aGT.sup.aT.sup.aAATGT 24
100_117P_F 187_207P_R 297 A18R_NC001611_
CTGT.sup.aC.sup.aT.sup.aGTAGATAAAC.sup.aT.sup.a 2 A18R_NC001611_
CGTTC.sup.aT.sup.aT.sup.aC.sup.aT.sup.aC.sup.aT.sup.aGGAGGAT 25
1348_1370P_F AGGATT 1428_1445P_R 298 K8R_NC001611_
CT.sup.aC.sup.aC.sup.aTC.sup.aC.sup.aATCAC.sup.aT.sup.aAG 3
K8R_NC001611_
CTATAACAT.sup.aT.sup.aC.sup.aAAAGC.sup.aT.sup.aT.sup.aATT 26
221_238P_F GAA 290_311P_R G 299 E9L_NC001611_
CGATAC.sup.aT.sup.aAC.sup.aGGACGC 4 E9L_NC001611_
CTTTATGAAT.sup.aT.sup.aAC.sup.aT.sup.aT.sup.aT.sup.aACAT 27
1119_1133P_F 1201_1222P_R AT 300 A25L_NC001611_
GTAC.sup.aT.sup.aGAAT.sup.aC.sup.aC.sup.aGC.sup.aC.sup.aT 5
A25L_NC001611_
GTGAATAAAGTAT.sup.aC.sup.aGC.sup.aC.sup.aC.sup.aT.sup.aA 28
28_45P_F AAG 105_127P_R ATA 301 A24R_NC001611_
CGCGAT.sup.aAAT.sup.aAGATAGT.sup.aG 6 A24R_NC001611_
GCTTC.sup.aC.sup.aAC.sup.aCAGGT.sup.aCAT.sup.aTAA 29 795_817P_F
C.sup.aT.sup.aAAAC 860_878P_R 308 A18R_NC001611_ GAAGTTGAACCGGGATCA
1 A18R_NC001611_ ATTATCGGTCGTTGTTAATGT 24 100_117_F 187_207_R 309
A18R_NC001611_ CTGTCTGTAGATAAACTAGG 2 A18R_NC001611_
CGTTCTTCTCTGGAGGAT 25 1348_1370_F ATT 1428_1445_R 310 K8R_NC001611_
CTCCTCCATCACTAGGAA 3 K8R_NC001611_ CTATAACATTCAAAGCTTATTG 26
221_238_F 290_311_R 311 E9L_NC001611_ CGATACTACGGACGC 4
E9L_NC001611_ CTTTATGAATTACTTTACATAT 27 1119_1133_F 1201_1222_R 312
A25L_NC001611_ GTACTGAATCCGCCTAAG 5 A25L_NC001611_
GTGAATAAAGTATCGCCCTAATA 28 28_45_F 105_127_R 313 A24R_NC001611_
CGCGATAATAGATAGTGCTA 6 A24R_NC001611_ GCTTCCACCAGGTCATTAA 29
795_817_F AAC 860_878_R 488 A18R_NC001611_
TAGAAGT.sup.aT.sup.aGAAC.sup.aC.sup.aGGGA 7 A18R_NC001611_
TATTATCGGT.sup.aC.sup.aGT.sup.aT.sup.aGT.sup.aT.sup.aAAT 30
98_117P_F TCA 187_208P_R GT 489 A18R_NC001611_
TCTGT.sup.aC.sup.aT.sup.aGTAGATAAAC.sup.a 8 A18R_NC001611_
TCGTTC.sup.aT.sup.aT.sup.aC.sup.aT.sup.aC.sup.aT.sup.aGGAGGAT 31
1347_1370P_F T.sup.aAGGATT 1428_1446P_R 490 K8R_NC001611_
TCT.sup.aC.sup.aC.sup.aTC.sup.aC.sup.aATCAC.sup.aT.sup.aAG 9
K8R_NC001611_
TCTATAACAT.sup.aT.sup.aC.sup.aAAAGC.sup.aT.sup.aT.sup.aAT 32
220_238P_F GAA 290_312P_R TG 491 E9L_NC001611_
TCGATAC.sup.aT.sup.aAC.sup.aGGACGC 10 E9L_NC001611_
TCTTTATGAAT.sup.aT.sup.aAC.sup.aT.sup.aT.sup.aT.sup.aACA 33
1118_1133P_F 1201_1223P_R TAT 492 A25L_NC001611_
TGTAC.sup.aT.sup.aGAAT.sup.aC.sup.aC.sup.aGC.sup.aC.sup.a 11
A25L_NC001611_
TGTGAATAAAGTAT.sup.aC.sup.aGC.sup.aC.sup.aC.sup.aT.sup.aA 34
27_45P_F TAAG 105_128P_R ATA 493 A24L_NC001611_
TCGCGAT.sup.aAAT.sup.aAGATAGT.sup.aG 12 A24R_NC001611_
TGCTTC.sup.aC.sup.aAC.sup.aCAGGT.sup.aCAT.sup.aTAA 35 794_817P_F
C.sup.aT.sup.aAAAC 860_879P_R 979 A18R_NC001611_
TGATTTCGTAGAAGTTGAAC 13 A18R_NC001611_ TCGCGATTTTATTATCGGTCGTTG 36
90_117_F CGGGATCA 187_217_R TTAATGT 980 A18R_NC001611_
TTCTCCCTAGAAGTTGAACC 14 A18R_NC001611_ TCCCTCCCTATTATCGGTCGTTGT 37
91_117_F GGGATCA 187_216_R TAATGT 981 E9L_NC001611_
TGGTGACGATACTACGGACG 15 E9L_NC001611_ TCCCTCCCAATATCTTTACGAATT 38
1113_1133_F C 1201_1235_R ACTTTACATAT 982 E9L_NC001611_
TCGGTGACGATACTACGGAC 16 E9L_NC001611_ TCCTCCCTCCCATCTTTACGAATT 39
1112_1133_F GC 1205_1235_R ACTTTAC 983 E9L_NC001611_
TCGGTGACGATACTACGGAC 17 E9L_NC001611_ TCCTCCCTCCCAATATCTTTACGA 40
1112_1133_F GC 1205_1238_R ATTACTTTAC 984 K8R_NC001611_
TGGAAAAAAAGTATCTCCTC 18 K8R_NC001611_ TCCCTCCCGAAAACTATAACATTC 41
207_238_F CATCACTAGGAA 290_324_R AAAGCTTATTG 985 K8R_NC001611_
TGGAAAGTATCTCCTCCATC 19 K8R_NC001611_ TCCCTCCCTCCCTATAACATTCAA 42
211_242_F ACTAGGAAAACC 290_322_R AGCTTATTG 986 K8R_NC001611_
TCCCTCCTCTCCTCCATCAC 20 K8R_NC001611_ TCCTCCCTCCCTAACATTCAAAGC 43
213_238_F TAGGAA 290_319_R TTATTG 987 A24R_NC001611_
TCTAGTAAACGCGATAATAG 21 A24R_NC001611_ TGTTCAGCTTCCACCAGGTCATTA 44
786_818_F ATAGTGCTAAACG 860_884_R A 988 A24R_NC001611_
TCCTCCTCGCGATAATAGAT 22 A24R_NC001611_ TGTGTTCAGCTTCCACCAGGTCAT 45
788_818_F AGTGCTAAACG 860_886_R TAA 989 A24R_NC001611_
TCCTCCCGCGATAATAGATA 23 A24R_NC001611_ TCCCAGCTTCCACCAGGTCATTAA 46
789_817_F GTGCTAAAC 860_883_R 1066 A18R_NC001611_
TGATTTCGTAGAAGTTGAAC 13 A18R_NC001611_ TCCCTCCCTATTATCGGTCGTTGT 37
90_117_F CGGGATCA 187_216_R TAATGT 1067 A18R_NC001611_
TTCTCCCTAGAAGTTGAACC 14 A18R_NC001611_ TCGCGATTTTATTATCGGTCGTTG 36
91_117_F GGGATCA 187_217_R TTAATGT
Example 2
DNA Isolation and Amplification
[0105] Genomic materials from culture samples or swabs are prepared
using the DNeasy.RTM. 96 Tissue Kit (Qiagen, Valencia, Calif.). All
PCR reactions are assembled in 50 .mu.l reactions in a 96 well
microtiter plate format using a Packard MPII liquid handling
robotic platform and MJ Dyad.RTM. thermocyclers (MJ research,
Waltham, Mass.). The PCR reaction consists of 4 units of Amplitaq
Gold.RTM., 1.times. buffer II (Applied Biosystems, Foster City,
Calif.), 1.5 mM MgCl.sub.2, 0.4 M betaine, 800 .mu.M of dNTP
mixture, and 250 nM of each primer.
[0106] The following PCR conditions can be used to amplify the
sequences used for mass spectrometry analysis: 95.degree. C. for 10
minutes followed by 8 cycles of 95.degree. C. for 30 seconds,
48.degree. C. for 30 seconds, and 72.degree. C. for 30 seconds,
with the 48.degree. C. annealing temperature increased 0.9.degree.
C. after each cycle. The PCR is then continued for 37 additional
cycles of 95.degree. C. for 15 seconds, 56.degree. C. for 20
seconds, and 72.degree. C. for 20 seconds
Example 3
Solution Capture Purification of PCR Products for Mass Spectrometry
with Ion Exchange Resin-Magnetic Beads
[0107] For solution capture of nucleic acids with ion exchange
resin linked to magnetic beads, 25 .mu.l of a 2.5 mg/mL suspension
of BioClon amine terminated supraparamagnetic beads were added to
25 to 50 .mu.l of a PCR (or RT-PCR) reaction containing
approximately 10 pM of a typical PCR amplification product. The
above suspension was mixed for approximately 5 minutes by vortexing
or pipetting, after which the liquid was removed after using a
magnetic separator. The beads containing bound PCR amplification
product were then washed 3 times with 50 mM ammonium
bicarbonate/50% MeOH or 100 mM ammonium bicarbonate/50% MeOH,
followed by three more washes with 50% MeOH. The bound PCR amplicon
was eluted with 25 mM piperidine, 25 mM imidazole, 35% MeOH, plus
peptide calibration standards.
Example 4
Mass Spectrometry and Base Composition Analysis
[0108] The ESI-FTICR mass spectrometer is based on a Bruker
Daltonics (Billerica, Mass.) Apex II 70e electrospray ionization
Fourier transform ion cyclotron resonance mass spectrometer that
employs an actively shielded 7 Tesla superconducting magnet. The
active shielding constrains the majority of the fringing magnetic
field from the superconducting magnet to a relatively small volume.
Thus, components that might be adversely affected by stray magnetic
fields, such as CRT monitors, robotic components, and other
electronics, can operate in close proximity to the FTICR
spectrometer. All aspects of pulse sequence control and data
acquisition were performed on a 600 MHz Pentium II data station
running Bruker's) (mass software under Windows NT 4.0 operating
system. Sample aliquots, typically 15 .mu.l were extracted directly
from 96-well microtiter plates using a CTC HTS PAL autosampler
(LEAP Technologies, Carrboro, N.C.) triggered by the FTICR data
station. Samples were injected directly into a 10 .mu.l sample loop
integrated with a fluidics handling system that supplies the 100
.mu.l/hr flow rate to the ESI source. Ions were formed via
electrospray ionization in a modified Analytica (Branford, Conn.)
source employing an off axis, grounded electrospray probe
positioned approximately 1.5 cm from the metalized terminus of a
glass desolvation capillary. The atmospheric pressure end of the
glass capillary was biased at 6000 V relative to the ESI needle
during data acquisition. A counter-current flow of dry N.sub.2 was
employed to assist in the desolvation process. Ions were
accumulated in an external ion reservoir comprised of an rf-only
hexapole, a skimmer cone, and an auxiliary gate electrode, prior to
injection into the trapped ion cell where they were mass analyzed.
Ionization duty cycles >99% were achieved by simultaneously
accumulating ions in the external ion reservoir during ion
detection. Each detection event consisted of 1M data points
digitized over 2.3 s. To improve the signal-to-noise ratio (S/N),
32 scans were co-added for a total data acquisition time of 74
s.
[0109] The ESI-TOF mass spectrometer is based on a Bruker Daltonics
MicroTOFT.TM.. Ions from the ESI source undergo orthogonal ion
extraction and are focused in a reflectron prior to detection. The
TOF and FTICR are equipped with the same automated sample handling
and fluidics described above. Ions are formed in the standard
MicroTOFT.TM. ESI source that is equipped with the same off-axis
sprayer and glass capillary as the FTICR ESI source. Consequently,
source conditions were the same as those described above. External
ion accumulation was also employed to improve ionization duty cycle
during data acquisition. Each detection event on the TOF was
comprised of 75,000 data points digitized over 75 .mu.s.
[0110] The sample delivery scheme allows sample aliquots to be
rapidly injected into the electrospray source at high flow rate and
subsequently be electrosprayed at a much lower flow rate for
improved ESI sensitivity. Prior to injecting a sample, a bolus of
buffer was injected at a high flow rate to rinse the transfer line
and spray needle to avoid sample contamination/carryover. Following
the rinse step, the autosampler injected the next sample and the
flow rate was switched to low flow. Following a brief equilibration
delay, data acquisition commenced. As spectra were co-added, the
autosampler continued rinsing the syringe and picking up buffer to
rinse the injector and sample transfer line. In general, two
syringe rinses and one injector rinse were required to minimize
sample carryover. During a routine screening protocol a new sample
mixture was injected every 106 seconds. More recently a fast wash
station for the syringe needle has been implemented which, when
combined with shorter acquisition times, facilitates the
acquisition of mass spectra at a rate of just under one
spectrum/minute.
[0111] Raw mass spectra were post-calibrated with an internal mass
standard and deconvoluted to monoisotopic molecular masses.
Unambiguous base compositions were derived from the exact mass
measurements of the complementary single-stranded oligonucleotides.
Quantitative results are obtained by comparing the peak heights
with an internal PCR calibration standard present in every PCR well
at 500 molecules per well. Calibration methods are commonly owned
and disclosed in U.S. Provisional Patent Application Ser. No.
60/545,425.
Example 5
De Novo Determination of Base Composition of Amplification Products
Using Molecular Mass Modified Deoxynucleotide Triphosphates
[0112] Because the molecular masses of the four natural nucleobases
have a relatively narrow molecular mass range (A=313.058,
G=329.052, C=289.046, T=304.046--See Table 2), a persistent source
of ambiguity in assignment of base composition can occur as
follows: two nucleic acid strands having different base composition
may have a difference of about 1 Da when the base composition
difference between the two strands is G.revreaction.A (-15.994)
combined with C.revreaction.T (+15.000). For example, one 99-mer
nucleic acid strand having a base composition of
A.sub.27G.sub.30C.sub.21T.sub.21 has a theoretical molecular mass
of 30779.058 while another 99-mer nucleic acid strand having a base
composition of A.sub.26G.sub.31C.sub.22T.sub.20 has a theoretical
molecular mass of 30780.052. A 1 Da difference in molecular mass
may be within the experimental error of a molecular mass
measurement and thus, the relatively narrow molecular mass range of
the four natural nucleobases imposes an uncertainty factor.
[0113] The present invention provides for a means for removing this
theoretical 1 Da uncertainty factor through amplification of a
nucleic acid with one mass-tagged nucleobase and three natural
nucleobases. The term "nucleobase" as used herein is synonymous
with other terms in use in the art including "nucleotide,"
"deoxynucleotide," "nucleotide residue," "deoxynucleotide residue,"
"nucleotide triphosphate (NTP)," or deoxynucleotide triphosphate
(dNTP).
[0114] Addition of significant mass to one of the 4 nucleobases
(dNTPs) in an amplification reaction, or in the primers themselves,
will result in a significant difference in mass of the resulting
amplification product (significantly greater than 1 Da) arising
from ambiguities arising from the G.revreaction.A combined with
C.revreaction.T event (Table 2). Thus, the same the G.revreaction.A
(-15.994) event combined with 5-Iodo-C.revreaction.T (-110.900)
event would result in a molecular mass difference of 126.894. If
the molecular mass of the base composition A.sub.27G.sub.30
5-Iodo-C.sub.2 T.sub.21 (33422.958) is compared with
A.sub.26G.sub.315-Iodo-C.sub.22T.sub.20, (33549.852) the
theoretical molecular mass difference is +126.894. The experimental
error of a molecular mass measurement is not significant with
regard to this molecular mass difference. Furthermore, the only
base composition consistent with a measured molecular mass of the
99-mer nucleic acid is A.sub.27G.sub.305-Iodo-C.sub.2T.sub.2. In
contrast, the analogous amplification without the mass tag has 18
possible base compositions.
TABLE-US-00002 TABLE 2 Molecular Masses of Natural Nucleobases and
the Mass-Modified Nucleobase 5-Iodo-C and Molecular Mass
Differences Resulting from Transitions Nucleobase Molecular Mass
Transition .DELTA. Molecular Mass A 313.058 A-->T -9.012 A
313.058 A-->C -24.012 A 313.058 A-->5-Iodo-C 101.888 A
313.058 A-->G 15.994 T 304.046 T-->A 9.012 T 304.046 T-->C
-15.000 T 304.046 T-->5-Iodo-C 110.900 T 304.046 T-->G 25.006
C 289.046 C-->A 24.012 C 289.046 C-->T 15.000 C 289.046
C-->G 40.006 5-Iodo-C 414.946 5-Iodo-C-->A -101.888 5-Iodo-C
414.946 5-Iodo-C-->T -110.900 5-Iodo-C 414.946 5-Iodo-C-->G
-85.894 G 329.052 G-->A -15.994 G 329.052 G-->T -25.006 G
329.052 G-->C -40.006 G 329.052 G-->5-Iodo-C 85.894
Example 6
Data Processing
[0115] Mass spectra of bioagent identifying amplicons are analyzed
independently using a maximum-likelihood processor, such as is
widely used in radar signal processing. This processor, referred to
as GenX, first makes maximum likelihood estimates of the input to
the mass spectrometer for each primer by running matched filters
for each base composition aggregate on the input data. This
includes the GenX response to a calibrant for each primer.
[0116] The algorithm emphasizes performance predictions culminating
in probability-of-detection versus probability-of-false-alarm plots
for conditions involving complex backgrounds of naturally occurring
organisms and environmental contaminants. Matched filters consist
of a priori expectations of signal values given the set of primers
used for each of the bioagents. A genomic sequence database is used
to define the mass base count matched filters. The database
contains the sequences of known bacterial bioagents and includes
threat organisms as well as benign background organisms. The latter
is used to estimate and subtract the spectral signature produced by
the background organisms.
[0117] A maximum likelihood detection of known background organisms
is implemented using matched filters and a running-sum estimate of
the noise covariance. Background signal strengths are estimated and
used along with the matched filters to form signatures which are
then subtracted. The maximum likelihood process is applied to this
"cleaned up" data in a similar manner employing matched filters for
the organisms and a running-sum estimate of the noise-covariance
for the cleaned up data.
[0118] The amplitudes of all base compositions of bioagent
identifying amplicons for each primer are calibrated and a final
maximum likelihood amplitude estimate per organism is made based
upon the multiple single primer estimates. Models of all system
noise are factored into this two-stage maximum likelihood
calculation. The processor reports the number of molecules of each
base composition contained in the spectra. The quantity of
amplification product corresponding to the appropriate primer set
is reported as well as the quantities of primers remaining upon
completion of the amplification reaction.
Example 7
Identification of Members of the Viral Genus Orthopoxvirus
[0119] DNA for five different test orthopoxvirus species from the
laboratory of Dr. Chris Upton at University of Victoria, British
Columbia, Canada: monkeypox (MPXV-VR267), cowpox (BR), rabbitpox
(Utrecht), vaccinia (WR) and ectromelia (Moscow). PCR products
corresponding to orthopoxvirus identifying amplicons were generated
according to Example 2 from each of the test viruses using primer
pair nos: 296, 297, 299, 310, 312 and 313 (Table 1). PCR products
were purified according to Example 3 and analyzed by mass
spectrometry according to Example 4 with data processing according
to Example 6.
[0120] Spectra were processed by an algorithm that converts
molecular mass to base composition data. All detected masses could
be unambiguously mapped to specific base compositions, which were
compared to the pre-compiled database of expected products from
each of these viruses. FIG. 3 (primer pair number 299) and FIG. 4
(primer pair number 297) show the deconvoluted base compositions
(solid cones) of the experimentally measured spectra in a
three-dimensional plot (A, G, C axes, with the T counts represented
by the tilt of the cone), overlaid on the expected base count
distributions (hollow spheres) of the orthopoxvirus species where
sequence data was available. Compositions for the test strains are
shown as a solid cone projected onto the same plot. The
experimentally determined base compositions with compositions
expected from the sequences in GenBank for all five viruses tested.
Vaccinia and ectromelia viruses gave expected products consistent
with the database sequence entry in each primer region. In the case
of the rabbitpox virus, the sequence of the target region was
identical to vaccinia virus in all primer regions selected and not
distinguished by the primers described above.
[0121] At the time of primer design, the only strain of monkeypox
virus deposited in GenBank was the Zaire 96_I-16 strain. The
experimentally determined base compositions for the MPXVVR267
strain were different from those for the Zaire strain. The
experimentally determined based-counts were subsequently validated
by comparison to the full genome sequence for the VR267 strain
(unpublished results--Chris Upton, University of Victoria). Thus a
new variant of a known orthopoxvirus species was identified with
the same technology used for primary detection, without the
requirement of additional analysis and/or design.
[0122] A whole genome sequence for a new strain of cowpox, GRI-90
strain was published as these experiments were in progress.
Analysis of several conserved genes across all of the orthopoxvirus
genera revealed that this strain was closer to vaccinia strains
than it was to the previously known Brighton Red strains of cowpox.
The material that was tested in the lab was clearly the BR strain
as evidenced by the perfect match to the expected base counts for
these in the database.
[0123] Table 3 shows the expected base counts of the various
orthopoxvirus species for all primer regions tested. The isolates
used in this test are indicated. In every test instance, the
experimentally measured signals matched database predicted base
compositions. While a single primer target region might not resolve
all species unambiguously, species can clearly be clearly
identified and differentiated from one another using the
triangulation strategy with multiple orthopoxvirus identifying
amplicons obtained from priming of different genetic loci. For
example, primer pair no. 310 does not distinguish the CMS and
M-92(2) strains of Camelpox virus but primer pair 296 does
distinguish these two strains because it produces two distinct base
compositions.
TABLE-US-00003 TABLE 3 Orthopoxvirus Species Base Compositions for
Primer Pair Nos: 296, 297, 299, 310, 312 and 313 Orthopoxvirus
Primer Primer Primer Primer Primer Primer Species and Pair Pair
Pair Pair Pair Pair GenBank No: 310 No: 296 No: 313 No: 299 No: 312
No: 297 Accession Strain [A G C T] Camelpox virus CMS A38 G11 A32
G20 A29 G15 A38 G23 A30 G19 A37 G17 AY009089 C23 T19 C23 T33 C14
T26 C16 T30 C18 T33 C22 T22 Camelpox virus M-96 A38 G11 A32 G19 A29
G15 A38 G23 A30 G19 A37 G17 AF438165 C23 T19 C23 T34 C14 T26 C16
T30 C18 T33 C22 T22 Cowpox virus Brighton A33 G14 A36 G18 A29 G15
A36 G25 A25 G24 A36 G17 AF482758 Red C18 T26 C23 T31 C16 T24 C17
T29 C21 T30 C22 T20 Cowpox virus GRI-90 A37 G11 A33 G19 A30 G15 A36
G25 A27 G23 A36 G18 X94355 C24 T19 C24 T32 C13 T26 C17 T29 C19 T31
C22 T22 Ectromelia Moscow A34 G13 A33 G19 A30 G15 A38 G25 A27 G22
A38 G16 virus C17 T27 C24 T32 C13 T26 C15 T29 C19 T32 C22 T22
AF012825 Monkeypox WR-267 A34 G14 A33 G20 A29 G15 A39 G24 A28 G20
A36 G17 virus C18 T25 C22 T33 C15 T25 C16 T28 C21 T34 C22 T20
AY603973 Monkeypox Zaire- A34 G14 A33 G20 A28 G16 A40 G24 A28 G20
A34 G19 virus 96-I-16 C18 T25 C22 T33 C15 T25 C14 T29 C21 T34 C22
T20 AF380138 Vaccinia virus Copenhagen A38 G10 A32 G21 A30 G15 A37
G25 A25 G23 A38 G16 M35027 C24 T19 C24 T31 C13 T26 C16 T29 C20 T31
C21 T23 Vaccinia virus Tian Tan A36 G12 A32 G21 A30 G15 A37 G25 A27
G22 A38 G16 AF095689 C24 T19 C24 T31 C13 T26 C16 T29 C19 T31 C21
T23 Vaccinia virus Western A36 G12 A33 G20 A30 G15 A37 G25 A27 G23
A37 G17 AY243312 Reserve C24 T19 C23 T32 C13 T26 C16 T29 C18 T32
C21 T23 Vaccinia virus Ankara A36 G12 A33 G20 A30 G15 A37 G25 A25
G24 A38 G16 U94848 C24 T19 C23 T32 C13 T26 C16 T29 C20 T31 C21 T23
Vaccinia virus Rabbitpox A36 G12 A33 G20 A30 G15 A37 G25 A25 G24
A37 G17 AY484669 Utrecht C24 T19 C23 T32 C13 T26 C16 T29 C20 T31
C21 T23 Variola major Bangladesh- A36 G11 A33 G20 A28 G16 A36 G23
A28 G21 A36 G18 virus 1975 C24 T20 C20 T35 C14 T26 C15 T30 C16 T35
C21 T23 L22579 Variola major India- A36 G11 A33 G20 A28 G16 A36 G23
A28 G21 A36 G18 virus 1967 C24 T20 C20 T35 C14 T26 C15 T30 C16 T35
C21 T23 S55844 Variola major Garcia- A36 G11 A34 G19 A28 G16 A36
G23 A28 G21 A36 G18 virus 1966 C24 T20 C21 T34 C14 T26 C15 T30 C16
T35 C21 T23 Y16780
[0124] Various modifications of the invention, in addition to those
described herein, will be apparent to those skilled in the art from
the foregoing description. Such modifications are also intended to
fall within the scope of the appended claims. Each reference
(including, but not limited to, journal articles, U.S. and non-U.S.
patents, patent application publications, international patent
application publications, gene bank accession numbers, internet web
sites, and the like) cited in the present application is
incorporated herein by reference in its entirety. Those skilled in
the art will appreciate that numerous changes and modifications may
be made to the embodiments of the invention and that such changes
and modifications may be made without departing from the spirit of
the invention. It is therefore intended that the appended claims
cover all such equivalent variations as fall within the true spirit
and scope of the invention.
Sequence CWU 1
1
46118DNAArtificial SequencePCR Primer 1gaagttgaac cgggatca
18223DNAArtificial SequencePCR Primer 2ctgtctgtag ataaactagg att
23318DNAArtificial SequencePCR Primer 3ctcctccatc actaggaa
18415DNAArtificial SequencePCR Primer 4cgatactacg gacgc
15518DNAArtificial SequencePCR Primer 5gtactgaatc cgcctaag
18623DNAArtificial SequencePCR Primer 6cgcgataata gatagtgcta aac
23720DNAArtificial SequencePCR Primer 7tagaagttga accgggatca
20824DNAArtificial SequencePCR Primer 8tctgtctgta gataaactag gatt
24919DNAArtificial SequencePCR Primer 9tctcctccat cactaggaa
191016DNAArtificial SequencePCR Primer 10tcgatactac ggacgc
161119DNAArtificial SequencePCR Primer 11tgtactgaat ccgcctaag
191224DNAArtificial SequencePCR Primer 12tcgcgataat agatagtgct aaac
241328DNAArtificial SequencePCR Primer 13tgatttcgta gaagttgaac
cgggatca 281427DNAArtificial SequencePCR Primer 14ttctccctag
aagttgaacc gggatca 271521DNAArtificial SequencePCR Primer
15tggtgacgat actacggacg c 211622DNAArtificial SequencePCR Primer
16tcggtgacga tactacggac gc 221722DNAArtificial SequencePCR Primer
17tcggtgacga tactacggac gc 221832DNAArtificial SequencePCR Primer
18tggaaaaaaa gtatctcctc catcactagg aa 321932DNAArtificial
SequencePCR Primer 19tggaaagtat ctcctccatc actaggaaaa cc
322026DNAArtificial SequencePCR Primer 20tccctcctct cctccatcac
taggaa 262133DNAArtificial SequencePCR Primer 21tctagtaaac
gcgataatag atagtgctaa acg 332231DNAArtificial SequencePCR Primer
22tcctcctcgc gataatagat agtgctaaac g 312329DNAArtificial
SequencePCR Primer 23tcctcccgcg ataatagata gtgctaaac
292421DNAArtificial SequencePCR Primer 24attatcggtc gttgttaatg t
212518DNAArtificial SequencePCR Primer 25cgttcttctc tggaggat
182622DNAArtificial SequencePCR Primer 26ctataacatt caaagcttat tg
222722DNAArtificial SequencePCR Primer 27ctttatgaat tactttacat at
222823DNAArtificial SequencePCR Primer 28gtgaataaag tatcgcccta ata
232919DNAArtificial SequencePCR Primer 29gcttccacca ggtcattaa
193022DNAArtificial SequencePCR Primer 30tattatcggt cgttgttaat gt
223119DNAArtificial SequencePCR Primer 31tcgttcttct ctggaggat
193223DNAArtificial SequencePCR Primer 32tctataacat tcaaagctta ttg
233323DNAArtificial SequencePCR Primer 33tctttatgaa ttactttaca tat
233424DNAArtificial SequencePCR Primer 34tgtgaataaa gtatcgccct aata
243520DNAArtificial SequencePCR Primer 35tgcttccacc aggtcattaa
203631DNAArtificial SequencePCR Primer 36tcgcgatttt attatcggtc
gttgttaatg t 313730DNAArtificial SequencePCR Primer 37tccctcccta
ttatcggtcg ttgttaatgt 303835DNAArtificial SequencePCR Primer
38tccctcccaa tatctttacg aattacttta catat 353931DNAArtificial
SequencePCR Primer 39tcctccctcc catctttacg aattacttta c
314034DNAArtificial SequencePCR Primer 40tcctccctcc caatatcttt
acgaattact ttac 344135DNAArtificial SequencePCR Primer 41tccctcccga
aaactataac attcaaagct tattg 354233DNAArtificial SequencePCR Primer
42tccctccctc cctataacat tcaaagctta ttg 334330DNAArtificial
SequencePCR Primer 43tcctccctcc ctaacattca aagcttattg
304425DNAArtificial SequencePCR Primer 44tgttcagctt ccaccaggtc
attaa 254527DNAArtificial SequencePCR Primer 45tgtgttcagc
ttccaccagg tcattaa 274624DNAArtificial SequencePCR Primer
46tcccagcttc caccaggtca ttaa 24
* * * * *