U.S. patent number 7,368,242 [Application Number 11/152,460] was granted by the patent office on 2008-05-06 for method and kits for multiplex hybridization assays.
This patent grant is currently assigned to Affymetrix, Inc.. Invention is credited to Thomas Matthew Daly, Dong-Jing Fu, Paul Hardenbol, Richard D. Hockett, James Ireland, Xin Miao.
United States Patent |
7,368,242 |
Miao , et al. |
May 6, 2008 |
Method and kits for multiplex hybridization assays
Abstract
The invention provides a method for genotyping interfering
polymorphic loci in a target polynucleotide, such as a strand of
genomic DNA, in a multiplex hybridization-based assay. The
invention also provides nucleic acid standards for validating the
performance of such hybridization-based assays. In one aspect, the
method of the invention is carried out by providing for each
interfering polymorphic locus one or more probes so that at least
one probe is capable of forming a perfectly match duplex at the
locus regardless of the characteristic sequence of an adjacent
polymorphism.
Inventors: |
Miao; Xin (Menlo Park, CA),
Ireland; James (San Francisco, CA), Hardenbol; Paul (San
Francisco, CA), Daly; Thomas Matthew (Indianapolis, IN),
Fu; Dong-Jing (Carmel, IN), Hockett; Richard D.
(Fishers, IN) |
Assignee: |
Affymetrix, Inc. (Santa Clara,
CA)
|
Family
ID: |
37524502 |
Appl.
No.: |
11/152,460 |
Filed: |
June 14, 2005 |
Prior Publication Data
|
|
|
|
Document
Identifier |
Publication Date |
|
US 20060281098 A1 |
Dec 14, 2006 |
|
Current U.S.
Class: |
435/6.14;
536/23.1; 536/24.3 |
Current CPC
Class: |
C12Q
1/6827 (20130101); C12Q 1/6827 (20130101); C12Q
2521/501 (20130101) |
Current International
Class: |
C12Q
1/68 (20060101); C07H 21/02 (20060101); C07H
21/04 (20060101) |
Field of
Search: |
;435/6
;536/23.1,24.3 |
References Cited
[Referenced By]
U.S. Patent Documents
Other References
US 5,962,233, 10/1999, Livak (withdrawn) cited by other .
The Stratagene Catalog p. 39 (1988). cited by examiner .
Shumaker et al., Mutation detection by solid phase primer
extension. Human Mutation 7 : 346-354 (1996). cited by examiner
.
Saiki et al. Analysis of enzymatically amplified -globin and HLA-DQ
DNA with allele-specific oligonucleotide probes. Nature 324 :
163-166 (1986). cited by examiner .
Wen et al, "Rapid detection of the known SNPs of CYP2C9 using
oligonucleotide microarray," World Journal of Gastroenterology, 9:
1342-1346 (2003). cited by other .
Daly, "Pharmacogenetics of the major polymorphic metabolizing
enzymes," Fundamental & Clinical Pharmacology, 17: 27-41
(2003). cited by other .
Landi et al, "Evaluation of a microarray for genotyping
polymorphisms related to xenobiotic metabolism and DNA repair,"
BioTechniques, 35: 816-827 (2003). cited by other .
Chou et al, "Comparison of two CYP2D6 genotyping methods and
assessment of genotype-phenotype relationships," Clinical
Chemistry, 49: 542-551 (2003). cited by other .
Linder et al, "Pharmacogenetics: a laboratory tool for optimizing
therapeutic efficiency," Clinical Chemistry, 43: 254-266 (1997).
cited by other .
Greenspoon et al, "Validation and implementation of the PowerPlex
16 BIO system STR multiplex for forensic casework," J. Forensic
Sci., 49: 71-80 (2004). cited by other .
Hardenbol et al, "Multiplexed genotyping with sequence-tagged
molecular inversion probes," Nature Biotechnology, 21: 673-678
(2003). cited by other.
|
Primary Examiner: Whisenant; Ethan
Attorney, Agent or Firm: Wells; Sandra E.
Claims
We claim:
1. A kit of nucleic acid standards for determining the presence in
a probe mixture of selected probes specific for interfering
polymorphic loci, the kit comprising: a plurality of nucleic acid
standards provided in pairwise mixtures corresponding to different
homozygous and heterozygous combinations of non-interfering
polymorphic loci and different diploid combinations of haplotypes
of interfering polymorphic loci, wherein each nucleic acid standard
comprises a double stranded nucleic acid plasmid containing a
nucleic acid sequence for at least one haplotype of the interfering
polymorphic loci, wherein each nucleic acid standard is capable of
replication and wherein there is at least one nucleic acid standard
having a nucleic acid sequence complementary with each probe of the
probe mixture; and instructions that set forth a protocol
comprising the steps of: hybridizing under stringent hybridization
conditions each of said pairwise mixtures of said nucleic acid
standards with a mixture of probes, such that the mixture of probes
comprises probes specific for at least one of said nucleic acid
standards in said pairwise mixture; and detecting the presence of
probes in the mixture of probes that form stable duplexes with at
least one of said nucleic acid standards of said pairwise mixture
to determine whether such probes are present in the mixture.
2. The kit of claim 1 wherein said nucleic acid standards contain
nucleic acid sequences of said interfering polymorphic loci in
genes selected from HLA genes, tumor suppressor genes, cell cycle
control genes, oncogenes, or genes encoding xenobiotic metabolizing
enzymes.
3. The kit of claim 2 wherein said interfering polymorphic loci are
in genes encoding said xenobiotic metabolizing enzymes that are
selected from the group consisting of ABCB1, ABCC2, CYP1A2, CYP2A6,
CYP2B6, CYP2C19, CYP2C8, CYP2C9, CYP2D6, CYP3A4, CYP3A5, DPYD,
FMO3, GSTM1, NAT 1, NAT2, SLC21A6, SLC22A1, SLC22A2, TPMT, and
UGT1A1.
4. The kit of claim 2 wherein said interfering polymorphic loci are
in genes encoding said xenobiotic metabolizing enzymes that are
selected from the group consisting of CYP1A1, CYP1B1, CYP2C18,
CYP3A7, GSTT1, GSTM3, GSTA1, UGT1A6, UGT1A7, UGT2B4, UGT2B7,
UGT2B15, ADH1B, ALDH2, APE1, CDKN2A, COMT, DRD2, DRD4, EPHX1,
ERCC1, ERCC2, ERCC4, ERCC5, GRPR, GSTA4, LIG3, MDM2, MGMT, MPG,
NQO1, OGG1, PCNA, POLB, SLC6A3, SOD2, TP53, XRCC1, XRCC2, XRCC3,
XRCC9, ABCB1, CYP2E1, GSTP1, SLC22A2, ABCC2, CYP2J2, NAT1, TPMT,
CDA, CYP3A4, NAT2, UGT1A1, CYP1A2, CYP3A5, CYP2C8, CYP2A6, DPYD,
CYP2C9, CYP2B6, FMO2, SLC15A2, CYP2C19, FMO3, SLC21A6, CYP2D6,
GSTM1, and SLC22A1.
Description
FIELD OF THE INVENTION
The present invention relates to methods and kits for multiplexed
hybridization-based assays, particularly for genotyping
applications.
BACKGROUND
Many high throughput approaches for analyzing genetic processes and
variation make use of complex mixtures of oligonucleotides to
detect, sort, or manipulate gene products and/or genomic fragments,
e.g. Brenner et al, Proc. Natl. Acad. Sci., 97: 1665-1670 (2000);
Church et al, Science, 240: 185-188 (1988); Chee et al, Science,
274: 610-614 (1996); Shoemaker et al, Nature Genetics, 14: 450-456
(1996); Hardenbol et al, Nature Biotechnology, 21: 673-678 (2003);
Kennedy et al, Nature Biotechnology, 21: 1233-1237 (2003); and the
like. Such techniques are starting to be employed to genotype
individuals to determine susceptibilities to a variety of
conditions, including cancer, adverse drug reactions,
responsiveness to targeted therapeutics, and the like, particularly
in clinical trial settings. As these complex hybridization-based
techniques move out of research laboratories and into medical and
diagnostic applications, there will be a critical need to ensure
that readouts based on the techniques are robust and valid, e.g.
Food and Drug Administration, "Class II special controls guidance
document: Instrumentation for clinical multiplex test systems,"
Guidance for Industry and FDA Staff (Mar. 10, 2005).
When polymorphisms are closely spaced along a gene or genome,
certain polymorphisms, particularly insertions or deletions, at one
locus may interfere with the detection of a polymorphism at
adjacent loci in hybridization-based assays because of anomalous
hybridization and/or interference among probes. This situation
makes it difficult to determine whether a lack of signal in a
readout is due to the absence of a polymorphism, probe degradation,
probe interference, or other problems, e.g. Landi et al,
BioTechniques, 35: 816-827 (2003). The difficulty of such
determinations is exacerbated when highly complex probes are used
that comprise hundreds, or even thousands, of hybridizing
components.
Such difficulties may be crucial when hybridization-based assays
are used to genotype a large set of xenobiotic metabolizing genes
to determine an effective dosage of a drug for a patient.
Metabolism of xenobiotic substances, such as drugs, is a chemical
process, by which the body structurally modifies foreign compounds
to enhance their solubility and facilitate their excretion. This
involves two distinct metabolic phases: enzymatic oxidation,
reduction, and hydrolysis reactions, which expose or add functional
groups to produce polar molecules (Phase I metabolism) and addition
of endogenous compounds to the molecules to further increase
polarity (Phase II metabolism). The bulk of responsibility for the
Phase I reactions rests on the cytochrome P450 (CYP450) superfamily
of enzymes. The CYP450 family consists of 60 to 100 different
monoxygenases that catalyze the oxidative metabolism of lipophilic
chemicals. These, together with several members of different
families of transport proteins, play a crucial role in the
disposition and elimination of a diverse array of therapeutic drugs
and other xenobiotics. It is now well established that significant
inter-individual variability exists in patient drug disposition and
response. Much of the observed heterogeneity is thought to be due
to the underlying genetic variation in the human population.
Individual differences at a single nucleotide of DNA, otherwise
known as single nucleotide polymorphisms (SNPs), are the most
abundant source of genetic variation in humans. Many SNPs with
potential for altering the activity of proteins involved in drug
metabolism, such as the CYP450s have been found, e.g. Daly,
Fundamental & Clinical Pharmacology, 17: 27-41 (2003).
Phenotypes resulting from these genetic changes can markedly
influence a drugs pharmacokinetics or change its efficacy and/or
toxicity profile. Several examples exist where subjects carrying
certain alleles suffer from a lack of drug efficacy, due to
ultrarapid metabolism (UM) or, alternatively, adverse effects from
the drug treatment due to impaired drug clearance by poor
metabolism (PM). In current clinical practice, the suitability of a
drug for a given individual is determined by trial and error. This
practice places a significant burden on healthcare systems and
costs. Having an accurate genetic profile of a patient's drug
metabolizing genes would help ensure that the patient receives the
most effective treatment, while avoiding inadvertent adverse drug
reactions in poor metabolizers.
In view of this, it would be highly desirable to have available
multiplexed hybridization-based assays that could accommodate
interfering polymorphisms and methods and compositions that would
allow one to factor out specific causes for signal loss or variance
in such assays. Such assays would be especially useful in the field
of medicine and drug development, where information such assays are
being increasingly used in decisions about patients and
products.
SUMMARY OF THE INVENTION
The invention provides a method for detecting multiple nucleic acid
targets that occur at closely adjacent loci of the same
polynucleotide, such as a strand of genomic DNA, in a multiplex
hybridization-based assay. The invention also provides nucleic acid
standards for validating the performance of such
hybridization-based assays. In one aspect, the method of the
invention is carried out by providing for each interfering
polymorphic locus one or more probes so that at least one probe is
capable of forming a perfectly match duplex at the locus regardless
of the characteristic sequence of an adjacent polymorphism. One
embodiment of such method is carried out by the following steps:
(i) providing for substantially every allele of each locus of
interfering polymorphic loci a probe for substantially every allele
of each adjacent loci of the interfering polymorphic loci, each
allele of the adjacent loci having a characteristic sequence, and
each such probe being capable of forming a stable duplex with the
characteristic sequence of a different allele of each such adjacent
loci; (ii) hydridizing the probes to a target polynucleotide
containing the interfering polymorphic loci under conditions that
allow stable duplexes to form whenever a probe has a sequence
complementary to a locus of the interfering polymorphic loci and a
characteristic sequence of an allele of an adjacent locus of the
interfering polymorphic loci; and (iii) detecting the presence of
probes forming stable duplexes with the target polynucleotide to
determine the genotype of the interfering polymorphic loci.
In another aspect, the invention provides kits comprising nucleic
acid standards for validating the performance of
hybridization-based assays in detecting genotypes at interfering
polymorphic loci. In one embodiment, such kits comprise a plurality
of nucleic acid standards, each nucleic acid standard comprising a
double stranded nucleic acid containing characteristic sequences of
polymorphisms of two or more interfering polymorphic loci. In one
aspect, nucleic acid standards are capable of replication. Kits of
the invention may further include probes for a hybridization-based
assay, wherein such assay includes probes specific for at least one
interfering polymorphic loci.
BRIEF DESCRIPTION OF THE FIGURES
FIGS. 1A-1C diagrammatically illustrate the concept of interfering
polymorphic loci for two and three polymorphic loci.
FIGS. 2A-2B illustrate a pattern of signals generated by probes
specific for a pair of interfering polymorphic loci.
FIG. 3 illustrates a molecular inversion probe that may be used
with the invention.
FIGS. 4A-4D illustrate aspects of nucleic acid standards of the
invention.
DEFINITIONS
Terms and symbols of nucleic acid chemistry, biochemistry,
genetics, and molecular biology used herein follow those of
standard treatises and texts in the field, e.g. Kornberg and Baker,
DNA Replication, Second Edition (W.H. Freeman, New York, 1992);
Lehninger, Biochemistry, Second Edition (Worth Publishers, New
York, 1975); Strachan and Read, Human Molecular Genetics, Second
Edition (Wiley-Liss, New York, 1999); Eckstein, editor,
Oligonucleotides and Analogs: A Practical Approach (Oxford
University Press, New York, 1991); Gait, editor, Oligonucleotide
Synthesis: A Practical Approach (IRL Press, Oxford, 1984); and the
like.
"Addressable" in reference to tag complements means that the
nucleotide sequence, or perhaps other physical or chemical
characteristics, of an end-attached probe, such as a tag
complement, can be determined from its address, i.e. a one-to-one
correspondence between the sequence or other property of the
end-attached probe and a spatial location on, or characteristic of,
the solid phase support to which it is attached. Preferably, an
address of a tag complement is a spatial location, e.g. the planar
coordinates of a particular region containing copies of the
end-attached probe. However, end-attached probes may be addressed
in other ways too, e.g. by microparticle size, shape, color,
frequency of micro-transponder, or the like, e.g. Chandler et al,
PCT publication WO 97/14028.
"Amplicon" means the product of a polynucleotide amplification
reaction. That is, it is a population of polynucleotides, usually
double stranded, that are replicated from one or more starting
sequences. The one or more starting sequences may be one or more
copies of the same sequence, or it may be a mixture of different
sequences. Amplicons may be produced by a variety of amplification
reactions whose products are multiple replicates of one or more
target nucleic acids. Generally, amplification reactions producing
amplicons are "template-driven" in that base pairing of reactants,
either nucleotides or oligonucleotides, have complements in a
template polynucleotide that are required for the creation of
reaction products. In one aspect, template-driven reactions are
primer extensions with a nucleic acid polymerase or oligonucleotide
ligations with a nucleic acid ligase. Such reactions include, but
are not limited to, polymerase chain reactions (PCRs), linear
polymerase reactions, nucleic acid sequence-based amplification
(NASBAs), rolling circle amplifications, and the like, disclosed in
the following references that are incorporated herein by reference:
Mullis et al, U.S. Pat. Nos. 4,683,195; 4,965,188; 4,683,202;
4,800,159 (PCR); Gelfand et al, U.S. Pat. No. 5,210,015 (real-time
PCR with "taqman" probes); Wittwer et al, U.S. Pat. No. 6,174,670;
Kacian et al, U.S. Pat. No. 5,399,491 ("NASBA"); Lizardi, U.S. Pat.
No. 5,854,033; Aono et al, Japanese patent publ. JP 4-262799
(rolling circle amplification); and the like. In one aspect,
amplicons of the invention are produced by PCRs. An amplification
reaction may be a "real-time" amplification if a detection
chemistry is available that permits a reaction product to be
measured as the amplification reaction progresses, e.g. "real-time
PCR" described below, or "real-time NASBA" as described in Leone et
al, Nucleic Acids Research, 26: 2150-2155 (1998), and like
references. As used herein, the term "amplifying" means performing
an amplification reaction. A "reaction mixture" means a solution
containing all the necessary reactants for performing a reaction,
which may include, but not be limited to, buffering agents to
maintain pH at a selected level during a reaction, salts,
co-factors, scavengers, and the like.
"Complementary or substantially complementary" refers to the
hybridization or base pairing or the formation of a duplex between
nucleotides or nucleic acids, such as, for instance, between the
two strands of a double stranded DNA molecule or between an
oligonucleotide primer and a primer binding site on a single
stranded nucleic acid. Complementary nucleotides are, generally, A
and T (or A and U), or C and G. Two single stranded RNA or DNA
molecules are said to be substantially complementary when the
nucleotides of one strand, optimally aligned and compared and with
appropriate nucleotide insertions or deletions, pair with at least
about 80% of the nucleotides of the other strand, usually at least
about 90% to 95%, and more preferably from about 98 to 100%.
Alternatively, substantial complementarity exists when an RNA or
DNA strand will hybridize under selective hybridization conditions
to its complement. Typically, selective hybridization will occur
when there is at least about 65% complementary over a stretch of at
least 14 to 25 nucleotides, preferably at least about 75%, more
preferably at least about 90% complementary. See, M. Kanehisa
Nucleic Acids Res. 12:203 (1984), incorporated herein by
reference.
"Duplex" means at least two oligonucleotides and/or polynucleotides
that are fully or partially complementary undergo Watson-Crick type
base pairing among all or most of their nucleotides so that a
stable complex is formed. The terms "annealing" and "hybridization"
are used interchangeably to mean the formation of a stable duplex.
In one aspect, stable duplex means that a duplex structure is not
destroyed by a stringent wash, e.g. conditions including tempature
of about 5.degree. C. less that the T.sub.m of a strand of the
duplex and low monovalent salt concentration, e.g. less than 0.2 M,
or less than 0.1 M. "Perfectly matched" in reference to a duplex
means that the poly- or oligonucleotide strands making up the
duplex form a double stranded structure with one another such that
every nucleotide in each strand undergoes Watson-Crick basepairing
with a nucleotide in the other strand. The term "duplex"
comprehends the pairing of nucleoside analogs, such as
deoxyinosine, nucleosides with 2-aminopurine bases, PNAs, and the
like, that may be employed. A "mismatch" in a duplex between two
oligonucleotides or polynucleotides means that a pair of
nucleotides in the duplex fails to undergo Watson-Crick
bonding.
"Genetic locus," or "locus" in reference to a genome or target
polynucleotide, means a contiguous subregion or segment of the
genome or target polynucleotide. As used herein, genetic locus, or
locus, may refer to the position of a nucleotide, a gene, or a
portion of a gene in a genome, including mitochondrial DNA, or it
may refer to any contiguous portion of genomic sequence whether or
not it is within, or associated with, a gene. In one aspect, a
genetic locus refers to any portion of genomic sequence, including
mitochondrial DNA, from a single nucleotide to a segment of few
hundred nucleotides, e.g. 100-300, in length. Usually, a particular
genetic locus may be identified by its nucleotide sequence, or the
nucleotide sequence, or sequences, of one or both adjacent or
flanking regions.
"Hybridization" refers to the process in which two single-stranded
polynucleotides bind non-covalently to form a stable
double-stranded polynucleotide. The term "hybridization" may also
refer to triple-stranded hybridization. The resulting (usually)
double-stranded polynucleotide is a "hybrid" or "duplex."
"Hybridization conditions" will typically include salt
concentrations of less than about 1M, more usually less than about
500 mM and less than about 200 mM. Hybridization temperatures can
be as low as 5.degree. C., but are typically greater than
22.degree. C., more typically greater than about 30.degree. C., and
preferably in excess of about 37.degree. C. Hybridizations are
usually performed under stringent conditions, i.e. conditions under
which a probe will hybridize to its target subsequence. Stringent
conditions are sequence-dependent and are different in different
circumstances. Longer fragments may require higher hybridization
temperatures for specific hybridization. As other factors may
affect the stringency of hybridization, including base composition
and length of the complementary strands, presence of organic
solvents and extent of base mismatching, the combination of
parameters is more important than the absolute measure of any one
alone. Generally, stringent conditions are selected to be about
5.degree. C. lower than the T.sub.m for the specific sequence at s
defined ionic strength and pH. Exemplary stringent conditions
include salt concentration of at least 0.01 M to no more than 1 M
Na ion concentration (or other salts) at a pH 7.0 to 8.3 and a
temperature of at least 25.degree. C. For example, conditions of
5.times.SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4)
and a temperature of 25-30.degree. C. are suitable for
allele-specific probe hybridizations. For stringent conditions, see
for example, Sambrook, Fritsche and Maniatis. "Molecular Cloning A
laboratory Manual" 2.sup.nd Ed. Cold Spring Harbor Press (1989) and
Anderson "Nucleic Acid Hybridization" 1.sup.st Ed., BIOS Scientific
Publishers Limited (1999), which are hereby incorporated by
reference in its entirety for all purposes above. "Hybridizing
specifically to" or "specifically hybridizing to" or like
expressions refer to the binding, duplexing, or hybridizing of a
molecule substantially to or only to a particular nucleotide
sequence or sequences under stringent conditions when that sequence
is present in a complex mixture (e.g., total cellular) DNA or
RNA.
"Hybridization-based assay" means any assay that relies on the
formation of a stable duplex or triplex between a probe and a
target nucleotide sequence for detecting or measuring such a
sequence. In one aspect, probes of such assays anneal to (or form
duplexes with) regions of target sequences in the range of from 8
to 100 nucleotides; or in other aspects, they anneal to target
sequences in the range of from 8 to 40 nucleotides, or more
usually, in the range of from 8 to 20 nucleotides. A "probe" in
reference to a hybridization-based assay mean a polynucleotide that
has a sequence that is capable of forming a stable hybrid (or
triplex) with its complement in a target nucleic acid and that is
capable of being detected, either directly or indirectly.
Hybridization-based assays include, without limitation, assays
based on use of oligonucleotides, such as polymerase chain
reactions, NASBA reactions, oligonucleotide ligation reactions,
single-base extensions of primers, circularizable probe reactions,
allele-specific oligonucleotides hybridizations, either in solution
phase or bound to solid phase supports, such as microarrays or
microbeads. There is extensive guidance in the literature on
hybridization-based assays, e.g. Hames et al, editors, Nucleic Acid
Hybridization a Practical Approach (IRL Press, Oxford, 1985);
Tijssen, Hybridization with Nucleic Acid Probes, Parts I & II
(Elsevier Publishing Company, 1993); Hardiman, Microarray Methods
and Applications (DNA Press, 2003); Schena, editor, DNA Microarrays
a Practical Approach (IRL Press, Oxford, 1999); and the like. In
one aspect, hybridization-based assays are solution phase assays;
that is, both probes and target sequences hybridize under
conditions that are substantially free of surface effects or
influences on reaction rate. A solution phase assay may include
circumstance where either probes or target sequences are attached
to microbeads.
"Interfering polymorphic loci" mean closely spaced loci having
sequence variants, or alleles, usually insertions, detections, or
substitutions, that are sought to be determined by a
hybridization-based assay. In one aspect, interfering polymorphic
loci are a pair of closely spaced loci in which at least one locus
of the pair contains two or more alternative forms, each having a
characteristic sequence, such that the presence of at least one
characteristic sequence destabilizes a probe specific for the other
locus of the pair on the same DNA strand. Characteristic sequences
of alleles may be identified in conventional databases, e.g. dbSNP,
or the like. The region of a target polynucleotide or genome that
interfering polymorphic loci span depends in part on the nature of
the probes employed in a hybridization-based assay. Thus, in one
aspect, members of a pair of interfering polymorphic loci are
within 40 nucleotides of one another; or in another aspect such
members may be within 20 nucleotides of one another.
"Kit" refers to any delivery system for delivering materials or
reagents for carrying out a method of the invention. In the context
of assays, such delivery systems include systems that allow for the
storage, transport, or delivery of reaction reagents (e.g., probes,
enzymes, etc. in the appropriate containers) and/or supporting
materials (e.g., buffers, written instructions for performing the
assay etc.) from one location to another. For example, kits include
one or more enclosures (e.g., boxes) containing the relevant
reaction reagents and/or supporting materials for assays of the
invention. In one aspect, kits of the invention comprise probes
specific for interfering polymorphic loci. In another aspect, kits
comprise nucleic acid standards for validating the performance of
probes specific for interfering polymorphic loci. Such contents may
be delivered to the intended recipient together or separately. For
example, a first container may contain an enzyme for use in an
assay, while a second container contains probes.
"Ligation" means to form a covalent bond or linkage between the
termini of two or more nucleic acids, e.g. oligonucleotides and/or
polynucleotides, in a template-driven reaction. The nature of the
bond or linkage may vary widely and the ligation may be carried out
enzymatically or chemically. As used herein, ligations are usually
carried out enzymatically to form a phosphodiester linkage between
a 5' carbon of a terminal nucleotide of one oligonucleotide with 3'
carbon of another oligonucleotide. A variety of template-driven
ligation reactions are described in the following references, which
are incorporated by reference: Whitely et al, U.S. Pat. No.
4,883,750; Letsinger et al, U.S. Pat. No. 5,476,930; Fung et al,
U.S. Pat. No. 5,593,826; Kool, U.S. Pat. No. 5,426,180; Landegren
et al, U.S. Pat. No. 5,871,921; Xu and Kool, Nucleic Acids
Research, 27: 875-881 (1999); Higgins et al, Methods in Enzymology,
68: 50-71 (1979); Engler et al, The Enzymes, 15: 3-29 (1982); and
Namsaraev, U.S. patent publication 2004/0110213.
"Microarray" refers to a solid phase support having a planar
surface, which carries an array of nucleic acids, each member of
the array comprising identical copies of an oligonucleotide or
polynucleotide immobilized to a spatially defined region or site,
which does not overlap with those of other members of the array;
that is, the regions or sites are spatially discrete. Spatially
defined hybridization sites may additionally be "addressable" in
that its location and the identity of its immobilized
oligonucleotide are known or predetermined, for example, prior to
its use. Typically, the oligonucleotides or polynucleotides are
single stranded and are covalently attached to the solid phase
support, usually by a 5'-end or a 3'-end. The density of
non-overlapping regions containing nucleic acids in a microarray is
typically greater than 100 per cm.sup.2, and more preferably,
greater than 1000 per cm.sup.2. Microarray technology is reviewed
in the following references: Schena, Editor, Microarrays: A
Practical Approach (IRL Press, Oxford, 2000); Southern, Current
Opin. Chem. Biol., 2: 404-410 (1998); Nature Genetics Supplement,
21: 1-60 (1999). As used herein, "random microarray" refers to a
microarray whose spatially discrete regions of oligonucleotides or
polynucleotides are not spatially addressed. That is, the identity
of the attached oligonucleoties or polynucleotides is not
discernable, at least initially, from its location. In one aspect,
random microarrays are planar arrays of microbeads wherein each
microbead has attached a single kind of hybridization tag
complement, such as from a minimally cross-hybridizing set of
oligonucleotides. Arrays of microbeads may be formed in a variety
of ways, e.g. Brenner et al, Nature Biotechnology, 18: 630-634
(2000); Tulley et al, U.S. Pat. No. 6,133,043; Stuelpnagel et al,
U.S. Pat. No. 6,396,995; Chee et al, U.S. Pat. No. 6,544,732; and
the like. Likewise, after formation, microbeads, or
oligonucleotides thereof, in a random array may be identified in a
variety of ways, including by optical labels, e.g. fluorescent dye
ratios or quantum dots, shape, sequence analysis, or the like.
"Nucleoside" as used herein includes the natural nucleosides,
including 2'-deoxy and 2'-hydroxyl forms, e.g. as described in
Kornberg and Baker, DNA Replication, 2nd Ed. (Freeman, San
Francisco, 1992). "Analogs" in reference to nucleosides includes
synthetic nucleosides having modified base moieties and/or modified
sugar moieties, e.g. described by Scheit, Nucleotide Analogs (John
Wiley, New York, 1980); Uhlman and Peyman, Chemical Reviews, 90:
543-584 (1990), or the like, with the proviso that they are capable
of specific hybridization. Such analogs include synthetic
nucleosides designed to enhance binding properties, reduce
complexity, increase specificity, and the like. Polynucleotides
comprising analogs with enhanced hybridization or nuclease
resistance properties are described in Uhlman and Peyman (cited
above); Crooke et al, Exp. Opin. Ther. Patents, 6: 855-870 (1996);
Mesmaeker et al, Current Opinion in Structual Biology, 5: 343-355
(1995); and the like. Exemplary types of polynucleotides that are
capable of enhancing duplex stability include oligonucleotide
N3'.fwdarw.P5' phosphoramidates (referred to herein as "amidates"),
peptide nucleic acids (referred to herein as "PNAs"),
oligo-2'-O-alkylribonucleotides, polynucleotides containing C-5
propynylpyrimidines, locked nucleic acids (LNAs), and like
compounds. Such oligonucleotides are either available commercially
or may be synthesized using methods described in the
literature.
"Polymerase chain reaction," or "PCR," means a reaction for the in
vitro amplification of specific DNA sequences by the simultaneous
primer extension of complementary strands of DNA. In other words,
PCR is a reaction for making multiple copies or replicates of a
target nucleic acid flanked by primer binding sites, such reaction
comprising one or more repetitions of the following steps: (i)
denaturing the target nucleic acid, (ii) annealing primers to the
primer binding sites, and (iii) extending the primers by a nucleic
acid polymerase in the presence of nucleoside triphosphates.
Usually, the reaction is cycled through different temperatures
optimized for each step in a thermal cycler instrument. Particular
temperatures, durations at each step, and rates of change between
steps depend on many factors well-known to those of ordinary skill
in the art, e.g. exemplified by the references: McPherson et al,
editors, PCR: A Practical Approach and PCR2: A Practical Approach
(IRL Press, Oxford, 1991 and 1995, respectively). For example, in a
conventional PCR using Taq DNA polymerase, a double stranded target
nucleic acid may be denatured at a temperature >90.degree. C.,
primers annealed at a temperature in the range 50-75.degree. C.,
and primers extended at a temperature in the range 72-78.degree. C.
The term "PCR" encompasses derivative forms of the reaction,
including but not limited to, RT-PCR, real-time PCR, nested PCR,
quantitative PCR, multiplexed PCR, and the like. Reaction volumes
range from a few hundred nanoliters, e.g. 200 nL, to a few hundred
.mu.L, e.g. 200 .mu.L. "Reverse transcription PCR," or "RT-PCR,"
means a PCR that is preceded by a reverse transcription reaction
that converts a target RNA to a complementary single stranded DNA,
which is then amplified, e.g. Tecott et al, U.S. Pat. No.
5,168,038, which patent is incorporated herein by reference.
"Real-time PCR" means a PCR for which the amount of reaction
product, i.e. amplicon, is monitored as the reaction proceeds.
There are many forms of real-time PCR that differ mainly in the
detection chemistries used for monitoring the reaction product,
e.g. Gelfand et al, U.S. Pat. No. 5,210,015 ("taqman"); Wittwer et
al, U.S. Pat. Nos. 6,174,670 and 6,569,627 (intercalating dyes);
Tyagi et al, U.S. Pat. No. 5,925,517 (molecular beacons); which
patents are incorporated herein by reference. Detection chemistries
for real-time PCR are reviewed in Mackay et al, Nucleic Acids
Research, 30: 1292-1305 (2002), which is also incorporated herein
by reference. "Nested PCR" means a two-stage PCR wherein the
amplicon of a first PCR becomes the sample for a second PCR using a
new set of primers, at least one of which binds to an interior
location of the first amplicon. As used herein, "initial primers"
in reference to a nested amplification reaction mean the primers
used to generate a first amplicon, and "secondary primers" mean the
one or more primers used to generate a second, or nested, amplicon.
"Multiplexed PCR" means a PCR wherein multiple target sequences (or
a single target sequence and one or more reference sequences) are
simultaneously carried out in the same reaction mixture, e.g.
Bernard et al, Anal. Biochem., 273: 221-228 (1999)(two-color
real-time PCR). Usually, distinct sets of primers are employed for
each sequence being amplified. "Quantitative PCR" means a PCR
designed to measure the abundance of one or more specific target
sequences in a sample or specimen. Quantitative PCR includes both
absolute quantitation and relative quantitation of such target
sequences. Quantitative measurements are made using one or more
reference sequences that may be assayed separately or together with
a target sequence. The reference sequence may be endogenous or
exogenous to a sample or specimen, and in the latter case, may
comprise one or more competitor templates. Typical endogenous
reference sequences include segments of transcripts of the
following genes: .beta.-actin, GAPDH, .beta..sub.2-microglobulin,
ribosomal RNA, and the like. Techniques for quantitative PCR are
well-known to those of ordinary skill in the art, as exemplified in
the following references that are incorporated by reference:
Freeman et al, Biotechniques, 26: 112-126 (1999); Becker-Andre et
al, Nucleic Acids Research, 17: 9437-9447 (1989); Zimmerman et al,
Biotechniques, 21: 268-279 (1996); Diviacco et al, Gene, 122:
3013-3020 (1992); Becker-Andre et al, Nucleic Acids Research, 17:
9437-9446 (1989); and the like.
"Polymorphism" or "genetic variant" means a substitution,
inversion, insertion, or deletion of one or more nucleotides at a
genetic locus, or a translocation of DNA from one genetic locus to
another genetic locus. In one aspect, polymorphism means one of
multiple alternative nucleotide sequences that may be present at a
genetic locus of an individual and that may comprise a nucleotide
substitution, insertion, or deletion with respect to other
sequences at the same locus in the same individual, or other
individuals within a population. An individual may be homozygous or
heterozygous at a genetic locus; that is, an individual may have
the same nucleotide sequence in both alleles, or have a different
nucleotide sequence in each allele, respectively. In one aspect,
insertions or deletions at a genetic locus comprises the addition
or the absence of from 1 to 10 nucleotides at such locus, in
comparison with the same locus in another individual of a
population (or another allele in the same individual). Usually,
insertions or deletions are with respect to a major allele at a
locus within a population, e.g. an allele present in a population
at a frequency of fifty percent or greater.
"Polynucleotide" or "oligonucleotide" are used interchangeably and
each mean a linear polymer of nucleotide monomers. Monomers making
up polynucleotides and oligonucleotides are capable of specifically
binding to a natural polynucleotide by way of a regular pattern of
monomer-to-monomer interactions, such as Watson-Crick type of base
pairing, base stacking, Hoogsteen or reverse Hoogsteen types of
base pairing, or the like. Such monomers and their internucleosidic
linkages may be naturally occurring or may be analogs thereof, e.g.
naturally occurring or non-naturally occurring analogs.
Non-naturally occurring analogs may include PNAs, phosphorothioate
internucleosidic linkages, bases containing linking groups
permitting the attachment of labels, such as fluorophores, or
haptens, and the like. Whenever the use of an oligonucleotide or
polynucleotide requires enzymatic processing, such as extension by
a polymerase, ligation by a ligase, or the like, one of ordinary
skill would understand that oligonucleotides or polynucleotides in
those instances would not contain certain analogs of
internucleosidic linkages, sugar moities, or bases at any or some
positions. Polynucleotides typically range in size from a few
monomeric units, e.g. 5-40, when they are usually referred to as
"oligonucleotides," to several thousand monomeric units. Whenever a
polynucleotide or oligonucleotide is represented by a sequence of
letters (upper or lower case), such as "ATGCCTG," it will be
understood that the nucleotides are in 5'.fwdarw.3' order from left
to right and that "A" denotes deoxyadenosine, "C" denotes
deoxycytidine, "G" denotes deoxyguanosine, and "T" denotes
thymidine, "I" denotes deoxyinosine, "U" denotes uridine, unless
otherwise indicated or obvious from context. Unless otherwise noted
the terminology and atom numbering conventions will follow those
disclosed in Strachan and Read, Human Molecular Genetics 2
(Wiley-Liss, New York, 1999). Usually polynucleotides comprise the
four natural nucleosides (e.g. deoxyadenosine, deoxycytidine,
deoxyguanosine, deoxythymidine for DNA or their ribose counterparts
for RNA) linked by phosphodiester linkages; however, they may also
comprise non-natural nucleotide analogs, e.g. including modified
bases, sugars, or internucleosidic linkages. It is clear to those
skilled in the art that where an enzyme has specific
oligonucleotide or polynucleotide substrate requirements for
activity, e.g. single stranded DNA, RNA/DNA duplex, or the like,
then selection of appropriate composition for the oligonucleotide
or polynucleotide substrates is well within the knowledge of one of
ordinary skill, especially with guidance from treatises, such as
Sambrook et al, Molecular Cloning, Second Edition (Cold Spring
Harbor Laboratory, New York, 1989), and like references.
"Primer" means an oligonucleotide, either natural or synthetic,
that is capable, upon forming a duplex with a polynucleotide
template, of acting as a point of initiation of nucleic acid
synthesis and being extended from its 3' end along the template so
that an extended duplex is formed. The sequence of nucleotides
added during the extension process are determined by the sequence
of the template polynucleotide. Usually primers are extended by a
DNA polymerase. Primers usually have a length in the range of from
14 to 36 nucleotides.
"Readout" means a parameter, or parameters, which are measured
and/or detected that can be converted to a number or value. In some
contexts, readout may refer to an actual numerical representation
of such collected or recorded data. For example, a readout of
fluorescent intensity signals from a microarray is the address and
fluorescence intensity of a signal being generated at each
hybridization site of the microarray; thus, such a readout may be
registered or stored in various ways, for example, as an image of
the microarray, as a table of numbers, or the like.
"Solid support", "support", and "solid phase support" are used
interchangeably and refer to a material or group of materials
having a rigid or semi-rigid surface or surfaces. In many
embodiments, at least one surface of the solid support will be
substantially flat, although in some embodiments it may be
desirable to physically separate synthesis regions for different
compounds with, for example, wells, raised regions, pins, etched
trenches, or the like. According to other embodiments, the solid
support(s) will take the form of beads, resins, gels, microspheres,
or other geometric configurations. Microarrays usually comprise at
least one planar solid phase support, such as a glass microscope
slide.
"Specific" or "specificity" in reference to the binding of one
molecule to another molecule, such as a labeled target sequence for
a probe, means the recognition, contact, and formation of a stable
complex between the two molecules, together with substantially less
recognition, contact, or complex formation of that molecule with
other molecules. In one aspect, "specific" in reference to the
binding of a first molecule to a second molecule means that to the
extent the first molecule recognizes and forms a complex with
another molecules in a reaction or sample, it forms the largest
number of the complexes with the second molecule. Preferably, this
largest number is at least fifty percent. Generally, molecules
involved in a specific binding event have areas on their surfaces
or in cavities giving rise to specific recognition between the
molecules binding to each other. Examples of specific binding
include antibody-antigen interactions, enzyme-substrate
interactions, formation of duplexes or triplexes among
polynucleotides and/or oligonucleotides, receptor-ligand
interactions, and the like. As used herein, "contact" in reference
to specificity or specific binding means two molecules are close
enough that weak non-covalent chemical interactions, such as Van
der Waal forces, hydrogen bonding, base-stacking interactions,
ionic and hydrophobic interactions, and the like, dominate the
interaction of the molecules.
"T.sub.m" is used in reference to "melting temperature." Melting
temperature is the temperature at which a population of
double-stranded nucleic acid molecules becomes half dissociated
into single strands. Several equations for calculating the Tm of
nucleic acids are well known in the art. As indicated by standard
references, a simple estimate of the Tm value may be calculated by
the equation. Tm=81.5+0.41 (% G+C), when a nucleic acid is in
aqueous solution at 1 M NaCl (see e.g., Anderson and Young,
Quantitative Filter Hybridization, in Nucleic Acid Hybridization
(1985). Other references (e.g., Allawi, H. T. & SantaLucia, J.,
Jr., Biochemistry 36, 10581-94 (1997)) include alternative methods
of computation which take structural and environmental, as well as
sequence characteristics into account for the calculation of
Tm.
"Sample" means a quantity of material from a biological,
environmental, medical, or patient source in which detection or
measurement of target nucleic acids is sought. On the one hand it
is meant to include a specimen or culture (e.g., microbiological
cultures). On the other hand, it is meant to include both
biological and environmental samples. A sample may include a
specimen of synthetic origin. Biological samples may be animal,
including human, fluid, solid (e.g., stool) or tissue, as well as
liquid and solid food and feed products and ingredients such as
dairy items, vegetables, meat and meat by-products, and waste.
Biological samples may include materials taken from a patient
including, but not limited to cultures, blood, saliva, cerebral
spinal fluid, pleural fluid, milk, lymph, sputum, semen, needle
aspirates, and the like. Biological samples may be obtained from
all of the various families of domestic animals, as well as feral
or wild animals, including, but not limited to, such animals as
ungulates, bear, fish, rodents, etc. Environmental samples include
environmental material such as surface matter, soil, water and
industrial samples, as well as samples obtained from food and dairy
processing instruments, apparatus, equipment, utensils, disposable
and non-disposable items. These examples are not to be construed as
limiting the sample types applicable to the present invention.
DETAILED DESCRIPTION OF THE INVENTION
In one aspect, the invention provides a method and kits for
performing hybridization-based assays that determine genotypes
and/or haplotypes at one or more interfering polymorphic loci. In
one aspect of the method of the invention, for each allele of each
locus of a pair of interfering polymorphic loci probes are provided
that form perfectly matched duplexes with each allele of the other
locus. In this manner, at least one probe will be available to form
a proper hybrid for detecting each haplotype of the interfering
polymorphic loci. Such probes are then combined under hybridization
conditions with a sample containing target polynucleotides
containing interfering polymorphic loci in accordance with the
hybridization-based assay in which they are employed.
In another aspect, the invention provides compositions and
associated kits for validating the performance of
hybridization-based assays containing probes specific for
interfering polymorphic loci. Compositions of the invention
comprise, either separately or as one or more mixtures, nucleic
acid standards, or reference sequences, that contain the sequences
of interfering polymorphic loci, which a corresponding a
hybridization-based assay is designed to detect. When a
hybridization-based assay is employed to genotype individuals, e.g.
patients being considered for chemotherapy, assay results may be
validated by application of the assay to the nucleic acid
standards. This aspect of the invention is particularly useful for
highly multiplexed hybridization-based assays that are capable of
identifying genotypes at many tens of loci to many thousands of
loci in the same assay reaction, where a subset of the loci may be
interfering polymorphic loci. In both of the above aspects, probes
or standards for interfering polymorphic loci may be present with
probes or standards for non-interfering polymorphic loci,
respectively. In one aspect, nucleic acid standards are provided
for each haplotype of each interfering polymorphic loci being
analyzed by a hybridization-based assay. In another aspect, nucleic
acid standards may be provided for a subset of possible haplotypes
of interfering polymorphic loci.
A pair of interfering polymorphic loci is illustrated in FIGS. 1A
and 1B for the case where there are two alternative alleles at each
locus. At locus 1 a single allele, i.e. an A:T basepair, is shown.
Locus 2 is biallelic for a first sequence (100) and a deletion
variation of it (102) (indicated by the symbol ".DELTA..sub.2").
FIGS. 1A and 1B also show probes from a hybridization-based assay
(in this case, components of a oligonucleotide ligation assay are
represented, e.g. as described in Whiteley et al, U.S. Pat. No.
4,883,750). At locus 1, the target nucleic acid (100 or 102) has an
"A" (104), and at locus 2, the target nucleic acid is present in
two variations: sequence (100) or sequence (102) that contains a
deletion within the site at which probe (108) hybridizes. Thus, if
only probes (106) and (108) were used in the assay, the deletion
variant (102) would completely preclude probe (108) from annealing,
or at best, would severely destabilize any duplex formed by
formation of loop (110), or by the formation of other unstable
structures. In accordance with one aspect of the invention,
whenever such an interfering polymorphic locus occurs, a separate
probe is provided for each sequence variant at that locus. Thus, in
the system of FIG. 1A, an additional pair of probes would be
provided, one of which would form a perfectly matched duplex with
the deletion sequence variant (102) at locus 2.
As illustrated in FIG. 1B, in one aspect of the invention, where
one locus in a pair of interfering polymorphic loci consists of a
single base substitution, a universal nucleotide may be used in a
probe specific for a polymorphism in an adjacent interfering
polymorphic locus. Thus, a single pair of probes (112) and (114)
specific for the deletion polymorphism at locus 2 may be used even
though there are two substitution variants, "C" (117) and "A"
(119), in the hybridization site of probe (112). In this
illustration, deoxyinosine is used in probe (112) since it may
basepair with either A or C. Other such universal bases, or
corresponding universal nucleotides, may be employed, e.g.
5-nitroindole, 4-nitroindole, and the like. The following
references, which are incorporated by reference, provide guidance
for selection of universal bases, or nucleosides, for incorporation
into hybridization probes: U.S. Pat. Nos. 6,313,286 and 6,239,159;
The Glen Report, 8: 1-5 (Glen Research, Sterling, Va., 1995);
Loakes, Nucleic Acids Research, 29: 2437-2447 (2001); Ball et al,
Nucleic Acids Research, 26: 5225-5227 (1998); Loakes et al, J. Mol.
Biol., 270: 426-435 (1997). As used herein, the term "universal
base" refers to a natural or unnatural base or base analog of a
nucleoside (or nucleoside analog, as the case may be) that has the
property of being able to form Watson-Crick basepairs with two or
more natural nucleosides.
As illustrated in FIG. 1C, interfering polymorphic loci may include
three adjacent polymorphic loci, which in FIG. 1C comprise a
deletion .DELTA..sub.1 (130), a base substitution (132), and a
deletion .DELTA..sub.2 (134). In the top diagram of FIG. 1C, probes
(138) and (140) specific for locus 2 are disrupted by the deletion
polymorphisms at locus 1 and locus 3. In the bottom diagram, the
correct-sequence probes, (142) and (144), for the two deletions are
shown. In general, for each polymorphism at locus 2, there are as
many as four alternative probes: (.DELTA..sub.1 present,
.DELTA..sub.2 present), (.DELTA..sub.1 present, .DELTA..sub.2
absent), (.DELTA..sub.1 absent, .DELTA..sub.2 present),
(.DELTA..sub.1 absent, .DELTA..sub.2 absent).
In FIGS. 2A and 2B, all possible genotypes and signal generating
probes are illustrated for a pair of interfering polymorphic loci,
where each locus has two possible alleles (shown by the absence of
a bar (200) or as a bar (202)). Occasionally, these will be
referred to as the "major" allele and "minor" allele, respectively,
for convenience. In this illustration, the alleles are shown as if
they occur in an independent manner. However, in practice, such
closely spaced alleles only very rarely would be independent.
Typically, only three genotypes are observed in a population at
such a pair of interfering polymorphic loci, e.g. 1, 2, and 3, of
FIG. 2A. Thus, in some aspects of the invention, nucleic acid
standards and/or probes may be provided only for a subset of
genotypes, e.g. 1, 2, and 3, of FIG. 2A. As used herein,
"substantially every" in reference to genotypes at interfering
polymorphic loci means that in particular embodiments nucleic acid
standard and/or probes corresponding to very rare genotypes are not
included in a set or mixture. As used herein, "very rare" in
reference to genotypes means less than 0.1% of a or less than 0.01%
of a population, Nonetheless, for the sake of illustration, all
theoretically possible genotypes are illustrated in FIG. 2B
together with probes that would generate signals in an assay where
the indicated genotype is present.
Probes of the invention are combined under appropriate
hybridization conditions with a sample to be genotyped or with a
set of nucleic acid standards for validation of an assay. Such
conditions depend on the hybridization-based assay being employed,
for which there is significant guidance available to one of
ordinary skill, as noted below. For example, in one aspect,
molecular inversion probes are provided for detecting interfering
polymorphic loci using procedures and conditions as set forth
below.
Genetic Systems
As mentioned above, methods and compositions of the invention are
particularly applicable to systems of polymorphic genes that are
responsible for or involve in a phenotypic response or
characteristic. Examples of such systems include xenobiotic
metabolizing genes, HLA genes, tumor suppressor genes, cell cycle
control genes, oncogenes, and the like. Of particular interest are
genes responsible for metabolizing drugs and other xenobiotic
substances, e.g. as described in Linder et al, Clin. Chem., 43:
254-266 (1997); Landi et al (cited above); Daly, Fundamental &
Clinical Pharmacology, 17: 27-41 (2003); and the like. In one
aspect, the invention includes hybridization-based assays to
determine the genotype of a plurality of loci in genes of
xenobiotic metabolizing enzymes selected from the tables below.
(Below conventional gene nomenclature is used. Nucleic acid
sequences and other information for the indicated genes may be
obtained in various public databases, such as dbSNP (http:
//www.ncbi.nlm.nih.gov/SNP/), HUGO Gene Nomenclature Committee
website (http://www.gene.ucl.ac.uk/nomenclature/index.html);
ENSEMBL (http://www.ensembl.org), and the like.
TABLE-US-00001 TABLE I Exemplary Xenobiotic Metabolizing Genes
ABCB1 CYP2E1 GSTP1 SLC22A2 ABCC2 CYP2J2 NAT1 TPMT CDA CYP3A4 NAT2
UGT1A1 CYP1A2 CYP3A5 CYP2C8 CYP2A6 DPYD CYP2C9 CYP2B6 FMO2 SLC15A2
CYP2C19 FMO3 SLC21A6 CYP2D6 GSTM1 SLC22A1
The following pairs of polymorphisms of genes in the above table
are interfering polymorphic loci for hybridization-based assays in
which pairs of oligonucleotides having lengths in the range of from
16 to 24 nucleotides are ligated for polymorphism detection:
TABLE-US-00002 TABLE II Exemplary Interfering Polymorphic Loci in
Xenobiotic Metabolizing Genes Gene Locus 1 Locus 2 Locus 3 ABCB1
rs1045642 rs17149694 ABCB1 MDR1*14A61G rs9332385 ABCC2 rs717620
rs8187711 CYP1A2 CYP1A2*1K, -740TG CYP1A2*1K, -730CT CYP2A6
CYP2A6*8, 6600G CYP2A6*5 (rs5031017) CYP2A6 CYP2A6*7, 6558TC
(rs5031016) rs6413474 CYP2B6 CYP2B6*8 CYP2B6*13076GA CYP2B6
CYP2B6*4 CYP2B6*3 CYP2C19 CYP2C19*14, 50TC rs17882687 CYP2C19
CYP2C19*6, 395GA rs17882291 CYP2C19 CYP2C19*10, 680CT CYP2C19*2,
681GA CYP2C19 CYP2C19*9, 991AG rs17878422 CYP2C8 CYP2C8*2, 805AT
rs1058930 CYP2C9 CYP2C9*2 CYP2C9*8 (rs7900194) CYP2C9 CYP2C9*10
rs9332131 CYP2C9 CYP2C9*3 CYP2C9*4 CYP2C9*5 CYP2D6 CYP2D6*42,
3259insGT rs1058172 CYP2D6 CYP2D6*44, 2950GC CYP2D6*7 CYP2D6
CYP2D6*38 CYP2D6*21, 2573insC CYP2D6 CYP2D6*3A CYP2D6*19,
2539delAACT CYP2D6 CYP2D6*20, 1973insG rs3831704 CYP2D6 CYP2D6*4,
1846GA rs11568728 CYP2D6 CYP2D6*17, 1023CT rs1081003 CYP2D6
CYP2D6*15 CYP2D6*12, 124GA CYP2D6 CYP2D6*42, -1584GC rs7511593
CYP3A4 CYP3A4*17 (rs4987161) 3A4RS4987159 CYP3A5 CYP3A5*3B, H30Y
CYP3A5*8 DPYD DPYD*2A DPYD*3 FMO3 FMO3L360P rs2066532 GSTM1 GSTM1,
AB rs1056806 NAT1 rs4987076 rs4986990 NAT1 rs5030809 NAT1*14G560A
NAT2 rs1801279 rs1805158 NAT2 rs1799931 NAT2, A845C SLC21A6
rs2306283 rs11045818 SLC21A6 OATPCA467G rs11045819 SLC22A1 OCT1C88R
OCT1L85F SLC22A1 rs2282143 OCT1R342H SLC22A1 OCT1G401S OCT1I403
SLC22A2 rs8177516 rs8177515 SLC22A2 rs8177507 rs8177508 TPMT TPMT*8
rs1800584 TPMT TPMT*3B rs2842934 UGT1A1 UGT1A1*28 rs873478 UGT1A1
UGT1A1*27 UGT1A1*35
Further genes of interest for determining an individuals ability to
metabolize a selected xenobiotic compound include those listed in
Table III.
TABLE-US-00003 TABLE III Further Exemplary Xenobiotic Metabolizing
Genes CYP1A1 CYP1B1 CYP2C18 CYP3A7 GSTT1 GSTM3 GSTA1 UGT1A6 UGT1A7
UGT2B4 UGT2B7 UGT2B15 ADH1B ALDH2 APE1 CDKN2A COMT DRD2 DRD4 EPHX1
ERCC1 ERCC2 ERCC4 ERCC5 GRPR GSTA4 LIG3 MDM2 MGMT MPO NQO1 OGG1
PCNA POLB SLC6A3 SOD2 TP53 XRCC1 XRCC2 XRCC3 XRCC9
Hybridization-Based Assays
As mentioned above, the invention relates to the use of
hybridization-based assays to detect or measure interfering
polymorphic loci. Such assays are widely used in multiplexed
formats to simultaneously genotype DNA samples at multiple loci,
e.g. allele-specific muliplex PCR, arrayed primer extension (APEX)
technology, variation detection arrays, solution phase primer
extension or ligation assays, and the like, described in the
following references: Shumaker et al, Hum. Mut., 7: 346-354 (1996);
Cronin et al, U.S. Pat. No. 6,468,744; Huang et al, U.S. Pat. Nos.
6,709,816 and 6,287,778; Fan et al, U.S. patent publication
2003/0003490; Chee et al, U.S. Pat. No. 6,355,431; Gunderson et al,
U.S. patent publication 2005/0037393; Hacia et al, U.S. Pat. No.
6,342,355; Kennedy et al, Nature Biotechnology, 21: 1233-1237
(2003); Chou et al, Clin. Chem., 49: 542-551 (2003); and the
like.
In one aspect, hybridization-based assays include circularizing
probes, such as padlock probes, rolling circle probes, molecular
inversion probes, linear amplification molecules for multiplexed
PCR, and the like, e.g. padlock probes being disclosed in U.S. Pat.
Nos. 5,871,921; 6,235,472; 5,866,337; and Japanese patent JP
4-262799; rolling circle probes being disclosed in Aono et al,
JP-4-262799; Lizardi, U.S. Pat. Nos. 5,854,033; 6,183,960;
6,344,239; molecular inversion probes being disclosed in Hardenbol
et al (cited above) and in Willis et al, U.S. patent publication
2004/0101835; and linear amplification molecules being disclosed in
Faham et al, U.S. patent publication 2003/0104459; all of which are
incorporated herein by reference. Such probes are desirable because
non-circularized probes can be digested with single stranded
exonucleases thereby greatly reducing background noise due to
spurious amplifications, and the like. In the case of molecular
inversion probes (MIPs), padlock probes, and rolling circle probes,
constructs for generating labeled target sequences are formed by
circularizing a linear version of the probe in a template-driven
reaction on a target polynucleotide followed by digestion of
non-circularized polynucleotides in the reaction mixture, such as
target polynucleotides, unligated probe, probe concatatemers, and
the like, with an exonuclease, such as exonuclease I.
FIG. 3 illustrates a molecular inversion probe and how it can be
used to generate an amplicon after interacting with a target
polynucleotide in a sample. A linear version of the probe is
combined with a sample containing target polynucleotide (300) under
conditions that permit target-specific region 1 (316) and
target-specific region 2 (318) to form stable duplexes with
complementary regions of target polynucleotide (300). The ends of
the target-specific regions may abut one another (being separated
by a "nick") or there may be a gap (320) of several (e.g. 1-10
nucleotides) between them. In either case, after hybridization of
the target-specific regions, the ends of the two target specific
regions are covalently linked by way of a ligation reaction or an
extension reaction followed by a ligation reaction, i.e. a
so-called "gap-filling" reaction. The latter reaction is carried
out by extending with a DNA polymerase a free 3' end of one of the
target-specific regions so that the extended end abuts the end of
the other target-specific region, which has a 5' phosphate, or like
group, to permit ligation. In one aspect, a molecular inversion
probe has a structure as illustrated in FIG. 3. Besides
target-specific regions (316 and 318), in sequence such a probe may
include first primer binding site (302), cleavage site (304),
second primer binding site (306), first tag-adjacent sequences
(308) (usually restriction endonuclease sites and/or primer binding
sites) for tailoring one end of a labeled target sequence
containing oligonucleotide tag (310), and second tag-adjacent
sequences (314) for tailoring the other end of a labeled target
sequence. Alternatively, cleavage-site (304) may be added at a
later step by amplification using a primer containing such a
cleavage site. In operation, after specific hybridization of the
target-specific regions and their ligation (322), the reaction
mixture is treated with a single stranded exonuclease that
preferentially digests all single stranded nucleic acids, except
circularized probes. After such treatment, circularized probes are
treated (326) with a cleaving agent that cleaves the probe between
primer (302) and primer (306) so that the structure is linearized
(330). Cleavage site (304) and its corresponding cleaving agent is
a design choice for one of ordinary skill in the art. In one
aspect, cleavage site (304) is a segment containing a sequence of
uracil-containing nucleotides and the cleavage agent is treatment
with uracil-DNA glycosylase followed by heating. After the
circularized probes are opened, the linear product is amplified,
e.g. by PCR using primers (332) and (334), to form amplicons (336).
A multiplexed readout may be obtained from amplicon (336) by
labeling and excising oligonucleotide tag (310) and specifically
hybridizing the labeled tags to a microarray of tag complements,
e.g. a GenFlex array (Affymetrix, Santa Clara, Calif.); a bead
array (Illumina, San Diego, Calif.); or a fluid array, e.g.
Chandler et al, U.S. Pat. No. 5,981,180 (Luminex, Austin,
Tex.).
In one aspect of the invention, probes may be selected for very
large sets of targets in multiplexed hybridization-based assays
that do not produce adequate signals for analysis because there
exist multiple regions in a genome having very similar sequences,
e.g. gene duplications, homologous families, and the like, Strachan
and Read, Human Molecular Genetics, 3.sup.rd Edition (Garland
Science/Taylor & Francis Group, 2003). In such cases, the
subset of targets may be pre-amplified, e.g. with PCR or like
method, prior to conducting a hybridization-based assay. That way,
the concentration of desired sections of the genome containing the
correct target regions can be increased, thereby allowing
relatively more probe to bind to the correct target regions than
the undesired regions. Such pre-amplifications may take place in a
separate reaction, the product of which may then be combined with
additional sample for application of the hybridization-based assay,
or the pre-amplification may take place as a preliminary-stage
reaction in the same reaction mixture as the hybridization-based
assay. The parameters of such pre-amplification reaction are a
matter of design choice for those skilled in the art and may
involve an amount of routine experimentation. Alternatively, in
some cases, probe signals may be increased by increasing the
relative concentration of probe directed to loci associated with
low signals, thereby promoting the formation of duplexes between
selected probes and their target loci.
In accordance with the invention, the degree of multiplexing in a
hybridization-based assay may vary widely. In one aspect, such
assays of the invention may have a degree of multiplexing greater
than 100 probes, or they may have a degree of multiplexing in a
range of from 100 to 10,000 probes, or in a range of from 100 to
100,000 probes.
Nucleic Acid Standards
In one aspect, nucleic acid standards of the invention comprise one
or more double stranded DNAs that contain sequences identical to
those of the interfering polymorphic loci that are to be detected
or measured in a hybridization-based assay. As mentioned above, the
purpose of the standards is to validate that a hybridization-based
assay is making correct determinations of the target sequences it
is designed to detect or quantify. Such validation includes
confirmation that correct determinations are being made of (i) the
genotypes of specific interfering polymorphic loci, and (ii) the
presence or absence of heterozygote and homozygote genotypes.
Standards may also be used to develop new multiplexed
hybridization-based assays by providing targets on which to test
new component probes. A set of nucleic acid tandards of the
invention may be provided such that there is one standard for each
target loci, or a set of nucleic acid standards may be larger or
smaller than the number of loci being analyzed in a particular
hybridization-based assay. For example, a single set of nucleic
acid standards may be employed with a family of different
hybridization-based assays where, for example, each assay may be
directed to a different subset of loci and the nucleic acid
standard may contain sequences for all the different subsets.
Usually, nucleic acid standards are provided in kits that may come
together with, or separately from, probes for a hybridization-base
assay. In one aspect, a kit of nucleic acid standards comprises one
or more double stranded DNAs each containing a sequence of a pair
or triplet of interfering polymorphic loci. In another aspect, such
a kit contains at least one double stranded DNA containing a
sequence of every interfering polymorphic loci, including a
separate sequence for each haplotype at each such loci, being
analyzed by a particular hybridization-based assay. In some
embodiments, nucleic acid standards may be provided only for
alleles or haplotypes that are present in a population at some
minimal frequency. For example, referring to the nucleic acid
standards of the interfering polymorphic loci illustrated in FIG.
4A, if a first allele (empty sector (401)) at a first loci is a
major allele having a frequency of 98% and a first allele (empty
sector (403)) at a second loci is a major allele having a
comparable frequency, then Haplotype 4 (410) may either not exist
because of the lack of recombination between closely spaced loci or
it may exist at an extremely low frequency. In either case, such a
haplotype (or allele) may be omitted from a set of nucleic acid
standards in some embodiments.
Nucleic acid standards for interfering polymorphic loci may be used
together with or separately from nucleic acid standards for
non-interfering polymorphic loci. Preferably, alleles of nucleic
acid standards for non-interfering polymorphis loci are contained
in separate DNA strands so that the ability to detect the presence
or absence of heterozygosity or homozygosity at a locus may be
validated, as well as the ability to generate signal in the
presence of a particular allele. Likewise, for interfering
polymorphic loci, preferably each haplotype of such loci is
contained in a separate strand of a nucleic acid standard.
Nucleic acid standards may be used and maintained in a variety of
formats. In one aspect, such standards are maintained and used as
self-replicable strands or circles of DNA, such as an amplicon
containing all the elements required for replication, a plasmid, or
the like. Conventional cloning vectors may be used for constructing
replicable nucleic acid standards. For example, suitable cloning
vectors include pUC19, pNEB206A, pNEB193, and the like (available
from New England Biolabs, Beverly, Mass.). Once constructed, such
replicable nucleic acid standards may be maintained and propagated
in conventional bacterial hosts, such as JM101, JM109, and the
like. Nucleic acids having sequences corresponding to the various
genetic loci of the standards may be synthesized on a commercially
available automated DNA synthesizer, e.g. Applied Biosystems
(Foster City, Calif.) model 3400, or like instrument, purified,
combined, and inserted into a preselected site in a polylinker
region of a cloning vector. After transfection and propagation in a
host, the cloning vector may be recovered and its insert sequenced
to confirm the identity of the nucleic acid standard.
In one aspect, nucleic acid standards may be replicated either by
in vitro or in vivo techniques. Usually, nucleic acid standards are
rendered capable of replication by sandwiching them between primer
binding sites for in vitro amplification, e.g. using PCR, or by
inserting them into a plasmid so that they may be replicated in a
host cell, such as a bacterial cell. Or, a combination may be used;
namely, nucleic acid standards flanked by primer binding sites may
be stored and replicated in a plasmid, and they may also be
replicated out of the plasmid by a PCR for use. Whether nucleic
acid standards are provided in plasmids or as amplicons of linear
double stranded DNA, preferably kits of the invention comprise
appropriate amounts of such DNA in purified form in ready-to-use
tubes, vials, ampules, or the like. Typically, such DNA may be
dissolved in a TE buffer, or like solution.
The number of separate replicable nucleic acid standards employed
and the number of sequences of different loci contained in each is
a matter of design choice, depending on factors such as the nature
of the replicable DNA, e.g. PCR amplicon, or plasmid, or the like;
the convenience or expense of maintaining and handling a few versus
a large number of separate replicable standards; the maximum insert
size of the replicable DNA; and so on.
FIG. 4A illustrates nucleic acid standards for four different
haplotypes of interfering polymorphic loci wherein each locus has
two alleles (represented by filled and non-filled sectors). In this
illustration, a set of cloning vectors (400) is provided each with
different inserts (404)-(410) corresponding to the different
haplotypes. Usually, the cloning vectors (400) are identical,
except for their inserts. FIGS. 4B and 4C illustrates nucleic acid
standards for four non-interfering loci. Each locus has two alleles
(again, represented by filled and non-filled sectors) and the
alleles are located in separate vectors (402) and (462).
In one aspect, when replicable nucleic acid standards are
constructed, either in plasmids or other amplicons, sequences from
the genome flanking each locus may be included. In one embodiment,
10 to 30 nucleotides of sequence upstream and downstream of each
probe binding site are included. For interfering polymorphic loci
where there are potentially overlapping probes, the flanking
sequences are sequences upstream of the upstream-most probe
specific for that interfering polymorphic loci and the
downstream-most probe specific for that interfering polymorphic
loci. In another embodiment, such flanking sequences are about 20
nucleotides in length. This feature of the invention is illustrated
in FIG. 4D. Nucleic acid standard (421) for interfering polymorphic
loci containing deletions (420) and (422) is shown in relation to
two possible probe pairs [(424) and (426), and (428) and (430),
respectively] that may hybridize to the site. In this illustration,
probe (424) is the upstream-most probe and probe (430) is the
downstream-most probe of the two pairs. The ends of these probes
define the location of the flanking sequences (432) and (434), as
indicated by the dashed lines in FIG. 4D.
The nucleic acid standards may be used in a wide range of
concentrations whose selection depends on a number of factors,
including the sensitivity of the hybridization-based assay
employed, the amount of sample available for an assay, whether
pre-assay purification or amplification steps are employed, the
nature of nucleic acid extraction procedures used, and the like. In
one aspect, nucleic acid standards are used at a concentration that
is equivalent to the concentration of the loci being interrogated
in a hybridization-based assay. In some embodiments, it may be
desirable to provide concentrations of nucleic acid standards that
are higher than those expected for loci in an assay mixture. For
example, such higher concentrations may reduce time required to
perform a test of a probe set, reduce variability due to sampling
error from small reaction volumes, reduce the effects of
non-specific binding to vessel walls, and like phenomena. In
another aspect, nucleic acid standard concentration may be selected
within the range of from about 1 femtomolar to about 10 nanomolar.
In still another aspect, such concentration may be selected in the
range of from about 10 femtomolar to about 1 nanomolar.
Preferably, nucleic acid standards are used in pairwise mixtures in
order to represent the different zygosities of a locus being
detected or measured. For example, for non-interfering polymorphic
loci, a locus may have alleles "A" and "a;" thus, homozygotes AA
and aa, and heterozygote Aa, would preferably each be tested
separately. Likewise, for interfering polymorphic loci, each such
set of loci will have a plurality of haplotypes. For example,
interfering polymorphic loci containing two loci having two alleles
each may have four possible haplotypes, e.g. AB, Ab, aB, and ab.
Thus, a diploid individual may have any pairwise combination of
such haplotypes. Typically, where one allele at each locus has a
low frequency of occurrence in a population, e.g. less than 5% or
less than 2%, the haplotype containing both such minor alleles,
e.g. ab, is so rare in the population that a probe and/or nucleic
acid standard may not be included for it, in which case the
haplotypes being considered are AB, Ab, and aB. For such an
embodiment, different pairwise mixtures would be produced for the
following diploid configurations: Homozygotes: (AB, AB), (Ab, Ab),
and (aB, aB); and Heterozygotes: (AB, Ab), (AB, aB), (Ab, aB).
Hybridization-Based Assays Employing Solid Phase Supports
Methods of conducting multiplexed hybridization-based assays using
microarrays, and like platforms, suitable for the present invention
are well known in the art. Guidance for selecting conditions and
materials for applying labeled sequences to solid phase supports,
such as microarrays, may be found in the literature, e.g. Wetmur,
Crit. Rev. Biochem. Mol. Biol., 26: 227-259 (1991); DeRisi et al,
Science, 278: 680-686 (1997); Chee et al, Science, 274: 610-614
(1996); Duggan et al, Nature Genetics, 21: 10-14 (1999); Schena,
Editor, Microarrays: A Practical Approach (IRL Press, Washington,
2000); Freeman et al, Biotechniques, 29: 1042-1055 (2000); and like
references. Methods and apparatus for carrying out repeated and
controlled hybridization reactions have been described in U.S. Pat.
Nos. 5,871,928, 5,874,219, 6,045,996 and 6,386,749, 6,391,623 each
of which are incorporated herein by reference. Hybridization
conditions typically include salt concentrations of less than about
1M, more usually less than about 500 mM and less than about 200 mM.
Hybridization temperatures can be as low as 5.degree. C., but are
typically greater than 22.degree. C., more typically greater than
about 30.degree. C., and preferably in excess of about 37.degree.
C. Hybridizations are usually performed under stringent conditions,
i.e. conditions under which a probe will stably hybridize to a
perfectly complementary target sequence, but will not stably
hybridize to sequences that have one or more mismatches. The
stringency of hybridization conditions depends on several factors,
such as probe sequence, probe length, temperature, salt
concentration, concentration of organic solvents, such as
formamide, and the like. How such factors are selected is usually a
matter of design choice to one of ordinary skill in the art for any
particular embodiment. Usually, stringent conditions are selected
to be about 5.degree. C. lower than the T.sub.m for the specific
sequence for particular ionic strength and pH. Exemplary
hybridization conditions include salt concentration of at least
0.01 M to no more than 1 M Na ion concentration (or other salts) at
a pH 7.0 to 8.3 and a temperature of at least 25.degree. C.
Additional exemplary hybridization conditions include the
following: 5.times.SSPE (750 mM NaCl, 50 mM sodium phosphate, 5 mM
EDTA, pH 7.4).
Exemplary hybridization procedures for applying labeled target
sequence to a GenFlex.TM. microarray (Affymetrix, Santa Clara,
Calif.) is as follows: denatured labeled target sequence at
95-100.degree. C. for 10 minutes and snap cool on ice for 2-5
minutes. The microarray is pre-hybridized with 6.times.SSPE-T (0.9
M NaCl 60 mM NaH.sub.2, PO.sub.4, 6 mM EDTA (pH 7.4), 0.005% Triton
X-100)+0.5 mg/ml of BSA for a few minutes, then hybridized with 120
.mu.L hybridization solution (as described below) at 42.degree. C.
for 2 hours on a rotisserie, at 40 RPM. Hybridization Solution
consists of 3M TMACL (Tetramethylammonium. Chloride), 50 mM MES
((2-[N-Morpholino]ethanesulfonic acid) Sodium Salt) (pH 6.7), 0.01%
of Triton X-100, 0.1 mg/ml of Herring Sperm DNA, optionally 50 pM
of fluorescein-labeled control oligonucleotide, 0.5 mg/ml of BSA
(Sigma) and labeled target sequences in a total reaction volume of
about 120 .mu.L. The microarray is rinsed twice with 1.times.SSPE-T
for about 10 seconds at room temperature, then washed with
1.times.SSPE-T for 15-20 minutes at 40.degree. C. on a rotisserie,
at 40 RPM. The microarray is then washed 10 times with
6.times.SSPE-T at 22.degree. C. on a fluidic station (e.g. model
FS400, Affymetrix, Santa Clara, Calif.). Further processing steps
may be required depending on the nature of the label(s) employed,
e.g. direct or indirect. Microarrays containing labeled target
sequences may be scanned on a confocal scanner (such as available
commercially from Affymetrix) with a resolution of 60-70 pixels per
feature and filters and other settings as appropriate for the
labels employed. GeneChip.TM. Software (Affymetrix) or like
software may be used to convert the image files into digitized
files for further data analysis.
Sample Preparation
Samples or specimens containing target polynucleotides may come
from a wide variety of sources for use with the present invention,
including cell cultures, animal or plant tissues, patient biopsies,
environmental samples, or the like. Samples are prepared for assays
of the invention using conventional techniques, which typically
depend on the source from which a sample or specimen is taken.
Prior to carrying out reactions on a sample, it will often be
desirable to perform one or more sample preparation operations upon
the sample. Typically, these sample preparation operations will
include such manipulations as extraction of intracellular material,
e.g., nucleic acids from whole cell samples, viruses and the like.
One or more of these various operations may be readily incorporated
into the fluidly closed systems contemplated by the present
invention.
For those embodiments where whole cells, viruses or other tissue
samples are being analyzed, it will typically be necessary to
extract the nucleic acids from the cells or viruses, prior to
continuing with the various sample preparation operations.
Accordingly, following sample collection, nucleic acids may be
liberated from the collected cells, viral coat, etc., into a crude
extract, followed by additional treatments to prepare the sample
for subsequent operations, e.g., denaturation of contaminating (DNA
binding) proteins, purification, filtration, desalting, and the
like. Liberation of nucleic acids from the sample cells or viruses,
and denaturation of DNA binding proteins may generally be performed
by chemical, physical, or electrolytic lysis methods. For example,
chemical methods generally employ lysing agents to disrupt the
cells and extract the nucleic acids from the cells, followed by
treatment of the extract with chaotropic salts such as guanidinium
isothiocyanate or urea to denature any contaminating and
potentially interfering proteins. Generally, where chemical
extraction and/or denaturation methods are used, the appropriate
reagents may be incorporated within a sample preparation chamber, a
separate accessible chamber, or may be externally introduced.
Following extraction, it will often be desirable to separate the
nucleic acids from other elements of the crude extract, e.g.,
denatured proteins, cell membrane particles, salts, and the like.
Removal of particulate matter is generally accomplished by
filtration, flocculation or the like. A variety of filter types may
be readily incorporated into the device. Further, where chemical
denaturing methods are used, it may be desirable to desalt the
sample prior to proceeding to the next step. Desalting of the
sample, and isolation of the nucleic acid may generally be carried
out in a single step, e.g., by binding the nucleic acids to a solid
phase and washing away the contaminating salts or performing gel
filtration chromatography on the sample, passing salts through
dialysis membranes, and the like. Suitable solid supports for
nucleic acid binding include, e.g., diatomaceous earth, silica
(i.e., glass wool), or the like. Suitable gel exclusion media, also
well known in the art, may also be readily incorporated into the
devices of the present invention, and is commercially available
from, e.g., Pharmacia and Sigma Chemical.
In some applications, such as measuring target polynucleotides in
rare cells from a patient's blood, an enrichment step may be
carried out prior to conducting an assay, such as by immunomagnetic
isolation. Such isolation or enrichment may be carried out using a
variety of techniques and materials known in the art, as disclosed
in the following representative references that are incorporated by
reference: Terstappen et al, U.S. Pat. No. 6,365,362; Terstappen et
al, U.S. Pat. No. 5,646,001; Rohr et al, U.S. Pat. No. 5,998,224;
Kausch et al, U.S. Pat. No. 5,665,582; Kresse et al, U.S. Pat. No.
6,048,515; Kausch et al, U.S. Pat. No. 5,508,164; Miltenyi et al,
U.S. Pat. No. 5,691,208; Molday, U.S. Pat. No. 4,452,773; Kronick,
U.S. Pat. No. 4,375,407; Radbruch et al, chapter 23, in Methods in
Cell Biology, Vol, 42 (Academic Press, New York, 1994); Uhlen et
al, Advances in Biomagnetic Separation (Eaton Publishing, Natick,
1994); Safarik et al, J. Chromatography B, 722: 33-53 (1999);
Miltenyi et al, Cytometry, 11: 231-238 (1990); Nakamura et al,
Biotechnol. Prog., 17: 1145-1155 (2001); Moreno et al, Urology, 58:
386-392 (2001); Racila et al, Proc. Natl. Acad. Sci., 95: 4589-4594
(1998); Zigeuner et al, J. Urology, 169: 701-705 (2003); Ghossein
et al, Seminars in Surgical Oncology, 20: 304-311 (2001).
In one aspect, genomic DNA for analysis is obtained using standard
commercially available DNA extraction kits, e.g. PureGene.RTM. DNA
Isolation Kit (Gentra Systems, Minneapolis, Minn.). In another
aspect, for assaying human genomic DNA with a multiplex
hybridization-based assay containing from about 1000 to 50,000
probes, a DNA sample may be used having an amount within the range
of from about 200 ng to about 1 .mu.g. When sample material is
scarce, prior to assaying, sample DNA may be amplified by whole
genome amplification, or like technique, to increase the total
amount of DNA available for performing an assay on. Preferably, any
such technique used to increase sample DNA prior to assaying does
not preferentially amplify or degrade sequences so that the
sequences in the sample tested are not representative of the
natural genomic DNA. Several whole genome, or partial genome,
amplification techniques are known in the art, such as the
following which are incorporated by reference: Telenius et al,
Genomics, 13: 718-725 (1992); Cheung et al, Proc. Natl. Acad. Sci.,
93: 14676-14679 (1996); Dean et al, Genome Research, 11: 1095-1099
(2001); U.S. Pat. Nos. 6,124,120; 6,280,949; 6,617,137; and the
like.
EXAMPLE
Genotyping Genetic Polymorphisms in Drug Metabolic Enzyme and
Transporter Genes Using Circularizable Probes
In this example, nucleic acid standards and molecular inversion
probes (having target-specific regions of length in the range of 36
to 64 nucleotides) are prepared for 177 loci of genes in the set
listed in Table I. Of these loci, interfering polymorphic loci are
listed in Table II. Molecular inversion probes are designed as
described in U.S. Pat. No. 6,858,412 and Hardenbol et al, Nature
Biotechnology, 21: 673-678 (2003), which are incorporated herein by
reference. Labeled oligonucleotide tags generated from the assay
are hybridized to a GeneChip.TM. array of tag complements are
analyzed on a GeneChip.TM. Scanner 3000 (Affymetrix, Santa Clara,
Calif.) using the manufacturer's recommended protocols. The
protocol of the above references is generally followed, subject to
the following modifications, described in the MegAllele.TM.
Genotyping System User Manual, available from the manufacturer of
molecular inversion probe kits (ParAllele Bioscience, South San
Francisco, Calif.). Briefly, after separation into four separate
tubes, the oligonucleotide tags of successfully circularized probes
("chip tags") are amplified by PCR using primers that are labeled
with one of four oligonucleotide tags ("label tag"), such that
there is a unique label tag for each tube, i.e. corresponding ot A,
C, G, and T, respectively. Such amplified and labeled chip tags are
hybridized to a GenFlex array as described in the above references,
after which haptenized oligonucleotides comprising complementary
sequences of the label tags are hybridized to the array. Four
different haptens (fluorescein, biotin, dinitrophenol, dansyl) are
employed such that each different label tag has a different hapten
attached. A mixture of four fluorescently labeled anti-hapten
binding compounds are added to the array, wherein each different
hapten-specific binding compound has a different fluorescent dye
attached. The identity of the polymorphism detected by each probe
is then resolved by detecting and characterizing the fluorescence
signal generated at each hybridization site on the array, using
commercially available data analysis software, e.g. ParAllele
Bioscience, Inc. (South San Francisco, Calif.) or Affymetrix (Santa
Clara, Calif.).
More particularly, molecular inversion probe assays are carried out
in the following steps: (1) Denaturing sample and annealing probe:
Four identical reactions containing 400 ng of genomic DNA for a
sample, 12 amol each of 1644 probes, 0.0625 units Ampligase
(Epicentre) and 0.5 units Stoffel fragment DNA polymerase (Applied
Biosystems) in 9 .mu.L of 20 mM Tris-HCl (pH 8.3), 25 mM KCl, 10 mM
MgCl.sub.2, 0.5 NAD and 0.01% Triton X-100 are incubated for 4 min
at 20.degree. C., 5 min at 95.degree. C. and 15 min at 60.degree.
C. (2) Gap-fill reaction: 1 .mu.L of each of four nucleotides is
added to the four reactions and incubated for 10 min at 60.degree.
C. and then 1 min at 37.degree. C. (3) Exonuclease selection: 10
units exonuclease I and 200 units exonuclease III (United States
Biochemical) in a 2-.mu.L volume are added and the mixture
incubated for 14 min at 37.degree. C., 2 min at 95.degree. C. and 1
min at 37.degree. C. (4) Uracil depurination and cleavage: 2 units
of uracil-N-glycosylase (New England Biolabs) is added in 25 .mu.L
of 1.6 mM MgCl.sub.2, 10 mM Tris-HCl (pH 8.3), 50 mM KCl and
incubated for 9 min at 37.degree. C. and 20 min at 95.degree. C.
(5) First amplification: 2 units of AmpliTaq Gold (Applied
Biosystems), 16 pmol first primer specific for first primer binding
site (302), and 16 pmol second primer specific for second primer
binding site (306) in 25 .mu.L of 1.6 mM MgCl.sub.2, 10 mM Tris-HCl
(pH 8.3), 50 mM KCl and 112 .mu.m dNTP are preactivated for 10 min
at 95.degree. C. and then added to the reaction mixtures. The
reactions are amplified in 28 cycles of 95.degree. C. for 20 s,
65.degree. C. for 45 s and 72.degree. C. for 10 s. (6) Second
amplification: For each of the four reaction mixtures, in separate
PCRS, oligonucleotide tags of the molecular inversion probes are
amplified and labeled with a "label tag." A separate and
non-cross-hybridizing label tag is provided for each reaction
mixture, i.e. one each for "A," "C," "G," and "T." Third and fourth
primers anneal to third and fourth primer binding sites flanking
the oligonucleotide tags ("chip tags") of the molecular inversion
probes. Each third primer contains a label tag at its 5' end, and
each fourth primer contains the sequence of a Dra I recognition
site. (7) Tag processing: 20 units of exonuclease I and 10 units
Dra I (New England Biolabs) are incubated with 60 .mu.L of each
amplification product at 37.degree. C. for 1 h and then 80.degree.
C. for 30 min. (8) Microarray hybridization: Approximately 1.25
pmol of each amplified and processed product are hybridized
overnight at 39.degree. C. to a GenFlex Tag Array (Affymetrix) DNA
array with 55 .mu.L 2.times.MES, 2.2 .mu.L 50.times. Denhardt
buffer, 1.1 fmol (each) GenFlex control oligonucleotide
(Affymetrix). (9) Wash and Staining: After washing with SSPE, or
like buffer, the following are hybridized to the GenFlex array:
haptenized oligonucleotides complementary to each of the label tags
together with a "spacer" oligonucleotide complementary to the
non-label tag portion of the third primer. After an additional
wash, fluorescently labeled anti-hapten antibodies are applied to
the GenFlex array followed by further washing. Data analysis is
performed on the raw signal data for each array feature generated
by the Affymetrix image analysis software.
The above teachings are intended to illustrate the invention and do
not by their details limit the scope of the claims of the
invention. While preferred illustrative embodiments of the present
invention are described, it will be apparent to one skilled in the
art that various changes and modifications may be made therein
without departing from the invention, and it is intended in the
appended claims to cover all such changes and modifications that
fall within the true spirit and scope of the invention.
* * * * *
References