U.S. patent application number 10/254828 was filed with the patent office on 2003-05-01 for parallel primer extension approach to nucleic acid sequence analysis.
This patent application is currently assigned to Baylor College of Medicine. Invention is credited to Caskey, C. Thomas, Metspalu, Andres, Shumaker, John.
Application Number | 20030082613 10/254828 |
Document ID | / |
Family ID | 35810584 |
Filed Date | 2003-05-01 |
United States Patent
Application |
20030082613 |
Kind Code |
A1 |
Caskey, C. Thomas ; et
al. |
May 1, 2003 |
Parallel primer extension approach to nucleic acid sequence
analysis
Abstract
A method of analyzing a polynucleotide of interest, comprising
providing one or more sets of consecutive oligonucleotide primers
differing within each set by one base at the growing end therof,
annealing a single strand of the polynucleotide or a fragment of
the polynucleotide to the oligonucleotide primers under
hybridization conditions; subjecting the primers to single base
extension reactions with a polymerase and terminating nucleotides,
the terminating nucleotides being mutually distinguishable; and
observing the location and identity of each terminating nucleotide
to thereby analyze the sequence or a part of the nucleotide
sequence of the polynucleotide of interest, is disclosed. An
apparatus comprising a solid support to which is attached at
defined locations thereon one or more sets of consecutive
oligonucleotide primers differing within each set by one base at
the growing end thereof is also described.
Inventors: |
Caskey, C. Thomas; (Houston,
TX) ; Shumaker, John; (Houston, TX) ;
Metspalu, Andres; (Tartu, EE) |
Correspondence
Address: |
HAMILTON, BROOK, SMITH & REYNOLDS, P.C.
530 VIRGINIA ROAD
P.O. BOX 9133
CONCORD
MA
01742-9133
US
|
Assignee: |
Baylor College of Medicine
|
Family ID: |
35810584 |
Appl. No.: |
10/254828 |
Filed: |
September 25, 2002 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10254828 |
Sep 25, 2002 |
|
|
|
09711476 |
Nov 13, 2000 |
|
|
|
09711476 |
Nov 13, 2000 |
|
|
|
08564100 |
Mar 6, 1996 |
|
|
|
6153379 |
|
|
|
|
08564100 |
Mar 6, 1996 |
|
|
|
PCT/US94/07086 |
Jun 22, 1994 |
|
|
|
Current U.S.
Class: |
435/6.14 ;
435/91.2 |
Current CPC
Class: |
C12Q 1/6837 20130101;
C12Q 1/6858 20130101; C12Q 1/6837 20130101; C12Q 1/6858 20130101;
C12Q 2535/125 20130101; C12Q 2565/537 20130101; C12Q 2535/125
20130101; C12Q 1/6874 20130101; C12Q 2525/186 20130101; C12Q
2525/186 20130101; C12Q 2563/107 20130101; C12Q 2523/107 20130101;
C12Q 2525/186 20130101; C12Q 2525/204 20130101; C12Q 2521/319
20130101; C12Q 1/6874 20130101; C12Q 2535/125 20130101; C12Q 1/6837
20130101 |
Class at
Publication: |
435/6 ;
435/91.2 |
International
Class: |
C12Q 001/68; C12P
019/34 |
Goverment Interests
[0001] This invention was made with Government Support under grant
number 5-R01-DK31 428-11 awarded by the National Institutes of
Health. The United States Government has certain rights in the
invention.
Foreign Application Data
Date |
Code |
Application Number |
Jun 22, 1993 |
SE |
SE9302152-5 |
Claims
What is claimed is:
1.A method of analyzing the sequence of a polynucleotide of
interest, comprising the steps of: a) annealing a polynucleotide of
interest to free oligonucleotide primers having known sequences of
N nucleotides in length to generate annealed primers; b) subjecting
the annealed primers to a single base extension reaction to extend
the annealed primers by the addition of a terminating nucleotide;
c) observing the identity of each terminating nucleotide that has
been added to the annealed primers.
2. A method of analyzing the sequence of a polynucleotide of
interest, comprising the steps of: a) annealing a polynucleotide of
interest to oligonucleotide primers having known sequences of N
nucleotides in length under hybridization conditions, to generate
annealed primers; b) subjecting the annealed primers to a single
base extension reaction which comprises providing to the annealed
primers nucleotides corresponding to each of the four bases, to
extend the annealed primers by the addition of a terminating
nucleotide; c) observing the identity and location of each
terminating nucleotide that has been added to the annealed
primers.
3. A method of analyzing the sequence of a polynucleotide of
interest, comprising the steps of: a) attaching an array of
oligonucleotide primers having known sequences of N nucleotides in
length to a solid support at known locations; b) annealing the
polynucleotide of interest to the array of oligonucleotide primers
to generate annealed primers; c) subjecting the annealed primers to
a single base extension reaction to extend the annealed primers by
the addition of a terminating nucleotide; d) observing the identity
and location of each terminating nucleotide within the array on the
solid support.
4. A method of analyzing the sequence of a polynucleotide of
interest, comprising the steps of: a) attaching an array of
oligonucleotide primers having known sequences of N nucleotides in
length to a solid support at known locations; b) annealing the
polynucleotide of interest to the array of oligonucleotide primers
to generate annealed primers; c) subjecting the annealed primers to
a single base extension reaction to extend the annealed primers by
the addition of a terminating nucleotide; d) selecting a starting
annealed primer; e) observing the identity and location of the
terminating nucleotide which has been added to the starting
annealed primer, to determine the next nucleotide in sequence; f)
selecting a second annealed primer which has the same nucleotide
sequence as nucleotides 2 through N of the starting annealed primer
nucleotide plus the next nucleotide in sequence as determined in
step (e), and g) repeating steps (e) and (f), using the second
annealed primer as the starting annealed primer for each
repetition, to determine the sequence of the polynucleotide of
interest.
5. A method of analyzing the sequence of a polynucleotide of
interest, comprising the steps of: a) attaching an array of
oligonucleotide primers, having known sequences of N nucleotides in
length to a solid support at defined locations; b) annealing the
polynucleotide of interest to the array of oligonucleotide primers
under hybridization conditions, to generate annealed primers; c)
subjecting the annealed primers to a single base extension reaction
which comprises providing to the annealed primers nucleotides
corresponding to each of the four bases, to extend the annealed
primers by the addition of a terminating nucleotide; d) observing
the identity and location of each terminating nucleotide within the
array on the solid support.
6. A method of analyzing the sequence of a polynucleotide of
interest, comprising the steps of: a) attaching an array of
oligonucleotide primers, having known sequences of N nucleotides in
length to a solid support at defined locations; b) annealing the
polynucleotide of interest to the array of oligonucleotide primers
under hybridization conditions, to generate annealed primers; c)
subjecting the annealed primers to a single base extension reaction
which comprises providing to the annealed primers nucleotides
corresponding to each of the four bases, to extend the annealed
primers by the addition of a terminating nucleotide; d) selecting a
starting annealed primer; e) observing the identity and location of
the terminating nucleotide which has been added to the starting
annealed primer, to determine the next nucleotide in sequence; f)
selecting a second annealed primer which has the same nucleotide
sequence as nucleotides 2 through N of the starting annealed primer
nucleotide plus the next nucleotide in sequence as determined in
step (e), and g) repeating steps (e) and (f), using the second
annealed primer as the starting annealed primer for each
repetition, to determine the sequence of the polynucleotide of
interest.
7. The method of any one of claims 1 to 6, wherein the single base
extension reaction comprises subjecting the annealed primers to a
reaction mixture comprising a polymerase and nucleotides
corresponding to each of the four bases.
8. The method of any one of claims 5 to 7, wherein the nucleotides
corresponding to each of the four bases are mutually
distinguishable.
9. The method of claim 8, wherein three of the four nucleotides are
differently labelled.
10. The method of claim 9, wherein the three differently labelled
nucleotides are fluorescently labelled.
11. The method of any one of claims 1 to 10, further comprising
analyzing the sequence of the complementary polynucleotide of
interest.
12. The method of any one of claims 1 to 11, wherein the
terminating nucleotides are dideoxynucleotides.
13. The method of any one of claims 1 to 12, wherein the length N
of the oligonucleotide primers is between 7 and 30 inclusive.
14. The method of any one of claims 1 to 13, wherein the length N
of the oligonucleotide primers is between 20 and 24 inclusive.
15. The method of any one of claims 1, 2, 13 or 14, wherein the
oligonucleotide primers comprise oligonucleotide primers of
different lengths.
16. The method of any one of claims 1 to 15, wherein observing the
identity and location of a terminating nucleotide comprises the use
of a charge coupled device or a photomultiplier tube.
17. The method of any one of claims 3 to 14 or 16, wherein the
terminating nucleotides are removed from the annealed primers after
completed analysis to prepare the solid support for reuse.
18. The method of any one of claims 1 to 17, wherein the
terminating nucleotides are dinucleotides.
19. An apparatus for analyzing the sequence of a polynucleotide of
interest, comprising a solid support having attached thereon at
defined locations an array of oligonucleotide primers having known
sequences.
20. The apparatus of claim 19, wherein the oligonucleotide primers
are attached to the solid support by a specific binding pair.
21. The apparatus of claim 20, wherein the specific binding pair is
biotin and a molecule selected front the group consisting of:
avidin and strepavidin.
Description
RELATED APPLICATIONS
[0002] This application is a Continuation of U.S. patent
application Ser. No. 08/564,100, which is the U.S. National Phase
Application of PCT/US94/07086, filed Jun. 22, 1994, which is a
Continuation-in-Part Application of Sweden Application No. SE
9302152-5, filed on Jun. 22, 1993. The entire teachings of the of
above applications are incorporated herein by reference.
BACKGROUND OF THE INVENTION
[0003] Today, there are two predominant methods for DNA sequence
determination: the chemical degradation method (Maxam and Gilbert,
Proc. Natl. Acad. Sci., 74:560-564 (1977), and the dideoxy chain
termination method (Sanger et al., Proc. Natl. Acad. Sci.,
74:5463-5467 (1977)). Most automated sequencers are based on the
chain termination method utilizing fluorescent detection of product
formation. There are two common variations of these systems: (1)
dye-labeled primers to which deoxynucleotides and
dideoxynucleotides are added, and (2) primers to which
deoxynucleotides and fluorescently labeled dideoxynucleotides are
added. In addition, the labeled deoxynucleotides can be used in
conjunction with unlabeled dideoxynucleotides. This method is based
upon the ability of an enzyme to add specific nucleotides onto the
3' hydroxyl end of a primer annealed to a template. The base
pairing property of nucleic acids determines the specificity of
nucleotide addition. The extension products are separated
electrophoretically on a polyacrylamide gel and detected by an
optical system utilizing laser excitation.
[0004] Although both the chemical degradation method and the
dideoxy chain termination method are in widespread use, there are
many associated disadvantages: for example, the methods require
gel-electrophoretic separation. Typically, only 400-800 base pairs
can be sequenced from a single clone. As a result, the systems are
both time- and labor-intensive. Methods avoiding gel separation
have been developed in attempts to increase the sequencing
throughput.
[0005] Methods have been proposed by Crkvenjakov (Drmanac et al.,
Genomics, 4:114 (1989); Strezoska et al., (Proc. Natl. Acad. Sci.
USA, 88:10089 (1991); Drmanac et al., Science, 260: 1649 (1991))
and Bains and Smith (Bains and Smith, J., Theoretical Biol., 135:
303 (1988)). These sequencing by hybridization (SBH) methods
potentially can increase the sequence throughput because multiple
hybridization reactions are performed simultaneously. This type of
system utilizes the information obtained from multiple
hybridizations of the polynucleotide of interest, using short
oligonucleotides to determine the nucleic acid sequence (Drmanac,
U.S. Pat. No. 5,202,231). To reconstruct the sequence requires an
extensive computer search algorithm to determine the optimal order
of all fragments obtained from the multiple hybridizations.
[0006] These methods are problematic in several respects. For
example, the hybridization is dependent upon the sequence
composition of the duplex of the oligonucleotide and the
polynucleotide of interest, so that GC-rich regions are more stable
than AT-rich regions. As a result, false positives and false
negatives during hybridization detection are frequently present and
complicate sequence determination. Furthermore, the sequence of the
polynucleotide is not determined directly, but is inferred from the
sequence of the known probe, which increases the possibility for
error. A great need remains to develop efficient and accurate
methods for nucleic acid sequence determination.
SUMMARY OF THE INVENTION
[0007] The current invention pertains to methods for analyzing, and
particularly for sequencing, a polynucleotide of interest, and an
apparatus useful in analyzing a polynucleotide of interest. In one
embodiment of the current invention, the nucleotide sequence of a
polypeptide of interest is analyzed for the presence of mutations
or alterations. In a second embodiment of the current invention,
the nucleotide sequence of a polypeptide of interest, for which the
nucleotide sequence was not known previously, is determined. The
method comprises detecting single base extension events of a set of
specific oligonucleotide primers, such that the label and position
of each separate extension event defines a base in a polynucleotide
of interest.
[0008] In one method of the current invention, a solid support is
provided. An array of a set or several sets of consecutive
oligonucleotide primers of a specified size having known sequences
is attached at defined locations to the solid support. The
oligonucleotide primers differ within each set by one base pair.
The oligonucleotide primers either correspond to at least a part of
the nucleotide sequence of one strand of the polynucleotide of
interest, if the sequence is known, or represent a set of all
possible nucleotide sequences for oligonucleotide primers of the
specified size, if the sequence is not known. A polynucleotide of
interest, which may be DNA or RNA, or a fragment of the
polynucleotide of interest, is annealed to the array of
oligonucleotide primers under hybridization conditions, thereby
generating "annealed primers." The annealed primers are subjected
to single base extension reaction conditions, under which a nucleic
acid polymerase and terminating nucleotides, such as
dideoxynucleotides (ddNTPs) corresponding to the four known bases
(A, G, T and C), are provided to the annealed primers. The
terminating nucleotides can also comprise a terminating string of
known polynucleotides, such as dinucleotides. As a result of the
single base extension reaction, extended primers are generated, in
which a terminating nucleotide is added to each of the annealed
primers. The terminating nucleotides can be provided to the
annealed primers either simultaneously or sequentially. The
terminating nucleotides are mutually distinguishable; i.e., at
least one of the nucleotides is labeled to facilitate detection.
After addition of the terminating nucleotides, the sequence of the
polynucleotide of interest is analyzed by "reading" the
oligonucleotide array: the identity and location of each
terminating nucleotide within the array on the solid support is
observed. The label and position of each terminating nucleotide on
the solid support directly defines the sequence of the
polynucleotide of interest that is being analyzed.
[0009] In a second method of the current invention, the
polynucleotide of interest is analyzed for the presence of specific
mutations through the use of oligonucleotide primers that are not
attached to a solid support. The oligonucleotide primers are
tailored to anneal to the polynucleotide of interest at a point
immediately preceding the mutation site(s). If more than one
mutation site is examined, the oligonucleotide primers are designed
to be mutually distinguishable: in a preferred embodiment, the
oligonucleotide primers have different mobilities during gel
electrophoresis. For example, oligonucleotides of different lengths
are used. After the oligonucleotide primers are annealed to the
polynucleotide of interest, the annealed primers are subjected to
single base extension reaction conditions, resulting in extended
primers in which terminating nucleotides are added to each of the
annealed primers. As in the first method of the current invention,
the terminating nucleotides are mutually distinguishable. After
addition of the terminating nucleotides, the sequence of the
polynucleotide of interest is analyzed by elating the extended
primers, performing gel electrophoresis, and "reading" the gel: the
identity and location of each terminating nucleotides on the gel is
observed using standard methods, such as with an automated DNA
sequencer. The label and position of each terminating nucleotide on
the gel directly defines the sequence of the polynucleotide of
interest that is being analyzed, and indicates whether a mutation
is present.
[0010] The apparatus of the current invention comprises a solid
support having an array of one or more sets of consecutive
oligonucleotide primers with known sequences attached to it at
defined locations, each oligonucleotide primer differing within
each set by one base pair. The set of oligonucleotide primers
either corresponds to at least a part of the nucleotide sequence of
one strand of the polynucleotide of interest, if the sequence is
known, or represents all possible nucleotide sequences for
oligonucleotide primers of the specified size, if the sequence is
not known.
[0011] The current invention provides both direct information, due
to the detection of a specific nucleotide addition, and indirect
information, due to the known sequence of the annealed primer to
which the specific base addition occurred, for the polynucleotide
of interest. The ability to determine nucleic acid sequences is a
critical element of understanding gene expression and regulation.
In addition, as advances in molecular medicine continue, sequence
determination will become a more important element in the diagnosis
and treatment of disease.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] FIG. 1 depicts an example of a set of oligonucleotide
primers (sense primers, SEQ ID NOS: 2-11, antisense primers, SEQ ID
NOS: 12-21), comprising consecutive primers differing by one base
pair at the growing end and capable of hybridizing successively
along the relevant part(s) or the whole of the polynucleotide of
interest (SEQ ID NOS: 1 and 22).
[0013] FIG. 2 is a schematic illustration of a single strand
template bound to a primer which is in turn attached to a solid
support.
[0014] FIG. 3 illustrates a set of consecutive oligonucleotide
primers for a part of the polynucleotide of interest following
immediately after the primer illustrated in FIG. 2.
[0015] FIG. 4 illustrates the single base pair additions to all the
primers illustrated in FIG. 3, as well as the corresponding
additions for the corresponding primers related to the
complementary strand of the polynucleotide of interest.
[0016] FIG. 5 is a graphic depiction of the length of extended
primers formed utilizing free oligonucleotide primers annealed to a
polynucleotide of interest.
[0017] FIGS. 6A, 6B and 6C are graphic depictions of
electrophoretograms demonstrating the detection of the presence of
a mutation in a polynucleotide of interest.
[0018] FIGS. 7A, 7B and 7C depict the results of a DNA chip-based
analysis for a five-base region within the third exon of the HPRT
gene.
DETAILED DESCRIPTION OF THE INVENTION
[0019] The current invention pertains to methods for analyzing the
nucleotide sequence of a polynucleotide of interest. The method
comprises hybridizing all or a fragment of a polynucleotide of
interest to oligonucleotide primers, conducting single base
extension reactions, and detecting the single base extension
events. The method can be used to analyze the sequence of a
polypeptide of interest by examining the sequence for the presence
of mutations or alterations in the nucleotide sequence, or by
determining the nucleotide sequence of a polypeptide of
interest.
[0020] As used herein, the term "polynucleotide of interest" refers
to the particular polynucleotide for which sequence information is
wanted. Representative polynucleotides of interest include
oligonucleotides, DNA or DNA fragments, RNA or RNA fragments, as
well as genes or portions of genes. The polynucleotide of interest
can be single- or double-stranded. The term "template
polynucleotide of interest" is used herein to refer to the strand
which is analyzed, if only one strand of a double-stranded
polynucleotide is analyzed, or to the strand which is identified as
the first strand, if both strands of a double-stranded
polynucleotide are analyzed. The term, "complementary
polynucleotide of interest" is used herein to refer to the strand
which is not analyzed, if only one strand of a double-stranded
polynucleotide is analyzed, or to the strand which is identified as
the second strand (i.e., the strand that is complementary to the
first (template) strand), if both strands of a double-stranded
polynucleotide are analyzed. Either one of the two strands can be
analyzed. In a preferred embodiment, both strands of a
double-stranded polynucleotide of interest are analyzed in order to
verify sequence information obtained from the template (first)
strand by comparison with the complementary (second) strand.
Nevertheless, it is not always necessary to analyze both strands.
For example, if the polynucleotide of interest is being analyzed
for the presence of a single base mutation, and not for the
complete base sequence in the mutation region, it is sufficient to
analyze a single strand of the polynucleotide of interest.
[0021] The methods of the current invention can be used to identify
the presence of mutations or alterations in the nucleotide sequence
of a polypeptide of interest. To identify mutations or alterations,
the sequence of the polynucleotide of interest is compared with the
sequence of the native or normal polynucleotide. An "alteration" in
the polynucleotide of interest, as used herein, refers to a
deviation from the expected sequence (the sequence of the native or
normal polynucleotide), including deletions, insertions, point
mutations, frame-shifts, expanded oligonucleotide repeats, or other
changes. The portion of the polynucleotide of interest that
contains the alteration is known as the "altered" region. The
methods can also be used to determine the polynucleotide sequence
of a polypeptide of interest having a previously unknown nucleotide
sequence.
[0022] In one embodiment of the current invention, the
polynucleotide of interest is analyzed by annealing the
polynucleotide to an array comprising sets of oligonucleotide
primers. The oligonucleotide primers in the array have a length N,
where N is from about 7 to about 30 nucleotides, inclusive, and is
preferably from 20 to 24 nucleotides, inclusive. Each
oligonucleotide primer within each set differs by one base pair.
The oligonucleotide primers can be prepared by conventional methods
(see Sambrook et al., Molecular Cloning: A Laboratory Manual (2nd
Ed, 1989)). The sets of oligonucleotide primers are arranged into
an array, such that the position and nucleotide content of each
oligonucleotide primer on the array is known.
[0023] The size and nucleotide content of the oligonucleotide
primers in the array depend on the polynucleotide of interest and
the region of the polynucleotide of interest for which sequence
information is desired. To analyze a polynucleotide of interest for
the presence of alterations, consecutive primers differing by one
base pair at the growing end and capable of hybridizing
successively along the relevant part(s) or the whole of the
polynucleotide are used. An example of such a primer set is shown
in FIG. 1. If only one or a few specific positions of the
polynucleotide sequence are examined for alterations, the necessary
array of oligonucleotide primers covers only the mutation regions,
and is therefore small. If the whole or a major part of the
polynucleotide of interest is to be analyzed for possible mutations
at varying positions, the necessary array is larger. For example,
the whole hypoxanthine-guanine phosphoribosyl-transferase (HPRT)
gene can be covered by 900 primers, arranged in a 30.times.30
array; the whole p53 gene requires 700 primers. If both strands of
a double-stranded polynucleotide of interest are analyzed for the
presence of alterations, the array comprises consecutive
oligonucleotide primers for the suspected mutation region of both
the template polynucleotide of interest and the complementary
polynucleotide of interest. If the polynucleotide of interest has
not been sequenced previously, the array includes oligonucleotide
primers comprising all possible N-mers.
[0024] The array of sets of oligonucleotide primers is immobilized
to a solid support at defined locations (i.e., known positions).
The immobilized array is referred to as a "DNA chip," which is the
apparatus of the current invention. The solid support can be a
plate or chip of glass, silicon, or other material. The solid
support can also be coated, such as with gold or silver. Coating
may facilitate attachment of the oligonucleotide primers to the
surface of the solid support. The oligonucleotide primers can be
bound to the solid support by a specific binding pair, such as
biotin and avidin or biotin and streptavidin. For example, the
primers can be provided with biotin handles in connection with
their preparation, and then the biotin-labeled primers can be
attached to a streptavidin-coated support. Alternatively, the
primers can be bound by a linker arm, such as a covalently bonded
hydrocarbon chain, such as a C.sub.10-20 chain. The primers can
also be bound directly to the solid support, such as by
epoxide/amine coupling chemistry (see Eggers, M. D. et al.,
Advances in DNA sequencing Technology, SPIE conference proceedings,
Jan. 21, 1993). The solid support can be reused, as described in
greater detail below.
[0025] In another embodiment of the invention, the polynucleotide
of interest is analyzed by annealing the polynucleotide to one or
more specific oligonucleotide primers that are not attached to a
solid support; such oligonucleotide primers are referred to herein
as "free oligonucleotide primers." If free oligonucleotide primers
are used, the polynucleotide of interest can be attached to a solid
support, such as magnetic beads. The free oligonucleotide primers
have a length N, as described above, and are prepared by
conventional methods (see Sambrook et al., Molecular Cloning. A
Laboratory Manual (2nd Ed, 1989)). The size and nucleotide content
of the free oligonucleotide primers depend on the polynucleotide of
interest and the region of the polynucleotide of interest for which
sequence information is desired. To analyze a polynucleotide of
interest for the presence of alterations, primers capable of
hybridizing immediately adjacent to the relevant part(s) of the
polynucleotide are used. If more than one position of the
polynucleotide sequence is examined for alterations, the free
oligonucleotide primers are mutually distinguishable: i.e., the
oligonucleotide primers have different mobilities during gel
electrophoresis. In a preferred embodiment, oligonucleotides of
different lengths are used. For example, an oligonucleotide primer
of 10 nucleotides in length is designed to hybridize immediately
adjacent to one putative mutation, and an oligonucleotide primer of
12 nucleotides in length is designed to hybridize immediately
adjacent to a second putative mutation. Because the oligonucleotide
primers are of different lengths, they will migrate to different
positions on the gel. Thus, in this manner, the nucleotide content
of each oligonucleotide primer can be identified by the position of
the oligonucleotide primer on the gel.
[0026] The polynucleotide of interest is hybridized to the array of
oligonucleotide primers, or to the free nucleotide primers, under
high stringency conditions, so that an exact match between the
polynucleotide of interest and the oligonucleotide primers is
obtained, without any base-pair mismatches (see Sambrook et al.,
Molecular Cloning. A Laboratory Manual (2nd Ed, 1989)). For
example, a schematic illustration of a hypothetical polynucleotide
of interest annealed to an oligonucleotide primer that is attached
to a solid support is shown schematically in FIG. 2. In FIG. 2, a
part of the sequence of the polynucleotide of interest that follows
immediately after the portion of the polynucleotide that is bound
to the oligonucleotide primer on the array is shown as TGCAACTA.
Six corresponding consecutive primers are shown in FIG. 3, i.e.
primers ending with the pairing bases A, AC, ACG, etc. If the
polynucleotide of interest is double-stranded, it can be separated
into two single strands either before or after the binding of the
polynucleotide of interest to the array oligonucleotide primers.
Both the template and the complementary polynucleotide of interest
can be analyzed utilizing a single array. Thus, while not shown in
FIG. 2, appropriate primers corresponding to the complementary
polynucleotide of interest are also attached to the solid support
in known positions.
[0027] When the polynucleotide of interest is hybridized to the
array of sets of oligonucleotide primers, or to the free
oligonucleotide primers, under hybridization conditions, annealed
primers are formed. The term, "annealed primer," as used herein,
refers to an oligonucleotide primer (either free or attached to a
solid support) to which a polynucleotide of interest is hybridized.
The annealed primers are subjected to a single base extension
reaction. The "single base extension reaction," as used herein,
refers to a reaction in which the annealed primers are provided
with a reaction mixture comprising a DNA polymerase, such as T7
polymerase, and terminating nucleotides under conditions such that
single terminating nucleotides are added to each of the annealed
primers. The term "terminating nucleotides," as used herein, refers
to either single terminating nucleotides, or units of nucleotides,
the units preferably being dinucleotides. In a preferred
embodiment, the terminating nucleotides are single
dideoxynucleotides. The terminating nucleotides can comprise
standard nucleotides, and/or nucleotide analogues. The terminating
nucleotide added to each annealed primer is thus a base pairing
with the template base on the polynucleotide of interest, and is
added immediately adjacent to the growing end of the respective
primer. An oligonucleotide primer to which a terminating nucleotide
has been added through the single base extension reaction is termed
an "extended primer." Thus, as schematically shown for both strands
of the hypothetical polynucleotide of interest in FIG. 4, a single
nucleotide is added to each primer in the array; the primer set
related to the strand illustrated in FIG. 2 is shown to the left in
FIG. 4, and the other (complementary) strand is shown to the right.
The nucleotides added are shown in extra bold type.
[0028] The terminating nucleotides preferably comprise dNTPs, and
particularly comprise dideoxynucleotides (ddNTPs), but other
terminating nucleotides apparent to the skilled person can also be
used. If the terminating nucleotides are single nucleotides, then
nucleotides corresponding to each of the four bases (A, T, G and C)
are utilized in the single base extension reaction. If the
terminating nucleotides are dinucleotide units, for example, then
nucleotides corresponding to each of the sixteen possible
dinucleotides are utilized.
[0029] The nucleotides are mutually distinguishable. For example,
if the solid support is coated with a free electron metal, such as
with gold or silver, surface plasmon resonance (SPR) microscopy
allows identification of each nucleotide, by the change of the
refractive index at the surface caused by each base extension.
Alternatively, at least one of the terminating nucleotides is
labeled by standard methods to facilitate detection. Suitable
labels include fluorescent dyes, chemiluminescence, and
radionuclides. The number of nucleotides that are labeled can be
varied. It is sufficient to use three labeled terminating
nucleotides, the fourth terminating nucleotide being identified by
its "non-label," if single nucleotides are added in the base
extension reaction. For example, if one is examining the
polynucleotide of interest for the presence of a particular
alteration, and not for the complete base sequence in the altered
region, three labeled terminating nucleotides are sufficient. Fewer
than three labels can also be utilized under appropriate
circumstances. An exemplification of the use of two labeled and two
unlabelled dNTPs is described below. If a specific alteration is to
be investigated, such as a point mutation, only the native or
normal nucleotide need be labeled, as a mutation would be indicated
by the presence of the "non-label." Alternatively, the expected
mutant nucleotide can also be labeled.
[0030] After the single base extension reaction has been performed,
the identity and location of each terminating nucleotide is
observed. If free oligonucleotide primers are used, the extended
primers are eluted and separated by gel electrophoresis, and the
gel is then analyzed. If oligonucleotide primers attached to an
array are used, the array itself is analyzed. The gel or array is
analyzed by detecting the labeled, terminating nucleotides bound to
the oligonucleotide primers. The labeled, terminating nucleotides
are detected by conventional methods, such as by an optical system.
For example, a laser excitation source can be used in conjunction
with a filter set to isolate the fluorescence emission of a
particular type of terminating nucleotide. Either a photomultiplier
tube, a charged-coupled device (CCD), or another suitable
fluorescence detection method can be used to detect the emitted
light from fluorescent terminating nucleotides.
[0031] The sequence of the polynucleotide of interest can be
analyzed from the label pattern observed on the array or on the
gel, since the position of each different primer on the array or on
the gel is known, and since the identity of each terminating
nucleotide can be determined by its specific label. The label and
position of each terminating nucleotide either within the array or
on the gel will directly define the sequence of the polynucleotide
of interest that is being analyzed. Mutations or alterations in the
sequence of the polynucleotide of interest are indicated by
alterations in the expected label pattern. For example, assume that
the nucleotide sequence shown in FIG. 2 contains a mutation: the
third base C from the left is replaced by a G in the polynucleotide
of interest. The top primer in FIG. 3 will still be extended by a C
as shown in FIG. 4, whereas the next primer will be extended by a C
rather than a G. Since this new, unexpected base C can be
identified by its specific label and the respective primer location
is known, the corresponding base mutation is identified as G.
[0032] The following simple example illustrates the ability to
obtain complete sequence information and to identify a mutation in
a representative polynucleotide of interest. The example utilizes
two labeled terminating nucleotides, which give complete sequence
information.
[0033] Assume a normal polynucleotide of the following base pair
composition:
1 +ACTGCTTAG -TGACGAATC
[0034] and a corresponding mutant polynucleotide having the
following base pair composition, which has a single base mutation
in the third base pair:
2 +ACCGCTTAG -TGGCGAATC.
[0035] Using fluorescent labeling, for example, with a red label
("R") for terminating A and a green label ("G") for terminating G,
and no label, i.e., null ("N") for the remaining bases T and C, the
following "binary" codes allowing sequence interpretation would be
obtained for the normal, mutant and heterozygote sequences,
respectively:
3 +N-G-R-N-G-R-R-N-N Normal -R-N-N-G-N-N-N-R-G +N-G-G-N-G-R-R-N-N
Mutant (Affected) -R-N-N-G-N-N-N-R-G R Heterozygote (Carrier)
+N-G-G-N-G-R-R-N-N -R-N-N-G-N-N-N-R-G.
[0036] The presence of such a point mutation will affect the base
pairing of the next few oligonucleotide primers to the
polynucleotide of interest, and thereby the primer extensions
obtained, such that the bases in the vicinity of the mutation
(i.e., in the altered region) may not be accurately identified. To
optimize identification of bases in the altered region, it is
preferred to analyze both strands of such a double-stranded
polynucleotide of interest. The few bases that may be difficult to
identify on the template polynucleotide of interest, as well as the
changed base, will be identified by the base extensions of the
primers for the complementary polynucleotide of interest, as the
analysis of the complementary polynucleotide of interest approaches
the mutation site from the opposite direction. In the nearest
regions on either side of the alteration, the sequence
determination is thereby provided by the oligonucleotide primers
for one of the two strands.
[0037] The sequence of a polynucleotide of interest for which the
sequence is previously known can be determined using methods
similar to those described above in reference to identification of
mutations utilizing an array of oligonucleotide primers. As before,
the positions of the terminating nucleotides within the array will
directly define the sequence position of each nucleotide in the
polynucleotide of interest.
[0038] To determine the oligonucleotide sequence, one annealed
primer is selected to be the "starting" annealed primer; it is
supposed for purposes of analysis that the sequence of the
polynucleotide of interest "starts" with this primer. The
nucleotide which has been added to the starting annealed primer is
detected using standard methods. Then, a second annealed primer
which has the same nucleotide sequence as the starting annealed
primer, minus the 5' nucleotide and with the addition of the added
nucleotide, is then selected. The terminating nucleotide which has
been added to the second annealed primer is detected. These steps
are then repeated, using the second annealed primer as the
"starting" annealed primer in each repetition, until the sequence
of the polynucleotide of interest is determined. For example, if
the oligonucleotide primers are 10 nucleotides in length (N=10),
the starting annealed primer is chosen to correspond to the first
ten bases of the sequence. The terminating nucleotide of the
starting annealed primer is then determined. Next, bases 2-11
(i.e., bases 2-10 of the starting annealed primer plus the
terminating nucleotide extension) are matched to another annealed
primer. This primer is the second annealed primer. The terminating
nucleotide of the second annealed primer is then determined. These
steps are repeated to determine the complete sequence. In this
manner, the single base extension reaction automatically links
together the set of annealed primers.
[0039] After analysis of the nucleotide sequence of a polypeptide
of interest, the polynucleotide of interest and the terminating
nucleotides can be removed from the DNA chip, so that the chip can
be reused. In a preferred embodiment, the added terminating
nucleotides are capable of being removed from the solid support
after analysis of the polynucleotide of interest has been
completed. Once the nucleotides are removed, the solid support with
the immobilized oligonucleotide primers can be used for a new
analysis. The nucleotides can be removed using standard methods,
such as enzymatic cleavage or chemical degradation. Enzymatic
cleavage, for example, would use a terminating nucleotide which can
be removed by an enzyme. The single base extension reaction could
result in addition to the oligonucleotide primers of RNA dideoxyTTP
or RNA dideoxyCTP by reverse transcriptase or other polymerase. A
C/T cleavage enzyme, such as RNase A, can then be used to "strip"
off the RNA dideoxynucleotides. Alternatively, sulfur-containing
dideoxy-A or dideoxy-G can be used during the single extension
reaction; a sulfur-specific esterase, which does not cleave
phosphates can then be used to cleave off the dideoxynucleotides.
For chemical degradation, a chemically degradable terminating
nucleotide can be used. For example, a modified ribonucleotide
having its 2' - and 3' -hydroxyl groups esterified, such as by
acetyl groups, can be used. After binding of the terminating
nucleotide to the annealed primer, the acetyl groups are removed by
treatment with a base to expose the 2'- and 3'-hydroxyl groups. The
ribose residue can then be degraded by periodate oxidation, and the
residual phosphate group removed from the annealed primer by
treatment with a base and alkaline phosphatase.
[0040] The method and apparatus of the current invention have uses
in detecting mutations, deletions, expanded oligonucleotide
repeats, and other genetic abnormalities. For example, the current
invention can be used to identify frame shifting mutations caused
by insertions or deletions. Furthermore, carrier status of
heritable diseases, such as cystic fibrosis, .beta.-thalassemia,
.alpha.-1, Gaucher's disease, Tay Sach's disease, or Lesch-Nyham
syndrome, can be easily determined using the current invention,
because both the normal and the altered signals would be detected.
Furthermore, mixtures of DNA molecules such as occur in HIV
infected patients with drug resistance can be determined. The HIV
virus may develop resistance against drugs like AZT by point
mutations in the nucleotide sequence of a reverse transcriptase
(RT) gene. When mutated viruses start to appear in the virus
population, both the mutated gene and the normal (wild type) gene
can be detected. The greater the proportion is of the mutant, the
greater is the signal from the corresponding mutant terminating
nucleotide.
[0041] The current invention is further exemplified by the
following Examples.
EXAMPLE 1
Analyzing the Sequence of a Polynucleotide of Interest Utilizing
Free Oligonucleotide Primers
[0042] An analysis of the hypoxanthine-guanine
phosphoribosyl-transferase (HPRT) gene (the polypeptide of
interest) was conducted for three individuals (Patients A, B, and
C).
[0043] A. Obtaining the Polynucleotide of Interest
[0044] The polymerase chain reaction (see Sambrook et al.,
Molecular Cloning: A Laboratory Manual (2nd Ed, 1989), especially
Chapter 14) was utilized to amplify the polynucleotide of interest.
During the reaction, one of the two PCR primers was tagged with a
biotin group. Following amplification, the single strand template
was captured with streptavidin coated magnetic beads. For a 50
.mu.l PCR reaction, 25 .mu.l of Dynal M-280 paramagnetic beads
(Dynal A/S, Oslo, Norway) was used. The supernatant of the beads
was removed and replaced with 50 .mu.l of a binding and washing
buffer (10 mM Tris-HCl (pH 7.5); 1 mM EDTA; 2 M NaCl). The PCR
product was added to the beads and incubated at room temperature
for 30 minutes for bead capture of the products. The single
stranded polynucleotide of interest was isolated by the addition of
150 .mu.l of 0.15 M NaOH for 5 minutes. The beads were captured,
the supernatant was removed, and 150 .mu.l of 0.15 M NaOH was again
added for five minutes. Following denaturation, the beads were
washed once with 150 .mu.l of 0.15 M NaOH and twice with 1.times.
T7 annealing buffer (40 mM Tris-HCl (pH 7.5); 20 mM MgCl.sub.2; 50
mM NaCl). The beads were finally suspended in 70 .mu.l of water.
This process both isolates the single-stranded polynucleotide of
interest and removes any unincorporated dNTPs remaining after
PCR.
[0045] B. Analyzing the Sequence of the Polynucleotide of
Interest
[0046] After single strand isolation, the oligonucleotide primers
were annealed to the polynucleotide of interest by heating to
65.degree. C. for approximately two minutes and cooling to room
temperature over approximately 20 minutes. The 10 .mu.l reaction
volume consisted of 7 .mu.l of the polynucleotide of interest
(0.5-1 pmol), 2 .mu.l of 5.times. T7 annealing buffer, and 1 .mu.l
of extension primer (3-9 pmol). The extension reaction was then
performed. For the reaction, 1 .mu.l of DTT, 2 .mu.l of T7
polymerase (diluted 1:8) and 1 .mu.l of ddNTPs (final concentration
of 0.5 uM) were added. The reaction proceeded at 37.degree. C. for
two minutes, and then was stopped by the addition of 100 .mu.l of
washing buffer (1.times. SSPE, 0.1% SDS, 30% ethanol). The beads
were washed twice with 150 .mu.l of the washing buffer. The
extension products were eluted by the addition of 5 .mu.l of
formamide and heated to 70.degree. C. for two minutes. The beads
were captured by the magnet and the supernatant containing the
extension products was collected and analyzed on a ABI 373 (Applied
Biosystems, Inc.). Oligonucleotide primers of lengths varying from
10 to 17 were used. As shown in FIG. 5, extension products were
formed efficiently.
1. Deoxynucleotide Labeling--Four Fluorophores
[0047] Each ddNTP was labeled by a different fluorophore. ABI Dye
Terminator dyes designed for taq polymerase were used: ddG is blue,
ddA is green, ddT is yellow, and ddC is red. Four fluorescent
ddNTPs were added to each reaction tube. The extension products
were purified, gel separated, and analyzed on an ABI 373. Two
different bases of exon 3 of the HPRT gene were analyzed: base
16534 (wild type is A) and base 16620 (wild type is C).
[0048] The results of the four fluor, single lane, indicated that
the presence of mutations could be identified easily. All three
patients are wild type for A at base 16534 (data not shown).
Electrophoretograms shown in FIGS. 6A, 6B and 6C indicate that
Patient A is wild type (C) at base 16620 (FIG. 6A), patient B is a
mutated individual (C.fwdarw.T) at base 16620 (FIG. 6B), and
patient C is a carrier at base 16620 (both C and T) (FIG. 6C).
2. Deoxynucleotide Labeling--Single Fluorophore
[0049] Each ddNTP was labeled by the same fluorophore. DuPont NEN
fluorescein dyes (NEL 400-404) were used. Each ddNTP appears blue
in the ABI 373. Only one fluorescent ddNTP is added to each
reaction tube. The extension products were purified, gel separated,
and analyzed on an ABI 373. Four lanes on the gel must be used to
analyze each base. Two different bases of exon 3 of the HPRT gene
were analyzed: base 16534 (wild type is A) and base 16620 (wild
type is C).
[0050] The results of the single fluor, four lane, demonstrated
results that were identical to those obtained using the four fluor,
single lane method described in (1), above. This type of assay
minimizes the effect of the fluorophore differences during
extension product formation and gel separation.
3. Deoxynucleotide Labeling--Biotinlyated Dideoxynucleotides
[0051] The ddNTPs are labeled with a biotin group. Four separate
reactions are performed, whereby only one of the four ddNTPs is
biotinylated. Following the extension reaction, a strepavidin (or
avidin) coupled fluorescent group is attached to the biotinylated
ddNTPs. Because the biotin group is small, uniform incorporation of
the ddNTPs is expected and base-specific differences in extension
are minimized. Furthermore, the fluorescent signal can be amplified
because the biotin group can bind a streptavidin moiety coupled to
multiple fluors.
EXAMPLE 2
Analyzing the Sequence of a Polynucleotide of Interest Utilizing
Labeled Deoxynucleotides
[0052] An analysis of the hypoxanthine-guanine
phosphoribosyl-transferase (HPRT) gene (the nucleotide sequence of
a polypeptide of interest) was conducted for three individuals
(Patients A, B, and C). The third exon of the HPRT gene was
examined.
[0053] Microscope glass slides were epoxysilanated at 80.degree. C.
for eight hours using 25% 3' glycidoxy propyltriethoxysilane
(Aldrich Chemical) in dry xylene (Aldrich Chemical) with a
catalytic amount of diisopropylehylamine (Aldrich Chemical),
according to Southern (Nucl. Acids Res. 20:1679 (1992), and
Genomics 13:1008 (1992)). The DNA chips were made by placing 0.5
.mu.l drops of 5'-amino-linked oligonucleotides (50 .parallel.M,
0.1 M NaOH) at 37.degree. C. for six hours in a humid environment.
The chips were washed in 50.degree. C. water for 15 minutes, dried
and used. The annealing reaction consisted of adding 2.2 .mu.l of
single-stranded DNA (0.1 .mu.M in T7 reaction buffer) to each grid
position, heating the chip in a humid environment to 70.degree. C.
and then cooling slowly to room temperature. A 1 .mu.l drop of 0.1
M DTT, 3 units of Sequenase Version 2.0 (USB), 5 .mu.Ci
.alpha.-.sup.32P dNTP (3000 Ci/mmol) (DuPont NEN) and noncompeting
unlabeled 18.5 .mu.M ddNTPs (Pharmacia) were added to each grid
position for three minutes. The reaction was stopped by washing in
75.degree. C. water, and analyzed on a PhosphorImager (Molecular
Dynamics).
[0054] FIGS. 7A, 7B and 7C depict the results of a DNA chip-based
analysis for a five-base region within the third exon of the HPRT
gene. The rows correspond to a particular base under investigation,
and the columns correspond to the labeled base. FIG. 7A
demonstrates the wild type sequence (TCGAG), FIG. 7B demonstrates a
C.fwdarw.T mutation, and FIG. 7C demonstrates a C.fwdarw.T
mutation.
[0055] Those skilled in the art will recognize, or be able to
ascertain using no more than routine experimentation, many
equivalents to the specific embodiments of the invention described
specifically herein. Such equivalents are intended to be
encompassed in the scope of the following claims.
* * * * *