U.S. patent application number 08/829893 was filed with the patent office on 2001-11-01 for sequencing of surface immobilized polymers utilizing microfluorescence detection.
Invention is credited to DOWER, WILLIAM J, FODOR, STEPHEN P.A..
Application Number | 20010036629 08/829893 |
Document ID | / |
Family ID | 24511591 |
Filed Date | 2001-11-01 |
United States Patent
Application |
20010036629 |
Kind Code |
A1 |
DOWER, WILLIAM J ; et
al. |
November 1, 2001 |
SEQUENCING OF SURFACE IMMOBILIZED POLYMERS UTILIZING
MICROFLUORESCENCE DETECTION
Abstract
Means for simultaneous parallel sequence analysis of a large
number of biological polymer macromolecules. Apparatus and methods
may use fluorescent labels in repetitive chemistry to determine
terminal manomers on solid phase immobilized polymers. Reagents
which specifically recognize terminal manomers are used to label
polymers at defined positions on a solid substrate.
Inventors: |
DOWER, WILLIAM J; (MENLO
PARK, CA) ; FODOR, STEPHEN P.A.; (PALO ALTO,
CA) |
Correspondence
Address: |
TOWNSEND AND TOWNSEND AND CREW
TWO EMBARCADERO CENTER
EIGHTH FLOOR
SAN FRANCISCO
CA
94111-3834
US
|
Family ID: |
24511591 |
Appl. No.: |
08/829893 |
Filed: |
April 2, 1997 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
08829893 |
Apr 2, 1997 |
|
|
|
08679478 |
Jul 12, 1996 |
|
|
|
5902723 |
|
|
|
|
08679478 |
Jul 12, 1996 |
|
|
|
07626730 |
Dec 6, 1990 |
|
|
|
5547839 |
|
|
|
|
07626730 |
Dec 6, 1990 |
|
|
|
07492462 |
Mar 7, 1990 |
|
|
|
5143854 |
|
|
|
|
07492462 |
Mar 7, 1990 |
|
|
|
07362901 |
Jun 7, 1989 |
|
|
|
Current U.S.
Class: |
435/6.11 ;
435/6.12; 435/91.1 |
Current CPC
Class: |
C12Q 2563/107 20130101;
C07B 2200/11 20130101; C12Q 1/6823 20130101; B01J 2219/00432
20130101; C12Q 1/6874 20130101; C12Q 2535/101 20130101; C12Q
2565/513 20130101; C12Q 2537/143 20130101; C12Q 2537/157 20130101;
C12Q 2537/157 20130101; C12Q 2565/518 20130101; C12Q 2565/537
20130101; C12Q 2537/143 20130101; C12Q 2563/107 20130101; C12Q
2537/143 20130101; C12Q 2537/157 20130101; C12Q 2533/101 20130101;
C12Q 2563/107 20130101; C12Q 2563/107 20130101; B01J 2219/00459
20130101; C12Q 2535/101 20130101; B01J 2219/00436 20130101; B01J
2219/00637 20130101; B01J 2219/0061 20130101; B01J 2219/00475
20130101; C12Q 1/6816 20130101; C12Q 1/6869 20130101; C12Q 1/6869
20130101; B01J 2219/00585 20130101; B01J 2219/00608 20130101; B01J
2219/005 20130101; B01J 2219/00695 20130101; C12Q 1/6837 20130101;
B01J 2219/00531 20130101; B01J 2219/00527 20130101; B01J 2219/00596
20130101; C07K 17/06 20130101; C12Q 1/6823 20130101; G11C 13/0014
20130101; C07K 1/045 20130101; B01J 2219/00725 20130101; C40B 40/06
20130101; B01J 2219/0059 20130101; B01J 2219/00626 20130101; B01J
2219/00648 20130101; G01N 15/1475 20130101; G03F 7/00 20130101;
G03F 7/26 20130101; B01J 19/0046 20130101; G03F 7/265 20130101;
G03F 7/38 20130101; C07K 1/042 20130101; B01J 2219/00722 20130101;
B01J 2219/00621 20130101; B01J 2219/00529 20130101; C12Q 1/6874
20130101; B01J 2219/00612 20130101; B01J 2219/00659 20130101; B01J
2219/00689 20130101; B01J 2219/00315 20130101; C12Q 1/6809
20130101; B01J 2219/00434 20130101; B01J 2219/00468 20130101; C40B
40/10 20130101; C07H 21/00 20130101; C12Q 1/6874 20130101; B01L
7/52 20130101; B82Y 30/00 20130101; C12Q 1/6869 20130101; G11C
13/0019 20130101; C12Q 1/6804 20130101; C07H 19/10 20130101; C12Q
1/6837 20130101; C07K 17/14 20130101; B01J 2219/00605 20130101;
B01J 2219/00711 20130101; B82Y 10/00 20130101; C12Q 1/6874
20130101; C40B 60/14 20130101 |
Class at
Publication: |
435/6 ;
435/91.1 |
International
Class: |
C12Q 001/68; C12P
019/34 |
Claims
What is claimed is:
1. A method for determining the positions of polymers which
terminate with a given monomer, where said polymers are attached to
a surface having a plurality of positionally distinct polymers
attached thereto, said method comprising the steps of: labeling a
terminal monomer in a monomer type specific manner; and scanning
said surface, thereby determining the positions of said label.
2. A method of claim 1, wherein said polymers are
polynucleotides.
3. A method of claim 1, wherein the labeling of the terminal marker
comprises incorporation of a labeled terminal monomer selected from
the group of nucleotides consisting of adenine, cytidine, guanidine
and thymidine.
4. A method for concurrently determining which subset of a
plurality of positionally distinct polymers attached to a solid
substrate at separable locations terminates with a given terminal
subunit, said method comprising the steps of: mixing said solid
substrate with a solution comprising a reagent, which selectively
marks positionally distinct polymers which terminate with said
given terminal subunit; and determining with a detector which
separable locations are marked, thereby determining which subset of
said positionally distinct polymers terminated with said given
terminal subunit.
5. A method of claim 4, wherein said solution comprises a reagent
which marks said positionally distinct polymer with a fluorescent
label moiety.
6. A method of claim 4, wherein said terminal subunit is selected
from the group consisting of adenosine, cytosine, guanosine, and
thymine.
7. A method for determining which subset of a plurality of primer
polynucleotides have a predetermined oligonucleotide, wherein the
polynucleotides are complementary to distinctly positioned template
strands which are attached to a solid substrate, said method
comprising the steps of: selectively marking said subset of primer
polynucleotides having the predetermined oligonucleotide; and
detecting which polynucleotides are marked.
8. A method of claim 7, wherein said oligonucleotide subunit is a
single nucleotide.
9. A method of claim 8, wherein the marking comprises elongating
said primer with a labeled nucleotide which is complementary to a
template.
10. A method of claim 7, wherein the marking step uses a polymerase
and a blocked and labeled adenine.
11. A method for concurrently obtaining sequence information on a
plurality of polynucleotides by use of a single label detector,
said method comprising the steps of: attaching a plurality of
positionally distinct polynucleotides to a solid substrate at
separable locations; labeling said plurality of polynucleotides
with a terminal nucleotide specific reagent, said label being
detectable using said label detector; determining whether said
specific labeling reagent has labeled each separable location.
12. A method of claim 11, wherein said labeling is performed with
reagents which can distinguishably label alternative possible
nucleotide monomers.
13. A method of claim 11, wherein at least four replica substrates
are produced, each of which is labeled with a specific labeling
reagent for adenine, cytosine, guanine, or thymine.
14. A method of claim 11, wherein said labeling and determining
steps are performed in succession using reagents specific for each
of adenine, cytosine, guanine, and thymine monomers.
15. A method for concurrently obtaining sequence information on a
plurality of polynucleotides, said method comprising the steps of:
attaching distinct polynucleotides to a plurality of distinct solid
substrates; labeling said plurality of solid substrates with a
terminal nucleotide specific labeling reagent; and determining
whether said specific labeling reagent has labeled each distinct
substrate.
16. A method of claim 15, wherein said labeling step is performed
in a continuous flow of distinct solid substrates through a
reaction solution.
17. A method for simultaneously sequencing a plurality of polymers
made up of monomer units, said plurality of polymers attached to a
substrate at definable positions, said method comprising the steps
of: mixing said substrate with a reagent which specifically
recognizes a terminal monomer, thereby providing identification
among various terminal monomer units; and scanning said substrate
to distinguish signals at definable positions on said substrate;
and correlating said signals at defined positions on said substrate
to provide sequential series of sequence determinations.
18. A method of claim 17, wherein said plurality of polymers are
synthesized by a plurality of separate cell colonies.
19. A method of claim 17, wherein said plurality of polymers are
attached to said substrate is by a carbonyl linkage.
20. A method of claim 17, wherein said polymers are
polynucleotides.
21. A method of claim 17, wherein said substrate comprises
silicon.
22. A method of claim 17, wherein said scanning identifies a
fluorescent label.
23. A method of claim 17, wherein said reagent exhibits specificity
of removal of terminal monomers.
24. A method of claim 17, wherein said reagent exhibits specificity
of labeling of terminal monomers.
25. A method for sequencing a plurality of distinctly positioned
polynucleotides attached to a solid substrate comprising the steps
of: hybridizing complementary primers to said plurality of
polynucleotides; elongating a complementary primer hybridized to a
polynucleotide by adding a single nucleotide; and identifying which
of said complementary primers have incorporated said
nucleotide.
26. A method of claim 25, wherein said elongating step is performed
simultaneously on said plurality of polynucleotides linked to said
substrate.
27. A method of claim 25, wherein said substrate is a two
dimensional surface and said identifying results from a positional
determination of said complementary primers incorporating said
single defined nucleotide.
28. A method of claim 25, wherein said substrate is a silicon
substrate.
29. A method of claim 25, wherein said linking is by
photocrosslinking said polynucleotide to said complementary primer,
where said primer is attached to said substrate.
30. A method of claim 25, wherein said elongating is catalyzed by a
DNA dependent polymerase.
31. A method of claim 25, wherein said single defined nucleotide
has a removable blocking moiety to prevent further elongation.
32. A method of claim 31, wherein said blocking moiety is NVOC.
33. A method of claim 25, wherein said single nucleotide has both a
blocking moiety and labeling moiety.
Description
[0001] This is a continuation-in-part application of commonly
assigned patent application by Pirrung et al., U.S. Ser. No.
07/362,901 (VLSIPS parent), filed Jun. 7, 1989; and Pirrung et al.,
U.S. Ser. No. 07/492,462, "Very Large Scale Immobilized Polymer
Synthesis", filed on Mar. 7, 1990, and which is incorporated herein
by reference. Related cases Barrett et al., U.S. Ser. No.
07/435,316 (caged biotin parent), filed on Nov. 13, 1989, and
Barrett et al, U.S. Ser. No. 07/612,671 (caged biotin CIP), filed
on Nov. 13, 1990, are also incorporated herein by reference.
Additional applications Fodor et al., U.S. Ser. No. ______,
attorney docket number 11509-28 (automated VLSIPS); and Fodor et
al., U.S. Ser. No. ______, attorney docket number 11509-30
(sequencing by hybridization), are filed on the same day as this
application and incorporated herein by reference.
BACKGROUND OF THE INVENTION
[0002] The present invention relates to the determination of the
sequences of polymers immobilized to a substrate. In particular,
one embodiment of the invention provides a method and apparatus for
sequencing many nucleic acid sequences immobilized at distinct
locations on a matrix surface. The principles and apparatus of the
present invention may be used, for example, also in the
determination of sequences of peptides, polypeptides,
oligonucleotides, nucleic acids, oligosaccharides, phospholipids
and other biological polymers. It is especially useful for
determining the sequences of nucleic acids and proteins.
[0003] The structure and function of biological molecules are
closely interrelated. The structure of a biological polymer,
typically a macromolecule, is generally determined by its monomer
sequence. For this reason, biochemists historically have been
interested in the sequence characterization of biological
macromolecule polymers. With the advent of molecular biology, the
relationship between a protein sequence and its corresponding
encoding gene sequence is well understood. Thus, characterization
of the sequence of a nucleic acid encoding a protein has become
very important.
[0004] Partly for this reason, the development of technologies
providing the capability for sequencing enormous amounts of DNA has
received great interest. Technologies for this capability are
necessary for, for example, the successful completion of the human
genome sequencing project. Structural characterization of
biopolymers is very important for further progress in many areas of
molecular and cell biology.
[0005] While sequencing of macromolecules has become extremely
important, many aspects of these technologies have not advanced
significantly over the past decade. For example, in the protein
sequencing technologies being applied today the Edman degradation
methods are still being used. See, e.g., Knight (1989)
"Microsequencers for Proteins and Oligosaccharides," BioTechnol.
7:1075-1076. Although advanced instrumentation for protein
sequencing has been developed, see, e.g., Frank et al. (1989)
"Automation of DNA Sequencing Reactions and Related Techniques: A
Work Station for Micromanipulation of Liquids," BioTechnol.
6:1211-1213, this technology utilizes a homogeneous and isolated
protein sample for determination of removed residues from that
homogeneous sample.
[0006] Likewise, in nucleic acid sequencing technology, three major
methods for sequencing have been developed, of which two are
commonly used today. See, e.g., Sambrook et al. (1989) Molecular
Cloning: A Laboratory Manual (2d Ed.) Vols. 1-3, Cold Spring Harbor
Press, New York, which is hereby incorporated herein by reference.
The first method was developed by Maxam and Gilbert. See, e.g.,
Maxam and Gilbert (1980) "Sequencing End-Labeled DNA with
Base-Specific Chemical Cleavages," Methods in Enzymol. 65:499-560,
which is hereby incorporated herein by reference. The polymer is
chemically cleaved with a series of base-specific cleavage reagents
thereby generating a series of fragments of various lengths. The
various fragments, each resulting from a cleavage at a specific
base, are run in parallel on a slab gel which resolves nucleic
acids which differ in length by single nucleotides. A protein
specific label allows detection of cleavages at all nucleotides
relative to the position of the label.
[0007] This separation requires high resolution electrophoresis or
some other system for separating nucleic acids of very similar
size. Thus, the target nucleic acid to be sequenced must usually be
initially purified to near homogeneity.
[0008] Sanger and Coulson devised two alternative methods for
nucleic acid sequencing. The first method, known as the plus and
minus method, is described in Sanger and Coulson (1975) J. Mol.
Biol. 94:441-448, and has been replaced by the second method.
Subsequently, Sanger and Coulson developed another improved
sequencing method known as the dideoxy chain termination method.
See, e.g., Sanger et al. (1977) "DNA Sequencing with
Chain-Termination Inhibitors," Proc. Natl. Acad. Sci. USA
74:5463-5467, which is hereby incorporated herein by reference.
This method is based on the inability of 2', 3' dideoxy nucleotides
to be elongated by a polymerase because of the absence of a 3'
hydroxyl group on the sugar ring, thus resulting in chain
termination. Each of the separate chain terminating nucleotides are
incorporated by a DNA polymerase, and the resulting terminated
fragment is known to end with the corresponding dideoxy nucleotide.
However, both of the Sanger and Coulson sequencing techniques
usually require isolation and purification of the nucleic acid to
be sequenced and separation of nucleic acid molecules differing in
length by single nucleotides.
[0009] Both the polypeptide sequencing technology and the
oligonucleotide sequencing technologies described above suffer from
the requirement to isolate and work with distinct homogeneous
molecules in each determination.
[0010] In the polypeptide technology, the terminal amino acid is
sequentially removed and analyzed. However, the analysis is
dependent upon only one single amino acid being removed, thus
requiring the polypeptide to be homogeneous.
[0011] In the case of nucleic acid sequencing, the present
techniques typically utilize very high resolution polyacrylamide
gel electrophoresis. This high resolution separation uses both
highly toxic acrylamide for the separation of the resulting
molecules and usually very high voltages in running the
electrophoresis. Both the purification and isolation techniques are
highly tedious, time consuming and expensive processes.
[0012] Thus, a need exists for the capability to simultaneously
sequence many biological polymers without individual isolation and
purification. Moreover, dispensing with the need to individually
perform the high resolution separation of related molecules leads
to greater safety, speed, and reliability. The present invention
solves these and many other problems.
SUMMARY OF THE INVENTION
[0013] The present invention provides the means to sequence
hundreds, thousands or even millions of biological macromolecules
simultaneously and without individually isolating each
macromolecule to be sequenced. It also dispenses with the
requirement, in the case of nucleic acids, to separate the products
of the sequencing reactions on dangerous polyacrylamide gels.
Adaptable to automation, the cost and effort required in sequence
analysis will be dramatically reduced.
[0014] This invention is most applicable, but not limited, to
linear macromolecules. It also provides specific reagents for
sequencing both oligonucleotides and polypeptides. It provides an
apparatus for automating the processes described herein.
[0015] The present invention provides methods for determining the
positions of polymers which terminate with a given monomer, where
said polymers are attached to a surface having a plurality of
positionally distinct polymers attached thereto, said method
comprising the steps of:
[0016] labeling a terminal monomer in a monomer type specific
manner; and
[0017] scanning said surface, thereby determining the positions of
said label. In one embodiment, the polymers are polynucleotides,
and usually the labeling of the terminal marker comprises
incorporation of a labeled terminal monomer selected from the group
of nucleotides consisting of adenine, cytidine, guanidine and
thymidine.
[0018] An alternative embodiment provides methods for concurrently
determining which subset of a plurality of positionally distinct
polymers attached to a solid substrate at separable locations
terminates with a given terminal subunit, said method comprising
the steps of:
[0019] mixing said solid substrate with a solution comprising a
reagent, which selectively marks positionally distinct polymers
which terminate with said given terminal subunit; and
[0020] determining with a detector which separable locations are
marked, thereby determining which subset of said positionally
distinct polymers terminated with said given terminal subunit. In
one version, the solution comprises a reagent which marks the
positionally distinct polymer with a fluorescent label moiety. In
another version the terminal subunit is selected from the group
consisting of adenosine, cytosine, guanosine, and thymine.
[0021] Methods are also provided for determining which subset of a
plurality of primer polynucleotides have a predetermined
oligonucleotide, wherein the polynucleotides are complementary to
distinctly positioned template strands which are attached to a
solid substrate, said method comprising the steps of:
[0022] selectively marking said subset of primer polynucleotides
having the predetermined oligonucleotide; and
[0023] detecting which polynucleotides are marked. In one
embodiment, the oligonucleotide subunit is a single nucleotide; in
another the marking comprises elongating said primer with a labeled
nucleotide which is complementary to a template; and in a further
embodiment the marking step uses a polymerase and a blocked and
labeled adenine.
[0024] The invention embraces methods for concurrently obtaining
sequence information on a plurality of polynucleotides by use of a
single label detector, said method comprising the steps of:
[0025] attaching a plurality of positionally distinct
polynucleotides to a solid substrate at separable locations;
[0026] labeling said plurality of polynucleotides with a terminal
nucleotide specific reagent, said label being detectable using said
label detector;
[0027] determining whether said specific labeling reagent has
labeled each separable location. Often, the labeling is performed
with reagents which can distinguishably label alternative possible
nucleotide monomers. One embodiment uses four replica substrates
each of which is labeled with a specific labeling reagent for
adenine, cytosine, guanine, or thymine. Usually, the labeling and
determining steps are performed in succession using reagents
specific for each of adenine, cytosine, guanine, and thymine
monomers.
[0028] An alternative embodiment provides methods for concurrently
obtaining sequence information on a plurality of polynucleotides,
said method comprising the steps of:
[0029] attaching distinct polynucleotides to a plurality of
distinct solid substrates;
[0030] labeling said plurality of solid substrates with a terminal
nucleotide specific labeling reagent; and
[0031] determining whether said specific labeling reagent has
labeled each distinct substrate. The method can be performed using
in a continuous flow of distinct solid substrates through a
reaction solution.
[0032] A method is provided for simultaneously sequencing a
plurality of polymers made up of monomer units, said plurality of
polymers attached to a substrate at definable positions, said
method comprising the steps of:
[0033] mixing said substrate with a reagent which specifically
recognizes a terminal monomer, thereby providing identification
among various terminal monomer units; and
[0034] scanning said substrate to distinguish signals at definable
positions on said substrate; and
[0035] correlating said signals at defined positions on said
substrate to provide sequential series of sequence determinations.
Often, the plurality of polymers are synthesized by a plurality of
separate cell colonies, and the polymers may be attached to said
substrate is by a carbonyl linkage. In one embodiment, the polymers
are polynucleotides, and often the substrate comprises silicon. The
scanning will often identify a fluorescent label. In one
embodiment, the reagent exhibits specificity of removal of terminal
monomers, in another, the reagent exhibits specificity of labeling
of terminal monomers.
[0036] The invention also embraces methods for sequencing a
plurality of distinctly positioned polynucleotides attached to a
solid substrate comprising the steps of:
[0037] hybridizing complementary primers to said plurality of
polynucleotides;
[0038] elongating a complementary primer hybridized to a
polynucleotide by adding a single nucleotide; and
[0039] identifying which of said complementary primers have
incorporated said nucleotide. In some versions, the elongating step
is performed simultaneously on said plurality of polynucleotides
linked to said substrate. Typically, the substrate is a two
dimensional surface and the identifying results from a positional
determination of the complementary primers incorporating the single
defined nucleotide. A silicon substrate is useful in this
method.
[0040] Methods, are provided where the linking is by
photocrosslinking polynucleotide to said complementary primer,
where said primer is attached to said substrate. The elongating
will be often catalyzed by a DNA dependent polymerase. In various
embodiments, a nucleotide will have a removable blocking moiety to
prevent further elongation, e.g., NVOC.
[0041] A nucleotide with both a blocking moiety and labeling moiety
will be often used.
[0042] A further understanding of the nature and advantages of the
invention herein may be realized by reference to the remaining
portions of the specification and the attached drawings.
BRIEF DESCRIPTION OF THE FIGURES
[0043] FIG. 1 illustrates a simplified and schematized embodiment
of a degradative scheme for polymer sequencing.
[0044] FIG. 2 illustrates a simplified and schematized embodiment
of a synthetic scheme for polymer sequencing.
[0045] FIG. 3 illustrates a coordinate mapping system of a petri
plate containing colonies. Each position of a colony can be
assigned a distinct coordinate position.
[0046] FIG. 4 illustrates various modified embodiments of the
substrates.
[0047] FIG. 5 illustrates an idealized scanning result
corresponding to a particular colony position.
[0048] FIG. 6 illustrates particular linkers useful for attaching a
nucleic acid to a silicon substrate. Note that thymine may be
substituted by adenine, cytidine, guanine, or uracil.
[0049] FIG. 7 illustrates an embodiment of the scanning system and
reaction chamber.
[0050] FIG. 8 illustrates the application of the synthetic scheme
for sequencing as applied to a nucleic acid cluster localized to a
discrete identified position.
[0051] FIG. 8A illustrates schematically, at a molecular level, the
sequence of events which occur during a particular sequencing
cycle. FIG. 8B illustrates, in a logic flow chart, how the scheme
is performed.
[0052] FIG. 9 illustrates the synthesis of a representative
nucleotide analog useful in the synthetic scheme. Note that the
FMOC may be attached to adenine, cystosine, or guanine.
[0053] FIG. 10 illustrates the application of the degradative
scheme for sequencing as applied to a nucleic acid cluster
localized to a discrete identified position. FIG. 10A illustrates
schematically, at a molecular level, the sequence of events which
occur during a particular sequencing cycle. FIG. 10B illustrates in
a logic flow chart how the scheme is performed.
[0054] FIG. 11 illustrates a functionalized apparatus for
performing the scanning steps and sequencing reaction steps.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0055] I. Sequencing Procedure for a Generic Polymer
[0056] A. Overview
[0057] 1. Substrate and matrix
[0058] 2. Scanning system
[0059] 3. Synthetic/degradative cycles
[0060] 4. Label
[0061] 5. Utility
[0062] B. Substrate/Matrix
[0063] 1. Non-distortable
[0064] 2. Attachment of polymer
[0065] C. Scanning system
[0066] 1. Mapping to distinct position
[0067] 2. Detection system
[0068] 3. Digital or analog signal
[0069] D. Synthetic or degradative cycle
[0070] 1. Synthetic cycles
[0071] a. synthetic scheme
[0072] b. blocking groups
[0073] 2. Degradative cycles
[0074] 3. Conceptual principles
[0075] E. Label
[0076] 1. Attachment
[0077] 2. Mode of detection
[0078] F. Utility
[0079] II. Specific Embodiments
[0080] A. Synthetic method
[0081] B. Chain degradation method
[0082] III. Apparatus
[0083] I. Sequencing Procedure for a Generic Polymer
[0084] The present invention provides methods and apparatus for the
preparation and use of a substrate having a plurality of polymers
with various sequences where each small defined contiguous area
defines a small cluster of homogeneous polymer sequences. The
invention is described herein primarily with regard to the
sequencing of nucleic acids but may be readily adapted to the
sequencing of other polymers, typically linear biological
macromolecules. Such polymers include, for example, both linear
cyclical polymers or nucleic acids, polysaccharides, phospholipids,
and peptides having various different amino acids, heteropolymers
in which the polymers are mixed, polyurethanes, polyesters,
polycarbonates, polyureas, polyamides, polyethyleneimines,
polyarylene sulfides, polysiloxanes, polyimides, polyacetates or
mixed polymers of various sorts. In a preferred embodiment, the
present invention is described in the use of sequencing nucleic
acids.
[0085] Various aspects of U.S. Ser. No. 07/362,901, (VLSIPS
parent); U.S. Ser. No. 07/492,462 (VLSIPS CIP); U.S. Ser. No.
07/435,316 (caged biotin parent); U.S. Ser. No. 07/612,671 (caged
biotin CIP); and simultaneously filed cases U.S. Ser. No. ______,
attorney docket number 11509-28 (automated VLSIPS); and U.S. Ser.
No. ______, attorney docket number 11509-30 (sequencing by
hybridization); each of which is hereby incorporated herein by
reference, are applicable to the substrates and matrix materials
described herein, to the apparatus used for scanning the matrix
arrays, to means for automating the scanning process, and to the
linkage of polymers to a substrate.
[0086] A. Overview
[0087] The present invention is based, in part, on the ability to
perform a step wise series of reactions which either extend or
degrade a polymer by defined units.
[0088] FIG. 1 schematizes a simplified linear two monomer polymer
made up of A type and B type subunits. A degradative scheme is
illustrated. Panel A depicts a matrix with two different polymers
located at positions 10 and 14, but with no polymer linked at
position 12. A reaction is employed to label all of these polymers
at the terminus opposite the attachment of the monomer. Panel B
illustrates a label (designated by an asterisk) incorporated at
position 16 on the terminal monomers. A scan step is performed to
locate positions 10 and 14 where polymers have been linked, but no
polymer is located at position 12. The entire matrix is exposed to
a regent which is specific for removing single terminal A monomers,
which are also labeled. The regent is selected to remove only a
single monomer; it will not remove further A monomers. Removal of
the labeled A monomer leaves a substrate as illustrated in panel C.
A scan step is performed, and compared with the previous scan,
indicates that the polymer located at position 12 has lost its
label, i.e., that polymer at 12 terminated with an A monomer. The
entire matrix is then exposed to a second reagent which is specific
for removing terminal B monomers which are also labeled. Note that
only a single B on each monomer is removed and that successive B
monomers are not affected. Removal of the labeled B monomer leaves
a substrate as illustrated in panel D. Another scan step is
performed, indicating that the polymer located at position 14 has
lost its label, i.e., it terminated with a B monomer. The sequence
of treatments and scans is repeated to determine the successive
monomers. It will be recognized that if the labeled A and B are
distinguishable, i.e., the label on polymers at sites 10 and 14 may
be distinguished, a single removal step can be performed to convert
the substrate as illustrated in panel B directly to that
illustrated in panel D.
[0089] An alternative embodiment employs synthetic reactions where
a synthetic product is made at the direction of the attached
polymer. The method is useful in the synthesis of a complementary
nucleic acid strand by elongation of a primer as directed by the
attached polymer.
[0090] FIG. 2 illustrates a similar simplified polymer scheme,
where the A and B monomer provide a complementary correspondence to
A' and B' respectively. Thus, an A monomer directs synthetic
addition of an A' monomer and a B monomer directs synthetic
addition of a B' monomer. Panel A depicts monomers attached at
locations 18 and 22, but not at location 20. Each polymer already
has one corresponding complementary monomer A'. The matrix, with
polymers, is subjected to an elongation reaction which
incorporates, e.g., single labeled A' monomers 24 but not B'
monomers, as depicted in panel B. The label is indicated by the
asterisk. Note that only one A monomer is added. A scan step is
performed to determine whether polymers located at positions 18 or
22 have incorporated the labeled A' monomers. The polymer at
position 18 has, while the polymer at position 22 has not. Another
elongation reaction which incorporates labeled B' monomers 26 is
performed resulting in a matrix as depicted in panel C. Again note
that only one, and not successive B' monomers, is added. Another
scan is performed to determine whether a polymer located at sites
18 or 22 has incorporated a labeled B' monomer, and the result
indicates that the polymer located at site 22 has incorporated the
labeled B' monomer. A next step removes all of the labels to
provide a substrate as depicted in panel D. As before, if the
polymer which incorporated a labeled A' monomer is distinguishable
from a polymer which incorporated a labeled B' monomer, the
separate elongation reactions may be combined producing a panel C
type matrix directly from a panel A type matrix and the scan
procedure can distinguish which terminal monomer was
incorporated.
[0091] It will be appreciated that the process may be applied to
more complicated polymers having more different types of monomers.
Also, the number of scan steps can be minimized if the various
possible labeled monomers can be differentiated by the detector
system.
[0092] Typically, the units will be single monomers, though under
certain circumstances the units may comprise dimers, trimers, or
longer segments of defined length. In fact, under certain
circumstances, the method may be operable in removing or adding
different sized units so long as the units are distinguishable.
However, it is very important that the reagents used do not remove
or add successive monomers. This is achieved in the degradative
method by use of highly specific reagents. In the synthetic mode,
this is often achieved with removable blocking groups which prevent
further elongation.
[0093] One important aspect of the invention is the concept of
using a substrate having homogeneous clusters of polymers attached
at distinct matrix positions. The term "cluster" refers to a
localized group of substantially homogeneous polymers which are
positionally defined as corresponding to a single sequence. For
example, a coordinate system will allow the reproducible
identification and correlation of data corresponding to distinct
homogeneous clusters of polymer locally attached to a matrix
surface. FIG. 3 illustrates a mapping system providing such a
correspondence, where transfer of polymers from a producing colony
of organisms to a matrix preserves spatial information thereby
allowing positional identification. The positional identification
allows correlation of data from successive scan steps.
[0094] In one embodiment, bacterial colonies producing polymers are
spatially separated on the media surface of a petri plate as
depicted in panel A. Alternatively, phage plaques on a bacterial
lawn can exhibit a similar distribution. A portion of panel A is
enlarged and shown in panel B. Individual colonies are labeled
C1-C7. The position of each colony can be mapped to positions on a
coordinate system, as depicted in panel C. The positions of each
colony can then be defined, as in a table shown in panel D, which
allows reproducible correlation of scan cycle results.
[0095] Although the preferred embodiments are described with
respect to a flat matrix, the invention may also be applied using
the means for correlating detection results from multiple samples
after passage through batch or continuous flow reactions. For
example, spatially separated polymers may be held in separate wells
on a microtiter plate. The polymers will be attached to a substrate
to retain the polymers as the sequencing reagents are applied and
removed.
[0096] The entire substrate surface, with homogeneous clusters of
polymer attached at defined positions, may be subjected to batch
reactions so the entire surface is exposed to a uniform and defined
sequence of reactions. As a result, each cluster of target polymers
for sequencing will be subjected to similar reactive chemistry. By
monitoring the results of these reactions on each cluster localized
to a defined coordinate position, the sequence of the polymer which
is attached at that site will be determined.
[0097] FIG. 4, panel A illustrates solid phase attached polymers
linked to particles 32 which are individually sequestered in
separate wells 34 on a microtiter plate. The scanning system will
separately scan each well. FIG. 4 panel B illustrates marbles 36 to
which polymers are attached. The marbles are automatically fed in a
continuous stream through the reaction reagents 38 and past a
detector 40. The marbles may be carefully held in tubes or troughs
which prevent the order of the beads from being disturbed. In a
combination of the two embodiments, each polymer is attached to a
plurality of small marbles, and marbles having each polymer are
separated, but retained in a known order. Each marble is, in batch
with a number of analogous marbles having other polymers linked
individually to them, passed through a series of reagents in the
sequencing system. For example, A2, B2, and C2 are subjected to
sequencing reactions in batch, with label incorporated only for the
second monomer A3, B3, and C3, are likewise treated to determine
the third monomer. Likewise for A.sub.n, B.sub.n, and C.sub.n.
However, within each batch, the detection will usually occur in the
order A, B, and C, thereby providing for correlation of successive
detection steps for the A polymer beads, for the B polymer beads,
and for the C polymer beads.
[0098] FIG. 5 illustrates a signal which might result from a
particular defined position. Panel A illustrates the position of a
given colony relative to the positions corresponding to the
positional map. The scan system will typically determine the amount
of signal, or type of signal, at each position of the matrix. The
scan system will adjust the relationship of the detector and the
substrate to scan the matrix in a controllable fashion. An optical
system with mirrors or other elements may allow the relative
positions of the substrate and detection to be fixed. The scanner
can be programmed to scan the entire substrate surface in a
reproducible manner, or to scan only those positions where polymer
clusters have been localized. A digital data map, panel B, can be
generated from the scan step.
[0099] Thus, instead of subjecting each individual and separated
polymer to the series of reactions as a homogeneous sample, a whole
matrix array of different polymers targeted for sequencing may be
exposed to a series of chemical manipulations in a batch format. A
large array of hundreds, thousands, or even millions of spatially
separated homogeneous regions may be simultaneously treated by
defined sequencing chemistry.
[0100] The use of a coordinate system which can reproducibly assay
a defined position after each reaction cycle can be advantageously
applied according to this invention. For example, a colony plaque
lift of polymers can be transferred onto a nitrocellulose filter or
other substrate. A scanning detector system will be able to
reproducibly monitor the results of chemical reactions performed on
the target polymers located at the defined locations of particular
clones. An accurate positioning can be further ensured by
incorporating various alignment marks on the substrate.
[0101] The use of a high resolution system for monitoring the
results of successive sequencing steps provides the possibility for
correlating the scan results of each successive sequencing reaction
at each defined position.
[0102] The invention is dependent, in part, upon the stepwise
synthesis or degradation of the localized polymers as schematized
in FIGS. 1 and 2. The synthetic scheme is particularly useful on
nucleic acids which can be synthesized from a complementary strand.
Otherwise, a stepwise degradation scheme may be the preferred
method. Although single monomer cycles of synthesis or degradation
will usually be applicable, in certain cases the technology will be
workable using larger segments, e.g., dimers or trimers, in the
cyclic reactions.
[0103] The present invention also provides methods for production
or selection of monomer-specific degradative reagents based upon
catalytic antibody constructs. Antibody binding sites exhibiting
specificity for binding particular terminal monomers can be linked
to cleavage reagents or active sites of cleavage enzymes. Thus,
reagents which are specific for particular terminal nucleotides may
function to remove them in a specific fashion.
[0104] The invention also makes use of a means for detecting or
labeling the polymers. Particular sequencing chemistry can be
selected for specificity in reacting with terminal monomer units.
Alternatively, indirect labeling methods may be applied which can
distinguish between different terminal monomers. Another
alternative scheme allows for terminal labeling which is not
monomer-specific, but with the determination of the monomer based
upon specificity of post-label reagents or upon
monomer-distinguishable labels. Suitable such reagents will be
antibodies or other reagents having specificity for distinguishing
between different labeled terminal monomer residues and cleaving
only those labeled monomer residues.
[0105] Thus, although neither the reaction nor the label need
necessarily be specific, at least one of the pair must be specific.
This ensures that a comparison of label signal before and after a
reaction allows determination of the change in label signal after
monomer specific reactions are performed, and thereby providing the
means to deduce the identity of the monomer at a given
position.
[0106] B. Substrate/Matrix
[0107] The substrate or matrix has relatively few constraints on
its composition. Preferably, the matrix will be inert to the
sequencing reactions to which the polymers attached thereto will be
subjected. Typically, a silicon or glass substrate will be used,
but other suitable matrix materials include ceramics, or plastics,
e.g., polycarbonate, polystyrene, delrin, and cellulose, and any
other matrix which satisfies these functional constraints.
[0108] In one embodiment, the matrix should be sufficiently
nondeformable that the scanning system can reproducibly scan the
matrix and reliably correlate defined positions with earlier and
later scan operations. However, by including alignment markings on
the substrate, the need for absolute rigidity of the substrate may
be reduced.
[0109] In an alternative embodiment, the matrix may merely be large
enough that the attached polymer may be separated from a liquid
phase containing the sequencing reagents. In this embodiment, a
single detection unit is used to analyze the label in a
multiplicity of different samples after each of the reaction steps.
Thus, different samples may be separably treated in distinct wells
of a microtiter dish.
[0110] Separate homogeneous polymers can be introduced to solid
phase beads in each microtiter well. Sequencing reagents may be
individually introduced separately into each well, or transferred
from well to well with the polymers remaining in the correct well
due to their solid phase attachments.
[0111] In an alternative approach, the solid phase matrix may be
marbles or other particularly shaped articles. Spherical shapes,
solid or hollow, are preferred because they can be easily
transported through troughs or tubing which retains their relative
orders. By feeding a succession of beads through appropriate
reaction baths and past a detector in a known and retained order, a
succession of label detection results from a bead may be correlated
and converted into a polymer sequence.
[0112] The attachment of the target homogeneous clusters of target
polymers to the substrate can be achieved by appropriate linkage
chemistry. As indicated before, the linkage should be stable and
insensitive to the sequencing reagents used. The specific linkages
will depend, of course, upon the particular combination of
substrate and polymer being used.
[0113] Typically, the most useful chemical moieties which will be
used are amines. Typical substrate derivatized gorups include
aminopropyl triethoxysilane, hydroxypropylacylate, or hydroxy
reagents, see, e.g., U.S. Ser. No. ______, attorney docket number
11509-28 (automated VLSIPS). Typical polymer derivatized groups
include nitroveratryl and nitroveratryl oxycarbonyl. Linkage types
are also illustrated and detailed in U.S. Ser. No. ______, attorney
docket number 11509-28 (automated VLSIPS), and U.S. Ser. No.
______, attorney docket number 11509-30 (sequencing by
hybridization).
[0114] FIG. 6 illustrates one preferred linkage chemistry for
nucleic acids. An NVO-derivatized nucleotide is made as described
in U.S. Ser. No. ______, attorney docket number 11509-28 (automated
VLSIPS). The specific conditions for synthesis of thymidine are
described therein and are adaptable to other nucleotides and
nucleosides. The nucleoside analog is further derivatized with an
appropriate R group at the 3' hydroxyl. Preferred R groups are
indicated in FIG. 6. The linkage produces a photosensitive blocked
nucleoside suitable for phosphoramidite synthesis of further
polynucleotides which can serve as a complementary strand for
hybridization of other polymers. The hybrids of the complementary
strands may be covalently crosslinked using acridine dyes or other
intercalative reagents, e.g., psoralen. See, e.g., Kornberg (1980)
DNA Replication Freeman, San Francisco; Wiesehahn, et al. (1978)
Proc. Natl. Acad. Sci. USA 75:2705-______, and Sheldon (1986) U.S.
Pat. No. 4,582,789 which are each incorporated herein by
reference.
[0115] The linkage should be substantially inert to the cyclic
sequencing reactions and scan cycles. Usually, the linkage will be
at a defined and homogeneous polymer position, preferably at the
end opposite where the sequencing chemistry takes place. Although
the type of linkage is dependent upon the polymer being sequenced,
various types of polymers have preferred linkages. For
polypeptides, amino terminal or carboxyl terminal linkages will be
preferred. Specific amino terminal linkages include amino butyric
acid, amino caproic acids, and similar carboxylic acids. Specific
carboxyl terminal linkages include butyric acid, caproic acid, and
other carboxylic acids, hydrocarbon, and ethers. See U.S. Ser. No.
435,316 (VLSIPS parent) and U.S. Ser. No. 492,462 (VLSIPS CIP)
which are incorporated herein by reference. For nucleic acids, the
linkages will typically be either 5' or 3' linkages. Suitable 3'
linkages include those illustrated in FIG. 6, and others described
in U.S. Ser. No. ______, attorney docket number 11509-30
(sequencing by hybridization).
[0116] Alternatively, for complementary polymers, particularly
nucleic acids, linkage may be via crosslinkage of the complementary
polymers where the complementary stand is directly attached to the
matrix. Acridine dyes, e.g., psoralen, or a similar crosslinking
agent between the strands can be used. See, e.g., Dattagupta, et
al., "Coupling of Nucleic Acids to Solid Support By Photochemical
Methods," U.S. Pat. No. 4,713,326; and U.S. Pat. No. 4,542,102; and
Chattejee, M. et al. (1990) J. Am. Chem. Soc. 112:6397; which
describe useful crosslinking reagents, and are hereby incorporated
herein by reference.
[0117] For polynucleotides, the preferred attachment to the matrix
is through a synthetic oligomer by the 5' end of each target
sequence. This oligomer is designed to anneal to the desired target
templates used in a synthetic system or to the polynucleotide used
in the degradation approach. In one embodiment, a vector sequence
which is complementary to the immobilized oligonucleotide is
incorporated adjacent the cloning inserts, thereby providing a
common complementary sequence for each insert. In particular, a
cloning vector will be selected with a defined sequence adjacent
the insert. See, e.g., Sambrook et al., (1989) Molecular Cloning: A
Laboratory Manual, Vols. 1-3, Cold Spring Harbor Press, which is
hereby incorporated herein by reference. This defined sequence is
used, in some embodiments, as a common linker for all of the vector
inserts. The inserts, adjacent to this linker, will be transferable
by hybridization to the matrix linked complementary sequences. The
hybrids are crosslinked by addition of a suitable crosslinker under
appropriate conditions, for example, photocrosslinking by psoralen
with UV light. See, e.g., Song et al. (1979) Photochem. Photobiol.
29:1177-1197; Cimino et al. (1985) Ann. Rev. Biochem. 54:1151-1193;
and Parsons (1980) Photochem. Photobiol. 32:813-821; each of which
is incorporated herein by reference. Using these approaches, the
oligonucleotide linker serves as both the attachment linker and the
polymerization primer.
[0118] FIG. 6 illustrates a preferred 3' terminal linkage designed
for a phosphoramidite linkage of a synthetic primer and the
reactions forming them. The chemical reactions for actually
performing the linkage will be similar to those used for
oligonucleotide synthesis instruments using phosphoramidite or
similar chemistry. Applied Biosystems, Foster City, Calif. supplies
oligonucleotide synthesizers.
[0119] C. Scanning System
[0120] The scanning system should be able to reproducibly scan the
substrate. Where appropriate, e.g., for a two dimensional substrate
where the polymers are localized to positions thereon, the scanning
system should positionally define the clusters attached thereon to
a reproducible coordinate system. It is important that the
positional identification of clusters be repeatable in successive
scan steps. Functionally, the system should be able to define
physical positions to a coordinate system as described above and
illustrated in FIGS. 3 and 4.
[0121] In alternative embodiments, the system can operate on a
cruder level by separately detecting separate wells on a microtiter
plate, or by scanning marbles which pass by the detector in an
embodiment as described above and illustrated in FIG. 4.
[0122] The scanning system would be similar to those used in
electrooptical scanning devices. See, e.g., the fluorescent
detection device described in U.S. Ser. No. 07/492,462 (VLSIPS
CIP), and U.S. Ser. No. ______, attorney docket number 11509-28
(automated VLSIPS). The system could exhibit many of the features
of photographic scanners, digitizers or even compact disk reading
devices. For example, a model no. PM500-A1 x-y translation table
manufactured by Newport Corporation can be attached to a detector
unit. The x-y translation table is connected to and controlled by
an appropriately programmed digital computer such as an IBM PC/AT
or AT compatible computer. The detection system can be a model no.
R943-02 photomultiplier tube manufactured by Hamamatsu, attached to
a preamplifier, e.g., a model no. SR440 manufactured by Stanford
Research Systems, and to a photon counter, e.g., an SR430
manufactured by Stanford Research System, or a multichannel
detection device. Although a digital signal may usually be
preferred, there may be circumstances where analog signals would be
advantageous.
[0123] The stability and reproducibility of the positional
localization in scanning will determine, to a large extent, the
resolution for separating closely positioned polymer clusters in a
2 dimensional substrate embodiment. Since the successive monitoring
at a given position depends upon the ability to map the results of
a reaction cycle to its effect on a positionally mapped cluster of
polymers, high resolution scanning is preferred. As the resolution
increases, the upper limit to the number of possible polymers which
may be sequenced on a single matrix will also increase. Crude
scanning systems may resolve only on the order of 1000.mu., refined
scanning systems may resolve on the order of 100.mu., more refined
systems may resolve on the order of about 10.mu. with optical
magnification systems a resolution on the order of 1.0.mu. is
available, and more preferably a resolution on the order of better
than 0.01.mu. is desired. The limitations on the resolution may be
diffrication limited and advantages may arise from using shorter
wavelength radiation for the photo-optical deprotection fluorescent
scanning steps. However, with increased resolution, the time
required to fully scan a matrix will be increased and a compromise
between speed and resolution will necessarily be selected. Parallel
detection devices which will provide high resolution with shorter
scan times will be applicable where multiple detectors will be
moved in parallel.
[0124] With other embodiments, resolution often is not so important
and sensitivity might be emphasized. However, the reliability of a
signal may be pre-selected by counting photons and continuing to
count for a longer period at positions where intensity of signal is
lower. Although this will decrease scan speed, it can increase
reliability of the signal determination. Various signal detection
and processing algorithms may be incorporated into the detection
system, such as described in U.S. Ser. No. 07/____, attorney docket
number 11509-28. (activated VLSIPS) In one embodiment, the
distribution of signal intensities of pixels across the region of
signal are evaluated to determine whether the distribution of
intensities corresponds to a time positive signal.
[0125] The detection system for the signal or label will depend
upon the label used, which may be defined by the chemistry
available. For optical signals, a combination of an optical fiber
or charged couple device (CCD) may be used in the detection step.
In those circumstances where the matrix is itself transparent to
the radiation used, it is possible to have an incident light beam
pass through the substrate with the detector located opposite the
substrate from the polymers. For electromagnetic labels, various
forms of spectroscopy systems can be used. Various physical
orientations for the detection system are available and discussion
of important design parameters is provided, e.g., in Jovin, Adv. in
Biochem. Bioplyms, which is hereby incorporated herein by
reference.
[0126] Various labels which are easily detected include radioactive
labels, heavy metals, optically detectable labels, spectroscopic
labels and the like. Various photoluminescent labels include those
described in U.S. Ser. No. ______, attorney docket number 11509-30
(sequencing by hybridization). Protection and deprotection are
described, e.g., in McCray, et al. (1989) Ann. Rev. Biophysical
Chemistry 18:239-270, and U.S. Ser. ______, attorney docket number
11509-28 (automated VLSIPS), each of which is hereby incorporated
herein by reference.
[0127] With a processing system, the speed of scanning may be
dramatically increased with a system which only scans positions
where known clusters of polymer are attached. This allows the
scanning mechanism to skip over areas which have been determined to
lack any polymer clusters and avoids loss of time in scanning
useless regions of the matrix. Moreover, various problems with
spurious or overlapping signals may be adjusted for by appropriate
analysis.
[0128] A scanning apparatus which may be used for the presently
described uses is schematically illustrated in FIG. 7. A substrate
52 is placed on an x-y translation table 54. In a preferred
embodiment the x-y translation table is a model no. PM500-A1
manufactured by Newport Corporation. The x-y translation table is
connected to and controlled by an appropriately programmed digital
computer 56 which may be, for example, an appropriately programmed
IBM PC/AT or AT compatible computer. Of course, other computer
systems, special purpose hardware, or the like could readily be
substituted for the AT computer used herein for illustration.
Computer software for the translation and data collection functions
described herein can be provided based on commercially available
software including, for example, "Lab Windows" licensed by National
Instruments, which is incorporated herein by reference for all
purposes.
[0129] The substrate and x-y translation table are placed under a
microscope 58 which includes one or more objectives 60. Light
(about 488 nm) from a laser 62, which in some embodiments is a
model no. 2020-05 argon ion laser manufactured by Spectraphysics,
is directed at the substrate by a dichroic mirror 64 which passes
greater than about 520 nm wavelength light but reflects 488 nm
light. Dichroic mirror 64 may be, for example, a model no. FT510
manufactured by Carl Zeiss. Light reflected from the mirror then
enters the microscope 58 which may be, for example, a model no.
Axioscop 20 manufactured by Carl Zeiss. Fluorescein-marked
materials on the substrate will fluoresce >488 nm light, and the
fluoresced light will be collected by the microscope and passed
through the mirror. The fluorescent light from the substrate is
then directed through a wavelength filter 66 and, thereafter
through an aperture plate 68. Wavelength filter 66 may be, for
example, a model no. OG530 manufactured by Melles Griot and
aperture plate 68 may be, for example, a model no. 477352/477380
manufactured by Carl Zeiss.
[0130] The fluoresced light then enters a photomultiplier tube 70
which in one embodiment is a model no. R943-02 manufactured by
Hamamatsu, the signal is amplified in preamplifier 72 and photons
are counted by photon counter 74. The number of photons is recorded
as a function of the location in the computer 56. Pre-Amp 72 may
be, for example, a model no. SR440 manufactured by Stanford
Research Systems and photon counter 74 may be a model no. SR430
manufactured by Stanford Research Systems. The substrate is then
moved to a subsequent location and the process is repeated. In
preferred embodiments the data are acquired every 1 to 100 .mu.m
with a data collection diameter of about 0.8 to 10 .mu.m preferred.
In embodiments with sufficiently high fluorescence, a CCD detector
with broadfield illumination is utilized.
[0131] By counting the number of photons generated in a given area
in response to the laser, it is possible to determine where
fluorescent marked molecules are located on the substrate.
Consequently, for a substrate which has a matrix of polypeptides,
for example, synthesized on the surface thereof, it is possible to
determine which of the polypeptides has incorporated a
fluorescently marked monomer.
[0132] According to preferred embodiments, the intensity and
duration of the light applied to the substrate is controlled by
varying the laser power and scan stage rate for improved
signal-to-noise ratio by maximizing fluorescence emission and
minimizing background noise. Signal analysis may improve the
resolution and reliability of the system. The time of photon
counting may be varied at various positions to provide high signal
to background or noise.
[0133] D. Synthetic or Degradative Cycle
[0134] The present invention provides a substrate with positionally
separated polymers for sequencing. The separation may be by solid
phase carriers separated in separate wells, by separately
manipulable carriers such as beads or marbles, or by physical
separation of regions on a two-dimensional substrate surface. Each
cluster region is a target for the sequencing reactions. Although
the reactions are, in various embodiments, performed on all the
clusters together, each cluster can be individually analyzed by
following the results from the sequence of reactions on polymer
clusters at positionally defined locations.
[0135] The synthetic mode, as illustrated in FIG. 1 is easily
applied to the sequencing of nucleic acids, since one target strand
may serve as the template to synthesize the complementary strand.
The nucleic acid can be DNA, RNA or mixed polymers. For the
purposes of illustration, and not by limitation, the sequencing
steps for DNA are described in detail. The synthetic mode, an
example of which is depicted in FIG. 8 for nucleotides, may also be
useful in circumstances where synthesis occurs in response to a
known polymer sequence. The synthetic scheme depends, in part, on
the stepwise elongation by small and identifiable units. A
polymerase is used to extend a primer complementary to a target
template. The primer is elongated one nucleotide at a time by use
of a particular modified nucleotide analog to which a blocking
agent is added and which prevents further elongation. This blocking
agent is analogous to the dideoxy nucleotides used in the Sanger
and Coulson sequencing procedure, but in certain embodiments here,
the blockage is reversible. This analog is also labeled with a
removable moiety, e.g., a fluorescent label, so that the scanning
system can detect the particular nucleotide incorporated after its
addition to the polymerization primer.
[0136] Panel 4A illustrates the cycle of sequence reactions in one
embodiment. The template polymer 82 located at a particular site
has already been linked to substrate. The template 82 and
complementary primer 84 are hybridized. Often, the primer 84 is
common to all of the target template sequences, selected by its
common occurrence on a selected cloning vector. The primer 84 is
also often covalently crosslinked to the target template 82 using
psoralen and U.V. light.
[0137] Labeled and blocked monomers 86 are shown, the label
depicted by the asterisk and the polymerization blocking groups
indicated by B. A compatible polymerase 88 which can elongate the
primer with the labeled blocked monomers 86 is used in reaction 1.
In the preferred embodiment, the separate labeled monomers can be
distinguished from one another by the wavelength of fluorescent
emission.
[0138] In the example illustrated, a labeled blocked guanosine
monomer has been incorporated into the elongated primer 90.
[0139] Step 2 is a scan, where the signal at the position
corresponding to template 82 indicates that the guanosine analog
was incorporated. Reaction 2 is performed, a removal of both the
label and blocking group. It will be recognized that the blocking
group prevents elongation by any more than a single nucleotide in
each reaction cycle. Reaction 3 is equivalent to reaction 1, though
the substrate primer has been elongated by one monomer.
[0140] Panel B illustrates the scheme in a logic flow chart. The
template 82 is attached to the substrate, either directly or
through the primer. Reaction 1 elongates the primer by a single
labeled blocked nucleotide. A scan step is performed and the
blocking and labeling agents are removed. The elongation reaction
is performed and the cycle repeated.
[0141] For a nucleic acid, a unit for addition would typically be a
single nucleotide. Under certain circumstances, dimers or trimers
or larger segments may be utilized, but a larger number of
different possible nucleotide elements requires high
distinguishability in other steps. For example, there are only four
different nucleotide monomer possibilities, but there are sixteen
different dimer possibilities. The distinction among four
possibilities is more precise and simple than among sixteen dimer
possibilities. To prevent elongation by a unit length greater than
one monomer, the nucleotide should be blocked at the position of 3'
elongation. Usually, the nucleotide will be blocked at the 3'
hydroxyl group where successive nucleotides would be attached. In
contrast to a dideoxy nucleotide, typically the blocking agent will
be a reversible blocking agent thereby allowing for deblocking and
subsequent elongation.
[0142] Variations may be easily incorporated into the procedure. If
the labels on the monomers are not distinguishable, successive
substrate scans can be performed after each monomer is provided
conditions allowing its incorporation. Alternatively, a small
fraction of permanently blocked but reversibly labeled monomers may
be incorporated. Those specific molecules which incorporate the
blocked monomers are permanently removed from further
polymerization, but such is acceptable if the labeling moiety is
also removed.
[0143] 1. other monomers
[0144] One important functional property of the monomers is that
the label be removable. The removal reaction will preferably be
achieved using mild conditions. Blocking groups sensitive to mild
acidic conditions, mild basic conditions, or light are preferred.
The label position may be anywhere on the molecule compatible with
appropriate polymerization, i.e., complementary to the template, by
the selected polymerase. A single polymerase for all of the
modified nucleotide is preferred, but a different polymerase for
each of the different monomers can be used.
[0145] Nucleotide analogs used as chain-terminating reagents will
typically have both a labeling moiety and a blocking agent while
remaining compatible with the elongation enzymology. As the
blocking agent will usually be on the 3' hydroxyl position of the
sugar on a nucleotide, it would be most convenient to incorporate
the label and the blocking agent at the same site, providing for a
single reaction for simultaneous removal of the label and blocking
agent. However, it is also possible to put a label on another
portion of the nucleotide analog than the 3' hydroxyl position of
the sugar, thereby requiring a two-step reaction cycle for removing
the blocking and labeling groups.
[0146] Analogs will be found by selecting for suitable combinations
of appropriate nucleotides with compatible polymerases. In
particular, it is desired that a selected polymerase be capable of
incorporating a nucleotide, with selectivity, having both the
blocking moiety and the label moiety attached. It has been observed
that RNA polymerases are less fastidious with respect to the
nucleotide analogues which will be polymerized into a growing
chain. See, e.g., Rozovaskaya, T., et al. (1977) Molekulvarnava
Biologiva, 11:598-610; Kutateladze, T., et al. (1986) Molekulvarnva
Biolopiva, 20:267-276; and Chidgeavadze, Z., et al. (1985) FEBS
Letters, 183:275-278. Moreover, those references also indicate that
rather significant chemical moieties may be attached at the 2' or
3' positions on a nucleotide, and still be correctly incorporated
at the growing chain terminus.
[0147] In particular, it is not necessary that the same nucleotide
have both the reversible blocking moiety and the removable labeling
moiety, as a combination of two separate nucleotide analogues could
be utilized, e.g., N1, which is reversibly blocked and not labeled,
and N2, which is irreversibly blocked but removably labeled. Note
that the removal of label may be affected by destruction of the
label, e.g., fluorescence destruction, or preferably by removal.
Both of these nucleotides might be, for instance, A analogues. With
the mixture, at an appropriate sequence position of a target
sequence, an N1 and N2 nucleotides can be incorporated at an
appropriate ratio, and these can be polymerized by either two
separate polymerases, or preferably a single polymerase.
[0148] For example, two separate polymerases might be necessary, P1
which incorporates N1, and P2 which incorporates N2. At the given
location in the sequence, some of the growing polymers will
incorporate N1 with P1 polymerase, and others will incorporate N2
with the P2 polymerase. The proportions of N1, N2, P1, and P2 may
be titrated to get the desired fractional proportions of the N1
reversibly blocked nucleotides and the N2 labeled but irreversibly
blocked nucleotides.
[0149] As all of the growing chains have blocked nucleotides, no
elongation takes place beyond a single nucleotide. The N2
nucleotides provide a specific label, detected in the scanning
step. After determination of the incorporated label, the label may
be removed or destroyed, and those irreversibly terminated growing
chains become permanently removed from further participation in the
sequencing process. Photodestruction may be achieved by a high
intensity laser beam of the correct wavelength. See, e.g., March
(1977) Advanced Organic Chemistry: Reactions, Mechanisms and
Structure (2d Ed) McGraw; and Carey and Sundberg (1980) Advanced
Organic Chemistry: part A Structure and Mechanisms, Plenum.
[0150] Next, the reversible blocking moiety is removed, providing a
new set of slightly longer polymers ready for the next step. Of
course, the amount of label necessary to be incorporated must be
detectable, preferably with a clear, unambiguous positive signal.
The amount of label incorporated will depend, in part, upon the
conditions in the polymizing step and the relative incorporation of
the N1 and N2 nucleotides. The proportions of the nucleotides,
polymerases, and other reagents may be adjusted to appropriately
incorporate the desired proportions of the nucleotides.
[0151] In an embodiment where a single polymerase will incorporate
both N1 and N2, the relative proportions and conditions to get the
correct incorporation levels of the two nucleotides can be
titrated. In an alternative preferred embodiment, a single
nucleotide will have both the removable label and the reversible
blocking moiety.
[0152] A similar approach may be necessary where only some fraction
of the nucleotide analogues is labeled. Separate polymerases might
also be useful for such situations, and each polymerase may have
special conditions necessary for activity.
[0153] Procedures for selecting suitable nucleotide and polymerase
combinations will be readily adapted rom Ruth et al. (1981)
Molecular Pharmacology 20:415-422; Kutateladze, T., et al. (984)
Nuc. Acids Res., 12:1671-1686; Kutateladze, T., et al. (1986)
Molekulvarnaya Biologiva 20:267-276; Chidgeavadze, Z., et al.
(1985) FEBS Letters, 183:275-278; and Rozovskaya, T., et al. (1977)
Molekulvarnava Biologiva 11:598-610.
[0154] The determination of termination activity is done in two
steps. First, nucleotide analogues are screened for the ability of
the compound to inhibit polymerase activity. Then the nucleotide
analogue is tested for base-specific termination as manifested by
generating a correct DNA sequencing ladder on a template of known
sequence. The appropriate reaction conditions are those used for
conventional sequencing reactions with the respective polymerases.
The conditions are then modified in the usual ways to obtain the
optimal conditions for the particular terminator compound (e.g.
concentration of terminator, ratio of terminator to dNTP, Mgtt, and
other reagents critical to proper polymerase function.
[0155] By way of example, an approach employing the polymerase
known as reverse transcriptase (AMV) will be described. The initial
conditions are essentially as described by Prober, et al. (1987)
Science 238: 336-341.
[0156] A nucleotide analogue is first selected from the group
available from a commercial source such as Amersham, New England
Nuclear, or Sigma Chemical Company. In particular, nucleotides
which are reversibly blocked from further elongation, especially at
the 5' or 3' --OH will be used.
[0157] General properties which are desired have been described.
Each of these analogs can be tested for compatibility with a
particular polymerase by testing whether such polymerase is capable
of incorporating the labeled analog. Various polymerases may be
screened, either natural forms of the mentioned types, or variants
thereof. Polymerases useful in connection with the invention
nuclide E. Coli DNA polymerase (Klenow fragment); are Klenow and
Henningsen (1970) Proc. Nat'l Acad Sci USA 65:168-______; and
Jacobsen et al. (1974) Eurs. J. Biochem. 45:623-______; modified
and cloned versions of T7 DNA polymerase (Sequenase and Sequenase
2.0); see Tabor and Richardson (1987) Proc. Nat'l Acad. Sci. USA
84:4767-______; and Tabor and Richardson (1987) J. Biol. Chem.
262:15330-______; Taq DNA polymerase from thermostable Thermus
aquaticus; see Chien et al. (1976) J. Bacterol. 127:1550-______;
and its cloned version Amplitaq; Saiki and Gelfand (1989)
Amplifications 1:4-______; T4 DNA polymerase; see Nossal (1974) J.
Biol. Chem. 249:5668-______, and various reverse transcriptases,
both RNA- and DNA- dependent DNA polymerases, e.g., avian
retroviruses; see Houts (1970) J. Viroloav 29:517-______; and
murine retroviruses; see Kotewicz et al. (1985) Gene 85:249-______;
Gerard et al. (1986) DNA 5:271-______; and Bst polymerase; see Ye,
S. and Hong (1987) Scientia Sinica 30:503-506.
[0158] In order to ensure that only a single nucleotide is added at
a time, a blocking agent is usually incorporated onto the 3'
hydroxyl group of the nucleotide. Optimally, the blocking agent
should be removable under mild conditions (e.g., photosensitive,
weak acid labile, or weak base labile groups), thereby allowing for
further elongation of the primer strand with a next synthetic
cycle. If the blocking agent also contains the fluorescent label,
the dual blocking and labeling functions will be achieved without
the need for separate reactions for the separate moieties.
[0159] The blocking group should have the functional properties of
blocking further elongation of the polymer. Additional desired
properties are reversibility and inertness to the sequencing
reactions. Preferably, where an enzymatic elongation step is used,
the monomers should be compatible with the selected polymerase.
Specific examples for blocking groups for the nucleic acids include
acid or base labile groups at the 3' OH position. See, e.g., Gait
(1984) Oligonucleotide Synthesis: A Practical Approach, IRL Press,
Oxford.
[0160] A DNA-dependent DNA polymerase is the polymerases of choice.
Polymerases used for conventional DNA sequencing, for example,
Klenow fragment of E. coli DNA Pol, Sequenase (modified T7 DNA
polymerase), Taq (Thermus aquaticus) DNA polymerase, Bst (Bacillus
stearothermophilus), DNA polymerase, reverse transcriptase (from
AMV, MMLV, RSV, etc.) or other DNA polymerases will be the
polymerases of choice. However, there is a functional constraint
that the polymerase be compatible with the monomer analogues
selected. Screening will be performed to determine appropriate
polymerase and monomer analog combinations.
[0161] Removal of the blocking groups may also be unnecessary if
the labels are removable. In this approach, the chains
incorporating the blocked monomers are permanently terminated and
will no longer participate in the elongation processes. So long as
these blocked monomers are also removed from the labeling process,
a small percentage of permanent loss in each cycle can also be
tolerated.
[0162] The fluorescent label may be selected from any of a number
of different moieties. The preferred moiety will be a fluorescent
group for which detection is quite sensitive. Various different
fluorescence-labeling techniques are described, for example, in
Cambara et al. (1988) "Optimization of Parameters in a DNA
Sequenator Using Fluorescence Detection," Bio/Technol. 6:816-821;
Smith et al. (1985) Nucl. Acids Res. 13:2399-2412; and Smith et al.
(1986) Nature 321:674-679, each of which is hereby incorporated
herein by reference. Fluorescent labels exhibiting particularly
high coefficients of destruction may also be useful in destroying
nonspecific background signals.
[0163] Appropriate blocking agents include, among others, light
sensitive groups such as 6-nitoveratryloxycarbonyl (NVOC),
2-nitobenzyloxycarbonyl (NBOC),
.alpha.,.alpha.-dimethyl-dimethoxybenzyloxycarbonyl (DDZ),
5-bromo-7-nitroindolinyl, o-hydroxy-2-methyl cinnamoyl,
2-oxymethylene anthraquinone, and t-butyl oxycarbonyl (TBOC). Other
blocking reagents are discussed, e.g., in U.S. Ser. No. 07/492,462;
Patchornik (1970) J. Amer. Chem. Soc. 92:6333; and Amit et al.
(1974) J. Org. Chem. 39:192, all of which are hereby incorporated
herein by reference. Additional blocking agents attached to
particular positions may be selected according to the functional
directives provided herein.
[0164] FIG. 9 schematically illustrates the synthesis of a generic
protected nucleotide. A suitable nucleotide is labeled with the
FMOC fluorescently detectable label by reaction under the
conditions described, e.g., in U.S. Ser. No. ______, attorney
docket number 11509-30 (sequencing by hybridization), with
(TMS-C1), FMOC-C1, and H.sub.2O. A protection moiety will be added
using conditions also described there.
[0165] Various nucleotides possessing features useful in the
described method can be readily synthesized. Labeling moieties are
attached at appropriate sites on the nucleotide using chemistry and
conditions as described, e.g., in Gait (1984) Oligonucleotide
Synthesis. Blocking groups will also be added using conditions as
described, e.g., in U.S. Ser. No. ______, attorney docket number
11509-30 (sequencing by hybridization). FIG. 9 also outlines
various reactions which lead to useful nucleotides.
[0166] Additionally, the selected polymerases used in elongation
reactions should be compatible with nucleotide analogs intended for
polymerization to the primer. Simple screening procedures for
nucleotide and polymerase combinations may be devised to verify
that a particular combination is functional. A test using primer
with template which directs the addition of the nucleotide analog
to be incorporated will determine whether the combination is
workable. Natural polymerases or variants thereof may be used under
particular defined conditions.
[0167] The degradative scheme is generally illustrated in FIG. 1,
an example more generally applicable to biological macromolecular
polymers is depicted in FIG. 10. This method is useful for a wider
variety of polymers without the limitations imposed by the need to
replicate the polymer. The degradative sequencing technique
depends, in part, upon the ability to specifically label or
distinguish between various different terminal monomers at
particular matrix positions. Reactions for specific removal of a
defined monomer unit are important.
[0168] This monomer distinguishability can arise from an ability to
differentiate between label on the various possible monomers in the
polymer. As a second means, distinguishability can come from
specific reagents which react with particularity on different
monomers. Thus, for instance, labels may be used which generally
attach to the terminal nucleotide, but whose fluorescent signal
differs depending upon the nucleotide. As a third means, a reagent
which specifically affects the label on only one monomer may be
used, as described below.
[0169] In the first example, every polymer cluster will be labeled
at a particular end, e.g., the 5' end, without specificity for the
monomer located there. The scan step will be able to distinguish
the terminal monomers, after which each labeled terminal monomer is
specifically removed. The general label step is repeated in the
cycle as described.
[0170] In the second means for distinguishability, reagents are
used which produce a signal which is dependent upon the terminal
nucleotide. For example, a labeling molecule which binds only to
one specific terminal monomer will provide a monomer specific
label. This will provide a cycle much like the first means for
distinguishability where the properties of the label is different
depending upon the terminal nucleotide to which each specific
labeling reagent binds.
[0171] In the third means for distinguishability, an individual
reagent labels or affects only a specific terminal monomer.
Polymers susceptible to each reagent by virtue of terminating with
the corresponding monomer will have its label specifically
affected. A scan of the matrix after each step and comparison with
the earlier scans will determine which positions correspond to
polymers ending with a susceptible monomer. Performing a removal
step with a second monomer-specific reagent followed by a scan will
identify those positional locations having polymer clusters ending
with that second monomer. A similar reagent for the other possible
monomers will further define all of the possibilities. Finally,
when all of the possible monomers have been removed, the labeling
reaction may be repeated and the succession of specific reagent and
scanning steps will also be repeated. This procedure allows for a
succession of automated steps to determine the sequence of the
polymer clusters localized to distinct positions.
[0172] Finally, a combination of both specificity of reagent and
ability to distinguish label on different monomers can be utilized.
Neither alone need be relied upon exclusively. Thus, in the case of
nucleotides, an ability to distinguish into two separate classes of
nucleotides, e.g., A and C from G and T, combined with specific
reagents for distinguishing between the indistinguishable label
pairs, e.g., in the example provided, A from C, or G from T, can
also provide sufficient information for sequencing.
[0173] Instead of performing four specific reactions on the same
substrate matrix, each of the four individual reactions can be
performed on separate parallel matrices. Four separate substrate
matrices may be made by a replica plating or successive transfers,
each matrix having the same spatial distribution of polymer
clusters. Thereby, each separate substrate can be subjected to only
a single specific reagent in a highly optimized reaction. on each
cycle, one out of the four parallel substrates should show a signal
indicating the monomer at the terminal for the cluster at a given
matrix position.
[0174] Likewise, two parallel substrates can be provided, and each
of the parallel substrates is used to determine two of the four
possible nucleotides at each position. Instead of treating a single
matrix with four separate reactions, this approach allows treating
each of two substrates with only two separate reactions. By
minimizing the number of reactions to which each chip is exposed,
the side reactions will be minimized, the chemistry will be
optimized, and the number of cycles through which a matrix will
survive will be optimized. This provides an advantage in the number
of cycles to which a matrix can be subjected before the signal to
noise becomes indistinguishable.
[0175] E. Label
[0176] The label is important in providing a detectable signal. The
signal may be distinguishable among the various monomers by the
nature of the signal, e.g., wavelength or other characteristic, as
described in Prober et al. (1987) Science 238:336-311. A
monomer-specific reagent can allow determination of whether each
position has a particular terminal monomer by the presence or loss
of label.
[0177] The label on the monomer may be a noncovalent attachment,
but will be preferably a direct covalent attachment. The label will
typically be one which is capable of high positional resolution and
not interfere with the nucleotide-specific chemistry or enzymology.
Although many different labels may be devised including enzyme
linked immunosorbent assays (ELISA), spectrophotometric labels,
light producing or other labels, a fluorescent moiety is the
preferred form. For example, an avidin/biotin type affinity binding
may be useful for attaching a particular label. Alternatively, an
antibody may be used which is specific for binding to a particular
terminal monomer. A wide variety of other specific reagents can be
used to provide a labeling function. See, for example, U.S. Ser.
No. ______, attorney docket number 11509-30 (sequencing by
hybridization), which is hereby incorporated herein by
reference.
[0178] The means of detection utilized will be selected in
combination with various other considerations. In some
circumstances, a spectroscopic label may be most compatible with a
particular monomer. Enzyme linked assays with a spectrophotometric
detection system are a workable system. Phosphorescent or light
producing assays provide high sensitivity using charged couple
devices. Fluorescent systems provide the same advantages,
especially where the incident light beam is a laser. The
fluorescent label also may provide the added advantage of
fluorescing at different wavelengths for the different monomers,
providing a convenient means to distinguish between different
monomers. Other forms of label may be desired for various reasons,
for example, magnetic labels, radioactive labels, heavy metal
atoms, optically detectable labels, spectroscopically detectable
labels, fluorescent labels, and magnetic labels.
[0179] For sequencing nucleic acids by this method, the labeled
monomers are simpler than those monomers used for the synthetic
method. The blocking group is unnecessary, but terminal specific
reagents are more difficult to produce.
[0180] The preferred attachment sites will be at the same location
as the blocking site, so a combined label and blocking moiety is
more preferred. The label will be attached as described, e.g., in
U.S. Ser. No. ______, attorney docket number 11509-30 (sequencing
by hybridation).
[0181] Two types of degradation cycles can be used, either
non-specific removal of the terminal labeled nucleotide, or a
base-specific removal. With the non-specific removal means, each of
the end monomers, when labeled, should be distinguishable from the
other three monomer possibilities. This allows for determination of
the terminal nucleotide for the cluster localized at a given matrix
position. Then the terminal, labeled nucleotides are
non-specifically removed and the newly exposed terminal nucleotides
will be again distinguishably labeled.
[0182] By this scheme, a specific label for each of the different
nucleotides may be provided. For example, fluorescent reagents
specific for each of the nucleotides may provide a signal with a
different wavelength. This will more usually occur when the
fluorescent probe is located near the base moiety of the nucleotide
In the scanning step, the regions terminating with each of the four
different nucleotides may be determined. Then, a reaction is
performed removing the labeled terminal nucleotides from all of the
polymers. This removal may be either enzymatic, using a
phosphatase, an exonuclease or other similar enzyme, or chemical,
using acid, base, or some other, preferably mild, reagent. Again,
the reactions are performed which label each of the terminal
nucleotides and a scan step repeated in the same manner.
[0183] In the base-specific removal scheme, nucleotide-specific
removal can be performed. For example, an enzyme which will
function to remove only a single modified nucleotide, e.g., a
5'-fluorescein-dAMP-specific exonuclease, is constructed. This may
be achieved by proper construction of a catalytic antibody. Other
similar reagents may be generated for each of the other labeled
nucleotide monomers.
[0184] Catalytic or derivatized antibodies to catalyze the removal
of the 3'-end or 5'-most fluorescent base in a base-specific manner
may be constructed as follows. A recombinant antibody library or a
series of monoclonal antibodies is screened with fluorescent
donor-quencher substrates. These substrates consist of a
fluorescent labeled base (A, C, G, or T) on the 5' or 3' end joined
by a 5' to 3' phosphodiester linkage to a second base. A collection
of all four possible second bases for each of the four end bases
gives the best selection target for the required non-specificity
with respect to the second base. The second base is then tethered
to an acceptor group in sufficient proximity to quench the
fluorescence of the end group. In the presence of a catalytic
antibody with cleaving activity, a fluorescent signal occurs from
the separation of the quenching group from the terminal fluorescent
label. To assure both base and end specificity, the positive
monoclonal antibody clones are rescreened against the other
substrates.
[0185] Upon selection of an antibody exhibiting the desired
specificity (or lack thereof), the reactive group for cleavage may
be attached. This cleavage reagent may be chemical or enzymatic and
will be attached by an appropriate length linker to the antibody
binding site in an orientation which is consistent with the steric
requirements of both binding and specific cleavage.
[0186] Particularly useful specific reagents may be produced by
making antibodies specific for each of the four different modified
terminal nucleotide bases. These antibodies would then specifically
bind only to polymers terminating in the appropriate base analog.
By combining a cleavage reagent to the specific antibody, a
terminal nucleotide specific cleavage reagent is generated.
[0187] In one example of the degradative embodiment, all of the
polymers may be uniformly labeled at a particular end. Thereafter,
a specific removal reaction which removes only a particular
nucleotide may be performed, leaving the three other nucleotides
labeled. Thereafter, a scanning step is performed through which all
regions which had incorporated that particular nucleotide will have
lost the label through specific removal. Then, the second specific
reagent will be applied which specifically removes the second
labeled nucleotide, and the scanning step following that reaction
will allow determination of all regions which lose the second
particular nucleotide. This process is repeated with reagents
specific for each of the last two remaining labeled nucleotides
interspersed with scanning steps, thereby providing information on
regions with each of the nucleotides located there. Then, the
entire process may be repeated by labeling the next terminal
nucleotides uniformly. As mentioned below, replication techniques
may allow for making four separate but identical matrix substrates.
Each substrate may be subjected to single nucleotide-specific
reactions, and the scan results correlated with each of the other
parallel substrates.
[0188] In the degradation scheme, the polynucleotide linkage to the
matrix must be more carefully selected such that the free end of
the oligonucleotide segments used for attachment will not interfere
with the determinations of the target sequence terminus.
[0189] F. Utility
[0190] The present sequencing method is useful to monitor and check
the accuracy and reliability of the synthetic processes described
in the U.S. Ser. No. 07/362,901 (VLSIPS parent) and U.S. Ser. No.
07/492,462 (VLSIPS CIP) applications. The present method can be
used to check the final products synthesized therein, or to label
each monomer as they are added stepwise to monitor the efficiency
and accuracy of those synthetic methods.
[0191] The present invention can also be used to monitor or
sequence matrix bound clusters of positionally distinct polymers.
This sequencing process provides the capability of simultaneously
sequencing a large plurality of distinct polymers which are
positionally segregated.
[0192] The method will be used to sequence extremely large
stretches of polymer, e.g., nucleic acids. A large number of
shorter segments of a large sequence can be sequenced with
alignment of overlaps either randomly generated, or in an ordered
fashion, or particular sequenceable segments of a large segment can
be generated. In one approach, a large segment is subcloned into
smaller segments and a sufficient number of the randomly generated
subclones are sequenced as described herein to provide sequence
overlap and ordering of fragments.
[0193] In an alternative approach, a large segment can be
successively digested to generate a succession of smaller sized
subclones with ends separated by defined numbers of monomers. The
subclones can be size sorted by a standard separation procedure and
the individual samples from a separation device manually or
automatically linked to a matrix in a defined positional map.
Fractions resulting from size separation can be spatially attached
at defined positions, often at adjacent positions. Then polymer
sequences at adjacent positions on the matrix will also be known to
have ends which differ by, e.g., approximately 25 or 50 or more
monomers, thereby providing significantly greater confidence in
overlapping sequence data.
[0194] III. Specific Embodiments
[0195] A specific series of reactions for sequencing a matrix of
polynucleotides is described.
[0196] A. Synthetic Method
[0197] This method involves annealing a primer (common to all the
attached sequences by virtue of the cloning construction) near to
the 3' end of the unknown target sequences. DNA polymerase, or a
similar polymerase, is used to extend the chains by one base by
incubation in the presence of DNTP analogs which function as both
chain terminators and fluorescent labels. This is done in a
one-step process where each of the four dNTP analogs is identified
by a distinct dye, such as described in Prober et al. Science
238:336-341, or in four steps, each time adding one of the four
bases, interspersed with a scanning identification step. When each
cluster incorporates the proper one of the four bases and the
fluorescence scanning is complete, the matrix is stripped of the
label and the chain terminators are deblocked for a next round of
base addition. Because the base addition is directed by the
template strand, the complementary sequence of the fragments at
each address of the matrix is deduced.
[0198] (1) Attachment to a surface.
[0199] Both degraditive and synthetic sequencing methods begin by
obtaining and immobilizing the target fragments of unknown sequence
to be determined at specific locations on the surface.
[0200] There are several strategies for photo-directed attachment
of the DNA strands to the surface in an orientation appropriate for
sequencing. A caged biotin technique, see, e.g., U.S. Ser. No.
07/435,316 (caged biotin parent); and U.S. Ser. No. 07/612,671
(caged biotin CIP), is available. Another technique that is
especially applicable for the enzymatic synthesis method is to
chemically attach a synthetic oligomer by the 5' end to the entire
surface (see FIG. 6), to activate it for photocrosslinking (with
psoralen, for example) and to anneal the complementary strands and
photocrosslink the target strand of unknown sequence (complementary
to this oligonucleotide at the 3' end) at the specific location
addressed by light. In this case, the oligonucleotide serves as
both the attachment linker and as the synthetic primer. A third
method is to physically transfer individual nucleic acid samples to
selected positions on the matrix, either manually or
automatically.
[0201] Many sequences in each step are attached by cloning the
library into a series of vectors identical except for the sequences
flanking the insert. These primers can be added at the point of
amplification of the cloned DNA with chimeric primers.
[0202] Alternatively, sequences are attached to a matrix substrate
by colony or phage immobilization. This directly transfers the
positional distribution on a petri plate to a usable substrate.
Colonies representing a shotgun collection of sequences (enough to
assure nearly complete coverage by overlap) are spread over (or in)
a nutrient surface at a density to give about 100 or more colonies
or plaques in several square centimeters, and the colonies are
allowed to grow to about 0.1 mm in diameter (the maximum possible
density of clusters at this size is .about.10,000
colonies/cm.sup.2). As described above, replica platings or
successive transfers may allow for preparation of multiple matrices
with identical positional distributions of polymers. Each separate
matrix may then be dedicated to the reactions applicable to a
single monomer.
[0203] For example, in the use of a phage library, on a petri dish,
the transfer substrate surface is treated to release DNA from the
phage. This is done, e.g., with CHCl.sub.3 vapor, SDS-NaOH, or by
heating. Prior to release of DNA, the phage particles are often
adsorbed to the surface by way of an antibody to the coat protein
that has been immobilized on the surface. This strategy prevents
diffusion of the phage from the colonies. The matrix surface is
prepared by coating with an oligonucleotide, immobilized to the
surface by one end that has homology with the phage vector DNA
adjacent to the cloning site.
[0204] The matrix surface is juxtaposed to the growth surface, and
the phage DNA is allowed to anneal to the immobilized
oligonucleotide. The growth surface is removed, and the hybrid is
stabilized by psoralen or an equivalent crosslinking reagent.
[0205] This method provides an efficient one-step method of placing
many DNA fragments onto the detection surface in preparation for
sequencing. Although the colonies are not placed in predefined
locations, the random arrangement of the clusters allows the final
sequence to be assembled from correlation of overlap sequence data
derived from sequence data derived from each of the defined
positions of each target cluster.
[0206] Sequences are, in other embodiments, attached by a manual or
automated transfer technique. A few cells from each colony in a
library is toothpicked into microliter wells. The plate is heated
to .about.100.degree. C. for a short period to lyse the cells and
release the DNA. The plate is cooled and reagents for cycled
amplification of the DNA using, e.g., PCR technology, are added,
including primers common to all the cloned sequences. See, e.g.,
Innis et al. (1990) PCR Protocols: A Guide to Methods and
Applications, Academic Press, which is hereby incorporated herein
by reference. The DNA is amplified asymmetrically by unbalanced
primer concentration to yield an excess of one strand for
sequencing and attached to a substrate by manual or automated
means.
[0207] An alternative form of automated localization is described
above in positioning of a succession of smaller sized polymers
which are manually or automatically linked to the substrate in a
pattern reflecting sequence overlaps.
[0208] (2) Enzymatic polymerization method.
[0209] The nucleic acid template is, in some embodiments, attached
to the surface by either the 5' or the 3' end, usually by a method
as described above. A preferred method of attachment is to anneal
the template to an oligonucleotide attached to the surface and to
crosslink the template to the oligonucleotide. oligonucleotide
primers are usually synthesized chemically. In this case, the
immobilized oligonucleotide may also serve as a primer for
polymerization. Because polymerization proceeds 5' to 3' on the
primer, the template will be attached by its 3' end, or a site 3'
proximal to the region to be sequenced, for the purposes of the
description to follow.
[0210] Step 1: A DNA-dependent, DNA polymerase such as those used
for conventional DNA sequencing, for example, Klenow fragment of E.
coli DNA Pol, Sequenase (modified T7 DNA polymerase), Taq (Thermus
aquaticus) DNA polymerase, Bst (Bacillus stearothermophilus), DNA
polymerase, reverse transcriptase (from AMV, MMLV, RSV, etc.) or
other DNA polymerases, and the reaction components appropriate to
the particular DNA polymerase selected, are placed in the
incubation chamber in direct contact with the surface.
[0211] Step 2: Fluorescent chain terminators (analogs of dATP,
dCTP, dGTP, and TP, each labeled with fluorophore preferably
emitting at a distinguishable wavelength) are added to the reaction
at a sufficient concentration and under suitable reaction
conditions (time, temperature, pH, ionic species, etc., see
Sambrook et al. (1989) Molecular Cloning, vols. 1-3, and Prober et
al.) to cause essentially all of the chains on the surface to be
extended by one base and thereby terminated. Detection of the
specific label thereby incorporated into each chain identifies the
last base added at each positional address in the matrix.
[0212] Step 3: The chain termination should be reversible by some
means, such as treatment with light, heat, pH, certain other
chemical or biological (enzymatic) reagents, or some combination of
these. Typically the chain termination results from a blocking
moiety which is labile to mild treatment. By one of these means,
the blocked 3' OH of the terminating base must be made available
for chain extension in the next round of polymerization.
[0213] Step 4: There are several suitable labeled, terminator
structures as follows:
[0214] (a) The fluorophore itself functions as the chain terminator
by placement on the 3' hydroxyl through a linkage that is easily
and efficiently cleaved (removing the label and leaving the free 3'
OH) by light, heat, pH shift, etc. The surface is scanned with a
scanning system, e.g., the fluorescence detection system described
in U.S. Ser. No. 07/492,462 (VLSIPS CIP); and U.S. Ser. No. ______,
attorney docket number 11509-28 (automated VLSIPS). Then,
preferably in a single step, the fluorophore is removed and the
chain is activated for the next round of base addition.
[0215] (b) The fluorophore is placed in a position other than the
3' OH of the nucleoside, and a different group is placed on the 3'
OH of the dNTPs to function as a chain terminator. The fluorophore
and the 3' blocking group are removed by the same treatment in a
single step (preferably), or they may be removed in separate
steps.
[0216] (c) An alternative polymer stepwise synthetic strategy can
be employed. In this embodiment, the fluorophores need not be
removable and may be attached to irreversible chain terminators.
Examples of such compounds for use in sequencing DNA include, but
are not limited to, dideoxynucleotide triphosphate analogs as
described by Prober et al. (1987) Science 238:336-341. A second,
unlabeled and reversible, set of terminators is also required.
Examples of these compounds are deoxynucleotide triphosphates with
small blocking groups such as acetyl, tBOC, NBOC and NVOC on the 3'
OH. These groups are easily and efficiently removed under
conditions of high or low pH, exposure to light or heat, etc. After
each round of base addition and detection, the fluorophores are
deactivated by exposure to light under suitable conditions (these
chains have their labeling moiety destroyed and remain terminated,
taking part in no further reactions). The unlabeled, reversible
terminators are unblocked at the 3' OH by the appropriate treatment
to allow chain extension in subsequent rounds of elongation. The
proportion of chains labeled in each round can be controlled by the
concentration ratio of fluorescent to non-fluorescent terminators,
and the reaction can be driven to completion with high
concentrations of the unlabeled terminators.
[0217] (d) A single dye strategy is used where all the base analog
terminators carry the same fluorophore and each is are added one at
a time: A, C, G, T. The addition of each base is followed by
scanning detection and labeled. After all four fluorophore are
added, reversal of the termination is performed, allowing for the
addition of the next base analog. Then, each scanning step
determines whether the immediately preceding labeled nucleotide had
been incorporated at each distinct position.
[0218] The structures of the fluorescent labeled and reversible
terminator base analogs are selected to be compatible with
efficient incorporation into the growing chains by the particular
DNA polymerase(s) chosen to catalyze extension. For example, where
two different chain terminators are used, they may be utilized by
two different polymerases that are both present during the chain
extension step.
[0219] Step 5: An optional step is the permanent capping of chain
extension failures with high concentrations of dideoxynucleotide
triphosphates. This step serves to reduce the background of
fluorescence caused by addition of an incorrect base because of
inefficient chain extension (termination) at an earlier step.
[0220] Step 6: After scanning to determine fluorescence, the
fluorophore is removed or deactivated. Deactivation of the
fluorophore can be achieved by a photodestruction event. The chain
elongation block is reversed (usually by removing a blocking group
to expose the 3' OH) by suitable methods that depend on the
particular base analogs chosen; and the substrate is washed in
preparation for the next round of polymerization.
[0221] Step 7: Repeat the cycle.
[0222] B. Chain Degradation Method
[0223] This method involves labeling the last base of the chain
(distal to the surface attachment) with a fluorescent tag followed
by base-specific removal. All the polynucleotide clusters on the
matrix are labeled using a standard labeling moiety. Base-specific
removal of the last base of each chain, interspersed with
fluorescence scanning of the array, will reveal the disappearance
of fluorescence and hence the identity of the last base of each
chain. When all four labeled end bases have been removed, the
polymers attached to the matrix are relabeled and the process is
repeated, working successively on the DNA chains.
[0224] Alternatively, if the label allows distinguishing between
different monomers, simpler degradation processes may be employed.
A single scan step can distinguish between all four possible
terminal nucleotides. The four separate removal steps are then
combined into a single nonspecific terminal nucleotide removal
step.
[0225] The DNA will usually be attached to the substrate by the 3'
or 5' terminus depending on the scheme of labeling and cleavage.
Because there are well-known 5'-labeling methods, see, e.g., Gait
(1984) Oligonucleotide Synthesis: A Practical Approach, IRL Press,
Oxford, this discussion will assume the 3' end is attached to the
substrate with the 5' end free.
[0226] Step 1: All the 5'-end bases are labeled with 5'-specific
chemistry, e.g., 5' amino linkage to FITC, Nelson et al. (1989)
Nucl. Acids Res. 17:7179-7186, which is hereby incorporated herein
by reference.
[0227] Step 2: Scan the matrix to obtain the background level.
[0228] Step 3: Optional: Cap all of the labeling failures, e.g.,
polymers whose ends were not labeled.
[0229] Step 4: The terminal A's are removed with end-base,
A-specific reagents (such a reagent may be chemical or biological).
One example is a 5'-fluorescein-dAMP-specific exonuclease made as a
catalytic antibody (see the description above for a scheme of
producing this reagent).
[0230] Step 5: Scan the matrix to detect those chains that had
terminated in A (these will be reduced in fluorescence compared to
the fluorescent labeled background).
[0231] Step 6: Repeat steps 4 and 5 for each of other three
possible bases using the appropriate fluorescein-base-specific
cleavage reagent and scan after removal of each of the C's, the
G's, and the T's. This succession of steps will allow the
determination of the terminal nucleotide of each positionally
defined cluster.
[0232] Step 7: Relabel the 5' terminal nucleotide of all the new
end bases that have been exposed by the earlier rounds of cleavage,
and repeat the stepwise removal and scanning processes.
[0233] This approach can be extended to protein sequencing using 20
catalytic antibodies (or other amino acid-specific cleavage
reagents), each recognizing a terminal amino acid and removing that
terminal residue.
[0234] The process for sequencing may be summarized as follows for
enzymatic polymerization:
[0235] 1) Target DNA templates (to be sequenced) are attached at
positionally defined locations on the matrix substrate.
[0236] 2) Fluorescent chain terminators are added to a primer under
conditions where all polymer chains are terminated after addition
of the next base complementary to the template.
[0237] 3) The matrix is scanned to determine which base was added
to each location. This step correlates the added base with a
position on the matrix.
[0238] 4) Chains failing to extend (and therefore to terminate) are
capped.
[0239] 5) The fluorophores are removed or deactivated.
[0240] 6) The terminators are activated for further chain
extension, usually by removal of a blocking group.
[0241] 7) Steps 2 through 6 are repeated to obtain the base-by-base
sequence of many different positionally separated DNA fragments
simultaneously.
[0242] C. Screening for new nucleotide analog/polymerase
combinations.
[0243] The use of a functional combination of blocked nucleotide
with a polymerase is important in the synthetic embodiment of the
present invention. It is important to ensure that only a single
nucleotide is incorporated at the appropriate step. The following
protocol describes how to screen for a functional combination.
[0244] Test 1. (test for polymerase inhibition)
[0245] In a reaction volume of 20 .mu.l , mix
[0246] 1 .mu.g M13mp19 single stranded DNA template 2.5 ng standard
M13 primer (17-mer:
5'-GTTTTCCCAGTCACGAC-3'
[0247] 60 mM tris-Cl pH 8.5
[0248] 7.5 mM MgC12
[0249] 75 mM NaCl
[0250] Template and primer are annealed by heating to 95.degree.
C., then cooling to .about.25.degree.
[0251] Extension components are added:
[0252] 50 .mu.M (each) dATP, dCTP, dGTP, TTP
[0253] 10 .mu.Ci P32 dATP
[0254] 0.01 .mu.M to 1 mM of the putative terminator compound,
further titrations may be desired.
[0255] 20 units AMV reverse transcriptase water to 20 .mu.l final
volume
[0256] The reaction is run at 42.degree. for about 30 minutes.
[0257] Aliquots are taken at 10, 20, 30 minutes, and samples are
TCA precipitated after the addition of 10 .mu.g tRNA carrier.
[0258] The filters are counted for acid-precipitable radioactivity
and the mass of dATP incorporated is calculated as a function of
reaction time.
[0259] Control reactions are run in parallel consisting of
[0260] A) no added terminator
[0261] B) 10 .mu.M and 100 .mu.M
[0262] The termination activity of the experimental samples
relative to that of ddNTPs is estimated, and a nucleotide is
appropriate for further testing if it substantially decreases the
number of acid precipitable counts at any time or relative
concentration.
[0263] Test 2 (test for base specific termination activity)
[0264] Reactions are run essentially as described by Prober et al.
except:
[0265] 1. Unlabelled primer is used
[0266] 2. 1 .mu.Ci P32 dATP is included
[0267] 3. No dideoxyNTPs are added to the experimental samples
(control reactions containing ddNTP at the usual concentrations,
and no test terminators are run in parallel)
[0268] 4. The test compound is added at a concentration estimated
to give 1% and 10% inhibition of incorporation as determined by
test #1.
[0269] The reactions are run for 10 min at 42.degree.. 100 .mu.M
dNTPS are added and the reaction run for an additional 10 min. A
portion of the reaction is prepared and run on a sequencing gel in
the usual fashion. The ladders obtained with the test compound are
compared with that obtained in the ddNTP reactions and the fidelity
of the termination activity of the test compound is thereby
assessed.
[0270] IV. Apparatus
[0271] The present invention provides a new use for an apparatus
comprising a reaction chamber and a scanning apparatus which can
scan a substrate material exposed to the chamber. FIG. 11
illustrates a system and a schematized reaction chamber to which is
attached a silicon or glass substrate. The system has a detection
system 102 as illustrated, in one embodiment, in FIG. 7. A silicon
substrate 104, is attached against and forming a seal to make a
reaction chamber 106. Leading into and out of the chamber are tubes
108, with valves 110 which control the entry and exit of reagents
112 which are involved in the stepwise reactions. The chamber is
held at a constant temperature by a temperature block 114.
[0272] All publications and patent applications are herein
incorporated by reference to the same extent as if each individual
publication or patent application was specifically and individually
indicated to be incorporated by reference. The invention now being
fully described, it will be apparent to one of ordinary skill in
the art that many changes and modifications can be made thereto
without departing from the spirit or scope of the claims.
* * * * *