U.S. patent application number 15/691279 was filed with the patent office on 2018-03-29 for high throughput nucleic acid sequencing by expansion and related methods.
The applicant listed for this patent is Stratos Genomics Inc.. Invention is credited to Mark Stamatios Kokoris, Robert N. McRuer.
Application Number | 20180087103 15/691279 |
Document ID | / |
Family ID | 42396047 |
Filed Date | 2018-03-29 |
United States Patent
Application |
20180087103 |
Kind Code |
A1 |
Kokoris; Mark Stamatios ; et
al. |
March 29, 2018 |
HIGH THROUGHPUT NUCLEIC ACID SEQUENCING BY EXPANSION AND RELATED
METHODS
Abstract
Nucleic acid sequencing methods and related products and methods
for detection and presentation of the same are disclosed. Methods
for sequencing a target nucleic acid comprise providing a daughter
strand produced by a template-directed synthesis, the daughter
strand comprising a plurality of subunits coupled in a sequence
corresponding to a contiguous nucleotide sequence of all or a
portion of the target nucleic acid, wherein the individual subunits
comprise a tether, at least one probe or nucleobase residue, and at
least one selectively cleavable bond. The selectively cleavable
bond(s) is/are cleaved to yield a surrogate polymer of a length
longer than the plurality of the subunits of the daughter strand,
the surrogate polymer comprising the tethers and reporter elements
for parsing genetic information in a sequence corresponding to the
contiguous nucleotide sequence of all or a portion of the target
nucleic acid. Reporter elements of the surrogate polymer are then
detected, Disclosed methods for detecting the surrogate polymers
comprise nanopore detection and other detection methods suitable
for high-throughput DNA sequencing. Methods for presenting the
surrogate polymer to the detector comprise presenting the surrogate
polymers: 1) in flow, 2) tethered to a solid support, and 3)
aligned on a substrate surface. Corresponding products, including
surrogate polymers and oligomeric and monomeric substrate
constructs are also disclosed.
Inventors: |
Kokoris; Mark Stamatios;
(Bothell, WA) ; McRuer; Robert N.; (Mercer Island,
WA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Stratos Genomics Inc. |
Seattle |
WA |
US |
|
|
Family ID: |
42396047 |
Appl. No.: |
15/691279 |
Filed: |
August 30, 2017 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14590894 |
Jan 6, 2015 |
9771614 |
|
|
15691279 |
|
|
|
|
14449912 |
Aug 1, 2014 |
|
|
|
14590894 |
|
|
|
|
13146800 |
Jul 28, 2011 |
|
|
|
PCT/US2010/022654 |
Jan 29, 2010 |
|
|
|
14449912 |
|
|
|
|
61148332 |
Jan 29, 2009 |
|
|
|
61148334 |
Jan 29, 2009 |
|
|
|
61148327 |
Jan 29, 2009 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12P 19/34 20130101;
C12Q 1/6869 20130101; C12Q 1/6869 20130101; C12Q 2565/631 20130101;
C12Q 2525/204 20130101; C12Q 2525/197 20130101 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C12P 19/34 20060101 C12P019/34 |
Claims
1-73. (canceled)
74. A method of presenting at least one surrogate polymer for
detection, comprising: a) providing a detector construct, wherein
the detector construct comprises at least one detector element; b)
providing the at least one surrogate polymer, wherein the at least
one surrogate polymer comprises one or more individual reporter
elements; and c) processing the at least one surrogate polymer to
obtain a uniform spatial and temporal spacing of the one or more
individual reporter elements.
75. (canceled)
76. The method of claim 75, wherein the detector construct
comprises a regular array of nanopore channels.
77. The method of claim 74, wherein processing the at least one
surrogate polymer comprises tethering an end of the at least one
surrogate polymer to a solid substrate having at least one binding
site.
78. (canceled)
79. The method of claim 75, wherein processing the at least one
surrogate polymer comprises attaching a charged, linear polymer
having a low molecular weight to an end of the at least one
surrogate polymer.
80. The method of claim 79, wherein the charged, linear polymer is
selected from polyglutamic acid and polyphosphate.
81. The method of claim 75, wherein processing the at least one
surrogate polymer comprises applying a voltage to the at least one
nanopore channel, wherein the voltage is higher than a desired
measurement voltage, and decreasing the voltage to the desired
measurement voltage when a surrogate polymer is detected in the
nanopore channel.
82. The method of claim 81, wherein the voltage is manipulated such
that only one surrogate polymer may occupy the at least one
nanopore channel at a time.
83. The method of claim 75, wherein processing the at least one
surrogate polymer comprises attaching a stop to an end of the at
least one surrogate polymer, wherein the stop prevents the at least
one surrogate polymer from passing through the at least one
nanopore channel and prevents multiple surrogate polymers from
occupying the same nanopore channel, and prefilling the at least
one surrogate polymer in the at least one nanopore channel.
84-87. (canceled)
88. The method of claim 74, wherein processing the at least one
surrogate polymer comprises controlling the flow of the at least
one surrogate polymer toward the detector construct.
89-91. (canceled)
92. The method of claim 88, wherein controlling the flow of the at
least one surrogate polymer comprises: a) providing at least one
gating construct, wherein the at least one gating construct
comprises a first, second, and third electrode; and b) manipulating
an electric field applied independently to the first, second, and
third electrodes to obtain a uniform spatial and temporal spacing
of the one or more individual reporter elements.
93. (canceled)
94. The method of claim 88, wherein controlling the flow of the at
least one surrogate polymer comprises: a) providing at least one
gating construct, wherein the at least one gating construct
comprises a first and second porous electrode and a gating element,
wherein the first and second porous electrodes are affixed to a
first and second side of the gating element, respectively; b)
applying an electric field to the first and second electrodes; and
c) transporting the at least one surrogate polymer though the gate
toward the at least one detector element.
95. The method of claim 94, wherein the gating element is selected
from a porous membrane and a nanohole.
96. The method of claim 94, wherein the electric field is
manipulated to obtain a uniform spatial and temporal spacing of the
one or more individual reporter elements.
97. (canceled)
98. The method of claim 95, wherein the gating element is a porous
membrane.
99. The method of claim 98, wherein the porous membrane comprises
pores from about 20 nm to about 100 nm in diameter.
100. The method of claim 98, wherein the porous membrane is
selected from aluminum oxide and a polymer track-formed
membrane.
101-127. (canceled)
128. The method of claim 74, wherein processing the at least one
surrogate polymer comprises affixing the at least one surrogate
polymer to a solid substrate comprising nanopore channels, wherein
affixing the at least one surrogate polymer to the solid substrate
comprises attaching a stop to an end of the at least one surrogate
polymer, wherein the stop prevents the at least one surrogate
polymer from passing through the at least one nanopore channel and
prevents multiple surrogate polymers from occupying the same
nanopore channel, and prefilling the at least one surrogate polymer
in the at least one nanopore channel.
129. The method of claim 128, wherein the stop is selected from a
bulky dendrimer and a bead.
130. The method of claim 129, wherein the bead is a magnetic bead.
Description
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. application Ser.
No. 14/590,894, filed Jan. 6, 2015, which is a continuation of U.S.
application Ser. No. 14/449,912, filed Aug. 1, 2014, which is a
continuation of U.S. application Ser. No. 13/146,800, a U.S.
national stage entry of PCT/US2010/022654 filed Jan. 29, 1010,
which claims the benefit under 35 U.S.C. .sctn. 119(e) of U.S.
Provisional Patent Application No. 61/148,332 filed on Jan. 29,
2009; U.S. Provisional Patent Application No. 61/148,334 filed on
Jan. 29, 2009; and U.S. Provisional Patent Application No.
61/148,327 filed on Jan. 29, 2009; all of which are incorporated
herein by reference in their entireties.
STATEMENT REGARDING SEQUENCE LISTING
[0002] The Sequence Listing associated with this application is
provided in text format in lieu of a paper copy, and is hereby
incorporated by reference into the specification. The name of the
text file containing the Sequence Listing is
870225_403C3_SEQUENCE_LISTING.txt. The text file is 1 KB, was
created on Dec. 5, 2017, and is being submitted electronically via
EFS-web.
BACKGROUND
Technical Field
[0003] This invention is generally related to nucleic acid
sequencing, as well as methods and products relating to the
same.
Description of the Related Art
[0004] Nucleic acid sequences encode the necessary information for
living things to function and reproduce, and are essentially a
blueprint for life. Determining such sequences is therefore a tool
useful in pure research into how and where organisms live, as well
as in applied sciences such as drug development. In medicine,
sequencing tools can be used for diagnosis and to develop
treatments for a variety of pathologies, including cancer, heart
disease, autoimmune disorders, multiple sclerosis, or obesity. In
industry, sequencing can be used to design improved enzymatic
processes or synthetic organisms. In biology, such tools can be
used to study the health of ecosystems, for example, and thus have
a broad range of utility.
[0005] An individual's unique DNA sequence provides valuable
information concerning their susceptibility to certain diseases.
The sequence will provide patients with the opportunity to screen
for early detection and to receive preventative treatment.
Furthermore, given a patient's individual blueprint, clinicians
will be capable of administering personalized therapy to maximize
drug efficacy and to minimize the risk of an adverse drug response.
Similarly, determining the blueprint of pathogenic organisms can
lead to new treatments for infectious diseases and more robust
pathogen surveillance. Whole genome DNA sequencing will provide the
foundation for modern medicine.
[0006] DNA sequencing is the process of determining the order of
the chemical constituents of a given DNA polymer. These chemical
constituents, which are called nucleotides, exist in DNA in four
common forms: deoxyadenosine (A), deoxyguanosine (G), deoxycytidine
(C), and deoxythymidine (T). Sequencing of a diploid human genome
requires determining the sequential order of approximately 6
billion nucleotides.
[0007] Currently, most DNA sequencing is performed using the chain
termination method developed by Frederick Sanger. This technique,
termed Sanger Sequencing, uses sequence specific termination of DNA
synthesis and fluorescently modified nucleotide reporter substrates
to derive sequence information. This method sequences a target
nucleic acid strand, or read length, of up to 1000 bases long by
using a modified Polymerase Chain Reaction (PCR). In this modified
reaction the sequencing is randomly interrupted at select base
types (A, C, G or T) and the lengths of the interrupted sequences
are determined by capillary gel electrophoresis. The length then
determines what base type is located at that length. Many
overlapping read lengths are produced and their sequences are
overlaid using data processing to determine the most reliable fit
of the data. This process of producing read lengths of sequence is
very laborious and expensive and is now being superseded by new
methods that have higher efficiency.
[0008] The Sanger method was used to provide most of the sequence
data in the Human Genome Project which generated the first complete
sequence of the human genome. This project took over 10 years and
nearly $3B to complete. Given these significant throughput and cost
limitations, it is clear that DNA sequencing technologies will need
to improve drastically in order to achieve the stated goals put
forth by the scientific community. To that end, a number of second
generation technologies, which far exceed the throughput and cost
per base limitations of Sanger sequencing, are gaining an
increasing share of the sequencing market. Still, these "sequencing
by synthesis" methods fall short of achieving the throughput, cost,
and quality targets required by markets such as whole genome
sequencing for personalized medicine.
[0009] For example, 454 Life Sciences is producing instruments
(e.g., the Genome Sequencer) that can process 100 million bases in
7.5 hours with an average read length of 200 nucleotides. Their
approach uses a variation of PCR to produce a homogeneous colony of
target nucleic acid, hundreds of bases in length, on the surface of
a bead. This process is termed emulsion PCR. Hundreds of thousands
of such beads are then arranged on a "picotiter plate". The plate
is then prepared for an additional sequencing whereby each nucleic
acid base type is sequentially washed over the plate. Beads with
target that incorporate the base produce a pyrophosphate byproduct
that can be used to catalyze a light producing reaction that is
then detected with a camera.
[0010] Illumina Inc. has a similar process that uses reversibly
terminating nucleotides and fluorescent labels to perform nucleic
acid sequencing. The average read length for Illumina's 1G Analyzer
is less than 40 nucleotides. Instead of using emulsion PCR to
amplify sequence targets, Illumina has an approach for amplifying
PCR colonies on an array surface. Both the 454 and Illumina
approaches use a complicated polymerase amplification to increase
signal strength, perform base measurements during the rate limiting
sequence extension cycle, and have limited read lengths because of
incorporation errors that degrade the measurement signal to noise
proportionally to the read length.
[0011] Applied Biosystems uses reversible terminating ligation
rather than sequencing-by-synthesis to read the DNA. Like 454's
Genome Sequencer, the technology uses bead-based emulsion PCR to
amplify the sample. Since the majority of the beads do not carry
PCR products, the researchers next use an enrichment step to select
beads coated with DNA. The biotin-coated beads are spread and
immobilized on a glass slide array covered with streptavidin. The
immobilized beads are then run through a process of 8-mer probe
hybridization (each labeled with four different fluorescent dyes),
ligation, and cleavage (between the 5th and 6th bases to create a
site for the next round of ligation). Each probe interrogates two
bases, at positions 4 and 5 using a 2-base encoding system, which
is recorded by a camera. Similar to Illumina's approach, the
average read length for Applied Biosystems' SOLiD platform is less
than 40 nucleotides.
[0012] Other approaches are being developed to avoid the time and
expense of the polymerase amplification step by measuring single
molecules of DNA directly. Visigen Biotechnologies, Inc. is
measuring fluorescently labeled bases as they are sequenced by
incorporating a second fluorophore into an engineered DNA
polymerase and using Forster Resonance Energy Transfer (FRET) for
nucleotide identification. This technique is faced with the
challenges of separating the signals of bases that are separated by
less than a nanometer and by a polymerase incorporation action that
will have very large statistical variation.
[0013] A process being developed by LingVitae sequences cDNA
inserted into immobilized plasmid vectors. The process uses a Class
IIS restriction enzyme to cleave the target nucleic acid and ligate
an oligomer into the target. Typically, one or two nucleotides in
the terminal 5' or 3' overhang generated by the restriction enzyme
determine which of a library of oligomers in the ligation mix will
be added to the sticky, cut end of the target. Each oligomer
contains "signal" sequences that uniquely identify the
nucleotide(s) it replaces. The process of cleavage and ligation is
then repeated. The new molecule is then sequenced using tags
specific for the various oligomers. The product of this process is
termed a "Design Polymer" and always consists of a nucleic acid
longer than the one it replaces (e.g., a dinucleotide target
sequence is replaced by a "magnified" polynucleotide sequence of as
many as 100 base pairs). An advantage of this process is that the
duplex product strand can be amplified if desired. A disadvantage
is that the process is necessarily cyclical and the continuity of
the template would be lost if simultaneous multiple restriction
cuts were made.
[0014] U.S. Pat. No. 7,060,440 to Kless describes a sequencing
process that involves incorporating oligomers by polymerization
with a polymerase. A modification of the Sanger method, with
end-terminated oligomers as substrates, is used to build sequencing
ladders by gel electrophoresis or capillary chromatography. While
coupling of oligomers by end ligation is well known, the use of a
polymerase to couple oligomers in a template-directed process was
utilized to new advantage.
[0015] Polymerization techniques are expected to grow in power as
modified polymerases (and ligases) become available through genetic
engineering and bioprospecting, and methods for elimination of
exonuclease activity by polymerase modification are already known.
For example, Published U.S. Patent Application 2007/0048748 to
Williams describes the use of mutant polymerases for incorporating
dye-labeled and other modified nucleotides. Substrates for these
polymerases also include .gamma.-phosphate labeled nucleotides.
Both increased speed of incorporation and reduction in error rate
were found with chimeric and mutant polymerases.
[0016] In addition, a large effort has been made by both academic
and industrial teams to sequence native DNA using non-synthetic
methods. For example, Agilent Technologies, Inc. along with
university collaborators are developing a single molecule detection
method that threads the DNA through a nanopore to make measurements
as it passes through. As with Visigen and LingVitae, this method
must overcome the problem of efficiently and accurately obtaining
distinct signals from individual nucleobases separated by
sub-nanometer dimensions, as well as the problem of developing
reproducible pore sizes of similar size. As such, direct sequencing
of DNA by detection of its constituent parts has yet to be achieved
in a high-throughput process due to the small size of the
nucleotides in the chain (about 4 Angstroms center-to-center) and
the corresponding signal to noise and signal resolution limitations
therein. Direct detection is further complicated by the inherent
secondary structure of DNA, which does not easily elongate into a
perfectly linear polymer.
[0017] Methods which overcome the spatial resolution challenges of
high-throughput DNA sequencing have been disclosed in Published PCT
Applications WO 2008/157696 and WO 2009/055617. WO 2008/157696
describes a method of sequencing by expansion. A daughter strand
comprising internucleotide tethers, reporter groups, and cleavable
internucleotide bonds is produced by template directed synthesis.
The internucleotide bonds are then cleaved producing an oligomer
having a length longer than the length of the target nucleic acid.
The longer length of oligomer results in better detection
resolution than s possible with the shorter target nucleic
acid.
[0018] In a related method, WO 2009/055617 discloses a method
wherein a daughter strand comprising reporter groups which encode
less than the entire genomic sequence of the target nucleic acid is
produced by template directed synthesis. The reduced reporter
content results in better resolution than that obtained with higher
reporter content methods.
[0019] While significant advances have been made in the field of
DNA sequencing, there continues to be a need in the art for new and
improved methods of sequencing DNA and for related methods of
detection and presentation of DNA oligomers. The present invention
fulfills these needs and provides further related advantages.
BRIEF SUMMARY
[0020] In general terms, methods and corresponding devices,
products and kits are disclosed that overcome the spatial
resolution, presentation, detection, and other challenges presented
by existing high throughput nucleic acid sequencing techniques.
[0021] In one embodiment, this is achieved by either encoding all
the base sequence information of a target nucleic acid on a first
surrogate polymer (referred to herein as an "Xpandomer") or
encoding only a subset of the base sequence information of the
target nucleic acid on a second surrogate polymer (referred to
herein as an "S-Xpandomer"). The surrogate polymers (Xdaughter
strands and S-Xdaughter strands, respectively) are of extended
length making them easier to detect. The Xpandomers and
S-Xpandomers are formed by template dependent replication of a DNA
target in which a plurality of subunits (referred to herein as
Xmers and S-Xmers, respectively) are serially connected. Such
synthesis preserves the original genetic information of the target
nucleic acid, while also increasing linear separation of the
individual elements of the sequence data.
[0022] In one embodiment, a method is disclosed for sequencing a
target nucleic acid, comprising: a) providing a Xdaughter strand
produced by a template-directed synthesis, the daughter strand
comprising a plurality of subunits coupled in a sequence
corresponding to a contiguous nucleotide sequence of all or a
portion of the target nucleic acid, wherein the individual subunits
comprise a tether, at least one probe or nucleobase residue, and at
least one selectively cleavable bond; b) cleaving the at least one
selectively cleavable bond to yield an Xpandomer of a length longer
than the plurality of the subunits of the daughter strand, the
Xpandomer comprising the tethers and reporter elements for parsing
genetic information in a sequence corresponding to the contiguous
nucleotide sequence of all or a portion of the target nucleic acid;
and c) detecting the reporter elements of the Xpandomer.
[0023] Examples of specific embodiments as well as methods of
making and using Xpandomers are disclosed in more in detail in
Published PCT WO2008/157696.
[0024] In another embodiment, a method is provided for sequencing a
target nucleic acid, comprising:
[0025] a) providing an S-Xdaughter strand produced by a
template-directed synthesis, the daughter strand comprising a
plurality of subunits coupled in a sequence corresponding to a
contiguous nucleotide sequence of all or a portion of the target
nucleic acid, wherein the individual subunits comprise a tether, at
least one probe, and at least one selectively cleavable bond, the
at least one probe comprising X nucleobase residues (with X being a
positive integer greater than one) and at least one reporter
construct that encodes the genetic information of Y nucleobase
residue(s) of the probe (with Y being a positive integer less than
X);
[0026] b) cleaving the at least one selectively cleavable bond to
yield an S-Xpandomer of a length longer than the plurality of the
subunits of the S-Xdaughter stand, the S-Xpandomer comprising the
tethers and reporter elements for determining Y nucleobase(s) every
X nucleobases; and
[0027] c) detecting the at least one reporter construct to decode
the genetic information of Y nucleobase(s) every X nucleobases of
the daughter strand.
[0028] Since Y is less than X, only a fraction of the nucleotide
bases of the target nucleic acid are detected. For example, and for
illustration only, when X is 4 and Y is 1, the reporter constructs
are detected to determine 1 nucleobase every 4 nucleobases of the
daughter strand. Since the daughter strand comprises a plurality of
subunits coupled in a sequence corresponding to a contiguous
nucleotide sequence of all or a portion of the target nucleic acid,
1 of every 4 nucleobases of the target nucleic acid is sequenced.
In many instance, detection of "Y of every X" nucleobases (e.g., 1
of every 4, or every 4.sup.th, nucleobase) in the target nucleic
acid is sufficient for sequencing purposes. Alternatively, and if
desired, template-dependent replication of the target nucleic acid
using a plurality (e.g., library) of probe constructs may be
employed to produce additional S-Xpandomers for detection, thus
identifying the remaining interlaced target nucleobases in a
similar manner.
[0029] In a further embodiment of the above method, the target
nucleic acid is produced by a template-directed rolling circle
polymerization process.
[0030] In other further embodiments, the template directed
synthesis comprises a ligation reaction. For example, the ligation
reaction may comprise an enzymatic ligation reaction.
[0031] In yet other further embodiments, the at least one reporter
construct is associated with:
[0032] the tethers of the S-Xpandomer;
[0033] the S-Xdaughter strand prior to cleavage of the at least one
selectively cleavable bond; or
[0034] the S-Xpandomer after cleavage of the at least one
selectively cleavable bond.
[0035] In further embodiments, the at least one reporter construct
is attached to the S-Xdaughter strand after template-directed
synthesis thereof.
[0036] In other further embodiments, the tether is attached to the
S-Xdaughter strand after template-directed synthesis thereof.
[0037] In other further embodiments, the S-Xpandomer further
comprises all or a portion of the at least one probe. For example,
in one embodiment, the at least one reporter construct is or is
associated with the at least one probe.
[0038] The S-Xpandomer comprises a plurality of subunits coupled in
a sequence corresponding to a contiguous nucleotide sequence of all
or a portion of the target nucleic acid. Thus, in a further
embodiment, the S-Xpandomer comprises the following structure:
##STR00001##
[0039] wherein [0040] T represents the tether; [0041] P.sup.1
represents a first probe moiety; [0042] P.sup.2 represents a second
probe moiety; [0043] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
and [0044] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid;
##STR00002##
[0045] wherein [0046] T represents the tether; [0047] P.sup.1
represents a first probe moiety; [0048] P.sup.2 represents a second
probe moiety; [0049] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
[0050] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0051] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00003##
[0052] wherein [0053] T represents the tether; [0054] P.sup.1
represents a first probe moiety; [0055] P.sup.2 represents a second
probe moiety; [0056] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
[0057] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0058] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00004##
[0059] wherein [0060] T represents the tether; [0061] P.sup.1
represents a first probe moiety; [0062] P.sup.2 represents a second
probe moiety; [0063] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
[0064] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0065] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00005##
[0066] wherein [0067] T represents the tether; [0068] .kappa.
represents the .kappa..sup.th subunit in a chain of m subunits,
where m is an integer greater than three; [0069] .alpha. represents
a species of a subunit motif selected from a library of subunit
motifs, wherein each of the species comprises sequence information
of the contiguous nucleotide sequence of a portion of the target
nucleic acid; and [0070] .chi. represents a bond with the tether of
an adjacent subunit;
##STR00006##
[0071] wherein [0072] T represents the tether; [0073] N represents
a nucleobase residue; [0074] .kappa. represents the .kappa..sup.th
subunit in a chain of m subunits, where m is an integer greater
than ten; [0075] .alpha. represents a species of a subunit motif
selected from a library of subunit motifs, wherein each of the
species comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0076] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00007##
[0077] wherein [0078] T represents the tether; [0079] .kappa.
represents the .kappa..sup.th subunit in a chain of m subunits,
where m is an integer greater than ten; [0080] .alpha. represents a
species of a subunit motif selected from a library of subunit
motifs, wherein each of the species comprises sequence information
of the contiguous nucleotide sequence of a portion of the target
nucleic acid; and [0081] .chi. represents a bond with the tether of
an adjacent subunit;
##STR00008##
[0082] wherein [0083] T represents the tether; [0084] N represents
a nucleobase residue; [0085] .kappa. represents the .kappa..sup.th
subunit in a chain of m subunits, where m is an integer greater
than ten; [0086] .alpha. represents a species of a subunit motif
selected from a library of subunit motifs, wherein each of the
species comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0087] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00009##
[0088] wherein [0089] T represents the tether; [0090] N represents
a nucleobase residue; [0091] .kappa. represents the .kappa..sup.th
subunit in a chain of m subunits, where m is an integer greater
than ten; [0092] .alpha. represents a species of a subunit motif
selected from a library of subunit motifs, wherein each of the
species comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; [0093]
.chi..sup.1 represents a bond with the tether of an adjacent
subunit; and [0094] .chi..sup.2 represents an inter-tether bond;
or
##STR00010##
[0095] wherein [0096] T represents the tether; [0097] n.sup.1 and
n.sup.2 represents a first portion and a second portion,
respectively, of a nucleobase residue; [0098] .kappa. represents
the .kappa..sup.th subunit in a chain of m subunits, where m is an
integer greater than ten; and [0099] .alpha. represents a species
of a subunit motif selected from a library of subunit motifs,
wherein each of the species comprises sequence information of the
contiguous nucleotide sequence of a portion of the target nucleic
acid.
[0100] In a further embodiment, the S-Xdaughter strand is formed
from a plurality of oligomer substrate constructs having the
following structure:
##STR00011##
[0101] wherein [0102] T represents the tether; [0103] P.sup.1
represents a first probe moiety; [0104] P.sup.2 represents a second
probe moiety; [0105] .about. represents the at least one
selectively cleavable bond; and [0106] R.sup.1 and R.sup.2
represent the same or different end groups for the template
directed synthesis of the daughter strand;
##STR00012##
[0107] wherein [0108] T represents the tether; [0109] P.sup.1
represents a first probe moiety; [0110] P.sup.2 represents a second
probe moiety; [0111] R.sup.1 and R.sup.2 represent the same or
different end groups for the template directed synthesis of the
daughter strand; [0112] .epsilon. represents a first linker group;
[0113] .delta. represents a second linker group; and [0114] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00013##
[0115] wherein [0116] T represents the tether; [0117] P.sup.1
represents a first probe moiety; [0118] P.sup.2 represents a second
probe moiety; [0119] R.sup.1 and R.sup.2 represent the same or
different end groups for the template directed synthesis of the
daughter strand; [0120] .epsilon. represents a first linker group;
[0121] .delta. represents a second linker group; and [0122] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00014##
[0123] wherein [0124] T represents the tether; [0125] P.sup.1
represents a first probe moiety; [0126] P.sup.2 represents a second
probe moiety; [0127] .about. represents the at least one
selectively cleavable bond; [0128] R.sup.1 and R.sup.2 represent
the same or different end groups for the template directed
synthesis of the daughter strand; [0129] .epsilon. represents a
first linker group; and [0130] .delta. represents a second linker
group;
##STR00015##
[0131] wherein [0132] T represents the tether; [0133] P.sup.1
represents a first probe moiety; [0134] P.sup.2 represents a second
probe moiety; [0135] .about. represents the at least one
selectively cleavable bond; [0136] R.sup.1 and R.sup.2 represent
the same or different end groups for the template directed
synthesis of the daughter strand; [0137] .epsilon. represents a
first linker group; and [0138] .delta. represents a second linker
group;
##STR00016##
[0139] wherein [0140] T represents the tether; [0141] N represents
a nucleobase residue; [0142] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0143] .epsilon. represents a first linker group;
[0144] .delta. represents a second linker group; and [0145] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00017##
[0146] wherein [0147] T represents the tether; [0148] N represents
a nucleobase residue; [0149] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0150] .about. represents the at least one
selectively cleavable bond; [0151] .epsilon. represents a first
linker group; [0152] .delta. represents a second linker group; and
[0153] "- - - -" represents a cleavable intra-tether crosslink;
##STR00018##
[0154] wherein [0155] T represents the tether; [0156] N represents
a nucleobase residue; [0157] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0158] .epsilon. represents a first linker group;
[0159] .delta. represents a second linker group; and [0160] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00019##
[0161] wherein [0162] T represents the tether; [0163] N represents
a nucleobase residue; [0164] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0165] .epsilon..sub.1 and .epsilon..sub.2
represent the same or different first linker groups; [0166]
.delta..sub.1 and .delta..sub.2 represent the same or different
second linker groups; and [0167] "- - - -" represents a cleavable
intra-tether crosslink; or
##STR00020##
[0168] wherein [0169] T represents the tether; [0170] N represents
a nucleobase residue; [0171] V represents an internal cleavage site
of the nucleobase residue; and [0172] R.sup.1 and R.sup.2 represent
the same or different end groups for the template directed
synthesis of the daughter strand.
[0173] In some further embodiments, R.sup.1 and R.sup.2 are
selected from hydroxyl, phosphate, and triphosphate.
[0174] In other further embodiments, the target nucleic acid is
produced by a rolling circle polymerization process.
[0175] In another embodiment, the present disclosure provides a
method for sequencing a target nucleic acid, the method
comprising:
[0176] a) providing a paired-end daughter strand produced by a
bidirectional template-directed synthesis, the paired-end daughter
strand comprising a first and second sequence region joined to a
first and second end of a primer region, respectively, each
sequence region independently comprising at least 10 nucleobase
residues coupled in a sequence corresponding to a contiguous
nucleotide sequence of all or a portion of the target nucleic
acid;
[0177] b) using the paired-end daughter strand as the analyte input
to sequence at least 10 nucleobase residues of each of the first
and second probe regions to decode the genetic information of the
target nucleic acid.
[0178] In further embodiments of the foregoing, the bidirectional
template-directed synthesis comprises a ligation reaction. For
example, in some embodiments, the ligation reaction is an enzymatic
ligation reaction. In other embodiments, the bidirectional
template-directed synthesis comprises a polymerase reaction from
the 3' end of the primer region.
[0179] In other embodiments, the present disclosure provides a
method for sequencing a target nucleic acid, the, method
comprising:
[0180] a) providing a surrogate polymer paired-end daughter strand
produced by a bidirectional template-directed synthesis, the
surrogate polymer paired-end daughter strand having a first and
second probe region joined to a first and second end of a primer
region, respectively, the first and second probe regions comprising
a plurality of surrogate polymer substrates coupled in a sequence
corresponding to a contiguous nucleotide sequence of all or a
portion of the target nucleic acid, wherein the individual
surrogate polymer substrates comprise a tether, at least one probe,
and at least one selectively cleavable bond, the individual probe
comprising X nucleobase residues (with X being a positive integer
greater than one) and at least one reporter element that encodes Y
nucleobase residue(s) (with Y being a positive integer of at least
one and up to a maximum of X);
[0181] b) cleaving the at least one selectively cleavable bond to
yield a paired-end surrogate polymer of a length longer than the
plurality of the surrogate polymer substrates of the daughter
stand, the paired-end surrogate polymer comprising the tethers and
reporter elements for determining Y nucleobase(s) every X
nucleobases; and
[0182] c) detecting the reporter elements to determine Y
nucleobase(s) every X nucleobases of the paired-end daughter
strand.
[0183] In further embodiments of the foregoing, the bidirectional
template-directed synthesis comprises a ligation reaction. For
example, in some embodiments, the ligation reaction is an enzymatic
ligation reaction. In other embodiments, the bidirectional
template-directed synthesis comprises a polymerase reaction from
the 3' end of the primer region.
[0184] In other embodiments, the present disclosure provides a
method for producing a paired-end nucleic acid comprising a first
and second region joined to a first and second end of a primer
region, respectively, wherein the first and second regions
independently comprise at least 4 oligonucleotides, the method
comprising:
[0185] a) providing a primer adapter, wherein the primer adapter
comprises a region complementary, or near complementary, to a
primer;
[0186] b) providing the primer, wherein the primer comprises a 5'
phosphate end and a 3' hydroxyl end;
[0187] c) duplexing the primer to the primer adapter; and
[0188] d) extending the primer from both the 5' end and the 3' end,
wherein extending comprises ligating at least 4 oligonucleotides to
the 5' end of the primer.
[0189] In other embodiments of the foregoing, the paired-end
nucleic acid is a paired-end surrogate polymer daughter strand, and
the nucleotides or oligonucleotides comprise Xprobes or S-Xprobes.
In other embodiments, the primer adapter circularizes a target
nucleic acid. In yet other embodiments, the primer adapter further
comprises a tether, wherein the tether is optionally attached to a
solid substrate. In yet other embodiments, ligating comprises an
enzymatic ligation reaction. In other embodiments, extending the
primer further comprises ligating from the 3' end of the primer. In
yet other embodiments, extending the primer further comprises a
polymerase reaction extending from the 3' end of the primer.
[0190] In other embodiments, the present disclosure provides a
paired-end surrogate polymer comprising a first probe region and a
second probe region, the first and second probe regions joined to a
first and second end of a primer region, respectively, the first
and second probe regions comprising a plurality of subunits coupled
in a sequence corresponding to a contiguous nucleotide sequence of
all or a portion of a target nucleic acid. In some embodiments, the
paired-end surrogate polymer is produced by the method described
above. In other embodiments, the first and second probe regions
independently comprise 4 or more probes.
[0191] The present disclosure also provides a kit comprising a
plurality of constructs (i.e., either Xmers or S-Xmers with the
appropriate R1/R2 end groups) for forming a surrogate polymer
daughter strand by a template-directed synthesis, wherein the kit
optionally comprises appropriate instructions for use of the same
in forming a surrogate polymer daughter strand. In some
embodiments, the kit comprises from 10 to 65000 unique members.
[0192] In other embodiments, the present disclosure provides a
method of reading individual reporter elements of a surrogate
polymer, comprising:
[0193] a) providing a surrogate polymer, wherein the surrogate
polymer comprises one or more individual reporter elements;
[0194] b) providing a detector construct;
[0195] b) presenting the surrogate polymer to the detector
construct;
[0196] c) reading the individual reporter elements sequentially to
determine the reporter element sequence; and
[0197] d) using the reporter sequence thus determined to decode the
genetic information of the surrogate polymer.
[0198] In some embodiments of the foregoing, the detector construct
comprises a first and a second reservoir comprising first and
second electrodes, respectively, wherein the first and second
reservoirs are separated by a nanopore substrate positioned between
the first and second reservoirs, the nanopore substrate comprising
at least one nanopore channel, and reading the individual reporter
elements comprises translocating the surrogate polymer from the
first reservoir to the second reservoir through the at least one
nanopore channel. In other embodiments, reading the individual
reporter elements further comprises measuring the impedance change
in the nanopore channel as the surrogate polymer translocates
through the nanopore channel.
[0199] In further embodiments, the individual reporter elements
comprise at least one FRET (Fluorescence Resonance Energy Transfer)
donor or acceptor fluorophore and the nanopore channel comprises at
least one FRET donor or acceptor fluorophore, provided that when
the individual reporter elements comprise a FRET donor, the
nanopore channel comprises a FRET acceptor, and when the individual
reporter elements comprise a FRET acceptor, the nanopore channel
comprises a FRET donor, and reading the individual reporter
elements further comprises:
[0200] a) exciting the donor fluorophores with a light source as
the surrogate polymer translocates the nanopore channel; and
[0201] b) detecting a fluorescent signal emitted from the acceptor
fluorophores.
[0202] In other embodiments of the foregoing, the donor
fluorophores comprise 1 to 4 excitation wavelengths.
[0203] In further embodiments, the detector construct comprises a
nanocomb detector array having at least one detector element in, or
at the end of, the nanocomb slot, and reading the individual
reporter elements comprises passing the surrogate polymer through
the end of the nanocomb slot. In some other embodiments, the
nanocomb detector array further comprises a first and a second
electrode, and the surrogate polymer is passed between the first
and the second electrodes. In yet other embodiments, the individual
reporter elements induce a change in electrolyte current as the
surrogate polymer passes between the first and the second
electrodes. In still other embodiments, the individual reporter
elements form a current path between the first and the second
electrodes as the surrogate polymer passes between the first and
the second electrodes.
[0204] In other further embodiments, the surrogate polymer is
presented to the detector construct as a linearized array, wherein
the linearized array comprises a substrate. In some other
embodiments, reading the individual reporter elements comprises
electron beam microscopy, wherein the electron beam forms a line.
In other embodiments, the individual reporter elements comprise
boron or nanogold. In another embodiment, the substrate comprises a
contrast coating for improved signal-to-noise ratio.
[0205] In other further embodiments of the foregoing, the detector
construct comprises at least one knife-edge electrode, the
individual reporter elements comprise conductive polymeric
bristles, and reading the individual reporter elements
comprises:
[0206] a) applying an electric potential between the at least one
knife-edge electrode and the substrate; and
[0207] b) measuring an electric current as the surrogate polymer
passes under the at least one knife-edge electrode.
[0208] In some other embodiments of the foregoing, the substrate
comprises a conductive film. In some embodiments, the conductive
polymeric bristles comprise polymers selected from polyacetylene,
polyaniline, or polypyrrole.
[0209] In other further embodiments, the individual reporter
elements comprise at least one fluorophore, wherein the at least
one fluorophore comprises at least one spectral type, and reading
the individual reporter elements comprises:
[0210] a) providing an excitation energy, localizing the excitation
energy to excite the at least one fluorophore of the individual
reporter elements; and
[0211] b) detecting a fluorescent signal emitted by the at least
one fluorophore.
[0212] In some embodiments of the foregoing, the excitation energy
is from a near field source, the near field source emerging from a
slit, and reading the individual reporter elements further
comprises:
[0213] a) moving the slit parallel to the surrogate polymer;
and
[0214] b) detecting the fluorescent signal of the at least one
fluorophore of the individual reporter elements.
[0215] For example, in some embodiments, the fluorescent signal is
detected in the far field.
[0216] In other embodiments, the present disclosure provides a
method of detecting an analyte, comprising:
[0217] a) providing at least one analyte;
[0218] b) providing at least one indicator moiety, wherein the
indicator moiety is not associated with the analyte;
[0219] c) providing a detector construct, wherein the detector
construct comprises a first and a second reservoir comprising first
and second electrodes, respectively, wherein the first and second
reservoirs are separated by a nanopore substrate positioned between
the first and second reservoirs;
[0220] d) providing an electric potential to the first and second
electrodes, wherein the electric potential is sufficient to
translocate the at least one analyte and the at least one indicator
moiety through the at least one nanopore channel; and
[0221] c) detecting a change in an optical signal emitted from the
at least one indicator moiety at or near the at least one nanopore
channel as the at least one analyte translocates through the at
least one nanopore channel.
[0222] In some embodiments, the foregoing method further comprises
providing an excitation wavelength, wherein the excitation
wavelength is sufficient to induce a fluorescent signal from the at
least one indicator moiety. In other embodiments, the at least one
analyte is a nucleic acid. In yet other embodiments, the at least
one analyte is a surrogate polymer. In some other embodiments, the
at least one nanopore comprises a nanopore array, and the nanopore
array shares the first and second reservoirs.
[0223] In further embodiments, the at least one indicator moiety is
a fluorophore, the first reservoir comprises a high concentration
of the fluorophores relative to the second reservoir, and detecting
a change in optical signal further comprises detecting a change in
fluorescent signal as the fluorophores translocate through the at
least one nanopore channel. In some embodiments, epifluorescence
microscopy is used for detecting the change in fluorescent signal.
In other embodiments, conoscopy is used for detecting the change in
fluorescent signal. In other embodiments, the nanopore substrate
comprises a blocking film. In yet other embodiments, the
fluorophores are fluoroscein. In some other embodiments, the second
reservoir comprises a fluorescence quenching agent, and detecting a
change in optical signal further comprises detecting a change in
fluorescent signal as the fluorophores or the quenching agent
translocate through the at least one nanopore channel. For example,
in some embodiments, the quenching agent is selected from QSY7,
QSY9, and free radicals.
[0224] In some further embodiments, the method further comprises
providing two indicator moieties, wherein a first indicator moiety
is selected from indicator ions, a second indicator moiety is
selected from fluorescence indicators, the first reservoir
comprises indicator ions, the second reservoir comprises
fluorescence indicators, and detecting a change in an optical
signal further comprises detecting the change in fluorescence
signal emitted as either the indicator ions or the indicator moiety
pass through the at least one nanopore channel. In some other
embodiments, the second reservoir further comprises a
non-fluorescing absorber. In yet other embodiments, the nanopore
channel is masked to create a circular opening of about 1 .mu.m in
diameter, wherein the opening is concentric with the nanopore
channel. In other embodiments, the indicator ions are selected from
calcium ions, singlet hydrogen ions, singlet oxygen ions, potassium
ions, zinc ions, magnesium ions, chlorine ions, and sodium ions. In
some embodiments, the fluorescence indicator is selected from
Fura-3, Fluo-3, Indo-1, and Fura Red. In other embodiments, the
fluorescence indicator is a fluorescence quencher. In yet other
embodiments, the first reservoir comprises iodide ions and the
second reservoir comprises fluorescein.
[0225] In some further embodiments, the method further comprises
providing two indicator moieties, wherein the first reservoir
comprises a first indicator moiety and the second reservoir
comprises a second indicator moiety, wherein the first and second
indicator moieties are capable of combining to form a third
indicator moiety in an excited state, and detecting a change in
optical signal further comprises detecting photons which are
emitted when the third indicator moiety relaxes to a ground
state.
[0226] In other embodiments a method of presenting at least one
surrogate polymer for detection is provided, wherein the method
comprises:
[0227] a) providing a detector construct, wherein the detector
construct comprises at least one detector element;
[0228] b) providing the at least one surrogate polymer, wherein the
at least one surrogate polymer comprises one or more individual
reporter elements; and
[0229] c) processing the at least one surrogate polymer to obtain a
uniform spatial and temporal spacing of the one or more individual
reporter elements.
[0230] In other embodiments, the detector construct comprises at
least one nanopore channel. For example, in some embodiments, the
detector construct comprises a regular array of nanopore channels.
In other embodiments, processing the at least one surrogate polymer
comprises tethering an end of the at least one surrogate polymer to
a solid substrate having at least one binding site. In yet other
embodiments, processing the at least one surrogate polymer
comprises aligning the at least one surrogate polymer on a
substrate surface.
[0231] In another further embodiment, processing the at least one
surrogate polymer comprises attaching a charged, linear polymer
having a low molecular weight to an end of the at least one
surrogate polymer. For example, in one embodiment the charged,
linear polymer is selected from polyglutamic acid and
polyphosphate.
[0232] In another further embodiment, processing the at least one
surrogate polymer comprises applying a voltage to the at least one
nanopore channel, wherein the voltage is higher than a desired
measurement voltage, and decreasing the voltage to the desired
measurement voltage when a surrogate polymer is detected in the
nanopore channel. In another embodiments of the foregoing, the
voltage is manipulated such that only one surrogate polymer may
occupy the at least one nanopore channel at a time.
[0233] In yet another further embodiment, processing the at least
one surrogate polymer comprises attaching a stop to an end of the
at least one surrogate polymer, wherein the stop prevents the at
least one surrogate polymer from passing through the at least one
nanopore channel and prevents multiple surrogate polymers from
occupying the same nanopore channel, and prefilling the at least
one surrogate polymer in the at least one nanopore channel. In some
embodiments, the stop is selected from a bulky dendrimer and a
bead, for example, a magnetic bead.
[0234] In other further embodiments, processing the at least one
surrogate polymer comprises attaching a linear ferrite polymer to
an end of the at least one surrogate polymer and manipulating a
magnetic field and an electric field to obtain a uniform spatial
and temporal spacing of the one or more individual reporter
elements. For example, in one embodiment, the magnetic field and
the electric field are manipulated such that only one surrogate
polymer may occupy the at least one nanopore channel at a time.
[0235] In another further embodiment, processing the at least one
surrogate polymer comprises controlling the flow of the at least
one surrogate polymer toward the detector construct. For example,
in one embodiment, controlling the flow of the at least one
surrogate polymer comprises tethering the surrogate polymer to a
substrate, wherein the tether comprises an addressable, cleavable
linkage, and selectively cleaving the linkage such that one
surrogate polymer is released from the substrate per unit of time.
In one embodiment, selectively cleaving the cleavable linkage
comprises controlling the cleavage rate such that only one
surrogate polymer may occupy the at least one detector element at a
time. In some embodiments, the linkage is selected from
photocleavable linkages, thermally cleavable linkages and
electrochemically cleavable linkage.
[0236] In further embodiments, controlling the flow of the at least
one surrogate polymer comprises:
[0237] a) providing at least one gating construct, wherein the at
least one gating construct comprises a first, second, and third
electrode; and
[0238] b) manipulating an electric field applied independently to
the first, second, and third electrodes to obtain a uniform spatial
and temporal spacing of the one or more individual reporter
elements.
[0239] In other embodiments of the foregoing, the electric field is
manipulated such that only one surrogate polymer may occupy the at
least one detector element at a time.
[0240] In yet other further embodiments, controlling the flow of
the at least one surrogate polymer comprises:
[0241] a) providing at least one gating construct, wherein the at
least one gating construct comprises a first and second porous
electrode and a gating element, wherein the first and second porous
electrodes are affixed to a first and second side of the gating
element, respectively;
[0242] b) applying an electric field to the first and second
electrodes; and
[0243] c) transporting the at least one surrogate polymer through
the gate toward the at least one detector element.
[0244] In other embodiments of the foregoing, the gating element is
selected from a porous membrane and a nanohole. In some
embodiments, the electric field is manipulated to obtain a uniform
spatial and temporal spacing of the one or more individual reporter
elements. In other embodiments, the electric field is manipulated
such that only one surrogate polymer may occupy the at least one
detector element at a time. In some embodiments, the gating element
is a porous membrane. For example, in some embodiments, the porous
membrane comprises pores from about 20 nm to about 100 nm in
diameter. In other embodiments, the porous membrane is selected
from aluminum oxide and a polymer track-formed membrane. In other
embodiments, a multiplexed gating construct is provided (i.e. more
than one gating construct is provided).
[0245] In other further embodiments, the flow of the at least one
surrogate polymer comprises providing at least one gating construct
selected from an affinity gel or a channel (e.g. aluminum oxide),
and processing the at least one surrogate polymer comprises:
[0246] a) attaching an affinity drag tag to an end of the surrogate
polymer; and
[0247] b) applying an electric field sufficient to translocate the
surrogate polymer through the gating construct toward the at least
one detector element.
[0248] In other further embodiments of the foregoing, the electric
field is manipulated to obtain a uniform spatial and temporal
spacing of the one or more individual reporter elements. In yet
other embodiments, the electric field is manipulated such that only
one surrogate polymer may occupy the at least one detector element
at a time. In some embodiments, the gating construct is an affinity
gel.
[0249] In other further embodiments, the solid substrate has
uniformly spaced binding sites. In some embodiments, the binding
sites are a spot about 1 .mu.m in size. In other embodiments, a
maximum of one surrogate polymer binds to each individual binding
site. In even other embodiments, the surrogate polymer further
comprises a dendrimer attached to the end of the surrogate polymer,
wherein the dendrimer sterically inhibits binding of another
surrogate polymer to the same binding site. In other embodiments,
the solid substrate is selected from flexible polyethylene
terephthalate (PET) film, float glass, a silicon wafer, and
stainless steel. In even other embodiments, the at least one
binding site comprises a line on the solid substrate. For example,
in some embodiments, the width of the line is less than the
distance between the surrogate polymers bound thereto.
[0250] In other further embodiments, the substrate, having at least
one surrogate polymer bound thereto, is rotated from normal to 180
degrees to an applied electric field, the electric field causing
the at least one surrogate polymer to lie down in a straight and
elongated orientation on the surface of the substrate. In some
embodiments, the at least one surrogate polymer is further attached
to the substrate surface in a laid down, straight and elongated
orientation. For example, in some embodiments, the at least one
surrogate polymer is attached to the substrate surface by
ultraviolet or chemical activation of the substrate surface. In
other embodiments, the solid substrate is a flexible polyethylene
terephthalate (PET) film.
[0251] In further embodiments, the substrate, having at least one
surrogate polymer bound thereto, is passed through a comb
construct, wherein the comb construct comprises a stretching
electric field at an input side, a pinning electric field at an
output side, and a comb element between the input and output sides,
and processing the at least one surrogate polymer further comprises
passing the substrate through the stretching electric field, under
the comb, and through the electric pinning field such that the at
least one surrogate polymer is laid down in a straight and
elongated orientation on the substrate surface. In some
embodiments, the at least one surrogate polymer is further attached
to the substrate surface in a laid down, straight and elongated
orientation. For example, in some embodiments, the at least one
surrogate polymer is attached to the substrate surface by
application of an electric filed or by ultraviolet or chemical
activation of the substrate surface.
[0252] In other further embodiments, the substrate, having at least
one surrogate polymer bound thereto, is passed under a brush
construct, wherein the brush construct comprises bristles, the
bristles causing the at least one surrogate polymer to lay down in
a straight and elongated orientation on the substrate surface. In
other embodiments, the at least one surrogate polymer is further
attached to the substrate surface in a laid down, straight and
elongated orientation. For example, in some embodiments, the at
least one surrogate polymer is attached to the substrate surface by
application of an electric field or by ultraviolet or chemical
activation of the substrate surface. In other embodiments, the
bristles are about 10 nm in diameter. In yet other embodiments, the
bristles comprise polymers, for example ultraviolet cured or
thermal cured polymers.
[0253] In further embodiments, the substrate comprises a closed
loop of flexible film, and processing the at least one surrogate
polymer further comprises a continuous process comprising:
[0254] a) rotating the substrate through the detector
construct;
[0255] b) removing the at least one surrogate polymer from the
substrate surface after it passes through the detector element;
[0256] c) reattaching another surrogate polymer to the substrate
surface;
[0257] d) and repeating steps a and b until all surrogate polymers
are analyzed.
[0258] In other embodiments of the foregoing, the substrate is a
flexible polyethylene terephthalate (PET) film.
[0259] In yet other further embodiments, processing the at least
one surrogate polymer comprises affixing the at least one surrogate
polymer to a solid substrate comprising nanopore channels, wherein
affixing the at least one surrogate polymer to the solid substrate
comprises attaching a stop to an end of the at least one surrogate
polymer, wherein the stop prevents the at least one surrogate
polymer from passing through the at least one nanopore channel and
prevents multiple surrogate polymer from occupying the same
nanopore channel, and prefilling the at least one surrogate polymer
in the at least one nanopore channel. For example, in some
embodiments, the stop is selected from a bulky dendrimer and a
bead. In other embodiments, the bead is a magnetic bead.
[0260] These and other aspects of the invention will be apparent
upon reference to the attached drawings and following detailed
description. To this end, various references are set forth herein
which describe in more detail certain procedures, compounds and/or
compositions, and are hereby incorporated by reference in their
entirety.
BRIEF DESCRIPTION OF THE DRAWINGS
[0261] In the figures, identical reference numbers identify similar
elements. The sizes and relative positions of elements in the
figures are not necessarily drawn to scale and some of these
elements are arbitrarily enlarged and positioned to improve figure
legibility. Further, the particular shapes of the elements as drawn
are not intended to convey any information regarding the actual
shape of the particular elements, and have been solely selected for
ease of recognition in the figures.
[0262] FIGS. 1A and 1B illustrate the limited separation between
nucleobases that must be resolved in order to determine the
sequence of nucleotides in a nucleic acid target.
[0263] FIGS. 2A through 2D illustrate schematically several
representative structures of substrates useful in the
invention.
[0264] FIGS. 3A, 3B and 3C are schematics illustrating simplified
steps for synthesizing an Xpandomer from a target nucleic acid.
[0265] FIG. 4 illustrates rolling circular polymerization.
[0266] FIGS. 5A and 5B illustrate methods of making paired-end
surrogate polymers and methods for preparing target oligomers for
preparing paired-end methods, respectively.
[0267] FIG. 6 represents an exemplary nanopore detection
technique.
[0268] FIG. 7 shows a depiction of a nanopore response as 4
different reporters are passed serially through a nanopore.
[0269] FIG. 8 depicts a nanopore fluorocurrent detection
technique.
[0270] FIG. 9 shows model data of the temporal diffusion of
fluorescein into an infinite trans reservoir.
[0271] FIG. 10 is a graph showing an exemplary embodiment where
fluorophore translocation is limited in time to 5 blocking
levels.
[0272] FIG. 11 depicts an ion indicator detection method.
[0273] FIG. 12 shows a quenching fluorescence detection method.
[0274] FIGS. 13A and 13B illustrate a nanocomb detection
technique.
[0275] FIGS. 14A, 14B, and 14C show a detection method comprising a
knife-edge electrode and conductive polymer.
[0276] FIGS. 15A, 15B, and 15C illustrate different methods for
presenting nucleic acid polymers for detection.
[0277] FIG. 16 shows a porous array on a substrate.
[0278] FIGS. 17A and 17B illustrate an exemplary presentation
method.
[0279] FIGS. 18A, 18B, and 18C depict exemplary presentation
methods.
[0280] FIGS. 19A, 19B, and 19C depict exemplary presentation
methods.
[0281] FIGS. 20A through 20D depict exemplary presentation
methods.
[0282] FIG. 21 depicts an affinity stretching presentation
method.
[0283] FIGS. 22A through 22D illustrate different methods of
aligning nucleic acid polymers on substrate surfaces. FIG. 22C is
an end view of a comb.
[0284] FIG. 23 shows a typical target template (SEQ ID NO: 1) that
is duplexed with a 16-mer HEX-modified primer (SEQ ID NO: 2) and
designed with a 20 base 5' overhang.
[0285] FIGS. 24A through 24D are gels of ligation experiments.
[0286] FIG. 25 is a gel of a ligation experiment.
DETAILED DESCRIPTION
[0287] In the following description, certain specific details are
set forth in order to provide a thorough understanding of various
embodiments. However, one skilled in the art will understand that
the invention may be practiced without these details. In other
instances, well-known structures have not been shown or described
in detail to avoid unnecessarily obscuring descriptions of the
embodiments. Unless the context requires otherwise, throughout the
specification and claims which follow, the word "comprise" and
variations thereof, such as, "comprises" and "comprising" are to be
construed in an open, inclusive sense, that is, as "including, but
not limited to." Further, headings provided herein are for
convenience only and do not interpret the scope or meaning of the
claimed invention.
[0288] Reference throughout this specification to "one embodiment"
or "an embodiment" means that a particular feature, structure or
characteristic described in connection with the embodiment is
included in at least one embodiment. Thus, the appearances of the
phrases "in one embodiment" or "in an embodiment" in various places
throughout this specification are not necessarily all referring to
the same embodiment. Furthermore, the particular features,
structures, or characteristics may be combined in any suitable
manner in one or more embodiments. Also, as used in this
specification and the appended claims, the singular forms "a,"
"an," and "the" include plural referents unless the content clearly
dictates otherwise. It should also be noted that the term "or" is
generally employed in its sense including "and/or" unless the
content clearly dictates otherwise.
Definitions
[0289] As used herein, and unless the context dictates otherwise,
the following terms have the meanings as specified below.
[0290] "SBX" refers to Sequence by Expansion. SBX processes and
methods are described in detail herein.
[0291] "Nucleobase" is a heterocyclic base such as adenine,
guanine, cytosine, thymine, uracil, inosine, xanthine,
hypoxanthine, or a heterocyclic derivative, analog, or tautomer
thereof. A nucleobase can be naturally occurring or synthetic.
Non-limiting examples of nucleobases are adenine, guanine, thymine,
cytosine, uracil, xanthine, hypoxanthine, 8-azapurine, purines
substituted at the 8 position with methyl or bromine,
9-oxo-N6-methyladenine, 2-aminoadenine, 7-deazaxanthine,
7-deazaguanine, 7-deaza-adenine, N4-ethanocytosine,
2,6-diaminopurine, N6-ethano-2,6-diaminopurine, 5-methylcytosine,
5-(C3-C6)-alkynylcytosine, 5-fluorouracil, 5-bromouracil,
thiouracil, pseudoisocytosine,
2-hydroxy-5-methyl-4-triazolopyridine, isocytosine, isoguanine,
inosine, 7,8-dimethylalloxazine, 6-dihydrothymine,
5,6-dihydrouracil, 4-methyl-indole, ethenoadenine and the
non-naturally occurring nucleobases described in U.S. Pat. Nos.
5,432,272 and 6,150,510 and PCT applications WO 92/002258, WO
93/10820, WO 94/22892, and WO 94/24144, and Fasman ("Practical
Handbook of Biochemistry and Molecular Biology", pp. 385-394, 1989,
CRC Press, Boca Raton, LO), all herein incorporated by reference in
their entireties.
[0292] "Nucleobase residue" includes nucleotides, nucleosides,
fragments thereof, and related molecules having the property of
binding to a complementary nucleotide. Deoxynucleotides and
ribonucleotides, and their various analogs, are contemplated within
the scope of this definition. Nucleobase residues may be members of
oligomers and probes. "Nucleobase" and "nucleobase residue" may be
used interchangeably herein and are generally synonymous unless
context dictates otherwise.
[0293] "Polynucleotides", also called nucleic acids or nucleic acid
polymers, are covalently linked series of nucleotides. DNA
(deoxyribonucleic acid) and RNA (ribonucleic acid) are biologically
occurring polynucleotides in which the nucleotide residues are
linked in a specific sequence by phosphodiester linkages. As used
herein, the terms "polynucleotide" or "oligonucleotide" encompass
any polymer compound, including the surrogate polymers disclosed
herein, having a linear backbone of nucleotides. Oligonucleotides,
also termed oligomers, are generally shorter chained
polynucleotides.
[0294] "Complementary" generally refers to specific nucleotide
duplexing to form canonical Watson-Crick base pairs, as is
understood by those skilled in the art. However, complementary as
referred to herein also includes base-pairing of nucleotide
analogs, which include, but are not limited to, 2'-deoxyinosine and
5-nitroindole-2'-deoxyriboside, which are capable of universal
base-pairing with A, T, G or C nucleotides and locked nucleic
acids, which enhance the thermal stability of duplexes. One skilled
in the art will recognize that hybridization stringency is a
determinant in the degree of match or mismatch in the duplex formed
by hybridization.
[0295] "Nucleic acid" is a polynucleotide or an oligonucleotide. A
nucleic acid molecule can be deoxyribonucleic acid (DNA),
ribonucleic acid (RNA), or a combination of both. Nucleic acids are
generally referred to as "target nucleic acids" or "target
sequence" if targeted for sequencing. Nucleic acids can be mixtures
or pools of molecules targeted for sequencing.
[0296] "Probe" is a short strand of nucleobase residues, referring
generally to two or more contiguous nucleobase residues which are
generally single-stranded and complementary to a target sequence of
a nucleic acid. As embodied in "Substrate Members" and "Substrate
Constructs", probes can be up to 20 nucleobase residues in length.
Probes may include modified nucleobase residues and modified
intra-nucleobase bonds in any combination. Backbones of probes can
be linked together by any of a number of types of covalent bonds,
including, but not limited to, ester, phosphodiester,
phosphoramide, phosphonate, phosphorothioate, phosphorothiolate,
amide bond and any combination thereof. The probe may also have 5'
and 3' end linkages that include, but are not limited to, the
following moieties: monophosphate, triphosphate, hydroxyl,
hydrogen, ester, ether, glycol, amine, amide, and thioester.
[0297] "Selective hybridization" refers to specific complementary
binding. Polynucleotides, oligonucleotides, probes, nucleobase
residues, and fragments thereof selectively hybridize to target
nucleic acid strands, under hybridization and wash conditions that
minimize nonspecific binding. As known in the art, high stringency
conditions can be used to achieve selective hybridization
conditions favoring a perfect match. Conditions for hybridization
such as salt concentration, temperature, detergents, PEG, and GC
neutralizing agents such as betaine can be varied to increase the
stringency of hybridization, that is, the requirement for exact
matches of C to base pair with G, and A to base pair with T or U,
along a contiguous strand of a duplex nucleic acid.
[0298] "Template-directed synthesis", "template-directed assembly",
"template-directed hybridization", "template-directed binding" and
any other template-directed processes, refer to a process whereby
nucleobase residues or probes bind selectively to a complementary
target nucleic acid, and are incorporated into a nascent daughter
strand. A daughter strand produced by a template-directed synthesis
is complementary to the single-stranded target from which it is
synthesized. It should be noted that the corresponding sequence of
a target strand can be inferred from the sequence of its daughter
strand, if that is known. "Template-directed polymerization" and
"template-directed ligation" are special cases of template-directed
synthesis whereby the resulting daughter strand is polymerized or
ligated, respectively.
[0299] "Daughter strand" means a strand produced by a
template-directed synthesis which is complementary to the
single-stranded target from which it is synthesized. Daughter
strands include Xdaughter strands and S-Xdaugther strands, as
defined herein, as well as daughter strands of other nucleic
acids.
[0300] "Paired-end daughter strand" means a daughter strand
produced by a bidirectional synthesis. A paired end daughter strand
comprises a first and second sequence region attached to first and
second end of a primer. The first and second sequence regions
independently comprise nucleobase residues encoding the genetic
information of a target nucleic acid. Typically, the first and
second sequence regions independently comprise 10 or more decodable
nucleobase residues, although paired-end daughter strands having
first and second regions comprising less than 10 decodable
nucleobase residues are also included with the definition of
"paired-end daughter strand". Paired-end daughter strands include
Paired-end Xdaughter strands and Paired-end S-Xdaugther strands, as
defined herein, as well as Paired-end daughter strands of other
nucleic acids.
[0301] "Sequence region" means a region of a nucleic acid
(surrogate polymer or otherwise) which comprises nucleobase
residues coupled in a sequence corresponding to a contiguous
nucleotide sequence of all or a portion of a target nucleic
acid.
[0302] "Not associated with" in the context of an indicator moiety
which is not associated with an analyte, means that the indicator
moiety is not bonded (e.g. covalent bond, hydrogen bond, etc.) or
otherwise conjugated to the analyte.
[0303] "Indicator moiety" means a moiety, for example a chemical
species, which can be detected under the conditions of a particular
assay. Non-limiting examples of indicator moieties include:
flourophores, chemilluminescent species, and any species capable of
inducing fluorescence or chemiluminescence in another species.
[0304] "Analyte nucleic acid" means a nucleic acid which is the
subject of analysis and/or detection. Analyte nucleic acids include
surrogate polymers as well as other nucleic acids.
[0305] "Primer" means a nucleic acid strand used as a template for
template-directed synthesis of a daughter strand.
[0306] "Primer adapter" means a nucleic acid strand used as a
template to produce a primer.
[0307] "Contiguous" indicates that a sequence continues without
interruption or missed nucleobase. The contiguous sequence of
nucleotides of the template strand is said to be complementary to
the contiguous sequence of the daughter strand.
[0308] "Substrates" or "substrate members" are oligomers, probes or
nucleobase residues that have binding specificity to the target
template. The substrates are generally combined with tethers to
form substrate constructs. Substrates of substrate constructs that
form the primary backbone of the daughter strand are also
substrates or substrate members of the daughter strand.
[0309] "Substrate constructs" are reagents for template-directed
synthesis of daughter strands, and are generally provided in the
form of libraries. Substrate constructs generally contain a
substrate member for complementary binding to a target template and
either a tether member or tether attachment sites to which a tether
may be bonded. Substrate constructs are provided in a variety of
forms adapted to the invention. Substrate constructs include both
"oligomeric substrate constructs" (also termed "probe substrate
constructs") and "monomeric substrate constructs" (also termed
"nucleobase substrate constructs").
[0310] "Subunit motif" or "motif" refers to a repeating subunit of
a polymer backbone, the subunit having an overall form
characteristic of the repeating subunits, but also having
species-specific elements that encode genetic information. Motifs
of complementary nucleobase residues are represented in libraries
of substrate constructs according to the number of possible
combinations of the basic complementary sequence binding nucleobase
elements in each motif. If the nucleobase binding elements are four
(e.g., A, C, G, and T), the number of possible motifs of
combinations of four elements is 4.sup.x, where x is the number of
nucleobase residues in the motif. However, other motifs based on
degenerate pairing bases, on the substitution of uracil for
thymidine in ribonucleobase residues or other sets of nucleobase
residues, can lead to larger libraries (or smaller libraries) of
motif-bearing substrate constructs. Motifs are also represented by
species-specific reporter constructs, such as the reporters making
up a reporter tether. Generally there is a one-to-one correlation
between the reporter construct motif identifying a particular
substrate species and the binding complementarity and specificity
of the motif.
[0311] "Xpandomer intermediate" or "S-Xpandomer intermediate" is an
intermediate product (also referred to herein as a "Xdaughter
strand or S-Xdaugther strand, respectively") assembled from
substrate constructs, and is formed by a template-directed assembly
of substrate constructs using a target nucleic acid template.
Optionally, other linkages between abutted substrate constructs are
formed which may include polymerization or ligation of the
substrates, tether-to-tether linkages or tether-to-substrate
linkages. The Xpandomer intermediate or S-Xpandomer intermediate
contains two structures; namely, the constrained Xpandomer or
S-Xpandomer and the primary backbone. The constrained Xpandomer or
S-Xpandomer comprises all of the tethers in the daughter strand but
may comprise all, a portion or none of the substrate as required by
the method. The primary backbone comprises all of the abutted
substrates. Under the process step in which the primary backbone is
fragmented or dissociated, the constrained Xpandomer or S-Xpandomer
is no longer constrained and is the Xpandomer or S-Xpandomer
product which is extended as the tethers are stretched out. "Duplex
daughter strand" refers to an Xpandomer intermediate or S-Xpandomer
intermediate that is hybridized or duplexed to the target
template.
[0312] "Primary backbone" refers to a contiguous or segmented
backbone of substrates of the daughter strand. A commonly
encountered primary backbone is the ribosyl 5'-3' phosphodiester
backbone of a native polynucleotide. However, the primary backbone
of an daughter strand may contain analogs of nucleobases and
analogs of oligomers not linked by phosphodiester bonds or linked
by a mixture of phosphodiester bonds and other backbone bonds,
which include, but are not limited to following linkages:
phosphorothioate, phosphorothiolate, phosphonate, phosphoramidate,
and peptide nucleic acid "PNA" backbone bonds which include
phosphono-PNA, serine-PNA, hydroxyproline-PNA, and combinations
thereof. Where the daughter strand is in its duplex form (i.e.,
duplex daughter strand), and substrates are not covalently bonded
between the subunits, the substrates are nevertheless contiguous
and form the primary backbone of the daughter strand.
[0313] "Constrained Xpandomer" or "constrained S-Xpandomer" is an
Xpandomer or S-Xpandomer in a configuration before it has been
expanded. The constrained Xpandomer or S-Xpandomer comprises all
tether members of the daughter strand. It is constrained from
expanding by at least one bond or linkage per tether attaching to
the primary backbone. During the expansion process, the primary
backbone of the daughter strand is fragmented or dissociated to
transform the constrained Xpandomer or constrained S-Xpandomer into
an Xpandomer or S-Xpandomer, respectively.
[0314] "Constrained Xpandomer backbone" or "constrained S-Xpandomer
backbone" refers to the backbone of the constrained Xpandomer or
constrained S-Xpandomer, respectively. It is a synthetic covalent
backbone co-assembled along with the primary backbone in the
formation of the daughter strand. In some cases both backbones may
not be discrete but may both have the same substrate or portions of
the substrate in their composition. The constrained Xpandomer or
constrained S-Xpandomer backbone always comprises the tethers
whereas the primary backbone comprises no tether members.
[0315] "Xpandomer" or "Xpandomer product" is a synthetic molecular
construct produced by expansion of a constrained Xpandomer, which
is itself synthesized by template-directed assembly of substrate
constructs. The Xpandomer is elongated relative to the target
template it was produced from. It is composed of a concatenation of
subunits, each subunit a motif, each motif a member of a library,
comprising sequence information, a tether and optionally, a
portion, or all of the substrate, all of which are derived from the
formative substrate construct. The Xpandomer is designed to expand
to be longer than the target template thereby lowering the linear
density of the sequence information of the target template along
its length. Xpandomers comprise reporter constructs which comprise
all the sequence information of the Xpandomer. In addition, the
Xpandomer optionally provides a platform for increasing the size
and abundance of reporters which in turn improves signal to noise
for detection. Lower linear information density and stronger
signals increase the resolution and reduce sensitivity requirements
to detect and decode the sequence of the template strand.
[0316] "S-Xpandomer" or "S-Xpandomer product" is similar to the
Xpandomer defined above, except that S-Xpanomers or S-Xpandomer
products comprise reporter constructs which comprise only a portion
of the sequence information of the S-Xpandomer. The reduced
reporter content allows for reduced resolution requirements.
[0317] The term "surrogate polymer" refers to both Xpandomers and
S-Xpandomers.
[0318] The term "surrogate polymer daughter strand" or surrogate
daughter strand" refers to both Xdaughter strands and S-Xdaugther
strands.
[0319] "Selectively cleavable bond" refers to a bond which can be
broken under controlled conditions such as, for example, conditions
for selective cleavage of a phosphorothiolate bond, a
photocleavable bond, a phosphoramide bond, a
3'-O--B-D-ribofuranosyl-2' bond, a thioether bond, a selenoether
bond, a sulfoxide bond, a disulfide bond, deoxyribosyl-5'-3'
phosphodiester bond, or a ribosyl-5'-3' phosphodiester bond, as
well as other cleavable bonds known in the art. A selectively
cleavable bond can be an intra-tether bond or between or within a
probe or a nucleobase residue or can be the bond formed by
hybridization between a probe and a template strand. Selectively
cleavable bonds are not limited to covalent bonds, and can be
non-covalent bonds or associations, such as those based on hydrogen
bonds, hydrophobic bonds, ionic bonds, pi-bond ring stacking
interactions, Van der Waals interactions, and the like.
[0320] "Moiety" is one of two or more parts into which something
may be divided, such as, for example, the various parts of a
tether, a molecule or a probe.
[0321] "Tether" or "tether member" refers to a polymer or molecular
construct having a generally linear dimension and with an end
moiety at each of two opposing ends. A tether is attached to a
substrate with a linkage in at least one end moiety to form a
substrate construct. The end moieties of the tether may be
connected to cleavable linkages to the substrate or cleavable
intra-tether linkages that serve to constrain the tether in a
"constrained configuration". After the daughter strand is
synthesized, each end moiety has an end linkage that couples
directly or indirectly to other tethers. The coupled tethers
comprise the constrained Xpandomer or S-Xpandomer that further
comprises the daughter strand. Tethers have a "constrained
configuration" and an "expanded configuration". The constrained
configuration is found in substrate constructs and in the daughter
strand. The constrained configuration of the tether is the
precursor to the expanded configuration, as found in Xpandomer
products and S-Xpandomers products. The transition from the
constrained configuration to the expanded configuration results
from cleavage of selectively cleavable bonds that may be within the
primary backbone of the daughter strand or intra-tether linkages. A
tether in a constrained configuration is also used where a tether
is added to form the daughter strand after assembly of the "primary
backbone". Tethers can optionally comprise one or more reporter
elements or reporter constructs along its length that can encode
sequence information of substrates. The tether provides a means to
expand the length of the Xpandomer or S-Xpandomer and thereby lower
the sequence information linear density.
[0322] "Tether constructs" are tethers or tether precursors
composed of one or more tether segments or other architectural
components for assembling tethers such as reporter constructs, or
reporter precursors, including polymers, graft copolymers, block
copolymers, affinity ligands, oligomers, haptens, aptamers,
dendrimers, linkage groups or affinity binding group (e.g.,
biotin).
[0323] "Tether element" or "tether segment" is a polymer having a
generally linear dimension with two terminal ends, where the ends
form end-linkages for concatenating the tether elements. Tether
elements may be segments of tether constructs. Such polymers can
include, but are not limited to: polyethylene glycols, polyglycols,
polypyridines, polyisocyanides, polyisocyanates,
poly(triarylmethyl) methacrylates, polyaldehydes, polypyrrolinones,
polyureas, polyglycol phosphodiesters, polyacrylates,
polymethacrylates, polyacrylamides, polyvinyl esters, polystyrenes,
polyamides, polyurethanes, polycarbonates, polybutyrates,
polybutadienes, polybutyrolactones, polypyrrolidinones,
polyvinylphosphonates, polyacetamides, polysaccharides,
polyhyaluranates, polyamides, polyimides, polyesters,
polyethylenes, polypropylenes, polystyrenes, polycarbonates,
polyterephthalates, polysilanes, polyurethanes, polyethers,
polyamino acids, polyglycines, polyprolines, N-substituted
polylysine, polypeptides, side-chain N-substituted peptides,
poly-N-substituted glycine, peptoids, side-chain
carboxyl-substituted peptides, homopeptides, oligonucleotides,
ribonucleic acid oligonucleotides, deoxynucleic acid
oligonucleotides, oligonucleotides modified to prevent Watson-Crick
base pairing, oligonucleotide analogs, polycytidylic acid,
polyadenylic acid, polyuridylic acid, polythymidine, polyphosphate,
polynucleotides, polyribonucleotides, polyethylene
glycol-phosphodiesters, peptide polynucleotide analogues,
threosyl-polynucleotide analogues, glycol-polynucleotide analogues,
morpholino-polynucleotide analogues, locked nucleotide oligomer
analogues, polypeptide analogues, branched polymers, comb polymers,
star polymers, dendritic polymers, random, gradient and block
copolymers, anionic polymers, cationic polymers, polymers forming
stem-loops, rigid segments and flexible segments.
[0324] "Peptide nucleic acid" or "PNA" is a nucleic acid analog
having nucleobase residues suitable for hybridization to a nucleic
acid, but with a backbone that comprises amino acids or derivatives
or analogs thereof.
[0325] "Phosphono-peptide nucleic acid" or "pPNA" is a peptide
nucleic acid in which the backbone comprises amino acid analogs,
such as N-(2-hydroxyethyl)phosphonoglycine or
N-(2-aminoethyl)phosphonoglycine, and the linkages between
nucleobase units are through phosphonoester or phosphonoamide
bonds.
[0326] "Serine nucleic acid" or "SerNA" is a peptide nucleic acid
in which the backbone comprises serine residues. Such residues can
be linked through amide or ester linkages.
[0327] "Hydroxyproline nucleic acid" or "HypNA" is a peptide
nucleic acid in which the backbone comprises 4-hydroxyproline
residues. Such residues can be linked through amide or ester
linkages.
[0328] "Reporter element" is a signaling element, molecular
complex, compound, molecule or atom that is also comprised of an
associated "reporter detection characteristic". Reporter elements
include, but are not limited to, FRET resonant donor or acceptor,
dye, quantum dot, bead, dendrimer, up-converting fluorophore,
magnet particle, electron scatterer (e.g., boron), mass, gold bead,
magnetic resonance, ionizable group, polar group, hydrophobic
group. Still others are fluorescent labels, such as but not limited
to, ethidium bromide, SYBR Green, Texas Red, acridine orange,
pyrene, 4-nitro-1,8-naphthalimide, TOTO-1, YOYO-1, cyanine 3 (Cy3),
cyanine 5 (Cy5), phycoerythrin, phycocyanin, allophycocyanin, FITC,
rhodamine, 5(6)-carboxyfluorescein, fluorescent proteins, DOXYL
(N-oxyl-4,4-dimethyloxazolidine), PROXYL
(N-oxyl-2,2,5,5-tetramethylpyrrolidine), TEMPO
(N-oxyl-2,2,6,6-tetramethylpiperidine), dinitrophenyl, acridines,
coumarins, Cy3 and Cy5 (Biological Detection Systems, Inc.),
erytrosine, coumaric acid, umbelliferone, texas red rhodaine,
tetramethyl rhodamin, Rox, 7-nitrobenzo-1-oxa-1-diazole (NBD),
oxazole, thiazole, pyrene, fluorescein or lanthamides; also
radioisotopes (such as .sup.33P, .sup.3H, .sup.14C .sup.35S,
.sup.125I, .sup.32P or .sup.131I), ethidium, Europium, Ruthenium,
and Samarium or other radioisotopes; or mass tags, such as, for
example, pyrimidines modified at the C5 position or purines
modified at the N7 position, wherein mass modifying groups can be,
for examples, halogen, ether or polyether, alkyl, ester or
polyester, or of the general type XR, wherein X is a linking group
and R is a mass-modifying group, chemiluminescent labels, spin
labels, enzymes (such as peroxidases, alkaline phosphatases,
beta-galactosidases, and oxidases), antibody fragments, and
affinity ligands (such as an oligomer, hapten, and aptamer).
Association of the reporter element with the tether can be covalent
or non-covalent, and direct or indirect. Representative covalent
associations include linker and zero-linker bonds. Included are
bonds to the tether backbone or to a tether-bonded element such as
a dendrimer or sidechain. Representative non-covalent bonds include
hydrogen bonds, hydrophobic bonds, ionic bonds, pi-bond ring
stacking, Van der Waals interactions, and the like. Ligands, for
example, are associated by specific affinity binding with binding
sites on the reporter element. Direct association can take place at
the time of tether synthesis, after tether synthesis, and before or
after Xpandomer synthesis.
[0329] A "reporter" or "reporter construct" is composed of one or
more reporter elements. Reporters include what are known as "tags"
and "labels." The probe or nucleobase residue of the Xpandomer or
S-Xpandomer can be considered a reporter. Reporters serve to parse
the genetic information of the target nucleic acid.
[0330] "Reporter construct" comprises one or more reporters that
can produce a detectable signal(s), wherein the detectable
signal(s) generally contain sequence information. This signal
information is termed the "reporter code" and is subsequently
decoded into genetic sequence data. A reporter construct may also
comprise tether segments or other architectural components
including polymers, graft copolymers, block copolymers, affinity
ligands, oligomers, haptens, aptamers, dendrimers, linkage groups
or affinity binding group (e.g., biotin).
[0331] "Reporter detection characteristic" referred to as the
"signal" describes all possible measurable or detectable elements,
properties or characteristics used to communicate the genetic
sequence information of a reporter directly or indirectly to a
measurement device. These include, but are not limited to,
fluorescence, multi-wavelength fluorescence, emission spectrum
fluorescence quenching, FRET, emission, absorbance, reflectance,
dye emission, quantum dot emission, bead image, molecular complex
image, magnetic susceptibility, electron scattering, ion mass,
magnetic resonance, molecular complex dimension, molecular complex
impedance, molecular charge, induced dipole, impedance, molecular
mass, quantum state, charge capacity, magnetic spin state,
inducible polarity, nuclear decay, resonance, or
complementarity.
[0332] "Reporter Code" is the genetic information from a measured
signal of a reporter construct. The reporter code is decoded to
provide sequence-specific genetic information data.
[0333] "Xprobe" or "S-Xprobe" is an expandable oligomeric substrate
construct. Each Xprobe or S-Xprobe has a probe member and a tether
member. The tether member generally having one or more reporter
constructs. Xprobes or S-Xprobes with 5'-monophosphate
modifications are compatible with enzymatic ligation-based methods
for Xpandomer or S-Xpandomer synthesis, respectively. Xprobes or
S-Xprobes with 5' and 3' linker modifications are compatible with
chemical ligation-based methods for Xpandomer or S-Xpandomer
synthesis, respectively.
[0334] "Xmer" or "S-Xmer" is an expandable oligomeric substrate
construct. Each Xmer or S-Xmer has an oligomeric substrate member
and a tether member, the tether member generally having one or more
reporter constructs. Xmers and S-Xmers are 5'-triphosphates
compatible with polymerase-based methods for synthesizing
Xpandomers and S-Xpandomers, respectively.
[0335] "RT-NTP" is an expandable, 5' triphosphate-modified
nucleotide substrate construct ("monomeric substrate") compatible
with template dependant enzymatic polymerization. An RT-NTP has a
modified deoxyribonucleotide triphosphate ("DNTP"), ribonucleotide
triphosphate ("RNTP"), or a functionally equivalent analog
substrate, collectively referred to as the nucleotide triphosphate
substrate ("NTPS"). An RT-NTP has two distinct functional
components; namely, a nucleobase 5'-triphosphate and a tether or
tether precursor. After formation of the daughter strand the tether
is attached between each nucleotide at positions that allow for
controlled RT expansion. In one class of RT-NTP (e.g., Class IX),
the tether is attached after RT-NTP polymerization. In some cases,
the RT-NTP has a reversible end terminator and a tether that
selectively crosslinks directly to adjacent tethers. Each tether
can be uniquely encoded with reporters that specifically identify
the nucleotide to which it is tethered.
[0336] "XNTP" is an expandable, 5' triphosphate modified nucleotide
substrate compatible with template dependent enzymatic
polymerization. An XNTP has two distinct functional components;
namely, a nucleobase 5'-triphosphate and a tether or tether
precursor that is attached within each nucleotide at positions that
allow for controlled RT expansion by intra-nucleotide cleavage.
[0337] "Processive" refers to a process of coupling of substrates
which is generally continuous and proceeds with directionality.
While not bound by theory, both ligases and polymerases, for
example, exhibit processive behavior if substrates are added to a
nascent daughter strand incrementally without interruption. The
steps of hybridization and ligation, or hybridization and
polymerization, are not seen as independent steps if the net effect
is processive growth of the nascent daughter strand. Some but not
all primer-dependent processes are processive.
[0338] "Promiscuous" refers to a process of coupling of substrates
that proceeds from multiple points on a template at once, and is
not primer dependent, and indicates that chain extension occurs in
parallel (simultaneously) from more than one point of origin.
[0339] "Single-base extension" refers to a cyclical stepwise
process in which monomeric substrates are added one by one.
Generally the coupling reaction is restrained from proceeding
beyond single substrate extension in any one step by use of
reversible blocking groups.
[0340] "Single-probe extension" refers to a cyclical stepwise
process in which oligomeric substrates are added one by one.
Generally the coupling reaction is restrained from proceeding
beyond single substrate extension in any one step by use of
reversible blocking groups.
[0341] "Corresponds to" or "corresponding" is used here in
reference to a contiguous single-stranded sequence of a probe,
oligonucleotide, oligonucleotide analog, or daughter strand that is
complementary to, and thus "corresponds to", all or a portion of a
target nucleic acid sequence. The complementary sequence of a probe
can be said to correspond to its target. Unless otherwise stated,
both the complementary sequence of the probe and the complementary
sequence of the target are individually contiguous sequences.
[0342] "Nuclease-resistant" refers to is a bond that is resistant
to a nuclease enzyme under conditions where a DNA or RNA
phosphodiester bond will generally be cleaved. Nuclease enzymes
include, but are not limited to, DNase I, Exonuclease III, Mung
Bean Nuclease, RNase I, and RNase H. One skilled in this field can
readily evaluate the relative nuclease resistance of a given
bond.
[0343] "Ligase" is an enzyme generally for joining 3'-OH
5'-monophosphate nucleotides, oligomers, and their analogs. Ligases
include, but are not limited to, NAD.sup.+-dependent ligases
including tRNA ligase, Taq DNA ligase, Thermus filiformis DNA
ligase, Escherichia coli DNA ligase, Tth DNA ligase, Thermus
scotoductus DNA ligase, thermostable ligase, Ampligase thermostable
DNA ligase, VanC-type ligase, 9.degree.N DNA Ligase, Tsp DNA
ligase, and novel ligases discovered by bioprospecting. Ligases
also include, but are not limited to, ATP-dependent ligases
including T4 RNA ligase, T4 DNA ligase, T7 DNA ligase, Pfu DNA
ligase, DNA ligase I, DNA ligase III, DNA ligase IV, and novel
ligases discovered by bioprospecting. These ligases include
wild-type, mutant isoforms, and genetically engineered
variants.
[0344] "Polymerase" is an enzyme generally for joining 3'-OH
5'-triphosphate nucleotides, oligomers, and their analogs.
Polymerases include, but are not limited to, DNA-dependent DNA
polymerases, DNA-dependent RNA polymerases, RNA-dependent DNA
polymerases, RNA-dependent RNA polymerases, T7 DNA polymerase, T3
DNA polymerase, T4 DNA polymerase, T7 RNA polymerase, T3 RNA
polymerase, SP6 RNA polymerase, DNA polymerase I, Klenow fragment,
Thermophilus aquaticus DNA polymerase, Tth DNA polymerase,
VentR.RTM. DNA polymerase (New England Biolabs), Deep VentR.RTM.
DNA polymerase (New England Biolabs), Bst DNA Polymerase Large
Fragment, Stoeffel Fragment, 9.degree.N DNA Polymerase, 9.degree.N
DNA polymerase, Pfu DNA Polymerase, Tfl DNA Polymerase, Tth DNA
Polymerase, RepliPHI Phi29 Polymerase, Tli DNA polymerase,
eukaryotic DNA polymerase beta, telomerase, Therminator.TM.
polymerase (New England Biolabs), KOD HiFi.TM. DNA polymerase
(Novagen), KOD1 DNA polymerase, Q-beta replicase, terminal
transferase, AMV reverse transcriptase, M-MLV reverse
transcriptase, Phi6 reverse transcriptase, HIV-1 reverse
transcriptase, novel polymerases discovered by bioprospecting, and
polymerases cited in US 2007/0048748, U.S. Pat. No. 6,329,178, U.S.
Pat. No. 6,602,695, and U.S. Pat. No. 6,395,524 (incorporated by
reference). These polymerases include wild-type, mutant isoforms,
and genetically engineered variants.
[0345] "Encode" or "parse" are verbs referring to transferring from
one format to another, and refers to transferring the genetic
information of target template base sequence into an arrangement of
reporters.
[0346] "Extragenetic" refers to any structure in the daughter
strand that is not part of the primary backbone; for example, an
extragenetic reporter is not the nucleobase itself that lies in the
primary backbone.
[0347] "Hetero-copolymer" is a material formed by combining
differing units (e.g., monomer subunit species) into chains of a
"copolymer". Hetero-copolymers are built from discrete "subunit"
constructs. A "subunit" is a region of a polymer composed a
well-defined motif, where each motif is a species and carries
genetic information. The term hetero-copolymer is also used herein
to describe a polymer in which all the blocks are blocks
constructed of repeating motifs, each motif having species-specific
elements. The daughter strand and the Xpandomer are both
hetero-copolymers whereby each subunit motif encodes 1 or more
bases of the target template sequence and the entire target
sequence is defined further with the sequence of motifs.
[0348] "Solid support" or "solid substrate" is a solid material
having a surface for attachment of molecules, compounds, cells, or
other entities. The surface of a solid support can be flat or not
flat. A solid support can be porous or non-porous. A solid support
can be a chip or array that comprises a surface, and that may
comprise glass, silicon, nylon, polymers, plastics, ceramics, or
metals. A solid support can also be a membrane, such as a nylon,
nitrocellulose, or polymeric membrane, or a plate or dish and can
be comprised of glass, ceramics, metals, or plastics, such as, for
example, polystyrene, polypropylene, polycarbonate, or polyallomer.
A solid support can also be a bead, resin or particle of any shape.
Such particles or beads can be comprised of any suitable material,
such as glass or ceramics, and/or one or more polymers, such as,
for example, nylon, polytetrafluoroethylene, TEFLON.TM.,
polystyrene, polyacrylamide, sepaharose, agarose, cellulose,
cellulose derivatives, or dextran, and/or can comprise metals,
particularly paramagnetic metals, such as iron. Solid supports may
be flexible, for example, a polyethylene terephthalate (PET)
film.
[0349] "Reversibly blocking" or "terminator" refers to a chemical
group that when bound to a second chemical group on a moiety
prevents the second chemical group from entering into particular
chemical reactions. A wide range of protecting groups are known in
synthetic organic and bioorganic chemistry that are suitable for
particular chemical groups and are compatible with particular
chemical processes, meaning that they will protect particular
groups during those processes and may be subsequently removed or
modified (see, e.g., Metzker et al. Nucleic Acids Res., 22(20):
4259, 1994).
[0350] "Linker" is a molecule or moiety that joins two molecules or
moieties, and provides spacing between the two molecules or
moieties such that they are able to function in their intended
manner. For example, a linker can comprise a diamine hydrocarbon
chain that is covalently bound through a reactive group on one end
to an oligonucleotide analog molecule and through a reactive group
on another end to a solid support, such as, for example, a bead
surface. Coupling of linkers to nucleotides and substrate
constructs of interest can be accomplished through the use of
coupling reagents that are known in the art (see, e.g., Efimov et
al., Nucleic Acids Res. 27: 4416-4426, 1999). Methods of
derivatizing and coupling organic molecules are well known in the
arts of organic and bioorganic chemistry. A linker may also be
cleavable or reversible.
[0351] "Detector construct" is an apparatus used for detection of
the surrogate polymers. Detector constructs include any element
necessary for detection of the surrogate polymers, and generally
comprise at least one detector element. The detector element is
capable of detecting the reporter elements of the surrogate
polymers. Examples of detector elements include, but are not
limited to, a nanopore channel, fluorescence detectors, UV
detectors, chemical and electrochemical detectors, photoelectric
detectors, and the like.
[0352] "Gating construct" is an apparatus used for controlling the
flow of surrogate polymers. Gating constructs include all elements
necessary to control the flow of surrogate polymers, and generally
comprise at least one gating element. Examples of gating elements
include nanoholes, and porous membranes, such as an aluminum oxide
porous membrane.
[0353] "Paired-end surrogate polymer" or "paired-end daughter
strand" both refer to a surrogate polymer or daughter strand
produced by a bidirectional template-directed synthesis. A rolling
circle polymerization process is an exemplary method for making a
"paired-end surrogate polymer" or "paired-end daughter strand."
[0354] The term "reading", within the context of reading a reporter
element or reporter construct, means identifying the reporter
element or reporter construct. The identity of the reporter element
or reporter construct can then be used to decode the genetic
information of the target nucleic acid.
[0355] An "addressable, cleavable linkage" is a cleavable linkage
whose location is known and can be individually targeted for
cleavage.
[0356] A "fluorophore" is a fluorescent molecule or a component of
a molecule that causes the molecule to be fluorescent. Fluorescien
is a non-limiting example of a fluorophore.
General Overview
[0357] In general terms, methods and corresponding devices and
products are described for replicating single-molecule target
nucleic acids. Such methods utilize "Xpandomers" and "S-Xpandomers"
(collectively referred to herein as "surrogate polymers") which
permit sequencing of the target nucleic acid with increased
throughput and accuracy. A surrogate polymer encodes (parses) the
nucleotide sequence data of the target nucleic acid in a linearly
expanded format, thereby improving spatial resolution, optionally
with amplification of signal strength. These processes are referred
to herein as "Sequencing by Expansion" or "SBX".
[0358] Sequencing by expansion enables low cost, high throughput
detection methods by providing sequence targets that: (a) have high
signal-to-noise reporters engineered for the detection method; (b)
require no concurrent chemistry with detection; and/or (c) are
engineered to the resolution requirements of the instrument. These
surrogates enable high fidelity read lengths >100 bases which
reduce post processing costs. SBX is disclosed in greater detail in
Published PCT WO2008/157696, which is hereby incorporated by
reference in its entirety.
[0359] More specifically, SBX can be solution-based with reagent
costs below US$15 per 100 Gigabases of surrogate suitable for
sequence reads. It converts DNA fragments >100 bases long into
longer surrogate molecules called Xpandomers or S-Xpandomers
(surrogate polymers). The sequential measurement of DNA bases is
rescaled from discerning small molecular differences between bases
that are spaced apart by .about.4 .ANG. to differentiating
responses of large 100 .ANG. reporters that are spaced apart by
>100 .ANG.. SBX preparation of DNA reduces the resolution
requirements and increases the signal-to-noise for any detection
methods that measure DNA directly, and provides many new
measurement methods for sequencing applications.
[0360] SBX processes for synthesizing surrogate polymers are
disclosed in more detail below and generally include polymerase and
enzymatic or chemical ligation to sequentially link probes in the
formation of surrogate polymers. For purpose of illustration, the
processes described herein are enzymatic ligation processes.
However, it should be understood that the procedures disclosed
herein can be readily adapted for Xpandomers created by other SBX
processes as described in WO2008/157696.
Sequencing Methods
[0361] As shown in FIG. 1A, native duplex nucleic acids have an
extremely compact linear data density; about a 3.4 .ANG.
center-to-center separation between sequential stacked bases (2) of
each strand of the double helix (1), and are therefore tremendously
difficult to directly image or sequence with any accuracy and
speed. When the double-stranded form is denatured to form single
stranded polynucleotides (3,4), the resulting base-to-base
separation distances are similar, but the problem becomes
compounded by domains of secondary structure.
[0362] As shown in FIG. 1B, surrogate polymer (5), here illustrated
as a concatenation of short oligomers (6,7) held together by
extragenetic tethers T (8,9), is a synthetic replacement or
"surrogate" for the nucleic acid target to be sequenced. Bases
complementary to the template are incorporated into the surrogate
polymer, and the regularly spaced tethers serve to increase the
distance between the short oligomers (here each shown with four
nucleobases depicted by circles). The surrogate polymer is made by
a process in which a synthetic duplex intermediate is first formed
by replicating a template strand. The daughter strand is unique in
that it has both a linear backbone formed by the oligomers and a
constrained surrogate polymer backbone comprised of folded tethers.
The tethers are then opened up or "expanded" to transform the
product into a chain of elongated tethers. Figuratively, the
daughter strand can be viewed as having two superimposed backbones:
one linear (primary backbone) and the other with "accordion" folds
(constrained surrogate polymer). Selective cleavage of bonds in the
daughter strand allows the accordion folds to expand to produce the
surrogate polymer product. This process will be explained in more
detail below, but it should be noted that the choice of four
nucleobases per oligomer and particulars of the tether as shown in
FIG. 1B is for purpose of illustration only, and in no way should
be construed to limit the invention. It should also be noted that
for purposes of illustration only, reporter elements are not shown
in the surrogate polymer depicted in FIG. 1B.
[0363] The separation distance "D" between neighboring oligomers in
the surrogate polymer is a process-dependent variable and is
determined by the length of the tether T. As will be shown, the
length of the tether T is designed into the substrate constructs,
the building blocks from which the surrogate polymer is made. The
separation distance D can be selected to be greater than 0.5 nm, or
greater than 2 nm, or greater than 5 nm, or greater than 10 nm, or
greater than 50 nm, for example. As the separation distance
increases, the process of discriminating or "resolving" the
individual oligomers becomes progressively easier. This would also
be true if, instead of oligomers, individual nucleobases of another
surrogate polymer species were strung together on a chain of
tethers.
[0364] Referring again to FIG. 1A, native DNA replicates by a
process of semi-conservative replication; each new DNA molecule is
a "duplex" of a template strand (3) and a native daughter strand
(4). The sequence information is passed from the template to the
native daughter strand by a process of "template-directed
synthesis" that preserves the genetic information inherent in the
sequence of the base pairs. The native daughter strand in turn
becomes a template for a next generation native daughter strand,
and so forth. Surrogate polymers are formed by a similar process of
template-directed synthesis, which can be an enzymatic or a
chemical coupling process. However, unlike native DNA, once formed,
surrogate polymers cannot be replicated by a biological process of
semi-conservative replication and are not suitable for
amplification by processes such as PCR. The surrogate polymer
product is designed to limit unwanted secondary structure.
[0365] FIGS. 2A through 2D show representative surrogate polymer
substrates (20,21,22,23). These are the building blocks from which
surrogate polymers are synthesized. Other exemplary surrogate
polymer substrates are addressed in subsequent sections. The
surrogate polymer substrate constructs shown here have two
functional components; namely, a probe member (10) and a "tether"
member (11) in a loop configuration. The loop forms the elongated
tether "T" of the final product. Solely for convenience in
explanation, the probe member is again depicted with four
nucleobase residues (14,15,16,17) as shown in FIG. 2B.
[0366] These substrate constructs can be end modified with
R-groups, for example a 5'-monophosphate, 3'-OH suitable for use
with a ligase (herein termed an "Xprobe" or "S-Xprobe") or as a
5'-triphosphate, 3'-OH suitable for use with a polymerase (herein
termed an "Xmer" or "S-Xmer"). Other R groups may be of use in
various protocols. In the first example shown in FIG. 3B, we
present the synthesis of a surrogate polymer from a template strand
of a target nucleic acid by a ligase-dependent process.
[0367] The four nucleobase residues (14,15,16,17) of the probe
member (10) are selected to be complementary to a contiguous
sequence of four nucleotides of the template. Each "probe" is thus
designed to hybridize with the template at a complementary sequence
of four nucleotides. By supplying a library of many such probe
sequences, a contiguous complementary replica of the template can
be formed. This daughter strand is termed an "Xpandomer
intermediate" or "S-Xpandomer intermediate". The intermediates have
duplex or single-stranded forms.
[0368] The tether loop is joined to the probe member (10) at the
second and third nucleobase residues (15,16). The second and third
nucleobase residues (15,16) are also joined to each other by a
"selectively cleavable bond" (25) depicted by a "V". Cleavage of
this cleavable bond enables the tether loop to expand. The
linearized tether can be said to "bridge" the selectively cleavable
bond site of the primary polynucleotide backbone of a daughter
strand. Cleaving these bonds breaks up the primary backbone and
forms the longer Xpandomer.
[0369] Selective cleavage of the selectively cleavable bonds (25)
can be done in a variety of ways including, but not limited to,
chemical cleavage of phosphorothiolate bonds, ribonuclease
digestion of ribosyl 5'-3' phosphodiester linkages, cleavage of
photocleavable bonds, and the like, as discussed is greater detail
below.
[0370] FIGS. 2A through 2D represent exemplary embodiments of
S-Xpandomers and Xpandomers. As mentioned above, an Xpandomer
comprises probes which further comprise one or more reporter
elements for parsing the entire genetic code of the probe, and an
S-Xpandomer comprises probes which further comprise one or more
reporter elements for parsing less than the entire genetic code of
the probe. Any representation throughout the figures of one or more
reporter elements attached to an Xprobe, Xmer, S-Xprobe or Smer or
a surrogate polymer derived therefrom, is for exemplary purposes
and, unless the content clearly dictates otherwise, is not meant to
indicate the amount of genetic information contained within the
probe or surrogate polymer (i.e. the exemplary surrogate polymers
and components thereof represented in the figures represent both
S-Xpandomers and Xpandomers and their respective components unless
clearly stated otherwise).
[0371] The substrate construct (20) shown in FIG. 2A has a single
tether segment, represented here by an ellipse (26), for attachment
of reporter elements. This segment is flanked with spacer tether
segments (12,13), all of which collectively form the tether
construct. One to many dendrimer(s), polymer(s), branched
polymer(s) or combinations therein can be used, for example, to
construct the tether segment. For the substrate construct (21) of
FIG. 2B, the tether construct is composed of three tether segments
for attachment of reporter elements (27,28,29), each of which is
flanked with a spacer tether segment. The combination of reporter
elements collectively form a "reporter construct" to produce a
unique digital reporter code (for probe sequence identification).
These reporter elements include, but are not limited to,
fluorophores, FRET tags, beads, ligands, aptamers, peptides,
haptens, oligomers, polynucleotides, dendrimers, stem-loop
structures, affinity labels, mass tags, and the like. The tether
loop (11) of the substrate construct (22) in FIG. 2C is "naked".
The genetic information encoded in this construct is not encoded on
the tether, but is associated with the probe (10), for example, in
the form of one or more tagged nucleotides. The substrate construct
(23) of FIG. 2D illustrates the general principal: as indicated by
the asterisk (*), the sequence information of the probe is encoded
or "parsed" in the substrate construct in a modified form more
readily detected in a sequencing protocol. Because the sequence
data is physically better resolved after cleavage of the
selectively cleavable bond (25) to form the linearly elongated
surrogate polymer, the asterisk (*) represents any form of encoded
genetic information for which this is a benefit. The bioinformatic
element or elements (*) of the substrate construct, whatever their
form, can be detectable directly or can be precursors to which
detectable elements are added in a post-assembly labeling step. In
some instances, the genetic information is encoded in a molecular
property of the substrate construct itself, for example a
multi-state mass tag. In other instances, the genetic information
is encoded by one or more fluorophores of FRET donor:acceptor
pairs, or a nanomolecular barcode, or a ligand or combination of
ligands, or in the form of some other labeling technique drawn from
the art. Various embodiments will be discussed in more detail
below.
[0372] The tether generally serves a number of functions: (1) to
sequentially link, directly or indirectly, to adjacent tethers
forming the surrogate polymer intermediate; (2) to stretch out and
expand to form an elongated chain of tethers upon cleavage of
selected bonds in the primary backbone or within the tether (see
FIG. 1B); and/or (3) to provide a molecular construct for
incorporating reporter elements, also termed "tags" or "labels",
that encode the nucleobase residue sequence information of its
associated substrate. The tether can be designed to optimize the
encoding function by adjusting spatial separations, abundance,
informational density, and signal strength of its constituent
reporter elements. A broad range of reporter properties are useful
for amplifying the signal strength of the genetic information
encoded within the substrate construct. The literature directed to
reporters, molecular bar codes, affinity binding, molecular tagging
and other reporter elements is well known to one skilled in this
field.
[0373] It can be seen that if each substrate of a substrate
construct contains x nucleobases, then a library representing all
possible sequential combinations of x nucleobases would contain
4.sup.x probes (when selecting the nucleobases from A, T, C or G).
Fewer or more combinations can be needed if other bases are used.
These substrate libraries are designed so that each substrate
construct contains (1) a probe (or at least one nucleobase residue)
complementary to any one of the possible target sequences of the
nucleic acid to be sequenced and (2) a unique reporter construct
that encodes the identity or partial identity of the target
sequence which that particular probe (or nucleobase) is
complementary to. A library of probes containing two nucleobases
would have 16 unique members; a library of probes containing three
nucleobases would have 64 unique members, and so forth. A
representative library would have the four individual nucleobases
themselves, but configured to accommodate a tethering means.
[0374] An exemplary synthesis of an Xpandomer is illustrated in
FIGS. 3A through 3C. The substrate depicted here is an Xprobe and
the method can be described as hybridization with primer-dependent
processive ligation in free solution. S-Xpandomers can be
synthesized in an analogous manner.
[0375] Many well known molecular biological protocols, such as
protocols for fragmenting the target DNA and ligating end adaptors,
can be adapted for use in sequencing methods and are used here to
prepare the target DNA (30) for sequencing.
[0376] Here we illustrate, in broad terms that which would be
familiar to those skilled in the art, processes for polishing the
ends of the fragments and blunt-ended ligation of adaptors (31,32)
designed for use with sequencing primers. These actions are shown
in Step I of FIG. 3A. In Steps II and III, the target nucleic acid
is denatured and annealed with suitable primers (33) complementary
to the adaptors.
[0377] In FIG. 3B, the primed template strand of Step III is
contacted with a library of substrate constructs (36) and ligase
(L), and in Step IV conditions are adjusted to favor hybridization
followed by ligation at a free 3'-OH of a primer-template duplex.
Optionally in Step V the ligase dissociates, and in Steps VI and
VII, the process of hybridization and ligation can be recognized to
result in extension by cumulative addition of substrates (37,38) to
the primer end. Although priming can occur from adaptors at both
ends of a single stranded template, the growth of a nascent
Xpandomer daughter strand is shown here to proceed from a single
primer, solely for simplicity. Extension of the daughter strand is
represented in Steps VI and VII, which are continuously repeated
(incrementally, without interruption). These reactions occur in
free solution and proceed until a sufficient amount of product has
been synthesized. In Step VIII, formation of a completed Xpandomer
intermediate (39) is shown.
[0378] Relatively long lengths of contiguous nucleotide sequence
can be efficiently replicated in this manner to form Xpandomer
intermediates (and S-Xpandomer intermediates analogously). It can
be seen that continuous read lengths ("contigs") corresponding to
long template strand fragments can be achieved with this
technology. It will be apparent to one skilled in the art that
billions of these single molecule SBX reactions can be done
simultaneously in an efficient batch process in a single tube.
Subsequently, the shotgun products of these syntheses can be
sequenced.
[0379] In FIG. 3C, the next steps of the SBX process are depicted.
Step IX shows denaturation of the duplex Xpandomer intermediate
followed by cleavage of selectively cleavable bonds in the
backbone, with the selectively cleavable bonds designed so that the
tether loops "open up", forming the linearly elongated Xpandomer
product (34). Such selective cleavage may be achieved by any number
of techniques known to one skilled in the art, including, but not
limited to, phosphorothiolate cleavage with metal cations as
disclosed by Mag et al. ("Synthesis and selective cleavage of an
oligodeoxynucleotide containing a bridged internucleotide
5'-phosphorothioate linkage", Nucleic Acids Research
19(7):1437-1441, 1991), acid catalyzed cleavage of phosphoramidate
as disclosed by Mag et al. ("Synthesis and selective cleavage of
oligodeoxyribonucleotides containing non-chiral internucleotide
phosphoramidate linkages", Nucleic Acids Research 17(15):
5973-5988, 1989), selective nuclease cleavage of phosphodiester
linkages as disclosed by Gut et al. ("A novel procedure for
efficient genotyping of single nucleotide polymorphisms", Nucleic
Acids Research 28(5): E13, 2000) and separately by Eckstein et al.
("Inhibition of restriction endonuclease hydrolysis by
phosphorothioate-containing DNA", Nucleic Acids Research, 25;
17(22): 9495, 1989), and selective cleavage of photocleavable
linker modified phosphodiester backbone as disclosed by Sauer et
al. ("MALDI mass spectrometry analysis of single nucleotide
polymorphisms by photocleavage and charge-tagging", Nucleic Acids
Research 31,11 e63, 2003), Vallone et al. ("Genotyping SNPs using a
UV-photocleavable oligonucleotide in MALDI-TOF MS", Methods Mol.
Bio. 297:169-78, 2005), and Ordoukhanian et al. ("Design and
synthesis of a versatile photocleavable DNA building block,
application to phototriggered hybridization", J. Am. Chem. Soc.
117, 9570-9571, 1995).
[0380] Refinements of the basic process, such as wash steps and
adjustment of conditions of stringency are well within the skill of
an experienced molecular biologist. Variants on this process
include, for example, immobilization and parsing of the target
strands, stretching and other techniques to reduce secondary
structure. Methods for preparation of Xpandomers are described in
greater detail in Published PCT WO 2008/157696, which is hereby
incorporated by reference. One skilled in the art will understand
that the methods described herein, and in Published PCT WO
2008/157696, for preparation of Xpandomers are applicable in an
analogous manner to preparation of S-Xpandomers.
[0381] The surrogate polymers comprise a plurality of subunits
coupled in a sequence corresponding to a contiguous nucleotide
sequence of all or a portion of the target nucleic acid. In one
embodiment, the surrogate polymers may be represented by the
following structures:
##STR00021##
[0382] wherein [0383] T represents the tether; [0384] P.sup.1
represents a first probe moiety; [0385] P.sup.2 represents a second
probe moiety; [0386] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
and [0387] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid;
##STR00022##
[0388] wherein [0389] T represents the tether; [0390] P.sup.1
represents a first probe moiety; [0391] P.sup.2 represents a second
probe moiety; [0392] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
[0393] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0394] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00023##
[0395] wherein [0396] T represents the tether; [0397] P.sup.1
represents a first probe moiety; [0398] P.sup.2 represents a second
probe moiety; [0399] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
[0400] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0401] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00024##
[0402] wherein [0403] T represents the tether; [0404] P.sup.1
represents a first probe moiety; [0405] P.sup.2 represents a second
probe moiety; [0406] .kappa. represents the .kappa..sup.th subunit
in a chain of m subunits, where m is an integer greater than three;
[0407] .alpha. represents a species of a subunit motif selected
from a library of subunit motifs, wherein each of the species
comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0408] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00025##
[0409] wherein [0410] T represents the tether; [0411] .kappa.
represents the .kappa..sup.th subunit in a chain of m subunits,
where m is an integer greater than three; [0412] .alpha. represents
a species of a subunit motif selected from a library of subunit
motifs, wherein each of the species comprises sequence information
of the contiguous nucleotide sequence of a portion of the target
nucleic acid; and [0413] .chi. represents a bond with the tether of
an adjacent subunit;
##STR00026##
[0414] wherein [0415] T represents the tether; [0416] N represents
a nucleobase residue; [0417] .kappa. represents the .kappa..sup.th
subunit in a chain of m subunits, where m is an integer greater
than ten; [0418] .alpha. represents a species of a subunit motif
selected from a library of subunit motifs, wherein each of the
species comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0419] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00027##
[0420] wherein [0421] T represents the tether; [0422] .kappa.
represents the .kappa..sup.th subunit in a chain of m subunits,
where m is an integer greater than ten; [0423] .alpha. represents a
species of a subunit motif selected from a library of subunit
motifs, wherein each of the species comprises sequence information
of the contiguous nucleotide sequence of a portion of the target
nucleic acid; and [0424] .chi. represents a bond with the tether of
an adjacent subunit;
##STR00028##
[0425] wherein [0426] T represents the tether; [0427] N represents
a nucleobase residue; [0428] .kappa. represents the .kappa..sup.th
subunit in a chain of m subunits, where m is an integer greater
than ten; [0429] .alpha. represents a species of a subunit motif
selected from a library of subunit motifs, wherein each of the
species comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; and [0430] .chi.
represents a bond with the tether of an adjacent subunit;
##STR00029##
[0431] wherein [0432] T represents the tether; [0433] N represents
a nucleobase residue; [0434] .kappa. represents the .kappa..sup.th
subunit in a chain of m subunits, where m is an integer greater
than ten; [0435] .alpha. represents a species of a subunit motif
selected from a library of subunit motifs, wherein each of the
species comprises sequence information of the contiguous nucleotide
sequence of a portion of the target nucleic acid; [0436]
.times..sup.1 represents a bond with the tether of an adjacent
subunit; and [0437] .chi..sup.2 represents an inter-tether bond;
or
##STR00030##
[0438] wherein [0439] T represents the tether; [0440] n.sup.1 and
n.sup.2 represents a first portion and a second portion,
respectively, of a nucleobase residue; [0441] .kappa. represents
the .kappa..sup.th subunit in a chain of m subunits, where m is an
integer greater than ten; and [0442] .alpha. represents a species
of a subunit motif selected from a library of subunit motifs,
wherein each of the species comprises sequence information of the
contiguous nucleotide sequence of a portion of the target nucleic
acid.
[0443] In some embodiments, the surrogate polymer daughter strands
may be formed by template-directed synthesis from a plurality of
subunits having the following structure:
##STR00031##
[0444] wherein [0445] T represents the tether; [0446] P.sup.1
represents a first probe moiety; [0447] P.sup.2 represents a second
probe moiety; [0448] .about. represents the at least one
selectively cleavable bond; and [0449] R.sup.1 and R.sup.2
represent the same or different end groups for the template
directed synthesis of the daughter strand;
##STR00032##
[0450] wherein [0451] T represents the tether; [0452] P.sup.1
represents a first probe moiety; [0453] P.sup.2 represents a second
probe moiety; [0454] R.sup.1 and R.sup.2 represent the same or
different end groups for the template directed synthesis of the
daughter strand; [0455] .epsilon. represents a first linker group;
[0456] .delta. represents a second linker group; and [0457] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00033##
[0458] wherein [0459] T represents the tether; [0460] P.sup.1
represents a first probe moiety; [0461] P.sup.2 represents a second
probe moiety; [0462] R.sup.1 and R.sup.2 represent the same or
different end groups for the template directed synthesis of the
daughter strand; [0463] .epsilon. represents a first linker group;
[0464] .delta. represents a second linker group; and [0465] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00034##
[0466] wherein [0467] T represents the tether; [0468] P.sup.1
represents a first probe moiety; [0469] P.sup.2 represents a second
probe moiety; [0470] .about. represents the at least one
selectively cleavable bond; [0471] R.sup.1 and R.sup.2 represent
the same or different end groups for the template directed
synthesis of the daughter strand; [0472] .epsilon. represents a
first linker group; and [0473] .delta. represents a second linker
group;
##STR00035##
[0474] wherein [0475] T represents the tether; [0476] P.sup.1
represents a first probe moiety; [0477] P.sup.2 represents a second
probe moiety; [0478] .about. represents the at least one
selectively cleavable bond; [0479] R.sup.1 and R.sup.2 represent
the same or different end groups for the template directed
synthesis of the daughter strand; [0480] .epsilon. represents a
first linker group; and [0481] .delta. represents a second linker
group;
##STR00036##
[0482] wherein [0483] T represents the tether; [0484] N represents
a nucleobase residue; [0485] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0486] .epsilon. represents a first linker group;
[0487] .delta. represents a second linker group; and [0488] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00037##
[0489] wherein [0490] T represents the tether; [0491] N represents
a nucleobase residue; [0492] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0493] .about. represents the at least one
selectively cleavable bond; [0494] .epsilon. represents a first
linker group; [0495] .delta. represents a second linker group; and
[0496] "- - - -" represents a cleavable intra-tether crosslink;
##STR00038##
[0497] wherein [0498] T represents the tether; [0499] N represents
a nucleobase residue; [0500] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0501] .epsilon. represents a first linker group;
[0502] .delta. represents a second linker group; and [0503] "- - -
-" represents a cleavable intra-tether crosslink;
##STR00039##
[0504] wherein [0505] T represents the tether; [0506] N represents
a nucleobase residue; [0507] R.sup.1 and R.sup.2 represent the same
or different end groups for the template directed synthesis of the
daughter strand; [0508] .epsilon..sub.1 and .epsilon..sub.2
represent the same or different first linker groups; [0509]
.delta..sub.1 and .delta..sub.2 represent the same or different
second linker groups; and [0510] "- - - -" represents a cleavable
intra-tether crosslink; or
##STR00040##
[0511] wherein [0512] T represents the tether; [0513] N represents
a nucleobase residue; [0514] V represents an internal cleavage site
of the nucleobase residue; and [0515] R.sup.1 and R.sup.2 represent
the same or different end groups for the template directed
synthesis of the daughter strand.
[0516] R.sup.1 and R.sup.2 are end groups configured as appropriate
for the synthesis protocol in which the subunit is used. For
example, R.sup.1=5'-phosphate and R.sup.2=3'-OH, would find use in
a ligation protocol, and R.sup.1=5'-triphosphate and R.sup.2=3'-OH
for a polymerase protocol. Optionally, R.sup.2 can be configured
with a reversible blocking group for cyclical single-substrate
addition. Alternatively, R.sup.1 and R.sup.2 can be configured with
linker end groups for chemical coupling or with no linker groups
for a hybridization only protocol. R.sup.1 and R.sup.2 can be of
the general type XR, wherein X is a linking group and R is a
functional group.
[0517] Other exemplary surrogate polymer and surrogate polymer
daughter strands are disclosed in greater detail in Published PCT
WO 2008/157696.
[0518] In one embodiment, the reporter constructs are attached to
the probe or nucleobase by a polymer tether. In other embodiments,
the tether is not associated with the reporter constructs. The
tethers can be constructed of one or more durable, aqueous- or
solvent-soluble polymers including, but not limited to, the
following segment or segments: polyethylene glycols, polyglycols,
polypyridines, polyisocyanides, polyisocyanates,
poly(triarylmethyl) methacrylates, polyaldehydes, polypyrrolinones,
polyureas, polyglycol phosphodiesters, polyacrylates,
polymethacrylates, polyacrylam ides, polyvinyl esters,
polystyrenes, polyamides, polyurethanes, polycarbonates,
polybutyrates, polybutadienes, polybutyrolactones,
polypyrrolidinones, polyvinylphosphonates, polyacetamides,
polysaccharides, polyhyaluranates, polyamides, polyimides,
polyesters, polyethylenes, polypropylenes, polystyrenes,
polycarbonates, polyterephthalates, polysilanes, polyurethanes,
polyethers, polyamino acids, polyglycines, polyprolines,
N-substituted polylysine, polypeptides, side-chain N-substituted
peptides, poly-N-substituted glycine, peptoids, side-chain
carboxyl-substituted peptides, homopeptides, oligonucleotides,
ribonucleic acid oligonucleotides, deoxynucleic acid
oligonucleotides, oligonucleotides modified to prevent Watson-Crick
base pairing, oligonucleotide analogs, polycytidylic acid,
polyadenylic acid, polyuridylic acid, polythymidine, polyphosphate,
polynucleotides, polyribonucleotides, polyethylene
glycol-phosphodiesters, peptide polynucleotide analogues,
threosyl-polynucleotide analogues, glycol-polynucleotide analogues,
morpholino-polynucleotide analogues, locked nucleotide oligomer
analogues, polypeptide analogues, branched polymers, comb polymers,
star polymers, dendritic polymers, random, gradient and block
copolymers, anionic polymers, cationic polymers, polymers forming
stem-loops, rigid segments and flexible segments. Such polymers can
be circularized at attachment points on a substrate construct.
[0519] The tether is generally resistant to entanglement or is
folded so as to be compact. Polyethylene glycol (PEG), polyethylene
oxide (PEO), methoxypolyethylene glycol (mPEG), and a wide variety
of similarly constructed PEG derivatives (PEGs) are broadly
available polymers that can be utilized in the practice of this
invention. Modified PEGs are available with a variety of
bifunctional and heterobifunctional end crosslinkers and are
synthesized in a broad range of lengths. PEGs are generally soluble
in water, methanol, benzene, dichloromethane, and many common
organic solvents. PEGs are generally flexible polymers that
typically do not non-specifically interact with biological
chemicals.
[0520] Other polymers that may be employed as tethers, and provide
"scaffolding" for reporters, include, for example, poly-glycine,
poly-proline, poly-hydroxyproline, poly-cysteine, poly-serine,
poly-aspartic acid, poly-glutamic acid, and the like. Side chain
functionalities can be used to build functional group-rich
scaffolds for added signal capacity or complexity.
[0521] Reducing the size and mass of the substrate construct can
also be achieved by using unlabeled tethers. By eliminating bulky
reporters (and reporter scaffolding such as dendrimers, which for
some encoding embodiments comprise over 90% of the tether mass),
hybridization and/or coupling kinetics can be enhanced.
Post-assembly tether labeling can then be employed. Reporters are
bound to one or more linkage chemistries that are distributed along
the tether constructs using spatial or combinatorial strategies to
encode the base sequence information. Post-assembly tether labeling
may be particularly advantageous in the context of S-Xpandomers due
to their reduced reporter content.
[0522] As mentioned above, the S-Xpandomers differ from the
Xpandomers in that the reporter construct(s) of the S-Xpandomers
encode only a subset of the probe sequence information. This is
beneficial in some embodiments because it simplifies the probe and
reduces its kinetic load. S-Xprobes are Xprobes that encode less
than all the base sequence information of their probes. For
example, in one embodiment, an S-Xprobe may have one 4-state
reporter that encodes one base (e.g., 5' end base) of its 6-base
probe. When assembled into an S-Xpandomer, the base information is
sampled as discrete intervals along the target. As a result,
multiple S-Xpandomers that are frame shifted with respect to the
base position are required to encode the entire target nucleic acid
sequence. Rolling circle polymerization is an exemplary method of
producing all the required S-Xpandomer sequence.
[0523] FIG. 4 shows a rolling circle polymerization process. In
Step I, a ligation reaction mix (L) is added, and the target DNA
fragment (depicted as the longer duplex) is ligated to a
double-stranded adapter oligomer (depicted as the shorter duplex)
to form a circularized target construct. In Step II, the target is
denatured into a single stranded target and a universal hairpin
primer is hybridized to its complement within the adapter portion
of the single-stranded, circularized target. The hairpin is not
phosphorylated on its 5' end and will only extend from its primer
end. In Step III, a polymerase reaction mix (P) is added.
Polymerase extension proceeds and extends from the 3' end of the
universal primer. In Step IV, polymerization of the available
nucleic acid monomers has extended the nascent 3' end around the
circularized template and displaces the hairpin to continue for a
second time around displacing the strand as it advances. Step V
illustrates continuous rolling circle replication. The reaction is
stopped when the product is of sufficient average length.
[0524] After denaturation and purification, the remaining
rolling-circle product has a series of more than R replication
units. A replication unit is the rolling-circle extension product
portion that replicates one loop of the circularized template. The
purified product (i.e. primed DNA) may then be used for an
S-Xpandomer synthesis using an such as that shown in FIGS. 3B and
3C. In this case S-Xprobes are incorporated from the nascent 5'
end.
[0525] An S-Xpandomer, synthesized from S-Xprobes, which encode for
single bases, will encode for the whole sequence of the
circularized template provided the following condition is met: for
the replication unit length in bases, L, the S-Xprobe probe length
in bases, S, and the number of replication units R, the remainders
of L/S, 2L/S, . . . , RL/S must include the numbers 0, 1, 2, . . .
, S-1. In general when this is satisfied, the minimum R is equal to
S. Each remainder is equivalent to the frame shift (in number of
bases) that occurs in the S-Xprobe position in the subsequent
replication unit for the 1st, 2nd, . . . Rth replication unit
respectively. This is further equivalent to saying that a frame
shift of the S-Xprobe position occurs after each replication unit
and that after R replication units, these frame shifts cause an
S-Xprobe in the S-Xpandomer to have every position relative to a
replication unit reference.
[0526] In an exemplary embodiment, a 5-base S-Xprobe probe is used
to produce S-Xpandomers of .about.1000 base DNA targets. Ignoring
other error sources, for target lengths that have equally
distributed remainders of 0, 1, 2, 3, or 4 when divided by 5 (S=5)
and if R is equal to or greater than 5 then only the case with
remainder zero will not generate S-polymers that encode for the
entire sequence of the target DNA.
[0527] To increase assembly efficiency of sequence reads in
redundant or low complexity regions of the genome, sequence reads
based upon paired-ends may be use. A paired-end read has two read
sequences taken from opposite ends of a long target DNA. By using
the length of the DNA target, the two sequences can reference each
other to assist in their assembly positioning. Paired-end nucleic
acids, including paired-end surrogate polymers, may be produced by
ligating probes bidirectionally from the primer. This process
starts by shearing and filtering target DNA into a narrow length
range 1000s of bases long, 10 kb+/-0.5 kb for example, as
illustrated in FIG. 5A.
[0528] FIG. 5A shows bidirectional synthesis of a paired-end
surrogate polymer. In step I, the DNA targets are blunt-ended,
ligated to a primer adapter, and circularized. The primer adapter
optionally comprises a tether for optional attachment to beads or
other solid substrate. The tether may use a covalent bond along
with a cleavable linkage for this attachment, or may use an oligo
to attach by hybridization. In step II, the circularized product is
denatured. A primer with a 5' phosphate and a 3' OH then duplexes
to the adaptor to initiate ligation bidirectionally. In Step III
the SBX process proceeds and s-Xprobes or Xprobes hybridize and are
ligated (step IV) along the circularized target in both directions
from both the nascent 3' and 5' ends
[0529] The primer can also be designed in a manner similar to
S-Xprobes and Xprobes to carry information on a tether about the
reaction such as the length range of the target (e.g., 10 kb, 20
kb, 30 kb) or to identify the target itself if there is target
parsing or multiplexing or just to identify the primer relative to
each of the pair of ends. The sequence region of paired-end nucleic
acids (surrogate polymers as well as other nucleic acids) may
contain any number of nucleobase residues. For example, a sequence
region comprise 10 more nucleobases, but sequence regions with
fewer bases are also possible.
[0530] In step V, the paired-end surrogate polymer daughter strand
has extended a sufficient number of bases in each direction. The
product is washed and denatured from the target. In step VI the
product is filtered for the higher value longer reads, and cleaved
to open the tethers yielding the paired-end surrogate polymer. This
resulting product encodes the paired-end sequence of the
circularized target.
[0531] FIG. 5B shows generation of a paired-end DNA target in a way
similar to that described above, except that oligo probes are used
instead of s-Xprobes or Xprobes. The product in this case may be
used as the surrogate polymer DNA target, used as a DNA target for
other sequencing methods, or used as the analyte input to a
sequencing method. Referring to FIG. 5B, in step I, the DNA targets
are blunt-ended, ligated to a primer adapter, and circularized. The
primer adapter optionally comprises a tether for optional
attachment to beads or other solid substrate. The tether may use a
covalent bond along with a cleavable linkage for this attachment,
or may use an oligo to attach by hybridization. In step II, the
circularized product is denatured. A primer with a 5' phosphate and
a 3' OH then duplexes to the adaptor to initiate ligation
bidirectionally.
[0532] In Step III the oligo probes hybridize and are ligated (step
IV) along the circularized target in both directions from both the
nascent 3' and 5' ends In step V, the daughter strand has extended
a sufficient number of bases in each direction. The product is
washed and denatured from the target.
[0533] The paired-end methods described above find utility in
surrogate polymer methods as well as methods employing other
nucleic acids (e.g. DNA, etc.) In addition to the above methods,
other variations are possible. For example, the bidirectional
synthesis may proceed from both the 3' and 5' ends of the primer
via ligation reactions. In another exemplary method, the
bidirectional synthesis proceeds from the 5' end of the primer via
a ligation reaction, and extension of the 3' end of the primer
proceeds via a polymerase reaction. On skilled in the art will
recognize that other combinations of the above methods are also
possible.
[0534] The disclosed surrogate polymers may comprise any number of
subunits which may be, for example, greater than 10, greater than
100, or greater than 1000. Further, while the reporter constructs,
C.sup.1, C.sup.2, C.sup.3, C.sup.4, C.sup.5 and C.sup.6, are
depicted above as being joined to the probes, P.sup.1, P.sup.2,
P.sup.3, P.sup.4, P.sup.5 and P.sup.6, by a bond, the reporter
constructs (also referred to herein as reporter elements) may be
joined to the tether or may be a component of the probe or tether
itself, and depiction of the reporter constructs as a separate
linked moiety is for purpose of illustration only.
[0535] The nucleobase residues of the probes may be, for example,
adenine (A), guanine (G), cytosine (C) or thymine (T), or other
heterocyclic base moieties as discussed in greater detail below,
including universal bases. The template-directed synthesis of the
daughter strand may be accomplished by any number of methods,
including techniques involving one or more enzymatic ligations,
polymerase reactions and/or chemical ligations. As noted above, the
daughter strand comprises a plurality of subunits, the number of
which can vary widely, for example, be greater than 30, or greater
than 1000.
[0536] Detection of the disclosed surrogate polymers can be
accomplished by any of a variety of techniques. For example, the
reporter constructs can be detected by passing the surrogate
polymer through a nanopore, by interrogation with an electron beam,
by scanning tunneling microscopy (STM), and/or transmission
electron microscopy (TEM). Other exemplary detection techniques are
described hereinbelow. The nature of the reporter construct will
largely depend upon the detection method employed. The reporter
construct may be joined to at least one nucleobase residue of the
probe by a covalent bond. Alternatively, or in addition to, the
reporter construct may be a component of at least one nucleobase
residue of the probe. The reporter construct may also optionally be
associated with or a part of the tether.
[0537] In more specific embodiments, the reporter elements for
parsing the genetic information may be associated with the tethers
of the surrogate polymer, with the surrogate polymer prior to
cleavage of the at least one selectively cleavable bond, and/or
with the surrogate polymer after cleavage of the at least one
selectively cleavable bond. The surrogate polymer may further
comprise all or a portion of the at least one probe or nucleobase
residue, and the reporter elements for parsing the genetic
information may be associated with the at least one probe or
nucleobase residue or may be the probe or nucleobase residues
themselves. Further, the selectively cleavable bond may be a
covalent bond, an intra-tether bond, a bond between or within
probes or nucleobase residues of the daughter strand, and/or a bond
between the probes or nucleobase residues of the daughter strand
and a target template.
[0538] A broad range of suitable commercially available chemistries
(Pierce, Thermo Fisher Scientific, USA) can be adapted for
preparation of the probes comprising selectively cleavable linker
bonds. Common linker chemistries include, for example, NHS-esters
with amines, maleimides with sulfhydryls, imidoesters with amines,
EDC with carboxyls for reactions with amines, pyridyl disulfides
with sulfhydryls, and the like. Other embodiments involve the use
of functional groups like hydrazide (HZ) and 4-formylbenzoate (4FB)
which can then be further reacted to form linkages. More
specifically, a wide range of crosslinkers (hetero- and
homo-bifunctional) are broadly available (Pierce) which include,
but are not limited to, Sulfo-SMCC (Sulfosuccinimidyl
4-[N-maleimidomethyl] cyclohexane-1-carboxylate), SIA
(N-Succinimidyl iodoacetate), Sulfo-EMCS ([N-e-Maleimidocaproyloxy]
sulfosuccinimide ester), Sulfo-GMBS (N-[g-Maleimido
butyryloxy]sulfosuccinimide ester), AMAS N-(a-Maleimidoacetoxy)
succinimide ester), BMPS (N EMCA (N-e-Maleimidocaproic
acid)-[.beta.-Maleimidopropyloxy] succinimide ester), EDC
(1-Ethyl-3-[3-dimethylaminopropyl]carbodiimide Hydrochloride),
SANPAH (N-Succinimidyl-6-[4'-azido-2'-nitrophenylamino]hexanoate),
SADP (N-Succinimidyl(4-azidophenyl)-1, 3'-dithiopropionate), PMPI
(N-[p-Maleimidophenyl]isocy, BMPH (N-[.beta.-Maleimidopropionic
acid] hydrazide, trifluoroacetic acid salt) anate), EMCH
([N-e-Maleimidocaproic acid] hydrazide, trifluoroacetic acid salt),
SANH (succinimidyl 4-hydrazinonicotinate acetone hydrazone), SHTH
(succinimidyl 4-hydrazidoterephthalate hydrochloride), and C6-SFB
(C6-succinim idyl 4-formylbenzoate). Also, the method disclosed by
Letsinger et al. ("Phosphorothioate oligonucleotides having
modified internucleoside linkages", U.S. Pat. No. 6,242,589) can be
adapted to form phosphorothiolate linkages.
[0539] Further, well established protection/deprotection
chemistries are broadly available for common linker moieties
(Benoiton, "Chemistry of Peptide Synthesis", CRC Press, 2005).
Amino protection include, but are not limited to, 9-Fluorenylmethyl
carbamate (Fmoc-NRR'), t-Butyl carbamate (Boc-NRR'), Benzyl
carbamate (Z--NRR', Cbz-NRR'), Acetamide Trifluoroacetamide,
Phthalimide, Benzylamine (Bn-NRR'), Triphenylmethylamine (Tr-NRR'),
and Benzylideneamine p-Toluenesulfonamide (Ts-NRR'). Carboxyl
protection include, but are not limited to, Methyl ester, t-Butyl
ester, Benzyl ester, S-t-Butyl ester, and 2-Alkyl-1,3-oxazoline.
Carbonyl include, but are not limited to, Dimethyl acetal
1,3-Dioxane, and 1,3-Dithiane N,N-Dimethylhydrazone. Hydroxyl
protection include, but are not limited to, Methoxymethyl ether
(MOM-OR), Tetrahydropyranyl ether (THP-OR), t-Butyl ether, Allyl
ether, Benzyl ether (Bn-OR), t-Butyldimethylsilyl ether (TBDMS-OR),
t-Butyldiphenylsilyl ether (TBDPS-OR), Acetic acid ester, Pivalic
acid ester, and Benzoic acid ester.
[0540] While the tether is often depicted as a reporter construct
with three reporter groups, various reporter configurations can be
arrayed on the tether, and can comprise single reporters that
identify probe constituents, single reporters that identify probe
species, molecular barcodes that identify probe species, or the
tether may be naked polymer (having no reporters). In the case of
the naked polymer, the reporters may be the probe itself, or may be
on a second tether attached to the probe. In some cases, one or
more reporter precursors are arrayed on the tether, and reporters
are affinity bound or covalently bound following assembly of the
Xpandomer product.
[0541] In some embodiments, each reporter has a minimum of two
states for encoding the base sequence information. Parity or error
correction information may also be encoded in the reporters. For
example, 9 binary-state reporters could encode the 4 base sequence
(2 bits/base) of the associated probe and use the last reporter to
encode parity of the previous 8 bits. In another example, three
4-state reporters encode for a three-base probe sequence. In yet
further embodiments, template-daughter strand duplexes are
disclosed comprising a daughter strand duplexed with a template
strand, as well as to methods for forming the same from the
template strand and the oligomer or monomer substrate
constructs.
[0542] In some embodiments, the present disclosure provides a kit
useful for SBX methods. The kit may comprise a plurality of
constructs (i.e., either Xprobes, Xmers, S-Xprobes or S-Xmers with
the appropriate R1/R2 end groups) for forming a daughter strand by
a template-directed synthesis, and may optionally comprise
appropriate instructions for use of the same in forming a daughter
strand. The number of constructs of the kit (which may also be
referred to as a "library" of constructs) will depend upon the
number of nucleobase residues/construct, as well as the number of
universal bases employed as the nucleobases residue(s). For
example, such a kit or library of constructs may contain unique
members numbering, for example, from 10 to 65000, from 50 to 5000,
or from 200 to 1200.
Detection Methods
[0543] Synthesis of surrogate polymers is done to facilitate the
detection and sequencing of nucleic acids, and is applicable to
nucleic acids of all kinds. The process is a method for "expanding"
or "elongating" the length of backbone elements (or subunits)
encoding the sequence, or partial sequence, information (expanded
relative to the small nucleotide-to-nucleotide distances of native
nucleic acids) and optionally also serves to increase signal
intensity (relative to the nearly indistinguishable, low-intensity
signals observed for native nucleotides). As such, the reporter
elements incorporated in the expanded synthetic backbone of the
surrogate polymers can be detected and processed using a variety of
detection methods, including detection methods well known in the
art (for example, a CCD camera, an atomic force microscope, or a
gated mass spectrometer), as well as by methods such as a massively
parallel nanopore sensor array, or a combination of methods.
Detection techniques are selected on the basis of optimal signal to
noise, throughput, cost, and like factors. The detection methods
described herein may optionally employ the fixed and linear array
presentation methods described below. Although often described in
the context of surrogate polymers for exemplary purposes, the
detection methods disclosed herein are equally applicable and
useful for detection of nucleic acids in general.
[0544] One exemplary detection method is the Coulter-like nanopore
process shown in FIG. 6. (It should be noted that these diagrams
are not to scale since reporters are anticipated to be >50 times
longer than the bases.) The much larger scale of the surrogate
polymer reporters relative to native DNA bases enables larger
scale, more readily produced nanopores. As depicted in FIG. 6, the
nanopore connects two reservoirs (40, 41) that are filled with an
aqueous electrolyte solution (typically 1 molar KCl). A potential
is applied between electrodes located in each reservoir and a
current flows through the nanopore (42). Typically, the surrogate
polymer has a negative charge density along its length, and is
drawn into the nanopore and is pulled through (translocated) by
electrophoretic forces. The nanopore current is modulated by
whatever portion of the surrogate polymer that lies within the
nanopore channel. In this example, each base type is associated
with a reporter type with a unique molecular structure based upon
size and/or charge distribution. As each reporter passes through
the nanopore, its molecular characteristics alter the current in
time and amplitude so the associated base identity can be
determined. By capturing this current signal, the sequence
information encoded in the sequential reporter constructs is
decoded.
[0545] Nanopore technology has the potential to serve as a low cost
approach to high throughput DNA sequencing. For example, single
molecule detection reduces reagent costs and enables long read
lengths, and minimal sample preparation eliminates the costs of
elaborate template processing and amplification. Rapid DNA
translocation rates across the detector (>1 Mbases/s) provides
extremely high throughput potential as well as simple low cost
single molecule transport. In addition, no chemistry requirement is
concurrent with the detection process, thus increasing detection
efficiency and decreasing complexity. Finally, simple
implementation is utilized that uses direct electrical detection
with macro scale electrodes, and low cost instrumentation may be
employed that utilizes the power of solid state integration to
perform both transport and detection of the DNA sequence.
[0546] Given that ds-DNA is .about.2 nm in diameter, reporters
designed with molecular cross-sectional diameters of 2, 2.8, 3.5
and 4 nm are believed to give responses in nanopores similar to
those shown in FIG. 7. More specifically, FIG. 7 shows a depiction
of a nanopore response as 4 different reporters (i.e. reporters
encoding A, C, T, and G) are passed serially through the nanopore.
(Note that these diagrams are not to scale since reporters are
anticipated to be >50 times longer than the bases.)
[0547] The surrogate polymer is expected to have higher mass with
lower average negative charge and thus will run slower than ds-DNA.
Methods to slow the reporter translocation rate include increasing
the reporter length, increasing the reporter mass, decreasing the
reporter charge density, increasing the reagent viscosity, and/or
reducing the translocation potential. Large reporter signals are
expected to provide signal-to-noise sufficient for 2-level or
higher multi-level coding at detection rates between 10 k to 100 k
reporters/s.
[0548] A variation of nanopore detection uses optical detection of
free fluorescent ions that translocate the nanopore. The advantage
of an optical detection technique for nanopores becomes especially
relevant for a large nanopore array. An array using
Coulter-counting requires each nanopore to be electrically isolated
from the next nanopore. Making tiny isolated reservoirs is
challenging because the fluids are the conductors. The optical
detection techniques disclosed herein allow a nanopore array to
share the cis and trans reservoirs (for a common sample)
eliminating the need for an array of small reservoirs and the
associated fluidics issues. Furthermore, optical detection allows
the use of high throughput CCD or CMOS image sensors to measure
entire nanopore arrays. Optical detection methods are useful for
detection of both surrogate polymers and nucleic acids in
general.
[0549] FIG. 8 shows a nanopore (43) with a surrogate polymer (44)
that is passing through the nanopore (translocating). The cis side
of the nanopore has a high concentration of fluorophores (45), such
as fluorescein (>10 mM). Fluoroscein is .about.1 nm in diameter
and ionizes with a charge of .about.1 elementary charges under weak
basic conditions. In a manner similar to Coulter counting, an
applied voltage between reservoirs and across the nanopore, drives
an ionic current that is limited by the nanopore resistance. For
the fluorescein example, its ions consist of a portion of negative
ion current that translocates the nanopore. The relative impedance
of the reservoirs is balanced by adding other electrolytes such as
KCl (at typical concentrations of .about.1 mM) to provide adequate
conductivity. This electrolyte will also contribute to the ion
current and will compete with the fluorophore current if its
concentration is too high.
[0550] Instead of measuring the current flow, this detection method
measures fluorescence of those fluorophores that pass through the
nanopore. The fluorophores in the trans reservoir quickly diffuse
away from the nanopore opening. As the surrogate polymer
translocates the nanopore, it modulates the fluorophore current
which in turn modulates the trans side fluorescence.
[0551] Fluorescence measurement must limit background noise due to
the cis side fluorophores (that are in high relative
concentration). One method which uses epifluorescence microscopy
for detection limits the background noise by applying a blocking
film on the nanopore substrate (46). An exemplary blocking film is
a gold film. The film does not need to be up to the edge of the
nanopore itself but any holes or gaps in the film should preferably
be <<.lamda./2n, the half wavelength of the excitation light
in the reagent media (index n). For example, in one embodiment
using 480 nm excitation, with n.about.1.33 (water), gaps must be
<<180 nm. A gold film 50 nm thick with a hole 30 nm in
diameter centered around the 10 nm nanopore satisfies this criteria
and limits transmission of light in both directions across the film
and through the gap.
[0552] To measure the fluorescence modulation, it is advantageous
that the fluorophores have a lifetime in the fluorescence
collection volume that is shorter than the rate of modulation. This
limited lifetime can be achieved using several different methods.
3D models of the fluorophore diffusion have been conducted that
show that the fluorophores diffuse away from the nanopore .about.1
micron in the order of milliseconds.
[0553] FIG. 9 plots model data of the temporal diffusion of
fluorescein into an infinite trans reservoir after translocating a
nanopore at 20 molecules/.mu.s. The volumes are defined as the
hemispheres centered at the nanopore exit. The fluorophores
approach a steady state concentration in each successively larger
volume in a successively longer time.
[0554] The volume that fluorescence is measured in is limited to a
hemisphere centered on the nanopore with a radius .about.1 micron
or less. The surrogate polymer is translocated through the nanopore
at rates slower than 1 ms per base so that the signal level
approaches steady-state. In some embodiments, the bandwidth
performance may be increased (at the cost of signal) by quenching
the fluorophores (as represented by (47) in FIG. 8) with additives
(48) after they enter the trans reservoir. Examples of such
additives are quenchers (e.g., QSY7 or QSY9 available from
Molecular Probes/Invitrogen, Carlsbad Calif.) that may be adapted
to bind to the fluorophore. Alternately free-radicals that oxidize
the fluorophore may be used as fluorescence quenchers. A free
radical generator such as azoisobutrylnitrile (AIBN) is a possible
source of free radicals. By controlling the concentration of these
additives, the time that a fluorophore actively fluoresces inside
the trans reservoir can be shortened and background of "old"
fluorophores (i.e. fluorophores that have already translocated the
nanopore) is reduced.
[0555] FIG. 10 shows graphs based upon an exemplary embodiment
where fluorophore translocation is limited in time to 5 blocking
levels (20, 17, 14, 11, and 8 fluorophores/us). The fluorophores
diffuse away but are quenched randomly throughout the reservoir
with a half-life of 600 .mu.s. This action of quenching reduces the
number of fluorophores there are for signal but also limits the
measurement volume and establishes a faster time to steady-state
(at each level).
[0556] In another embodiment, a method of eliminating the
fluorescent background comprises designing the detector so as to
only view a limited volume at the nanopore exit. Conoscopy is one
such exemplary method. The advantages of an optical detection
method are that the fluorophore current can be a highly amplified
signal. For example, in contrast to non-optical methods, the
nanopore array can be very high density because it does not require
reservoirs to be isolated between nanopores. In addition, the
measurement is well suited to a simple single color epifluorescent
microscope and takes advantage of the advances in high speed
cameras. The fluorescent background can be further reduced by
employing a nanopore substrate comprising a blocking film.
Exemplary blocking films for this purpose include gold films.
[0557] Another exemplary embodiment for reading the reporter
constructs of the surrogate polymers is ion indicator detection.
FIG. 11 depicts an example of ion indicator detection. This is a
nanopore variant that uses excitation light (50) and an ion
selective indicator molecule (54) to produce a fluorescent signal
(55). The signal amplitude depends upon the number of indicator
ions that translocate the nanopore (56). In FIG. 11, the indicator
ions (49) are loaded on the cis reservoir and indicator molecules
are loaded on the trans reservoir. The indicator ions translocate
the nanopore by both entropic and electric field forces. The rate
of their translocation depends upon the blockage state of the
nanopore. Surrogate polymer reporters are designed to provide
different blockage states. Once the indicator ions have
translocated (52), the indicator molecules will couple to them (53)
and alter their fluorescent emission characteristics (55).
[0558] For example, in one ion indicator detection embodiment, the
indicator ion Ca.sup.+2, will couple to the indicator Fura-3
(available from Molecular Probes/Invitrogen, Carlsbad Calif.), and,
under UV excitation, the emission at .about.520 nm increases by
40.times.. As the surrogate polymer translocates through the
nanopore each reporter limits the rate Ca.sup.+2 ions will
translocate and changes the ion distribution on the trans reservoir
side. The indicator located in the trans reservoir couples to the
Ca.sup.+2 ion and increases fluorescence. For a given Ca.sup.+2
translocation rate, the measured fluorescence level reaches a
steady state because the rate that new fluorescing
Ca.sup.+2/indicator compounds are created equals the rate of them
dissociating and/or diffusing out of the measurement volume. Unlike
the fluoro-current method described above, there is no fluorophore
or absorber in the cis reservoir, which means the volume at the
nanopore exit (i.e in the trans reservoir side) can be illuminated
from the cis side to excite the fluorophores. As with the
fluoro-current approach above, the ion/indicator couplet diffuses
away from the nanopore and steady state is established in the least
time (<1 ms) in a small volume close to the nanopore (<1
um).
[0559] To limit the measurement volume to this small volume, two
exemplary methods may be used. A nonfluorescing absorber in the
trans reservoir will absorb the excitation light exponentially with
depth into the reservoir to limit the measurement depth. An
epi-illumination microscope can be used to spatially delineate the
lateral dimensions of the volume.
[0560] In an alternative embodiment which also uses an
epi-illumination microscope, the volume can be delineated by
masking a small opening (<1 um diameter) centered on the
nanopore. This limits most of the fluorescence collection to the
small unshadowed volume at the nanopore exit.
[0561] Other exemplary indicators useful in this method include
Fluo-3, Indo-1, and Fura Red (available from Molecular
Probes/Invitrogen, Carlsbad Calif.). Other exemplary ion indicators
that can be used in, this method include but are not limited to,
ions of singlet hydrogen, singlet oxygen, potassium, zinc,
magnesium, chlorine and sodium, all of which have commercially
available fluorescence indicators (available from Molecular
Probes/Invitrogen, Carlsbad Calif.).
[0562] In another embodiment, quenching instead of enhancement is
used. FIG. 12 illustrates this quenching fluorescence approach. As
shown in FIG. 12, fluorophores (57) are present in one reservoir
and a quencher (58) is present in the other reservoir. As quencher
flows through the nanopore, it is in highest concentration at the
nanopore and drops to lower concentration as it diffuses away from
the nanopore. Excitation light (59) causes the fluorophores to
fluoresce (60), but this fluorescence is quenched (61) in
proportion to the concentration of quencher. The nanopore substrate
may optionally comprise a blocking film (62). A non-limiting
example of a blocking film is a gold film.
[0563] In an exemplary embodiment of the above, fluorescein may be
loaded at .about.10 mMolar concentration on the trans reservoir and
iodide ion may be loaded at .about.1 Molar concentration on the cis
reservoir. Iodide can translocate the nanopore in nA levels leading
to concentrations of iodide near the nanopore that act as quenchers
to the excited fluorescein (excited with 488 nm light). By using
epi-illumination with fluorescence capture from the cis reservoir
side and masking around the nanopore, fluorescence may be collected
from a small volume (<1 .mu.m.sup.3) within the trans reservoir.
The level of blocking in the nanopore establishes the level of
fluorescence quenching and provides the signal for decoding the
sequence information.
[0564] In another embodiment, translocation blockage level can be
measured using chemiluminescence. This method employs two species,
A and B, which are capable of combining to form an excited state
compound C'. A and B may be loaded into the cis and trans
reservoirs respectively, if either species translocates the
nanopore, it will react and form C'. When C' returns to the ground
state, it emits a photon. The intensity of the photon emission may
be used as a measure of the nanopore blockage. Non-limiting
examples of chemical species useful for this method include
luminal/peroxidase and luciferin/luciferase.
[0565] In another embodiment, Fluorescence Resonance Energy
Transfer (FRET) detection may be used. An exemplary FRET detection
embodiment employs an array of pores (PXp). In this embodiment, the
surrogate polymer is assembled using the methods described herein.
Surrogate polymer reporters are loaded with FRET donor
fluorophores, for 1 to 4 excitation wavelengths. The FRET acceptor
fluorophores are tethered to the porous node entrance. As the
surrogate polymer is translocated through the porous node its donor
fluorophores are excited with a light source, and as the reporters
pass proximal to the acceptor fluorophores at the nanopore
entrance, the acceptors are excited and emit their signature
fluorescence. These emissions are decoded into the associated
nucleotide sequence. Emissions can be modulated by wavelength,
ratio of wavelength, strength of emission, length of emission or a
combination of these.
[0566] In another embodiment, a nanocomb detector array may be
employed. As described in more detail below, a nanocomb performs a
presenting function by capturing and guiding tethered surrogate
polymer into the bottom of its channels, but it also comprises a
means of detecting the surrogate polymer. An exemplary embodiment
is depicted in FIGS. 13A and 13B. The nanocomb comprises a pair of
lateral electrodes (63), between which the surrogate polymer (67)
passes, an insulating layer (64), and a spacer (66). The detector
is located in or at the end of the nanocomb slot through which the
surrogate polymer is guided (65). The detector "reads" the reporter
elements as the surrogate polymer is drawn past. Modes for reading
the reporter elements include, but are not limited to: 1) the
electrolyte current is blocked by the surrogate polymer reporters
(Coulter Mode), or 2) the reporters are conductive, and a current
path is formed between the two electrodes. In some embodiments,
Reporters with conductive polymers at different densities are
employed. The reporters short the two electrodes at the slot
intersection of the electrode pair with different impedances which
can be decoded into the surrogate polymer's parent DNA
sequence.
[0567] The scale of the nanocomb detectors and the reporters
require that the surrogate polymer position be tightly controlled.
Thus, the nanocomb must be manufactured in a manner such that the
desired control can be achieved. For example, the channels or
troughs of the nanocomb are the intersection of two crystal planes.
By using anisotropic silicon etching, the bottom of the nanocomb's
troughs can be defined very sharply to <10 nm radius. The
nanocomb detector element is preferentially located near the
junction of the wafer surface and each trough, thus, it can be
formed by use of thin films and conventional wafer processing. One
exemplary method of creating the two electrodes uses two overlayed
thin films whose intersecting edges define the asymmetric etch mask
for the silicon. Shadow coating of a conductive metal (e.g., Au) on
these films produces two electrodes that are separated by the
shadow and film thickness. In some embodiments, further masking may
be required to further define the conductive electrodes.
[0568] In some embodiments, a direct means of "reading" surrogate
polymers presented in a linearized array uses electron beam
microscopy. Electron beam microscopy (e.g. Scanning Electron
Microscopy (SEM)) is capable of .about.1 nm resolution, and a large
number of different techniques have been developed for different
applications. In this embodiment, throughput is of high importance
whereas resolution can be compromised. For example, because the
surrogate polymers are ordered along a single axis in the
linearized array, the electron beam does not require resolution
normal to the surrogate polymer backbone and can be broadened in
this dimension to form a line rather than a spot focus.
[0569] With a line focus the electron beam is scanned along the
surrogate polymer backbone axis for data capture. The length of the
line electron beam is limited by the background noise it produces.
This background noise degrades the signal emitted by the surrogate
polymer (i.e., reduces signal-to-noise ratio (SNR)). The advantage
of the long line electron beam is the reduced requirement for
lateral positioning. In some embodiments, materials such as boron
or nanogold in the reporters provides large scatter cross-sections
to the electron beam for high contrast signals. In some
embodiments, the SEM beam angle can be optimized to improve the
SNR.
[0570] In some embodiments, conventional post processing, for
example, by deposition of high contrast coatings, such as gold
films, to the linearized array can provide enhanced SEM contrast.
Other thin film techniques including shadow deposition,
electrodeposition, vacuum deposition and etching can be used to
enhance the SNR of the surrogate polymer in the electron beam
measurement.
[0571] In some embodiments, knife-edge conduction may be used for
detection of the surrogate polymers. In these embodiments, the
surrogate polymers comprise one or more brush polymer reporters,
wherein the brush polymer reporters comprise conductive polymeric
bristles. FIGS. 14A-C illustrate an exemplary multichannel
knife-edge detector which slides along a linearized array substrate
(68). Surrogate polymers (69) comprising brush polymer reporters
(70) are presented on the linearized array aligned and spaced apart
by the pitch of the detector channels. They are in electrical
contact with a conductive ground plane (e.g., gold film (71)) of
the linear array substrate. A knife edge electrode (72), normal to
the surrogate polymer backbone axis, is positioned above the
substrate ground plane by a .about.10 nm gap, formed by friction
spacers (73) along each side of the gap. An electric potential is
applied between the knife-edge electrode and the linearized array
ground plane. As the detector slides along the substrate, the
surrogate polymer reporters sequentially pass under the knife-edge
electrode. The conductive polymer bristles of the reporter act
under the influence of the electric field bridge to make contact
and complete an electric circuit between the knife-edge and the
ground plane. The electric current provides the measured signal.
The reporters may be selected to have different impedances or
different lengths thus generating distinguishable current
signatures for decoding the surrogate polymer information. An
exemplary method of distinguishing different impedance levels is to
use different conductive bristle densities. Non-limiting examples
of conductive polymers include polyacetylene, polyaniline and
polypyrrole.
[0572] In an alternative embodiment, fluorescence microscopy may be
employed for surrogate polymer detection. The surrogate polymer is
labeled with fluorophores of one, two or more spectral types. One
example is to use two fluorophores with different spectral
emissions, red and green for example. Each of the four nucleic base
types can be uniquely identified using four emission states: (1)
Red only, (2) Red>Green, (3) Green>Red, and (4) Green
only.
[0573] To maintain high information density but practical
fluorescence capture, the surrogate polymer may be presented in a
dense parallel aligned packing arrangement with separations of
.about.1 micron. A sensor with 10 micron pixels and a 40.times.
objective provides 250 nm/pixel resolution (or 4 pixels
intersurrogate polymer separation). To resolve reporters their
minimal separations are .about.200 nm. This can be further reduced
by invoking near field, zero mode, STORM or FRET/Quench methods for
more localized detection.
[0574] In another embodiment, an optical near field method using
slits instead of capillaries can be used to localize the excitation
energy to <100 nm along the surrogate polymer axis. The near
field source that emerges from the slit is used to excite
fluorophores of the surrogate polymer reporters and fluorescence is
detected in the far field. As the slit array is scanned along the
linearized array and along the axis of the surrogate polymer, the
measured fluorescence can be deconvolved to produce the surrogate
polymer sequence information.
Presentation Methods
[0575] The SBX process produces an enriched product of surrogate
polymer that is then presented to the detection instrument to
"read" the reporter sequence. Exemplary detection methods include
those discussed above and other detection methods known in the art.
To improve the performance of the detector, the surrogate polymer
product can be further processed for presentation to the detector.
For example, in some embodiments, the charge characteristics of the
surrogate polymer may be engineered to be similar to a native DNA
polymer. Exemplary presentation methods include molecular gating,
spatial confinement, flow control, channelizing, substrate bonding
and thin film processing enhancements. For exemplary purposes, the
methods disclosed herein are often illustrated and discussed with
reference to surrogate polymers, however, the disclosed
presentation methods are equally useful for nucleic acids in
general.
[0576] An important characteristic for detection and measurement of
reporters is to have uniform spatial and temporal spacing of the
reporters presented to the detector. For this to happen it is
advantageous that the surrogate polymers be extended and positioned
appropriately. A hairpin fold places a high burden on the detector
to distinguish two portions of a labeled strand simultaneously and
leads to lowered detection efficiency. In a related issue the
surrogate polymer should have either an inherent stiffness or a
tension along its length to prevent adjacent labels from bunching.
This characteristic helps to maintain the reporter-to-reporter
spacing and maintain reporter resolution. In a final related
characteristic, the speed at which the surrogate polymer is
presented to the detector should be uniform and smooth. Temporal
variation in presenting the reporter reduces the detector
efficiency because it must sample for the fastest exposure
requirement, whereas the throughput depends only on the average
exposure requirement. The embodiments disclosed herein, address
these needs and provide further advantages.
[0577] Non-limiting examples of methods of presenting the surrogate
polymer to the detector include: (1) in-flow, (2) tethered to a
solid substrate, and (3) aligned on a substrate surface. FIG. 14
illustrates this concept. FIG. 15A illustrates surrogate polymers
presented in-flow in both a random and ordered (e.g. gated)
fashion. The arrows indicate the direction of surrogate polymer
flow. FIG. 15B illustrates surrogate polymers presented tethered to
a solid substrate in both a random and ordered (e.g. arrayed)
fashion. Finally, FIG. 15C illustrates surrogate polymers presented
aligned on a substrate surface in both a random and ordered (e.g.
arrayed) fashion.
[0578] An example of the "in flow" presentation is when surrogate
polymer flows to and through a nanopore detector. By this detection
technique, two to four reporter types can provide a corresponding
number of current levels with which to encode base sequence
information and can be detected at throughput rates of 10 to 1000
kReporters/s. For sequencing throughput >1 Gbases/hour,
parallelization of the nanopores is required.
[0579] FIG. 16 shows a substrate having an array of pores (called a
PXp) useful for in-flow presentation methods. The substrate
comprises a regular array of nodes each of which has one or more
pores. When the array has a single pore at each node, it is a
nanopore array (NXp). As discussed above, nanopore channels can be
configured to comprise a detector element. Thus, the array depicted
in FIG. 16 provides a possible method for parallel detection of
surrogate polymers.
[0580] In an exemplary embodiment, each nanopore channel of a NXP
is configured to allow detection of a surrogate polymer. In this
embodiment, the concentration of surrogate polymer must be
controlled to maximize the efficiency of the nanopore channel.
Surrogate polymers arrive at the detector randomly in time with 0,
1, 2 or more surrogate polymer arrivals occurring over any set time
period. According to Poisson statistics, over a given sampling
period, a maximum of .about.37% of the periods will have a single
surrogate polymer arrival. The rest of the periods will have 0, 2
or more arrivals. This percentage of single surrogate polymer per
channel is further reduced when overlapping of the adjacent
surrogate polymer is accounted for. Modeling of the case where all
molecules have equal length and velocity but have random arrival
times indicates that only 18% of the read time will produce
complete nonoverlapped reads. An ideal scenario is to have single
molecules line up head-to-tail so the detector sees no gaps and
always sees a portion of a single molecule.
[0581] In the above embodiment, some of the surrogate polymers may
be in a folded condition which lowers the efficiency even more
(assuming the detector can only distinguish unfolded surrogate
polymer). Solubility and mobility limits of long surrogate polymers
can further limit fill efficiency because even at maximum
solubility concentrations, the surrogate polymer may not fill the
nanopores fast enough to reach 18% fill. Thus, there remains a need
in the art for in-flow presentation methods which optimize the
efficiency of nanopore array detectors. Exemplary embodiments
disclosed herein overcome the problems associated with nanopore
array detection and provide further advantages.
[0582] In one embodiment, adding a charged, long linear polymer
having a low molecular weight to the end of the surrogate polymer
can assist in threading the surrogate polymer because of the
polymer's higher mobility and charge density relative to the
surrogate polymer itself. For example, a polymer having a linear
charge density and/or the same charge state as DNA (i.e. negative)
can be attached to the end of the surrogate polymer. As depicted in
FIGS. 17A and B, the end of such a polymer (74) would be drawn into
a nanopore (75) with much higher efficiency in either a folded or
unfolded state, translocating ahead of the surrogate polymer (76)
and pulling it into the nanopore. Non-limiting examples of highly
negatively charged, low linear density polymers include:
polyglutamic acid and polyphosphate. To further increase the
probability of threading the nanopore, one embodiment employs
multiple polymers attached to one or both ends of the surrogate
polymer. The length of the polymer is optimized to maximize the
threading probability, but must not significantly interfere with
the surrogate polymer measurement. In some embodiments, the
polymers can be microns in length.
[0583] Increasing the potential across a nanopore can improve
performance of fill. Thus, in one embodiment, the nanopore current
is actively monitored, and the voltage can is increased (thereby
increasing the nanopore fill efficiency) until a threaded surrogate
polymer is detected and then decreased to the desired measurement
voltage until the surrogate polymer measurement is completed. The
voltage is then increased again, until the next surrogate polymer
is threaded.
[0584] In some embodiments, read efficiency is increased by
actively switching the detector (e.g. current measurement
electronics in the case of Coulter-like nanopores) to an array of
nanopores that is already filled with surrogate polymers in a
ready-to-measure state. When measurement is completed the detector
is switched to another array of prefilled surrogate polymers. In
some embodiments, prefilling may be performed offline with enough
nanopore arrays to complete the whole sequencing job. In other
embodiments, prefilling can be a real time function whereby
prefilling is occurring on one or more arrays while measurements
are being made on another array.
[0585] FIGS. 18A through 18C illustrate the prefill concept. In the
illustrated embodiment, surrogate polymers (77) are first adapted
to have a stop (78) which prevents the surrogate polymer from
passing through a nanopore but allows them to be threaded.
Non-limiting examples of stops include beads and bulky dendrimers.
To adapt to a bead stop, a linker at the end of an surrogate
polymer is reacted with appropriate chemistry on the bead surface
under low surrogate polymer concentration and high bead
concentration. Beads are collected and washed to remove unreacted
surrogate polymer. Next, unreacted beads are removed under
electrophoretic fields that filter out bead-attached surrogate
polymers. The bead-adapted surrogate polymers are now threaded into
the nanopore array by applying a voltage across the pores (depicted
in FIG. 18A as electric field E). Under dilute conditions the
surrogate polymer will thread singularly through the nanopore up to
the stop (e.g., bead). The stop is pulled against the nanopore by
the translocation force due to the electric field acting on the
charged surrogate polymer and seals the nanopore, preventing
additional surrogate polymers from threading through the same
nanopore. This continues until the nanopores each have a single
surrogate polymer (FIG. 18B). The array can now be measured by
reversing the voltage, driving the surrogate polymer in the
opposite direction and collecting the measurement sequence (e.g.,
current).
[0586] In an alternative embodiment, mechanical force may be used
for translocation. This embodiment is illustrated in FIG. 18C where
the surrogate polymers are pulled through the nanopores by their
ends being attached to a substrate (79) that is lowered.
[0587] In another embodiment, a magnetic bead stop provides
additional functionality. As illustrated in FIG. 19A, the surrogate
polymer (80) may comprise a magnetic bead (81) attached to one end.
Under a magnetic field, B, the beads are driven to the nanopore
array surface which increases the concentration and probability of
threading. After the surrogate polymer has threaded through the
nanopore and is pinned in the nanopore by the applied electric
field, E, the surrogate polymers that are not threaded into
nanopores can be drawn away by applying a low magnetic field in the
reverse direction as shown in FIG. 19B. By further increasing the
magnetic field (and adjusting the applied electric field as
required), the surrogate polymer can stretch and it will
translocate through the nanopore in either direction depending upon
which force dominates. If the magnetic force dominates, the
nanopore translocation has velocity irregularity due to Brownian
motion. Using a platform (83) that moves slower than the magnetic
bead velocity, as shown in FIG. 19C, the translocation velocity can
be uniformly controlled across the nanopore array.
[0588] In another embodiment, a linear ferrite polymer leader can
be adapted to the end of the surrogate polymer instead of a
magnetic bead. The linear ferrite polymer leader can be used to
move (or stretch) the surrogate polymer much like the magnetic bead
but can still translocate the nanopore.
[0589] Several different gating techniques are described herein and
generally share a common characteristic. In each case, single
surrogate polymer molecules are released to flow towards the
detector on a regular period. The period is chosen so that as the
detector finishes the sequential reading of reporters on one
surrogate polymer molecule another one enters the detector and
thereby maximizes the duty cycle of the detector.
[0590] In one embodiment, gating of the surrogate polymers is
accomplished by timed release of the surrogate polymers from a
substrate. Referring to FIG. 20A, the surrogate polymers (84) are
tethered to a substrate (85) in a manner such that their positions
are known or can be determined. For example, filling a regular grid
with one tethered molecule per grid space is one embodiment.
Another example is to place them randomly but with suitable
separation and with a fluorescent label so they can be located. The
tether is chosen to have an addressable cleavable coupling (86) so
that the surrogate polymer can be released (87) on demand.
Exemplary methods of addressable cleaving include: (i)
photocleavable links using a directed light source such as a
focused laser beam (88), (ii) thermally released links where local
heating is achieved by addressable resistors or a laser, and/or
(iii) electrochemical links where addressable electrodes can drive
redox reactions.
[0591] FIG. 20B illustrates another embodiment referred to as
electrical trapping or electrode gating. In this embodiment
electric fields (i.e. electrodes 89, 90, and 91) are placed in
front of the detector (92) to repel the surrogate polymer from the
detection area when a molecule is in the detector and is being
"read". When the molecule is read (or near completion of the read),
the gating field is reversed and the surrogate polymer that is in
solution in front of the detector is attracted to the detector. As
soon as one enters the detector, the gate electric field is again
reversed to repel the other surrogate polymers. It is necessary to
maintain gate fields low enough so that the surrogate polymer in
the detector does not get drawn back by the repulsive gating
force.
[0592] FIG. 20C illustrates another gating embodiment referred to
as membrane gating. This embodiment shares features of the previous
embodiments but also comprises features that help stretch out the
surrogate polymer and to filter those surrogate polymers that are
knotted or clumped. This embodiment employs a thin filter membrane
(93) that has pores on the order of 20-100 nanometers. Exemplary
membranes include an aluminum oxide porous membrane and a polymer
track-formed membrane.
[0593] Referring again to FIG. 20C, porous electrodes (94, 95)
sandwich the membrane. When a field is applied between the
electrodes, the surrogate polymer is transported by
electrophoresis. As it enters the membrane it preferentially
threads from one end due to the pore constriction of the membrane
and is pulled through. When it emerges from the membrane, the pore
provides a drag force that elongates the surrogate polymer. The
flow of surrogate polymers toward the detector (96) is controlled
by shutting the field off or applying a high frequency alternating
current field, stopping the electrophoretic transport force. In the
latter embodiment the alternating current field may assist in
elongating the surrogate polymer without transporting it. To
utilize the gating function of this embodiment a feedback mechanism
similar to those described above is used.
[0594] In yet another embodiment illustrated in FIG. 20D, two or
more addressable gate electrodes (97, 98) are used to "feed" the
detector (99). In this embodiment, each gate electrode is turned on
(gate open) until a surrogate polymer is detected and is then is
shut off (gate closed). The gates (100) that are "holding" a
surrogate polymer molecule are opened on a regular period so that a
steady stream of nonoverlapped, head-to-tail, surrogate polymers
feed through the detector. Non-limiting examples of gates types
include a porous membrane using surrogate polymer-bound
fluorescence to detect the surrogate polymer in the gate and a
nanohole that uses current to determine the presence of the
surrogate polymer.
[0595] Some in flow methods of presenting the surrogate polymer
include the use of drag tags, hydrodynamic straightening, electric
field gradients and magnetic force. In one embodiment an affinity
drag tag is used to straighten and stretch the surrogate polymer as
shown in FIG. 21. A weak affinity drag tag (101) is linked to the
end of the surrogate polymer. The surrogate polymer (102) is then
electrophoretically pulled through an appropriate affinity gel or
channel (103). The weak affinity coupling creates friction on the
surrogate polymer end that allows the rest of the surrogate polymer
to unravel and stretch out in the field.
[0596] Some detection methods are best adapted to surrogate
polymers that are tethered to a substrate on one end and are "read"
by moving the detector array relative to the substrate. Such
substrates are known as fixed surrogate polymer arrays (also
referred to herein as fixed arrays). FIG. 15B shows that surrogate
polymer can be surface tethered in a spatially stochastic manner of
attachment or they can be tethered on a regular array with a single
surrogate polymer at each array point. With random spatial
attachment the efficiency of a detector is guided by
two-dimensional Poisson statistics which, for any desired square
cell size, optimally gives one molecule/cell in 14% of the cells.
By comparison, a fully loaded regular array of tethered single
surrogate polymer sites is 100%. This is a 7-fold increase in
detector efficiency at the cost of more preprocessing of the
surrogate polymer.
[0597] Other packing scenarios having intermediate efficiencies are
also possible. For example, a regular fixed array with point
attachment positions that may couple multiple tethers is governed
by 1-dimensional Poisson statistics. This strategy has an optimum
of 37% of the sites having single surrogate polymer occupancy.
[0598] One embodiment of the regular fixed array employs a smooth
substrate that has small binding sites <1 micron in size placed
on a regular grid. Surrogate polymers can be adapted to bind to
these sites. If they are attached randomly, 37% of the sites are
single surrogate polymers. Exemplary substrate choices include:
tape, such as flexible polyethylene terephthalate (PET) film, float
glass, silicon wafers and stainless steel sheets. The array grid
size is chosen to correspond to the detection method to be
used.
[0599] An exemplary method of creating a fixed array that has a
single surrogate polymer per binding site uses an array of very
small spots on the substrate. These spots comprise surface bound
reactive linkers. The surrogate polymers are end adapted to a
molecular complex which has an overabundance of reactive endgroups,
wherein each molecular complex has enough relative mobility such
that it reacts with all of a spot's linker groups. Thus, each of
the surface array spots only link with a single surrogate polymer.
Exemplary molecular complexes include: a bead, a dendrimer and a
linear polymer with sidechains.
[0600] The end-adapted surrogate polymers are reacted under dilute
conditions to limit multiple surrogate polymers from reacting with
a single spot. The surface bound linkers can be chosen from many
existing well established chemistries, for example,
biotin/strepavidin. In one embodiment, biotin can be linked to the
substrate using biotinylated PEG modified to link with the
substrate (e.g., thiolation for gold film or silanization for Si or
SiO.sub.2). In a similar manner pegalated strepavidin may be linked
to the end adapted molecular complex on the surrogate polymer.
[0601] The diameter of the linker spots is minimized to limit the
binding area, but the array must be made efficiently since each
genome preparation may require 10.sup.7 to 10.sup.9 spots covering
areas up to 100 cm.sup.2 or more. E-beam lithography to expose each
spot is time-intensive and expensive. However, use of E-beam
lithography to define masks is a reasonable approach for
preparation of the array. Molecular Imprints, Inc. has developed
imprint technology (using E-beam imprint molds) whereby <20 nm
features with high aspect relief can be defined in quartz masks.
These may be used to define contact printing stamps for direct
stamping of a linker (for example, biotin). Alternatively, a metal
mask spot array may be used as a contact mask for UV ablation of a
biotin monolayer. When the monolayer is against the metal it is
protected from the UV, whereas unmasked areas are stripped.
[0602] Lithographic techniques using photoresist liftoff or
protected etching may also be used to prepare the array. Defining
lines of linkage sites rather than spots is a compromise between 2D
random surface linkages and the one-to-one surrogate polymer
linkage of spot arrays. In this embodiment, surrogate polymers link
randomly along a line, and, provided the line width is much
narrower than the average spacing between surrogate polymer along
the line, the linked surrogate polymers will lie in a one
dimensional Poisson distribution (i.e. a 37% fill factor at optimal
Poisson statistics). The advantage of this embodiment is that
lithography is a relatively simple technique and the surrogate
polymer need only be adapted to have a single reactive site.
[0603] The complimentary chemistry on the end of the surrogate
polymer is designed to link to the substrate binding site in a
manner that prevents or minimizes more than one surrogate polymer
per binding site. One embodiment employs a dendrimer attached to
the end of the surrogate polymer for linking the surrogate polymer
to the substrate. The dendrimer is designed to saturate or block
all the coupling capacity on the binding site thereby preventing
another dendrimer (on another surrogate polymer) from binding to
the same site.
[0604] In another embodiment a nanopore array is prefilled and
utilized as a fixed array. In this embodiment, the surrogate
polymers are adapted to have stops at one end which allow the
surrogate polymer to be threaded into the nanopore, but will stop
the end of the surrogate polymer from complete translocation. In
addition, the stop performs a function of limiting the nanopore
filling to one surrogate polymer per nanopore because a second
surrogate polymer cannot enter a "stopped" nanopore. After filling
the array with surrogate polymers the stops are fixed in place to
create a fixed array. Exemplary stops include beads and
dendrimers.
[0605] In another embodiment, the fixed array may be further
processed by aligning the surrogate polymers on the substrate
surface to create a linearized array as depicted in FIG. 15C. In
this embodiment, the grid length must be long enough to allow the
surrogate polymer to lie down stretched and not overlapped by other
surrogate polymers. An exemplary grid size is about 10
microns.times.about 500 microns.
[0606] In exemplary fixed array embodiments, the surrogate polymer
is fully immobilized on the substrate surface and the detector
reads the surrogate polymer reporters sequentially by moving
laterally relative to the substrate surface. This method requires
additional preprocessing of the surrogate polymer prior to
detection, but it provides new opportunities for detection and a
more readily accessible media for reread and archive functions.
[0607] For surface alignment methods, the surface area should be
used efficiently, both for efficient detection but also to limit
substrate costs. Thus, the surrogate polymer must be coupled to the
substrate in a very controlled manner. One embodiment to realize a
high density regular array of aligned surrogate polymers on the
surface of the substrate is to first create a regular array of
tethered surrogate polymer as described above (i.e. a fixed array).
The next step is to lay the surrogate polymer down onto the
substrate surface and bond it thereto.
[0608] Surrogate polymer densities on the substrate surface will
depend upon process limitations and on the detection techniques. In
some embodiments, the surrogate polymer density is about 1-10 .mu.m
between parallel surrogate polymers and about 10% to about 30%
longer than the surrogate polymer separating sequential surrogate
polymers. For example, a 150 .mu.m surrogate polymer may be spaced
along its length from the next surrogate polymer by 30 .mu.m. To
prevent the surrogate polymer from surface bonding prematurely and
misaligning, it is advantageous to use a real time bonding
activation method that can be applied when it is needed. For
example, ultraviolet and chemical activation of the surface are
exemplary bonding activation methods. Exemplary methods of laying
the surrogate polymer down on the substrate surface are described
below.
[0609] In one embodiment, the surrogate polymer is stretched under
an electric field and the field is smoothly rotated from normal to
the substrate 180 degrees through tangent to the substrate (at 90
degrees) and finally to the negative normal position. This must be
performed slowly enough for the surrogate polymer to maintain a
stretched position in the field. By rotating beyond 90 degrees the
surrogate polymer is pinned in an extended stretched position on
the substrate. When the surrogate polymer contacts the substrate it
is bonded in place. In some embodiments, a rotation smaller than
180 degrees can be used, provided that the surrogate polymer moves
freely to a stretched position (in the desired direction) and can
subsequently be pinned (e.g. by electric force) to the
substrate.
[0610] An exemplary embodiment is shown in FIG. 22A. In this
embodiment, flexible tape substrate (104), for example a PET film,
is processed continuously. As the tape moves and enters the turn of
the roller, the surrogate polymers on the tape surface remain
oriented normal to the tape (105). Proceeding around the first 90
degrees of the turn they remain aligned and stretched with the
field (108) but rotate relative to the tape surface until they lie
down on the tape surface (106). As the tape moves around the next
90 degrees of the turn, the field now pins the surrogate polymer
onto the tape surface (107). If the surface and/or surrogate
polymer are appropriately activated, the surrogate polymer may be
bonded to the surface. If not, it may be bonded during a subsequent
process or a pinning force may be maintained until the tape is
"read" by the detector. Another embodiment comprises rotating a
solid substrate is the field.
[0611] Another exemplary embodiment is illustrated in FIG. 22B.
Here, the surrogate polymer (112) is laid down on the substrate
using a comb (109) to guide it down to the substrate and also to
steer it laterally into a straight channel (110). The comb
comprises electrodes (114a, 114b) and a ground plane (115) which
produce a stretching field (111) at the input side and a pinning
field (113) at its exit. As the substrate is passed under the comb,
the tethered surrogate polymer is "caught" and steered into a comb
channel near the substrate linkage site. With further movement, the
steered surrogate polymer is drawn to the substrate surface. At
this point, the field is reversed and it pins the surrogate polymer
to the substrate for bonding as required.
[0612] A lithographic method of making the comb uses anisotropic
etching of Si wafers. Crystalline Si wafers cut and polished on the
face will preferentially etch relative to the face with a potassium
hydroxide etchant. The wafer is lithographically masked along one
side of a regular sawtooth pattern with 57 degree switchbacks
oriented parallel to 2 of the intersecting planes. After etching,
the wafer is cut and polished normal to the surface and parallel to
the edge of the sawtooth to form a regular pattern of notches that
form the comb. Each notch has 2 smooth faces that intersect at the
trough of the comb channel. The angle the trough makes with the top
of the wafer is .about.55 degrees and with the polished edge is
.about.35 degrees. To prevent shearing of the surrogate polymer,
small film-based runners can be defined on the substrate or the
comb.
[0613] Another embodiment which is similar to the comb described
above is the brush illustrated in FIG. 22C. This brush comprises
very tiny bristles (116) of .about.10 nm in diameter. As with the
Si comb described above, the brush is drawn across a fixed array
substrate (117). Provided the bristles sweep the surface, starting
at the substrate-to-surrogate polymer linkage, the surrogate
polymer (118) becomes entrained and stretched along the direction
of brush movement. By using a field or other capture means, the
surrogate polymer is then pinned to the substrate surface.
[0614] An exemplary method of making the brush employs an
Al.sub.2O.sub.3 porous array as a mold to form a brush of polymer
bristles. These may be based upon UV or thermal-cured polymers. An
example of this process is described by Lee et al. (H. S. Lee, D.
S. Kim, and T. H. Kwon, "UV nano embossing for polymer nano
structures with non-transparent mold insert," Microsystem
Technologies, vol. 13, 2007, pp. 593-599).
[0615] After a substrate has been processed by aligning surrogate
polymers along its surface, further processing is possible that can
serve to get more robust signal or to simplify detection. These
methods are often intimately bound with the detection process.
Coating linear arrays with gold to improve electron microscopy
contrast is a non-limiting example of this. Further, reactive sites
on the reporters can be loaded with label chemistries or contrast
agents.
[0616] A tape or film substrate provides a means for continuous
"web" processing of the surrogate polymer through the detection
process. In one embodiment, this could be a loop in which the tape
is cleaned after detection and loops back to retether new surrogate
polymer product for reading. Tapes suitable for this purpose
include PET. Commercial PET film has surface roughness of <40 nm
which with planarization processing can be reduced to <10
nm.
EXAMPLES
Example 1
Preparation of Surrogate Polymers by Template Directed Ligation
[0617] SBX have been demonstrated in different probe types. Each
modified probe is synthesized and demonstrated to extend from a
primer using template-directed ligation. Ligation of probes that
are 2, 3, 4, and 6 nucleotides in length have been investigated at
different stages of modification. These include the following types
of probes:
[0618] (1) simple oligomer probe;
[0619] (2) probe with two nucleotides modified with aliphatic amino
linkers;
[0620] (3) probe with a PEG3500 tether conjugated to the probe's
amino linkers; and
[0621] (4) probe with an internucleotide selectively cleavable
linker (that is also modified with two aliphatic amino
linkers).
[0622] Modified probes of types (1), (2), (3), and (4) have been
synthesized and have each demonstrated primer-initiated extension
using template-directed ligation. Selective cleavage of the
selectively cleavable linker has also been demonstrated. The gel
data discussed below confirms extension by processive ligation of
these modified probes and confirms selective cleavage.
[0623] FIG. 23 shows a typical target template that is duplexed
with a 16-mer HEX-modified primer and designed with a 20 base 5'
overhang. Different length templates are used to measure processive
template-directed ligation of modified probes that extend off the
primer. Ligation product is sorted by gel electrophoresis into
bands, and bands are identified using the HEX fluorescence.
Experiments with unmodified probes (probe type (1)) were conducted
to demonstrate primer extension with processive ligation. Template
dependent ligation was confirmed by negative results when
conducting probe ligation on mismatched templates.
[0624] FIG. 24A shows ligation products up to .about.100 bases in
length. These gel results demonstrate multiple, template-directed
ligations of a bis(amino-modified) hexanucleotide probe (probe type
(2)). For this example, the templates were fixed to magnetic beads
and duplexed to a hex-labeled extension primer. Hexanucleotide
oligomeric substrates of the sequence 5' (phosphate) C A (amino)C
(amino)A C A 3' were hybridized to a range of progressively longer
complimentary templates in the presence of T4 DNA ligase, ligating
and extending from the duplexed primer. The ligation product was
then denatured from its template and separated on a 20% acrylamide
gel. Ligation products in lanes 1 to 4 were produced on templates
with 18, 36, 68 and 100 bases beyond the primer. The upper rung in
the ladder for each of the four lanes corresponds to ligated
additions of 3, 6, 12 and 17 hexamers. These upper rungs are
relatively strong bands and indicate much longer ligation products
are possible.
[0625] As a point of reference, the single band indicating
extension of 100 bases is estimated to be 0.1 pmoles of ligated
product using the measurement sensitivity as a reference. This is
equivalent to 6 trillion bases or 60 genomes @ 20.times. coverage
worth of "read" material. This demonstrates why processing cost is
low, scaling is simple, and the advantage of size enrichment to
only send the longest and highest value surrogate polymers to the
detection step.
[0626] The gel results shown in FIG. 24B demonstrate multiple,
template-directed ligations of tetranucleotide probes modified with
a PEG3500 attached at each end to two modified probe nucleotides
(probe-type (3)). The probe precurser was a bis 2,3 (amino)
tetranucleotide, 5' (phosphate) C(amino) A (amino)C A 3'. The
aliphatic amino modifiers were then converted to 4-formylbenzoate
(4FB), and bis (amino) PEG3500 converted to bis (HyNic) PEG3500
(HyNic conjugation kit purchased from Solulink, CA). Under dilute
conditions the bifunctional PEG3500 was reacted with the bis 2,3
(4FB) tetranucleotide to form a circularized PEG loop. As in the
previous example, a template was fixed to magnetic beads and
duplexed to a hex-labeled extension primer. In this example, the
template has 20 bases beyond the primer. The PEG-circularized
tetranucleotide probes were hybridized to the complimentary
template and ligated in the presence of T4 DNA ligase. The ligation
product was then separated from its template and separated on a 20%
acrylamide gel. The gel results indicate ligation of product
polymer containing four PEG-modified probes. This demonstrates that
doubly modified probes loaded with high masses of 3500 Daltons can
be progressively ligated to a template. (Due to limitations of the
gel, it was difficult to resolve incorporations of these ligated
mass-loaded probes to longer than four probes.)
[0627] FIG. 24C shows gel results of ligated tetranucleotide probes
(probe-type (4)) with amino-modifiers that are further modified to
have a selectively cleavable bond (cb). This bond is a modification
to an internucleotide phosphodiester and will cleave under acidic
conditions. The probe was a bis 2,3 (amino) tetranucleotide, 5'
(phosphate) A (amino)C(cb)T (amino)C 3'. This template has 36 bases
beyond the primer and the gel ladder indicates ligation was
complete with a maximum of 9 tetranucleotide additions. The control
has the same reagent mix with no ligase.
[0628] FIG. 24D shows results for a test of the selective cleavage.
A 26-mer was produced with a selectively cleavable bond between the
17.sup.th and 18.sup.th nucleotide. Lane 1 shows the 26-mer
control. Lane 2 shows the 17-mer cleavage product after subjecting
the 26-mer to acid. Results indicate essentially complete cleavage
of the 26-mer into shorter segments at very high level of
completion.
Example 2
Preparation of a Paired-End Surrogate Polymer by Rolling Circular
Polymerization
[0629] To prepare a paired-end surrogate polymer, ligation was
initiated from each end of a primer that was hybridized on a ss-DNA
template where 36 bases of the template extended beyond both the 3'
OH and the 5' phosphorylated ends of the primer. Ligation products
with 4-mer probes extending from either end of the primer were
synthesized. FIG. 25 shows the gel results of ligated
tetranucleotide probes (probe-type (4)) bis 2,3 (amino)
tetranucleotide, 5' (phosphate) A (amino)C(cb)T (amino)C 3' with 2
different templates, T -36 and T +36. Template T +36 has 36 bases
beyond the 3' OH end of the primer whereas the T +36 has 36 bases
beyond the 5' phosphate end of the primer. In each case, the probes
were incorporated and ligated. The gel shows unextended template
(119) and a maximum of 9 (120) and 8 (121) tetranucleotide
additions in the T -36 and T +36 templates, respectively.
Sequence CWU 1
1
2136DNAArtificial SequenceTypical target template 1tgtgtgtgtg
tgtgtgtgtg atctaccgtc cgtccc 36216DNAArtificial Sequence16-mer
primer sequnence 2gggacggacg gtagat 16
* * * * *