U.S. patent application number 11/085679 was filed with the patent office on 2005-11-03 for molecular arrays and single molecule detection.
This patent application is currently assigned to The Chancellor, Master and Scholars of The University of Oxford, The Chancellor, Master and Scholars of The University of Oxford. Invention is credited to Mir, Kalim.
Application Number | 20050244863 11/085679 |
Document ID | / |
Family ID | 32031884 |
Filed Date | 2005-11-03 |
United States Patent
Application |
20050244863 |
Kind Code |
A1 |
Mir, Kalim |
November 3, 2005 |
Molecular arrays and single molecule detection
Abstract
Methods are provided for producing a molecular array comprising
a plurality of molecules immobilized to a solid substrate at a
density which allows individual immobilized molecules to be
individually resolved, wherein each individual molecule in the
array is spatially addressable and the identity of each molecule is
known or determined prior to immobilisation. The use of spatially
addressable low density molecular arrays in single molecule
detection techniques is also provided.
Inventors: |
Mir, Kalim; (Oxford,
GB) |
Correspondence
Address: |
PALMER & DODGE, LLP
KATHLEEN M. WILLIAMS
111 HUNTINGTON AVENUE
BOSTON
MA
02199
US
|
Assignee: |
The Chancellor, Master and Scholars
of The University of Oxford
|
Family ID: |
32031884 |
Appl. No.: |
11/085679 |
Filed: |
March 21, 2005 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11085679 |
Mar 21, 2005 |
|
|
|
PCT/GB03/04041 |
Sep 19, 2003 |
|
|
|
Current U.S.
Class: |
435/6.11 ;
427/2.11; 435/287.2; 435/7.1 |
Current CPC
Class: |
B01J 2219/00729
20130101; B01J 2219/00432 20130101; B01J 2219/00605 20130101; B01J
2219/00612 20130101; B01J 19/0046 20130101; B01J 2219/00497
20130101; B82Y 30/00 20130101; B01J 2219/00378 20130101; B01J
2219/00691 20130101; B01J 2219/00599 20130101; B82Y 5/00 20130101;
B01J 2219/00367 20130101; B01J 2219/00626 20130101; B01J 2219/00585
20130101; B01J 2219/00317 20130101; B01J 2219/00576 20130101; B01J
2219/00659 20130101; B01J 2219/00677 20130101; B01J 2219/00653
20130101; C40B 50/14 20130101; B82Y 10/00 20130101; B01J 2219/00725
20130101; B01J 2219/0043 20130101; B01J 2219/00596 20130101; B01J
2219/00592 20130101; B01J 2219/00711 20130101; B01J 2219/00722
20130101; B01J 2219/00527 20130101 |
Class at
Publication: |
435/006 ;
435/007.1; 435/287.2; 427/002.11 |
International
Class: |
C12Q 001/68; G01N
033/53; C12M 001/34 |
Foreign Application Data
Date |
Code |
Application Number |
Sep 19, 2002 |
GB |
0221792.5 |
Sep 26, 2002 |
GB |
0222412.9 |
Claims
1. A method for producing a molecular array which method comprises
immobilising to a solid phase a plurality of molecules at a density
which allows individual immobilised molecules to be individually
resolved, wherein each molecule in the array is spatially
addressable and the identity of each molecule is known or
determined prior to immobilisation.
2. A method according to claim 1 wherein the molecules are applied
to the solid phase by a method selected from printing, electronic
addressing, or in situ synthesis by light-directed synthesis, ink
jet synthesis or physical masking.
3. A method according to claim 2 wherein the molecules are applied
to the solid phase by printing of dilute solutions.
4. A method for producing a molecular array which method comprises:
(i) providing a molecular array comprising a plurality of molecules
immobilised to a solid phase at a density such that individual
immobilised molecules are not capable of being individually
resolved; and (ii) reducing the density of functional immobilised
molecules in the array such that remaining individual functional
immobilised molecules are capable of being individually resolved;
wherein each individual functional molecule in the resulting array
is spatially addressable and the identity of each molecule is known
or determined prior to the density reduction step.
5. A method according to claim 4 wherein the density of functional
molecules is reduced by cleaving all or part of the molecules from
the solid phase.
6. A method according to claim 4 wherein the density of functional
molecules is reduced by functionally inactivating the molecules in
situ.
7. A method according to claim 4 wherein the density of functional
molecules is reduced by labelling some of the plurality of
molecules such that individual immobilised labelled molecules are
capable of being individually resolved.
8. A method according to claim 1 wherein the immobilised molecules
are present within discrete spatially addressable elements.
9. A method according to claim 8 wherein the structure of probes
present in each discrete spatially addressable elements is
precisely known and unintended structures are substantially
absent.
10. A method according to claim 8 wherein a plurality of molecular
species are present within one or more elements and each molecular
species in an element can be distinguished from other molecular
species in the element by means of a label.
11. A method according to claim 1 wherein the plurality of
molecules which are capable of being individually resolved are
capable of being resolved by optical means.
12. A method according to claim 1 wherein the plurality of
molecules which are capable of being individually resolved are
capable of being resolved by scanning probe microscopy.
13. A method according to claim 1 wherein the molecules are
attached to the solid phase at a single defined point.
14. A method according to claim 1 wherein the molecules are
attached to the solid phase at two or more points.
15. A method according to claim 1, wherein the molecules comprise a
detectable label.
16. A method according to claim 15 wherein the label can be read by
optical methods.
17. A method according to claim 15 wherein the label is a single
fluorescent molecule or nano-particle/rod, or a plurality of
fluorescent molecules or nano-particles/rods.
18. A method according to claim 15 wherein the label is a
non-fluorescent molecule, nanoparticle or nanorod.
19. A method according to claim 1 wherein the molecules are
selected from defined chemical entities, oligonucleotides,
polynucleotides, peptides, polypeptides, conjugated polymers, small
organic molecules or analogues, mimetics or conjugates thereof.
20. A method according to claim 19 wherein the molecules are cDNAs
and/or genomic DNA.
21. A method according to claim 19 wherein the molecules are
oligonucleotides or polynucleotides and the molecules are provided
as groups of molecules, each group of molecules selectively
hybridising to a different site within a target nucleic acid
molecule and immobilised to the solid phase such that each group is
spatially distinct from the other groups.
22. A method according to claim 21 wherein within each group,
different molecular species are immobilised in discrete spatially
addressable elements.
23. A method according to claim 22 wherein the different molecular
species selectively hybridise to different alleles.
24. A method according to claim 21 wherein the different groups of
molecules are immobilised to the solid phase such that the order of
arrangement of each group relative to the other groups on the solid
phase corresponds to the order of the corresponding sites in the
target nucleic acid molecule.
25. A method according to claim 21 wherein the different groups are
arranged along a first horizontal axis of the solid phase and
within each group the different molecular species are arranged in
discrete elements along a second horizontal axis of the solid
phase.
26. A method according to claim 1, wherein the immobilised
molecules are present within discrete spatially addressable
elements and each element comprises a distinct spatially
addressable micro electrode or nano electrode.
27. A method according to claim 26 wherein said electrodes are
formed of conducting polymers.
28. A method according to claim 27 wherein said electrodes are
produced by a method selected from inkjet printing, soft
lithography, nanoimprint lithography/lithographically induced self
assembly, VLSI methods and electron beam writing.
29. A method according to claim 1, wherein the immobilised
molecules are immobilised onto a single electrode.
30. A method according to claim 29 wherein the electrode(s)
transduce a signal when a target molecule binds to an immobilised
molecule present in the same element as an electrode.
31. A method for typing single nucleotide polymorphisms (SNPs) and
mutations in nucleic acids, comprising the steps of: a) providing a
repertoire of probes complementary to one or more nucleic acids
present in a sample, which nucleic acids may possess one or more
polymorphisms, said repertoire being presented such that molecules
in said repertoire may be individually resolved; b) exposing the
sample to the repertoire and allowing nucleic acids present in the
sample to hybridise to the probes at a desired stringency and
optionally to be processed by enzymes; c) detecting individual
hybridised nucleic acid molecules after optionally eluting the
unhybridised nucleic acids from the repertoire.
32. A method according to claim 31, wherein the repertoire is
arrayed on a solid phase.
33. A method according to claim 31, wherein the repertoire is
arrayed at a density which allows molecules in said repertoire to
be individually resolved.
34. A method according to claim 33, wherein said array is an array
according to claim 31.
35. A method according to claim 31, wherein the sample is exposed
to a second repertoire of probes, which probes bind to one or more
molecules of the sample at a different position to the probes of
the first repertoire.
36. A method according to claim 35, wherein said first and second
repertoires are differentially labelled.
37. A method for determining the complete or partial sequence of a
target nucleic acid, comprising the steps of: a) providing a first
set of probes complementary to one or more nucleic acids present in
a sample, said first set of probes being presented such that
arrayed molecules may be individually resolved; b) hybridising a
sample comprising a target nucleic acid to the first set of probes;
c) hybridising one or more further probes of defined sequence to
the target nucleic acid; and d) detecting the binding of individual
further probes to the target nucleic acid. e) and detecting the
approximate distance separating each probe or the order of each
probe
38. A method according to claim 37, wherein the first set of probes
is a repertoire of probes.
39. A method according to claim 38, wherein the repertoire is
arrayed on a solid phase.
40. A method according to claim 39, wherein the target nucleic
acids are captured to the solid phase at one or more points.
41. A method according to claim 37, wherein the repertoire is
arrayed at a density which allows molecules in said repertoire to
be individually resolved.
42. A method according to claim 37, wherein the probes are
differentially labelled.
43. A method for determining the number of sequence repeats in a
sample of nucleic acid, comprising the steps of: a) providing one
or more probes complementary to one or more nucleic acids present
in a sample, which nucleic acids may possess one or more sequence
repeats, said probes being complementary to a sequence flanking one
end of the repeats, said probes being presented such that molecules
may be individually resolved; b) contacting the nucleic acids with
labelled probes complementary to units of said sequence repeats and
a differentially labelled probe complementary to the flanking
sequence at the other end of the targeted repeats; c) contacting
the complex formed in b) with probes in a); and d) determining the
number of repeats present on each sample nucleic acid by individual
assessment of the number of labels incorporated into each molecule
and only counting those molecules to which the differentially
labelled probe complementary to the flanking sequence is also
associated with.
44. A method according to claim 43, wherein the repertoire is
arrayed on a solid phase.
45. A method according to claim 43, wherein the repertoire is
arrayed at a density which allows molecules in said repertoire to
be individually resolved.
46. A method for analysing the expression of one or more genes in a
sample, comprising the steps of: a) providing a repertoire of
probes complementary to one or more nucleic acids present in a
sample, said repertoire being presented such that molecules may be
individually resolved; b) hybridising a sample comprising said
nucleic acids to the probes; and c) determining the nature and
quantity of individual nucleic acid species present in the sample
by counting single molecules which are hybridised to the
probes.
47. A method according to claim 46, wherein the repertoire is
arrayed on a solid phase.
48. A method according to claim 46, wherein the repertoire is
arrayed at a density which allows molecules in said repertoire to
be individually resolved.
49. A method according to claim 46, wherein the repertoire
comprises a plurality of probes of each given specificity.
50. A method for typing single nucleotide polymorphisms (SNPs) and
mutations in nucleic acids, comprising the steps of: a) providing a
repertoire of probes complementary to one or more nucleic acids
present in a sample, which nucleic acids may possess one or more
polymorphisms; b) arraying said repertoire such that each probe in
the repertoire is resolvable individually c) exposing the sample to
the repertoire and allowing nucleic acids present in the sample to
hybridise to the probes at a desired stringency and optionally be
processed by enzymes such that hybridised/processed nucleic
acid/probe pairs are detectable; d) eluting the unhybridised
nucleic acids from the repertoire and detecting individual
hybridised nucleic acid/probe pairs; e) analysing the signal
derived from step (d) and computing the confidence in each
detection event to generate a PASS table of high-confidence
results; and f) displaying results from the PASS table to type
polymorphisms present in the nucleic acid sample.
51. A method according to claim 50, wherein confidence in each
detection event is computed in accordance with Table 1.
52. A method according to claim 50, wherein detection events are
generated by labelling the sample nucleic acids and/or the probe
molecules, and imaging said labels on the array using a
detector.
53. A method according to claim 50, where probe and/or target acts
as a primer or ligation substrate.
54. A method according to claim 50, wherein the probe and or target
is enzymatically processed by ligases or polymerases or
thermophilic varieties thereof.
55. A method according to claim 50, wherein the probe forms
secondary structures which facilitate or stabilise hybridisation or
improve mismatch discrimination.
56. A method for determining the sequence of all or part of a
target nucleic acid molecule which method comprises: (i)
immobilising the target molecule to a solid phase at two or more
points such that the molecule is substantially horizontal with
respect to the surface of the solid phase; (ii) straightening the
target molecule, during or after immobilisation; (iii) contacting
the target molecule with a nucleic acid probe of known sequence;
and (iv) determining the position within the target molecule to
which the probe hybridises.
57. A method according to claim 56 wherein the target molecule is
contacted with a plurality of probes.
58. A method according to claim 57 wherein each probe is labelled
with a different detectable label.
59. A method according to claim 57 wherein the target molecule is
contacted sequentially with each of the plurality of probes.
60. A method according to claim 59 wherein each probe is removed
from the target molecule prior to contacting the target molecule
with a different probe.
61. A method according to claim 57 wherein the target molecule is
contacted with all of the plurality of probes substantially
simultaneously.
62. A method according to claim 60 wherein the probes are removed
by heating, modifying the salt concentration or pH, or by applying
an appropriately biased electric field.
63. A method according to claim 56 wherein the target is
substantially a double stranded molecule and is probed by strand
invasion using PNA or LNA.
64. A method according to claim 56 wherein the target nucleic acid
molecule is a double-stranded molecule and is derived from a
single-stranded nucleic acid molecule of interest by synthesising a
complementary strand to said single-stranded nucleic acid.
65. A method for determining the sequence of all or part of a
target single-stranded nucleic acid molecule which method
comprises: (i) immobilising the target molecule to a solid phase at
two or more points such that the molecule is substantially
horizontal with respect to the surface of the solid phase; (ii)
straightening the target molecule, during or after immobilisation
(iii) contacting the target molecule with a plurality of nucleic
acid probes of known sequence, each probes being labelled with a
different detectable label; and (iv) ligating bound probes to form
a complementary strand.
66. A method according to claim 65 wherein prior to step (iv), any
gaps between bound probes are filled by polymerisation primed by
said bound probes.
67. A method according to claim 65 wherein the solid phase is a
bead or particle.
68. A method according to claim 65 wherein the solid phase is a
substantially flat surface.
69. A method for arraying a plurality of nucleic acid molecules
which method comprises: (i) immobilising the plurality of nucleic
acid molecules randomly to a solid substrate; (ii) optionally
horizontalising and straightening the molecules, during or after
immobilisation; and (iii) contacting the plurality of nucleic acid
molecules with a plurality of probes, each probe being labelled,
such that each immobilised molecule can be identified uniquely by
detecting the probes bound to the molecule.
70. A method according to claim 69 wherein the plurality of nucleic
acid molecules are immobilised at a density such that individual
immobilised molecules in the sample can be individually
resolved.
71. A method for arraying a plurality of nucleic acid molecules
which method comprises: (i) contacting the plurality of nucleic
acid molecules with a plurality of probes, each probe being
labelled with a tag which indicates uniquely the identity of the
probe, such that each molecule can be identified uniquely by
detecting the probes bound to the molecule and determining the
identity of the corresponding tags; (ii) immobilising the plurality
of nucleic acid molecules randomly to a solid substrate; and
optionally (iii) horizontalising and straightening the molecules,
during or after immobilisation.
72. A method according to claim 71 wherein the plurality of nucleic
acid molecules are immobilised at a density such that individual
immobilised molecules in the sample can be individually
resolved.
73. A method according to claim 69 or 71 wherein the solid phase is
a substantially flat solid substrate or a
bead/particle/rod/bar.
74. A method for producing a molecular array which method comprises
immobilising to a solid phase a plurality of molecules present in a
sample, wherein the plurality of molecules are immobilised at a
density such that individual molecules in the sample can be
individually resolved.
75. A method according to claim 74 wherein the plurality of
molecules are polypeptides.
76. A method according to claim 74 wherein the plurality of
molecules comprise the genome, proteome, transcriptome or
metabolome of a cell, tissue or organism.
77. A method for identifying and/or characterising one or more
molecules of a plurality of molecules present in a sample which
method comprises: (i) producing a molecular array by a method
comprising immobilising to a solid phase a plurality of molecules
present in a sample, wherein the plurality of molecules are
immobilised at a density such that individual molecules in the
sample can be individually resolved; and (ii) identifying and/or
characterising one or more molecule immobilised to the array.
78. A method according to claim 77 wherein step (ii) comprises
contacting the array with a one or more probes and determining
whether one or more of said probes interacts with one or more of
said immobilised molecules.
79. A method according to claim 77 wherein one or more of said
immobilised molecules is interrogated by an optical method.
80. A method according to claim 79 wherein the optical method is
selected from far-field optical methods, near-field optical
methods, epi-fluorescence spectroscopy, scanning confocal
microscopy, two-photon microscopy and total internal reflection
microscopy.
81. A method according to claim 78 wherein one or more of said
immobilised molecules is interrogated by scanning probe microscopy
or electron microscopy.
82. A method according to claim 77 wherein a physicochemical
property of the immobilised molecules is determined, such as shape,
size or mass, charge, hydrophobicity.
83. A method according to claim 77 wherein an electromagnetic,
electrical, optoelectronic and/or electrochemical property of the
immobilised molecules is determined.
84. A method according to claim 77 wherein a characteristic of a
complex of between an immobilised molecule and a probe is
determined
85. A method according to claim 77 wherein the plurality of
molecules are polypeptides.
86. A method according to claim 77 wherein the plurality of
molecules comprise the proteome, transcriptome or metabolome of a
cell, tissue or organism.
87. A method according to claim 77 wherein the characteristics of
individual immobilised molecules are learnt using a computational
method.
88. A method according to claim 87 wherein the computational method
is a neural network or artificial intelligence.
89. A molecular array obtained by the method of claim 87 wherein
the characteristics of a plurality of immobilised molecules and
their corresponding physical location in the array have been
determined.
90. A multiplexed array comprising a plurality of arrays, each
array comprising immobilized to a solid phase, a plurality of
molecules at a density which allows individual immobilized
molecules to be individually resolved, wherein each molecule in the
array is spatially addressable and the identity of each molecule is
known or determined prior to immobilization.
91. A method for identifying and/or characterising one or more
molecules of a plurality of molecules present in a sample which
method comprises: (i) producing a molecular array by a method
comprising immobilising to a solid phase a plurality of molecules
present in a sample, wherein the plurality of molecules are
immobilised at a density such that individual molecules in the
sample can be individually resolved; and (ii) identifying and/or
characterising one or more molecule immobilised to the array by a
method comprising contacting the immobilised molecules with a
plurality of encoded probes.
92. A method according to claim 91 wherein each probes is encoded
by virtue of being labelled with a tag which indicates uniquely the
identity of the probe, such that an immobilised molecule can be
identified uniquely by detecting the probes bound to the molecule
and determining the identity of the corresponding tags.
93. A method according to claim 92 wherein the tagged probes are
produced using combinatorial chemistry.
94. A method according to claim 92 wherein the tag is selected from
a nanoparticle, a nanorod and a quantum dot.
95. A method according to claim 92 wherein each tag comprises
multiple molecular species.
96. A method according to claim 92 wherein the tags are detectable
by optical means.
97. A method according to claim 92 wherein the tags are particulate
and comprise surface groups.
98. A method according to claim 92 wherein the tags are particulate
and encase detectable entities, such as particle or molecules.
99. A method according to claim 92 wherein tags can be detected and
distinguished by scanning probe microscopy.
100. A method according to claim 92 wherein the solid substrate is
a bead/particle/rod/bar.
101. A method according to claim 92 wherein the solid phase
comprises channels or capillaries within which the molecules are
immobilised.
102. A method according to claim 92 wherein the solid phase
comprises a gel.
103. A biosensor comprising a molecular array according comprising
immobilized to a solid phase, a plurality of molecules at a density
which allows individual immobilized molecules to be individually
resolved, wherein each molecule in the array is spatially
addressable and the identity of each molecule is known or
determined prior to immobilization.
104. An integrated biosensor comprising a molecular array according
to claim 103, an excitation source, a dectector, such as a CCD and
optionally, signal processing means.
105. A biosensor according to claim 103 wherein the biosensor
comprises a plurality of elements, each element containing distinct
molecules, such as probe sequences.
106. A biosensor according to claim 105 wherein each element is
specific for the detection of a different target, such as different
pathogenic organisms.
107. A biosensor according to claim 103 wherein the molecular array
is formed on an optical fibre.
108. A method according to claim 26, in which: (a) the immobilised
molecule is selectively coated with a material that facilitates
detection (b) the coating is a conducting material which allows a
circuit to form between only those electrodes onto which are
occupied by the target molecule by virtue of its binding to the
alleic probe present on the electrode; (c) a potential difference
is applied between electrodes in any two contiguous groups of
electrodes and the electrodes on which probes interact with target
are identified by virue of the fact that a current flows between
them; (d) the conducting material comprises silver, gold, palladium
or conjugated polymers; or (e) multiple single molecules span the
electrodes then the haplotype frequency is given by the amount of
current that flows between the electrodes.
109. A method according to claim 69 in which the plurality of
probes are labeled with a tag which indicates uniquely the identity
of the probe.
110. A method according to claim 69 in which the plurality of
tagged probes are hybridized substantially simultaneously or in
groups of probes.
111. A method according to claim 1 in which probes are grouped
according to their Tm.
112. A method according to claim 69, in which each of the plurality
of labeled probes are successively hybridized to the immobilized
nucleic acid and a record of those that hybridise to each molecule
can be used to identify or re-assemble the sequence of the
immobilized molecule.
113. A method according to claim 112 in which haplotype frequencies
can be determined.
114. A method according to claim 70 in which probes are between
lengths 3-9 mers.
115. An array comprising, immobilized to a solid phase, a plurality
of molecules at a density which allows individual immobilized
molecules to be individually resolved, wherein each molecule in the
array is spatially addressable and the identity of each molecule is
known or determined prior to immobilization.
116. A method of identifying one or more target molecules in a
sample, comprising: providing an array comprising a plurality of
molecules immobilized to a solid phase at a density which allows
individual immobilized molecules to be individually resolved,
wherein each individual immobilized molecule in the array is
spatially addressable and the identity of each immobilized molecule
is known or encoded; and contacting the array with said sample and
interrogating one or more individual immobilized molecules to
determine whether a target molecule has bound.
Description
FIELD OF THE INVENTION
[0001] The present invention relates to single molecule analytical
approaches which are performed using molecular arrays.
[0002] In particular, the single molecule analytical approaches
according to the invention involve tagging schemes, the detection
of labels/tags and the determination of the spatial coordinates of
a single molecule on the array. The invention further involves the
direct measurement of physico-chemical properties of individual
molecules and their interaction with other molecules. The use of
the invention in a number of methods is described including SNP
typing, haplotyping, gene expression analysis, proteomics and
sequence determination, where the invention is particularly
relevant to ultra-fast, parallel DNA sequencing which is applicable
to the sequencing of whole genomes.
BACKGROUND TO THE INVENTION
[0003] The analytical methods generally in use today involve
analysing the reactions of molecules in bulk. Although bulk or
ensemble approaches have in the past proved useful, culminating in
an explosion in our understanding of molecular biology and recently
to the sequencing of the human genome, there are barriers to future
progress in a number of directions. The results generated by bulk
analysis are an average of millions of individual molecular
reactions where multiple events, multi-step events and variations
from the average cannot be resolved and detection methods that are
adapted for high frequency events are insensitive to rare
events.
[0004] The bulk nature of conventional methods does not allow
access to specific characteristics of individual molecules. One
example in genetic analysis is the need to obtain genetic phase or
haplotype information--the specific alleles associated with each
chromosome. Bulk analysis cannot resolve haplotypes in a
heterozygotic sample. The currently available molecular biology
techniques, for this, such as allele-specific or single molecule
PCR are difficult to optimise and apply on a large scale.
[0005] Bulk analysis typically requires a large amount of sample
material. For example, Microarray gene expression analysis using
unamplified cDNA target typically requires 10.sup.6 cells or 100
micrograms of tissue. Furthermore, neither expression analysis nor
analysis of genetic variation can be performed directly on material
obtained from a single cell which would be advantageous in a number
of cases e.g. analysis of MRNA from cells in early development or
genomic DNA from sperm.
[0006] Furthermore, it would be highly desirable if the
amplification processes that are required before most biological or
genetic analysis can be avoided. This includes amplification of
molecules by cloning and the Polymerase chain reaction (PCR). The
need is particularly acute in the large scale analysis of SNPs. The
cost of performing SNP detection reactions on the scale required
for high-throughput analysis of polymorphisms in a population is
prohibitive if each reaction has to be conducted separately, or if
only a limited multiplexing possibility exists. The need to design
primers and perform PCR on a large number of SNP sites presents a
major bottleneck, DNA pooling is a solution for some aspects of
genetic analysis but accurate allele frequencies must be
obtained.
[0007] Sequencing the human genome for the first time took more
than ten years and hundreds of millions of dollars. Although this
was achieved by use of Sanger-dideoxy method (Sanger F, Nicklen S,
Coulson AR DNA sequencing with chain-terminating inhibitors. Proc
Natl Acad Sci USA. 1977 December;74(12):5463-7), the methods
involved are inherently slow and costly relying on electrophoresis
which is slow, has limited separation range and not amenable to
high degrees of parallelism.
[0008] Now, however, the need for large scale re-sequencing of
individual human genomes and de novo sequencing in pathogens and
model organisms require cheaper and faster alternatives to be
developed. Recently, several methods that would avoid gel
electrophoresis, cloning or the Polymerase-chain reaction (PCR)
have been suggested (Pray L, A cheap personal genome? The
Scientist, Daily News Oct. 4, 2002). One idea, "sequencing by
synthesis" (SbS) which is attracting wide interest, involves the
identification of each nucleotide immediately following its
incorporation by polymerase into an extending DNA strand. Today,
one SbS approach, pyrosequencing, is widely used for SNP
(single-nucleotide polymorphism) typing (Ronaghi M, Uhlen M, Nyren
P. A sequencing method based on real-time pyrophosphate. Science.
1998 Jul. 17;281(5375):363, 365.). In this case, the detection is
based on pyrophosphate (PPi) release, its conversion to ATP, and
the production of visible light by s firefly luciferase. However,
because the signal is diffusible, pyrosequencing cannot take
advantage of the massive degree of parallelism that becomes
available when surface immobilised reactions are analysed.
[0009] Increasing read-lengths beyond those currently available
would be highly useful. Moreover, it would be advantageous if
sequencing runs can be on the scale of genomes, at least small
genomes or whole genes or if thousands or millions of DNA fragments
could be sequenced in parallel. It would also be useful if the
confidence in the sequence information that is obtained could be
increased. It would also be useful if the underlying haplotype
information of the sequence could be retained. These facilities
would aid the task of functional genomics by enabling
genotype-phenotype correlations to be obtained at an unprecedented
resolution and scale and would be widely applicable to disease
genetics. If large amounts of data can be handled efficiently,
sequencing would offer a number of advantages over typing SNPS. It
would also have wide applications as a means for determining the
identity of a molecule.
[0010] Array technology offers massive parallelization, but present
implementations are limited by the constraints of bulk analysis.
Array re-sequencing Patil N, et al. Blocks of limited haplotype
diversity revealed by high-resolution scanning of human chromosome
21. Science. 2001 Nov. 23;294(5547):1719-23) offers high parallism
but is highly complex.
[0011] Furthermore, following the sequencing of the human genome
the emphasis has shifted to the analysis of gene products, namely
RNA and particularly proteins. The methods available for protein
analysis are not typically available in a highly parallel format.
2-D gel electrophoresis has been traditionally used to analyse
populations of proteins but this method is difficult to implement
particularly as it relies on gel electrophoresis. Recently there
are efforts towards developing protein microarrays. However, to
date there is no established method for conducting proteomics in a
rapid and sensitive manner, which is widely applicable.
[0012] Furthermore, sensitive, high-throughput methods are needed
for analysing the interactions of proteins with small
molecules.
[0013] New techniques are being developed that forgo traditional
`bulk` biochemical methods that analyse the average signal from an
ensemble of molecules and instead examine single molecules. A
single binding event or reaction can be amplified by RCA (Lizardi P
M, Huang X, Zhu Z, Bray-Ward P, Thomas D C, and Ward D C. 1998.
Mutation detection and single-molecule counting using isothermal
rolling-circle amplification. Nat Genet 19 225-32.63 Schultz S,
Smith D R, Mock J J, Schultz D A. Single-target molecule detection
with nonbleaching multicolor optical immunolabels. Proc Natl Acad
Sci USA. 2000 Feb. 1;97(3):996-1001. AND Oldenburg S J, Genick C C,
Clark K A, Schultz D A. Base pair mismatch recognition using
plasmon resonant particle labels. Anal Biochem. 2002 Oct.
1;309(1):109-116) or by labeling with nanoparticles and a number of
techniques have been developed that can start with a single
molecule and then do PCR amplification; these include MPSS (Brenner
S, Williams S R, Vermaas E H, Storck T, Moon K, McCollum C, Mao J
I, Luo S, Kirchner J J, Eletr S, DuBridge R B, Burcham T, Albrecht
G. In vitro cloning of complex mixtures of DNA on microbeads:
physical separation of differentially expressed cDNAs. Proc Natl
Acad Sci U S A. 2000 Feb. 15;97(4):1665-70), Colony PCR (Mitra R D,
Church G M. In situ localized amplification and contact replication
of many individual DNA molecules. Nucleic Acids Res. 1999 Dec.
15;27(24):e34 AND Mitra R D, Butty V L, Shendure J, Williams B R,
Housman D E, Church G M. Digital genotyping and haplotyping with
polymerase colonies. Proc Natl Acad Sci USA 2003 May
13;100(10):5926-31. Epub 2003 May 2 AND Mitra R D, Shendure J,
Olejnik J, Edyta-Krzymanska-Olejnik, Church G M. Fluorescent in
situ sequencing on polymerase colonies. Anal Biochem. 2003 Sep.
1;320(1):55-65) and Digital PCR (Vogelstein B, Kinzler K W. Digital
PCR. Proc Natl Acad Sci USA. 1999 Aug. 3;96(16):9236-41). A
commercial SNP typing system based on Fluorescence Correlation
Spectroscopy (FCS) of multi-labelled single molecules has recently
been introduced.sup.65 (Olympus/Evotec OA). However it is not a
significant departure from other homogeneous techniques because
even though single molecules are detected as they pass through the
small focus volume of a laser, the assay strategy still retains the
PCR step. Single molecule methods using optical laser-trapping have
been developed to study the transcription of immobilised RNA
polymerase molecules (Yin et al., 1995, Science 270:1653-56).
[0014] The signal from a single molecule does not need to be
amplified to be detected, as a single fluorophore label emits
enough photons to be detected if background noise is sufficiently
suppressed.
[0015] In recent years methods have been developed for detecting
and analysing individual molecules labeled with single dye
molecules on surfaces or in solution. Individual ATP turnover by
single myosin molecules has been visualised using evanescent wave
excitation (Funatsu et al., 1995, Nature 374: 555-59). Moreover,
analysis has been performed on single molecules in unamplifled
genomic DNA (Castro A. and Williams J G K, 1997, Anal. Chem.
69:3915-3920). Coincident single molecule detection of two PNA
probes each labeled with a single fluorophore, as they hybridise to
proximal sequences on genomic DNA passing through a sheath flow
provided the specificity to detect an unamplified, single-copy
target DNA molecule within a complex genomic background in a
homogeneous assay. Methods for sequencing on single molecules are
now under development (Braslavsky I, Hebert B, Kartalov E, Quake S
R. Sequence information can be obtained from single DNA molecules.
Proc Natl Acad Sci USA. 2003 Apr. 1;100(7):39604.).
[0016] The methods described so far detect fluorescent signals from
single molecules but do not visualize the molecules themselves.
Other techniques visualize the DNA as a polymer on a surface. DNA
polymers on a surface have been probed at SNP sites using tagged
probes that can be detected by the AFM and by fluorescent
probes.
[0017] Technologies that permit the elimination of PCR, such as
those based on single molecule examination would increase
throughput and bring down costs but are faced with the formidable
complexity of the human genome which impacts the specificity with
which a desired locus can be targeted. Nevertheless, encouragement
can be found in recent results from generic PCR which suggest that
the problems of genome complexity can be substantially overcome by
suppressing repetitive sequences.
[0018] Such whole human genome sequencing would be able to access
disease causing mutations directly including those to which the
common SNPs do not associate through linkage disequilibrium. It
could also open up an era of personalized medicine in which health
management is informed by an individual's genomic sequence.
[0019] The prior art pertaining to molecular arrays is not
specifically applicable to the analysis of single molecules. Where
the term "molecule" is mentioned, it is obvious that the methods
are not relevant to single molecule techniques because strategies
for detecting single molecules are not described.
[0020] In general, the methods of the prior art do not examine
single molecules individually but examine large homogeneous
populations of substantially identical molecules, wherein the
signal which is used to identify a label originates from a bulk
population of molecules rather than an individual molecule.
Conventional usage does not generally facilitate this distinction:
phrases such a "a molecule" or "a sample molecule" as used in the
prior art generally do not refer to an individual molecule
considered separately or in isolation from other molecules,
including separately from other molecules of identical composition
and structure, but to populations comprising millions or more
molecules of identical structure. Invariably, investigators are not
working with samples consisting of single molecules but rather with
samples comprising a plurality of identical molecules. In
particular, even where these investigators do not (as is consistent
with conventional usage) explicitly note this point, they take
measures which would apply only to samples of pluralities of
identical molecules, and do not take measures associated with
working with single molecules.
[0021] In WO02/074988 the present inventor described a method for
preparing and using single molecule arrays which are comprised of
individually resolvable, spatially addressable molecules, the
identity of which is known or determined prior to
immobilisation.
SUMMARY OF THE INVENTION
[0022] The present invention overcomes the above-mentioned
practical limitations associated with bulk analysis. This is
achieved by the precision, richness of information, speed and
throughput that is obtained by taking analysis to the level of
single molecules.
[0023] To date single molecule analysis has only been conducted on
very simple samples but presently the need to apply tests on larger
scales is increasing. An important aspect of any single molecule
technique for rapid analysis of large numbers of molecules will be
a system for sorting/organising individual molecules and
tracking/following individual events on them in parallel.
[0024] The approach of the present invention is set apart from
traditional bulk array technologies inter alia by the type of
information it aims to acquire, information that is based on the
analysis of single molecules as separate, individual entities. The
low density signals would not be readable by instrumentation
typically used for analysing the results of bulk. The manufacture
of single molecule arrays of the invention requires special
measures as described herein.
[0025] 1. Arrays
[0026] Low Molecular Density
[0027] Arrays useful in the present invention can be produced by a
method which comprises immobilising on a solid phase a plurality of
molecules at a density which allows individual immobilised
molecules to be individually resolved. Alternatively, said method
comprises immobilising to a solid phase a plurality of defined
molecules at a density which allows individual immobilised molecule
to be individually resolved by a method of choice, wherein each
individual molecule in the array is or can become spatially
addressable.
[0028] High Molecular Density
[0029] Arrays may moreover be produced by a method which
comprises:
[0030] (i) providing a molecular array comprising a plurality of
molecules immobilised to a solid phase at a density such that
individual immobilised molecules are not capable of being
individually resolved; and
[0031] (ii) reducing the density of functional immobilised
molecules in the array such that the remaining individual
functional immobilised molecules are capable of being individually
resolved
[0032] The method may also comprise:
[0033] (i) providing a molecular array comprising a plurality of
defined spatially addressable molecules immobilised to a solid
phase at a density such that individual immobilised molecules are
not capable of being individually resolved by optical means or
another method of choice; and
[0034] (ii) reducing the density of functional immobilised
molecules in the array such that each remaining individual
functional immobilised molecule is capable of being individually
resolved.
[0035] According to the above embodiments the invention further
provides a method for producing a double stranded nucleic acid
array, whereby the sample that is arrayed is double stranded prior
to arraying. The invention provides a method for producing a single
stranded nucleic acid array, whereby the sample that is arrayed is
single stranded prior to arraying.
[0036] Alternatively, according to the above embodiments the
invention further provides a method for producing a double stranded
nucleic acid array, whereby the sample that is arrayed is not
double stranded prior to arraying but is made double stranded after
arraying. The invention provides a method for producing a single
stranded nucleic acid array, whereby the sample that is arrayed is
not single stranded prior to arraying but is made single stranded
after arraying.
[0037] Encoded Molecules
[0038] The present invention also provides a method for producing a
molecular array comprising a plurality of molecules immobilised to
a solid phase at a density which allows each individual immobilised
molecule to be individually resolved, wherein the identity of each
individual molecule is encoded and can be decoded, for example with
reference to a look up table.
[0039] The present invention also relates to methods of arraying
pluralities of nucleic acid molecules at low density where,
although the identity of the nucleic acids may be unknown prior to
immobilisation, the array is subsequently characterised by the use
of encoded probes, such as tagged probes or by successive serial
addition and/or removal of probes from a repertoire and then
reconstructing the sequence identity from information about which
of the probes interact with which of the immobilized nucleic
acids.
[0040] In this embodiment the molecules are first placed randomly
on the surface and the decoding process is carried out to make them
spatially addressable i.e. to correlate an individual location on
the array with the identity of the molecule at that particular
location. This means that the molecules may be randomly distributed
on the array, which is simpler, faster and cheaper way of putting
the molecules on the surface as compared to in situ synthesis or
spotting of spatially addressable arrays. The decoding process may
involve methods known in the art such as Sequencing by synthesis.
In a preferred embodiment the decoding process involves interacting
the array with a repertoire of probes.
[0041] Thus, in a further aspect, the present invention provides a
method for arraying a plurality of nucleic acid molecules which
method comprises:
[0042] (i) contacting the plurality of nucleic acid molecules with
a plurality of probes, each probe being labelled with a tag which
indicates the identity of the probe, such that each molecule can be
identified by detecting the probes bound to the molecule and
determining the identity of the corresponding tags;
[0043] (ii) immobilising the plurality of nucleic acid molecules
randomly to a solid substrate; and optionally
[0044] (iii) horizontalising and optionally straightening the
molecules during or after immobilisation
[0045] The plurality of nucleic acids are immobilised at a density
which allows individual molecules in the array to be individually
resolved.
[0046] Horizontalisation is defined as the immobilsation of the DNA
so that it is substantially in a parallel plane to the surface.
This may be achieved by multiple interactions on the surface or by
directional fluid flow. In most cases of horizontalisation it is
preferable that the molecule is substantially straigtened, as can
be assessed by far-field optical microscopy. An exception is where
different regions of a DNA polymer are deliberately directed to
particular spatial locations. This horizontalisation and
straightening can also be described as combing but the processes
used are different from those used in molecular combing.
[0047] In an alternative embodiment, the present invention provides
a method for arraying a plurality of nucleic acid molecules which
method comprises:
[0048] (i) immobilising the plurality of nucleic acid molecules
randomly to a solid substrate;
[0049] (ii) optionally horizontalising and straightening the
molecules during or after immobilization.
[0050] (iii) contacting the plurality of nucleic acid molecules
with a one or a plurality of probes, each probe being labelled with
a tag which indicates the identity of the probe, such that each
immobilised molecule can be identified by detecting the probes
bound to the molecule and determining the identity of the
corresponding tags.
[0051] (iv) Optionally repeating step (iii) until the identity of
molecule becomes substantially established
[0052] The plurality of nucleic acids are immobilised at a density
which allows individual molecules in the array to be individually
resolved.
[0053] In one specific embodiment the repertoire is of
polynucleotides whose identity is decoded by the following
steps:
[0054] i) adding to said array composition a first set of decoding
nucleotides/probes
[0055] ii) at least one nucleotide of molecule that basepair with
at least one nucleotide of at least one of said labelled decoding
nucleotide/probe; and
[0056] iii) detecting the presence of said label at particular
location
[0057] iv) Optionally removing said decoding probes and repeating
steps i) and iii), wherein said nucleotide sequence being decoded
will base-pair with a different nucleotide of a second set of
decoding nucleotides/probes.
[0058] v) compiling the sequence to provide identity of molecule at
particular array locations
[0059] The probes may be oligonucleotides which are shorter in
length than the polynucleotides of the array.
[0060] In above embodiments upon determination of identity the
immobilised molecules of the array become spatially
addressable.
[0061] Iteration of this process can further sequence characterize
the molecule until a full sequence is obtained as will be described
below.
[0062] Once decoded the array can then be used for further
investigations for example in mRNA quantitation.
[0063] Direct Arraying the Sample
[0064] In a further aspect, the present invention can be used more
generally to produce low density arrays of molecules in a sample to
enable characterization of the molecules in the sample under
analysis. Thus the present invention also provides a method for
producing a molecular array which method comprises immobilising to
a solid phase a plurality of molecules present in a sample under
analysis, wherein the plurality of molecules are immobilised at a
density such that individual molecules in the sample can be
individually resolved.
[0065] The plurality of molecules may comprise the genome,
proteome, transcriptome or metabalome of a cell, tissue or
organism. The resulting arrays may be used in genome or proteome
analyses.
[0066] In a specific embodiment of the invention an array of
capture molecules are spread onto a surface to form a primary
array. This is followed by the formation of a secondary array by
the addition of the sample molecules to the surface under
conditions that the sample molecules interact with the primary
array. For example, sticky ends are created on sample DNA (these
may be optionally further recessed) and bind to probes randomly
arrayed on the surface. Alternatively the surface may comprise a
spatially random set of oligonucleotide capture probes which will
bind to any regions of complementary sequence that may be
accessible. Accessibility is induced in substantially double
stranded target by partial denaturation (heating, pH etc) or by use
of the RecA Protein. Alternatively, the sample may be substantially
single stranded.
[0067] Linked mRNA-Protein Arrays
[0068] mRNA coding for protein of interest, a puromycin attached to
the 3' end of mRNA using a synthetic linker, the mRNA puromycin
complex is subject to in vitro translation to generate the protein,
the puromycin then links to the proteins. Hence a protein is linked
to it's coding mRNA. A spatially addressable array is then made in
which each molecular complex is individually resolvable or is
individually functionalized so that it can be individually
resolvable. Alternatively, this mRNA-protein complex is then spread
onto a surface to produce a spatially random array in which each
molecular complex is individually resolvable. This can then made
addressable by binding of decoding sequences. A contiguous sequence
length between 10 and 16 bases will in most cases be sufficient to
identify the mRNA and thereby the protein uniquely. If the sequence
is obtained from a particular position along the mRNA the sequence
information required will be less. For example 10 or 11 bases of
sequence information from the 3' untranslated region will be
sufficient. The sequence information can be obtained by any method
known in the art, including Sequencing by Hybridisation and
Sequencing by synthesis. In both cases a primer could be provided
such as oligo dT which binds at the PolyA tail and primes synthesis
of 10 bases of sequence information. In the case of Sequencing by
Hybridisation the oligo dT will promote stacking hybridization of
for example 6 mers which are differentially tagged. The
characteristics and interactions of the protein can be probed by
the methods of this invention.
[0069] Proteins Arrays
[0070] Spatially addressable arrays of proteins or polypetides can
be made can be made in which each individual molecule is
individually resolvable. Alternatively, spatially random protein
array can be made and the molecules of the array made spatially
addressable by binding of a repertoire of peptides or antibodies,
affibodies, aptamers so that they can be identified.
[0071] Directing Different Loci on a Single Polymer Molecule to
Different Spatial Locations
[0072] In one embodiment the immobilised molecules are present
within discrete spatially addressable elements. In one such
embodiment, a plurality of molecular species are present within one
or more of the discrete spatially addressable elements and each
molecular species in an element can be distinguished from other
molecular species in the element by means of a label. In another
embodiment the plurality of molecules are not distinguishable by a
label but comprise a group of sequences, for example representing
members of a gene family, according to which they may be
distinguished. In a further preferred embodiment, the probes are
oligonucleotides or polynucleotides and the molecules are provided
as groups of molecules, members of each group of molecules are
complementary (and thereby each able to hybridise) to a different
site such as a locus of interest, within the target nucleic acid
molecule and immobilised to the solid phase such that each group is
spatially distinct from the other groups. In one such embodiment
the spatially addressable elements are coincident with an
electrical semi-conductor or conductor layer.
[0073] The present invention also provides a multiplexed array
comprising a plurality of molecular arrays produced by the above
methods of the inventions. Methods for producing such multiplexed
arrays are also provided. The multiplexed arrays may be used in
multiplexed analysis. The multiplexing can be of arrays in which
molecules are spatially addressable or random.
[0074] Typically, in any of the above embodiments, the solid phase
is a substantially flat solid substrate or a bead/particle/bar.
"Solid phase", as used herein, refers to any material which is
isolatable from solutions and thus includes porous materials, gels
and gel-covered materials. In one embodiment the solid-phase
comprises microscopic particles which are placed on a planar solid
surface and where preferably the microscopic particles are metallic
or semiconductor particles.
[0075] In a particular embodiment, the solid phase comprises
channels or capillaries within which the molecules are immobilised.
Moreover, the molecular array can be formed on or in an optical
fibre.
[0076] The molecular array can comprise nucleic acids which form
secondary structures, said secondary structures facilitating or
stabilising hybridisation or improving mismatch discrimination.
[0077] The array can be an array of anti-tags to which tags
labeling a sample repertoire can be decoded.
[0078] The present invention also provides a molecular array
obtained by the above methods.
[0079] Spreads: Spatially Random Arrays
[0080] A method for creating spatially random arrays whereby the
sample is placed between two flat surfaces, optionally the surface
is chemically derivatised and optionally the sample is exposed to
electromagnetic (UV) irradiation, one surface is removed from the
other by a lateral motion, optionally unadsorbed material is
removed from the surfaces and optionally the surface undergoes
further UV crosslinking. Random arrays are now created on both of
the two flat surfaces.
[0081] The repertoire is preferably a repertoire of probes, for
example a sample repertoire. Advantageously, the repertoire
comprises nucleic acids, proteins and/or protein-nucleic acid
hybrids.
[0082] Secondary Array
[0083] Secondary arrays can be created on a primary array. For
example, if the primary array is an arrayed repertoire of probes,
they can be used to capture a repertoire of sample molecules.
[0084] Spreads of Linearised Polymers
[0085] A method for creating spatially random arrays of linearised
polymers whereby the sample is placed between two flat surfaces,
optionally the surface is chemically derivatised, one surface is
removed from the other by a lateral motion, optionally excess
material is removed from the surfaces. Random arrays of linearised
polymer are now created on both of the two flat surfaces. This
method produces very good distributions of molecules where
typically it is difficult to produce homogeneous molecular combing.
The molecules of a secondary array can be straightened/linearised
in this way.
[0086] 2. Use of Arrays
[0087] The invention provides the use of a molecular array, for
example as described herein, to perform single molecule analysis.
In one embodiment said analysis can form part of a molecular
assay.
[0088] The present invention further provides means to analyse the
array of single molecules, wherein a physical, chemical or other
property may be determined. For example, molecules which fluoresce
at a certain tested wavelength can be directly sampled. This is
particularly applicable where the repertoire is a repertoire
created by in vitro evolution or SELEX experiments. The invention
also provides techniques for measuring the physical properties of
the molecules comprising the array or their interaction with
various types of probes.
[0089] The present invention further provides a number of
techniques for detecting interactions between sample molecules and
the constituent molecules of molecular arrays.
[0090] Accordingly, the present invention provides the use of a
molecular array, for example as described herein, in a method of
identifying one or more array molecules which interact with a
target.
[0091] The molecular array may also be used more generally in
identifying compounds which interact with one or more molecules in
the array. In this case the preferred targets would be small
molecules, RNA molecules, proteins or genomic DNA.
[0092] Typically said methods comprise contacting the array with
the sample and interrogating one or more individual immobilised
molecules to determine whether a target molecule has bound.
[0093] Preferably interrogation is by an optical method such as a
method selected from far-field optical methods, near-field optical
methods, epi-fluorescence imaging, scanning confocal microscopy,
two-photon microscopy, and total internal reflection microscopy.
Other methods of microscopy, such as scanning probe microscopy and
electron microscopy are also appropriate.
[0094] In one embodiment, the immobilised molecules are of the same
chemical class as the target molecules. In another embodiment, the
immobilised molecules are of a different chemical class to the
target molecules.
[0095] In a preferred aspect, target molecules are genomic DNA or
cDNA or mRNA. Accordingly, the molecular array may be used, for
example in gene expression studies or the detection of single
nucleotide polymorphisms (SNPs) in a sample of nucleic acids.
[0096] Thus in one preferred embodiment the immobilised molecules
of the array and the target molecules are nucleic acids and the
contacting step takes place under conditions which allow
hybridisation of the immobilised molecules to the target molecules
or the contacting step takes place under conditions which allow
annealing and template (target) directed enzymatic processing of
the immobilised molecules.
[0097] Sample nucleic acids can be fragmented prior to analysis.
Large and/or complex samples, such as genomic samples, can be
sorted prior to analysis e.g. according to chromosome by for
example flow cytometry.
[0098] Sample complexity can be reduced by fragmenting the target
and pre-hybridising it to C.sub.0t=1 DNA. The samples DNA then
undergoes whole genome amplification prior to analysis.
[0099] The single molecule methods allows the use of small samples
and the detection of very small quantities of analyte in said
samples--as low as a single molecule.
[0100] Particular applications of molecular arrays according to the
invention, and of single molecule detection in techniques in
general, are set forth herein. Particularly preferred uses include
nucleic acid analysis, such as in SNP typing, sequencing and the
like, in genetic and genomic analysis as well as uses for
proteomics. These uses may be carried out in a large-scale format
or in a compact biosensor device.
[0101] 3. Specific Applications
[0102] The repertoires and arrays of the invention can be used to
execute a number of different applications. These include SNP
typing, gene expression analysis, sequencing and protein expression
and characterisation.
[0103] SNP Typing
[0104] In a further aspect, the invention relates to a method for
typing single nucleotide polymorphisms (SNPs) and mutations in
nucleic acids, comprising the steps of:
[0105] a) providing a repertoire of probes complementary to one or
more nucleic acids present in a sample, which nucleic acids may
possess one or more polymorphisms, said repertoire being presented
such that molecules may be individually resolved;
[0106] b) exposing the sample to the repertoire and allowing
nucleic acids present in the sample to anneal to the probes at a
desired stringency, and optionally further processing;
[0107] c) detecting binding events or the result of processing.
[0108] The detection of binding events may be aided by eluting the
non-annealed/unprocessed nucleic acids from the repertoire and
detecting individual hybridised/processed nucleic acid molecules.
The processing includes enzyme reactions such as primer extension,
single base extension, ligation, padlock probe ligation and rolling
circle amplification.
[0109] In a one embodiment sequence is extended from primer.
Extension may be of one base or a few bases (to characterisation of
insertions/deletions, Indels).
[0110] In a preferred embodiment the repertoire of probes target
SNPs that that "tag" the haplotypes of a given region of Linkage
disequilibrium and leaving out SNPs that provide redundant
information
[0111] Advantageously, the repertoire is presented as an array,
which is preferably an array as described hereinbefore.
[0112] Haplotyping
[0113] The invention is moreover applicable to haplotyping, in
which a multiallelic probe set is used to analyse each sample
molecule in a population for two or more features simultaneously.
For example, a first probe may be used to immobilise the sample
nucleic acid to the solid phase, and optionally simultaneously to
identify one polymorphism or mutation; and a second probe may be
used to interact with the immobilised sample nucleic acid and
detect a second polymorphism or mutation. Thus, the first probe (or
biallelic probe set) is arrayed on the solid phase, and the second
probe (or biallelic probe set) is provided in solution (or is also
arrayed; see below). Further probes may be used to test further SNP
sites along the DNA. polymer as required. Thus, the method of the
invention may comprise a further step of hybridising the sample
nucleic acid with one or more further probes in solution.
[0114] The signals generated by the first and second probe sets may
be differentiated, for example, by the use of differentiable signal
molecules such as fluorophores emitting at different wavelengths,
as described in more detail below. Moreover, the signals may be
differentiable based on their location on the solid phase. To aid
detection of the location of signal along the molecule, molecules
may be stretched out by methods known in the art.
[0115] Within a probe set, the signals generated by two or more
allelic probes may be differentiated, for example, by the use of
differentiable signal molecules such as fluorophores emitting at
different wavelengths, as described in more detail below. Moreover,
the signals may be differentiable based on their location on the
solid phase.
[0116] In a further preferred embodiment, the probes are
oligonucleotides or polynucleotides and the molecules are provided
as groups of molecules, each group of molecules complementary (and
thereby each able to hybridise) to a different site such as a locus
of interest, or a different variant such as SNP allele, within a
target nucleic acid molecule and immobilised to the solid phase
such that each group is spatially distinct from the other groups.
In one such embodiment the spatially addressable elements are
coincident with an electrical semi-conductor or conductor
layer.
[0117] A method for haplotyping which involves the detection of the
identity of SNP alleles along a single DNA polymer by binding to
probes whose identity is linked to their spatial location on a
surface. The spatial location of signal provides the read-out of
the technique. This approach is particularly advantageous as it
enables in situ synthesis of probes and does not require separate
oligonucleotides to be synthesised.
[0118] In one embodiment spatial coordinates occupied by the single
DNA polymer is detected by fluorescence staining. In another
embodiment the spatial coordinates occupied by the testing
electrical continuity between electrodes carrying each of the
allele combinations virtue of it's formation of a circuit across a
pair of electrodes, bearing probes testing contiguous SNP sites on
the sample molecules, by the spanning of the electrodes.
[0119] A method for creating electrodes by DNA directed coating of
the microarray spots with a conducting material and a method of
creating a nanowire by directed DNA spanning two or more spatially
addressable probes.
[0120] Analysing Sequence Repeats
[0121] In a further embodiment, the invention provides a method for
determining the number of sequence repeats in a sample nucleic
acid, comprising the steps of:
[0122] a) providing one or more probes complementary to one or more
nucleic acids present in a sample, which nucleic acids may possess
one or more sequence repeats, said probes being presented such that
molecules may be individually resolved;
[0123] b) annealing a sample of nucleic acid comprising the repeats
to the probes
[0124] c) contacting the nucleic acids with labelled probes
complementary to said sequence repeats optionally in the presence
of DNA ligase, or alternatively contacting with nucleotides (a
mixture of labelled and nonlabelled at a ratio that would enable
only one labelled incorporation per repeat) and a polymerase;
and
[0125] d) determining the number of repeats present on each sample
nucleic acid by individual assessment of the number of labels
incorporated into each molecule, such as by measuring the
brightness of the signal produced by the labels; wherein in a
preferred embodiment signal is only processed from molecules to
which a second solution oligonucleotide labelled with a different
label is also incorporated.
[0126] The results may be analysed in terms of intensity ratios of
the repeat probes labelled with first colour and the second probe
labelled with a second colour.
[0127] Advantageously, the repertoire is presented as an array,
which is preferably an array as described hereinbefore.
[0128] mRNA Analysis
[0129] The invention moreover provides a method for analysing the
expression of one or more genes in a sample, comprising the steps
of:
[0130] a) providing a repertoire of probes complementary to one or
more nucleic acids present in a sample, said repertoire being
presented such that molecules may be individually resolved;
[0131] b) hybridising a sample comprising said nucleic acids to the
probes;
[0132] c) determining the nature and quantity of each individual
nucleic acid species present in the sample by counting single
molecules which are hybridised to the probes.
[0133] In some cases the individual molecule may be further probed
by sequences that would differentiate alternative transcripts or
different members of a gene family.
[0134] Advantageously, the repertoire is presented as an array,
which is preferably an array as described herein.
[0135] Preferably, the probe repertoire comprises a plurality of
probes of each given specificity, thus permitting capture of more
than one of each species of nucleic acid molecule in the sample.
This enables accurate quantitation of expression levels by single
molecule counting.
[0136] Advantageously one or more mRNA populations, each population
differently labelled so that its molecules can be distinguished
from molecules from another population are interacted
simultaneously with the repertoire of probes. This enables easy
side-by-side comparisons of the differential expression of genes
between the different populations analysed.
[0137] Advantageously the probes are designed to hybridise to
specific positions on a mRNA molecule from the following:
polydenlyation signal, (e.g. AAUAAA), Poly A tail, 5' cap or
sequence clamped to the 5' or 3' end of the molecules of the mRNA
population.
[0138] In an alternative embodiment a sample mRNA population is
spatially randomly arrayed and the identity of the sequence is
determined by the hybridisation of decoding probes to reveal the
identity of the mRNA. Hence gene expression analysis can be
conducted by compiling the quantity of molecules of each individual
identity present on the surface.
[0139] Sequencing
[0140] In a still further aspect, the invention relates to a method
for determining the sequence of one or more target DNA molecules.
Such a method is applicable, for example, in a method for
fingerprinting a nucleic acid sample. Moreover the method may be
applied to complete or partial sequence determination of a nucleic
acid molecule or population of molecules.
[0141] Sequencing on Linearised Molecules
[0142] Genomic sequence would have much greater utility if
haplotype information (the association of alleles along a single
DNA molecule derived from a single parental chromosome) could be
obtained over a long range. This is possible by following
sequencing on a single molecule and more preferably where the
single molecule is linearised on a surface enabling multiple sites
from which sequence information is obtained are resolved. Here each
template molecule is straitened to provide a linear display of
sequence along its length. and allowing multiples foci along its
length to be resolved.
[0143] Capture and Sequence
[0144] Thus, the invention provides a method for determining the
complete or partial sequence of a target nucleic acid, comprising
the steps of:
[0145] a) providing a repertoire of probes complementary to one or
more nucleic acids present in a sample, said first repertoire being
presented such that molecules may be individually resolved;
[0146] b) hybridising a sample comprising a target nucleic acid to
the probes;
[0147] c) hybridising one or more further probes of defined
sequence to the target nucleic acid; and
[0148] d) detecting the binding of individual further probes to the
target nucleic acid.
[0149] e) optionally repeating steps c-d
[0150] f) reconstructing the sequence of the target
polynucleotide
[0151] Advantageously, the further probes are labelled with labels
which are differentiable, such as different fluorophores.
[0152] Advantageously, the repertoire is presented as an array,
which is preferably an array as described hereinbefore.
[0153] General sequencing can be conducted by providing a complete
repertoire of probes of a given length. More directed sequencing
can be conducted by providing a complete repertoire of probes
covering for example a repertoire of SNPs.
[0154] Direct Immobilization of Target
[0155] The present invention also provides methods for determining
all or part of the sequence of a target nucleic acid molecule which
does not require immobilised arrays of probe molecules for
capturing the target. Instead, the target molecule is immobilized
to a solid phase, preferably being horizontalised and straightened.
Then probes are used to interrogate the immobilised target. The
immobilised target may be a repertoire of oligonucleotides. Each
oligonucleotide molecule within the repertoire is then sequenced by
hybridisation of a repertoire of shorter oligos. This sequenced and
now spatially addressable immobilised repertoire can then be used
for further array experiments e.g. SbH, gene expression analysis or
as primers for Sequencing by synthesis.
[0156] The further probes may act as primers for a variety of other
template directed enzymatic reactions for example, the synthesis of
a complementary DNA strand by the use of DNA polymerase and the
provision of nucleotides. This is compatible with further sequence
characterization by providing fluorescently tagged nucleotides
whose incorporations are monitored in a way that enables the
identity of each nucleotide to be determined.
[0157] In an advantageous embodiment, target nucleic acids are
captured and/or immobilised on the solid phase surface at multiple
points, which allows the molecule to be arranged horizontally on
the surface and optionally sites on the target where immobilisation
reaction occurs are in such locations that the target molecule is
elongated. In a further embodiment the molecule is attached by a
single point and physical measures are taken to horizontalise it.
Hybridisation of further probes may then be determined according to
position as well as or instead of according to differences in
label.
[0158] In particular, the probes may be encoded i.e. labelled with
tags whose identity can be readily determined, such as by using
single molecule detection techniques. Detection is generally used
to determine the position of the tagged probes with respect to the
ends of the target molecules or other landmarks. The use of
multiple probes then allows a sequence to be built up. When
multiple copies of each target species is present then overlapping
sequence information that is obtained can be used to build up the
sequence by `Sequencing by Hybridisation` methods known in the
art.
[0159] Accordingly, the present invention provides a method for
determining the sequence of all or part of a target nucleic acid
molecule which method comprises:
[0160] (i) immobilising the target molecule to a solid phase at one
or more points such that the molecule is substantially horizontal
with respect to the surface of the solid phase;
[0161] (ii) straightening the target molecule during or after
immobilisation;
[0162] (iii) contacting the target molecule with a nucleic acid
probe of known sequence; and
[0163] (iv) determining the position within the target molecule to
which the probe hybridises.
[0164] (v) repeating steps (iii) to (iv) as necessary, and
[0165] (vi) reconstructing the sequence of the target molecule.
[0166] The target may be immobilized at one point but linearised by
fluid flow.
[0167] Preferably the target molecule is contacted with a plurality
of probes,
[0168] In one embodiment target molecule is contacted with all of
the plurality of probes substantially simultaneously. Preferably
each probe is encoded, for example labelled with a different
detectable label or tag.
[0169] Alternatively, the target molecule may be contacted
sequentially with each of the plurality of probes.
[0170] Accordingly, method in which each of the plurality of
labeled probes are successively hybridized to the immobilized
nucleic acid and a record of those that hybridise to each molecule
can be used to identify or re-assemble the sequence of the
immobilized molecule.
[0171] In one embodiment the complete set of oligonucleotides of a
given length are provided as probes.
[0172] In one embodiment different sub-sets of probes are provided
in different experiments but the probes within each sub-set are
differentially labeled. A method in which sub-sets are grouped
according to their Tm
[0173] In one embodiment each probe or its label/tag is removed
from the target molecule prior to contacting the target molecule
with a different probe. Typically, the probes are removed by
heating, modifying the salt concentration or pH, or by applying an
appropriately biased electrical field.
[0174] In one embodiment the target is substantially a double
stranded molecule and the probes are LNA or PNA and bind by strand
invasion under appropriate conditions. In another embodiment the
probes are Padlock Probes which bind to the target under
appropriate conditions and become fixed to the target by a ligation
reaction. In another embodiment RecA mediates the binding of the
probes to a substantially double stranded molecule.
[0175] In another embodiment the target is substantially single
stranded and is made accessible for subsequent hybridisation by
stretching out/straightening, which may be achieved by capillary
forces acting on the target in solution.
[0176] In one embodiment, where it is desired to determine the
sequence of single-stranded molecules, the target nucleic acid
molecule is a double-stranded molecule and is derived from such a
single-stranded nucleic acid molecule of interest by synthesising a
complementary strand to said single-stranded nucleic acid.
[0177] The present invention also provides a method for determining
the sequence of all or part of a target single-stranded nucleic
acid molecule which method comprises:
[0178] (i) contacting the target molecule with a plurality of
nucleic acid probes of known sequence, each probes being labelled
with a different detectable label; and
[0179] (ii) ligating bound probes to form a complementary
strand
[0180] (iii) Where the probes are not bound in a contiguous manner,
optionally any gaps between bound probes are filled in by
polymerisation primed by said bound probes
[0181] (iv) determining the position of labels along the
polymer.
[0182] In one embodiment the following steps are taken before step
(i) and in another embodiment the following steps are taken before
step (iv):
[0183] (i) immobilising the target molecule to a solid phase at one
or more points
[0184] (ii) straightening the target molecule during or after
immobilisation.
[0185] Preferably the probe of each variety is differentially
labeled.
[0186] In one embodiment the complete set of oligonucleotides of a
given length are provided as probes.
[0187] In one embodiment different sub-sets of probes are provided
in different experiments but the probes within each sub-set are
differentially labeled.
[0188] To aid detection of the location of signal along the
molecule, molecules may be stretched out by methods known in the
art.
[0189] 5. Genomics
[0190] The invention provides a method for characterizing the
physical properties or interactions of polynuleoitdes on a surface,
particularly polynucleotides which are linearised on a surface.
[0191] Properties which can be determined include the chemical
status, such as the state of methylation or state of depurination;
and intermolecular interactions, such as the interaction of DNA
regulatory regions with transcription factors.
[0192] In a preferred aspect, the invention provides a method where
the nucleic acid sample is composed of DNA fibres, isolated from a
cell, the method comprising substantially retaining the binding of
proteins of interest and characterizing the proteins that are
bound. their position on molecules which are identified and
landmarks along their length have been detected.
[0193] 6 Proteomics
[0194] The invention is applicable to proteomics, including the
measurement of the quantities of protein species present in a
sample, characterization of their properties and of the ability to
interact with various partners, including small molecules, other
proteins, carbohydrates, lipids and nucleic acids, or to catalyse
various reactions.
[0195] The invention is particularly applicable to the analysis of
the properties of protein variants created by DNA shuffling. The
array may be an array as described above. In one embodiment the
array is an array of nucleic acids to which a protein is
linked.
[0196] In another embodiment the array is an ordered array in which
each different protein is present in a different element of the
array. In a further embodiment the array is a random array. In a
further embodiment the array is composed of molecules isolated from
a particular target organism, tissue or cell. The sample is
interrogated using the following steps;
[0197] (i) producing a molecular array by a method comprising
immobilising to a solid phase a plurality of molecules present in a
sample, wherein the plurality of molecules are immobilised at a
density such that individual molecules in the sample can be
individually resolved; and
[0198] (ii) identifying and/or characterising one or more molecule
immobilised to the array.
[0199] Preferably, challenging the array with sample molecule(s) of
interest and detecting the interaction, where each interaction is
individually resolvable. The molecules of interest are
advantageously proteins, small molecules, RNA or DNA.
[0200] Preferably, contacting or approaching the array with one or
more probes of interest, where each interaction is individually
resolvable. Preferably, the probe is a an AFM tip and where the AFM
tip is coated with a molecule or material of interest. The AFM tip
can be electrically biased. Advantageously, following approach or
contact by the probe the forces acting upon the probe are measured.
Such forces are, for example, electrostatic forces.
[0201] Preferably, stimulating the sample with a physical agent,
where each interaction is individually resolvable. Advantageously
the physical agent may be electromagnetic radiation, electron
source, electrical stimulation, electrochemical stimulation etc.
Following stimulation with physical agent, a raman signal can be
detected. Preferably, the sample is placed on metallic surfaces,
preferably colloidal metal particles and surface enhanced raman
signal is detected.
[0202] Advantageously, the plurality of molecules is a polypeptide
repertoire or the proteome.
[0203] One or more of said immobilised molecules can be
interrogated by an optical method. Preferably, the optical method
is selected from far-field optical methods, near-field optical
methods, epi-fluorescence spectroscopy, scanning confocal
microscopy, two-photon microscopy and total internal reflection
microscopy. One or more of said immobilised molecules can be
interrogated by scanning probe microscopy or electron microscopy.
Preferably, a physicochemical property of the immobilised molecules
is determined, such as shape, size or mass, charge, hydrophobicity.
Alternatively, or additionally, an electromagnetic, electrical,
optoelectronic and/or electrochemical property of the immobilised
molecules is determined.
[0204] In a further embodiment, a characteristic of a complex of
between an immobilised molecule and a probe is determined.
Preferably, the characteristics of individual immobilised molecules
are learnt using a computational method. The computational method
can be a neural network or artificial intelligence method such as
fuzzy logic.
[0205] The invention further provides an array wherein the
characteristics of a plurality of immobilised molecules and their
corresponding physical location in the array have been determined.
Such an array can be used in a method of identifying candidate
molecules or distinguishing them from non-candidate molecules.
[0206] 7 Sample Pooling
[0207] The present invention is particularly applicable to pooling
strategies, such as DNA pooling in SNP typing. Pooling strategies
involve mixing multiple samples together and analysing them
together to save costs and time. The present invention is also
applicable to detection of low frequency mutations in a wild type
background. The present invention is applicable for determining
haplotype frequencies in pooled DNA samples.
[0208] Labelling and Tagging Schemes for use in Single Molecule
Regimes
[0209] In one embodiment the labelling schemes involve labelling
with single fluorophores or a combination of single
fluorophores.
[0210] In another embodiment the labelling scheme involves
labelling with nanoparticles. Gold nanoparticles which are
optically active and electronically active and can be made 1.4 mn
in diameter (Nanoprobes) and are available derivatised with
streptavidin and/or a number of fluorescent groups.
[0211] Tags can be linked to probes, such as oligonucleotide
probes, in a number of ways. Firstly, probes and tags can be
prepared separately and then manually linked together (not
combinatorially). Secondly they can be joined by combinatorial
chemistry by various means, for example, where both probe and tag
are co-synthesised. Split and mix synthesis is particularly
appropriate.
[0212] The present invention also provides a method for identifying
and/or characterising one or more molecules of a plurality of
molecules present in an array, comprising:
[0213] (i) producing a molecular array wherein the plurality of
molecules are immobilised at a density such that individual
molecules in the sample can be individually resolved; and
[0214] (ii) identifying and/or characterising one or more molecules
immobilised in the array by contacting them with encoded tagged
probes
[0215] Furthermore, the concept of using encoded probes to
characterise an array may be applied to random arrays comprising
immobilised molecules of interest from a sample.
[0216] The detectable feature may be present in the tag.
Alternatively, the detectable feature may not be present on the tag
but would be present on a partner molecule which would specifically
recognize the tag. The tag and its partner can be complementary
oligonucleotides or an antigen-antibody pair or a ligand-aptamer
pair. The advantage of such arrangements is that bulky detectable
moieties do not interfere with processing of the target molecule
and is only be added once processing is completed.
[0217] Preferably each probe is encoded by virtue of being labelled
with a tag which indicates uniquely the identity of the probe, such
that an immobilised molecule can be identified uniquely by
detecting the probes bound to the molecule and determining the
identity of the corresponding tags. Consequently step (ii) may
comprise contacting the immobilised molecules with a plurality of
encoded probes.
[0218] In one or more of the above methods which use tagged probes
and relate to the identification of nucleic acids, one or more of
the tagged probes may be used to identify an individual nucleic
acid molecule. For example two distal tagged probes can be used
that define an area flanking one or more nucleotide sites of
interest such as SNPs. Repertoires of tagged probes may also be
used in methods of sequencing as described herein.
[0219] The tag repertoires may specifically be detectable by single
molecule detection regimes but may also be useful in assays not
requiring single molecule detection
[0220] Advantageously, the plurality of probes is labeled with a
tag which indicates uniquely the identity of the probe.
[0221] Preferably, a method according to the invention is applied
in single molecule detection regimes in which the number of unique
tags required is reduced by using more than one tag for encoding
the probe. In a preferred embodiment a unique tag is provided for
each base, at each position along the sequence. Hence 24 tags
species will be sufficient to code for a complete library of
6-mers. In a second preferred embodiment a unique tag is provided
for each position, its quantity or some measurable feature of it is
varied to encode each of the four bases. In a preferred method
according to above, the tags are detectable by optical means.
Further, the invention provides a method wherein the tags are
particulate and comprise surface groups; a method wherein the tags
are particulate and encase detectable entities, such as particle or
molecules; and a method wherein tags can be detected and
distinguished by scanning probe microscopy.
[0222] A invention also provides a method for tagging whereby a
dendrimer is co-synthesised with the oligonucleotide sequence,
where each layer of the dendrimer encodes a different base which is
co-synthesized. The method also provides tags comprising
nanoparticles carrying different surface or internal detectable
groups that can be quantitatively detected.
[0223] The invention also provides tags that are composed of a
string of beads, for example gold nanoparticles. The invention also
provides tags that are composed of polymers of various lengths, the
length of the polymer and optionally some other feature
distinguishing one tag from another. The tags and DNA may be
metallized. Analysis is by SPM or electron microscopy.
[0224] 8. Biosensor
[0225] The present invention further provides a biosensor or
chemosensor comprising a molecular array as defined above. The
present invention also provides an integrated sensor comprising a
molecular array as defined above, an excitation source, a detector,
such as a CCD or alternatively integrated biosensor comprising a
molecular as defined above, a voltage source and electrodes and
electronic circuitry for detection. In addition, optionally means
for any or all of the following: hardware-based signal processing,
software-based signal processing; software-based processing of
results, display of results; transmission of results to a central
database on a remote computer. The present invention is
particularly applicable to biosensor applications where the amount
of sample material is small.
[0226] A biosensor according to the invention comprises a biosensor
wherein the molecular array is formed on an optical fibre.
[0227] Moreover, the biosensor can comprise a plurality of
elements, each element containing distinct molecules, such as probe
sequences.
[0228] In a particular embodiment, the invention provides a
biosensor for haplotyping in which:
[0229] (a) the immobilised single molecule is selectively coated
with a material that facilitates detection;
[0230] (b) the coating is a conducting material which allows a
circuit to form between only those electrodes onto which are
occupied by the target molecule by virtue of its binding to the
allelic probe present on the electrode;
[0231] (c) a potential difference is applied between electrodes in
any two contiguous groups of electrodes and the electrodes on which
probes interact with target are identified by virtue of the fact
that a current flows between them;
[0232] (d) the conducting material comprises silver, gold,
palladium or conjugated polymers; or
[0233] (e) multiple single molecules span the electrodes then the
haplotype frequency is given by the amount of current that flows
between the electrodes.
[0234] Advantageously, each element of the biosensor is specific
for the detection of a different target, such as different
pathogenic organisms. Preferably, molecules within each microarray
spot are monitored.
DESCRIPTION OF THE FIGURES
[0235] FIG. 1
[0236] Hybridisation of a single DNA Molecule to an array of pads
containing allele specific probes. Labelling with a dye such as
Sybr Gold enables the path of the polymer to be assessed and
compared with the known position of the array pads. A. Illustrates
the binding pattern of a first single molecule of a particular
haplotype. B. Illustrates the binding pattern of a second single
molecule of a different haplotype. C. Illustrates the capture of a
single molecule by hybridization to capture probes situated on the
pads. D. Illustrates signal obtained from pads where hybridisation
occurs. E. Determination of alleles along a haplotype by a
spatially addressable array. The signal is detected by measuring
conductivity between adjacent electrodes, pairwise. If electrical
continuity is detected it indicates that a DNA molecule bridges
between the tested two electrodes. The DNA molecule adds as a
nucleation point of a metallization process. There is little not
specific metal aggregation.
[0237] FIG. 2.
[0238] Illustrates interaction of differentially tagged probes with
a DNA molecule.
[0239] FIG. 3.
[0240] Illustrates the ligation of oligonucleotides bearing
differentially tagged probes, templated by the DNA strand
[0241] FIG. 4.
[0242] Illustrates the binding of differentially tagged probes at
distal sites from one another. Each probe then acts as a primer in
a polymerisation reaction, for example using Klenow Fragment(NEB)
or Taq Polymerase. The polymerisation from a first primer continues
until the phosphorylated 5' end of a second oligonucleotide is
reached. At this point a DNA ligase such as E. coli DNA ligase
(NEB) or Tth DNA ligase (Abgene) may ligate the extended strand to
the second oligonucleotide.
[0243] FIG. 5.
[0244] Illustrates the binding of a nucleotide/oligonucleotide
which is adapted with a tag. This tag is then specifically bound by
a second partner molecule. The partners may be complementary
oligonucleotides, antibody-antigen, streptavidin-biotin or any
ligand-receptor interaction. The partners uniquely identify the
probe in the context of the reaction. This is useful for example
when addition of a bulky detectable label is to be avoided during
the course of a reaction but can be added once the reaction has
taken place.
[0245] FIG. 6.
[0246] A biosensor device is illustrated. A. A molecular beacon is
shown to emit fluorescence after binding to a target molecule. This
is situated on a surface structure/composition in which a waveguide
is created in order to excite the dye on the beacon. Below the
transparent waveguide layer is a filter and a CCD detector which
detects the fluorescence emission from the opened up molecular
beacon. B. An alternative molecular structure is illustrated in
which a DNA intercalator acts as a FRET partner with a label on the
probe. The intercalating is attached via a linker of sufficient
length that the dye only comes into FRET range with it's partner
when it has intercalated into a double stranded region created when
the positive outcome of the assay, stable hybridisation under the
defined conditions, occurs. C. it shown that each molecule within
each element of the array is individually resolvable.
[0247] FIG. 7
[0248] Haplotyping on an array, The first allele is defined by the
array position. The second allele is defined by the label. Each
consecutive set of array spots analyses consecutive SNPs in a
haplotype. The signal may be detected as a point source of
fluorescence.
[0249] FIG. 8.
[0250] A. and B. Microarray scanner images of single molecule
dilution series. Each DNA oligonucleotide is labelled with a single
dye molecule. A and B are different exposures, C. TIRF image if a
spot dilution where a few single molecules are resolvable. D, An
intensity profile of a few pixels covering a putative single
molecule show a one step photobelaching which is indicative of a
single fluorescent dye molecule.
[0251] FIG. 9.
[0252] Oligonucleotide target labelled with a 20 nM Fluosphere
nanoparticle (Molecular Probes) hybridised to complementary
molecules within a spot of a single molecule array. Individual
nanoparticles are easily detectable distinguishable and therefore
can easily be counted. Imaging was with 40.times. dry Olympus
(Japan) objective focused directly in the surface of microscope
slide with no coverslip. The image was taken with a Roper Micromax
CCD camera. The binding is specific because no binding occurred to
other spots of the array
[0253] FIG. 10
[0254] Mismatch Discrimination on Single Molecule Arrays. TIRF
microscopy on an Olympus Microscope. Images taken from a twofold
dilution series of mismatch and perfect match probes within
microarray spots that are hybridised with a complementary Cy3
labelled oligo.
[0255] It is clear that the density of hybridisation is lower in
the mismatch probe spot compared to the perfect match spot at
dilution 5.
[0256] The molecules are far enough apart to enable single
molecules to be easily counted at dilution 5 for the mismatch
whereas they only become far enough apart at dilutions 7/8 for the
perfect match. Gamma setting 0 to 2000 for all.
[0257] FIG. 11.
[0258] Single Molecule Counting, The microarrays spot image is
digitised so that individual molecules can be assessed The number
of molecules in dilution 5 are seen to be less than in dilution 4.
Objects which the software deems as individual molecules are
coloured so that they can easily visualized. Non-single molecule
objects are in white.
[0259] FIG. 12.
[0260] Illustration of the capture-combing process within a
microarray spot. A. Capture probes on a surface (primary array). B.
Capture of sample from solution. C. Combing of captured nucleic
acids (secondary array)
[0261] FIG. 13
[0262] Self-assembly and horizontalization of genomic DNA on DNA
microarrays.
[0263] Illustration of the concept of sorting and displaying the
genome by sequence specific capture of different fragments of the
genome to different locations on an array surface.
[0264] FIG. 14
[0265] Images of the experimental results of capture combing
genomic DNA to a spatially addressable array at various
magnifications. Straightened, individually resolvable single DNA
polymers clearly seen in the 100.times. magnification images (A and
C) the DNA polymers are seen only within the spot areas. There is
no combing in areas outside the spots, where the capture probes are
not present
[0266] FIG. 15.
[0267] A. Concatemerised Lambda DNA of >200 kb length B.
Concatemerised genomic DNA, probed at a recurring sequence in the
concatamers
[0268] FIG. 16.
[0269] Spread of Human Female Genomic DNA on a Matsunami coverslip.
100.times. Olympus objective.
[0270] FIG. 17.
[0271] An array of electrodes(dark areas) separated by gaps.
Adjacent electrodes are spotted with oligonucleotides complementary
to different ends of a lambda DNA molecule. At the top right hand
corner a single Lambda molecule can be seen which is bridging the
two electrodes by binding to its complementary oligonucleotide.
Hybridisation was done in 4.times. SSC/o.1% Sarkosyl.
DETAILED DESCRIPTION OF THE INVENTION
[0272] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood by one of
ordinary skill in the art (e.g., in cell culture, molecular
genetics, nucleic acid chemistry, hybridization techniques and
biochemistry). Standard techniques are used for molecular, genetic
and biochemical methods (see generally, Sambrook et al., Molecular
Cloning: A Laboratory Manual, 2.sup.nd ed. (1989) Cold Spring
Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et
al., Short Protocols in Molecular Biology (1999) 4.sup.th Ed, John
Wiley & Sons, Inc.--and the full version entitled Current
Protocols in Molecular Biology, which are incorporated herein by
reference) and chemical methods. See also Genomics, The Science and
Technology Behind the Human Genome Project [1999]; Charles Cantor
and Cassandra Smith (John Wiley and Sons) for genomics technology
and methods including sequencing by hybridisation. DNA Microarray:
A Practical Approach [1999] Ed: M. Schena, (Oxford University
Press) can be referred to for array methods.
[0273] The present invention possesses many advantages over
conventional bulk analysis of molecular arrays. One of the key
advantages is that, in accordance with the present invention,
specific PCR amplification of target molecules may be dispensed
with due to the sensitivity of single molecule analysis. Thus,
there is no requirement to amplify target nucleic acids, which is a
very cumbersome task when analysis is large scale or requires rapid
turnaround and which may introduce errors due to non-linear
amplification of target strands and the under-representation of
rare molecular species often encountered with PCR. It also adds
considerable expense.
[0274] Moreover, the methods of the invention may be multiplexed to
a very high degree. Samples may comprise pooled genomes of target
and control subject populations respectively, since accurate
analysis of allele frequencies may be accurately determined by
single molecule counting. Since more than a single site on each
molecule may be probed, haplotype information is easily determined.
There is also the possibility of obtaining haplotype frequencies.
Such methods are particularly applicable in association studies,
where SNP frequencies are correlated with diseases in a population.
The expense of single SNP typing reactions can be prohibitive when
each study requires the performance of millions of individual
reactions; the present invention permits millions of individual
reactions to be performed and analysed on a single array
surface.
[0275] A. Methods of Manufacturing Low Density Arrays.
[0276] The present invention is in one aspect concerned with the
production of molecular arrays wherein the individual molecules in
the array are at a sufficiently low density such that the
individual molecules can be individually resolved--i.e. when
visualised using the method of choice, each molecule can be
visualised separately from neighbouring molecules, regardless of
the identity of those neighbouring molecules. The required density
varies depending on the resolution of the visualisation method. As
a guide, molecules are preferably separated by a distance of
approximately at least 250, 500, 600, 700 or 800 nm in both
dimensions when the arrays are intended for use in relatively low
resolution optical detection systems (the diffraction limit for
visible light is about 300 to 500 nm). If nearest neighbour single
molecules are labelled with different fluors, or their
functionalization can be temporally resolved, then it may be
possible to obtain higher resolution by deconvolution
algorithms/image processing. Alternatively, where higher resolution
detection systems are used, such as scanning near-field optical
microscopy, then separation distances of 50 nm or more may be used.
As detection techniques improve, it may be possible to reduce
further the minimum distance. The use of non-optical methods, such
as AFM, allows the reduction of the feature-to-feature distance
effectively to zero.
[0277] Since, for example, during many immobilisation procedures or
density reduction procedures, the probability of all molecules
being at least the minimum distance required for resolution is low,
it is acceptable for a proportion of molecules to be closer than
that minimum distance. However, it is preferred that at least 50%,
more preferably at least 75, 90 or 95% are at the minimum
separation distance required for individual resolution.
[0278] Furthermore, the actual density of molecules in the array
may be higher than the maximum density allowed for individual
resolution since only a proportion of those molecules may be
detectable using the resolution method of choice. Thus where
resolution, for example, involves the use of labels, then provided
that individually labelled molecules can be resolved, the presence
of higher densities of unlabeled molecules is immaterial. The label
may be due to the sample molecules which may be low in number
compared to the probe molecules.
[0279] Molecules that may be immobilised to the array include
nucleic acids such as DNA and analogues and derivatives thereof,
such as PNA. Nucleic acids may be obtained from any source, for
example genomic DNA or cDNA or synthesised using known techniques
such as step-wise synthesis. Nucleic acids may be single or double
stranded. DNA nanostructures or other supramolecular structures may
also be immobilised. Other molecules include: compounds joined by
amide linkages such as peptides, oligopeptides, polypeptides,
proteins or complexes containing the same; defined chemical
entities, such as organic molecules; combinatorial libraries;
conjugated polymers and carbohydrates.
[0280] In several embodiments, the chemical identity of the
molecules must be known or encoded prior to manufacture of the
array by the methods of the present invention. For example, the
sequence of nucleic acids (or at least the sequence of the region
that is used to bind sample molecules) and the composition and
structure of other compounds should be known or encoded in such a
way that the sequence of molecules of interest can be determined
with reference to a look-up table. The term "spatially addressable,
as used herein, therefore signifies that the location of a molecule
specifies its identity (and in spatial combinatorial. synthesis,
the identity is a consequence of location).
[0281] However, in alternative embodiments, arrays may be
manufactured using pluralities of unknown molecules from samples
and the arrays subsequently interrogated to characterise and
identify the immobilised molecules, particularly by using encoded
probes. The characteristics and location of individual immobilised
molecules may then be determined using encoded probes and the
results "learnt" for future use. Learning may be achieved using
computational methods such as neural networks or artificial
intelligence.
[0282] Molecules may be labelled to enable interrogation using
various methods. Suitable labels include: optically active dyes,
such as fluorescent dyes; nanoparticles such as fluorospheres and
quantum dots; and surface plasmon resonant particles (PRPs) or
resonance light scattering particles (RLSs)--particles of silver or
gold that scatter light (the size and shape of PRP/RLS particles
determines the wavelength of scattered light). (Schultz et al.,
2000, PNAS 97: 996-1001; Yguerabide, J. and Yguerabide E., 1998,
Anal Biochem 262: 137-156). Quantum dots or rods or nanobars can
also be used.
[0283] In the resulting arrays, it is preferred that molecules are
arranged in discrete elements. Generally, each element is adjacent
to another or at least 1 .mu.m apart and/or less than 10, 20, 50,
100 or 300 .mu.m apart. Each element is spatially addressable since
the identity of the molecules present in each element is known or
can be determined on the basis of a prior coding. Thus if an
element is interrogated to determine whether a given molecular
event has taken place, the identity of the immobilised molecule is
already known by virtue of its position in the array. Within each
element, only one molecule species may be present, in single or
multiple copies. Where present in multiple copies, it is preferred
that individual molecules are individually resolvable. In one
embodiment, elements in the array may comprise multiple species
that are individually resolvable. Typically, multiple species are
differentially labelled such that they can be individually
distinguished. By way of an example, an element may comprise a
number of different probes for detecting single nucleotide
polymorphisms alleles, each probe having a different label such as
a different fluorescent dye.
[0284] In one embodiment, the array comprises a block of array
elements where probes specific for different SNP alleles are
grouped together, typically in separate but adjacent discrete
elements. Furthermore, groups of probes which detect different but
closely linked SNP loci may be arranged together in the block of
array elements. In this way, a block of elements may be used to
probe multiple loci in a single molecule simultaneously. The
distance between the probes for different loci will generally be
determined by the distance between the loci in the target nucleic
acid molecules. For example, if the SNP loci are 10 kb apart, then
each group of allelic probes may be spaced apart by about 3
microns. If the SNPs are about 1000 bp apart then each group of
allelic probes may be spaced apart by about 300 nm. In practice the
distance between each consecutive SNP would vary. In a highly
preferred embodiment, the various probes in the block of array
elements are arranged such that the groups of allelic probes for
the various loci are arranged in one axis and within each group,
the different allelic probes for the locus are arranged in another
axis. For example, to detect four linked biallelic SNP loci, a
block of array elements may be arranged as 8 cells in a 4 by 2
arrangement with the probes for one allele on one row and the
probes for the other allele on another row, each column having two
cells representing the two possible alleles for each locus (see
FIG. 1).
[0285] This arrangement of blocks of array elements for
interrogating individual molecules at multiple loci is not limited
to SNP detection but may also be used in other methods such as
haplotyping or sequence determination.
[0286] Molecular arrays produced by the methods of the invention
preferably comprise at least 10 distinct molecular species, more
preferably at least 50 or 100 different molecular species.
[0287] Two possible approaches for manufacturing low density arrays
for use in the present invention are outlined below.
[0288] i. De Novo Fabrication
[0289] In one embodiment of the present invention, low density
molecular arrays are produced by immobilising pluralities of
molecules of known composition to a solid phase. Typically, the
molecules are immobilised onto or in discrete regions of a solid
substrate. The substrate may be porous to allow immobilisation
within the substrate (e.g. Benoit et al., 2001, Anal. Chemistry 73:
2412-242) or substantially non-porous, in which case the molecules
are typically immobilised on the surface of the substrate.
[0290] The solid substrate may be made of any material to which the
molecules can be bound, either directly or indirectly. Examples of
suitable solid substrates include flat glass, quartz, silicon
wafers, mica, ceramics and organic polymers such as plastics,
including polystyrene and polymethacrylate. The surface may be
configured to act as an electrode or a thermally conductive
substrate (which may enhance the hybridisation or discrimination
process). For example, micro and sub-micro electrodes can be formed
on the surface of a suitable substrate using lithographic
techniques. Smaller nanoelectrodes can be made by electron beam
writing/lithography. Electrodes may also be made using conducting
polymers which may be applied to the substrate by ink-jet printing
devices or by soft lithography. Electrodes may be provided at a
density such that each immobilised molecule has its own electrode
or at a higher density such that groups of molecules or elements
are connected to an individual electrode. Alternatively, one
electrode may be provided as a layer below the surface of the array
which forms a single electrode. Where each probe species are
arranged on individual electrodes, the current flowing between
separate electrodes can be determined. A current would be expected
to flow if certain molecules, such as double stranded DNA or
conductive substances whose growth is templated by such molecules
span the space between the electrodes.
[0291] The solid substrate may optionally be interfaced with a
permeation layer or a buffer layer. It may also be possible to use
semi-permeable membranes such as nitrocellulose or nylon membranes,
which are widely available. The semi-permeable membranes may be
mounted on a more robust solid surface such as glass. The surfaces
may optionally be coated with a layer of metal, such as gold,
platinum or other transition metal. A particular example of a
suitable solid substrate is the commercially available SPR
BIACore.TM. chip (Pharmacia Biosensors). Heaton et al., 2001 (PNAS
98:3701-3704) have applied an electrostatic field to an SPR surface
and used the electric field to control hybridisation.
[0292] Preferably, the solid substrate is generally a material
having a rigid or semi-rigid surface. In preferred embodiments, at
least one surface of the substrate will be substantially flat,
although in some embodiments it may be desirable to physically
separate discrete elements with, for example, raised regions or
etched trenches. For example, the solid substrate may comprise
nanovials--small cavities in a flat surface e.g. 10 .mu.m in
diameter and 10 .mu.m deep. This may particularly be useful for
cleaving molecules from a surface and performing assays or other
processes such as amplification on them. The solution phase
reaction would be expected to be more efficient than the solid
phase reaction. But the result would remain spatially addressable
which is advantageous.
[0293] It is also preferred that the solid substrate is suitable
for the low density application of molecules such as nucleic acids
in discrete areas. It may also be advantageous to provide channels
to allow for capillary action since in certain embodiments this may
be used to achieve the desired straightening of individual nucleic
acid molecules. Channels may be in a 2-D arrangement (e.g Quake S,
and Scherer., 200, Science 290: 1536-1540) or in a 3-D flow through
arrangement (Benoit et al., 2001, Anal.Chemistry 73: 2412-2420).
Channels could provide a higher surface area hence a larger number
of molecules could be immobilised. In the case of a 3-D flow
channel array interrogation may be by confocal microscopy which may
image multiple slices of the channels in the z direction.
[0294] In some instances array elements will be raised atop
electrodes/electrode arrays.
[0295] The solid substrate is conveniently divided up into
sections. This may be achieved by techniques such as photoetching,
or by the application of hydrophobic inks, for example Teflon-based
inks (Cel-line, USA).
[0296] Discrete positions, in which each different molecules or
groups of molecular species are located may have any convenient
shape, e.g., circular, rectangular, elliptical, wedge-shaped,
etc.
[0297] Attachment of the plurality of molecules to the substrate
may be by covalent or non-covalent (such as electrostatic) means.
The plurality of molecules may be attached to the substrate via a
layer of intermediate molecules to which the plurality of molecules
bind. For example, the plurality of molecules may be labelled with
biotin and the substrate coated with avidin and/or streptavidin. A
convenient feature of using biotinylated molecules is that the
efficiency of coupling to the solid substrate can be determined
easily. Since the plurality of molecules may bind only poorly to
some solid substrates, it may be necessary to provide a chemical
interface between the solid substrate (such as in the case of
glass) and the plurality of molecules. Examples of suitable
chemical interfaces include various silane linkers and polyethylene
glycol spacer. Another example is the use of polylysine coated
glass, the polylysine then being chemically modified if necessary
using standard procedures to introduce an affinity ligand. Nucleic
acids may be immobilised directly to a polylysine surface
(electrostatically). The surface density of the surface charge will
be important to immobilise molecules in a manner that allows them
to be well. presented for assays and detection.
[0298] Other methods for attaching molecules to the surfaces of
solid substrate by the use of coupling agents are known in the art,
see for example WO98/49557. The molecules may also be attached to
the surface by a cleavable linker.
[0299] In one embodiment, molecules are applied to the solid
substrate by spotting (such as by the use of robotic micropipetting
techniques--Schena et al., 1995, Science 270: 467-470) or ink jet
printing using for example robotic devices equipped with either
pins or piezo electric devices as in the known art.
[0300] For example pre-synthesised oligonucleotides dissolved 100
mM NaoH or 2-4.times.SSC can be applied to glass slides coated with
3-Glycodioxypropyltrimethoxysilane or the ethoxy derivative under
con. These can then be placed at 110 degrees for 15 minutes and
then placed at 4 degrees. Advantageously the oligos may have an
amino terminus but unmodified oligos can also be spotted.
[0301] Alternatively amino-terminated oligonucleotides can be
spotted onto 3-Aminopropyltrimethoxysilane in 100 mM 1:1 Sodium
Carbonate: Sodium Hydrogen Carbonate at pH 9. This can be followed
by 37 degrees for two hours and exposure to ammonia vapour for 1
hour.
[0302] CDNAs or other unmodified DNA can be spotted onto the above
slides or onto poly-L-lysine coated slides can 2-4.times.SSC or 1:1
DMSO: Water can be used for spotting. Optional treatment with UV
and succinic anhydride. There are a number of vendors who sell
slides with different surface modifications and appropriate buffers
e.g Corning (USA), Qunatifoil (Jena. Germany), Surrmodics
(USA),Mosaic (Boston, USA).
[0303] The required low density is typically achieved by using
dilute solutions. One microlitre of a 10.sup.6 M solution spread
over a 1 cm.sup.2 area has been shown to give a mean intermolecular
separation of 12.9 nm on the surface, a distance far too small to
resolve with optical microscope. Each factor of 10 dilution
increases the average intermolecular separation by a factor 3.16.
Thus, a 10.sup.-9 M solution gives a mean intermolecular separation
of about 400 nm and a 10.sup.-12 M gives a mean intermolecular
separation of about 12.9 .mu.m. With a mean separation of about
12.9 .mu.m, if the molecules are focused to appear to be 0.5 .mu.M
in diameter and the average distance is 5 .mu.M, then the chance of
two molecules overlapping (i.e. centre to centre distance of 5
.mu.M or less) is about 1% (based on M. Unger E. Kartalov, C. S
Chiu, H. Lester and S. Quake, "Single Molecule Fluorescence
Observed with Mercury Lamp Illumination", Biotechniques 27:
1008-1013 (1999)). Consequently, typical concentrations of dilute
solutions used to spot or print the array, where far field optical
methods will be used for detection will be in the order of at least
10.sup.-9 M, preferably least 10.sup.-10 M or 10.sup.-12 M. The
concentration used will be higher with the use of superresolution
far field methods or SPM. It should also be borne in mind that only
a fraction of molecules that are spotted onto a surface will
robustly attach to the surface (0.1% to 1% for example). Depending
on the method of immobilisation, only a fraction of those molecules
that are robustly attached will be available for hybridisation or
enzymatic assays. For example with the use of aminolinked
oligonucleotides and spotting onto a Aminopropyltriethoxysilane
(APTES) coated slide surface about 20% of the oligonucleotides are
available for mini-sequencing.
[0304] In a second embodiment, the surface is designed in such a
way that sites of attachment (i.e. chemical linkers or surface
moieties) are dilute or that sites are selectively protected or
blocked. In this case, the, concentration of the sample used for
ink jet printing or spotting is immaterial provided the attachment
is specific to these sites. In the case of in situ synthesis of
molecules, the lower number of available sites for initiating
synthesis allows more efficient synthesis providing a higher chance
of obtaining full-length products.
[0305] Polymers such as nucleic acids or polypeptides may also be
synthesised in situ using photolithography and other masking
techniques whereby molecules are synthesised in a step-wise manner
with incorporation of monomers at particular positions being
controlled by means of masking techniques and photolabile
reactants. For example, U.S. Pat. No. 5,837,832 describes a method
for producing DNA arrays immobilised to silicon substrates based on
very large scale integration technology. In particular, U.S. Pat.
No. 5,837,832 describes a strategy called "ling" to synthesise
specific sets of probes at spatially-defined locations on a
substrate. U.S. Pat. No. 5,837,832 also provides references for
earlier techniques that may also be used. Light directed synthesis
can also be carried out by using a Digital Light Micrornirror chip
(Texas Instruments) as described (Singh-Gasson et al Nature
Biotechnology 1999 17: 974-978). Instead of using
photo-deprotecting groups which are directly processed by light,
conventional deptotecting groups such as dimethoxy trityl can be
employed with light directed methods where for example a photoacid
is generated in a spatially addressable way which selectively
deprotects the DNA monomers (McGall et al PNAS 1996 93:
1355-13560). Electrochemical generation of acid is another means
that is being developed (eg. Combimatrix Corp.)
[0306] The size of array elements is from 0.1.times.0.1 microns and
above as can be ink jet printed onto a patterned surface or created
by photolithography or physical masking.
[0307] Molecules may be attached to the solid phase at a single
point of attachment, which may be at the end of the molecule or
otherwise. Alternatively, molecules may be attached at two or more
points of attachment. In the case of nucleic acids, it may be
advantageous to use techniques that `horizontalize` the immobilised
molecule relative to the solid substrate. For example, fluid
fixation of drops of DNA has been shown previously to elongate and
fix DNA to a derivatised surface such as silane derivatised
surfaces. This may promote accessibility of the immobilised
molecules for target molecules. Spotting of sample by
quills/pins/pens under fast evaporation conditions creates
capillary forces as samples dry to elongate molecules. Means for
straightening molecules by capillary action in channels have been
described by Jong-in Hahm at the Cambridge Healthtech institutes
Fifth Annual meeting on Advances in Assays, Molecular Labels,
Signaling and Detection, May 17-18.sup.th Washington D.C. Samples
may be applied through an array of channels. The density of
molecules stretched across a surface is typically constrained by
the radius of gyration of the DNA molecule.
[0308] Immobilised molecules may also serve to bind further
molecules to complete manufacture of the array. For example,
nucleic acids immobilised to the solid substrate may serve to
capture further nucleic acids by hybridisation, or polypeptides.
Similarly, polypeptides may be incubated with other compounds, such
as other polypeptides. It may be desirable to permanently "fix"
these interactions using, for example UV cross-linking and
appropriate cross-linking reagents. Capture of secondary molecules
may be achieved by binding to a single immobilised "capture"
molecules or to two or more "capture" molecules. Where secondary
molecules bind to two or more "capture" molecules, this may have
the desirable effect of containing the secondary molecule
horizontally.
[0309] ii. Density Reduction of High Density Arrays
[0310] In an alternative embodiment, the molecular array may be
obtained by providing an array produced with molecules at normal
(high) densities using a variety of methods known in the art,
followed by reduction of surface coverage.
[0311] A reduction in actual or effective surface coverage may be
achieved in a number of ways. Where molecules are attached to the
substrate by a linker, the linker may be cleaved. Instead of taking
the cleavage reaction to completion the reaction is partial, to the
level required for achieving the desired density of surface
coverage. In the case of molecules attached to glass by an epoxide
and PEG linkage, such as oligonucleotides, partial removal of
molecules can be achieved by heating in ammonia which is kinown to
progressively destroy the lawn.
[0312] It may also be possible to obtain a reduction in surface
coverage by functional inactivation of molecules in situ, for
example using enzymes or chemical agents. The amount of enzyme or
agent used should be sufficient to achieve the desired reduction
without inactivating all of the molecules. Although the end result
of this process will often be a substrate which has molecules per
se at the same density as before the density reduction step, the
density of functional molecules will have been reduced since many
of the original molecules will have been inactivated. For example,
phosphorylation of the 5' ends of 3' attached oligonucleotides by
Polynucleotide kinase, which renders the oligonucleotides available
for ligation assays may only be 10% efficient.
[0313] An alternative method for obtaining a reduction in molecule
density is to obtain an effective reduction in density by labelling
or tagging only a proportion of the pre-existing immobilised
molecules so that only the labelled/tagged molecules at the
required density are available for interaction and/or analysis.
This is particularly useful for analysing low target numbers on
normal density arrays where the target introduces label.
[0314] These density reduction steps can be applied conveniently to
ready-made molecular arrays which are sold by various vendors e.g.
Affymetrix, Corning, Agilent and Perkin Elner. Alternatively,
proprietary molecular arrays may be treated as required.
[0315] The present invention also provides an "array of arrays",
wherein an array of molecular arrays (level 1) as described are
configured into arrays (level 2) for the purpose of multiplex
analysis. Multiplex analysis can be done by sealing each molecular
array (level 1) by individual chambers, that makes a seal with the
common substrate, so that a separate sample can be applied to each.
Alternatively each molecular array (level 1) can be place at the
end of a pin (as commonly used in combinatorial chemistry) or a
fibre and can be dipped into a multi well plate such as a 384 well
microtitre plate. The fibre could be an optical fibre which can
serve to channel the signal from each array to a detector. The
molecular array (level 1) could be on a bead which self-assembles
onto a hollow optical fibre as described by Walt and co-workers
Mumina Inc. Karri et al Anal. Chem 1998 70: 1242-1248]. Moreover,
the array maybe of of arrays of randomly immobilised molecules of
known and defined type, for example a complete oligonucleotide set
of every 17 mer or genomic DNA from a particular human sample.
[0316] Biosensors
[0317] Low density molecular arrays may be used to produce a
biosensor which may be used to monitor single molecule assays on a
substrate surface, such as a chip. The array may comprise, for
example, between 1 and 100 different immobilised molecules (e.g.
probes), an excitation source and a detector such as a CCD, all
within an integrated device. Sample processing may or may not be
integrated into the device.
[0318] In one aspect, the biosensor would comprise a plurality of
elements, each element containing distinct molecules, such as probe
sequences. Each element may then be specific for the detection of,
for example, different pathogenic organisms.
[0319] In a preferred embodiment the immobilised molecules would be
in the form of molecular beacons and the substrate surface would be
such that an evanescent wave can be created at the surface. This
may be achieved by the forming a grating structure on the substrate
surface or by making the array on an optical fibre (within which
light is totally internally reflected) for example. The CCD
detector may be placed below the array surface or above the array,
separated from the surface by a short distance to allow space for
the reaction volume.
[0320] Examples of biosensor configurations are given in FIG. 6
where: (a) is an integrated detection scheme based on Fluorescence
Energy Resonance Transfer (FRET). The sample is applied between two
plates, one with a CCD and the other with an LED with grating
structure on its surface. (b) is an integrated detection system
with a molecular beacon (Tyagi et al Nat Biotechnol. 1998,
16:49-53) on an optical fibre.
[0321] B. Interrogation/Detection Methods
[0322] Individual molecules in the array and their interaction with
target molecules can be detected using a number of means. Detection
may be based on measuring, for example physicochemical,
electromagnetic, electrical, optoelectronic or electrochemical
properties, or characteristics of the immobilised molecule and/or
target molecule.
[0323] There are two factors that are pertinent to single molecule
detection of molecules on a surface. The first is achieving
sufficient spatial resolution to resolve individual molecules. The
density of molecules is such that only one molecule is located in
the diffraction limit spot of the microscope which is ca. 300 nm.
Low signal intensities reduce the accuracy with which the spatial
position of a single molecule can be determined. The second is to
achieve specific detection of the desired single molecules as
opposed to background signals.
[0324] Scanning probe microscopy (SPM) involves bringing a probe
tip into intimate contact with molecules as the tip is scanned
across a relatively flat surface to which the molecules are
attached. Two well-known versions of this technique are scanning
tunnelling microscopy (STM) and atomic force microscopy (AFM; see
Moeller et al., 2000, NAR 28: 20, e91) in which the presence of the
molecule manifests itself as a tunnel current or a deflection in
the tip-height of the probe, respectively. AFM may be enhanced
using carbon nanotubes attached to the probe tip (Wooley et al.,
2000, Nature Biotechnology 18: 760-763). An array of SPM probes
which can acquire images simultaneously, are being developed by
many groups and this would speed the image acquisition process.
Gold or other material beads could be used to help scanning probe
microscopy find molecules automatically. Electron microscopy is
also a means to interrogate but this is relatively cumbersome.
[0325] Optical methods based on sensitive detection of absorption
or emission may also be used Typically optical excitation means are
used to interrogate the array, such as light of various
wavelengths, often produced by a laser source. A commonly used
technique is laser-induced fluorescence. Although some molecules
will be sufficiently inherently luminescent for detection,
generally molecules in the array (and/or target molecules) will
need to be labelled with a chromophore such as a dye or optically
active particle (see above). If necessary, the signal from a single
molecule assay can, for example, be amplified by labelling with dye
loaded nanoparticles, or multi-labelled dendrimers or PRPs/SPRs.
Raman spectroscopy is another means for achieving high
sensitivity.
[0326] Plasmon resonant particles (PRPs) are metallic nanoparticles
which scatter light elastically with remarkable efficiency because
of a collective resonance of the conduction electrons in the metal
(i.e. the surface plasmon resonance). PRPs can be formed that have
scattering peak anywhere in the visible range of the spectrum. The
magnitude, peak wavelength and spectral bandwidth of the plasmon
resonance associated with a nanoparticle are dependent on a
particle's size, shape and material composition, as well as local
environment. These partcles can be used to label a molecule of
interest.
[0327] SERS [Surface-enhanced Raman Scattering on nanoparticles
exploit raman vibrations on metallic nanoparticles of the single
molecules themselves to amplify their spectroscopic signatures.
[0328] Further, many of these techniques may be applied to
fluorescence resonance energy transfer (FRET) methods of detecting
interactions where, for example the molecules in the array are
labelled with a fluorescent donor and the target molecules (or
reporter oligonucleotides) are labelled with a fluorescent
acceptor, a fluorescent signal being generated when the molecules
are in close proximity. Moreover, structures such as molecular
beacons where the FRET donor and acceptor (quencher) are attached
to the same molecule can be used.
[0329] The use of dye molecules encounters the problems of
photobleaching and blinking. Labelling with dye-loaded
nanoparticles or surface plasmon resonance (SPR) particles reduces
the problem. However a single dye molecule will bleach after a
period of exposure to light. The photobleaching characteristics of
a single dye molecule have been used to advantage in the single
molecule field as a means for distinguishing signal from multiple
molecules or other particles from the single molecule signal.
[0330] Spectroscopy techniques require the use of monochromatic
laser light, the wavelength of which will vary according to the
application. However, microscopy imaging techniques may use broader
spectrum electromagnetic sources.
[0331] Optical interrogation/detection techniques include
near-field scanning optical microscopy (NSOM), confocal microscopy
and evanescent wave excitation. More specific versions of these
techniques include far-field confocal microscopy, two-photon
microscopy, wide-field epi-illumination, epifluorescence microscopy
and total internal reflection (TIR) microscopy. Many of the above
techniques may also be used in a spectroscopic mode. The actual
detection means include charge coupled device (CCD) cameras and
intensified CCDs, photodiodes and photomultiplier tubes. These
means and techniques are well-known in the art. However, a brief
description of a number of these techniques is provided below.
[0332] Near-Field Scanning Microscopy (NSOM)
[0333] In NSOM, subdiffraction spatial resolutions in the order of
50-100 nm are achieved by bringing a sample to within 5-10 nm of a
subwavelength-sized optical aperture. The optical signals are
detected in the far field by using an objective lens either in the
transmission or collection mode (see Barer, Cosslett, eds 1990,
Advances in Optical and Electron Microscopy. Academic; Betzig,
1992, Science 257: 189-95). The benefits of NSOM are its improved
spatial resolution and the ability to correlate spectroscopic
information with topographic data. The molecules of the array need
to either have an inherent optically detectable characteristic such
as fluorescence, or be labelled with an optically active dye or
particle, such as a fluorescent dye. It has been proposed that
resolution can be taken down to just a few nanometres by scanning
apertureless microscopy (Scanning Interferometric Apertureless
Microscopy: Optical Imaging at 10 Angstrom Resolution" F.
Zenhausern, Y. Martin, H. K. Wickramasinghe, Science 269, p. 1083;
T. J. Yang, G. A. Lessard, and S. R. Quake, "An Apertureless
Near-Field Microscope for Fluorescence Imaging", Applied Physics
Letters 76: 378-380 (2000).
[0334] Alternatively excitation can be limited to the near field by
a scanning probe or a narrow slit in near-field proximity to the
sample. Acquisition may be in the far field (Tegenfeldt et al.,
2001, Physical Review Letters 86: 1378-1381).
[0335] Far-Field Confocal Microscopy
[0336] In confocal microscopy, a laser beam is brought to its
diffraction-limited focus inside a sample using an oil-immersion,
high-numerical-aperture objective. The fluorescent signal emerging
from a 50-100 .mu.l region of the sample is measured by a photon
counting system and displayed on a video system (for further
background see Pawley J. B., ed 1995, Handbook of Biological
Confocal Microscopy). Improvements to the photon-counting system
have allowed single molecule fluorescence to be followed in real
time (see Nie et al., 1994, Science 266: 1018-21). A further
development of far-field confocal microscopy is confocal two-photon
fluorescence microscopy, which can allow excitation at a single
wavelength (see for example, Mertz et al., 1995, Opt. Lett. 20:
2532-34).
[0337] Wide-Field Epi-Illumination
[0338] The optical excitation system used in this method generally
consists of a laser source, defocusing optics, a high performance
dichroic beamsplitter, and an oil-immersion, low autofluorescence
objective. Highly sensitive detection is achieved by this method
using a cooled, back-thinned charge-coupled device (CCD) camera or
an intensified CCD (ICCD). High-powered mercury lamps may also be
used to provide more uniform illumination than is possible for
existing laser sources. The use of epi-fluorescence to image single
myosin molecules is described in Funatsu et al., 1995, Nature 374:
555-59.
[0339] Evanescent Wave Excitation
[0340] At the interface between a glass-liquid/air interface, the
optical electromagnetic field decays exponentially into the liquid
phase (or air). Molecules in a thin layer of about 300 nm
immediately next to this interface can still be excited by the
rapidly decaying optical field (known as an evanescent wave). A
description of the use of evanescent wave excitation to image
single molecules is provided in Hirschfeld, 1976, Appl. Opt. 15:
2965-66 and Dickson et al., 1996, Science 274: 966-69. The imaging
setup for evanescent wave excitation typically includes a
microscope configured such that total internal reflection occurs at
the glass/sample interface (Axelrod D. Methods on Cell Biology 1989
30: 245-270). Alternatively a periodic optical microstructures or
gratings can provide evanescent wave excitation at the optical
near-field of the grating structures. This serves to increase
signal around 100 fold (surface planar waveguides have been
developed by Zeptosens, Switzerland; similar technology has been
developed by Wolfgag Budach et al., Novartis Switzerland-poster at
Cambridge Healthtech Institutes Fifth Annual meeting on "Advances
in Assays, Molecular Labels, Signalling and Detection). Preferably
an intensified CCD is used for detection.
[0341] Superresolution Far-Field Optical Methods
[0342] Superresolution far-field optical methods have been
highlighted by Weiss, 2000 (PNAS 97:8747-8749). One new approach
which merits mention is point-spread-function engineering by
stimulated emission depletion (Klar et al 2000, PNAS 97: 8206-8210)
which may improve far-field resolution by 10 fold. Distance
measurement accuracy of better than 10 nm using far field
microscopy, can be achieved by scanning a sample with nanometre
size steps using a piezo-scanner (Lacoste et al PNAS 2000 97:
9461-9466). The resulting spots are localised accurately by fitting
then to the known shape of the excitation point-spread function of
the microscope. The laboratory of Enrico Gratton is developing
similar measurement capabilities by circular scanning of the
excitation beam. Shorter distances can typically be measured by
molecular labelling strategies utlilising FRET [Ha et al Chem.
Phys. 1999 247:107-118) or near field methods such as SPM. These
distance measurement capabilities will be useful for the sequencing
applications proposed in this invention.
[0343] Microarray Scanners
[0344] The burgeoning microarray field has introduced a plethora of
different scanners based on many of the above described optical
methods. These include scanners based on scanning confocal
microscopy, TIRF and white light for illumination and
Photomultiplier tubes, avalanche photodiodes and CCDs for
detection. However, commercial array scanners in their standard
form are not sensitive enough for SMD and the analysis software is
inappropriate.
[0345] Studies have suggested that by varying the angle of
incidence in TIRF microscopy it is possible to discriminate between
fluorophores on a nanometric scale (Ajo-Franklin C M, Kam L, Boxer
S G. PNAS 2001: 98 (24): 13643-8. This can lead to discrimination
of closely spaced probes. A separate method for nanometric
localization precision has been described by Thompson et al
(Thompson R E, Larson D R, Webb W W. Biophys J. 82:2775-83).
[0346] Since the molecular arrays of the invention are spatially
addressable, any immobilised molecule of interest/element of
interest can be interrogated by moving the substrate comprising the
array to the appropriate position (or moving the detection means).
In this way as many or as few of the elements in the array can be
read and the results processed. x-y stage translation mechanisms
for moving the substrate to the correct position are available for
use with microscope slide mounting systems (some have a resolution
of 100 nm). Movement of the stage can be controlled automatically
by computer if required. Ha et al (Appl.Phys. Lett. 70: 782-784
(1997)) have described a computer controlled optical system which
automatically and rapidly locates and performs spectroscopic
measurements on single molecules. A galvonometer mirror or a
digital micromirror device (Texas Instruments, Houston) may be used
to enable scanning of the image from a stationary light source.
Signals can be processed from the CCD or other imaging device and
stored digitally for subsequent data processing.
[0347] Multicolour Imaging
[0348] Signals of different wavelength can be obtained by multiple
acquisitions or by simultaneous acquisition by splitting the
signal, using RGB detectors or analysing the whole spectrum
(Richard Levenson, Cambridge Healthtech Institutes Fifth Annual
meeting on Advances in Assays, Molecular Labels, Signaling and
Detection, May 17-18.sup.th Washington D.C.). Several spectral
lines can acquired by the use of a filter wheel or a monochromater.
Electronic tunable filters such as acoustic-optic tunable filters
or liquid crystal tunable filters can be used to obtain
multispectral imaging (e.g. Oleg Hait, Sergey Smirnov and Chieu D.
Tran, 2001, Analytical Chemistry 73: 732-739).An alternative method
to obtain a spectrum is hyperspectral imaging (Schultz et al.,
2001, Cytometry 43:239-247).
[0349] The Problem of Background Fluorescence
[0350] Microscopy and array scanning are not typically configured
for single molecule detection. The fluorescence collection
efficiency must be maximized and this can be achieved with high
numerical aperture (NA) lenses and highly sensitive electro-optical
detectors such as avalanche diodes that reach quantum yields of
detection as high as 0.8 and CCDs that are intensified (e.g
I-PentaMAX Gen III; Roper Scientific, Trenton, N.J. USA) or cooled
(e.g. Model ST-71 (Santa Barbara Instruments Group, Calif., USA).
However, the problem is not so much the detection of fluorescence
from the desired single molecule (single fluorophores can emit
.about.10.sup.8 photons/sec) but the rejection of background
fluorescence. This can be done in part by only interrogating a
minimal volume as done in confocal microscopy and TIRF. Traditional
spectral filters can be applied to reduce the contribution from
surrounding material (largely Rayleigh and Raman scattering of the
excitation laser beam by the solvent and fluorescence from
contaminants).
[0351] To reduce background fluorescence to levels which allow
legitimate signal from single molecules to be detected a pulsed
laser illumination source synchronized with a time gated low light
level CCD can be used (Enderlein et al in: Microsystem technology:
A powerful tool for biomolecular studies; Eds.: M. Kohler, T.
Mejevaia, H. P. Saluz (Birkhuser, Basel, 1999) 311-29)). This is
based on the phenomena that after a sufficiently short pulse of
laser excitation the decay of the analyte fluorescence is usually
much longer (1-10 ns) than the decay of the light scattering
(.about.10.sup.2 ps). Pulsing of a well chosen laser can reduce the
background count rate so that individual photons from individual
fluorophores can be detected. The laser power, beam size, and
repetition rate, must be appropriately configured. A commercial
array scanner and its software can be customized (Fairfield
Enterprises, USA) so that robust single molecule sensitivity can be
achieved.
[0352] In addition to these methods that combat fluorescence noise
from within the sample volume, the instrument itself can contribute
to background noise. Such tehrmoelectronic noise can be reduced for
example by cooling of the detector. Coupling SPM measurements with
optical measurements would be one way of correlating signals
optically detected to the targeted structures rather than those due
to other sources. Spatial or temporal correlation of signal from
two probes targeting the same molecule suggests the desired rather
than extraneous signal (e.g. Castro and Wiffiams Anal. Chem. 1997
69: 3915-3920).
[0353] Low fluorescence immersion oils should be used and
substrates that are ultra-clean and of low intrinsic fluorescence.
Glass slides/coverslips should be of high quality and well cleaned
(e.g with detergents such as Alconex and Chromerge (VWR Scientific,
USA) and high purity water)). Preferably fused quartz should be
used which has a low intrinsic fluorescence. Single fluorophores
can be distinguished from contaminating particules by several
features: spectral dependence, concentration dependence, quantized
emission and blinking. Particulate contaminants usually have broad
spectrum fluorescence which is obtained in several filter sets
whereas single fluorephores are only visible in specific filter
sets.
[0354] The signal to noise ratio can also be improved by using
labels with higher signal intensities such as fluospheres
(Molecular Probes Inc.) or multilabelled dendrimers.
[0355] Label Free Detection.
[0356] A number of physical phenomena may be adapted for detection,
that rely on the physical properties of the immobilised molecules
alone or when complexed with captured targets or that modify the
activity or properties of some other elements. Terahertz frequency
is one way that the difference between double stranded and single
stranded DNA could be detected Brucherseifer et al., 2000, Applied
Physics Letters 77: 40494051. Interferometry, elliposometry,
refraction would be other means. The modification of the signal
from a light emitting diode integrated into the surface would be
another means. The native electronic, optical (e.g. absorbance),
optoelectronic and electrochemical properties would be other means.
Various modes of the AFM could detect differences on the surface in
a label free manner. The quartz crystal microbalance would be
another means.
[0357] Coating DNA Between Electrodes with Metal and Measuring
Conductivity
[0358] A method where the immobilised single molecule acts as a
template for fabricating a nanowire where the single molecules are
selectively coated with a material that facilitates detection. The
coating is typically a conducting material which allows a circuit
to form between only those electrodes which are occupied by the
target molecule (by virtue of the target molecule binding to a the
probes present on each of the electrodes). A potential difference
is applied between electrodes in any two contiguous groups of
electrodes and the electrodes on which probes interact with target
are identified by virtue of the fact that a current flows between
them. The conducting material can be from silver, gold, palladium
and/or conjugated polymer. Where multiple single molecules span the
electrodes then the haplotype frequency is given by the amount of
current that flows between the electrodes. Protocols based on
methods described in the following articles can be used: Richter J
et al., 2001, Applied Physics Letters 78: 536-538; Braun E et al.,
1998, Nature 391: 775; Quake S, and Scherer., 200, Science 290:
1536-1540.
[0359] C. Processing of Raw Data and Means for Error Limitation
[0360] Where Signals from Different Labels, such as Different Dyes,
Overlap, the First Stage is Typically to Deconvolute the Signals if
they Overlap.
[0361] Digital Analysis of Signals
[0362] Discrete groups of assay classification (e.g. nucleotide
base calling) can be defined by various measures. A set of unique
parameters are chosen to define each of several discrete groups.
The result of interrogation of each individual molecule can be
assigned to one of the discrete groups. One group can be assigned
to represent signals that do not fall within known patterns. For
example there may be groups for real base additions, a, c, g, and t
in extension assays.
[0363] One of the prime reasons that single molecule resolution
techniques are set apart from bulk methods is that they allow
access to individual molecules. The most basic information that can
be obtained is the frequency of occurrence of hits to a particular
group. In bulk analysis the signal is represented in analogue by an
(arbitrary) intensity value (from which a concentration may be
inferred) and this indicates the result of the assay in terms of,
say, a base call or it may indicate the level of a particular
molecule in the sample, by virtue of its calibrated interaction
profile. In contrast, the single molecule approach enables direct
counting and classification of individual events.
[0364] Moreover, digital data processing facilitates error
correction and temporal resolution of reactions at the array
surface. Thus, time-resolved microscopy techniques may be used to
differentiate between bona-fide reactions between probe and sample
and "noise" due to aberrant interactions which take place over
extended incubation times. The use of time-gated detection or
time-correlated single-photon counting is particularly preferred in
such an embodiment.
[0365] The invention accordingly provides a method for sorting
signals obtained from single molecule analysis according to the
confidence with which the signal may be treated. A high confidence
in the signal will lead to the signal being added to a PASS group
and counted; signals in which confidence is low are added to a FAIL
group and discarded, or used in error assessment.
[0366] Table 1 illustrates the processing of signals for error
analysis by example, for SNP typing by primer extension. The object
of the process represented by the flowchart is to eliminate errors
from the acquired image. The input for the process is one of the
four colours (representing each of four differentially labelled
ddNTPs) from the acquired image (after beam splitting). This
process is performed on each of the four split signals.
[0367] Signals that satisfy a number of criteria are put into a
PASS table. This PASS table is the basis for base calling after
counting the number of signals for each colour.
[0368] The FAIL table is made so that information about error rate
can be gathered. The five different types of errors can be
collected into separate compartments in the FAIL table so that the
occurrence of the different types of error can be recorded. This
information may aid experimental methods to reduce error, for
example it may reveal which is the most common type of error.
Alternatively, the failed signals may be discarded.
[0369] The five criteria that are used to assess errors are:
[0370] 1. If intensity is less than p where p=a minimum threshold
intensity. This is high pass filter to eliminate low fluorescence
intensity artefacts
[0371] 2. If intensity is less than q, where q=a maximum intensity
threshold. This is a lowpass filter to eliminate high fluorescence
intensity artefacts.
[0372] 3. If time is less than x where x=early time point. This is
to eliminate signals due to self-prining which would be expected to
occur early.
[0373] If time is greater than z, where z=late timepoint. This is
to eliminate signals due to mis-priming of nucleotides which the
enzyme would be expected to incorporate over an extended period.
For example this may be due to priming by template on template
which would be a two-step process, involving hybridisation of the
first template to array and then hybridisation of the second
template molecule to the forst template molecule.
[0374] 4. Nearest neighbour pixels are compared to eliminate those
in which signal is carried over multiple adjacent pixels which
would be indicative of signals from, for example, non-specific
adsorption of clumps or aggregates of ddNTPs.
[0375] The reaction is controlled by adjusting reaction components,
for example salt concentration, ddNTP concentration, temperature or
pH such that the incorporations occur within the time window
analysed.
[0376] If a single dye molecule, which photobleaches after a
defined time, is associated with each ddNTP, then an additional
sub-process may be added which eliminates signals that occur in the
same pixel over multiple time points.
[0377] If the array is composed of elements an additional process
can be used to organise the data into groupings representing the
array elements.
[0378] In the scheme described the system is configured such that a
single pixel measures a single molecule event (statistically, in
the large majority of cases). The system may be set up, for
example, such that several pixels are configured to interrogate a
single molecule. 1
[0379] Thus, in a preferred embodiment, the invention relates to a
method for typing single nucleotide polymorphisms (SNPs) and
mutations in nucleic acids, comprising the steps of:
[0380] a) providing a repertoire of probes complementary to one or
more nucleic acids present in a sample, which nucleic acids may
possess one or more polymorphisms;
[0381] b) arraying said repertoire on to a solid surface such that
each probe in the repertoire is resolvable individually,
[0382] c) exposing the sample to the repertoire and allowing
nucleic acids present in the sample to hybridise to the probes at a
desired stringency such that hybridised nucleic acid/probe pairs
are detectable;
[0383] d) imaging the array in order to detect individual
hybridised nucleic acid/probe pairs;
[0384] e) analysing the signal derived from step (d) and computing
the confidence in each detection event to generate a PASS table of
high-confidence results; and
[0385] f) displaying results from the PASS table to type
polymorphisms present in the nucleic acid sample.
[0386] Preferably, the confidence in each detection event is
computed in accordance with Table 1.
[0387] Advantageously, detection events are generated by labelling
the sample nucleic acids and/or the probe molecules, and imaging
said labels on the array using a suitable detector. Preferred
labelling and detection techniques are described herein.
[0388] Methods for Reducing Errors
[0389] Single molecule analysis allows access to specific
properties and characteristics of individual molecules and their
interactions and reactions. Specific features of the behaviour of a
particular molecular event on a single molecule may belie
information about its origin. For enzymatic assays one example is
that there may be a slower rate of mis-incorporations than correct
incorporations. Another example is that there may be a different
rate of incorporations for self-priming compared to priming in
which the target forms the template. The rate characteristics of
self-priming are likely to be faster than from priming of sample.
This is because self-priming is a unimolecular reaction whereas
priming of sample DNA is bimolecular. Therefore if time-resolved
microscopy is performed, the time-dependence of priming can
distinguish self-priming and mis-priming from correct sample
priming. Alternatively, it might be expected that DNA priming form
the perfectly matched sample has the capacity to incorporate a
greater number of fluorescent dye NTPs in a multi-primer primer
extension approach (Dubiley et al Nucleic Acids Research 1999 27:
e19i-iv) than a mis-priming and a self-priming and so would give a
higher signal level or molecular brightnesss. It may be difficult
to differentiate between correct incorporation and
mis-incorporation in the mini-sequencing (multi-base approach)
because even though a wrong base may take longer to incorporate it
may be associated with the primer for the same length of time as
the correctly incorporated base. If the fluorescence intensity of a
ddNTP in quenched to some degree when it is incorporated then the
molecular brightness/fluorescence intensity may be a way of
distinguishing between mis-incorporation which may take longer to
become fixed than correct incorporation.
[0390] Different means for reduction of errors may be engineered
into the system. For example, in genetic analysis, FRET probes can
be integrated at the aflelic site. The conformation of a perfect
match allows the fluorescent energy to be quenched whereas the
conformation of a mismatch does not. The FRET probes may be placed
on a spacer, which can be configured to accentuate the distances of
FRET probes between matched and mismatched base pair sets.
[0391] Mismatch errors can be eliminated in some cases by
degradation by enzymes such as RNAse1.
[0392] In addition to false positive errors discussed above, false
negatives can be a major problem in hybridisation based assays.
This is particularly the case when hybridisation is between a short
probe and a long target, where the low stringency conditions
required to form stable heteroduplex concomitantly promotes the
formation of secondary structure in the target which masks binding
sites. The effects of this problem may be reduced by fragmenting
the target, incorporating analogue bases into target or probe,
manipulating buffers etc. Enzymes can help reduce false negatives
by trapping transient interactions and driving the hybridisation
reaction forward (Southern, Mir and Shchepinov, 1999, Nature
Genetics 21:s5-9). However, it is likely that false negatives will
remain to some level. As previously mentioned, because large-scale
SNP analysis without the need for PCR is enabled the fact that some
SNPs do not yield data is not a major concern. For smaller scale
studies, effective probes may need to be pre-selected.
[0393] In cases where the amount of sample material is low, special
measures must be taken to prevent sample molecules from sticking to
the walls of the reaction vessel and other vessels used for
handling the material. These vessels can be silanised to reduce
sticking of sample material and/or can be treated in advance with
blocking material such as Denhardt's reagent or tRNA.
[0394] Alternative Methods for Detection and Decoding of
Results
[0395] The molecules may be detected, as mentioned above, using a
detectable label or otherwise, and correlating the position of the
label on an array with information about the nature of the arrayed
probe to which the label is bound. Further detection means may be
envisaged, in which the label itself provides information about the
probe which is bound without requiring positional information. For
example, each probe sequence may be constructed to comprise unique
fluorescent or other tags (or sets thereof), which are
representative of the probe sequence. Such encoding could be done
by stepwise co-synthesis of probe and tag by split and pool
combinatorial chemistry. Ten steps generates every 10 mer encoded
oligo (around 1 million sequences). 16 steps generates every 16 mer
encoded oligos (around 4 billion sequences) which would be expected
to occur only once in the genome. Fluorescent tags that are used
for encoding could be of different colours or different fluorescent
lifetimes. Moreover, unique tags may be attached to individual
single molecule probes and used to isolate molecules on anti-tag
arrays. The anti tag arrays may be spatially addressable or
encoded.
[0396] D. Assay Techniques and Uses
[0397] A further aspect of the present invention relates to assay
techniques based on single molecule detection. These assays may be
conducted using molecular arrays produced by the methods of the
invention or by any other suitable means.
[0398] The spatial addressable array is a way of capturing and
organizing molecules. The molecules can then be assayed in a
plethora of ways, including using any assay method which is
suitable for single molecule detection, such as those described in
WO0060114; U.S. Pat. No. 6,210,896; Watt Webb, Research Abstract:
New Optical Methods for Sequencing Individual Molecules of DNA, DOE
Human Genome Program Contractor-Grantee Workshop m, web
page:www.oml.gov/hgmis/publicat/00santa/31.html on Feb. 5,
2001.
[0399] In general, the assay methods of the invention comprise
contacting a molecular array with a sample and interrogating all or
part of the array using the interrogation/detection methods
described above. Alternatively, the molecular array is itself the
sample and is subsequently interrogated with other molecules or
probes using the interrogation/detection methods described
above.
[0400] Many assay methods rely on detecting binding between
immobilised molecules in the array and target molecules in the
sample. However other interactions that may be identified include,
for example, interactions that may be transient but which result in
a modification to the properties of an immobilised molecule in the
array, such as charge transfer.
[0401] Once the sample has been incubated with the array for the
desired period, the array may simply be interrogated (following an
optional wash step). However, in certain embodiments, notably
nucleic acid-based assays, the captured target molecules may be
further processed or incubated with other reactants. For example,
in the case of antibody-antigen reactions, a secondary antibody
which carries a label may be incubated with the array containing
antigen-primary antibody complexes.
[0402] Target molecules of interest in samples applied to the
arrays may include nucleic acids such as DNA and analogues and
derivatives thereof, such as PNA. Nucleic acids may be obtained
from any source, for example genomic DNA or cDNA or synthesised
using known techniques such as step-wise synthesis. Nucleic acids
may be single or double stranded. Other molecules include:
compounds joined by amide linkages such as peptides, oligopeptides,
polypeptides, proteins or complexes containing the same; defined
chemical entities, such as organic molecules; combinatorial
libraries; conjugated polymers, lipids and carbohydrates.
[0403] Due to the high sensitivity of the approach specific
amplification steps can be eliminated if desired. Hence, in the
case of analysis of SNPs, extracted genomic DNA can be presented
directly to the array (a few rounds of whole genome amplification
may be desirable for some applications). In the case of gene
expression analysis normal cDNA synthesis methods can be employed
but the amount of starting material can be low. Genomic DNA is
typically fragmented prior to use in the methods of the present
invention. For example, the genomic DNA may be fragmented such that
substantially all of the DNA molecules are 1 Mb, 100 kb, 50 kb, 10
kb and/or 1 kb or less. Fragmentation may be achieved using
standard techniques such as passing the DNA through a narrow guage
syringe, sonication, alkali treatment, free radical treatment,
enzymatic treatment (e.g. DNaseI), or combinations thereof.
[0404] Target molecules may be presented as populations of
molecules. More than one population may be applied to the array at
the same time. In this case, the different populations are
preferred differentially labelled (e.g. cDNA populations labelled
with Cy5 or Cy3). In other cases such as analysis of pooled DNA,
each population may or may not be differently labelled.
[0405] A number of assay methods of the present invention are based
on hybridisation of analyte to the single molecules of the array
elements. The assay may stop at this point and the results of the
hybridisation analysed.
[0406] However, the hybridisation events may also form the basis of
further biochemical or chemical manipulations or hybridisation
events to enable further probing or to enable detection (e.g. a
sandwich assay). These further events include primer extension from
the immobilised molecule/captured molecule complex; hybridisation
of additional probes to the immobilised molecule/captured molecule
complex and ligation of additional nucleic acid probes to the
immobilised molecule/captured molecule complex.
[0407] For example, following specific capture (by hybridisation or
hybridisation plus enzymatic or chemical attachment) of a single
target strand by immobilised oligonucleotide, further analysis can
be performed on the target molecule. This can be done on an
end-immobilised target (or a copy thereof--see below)
Alternatively, the immobilised oligonucleotide anchors the target
strand which is then able to interact with a second (or higher
number) of immobilised oligonucleotide(s), thereby causing the
target strand to lay horizontally. Where the different immobilised
oligonucleotide are different allelic probes for different loci,
the target strand can be allelically defined at multiple loci.
[0408] The target strand can also be horizontalised and
straightened, after being anchored by immobilised oligonucleotide
by various physical methods known in the art.
[0409] In one embodiment, following hybridisation the array
oligonucleotide can be used as a primer to produce a permanent copy
of the bound target molecule which is covalently fixed in place and
is addressable.
[0410] Single molecule counting of these assays will allow even a
rare polymorphism/mutation in a homogeneous population to be
detected.
[0411] Some specific assay configurations and uses are described
below.
[0412] Nucleic Acid Arrays and Accessing Genetic Information
[0413] To interrogate sequence, in most cases the target must be in
single stranded form. The exception would be cases such as triplex
formation, binding of proteins to duplex DNA (Taylor J R, Fang, M M
and S. Nie, 2000, Anal. Chem. 72:1979-1986), or sequence
recognition facilitated by RecA (see Seong et al., 2000, Anal.
Chem. 72: 1288-1293) or by the use of PNA probes (Bukanov et al,
1998, [PNAS] 95: 5516-5520; Cherny et al, 1998, Biophysical Journal
1015-1023). Also, the detection of mismatches in annealed duplexes
by MutS protein has been demonstrated (Sun, HBS and H Yokoto, 2000,
Anal. Chem 72: 3138-3141). Long RNAs (e.g. mRNA) can form R-loops
inside linear ds DNA and this may be the basis for mapping of genes
on arrayed genomic DNA. Where adouble stranded DNA target is
arrayed, it may be necessary to provide suitable conditions to
partially disrupt the native base-pairing in the duplex to enable
hybridisation to probe to occur. This may be achieved by heating
the surface/solution of the substrate, manipulating salt
concentration/pH or applying an electric field to melt the
duplex.
[0414] One preferred method for probing sequences is by probing
double stranded DNA using strand invasion locked nucleic acid (LNA)
or peptide nucleic acid (PNA) probes under conditions where
transient breathing nodes in the duplex structure can arise, such
as at 50-65.degree. C. in 0-100 mM monovalent cation.
[0415] There are several methods that have been described to
stretch out double stranded DNA so that it can be interrogated
along its length. Methods include optical trapping, electrostatic
trapping, Molecular Combing (Bensimon et al Science 1994 265:
20962098), forces within an evaporating droplet/film (Yokota et al
Anal. Biochem 1998 264:158-164; Jing et al PNAS 1998 95:
8046-8051), centrifugal force and moving the air-water interface by
a jet of air (Li et al Nucleic Acid Research 1998 6:
4785-4786).
[0416] Molecular Combing which involves surface tension created by
a moving air-water interface/mensicus and a modification to the
basic technique has been used to stretch out several hundred
haploid genomes on a glass surface (Michalet et al Science. 1997
277: 1518-1523).
[0417] Relatively fewer methods have been described for
single-stranded DNA Woolley and Kelly (Nanoletters 2001 1: 345-348)
achieve elongation of ssDNA by translating a droplet of DNA
solution linearly across a mica surface coated with positve charge.
The forces exerted on ssDNA are thought to be from a combination of
fluid flow and surface tension at the travelling air-water
interface. The forces within fluid flow can be sufficient to
stretch out a single strand in a channel. Capillary forces can be
used to move solutions within channels.
[0418] DNA can be combed onto a surface by one of the standard
methods (eg as described by Bensimon et al., above) and this is
followed by processes to acquire genetic or sequence information
form the single molecules.
[0419] In a preferred embodiment, however, the combing on the
surface is performed as follows:
[0420] Oligonucleotide probes are spread and fixed randomly on the
surface of the substrate. The DNA to be combed is then captured
from solution, by hybridisation to these oligonucleotides.
Following capture the DNA is combed onto the surface. In a
preffered embodiment the DNA is combed by flow of fluids over the
surface. The combed DNA can optionally be dehydrated and fixed on
the surface. This is shown schematically in FIG. 8.
[0421] Methods described herein for hybridisation to single or
double stranded DNA can be used. For "capture combing" of Lambda
DNA, oligonucleotides complementary to either of the "sticky" ends
of Lambda phage DNA can be used. Similarly, genomic DNA can be
captured by digesting it with specific restriction enzymes and to
use oligonucleotide capture probes complementary to the overhangs
which are generated. The advantage of this capture combing for
randomly immobilsing nucleic acids on a surface is that it enables
a very homogeneous spread of the DNA on substantially all of the
surface. This is in contrast to the patchy coverage typically
achieved by standard combing methods.
[0422] Moreover, specific probes can be used which allow combing of
only specific desired types of nucleic acids from a complex mixture
of nucleic acids. For example if the capture probes is oligo d(T)
it specifically capture s polyadenylated mRNA
[0423] These methods, in addition to stretching out DNA, overcome
intermolecular secondary structures which are prevalent in ssDNA
under conditions required for hybridisation.
[0424] An alternative way of overcoming secondary structure
formation of nucleic acids on a surface would be by heating the
surface of the substrate or applying an electric field to the
surface.
[0425] The the majority of the assays described below do not
require the molecules to be linearised, as positional information
along the molecules length is not required. In the cases where
positional information is required, DNA needs to be
linearised/horizontalised. The attachment to more than one surface
immobilised probe will facilitate the process. Double stranded
targets can be immobilised to probes having sticky ends such as
those created by restriction digestion.
[0426] In one embodiment, following capture by an immobilised
oligonucleotide, a target strand is straightened. This can be done
on a flat surface by molecular combing. In one embodiment the
probes could be placed on a narrow line on the left most side of an
array element and then the captured molecules would be stretched
out in rows form left side to the right side by moving an air-water
interface from left to right.
[0427] Alternatively the captured target can be stretched out in a
channel or capillary where the capture probes are attached to (one
or more) walls of the vessel and the physical forces within the
fluid cause the captured target to stretch out. The target
molecules could be stretched out thus by methods that do not rely
on probe capture, instead an oligo that is 5' phosphorylated can be
made to attach to appropriately derivatised surfaces under acidic
pH conditions. These conditions may be created with fluid flow
within a channel/capillary to immobilise and stretch out a target
strand. Fluid flow may facilitate mixing and this would make
hybridisation and other processes more efficient. Reactants could
be recirculated within the channels during the reactions.
[0428] Although in a number of embodiments described below,
interrogation of multiple sites of a target nucleic acid is
achieved by separate binding of multiple copies of that target
nucleic acid to various elements in the array, in one embodiment, a
single nucleic acid molecule may be simultaneously interrogated at
multiple loci by binding to multiple elements suitably spatially
placed (the construction of arrays with a suitable layout is
described in section A). This type of detection may conveniently be
applied to SNP detection, haplotyping and sequence determination.
Various aspects are discussed below under individual headings but
are typically broadly applicable to any detection technique where
simultaneous interrogation of a single molecule at multiple sites
is desired.
[0429] 1. Resequencing and/or Typing of Single-Nucleotide
Polymorphisms (SNPs) and Mutations
[0430] a. Hybridisation
[0431] The organisation of the array would typically follow the
known art as taught by Affymetrix e.g. Lipshutz et al., Nature
Genetics 1999 21: s20-24; Hacia et al., Nature Genetics 21: s42-47)
) for SNP resequencing or typing. In short, an SNP may be analysed
with a block of array elements containing defined probes, in the
simplest form, with probes to each known or possible allele. This
may include substitutions and simple deletions or insertions.
However, whereas the Affymetrix techniques require complex tiling
paths to resolve errors, advanced versions of the single molecule
approach may suffice with simpler arrays, as other means for
distinguishing errors may be used. Transient interactions can also
be recorded.
[0432] Typically the oligonucleotides will be between about 17 and
25 nucleotides in length although longer or shorter probes may be
used in some instances.
[0433] In a different implementation, a mix of probes complementary
to all alleles is placed within a single array element. Each probe
comprising a different allele is distinguishable from the other
probes, e.g. each single molecule of a particular allele will have
a specific dye associated with it. A single molecule assay system
of the invention allows this space saving operation and would be
simple to do when pre-synthesised oligos are spotted on the
array.
[0434] The probe can be appended with a sequences that would
promote its formation into a secondary structure that would
facilitate the discrimination of mismatch (e.g. a stem loop
structure where the probe sequence is in the loop).
[0435] The following are typical reaction conditions that can be
used: 1M NaCl or 3-4.4 M TMACl in Tris Buffer, target sample, 4 to
37.degree. C. in a humid chamber for 30 mins to overnight.
[0436] It is recognised that hybridisation of rare species is
discriminated against under conventional reaction conditions,
whilst species that are rich in A-T base pairs are not able to
hybridise as effectively as G-T rich sequences. Certain buffers are
capable of equalising hybridisation of rare and A-T rich molecules,
to achieve more representative outcomes in hybridisation reactions.
The following components may be included in hybridisation buffers
to improve hybridisation with positive effects on specificity
and/or reduce the effects of base composition and/or reduce
secondary structure and/or reduce non-specific interactions and/or
facilitate enzyme reactions:
[0437] 1M Tripropylamine acetate
[0438] N, N-dimethylheptylamine
[0439] 1-Methyl piperdine
[0440] LiTCA
[0441] DTB
[0442] C-TAB
[0443] Betaine
[0444] Guanidinium isothyacyanate
[0445] Formamide
[0446] Tetramethy ammonium chloride (TMACl)
[0447] Tetra ethyl Ammonium Chloride (TEACl)
[0448] Sarkosyl
[0449] SDS (Sodium dodecyl sulphate)
[0450] Dendhardt's reagent
[0451] Poly ethyene Glycol
[0452] Urea
[0453] Trehalose
[0454] Cot DNA
[0455] tRNA
[0456] N-N-dimethylisopropylamine acetate.
[0457] Buffers containing N-N-dimethylisopropylamine acetate are
very good for specificity and base composition. Related compounds
with similar structure and arrangement of charge and/ or
hydrophobic groups may also be used.
[0458] Probes are chosen, where possible, to have minimal potential
for secondary structure and cross hybridisation with non-targeted
sequences.
[0459] Where the target molecules are genomic DNA and specific PCRs
are not used to enrich the SNP regions of choice, measures need to
be taken to reduce complexity. The complexity is reduced by
fragmenting the target and pre-hybridising it to C.sub.0t=1 DNA.
Other methods are described by Cantor and Smith (Genomics, The
Science and Technology Behind the Human Genome Project 1999; John
Wiley and Sons]. It may also be useful to perform whole genome
amplification prior to analysis.
[0460] The probes would preferentially be morpholino, locked
nucleic acids (LNA) or peptide nucleic acids (PNA).
[0461] Molecules and their products may be immobilized and
manipulated on a charged surface such as an electrode. Applying an
appropriate bias to the electrode may speed up hybridization and
aid in overcoming secondary structure when the bulk solution is at
high stringency. Switching polarity would aid in preferentially
eliminating mismatches.
[0462] b. Stacking Hybridisation
[0463] Adding either sequence specific probes or a complete set of
probes in solution that will coaxially stack onto the immobilised
probe, templated by the target, can increase the stability and
specificity of the hybridisation. There is a stability factor
associated with stacking and this is abrogated if there is a
mismatch present between the immobilised probe and the solution
probe. Therefore mismatch events can be distinguished by use of
appropriate temperatures and sequence.
[0464] The probe can be appended with sequences that configure it
to form a secondary structure such that it provides a coaxial
stacking interface onto which the end of a target is juxtaposed.
This may be a favourable approach when the target is
fragmented.
[0465] It may be advantageous to use LNA probes as these may
provide better stacking features due to their pre-configured
"locked" structure.
[0466] The following are typical reaction conditions that can be
used: 1M NaCl in Tris Buffer; 1 to 10 nM (or higher concentration)
stacking oligonucleotide; target sample; 4-37.degree. C. 30min to
overnight
[0467] c. Primer Extension
[0468] This is a means for improving specificity at the free end of
the immobilised probe and for trapping transient interactions.
There are two ways that this can be applied. The first is the
multiprimer approach, where as described for hybridisation arrays,
there are separate array elements containing single molecules for
each allele.
[0469] The second is the multi-base approach in which a single
array contains a single species of primer whose last base is
upstream of the polymorphic site. The different alleles are
distinguished by incorporation of different bases each of which is
differentially labelled. This approach is also known as
mini-sequencing.
[0470] The following reaction mix and conditions can be used:
5.times. polymerase buffer, 200 mM Tris-HCL pH 7.5, 100 mM
MgCl.sub.2, 250 mM NaCl 2.5 mM DTT; ddNTPs or dNTPs (multibase);
dNTPs (multiprimer), Sequenase V.2 (0.5 .mu./.mu.l) in polymerase
dilution buffer, target sample, 37.degree. C. degrees 1 hr.
[0471] d. Ligation Assay
[0472] Ligation, (chemical or enzymatic) is another means for
improving specificity and for trapping transient interactions. Here
the target strand is captured by the immobilised oligonucleotide
and then a second oligonucleotide is ligated to the first, in a
target dependent manner. There are two ways that this can be
applied. In the first type of assay, the "second" oligonucleotides
that are provided in solution are complementary in the region of
the known polymorphisms under investigation. One oligo of either
the array oligos or the "second" solution oligonucleotide will
overlap the SNP site and the other will end one base upstream of
it.
[0473] In the second type of assay, the second oligonucleotides in
solution comprise the complete set, every oligonucleotide sequence
of a given length. This would allow analysis of every position in
the target. It may be preferable to use all sequences of a given
length where one or more nucleotides are LNA.
[0474] A typical ligation reaction is as follows: 5.times. ligation
buffer, 100 mM Tris-HCL pH 8.3, 0.5% Triton X-100, 50 mM MgCl, 250
mM KCl, 5 mM NAD+, 50 mM DTT, 5 mM EDTA, solution oligonucleotide
5-10 pmol. Thermus thermophilus DNA ligase (Tth DNA ligase) 1 U/ul,
target sample, between 37.degree. C. and 65.degree. C. 1 hr.
[0475] Altematively, stacking hybridisation can be performed first
in high salt: 1M NaCl, 3-4.4M TMACl, 5-10 pmol solution
oligonucleotide, target sample.
[0476] After washing of excess reagents from the array under
conditions that would retain the solution oligonucleotide, the
above reaction mix minus solution oligonucleotide and target sample
is added to the reaction mix.
[0477] Combining the Power of Different Assay Methods
[0478] The power of primer extension and ligation can be combined
in technique called gap ligation (the processivity and
discriminatory power of two enzymes combined). Here a first and a
second oligonucleotide are designed that hybridise in close
proximity on the target but with a gap of preferably a single base.
The last base of one of the oligonucleotides ends one base upstream
or downstream of the polymorphic site. In cases where it ends
downstream, the first level of discrimination is through
hybridisation. Another level of discrimination occurs through
primer extension which extends the first oligonucleoitde by one
base. The extended first oligonucleotide now abuts the second
oligonucleotide. The final level of discrimination occurs where the
extended first oligonucleotide is ligated to the second
oligonucleotide.
[0479] Alternatively the ligation and primer extension reactions
described in c. and d. above can be performed simultaneously, with
some molecules of the array giving results due to ligation and
others giving results due to primer extension, within the same
array element. This would be a way to increase confidence in the
base call, being made independently by two assay/enzyme systems.
The products of ligation may be differetly labelled than the
products of primer extension.
[0480] The primer or ligation oligonucleotides may be designed on
purpose to have mismatch base at a site other than the base that
serves to interrogate the polymorphic site. This would serve to
reduce error as duplex with two mismatch bases is considerably less
stable than a duplex with only one mismatch.
[0481] It may be desirable to use probes that are fully or
partially composed of LNA (which have improved binding
characteristics and are compatible with enzymes) in the above
described enzymatic assays.
[0482] The invention provides a method for SNP typing which enables
the potential of genomic SNP analysis to be realised in an
acceptable time-frame and at affordable cost. The ability to type
SNPs through single-molecule recognition intrinsically reduces
errors due to inaccuracy and PCR-induced bias which are inherent in
mass-analysis techniques. Moreover, if errors occur which left a
percentage of SNPs untyped, assuming errors are random with regard
to position of SNP in the genome, the fact that the remaining SNPs
are typed without the need to perform individual (or multiplexed)
PCR still confers an advantage. It allows large-scale association
studies to be performed in a time- and cost-effective way. Thus,
all available SNPs may be tested in parallel and data from those in
which there is confidence selected for further analysis.
[0483] There is a concern that duplicated regions of the genome may
lead to errors, where the results of an assay may be biased by DNA
from a duplicated region. The direct assay of the genome by single
molecule detection is no more susceptible to this problem than
assays utilising PCR since in most instances the way PCR is
commonly designed, a small segment surrounding the SNP site is
amplified (this is necessary to achieve multiplex PCR). However,
with the availability of the genome sequence, this would be less of
a problem as in some cases it may be possible to select
non-duplcated regions of the genome for analysis. In other cases,
the sources of bias would be known and so could be accounted
for.
[0484] If signal is obtained from probes or labels representing
only one allele then the sample is likely to be homozygous. If it
is from both, in substantially a 1:1 ratio then the sample is
likely to be heterozygous. As the assays are based on single
molecule counting, highly accurate allele frequencies can be
determined when DNA pooling strategies are used. In these case the
ratio of molecules might be 1:100. Similarly, a rare mutant allele
in a background of the wild-type allele might be found to have
ratio of molecules as 1:1000.
[0485] Tagging Mismatches
[0486] As an alternative means for selecting SNPs or mutation, is
to detect the sites of mismatches when a heterozygous sample DNA
(one or both of which contain 2'-amine subsitiuted nucleotides) is
denatured and re-annealed to give heteroduplexes can be tagged 2'
amine acylation (or more preferably an unknown sample DNA can be
hybridised to modified tester DNAs of known sequence. This is made
possible by the fact that acylation occurs preferably at flexible
positons in DNA and less preferably in double stranded constrained
regions (John D and K Weeks, Chem. Biol. 2000, 7: 405410). This
method could be used to place bulky tags onto sites of mismatch on
DNA that has been horizontalised. Detection of these sites may then
be detected by for example AFM. When this is applied genome-wide
the genome would be sorted by array probes or the identity of
fragments obtained by use of encoded probes.
[0487] Homogeneous Assays
[0488] Low background fluorescence and the elimination of the need
for post-assay processing to remove unreacted fluorescent labels
can be achieved by two approaches. The first is the use of
Molecular Beacons (Tyagi et al Nat. Biotechnol. 1998, 16:49-53) and
other molecular structures comprising Dye-Dye interactions in which
fluorescence is only emitted in the target bound state and is
quenched when the structure is unbound by the target. In practice a
fraction of the molecular beacons will fluoresce and so an image
may need to be taken before adding targets to the array to make a
record of false positives.
[0489] The second is the analysis of fluorescence polarization of a
dye labelled molecule (Chen et al Genome Res. 1998, 9: 492-98). For
example, in a mini-sequencing assay, free and incorporated dye
labels exhibit different rotary behaviour. When the dye is linked
to a small molecule such as a ddNTP, it is able to rotate rapidly,
but when the dye is linked to a larger molecule, as it would be if
added to the primer by incorporation of the ddNIP, rotation is
constrained. A stationary molecule transmits back into a fixed
plane, but rotation depolarises the emitted light to various
degrees. An optimal set of four dye terminators are available where
different emissions can be discriminated. These approaches can be
configured within single molecule detection regimes. Other
homogeneous assays are described by Mir and Southern (Ann Rev.
Genomics and Human Genetics 2000, 1: 329-60). The principles
inherent in pyrosequencing (Ronaghi M et al Science, 1998, 363-365)
may also be applicable to single molecule assays.
[0490] 2. Haplotyping
[0491] Two or more polymorphic sites on the same DNA strand can be
analysed. This may involve hybridisation of oligos to the different
sites but each labelled with different fluorophores. As described,
the enzymatic approaches could equally be applied to these
additional sites on the captured single molecule.
[0492] In one embodiment each probe in a biallelic probe set may be
differentially labelled and these labels are distinct from the
labels associated with probes for the second site. The assay
readout may be by simultaneous readout by splitting of the emission
by wavelength obtained from the same foci or from a focal region
defined by the 2-D radius of projection of a a DNA target molecule
immobilised at one end. This radius is defined by the distance
between the site of immobilized probe and the second probe. If the
probes from the first biallelic set are removed or their fluors
photobleached then a second acquisition can be made with the second
biallelic set which in this case do not need labels that are
distinct from labels for the first biallelic setin another
embodiment haplotyping can be performed on single molecules
captured on allele-specific microarays. Haplotype information can
be obtained for nearest neighbour SNPs by for example, determining
the first SNP by spatially addressabe allele specific probes (see
FIG. 7a). The labelling is due to the allelic probes (which are
provided in solution) for the second SNP. Depending on which foci
colour is detected within a SNP 1 allele specific spot determines
the allele for the second SNP. So spatial position of microarray
spot determines the allele for the first SNP and then colour of
foci within the microarray spot determines the allele for the
second SNP. If the captured molecule is long enough and the array
probes are far enough apart then further SNP allele specific probe,
each labelled with a different colour can be resolved by
co-localization of signal to the same foci.
[0493] More extensive haplotypes, for three or more SNPs can be
reconstructing from analysis of overlapping nearest neighbour SNP
haplotyes (see FIG. 7b) or by further probing with differently
labeled probes on the same molecule.
[0494] Samples molecules may be pre-processed to bring distal sites
into closer vicinity. For example this can be done by appropriate
modular design of PCR or ligation probes. For example, the modular
ligation probe would have a 5' sequence that would ligate to one
site and the 3' portion would have a sequence that would ligate at
a distal site on the target. Use of such modular probes would
juxtopose two distal elements of interest and cut out the
intervening region that is not of interest.
[0495] In the case where the target has been horizontalised, the
labels associated with the first locus need not be distinct from
labels associated with subsequent loci; the position specifies the
identity.
[0496] In another embodiment of the invention, single nucleic acid
molecules may be simultaneously multiply probed by suitable spatial
placement of probes at distinct locations. For example, four SNPs
could be interrogated, each 10 kb apart along a 40 kb DNA molecule.
The .about.3 micron spacing of these SNPs could be replicated in
the spacing of patches of probes on the surface that would interact
with the SNPs. If all SNPs, which would occur every .about.1000 bp
then the spacing of SNPs and probes on the surface is 300 nm.
Moreover, each allele of the SNP would be represented in cells, one
above the other and the series of probes against consecutive SNPs
on the taget molecule would run sequentially from left to right (or
right to left) on the surface. Here the alleles (hence haplotype)
present on a single molecule may be revealed by looking at the
target strand on the surface. For example it may be complementary
to probes on the bottom for the first two SNPs but complementary to
the top positions at the third SNP and fourth SNP as shown in FIG.
1A (see FIG. 1B for an alternative path). By tracing the path taken
by the strand, which is guided by hybridisation to perfect
complement on the surface(see FIG. 1C) the haplotype can be
obtained. The target DNA strand could be directly visualised by AFM
or may be labelled with a fluorescent dye e.g YO-YO or TO-TO dyes
Molecular Probes Inc. and analysed by optical microscopy. An
alternative, and one which would be conducive to an integrated
device would be to place each probe on a nanoelectrode, use redox
mediators in solution and then measure the change in cyclic
voltametry or other electronic meaure to indicate hybridisation.
The target strand would trace a path along upper or lower
electrodes depending on which allele is present on the strand.
Hybridisation with a single probe molecule on the electrode would
be detected through charge transfer to the nanoelectrode for
example. The footprint due to the path of the DNA strand would be
revealed by the spatial location of the electrodes that give
signal. Alternatively, and as described above, the DNA molecules
can serve as template for deposition of conducting materials and
subsequent determination of through which electrode-probe pairs
current can flow due to a circuit being made.
[0497] Where the probes are labelled with a detectable label, such
as a fluorophore, the haplotype is given by the spatial coordinates
of the fluorescent footprint. The patch of probes may be of high
density but only the single immobilised molecule that interacts
with the single target molecule would be a finctionally active
molecule of the array. It would be possible to obtain haplotype
frequencies by this method in two ways. Firstly haplotype
frequencies of nearest neighbour SNPs could be obtained where
multiple single molecules occupy the patch sets. A haplotype of
greater than two nearest neighbours would be difficult to obtain as
there may be crossover of molecules. The second way of obtaining
haplotype frequencies would be to have multiple copies of the patch
sets on the surface which each interrogate a single molecule
only.
[0498] A limitation of DNA pooling methods for genotyping is that
because individual genotypes are not analysed, the estimation of
haplotypes is complicated. However, in the methods described in the
present invention, DNA pooling strategies could be used to obtain
Haplotype frequencies.
[0499] 3. Fingerprinting
[0500] A captured target strand can be further characterised and
uniquely identified by further probing by hybridisation or other
means. The particular oligonucleotides that associate with the
target strand provide information about the sequence of the target.
This can be done by multiple acquisitions with similarly labelled
probes (e.g. after photobleaching or removal of the first set) or
simultaneously with differentially labelled probes. A set of
oligonucleotides, which are differentially labelled could be
specifically used for simultaneous fingerprinting.
[0501] Again, individual molecules may be simulataneously multiply
probed as described for haplotyping.
[0502] 4. Nucleic Acid Sequencing
[0503] Capture of DNA molecules would be the basis for complete or
partial sequence determination of the target by various means. The
captured DNA can be sequenced by determining interactions by
Watson-Crick base pairing, serially to a complete set of sequences,
e.g. every 6-mer.
[0504] For example, a mixture of two or more probes could be placed
within the array element. The plating density would be such that
individual probe molecules would be sufficiently spaced to capture
a single molecule at defined points. Alternatively, two or more
probes could be placed at defined positions within an array
element, as a means to stretch out target DNA by hybridisation to
these probes. The horizontal molecule could then be characterised
by, for example, using fluorescent probes or tagged probes (as
described below). Each array element would address an individual
fragment from the genome. This could form the basis of resequencing
the genome using SPM or a high resolution optical method. If the
array has one million sites, then it will typically be necessary to
fragment human genomic DNA into 3000 bp lengths to cover the entire
genome. For 50,000 element array 60 kb fragments would cover the
entire genome. The method for sequencing and sequence
reconstruction is given section below.
[0505] The target DNA may be substantially a double stranded
molecule and probing may be by strand invasion with PNA or LNA.
Hybridisation at around 50.degree. C. would be sufficient to create
single stranded nodes within the duplex which would seed strand
invasion. A salt concentration between 0 and 1 M Na would typically
be appropriate for PNA. A salt concentration between 50 mM and 1 M
Na would typically be appropriate for LNA.
[0506] The target may be substantially single stranded but would be
made accessible to hybridisation by stretching out on a surface.
This may be achieved by passing the molecules through a channel
that makes a seal with the substrate and passing a solution of the
molecules through by capillary action.
[0507] Other methods of obtaining sequence information described
herein would be applicable to sequence analysis on probe-captured
DNA.
[0508] Sequence information could be obtained by probing along a
single molecule using blocks of probe arrays in a similar manner to
that described above for haplotyping. Multiple copies of each
sequence would typically be required and probes would typically
need to be laid out in optimal spatial locations to obtain sequence
information. The position of individual molecules over the array
containing known sequences would need to be determined.
[0509] The present invention also relates to methods of arraying
pluralities of nucleic acid molecules at low density where,
although the identity of the nucleic acids may be unknown prior to
immobilisation, the array is subsequently characterised by the use
of encoded probes, such as tagged probes. Or by successive serial
hybridization/melting of each probe from a complete repertoire e.g
around a thousand cycles with 5 mers and then reconstructing the
sequence from information about the probes that hybridise to each
immobilized nucleic acid. In addition to obtaining sequence from a
sample nucleic acid this could also be a way of randomly arraying
probes eg 25 mers and then making the sequences spatially
addressable by decoding their sequence by hybridisaiton with
shorter probes.
[0510] 5. STR Analysis
[0511] The array oligonucleotide could probe the sequence flanking
a repetitive element. This captures a sequence containing a
repetitive element. It is then used to seed ligation of probes
complementary to the repetitive sequence, along the target strand
or to act as primer to polymerise a complementary strand to the
repetitive elements. Then the number of repeat units are determined
by quantitating the level of signal from fluorescently labelled
oligonucleotides or fluorescent nucleotides. Only completely
extended oligos which incorporate an oligo (preferably by stacking
hybridisation or ligation) complementary to the other flanking
sequence labelled with a different fluorophore, are typically
counted. It may be helpful to obtain ratios between fluorescence
intensity from the extended region and the labelled flanking
sequence Ligation conditions described above (see lc) can be used;
a reaction temperature of 46-65.degree. C. with a thermostable
ligase is preferable. Polymerisation conditions described above can
be employed.
[0512] A method to determine repeat lengths based on providing
probes complementary in length to the different target repeat
lengths as described (Case Green et al, p61-67 DNA Microarrays A
Practical Approach Ed: M. Schena1999 Oxford University Press) can
also be implemented at the singe molecule level.
[0513] 6. Expression Analysis
[0514] Conventional microarray expression analysis is performed
using either synthetic oligonucleotide probes (e.g 40-75 nt) or
longer cDNA or PCR product probes (typically 0.6 kb or more)
immobilised to a solid substrate. These types of arrays can be made
according to present invention at low surface coverage (as
described in section A). After hybridisation, the level of gene
expression can be determined by single molecule counting using the
methods of the invention. This will give increased sensitivity and
will allow events due to noise to be distinguished from real
events. Also, as the basic unit of counting is the single molecule,
even a rare transcript can be detected. One implementation of
expression analysis involves comparison of two mRNA populations by
simultaneous analysis on the same chip by two-colour labelling.
This can also be done at the single molecule level by counting each
colour separately by for example beam splitting. Capture of a
target cDNA or mRNA can allow further analysis by oligonucleotide
probing. For example this could be used to distinguish
alternatively spliced transcripts.
[0515] Microarray Theory Suggests that Accurate Gene Expression
Ratios at Equilibrium can be Obtained when the Sample Material is
Low.
[0516] A permanently addressable copy of an mRNA population can be
made by primer extension of molecules separated on single molecule
arrays. Primers could be designed based on the available genome
sequence or gene fragment sequences. Alternatively, unknown
sequences could be sampled using a binary probe comprising a fixed
element that would anchor all mRNA and a variable element that
would address/sort the repertoire of mRNA species in a population.
The fixed element may be complementary to sequence motifs that are
common to all mRNA such as the Poly A sequence or the
Polyadenylation signal AAUAAA or preferably to a common clamp
sequence that is ligated to all mRNA or cDNA at 5' or 3' ends. The
copy could be used as the basis for further analysis such as
sequencing.
[0517] 7. Comparative Genomic Hybridisation (CGH).
[0518] Gridded genomic DNA or genomic DNA immobilized by spatially
addressable capture probes (or complementary copies) is probed by
genomic DNA from a different source to detect regions of
differential deletions and amplifications between the two samples.
The immobilized sample containing multiple copies of each species
may be a reference set and genomic DNA from two different sources
may be differentially labeled and compared by hybridization to the
reference.
[0519] 8. Detection of Target Binding to a Repertoire of
Oligonucleotides
[0520] A target can be hybridised to a repertoire of ligands.
Single molecule analysis would be advantageous for example it would
reveal binding characteristics of conformational isomers and
overcome the steric hindrance associated with binding of targets to
arrays in which molecules are tightly packed. Hybridisation would
be conducted under conditions close to those that would occur in
the intended use of any selected ligand.
[0521] For antisense oligonucleotide binding to RNA, hybridisation
would occur at 0.05 to 1 M NaCl or KCl with MgCl.sub.2
concentrations between 0 and 10 mM in for example Tris Buffer. One
picomole or less of target will be sufficient. (Refer to
EP-A-742837: Methods for discovering ligands).
[0522] The method also provides a method for randomly arraying a
combinatorial repertoire. Such a repertoire could allow billions of
molecules to be analysed in an assay that would be designed to
detect a signal from each molecule by single molecule detection
techniques. The encoding would identify the molecule. The
combinatorial repertoire, whether it is encoded or not could be
made much more simply than conventional libraries e.g by adding a
mixture of all four bases at every step of synthesis in DNA
synthesis as is done when generating repertoires for systematic
evolution of ligands by exponential enrichment (SELEX)(Tuerk C and
Gold L., Science 1990 249: 5050-510) However because analysis is at
the single molecule level, enrichment by PCR is mot needed for
detection. In application to nucleic acid structures or aptamers,
after a functional assay such as a binding assay, the immobilized
target molecule could be probed by short oligonucleotides to
determine it's sequence by sequence by hybridsatin methods.
Molecular combing would facilitate this.
[0523] 9. Protein-Nucleic Acid Interactions
[0524] Interactions between biological molecules, such as proteins,
and nucleic acids can be analysed in a number of ways. Double
stranded DNA polynucleotides (by foldback of designed sequences)
can be immobilised to a surface in which individual molecules are
resolvable to form a molecular array. Immobilised DNA is then
contacted with candidate proteins/polypeptides and any binding
determined by the methods described above. Alternatively RNA or
duplex DNA can be horizontalised and optionally straightened by any
of the methods refered to herein. The sites of protein binding may
then be identified within a particular RNA or DNA using the methods
described herein. Candidate biological molecules typically include
transcription factors, regulatory proteins and other molecules or
atoms such as calcium or iron. When binding to RNA is analysed
meaningful secondary structure is typically retained.
[0525] The binding of labeled transcription factors or other
regulatory proteins to genomic DNA immobilized and linearised by
the methods referred to herein may be used to identify active
coding regions or the sites of genes in the genome. This would be
an experimental alternative to the bioinformatic approaches that
are typically used to find coding regions in the genome. Similarly,
methylated regions of the genome could be denoted by using
antibodies specific for 5-methylcytosine. Differential methylation
may be an important means for epigenetic control of the genome, the
study of which is becoming increasingly important. Information from
tag probes would preferably be combined with information about
methylated regions and coding regions.
[0526] Below is out of place in protein-nucleic acid interactions
as it is talking about interactions of AFM tip with methylated DNA,
is it still OK in this section?:
[0527] An alternative means for determining the methylation status
of DNA would be by force or chemical force analysis using AFM. For
example a silicon nitride AFM tip would interact differently with
methyl cytosine in DNA, which is more hydrophobic than
non-methylated DNA.
[0528] Other applications include RNA structure analysis and
hybridisation of tags to anti-tag arrays.
[0529] Other Types of Assays
[0530] The present invention is not limited to methods of analysing
nucleic acids and interactions between nucleic acids. For example,
in one aspect of the invention, the molecules are proteins. Capture
probe may be used to bind protein. Other probes can further
interrogate protein. For example, further epitopes may be accessed
by antibodies or an active site by a small molecule drug.
[0531] Low density molecular arrays may also be used in methods of
high-throughput screening for compounds that interact with a given
molecule of interest. In this case, the plurality of molecules
represent candidate compounds (of known identity). The molecule of
interest is contacted with the. array and the array interrogated to
determine where the molecule binds. Since the array is spatially
addressable, the identity of each immobilised molecule identified
as binding the molecule of interest can be readily determined. The
molecule of interest may, for example, be a polypeptide and the
plurality of immobilised molecules may be a combinatorial library
of small molecule organic compounds.
[0532] Many of the above assays involve detecting interactions
between molecules in the array and target molecules in samples
applied to the array. However, other assays include determining the
properties/characteristics of the arrayed plurality of molecules
(even though their identity is already known), for example
determining the laser induced fluorescence characteristics of
individual molecules. An advantage over bulk analysis would be that
transient processes and functional isomers would be detected.
[0533] Thus in summary, the assays of the invention and the low
density molecular arrays of the invention may be used in a variety
of applications including genetic analysis, such as SNP detection,
haplotyping, STR analysis, sequencing and gene expression studies;
identifying compounds/sequences present in a sample (including
environmental sampling, pathogen detection, genetically modified
foodstuffs and toxicology); and high through screening for
compounds with properties of interest. High throughput genetic
analysis will be useful in medical diagnosis as well as for
research purposes.
[0534] Advantages of the single molecule array approach can be
summarised as follows:
[0535] 1. Can resolve complex samples.
[0536] 2. Can separate correct signals from erroneous signals.
[0537] 3. Sensitivity of detection down to a single molecule in the
analyte.
[0538] 4. Sensitivity of detection of a single variant molecule
within a pool of common (e.g. wild-type) molecules.
[0539] 5. Eliminates need for sample amplification.
[0540] 6. Allows individual molecules in target sample to be sorted
to discrete array elements and to ask specific questions of said
target molecules e.g. analyse multiple polymorphic sites (i.e.
haplotyping).
[0541] 7. Can perform time-resolved microscopy of single molecular
events within array elements and hence detect transient
interactions or temporal characteristics of single molecule
processes.
[0542] 8. Due to single molecule counting can get very precise
measurements of particular events e.g. Allele frequencies or mRNA
concentration ratios.
[0543] E. Alternative Assay Methods
[0544] A further aspect of the invention relates to the production
of arrays comprising randomly immobilised molecules from a sample
of interest. These arrays are then interrogated to obtain
information about the immobilised molecules in the array. This
approach is typically applied to pluralities of polypeptide or
nucleic acids obtained from, for example cells, in genomics or
proteomics approaches. Not only will characterisation of the arrays
provide useful genomic and proteomic information about the sample
which has been arrayed, but characterised arrays may then be used
in many of the methods described above.
[0545] One method for obtaining a signature identity for each
molecule within a randomly immobilised array is surface enhanced
raman spectroscopy (SERS). A single molecule can be attached to a
colloidal gold or silver bead, and the beads spread on a surface.
This enables the raman signal due to the single molecule to be
enhanced sufficiently for it to be detected. Raman spectroscopy is
advantageously carried out within a scanning probe microscopy
configuration. This kind of Raman spectroscopy may further provide
some structural information about the molecules under
investigation
[0546] Moreover, Raman spectroscopic fingerprints can be used for
encoding labels for probes as required for certain aspects of the
present invention (see Cao, Y C, Rongchao, J and C A Mirkin,
Science 297:1536-1540 2002).
[0547] Where immobilization is totally random, there will be a
poisson distribution of molecules on the surface, spaced apart at a
variety of distances. Some molecules will be too close to resolve
by optical microscopy. It can thus be advantageous to make the
distribution of a non-spatially addressable random array ordered
such that each molecule occupies a fixed distance from any other.
For example, each individual molecule can be positioned at a
pre-defined position by using for instance scanning probe
microscopy. Another method involves attaching the molecules to
binding sites created in two or three dimenensional lattices of the
type described by Winfree et al. (Nature 1998 394:53944.)
[0548] Proteomics--Immobilisation of Target Molecules and
Interrogation of Physicochemical Properties
[0549] In an alternative method for characterization of the protein
content of a cell or tissue, the sample molecules are not captured
by array molecules. Instead the sample is applied to a solid phase
(lacking, not comprising, an immobilised array of molecules), with
each individual molecule settling randomly on the surface. For
example, protein molecules can be adsorbed onto a variety of
surfaces, with some proteins better adapted for one surface than
another. The surface could be differentially patterned with
different surface coatings e.g. hydrophilic or hydrophobic. Then
individual protein molecules are differentiated by their size,
shape, mass or any physicochemical property, preferably by scanning
probe microscopy. They may also be differentiated due to the region
of surface attachment on an array of different surface
chemistries.
[0550] Proteins may also be recombinant. There are currently
efforts underway to make a catalogue of clones so that any protein
can be expressed off the shelf. Hence each protein (and any
variant) can be expressed individually, placed on the surface and
its characteristics determined or "learnt" by the method,
preferably based on SPM. To analyse the >40,000 protein
molecules refers to the minimum which is due to the number of
genes; must be a higher number of proteins due to alternative
splicing and post-translational modifications] in a high throughput
manner to learn their characteristics, an array format may be
useful, and it is likely that arrays of increasing numbers of
different proteins will become increasingly available . A method
for producing an in vitro array of proteins by in cell free
synthesis from PCR products has been reported (M He and M Taussig
Nucelic Acid Research 2001 29: e73
[0551] Then for a complete description of the protein content of a
cell, questions such as what proteins (and variants) are present,
how many individual molecules of each are present, and which
proteins are interacting with which other proteins can be asked. If
learning has only been performed on a subset then it is only the
members of the subset that can be identified. Any proteins that do
not fit the description of a "learnt" protein can be stored in
computer memory along with its determined characteristics for
future identification. Such unknown proteins may become implicated
with particular functions due to being correlated with specific
expression or prevalence patterns in different biological or
pathological situations.
[0552] This approach could be extended to look at other components
of cells or tissues such as lipids, polysaccharides or the
metabolome.
[0553] Genomics--Immobilisation of Target Molecules and
Interrogation with Tagged Probes.
[0554] In an alternative means for haplotyping and for sequencing,
the sample nucleic acids are not captured by array probes. Instead
the sample (e.g. fragmented genomic DNA) is applied to a solid
phase (lacking not comprising an immobilised array of molecules),
with each individual molecule settling randomly on the surface and
becoming horizontalised. For example DNA molecules can be adsorbed
to mica surfaces in the presence of certain divalent cations, e.g.
nickel or cobalt or magnesium or onto polylysine coated surfaces.
The use of low pH promotes attachment of molecules only by one end.
The molecules would then preferably be straightened by methods
known in the art and as discussed above. The method of application
of the nucleic acids may also lead to straightened molecules. The
targets may be in double or single stranded form as discussed
above.
[0555] The identities of individual molecules can be determined by
probes of known sequence. Sixteen nucleotides of sequence
information are typically required to identify uniquely a DNA
fragment in the genome. It would be expected that this length of
sequence information would allow the fragment to be mapped to the
genome. Only 7 to 9 nucleotides may be sufficient to uniquely tag
mRNA. Preferably, the identity of each molecule is encoded prior to
arraying (by pre-hybridisation of the sample DNA with the
repertoire of tags).
[0556] Obtaining 16 nucleotides of sequence information from one or
more proximal points allows each molecule to be identified. For
example, four 4 mers would give the requisite information and would
require only 256 different tags. Six 3 mers would give 18 nt of
sequence information and this would require only 64 tags, although
it would be difficult to obtain stable hybrids with such short
length. These oligonucleotide sizes could be incorporated into
methods described herein for synthesizing complementary strands by
ligation. Or alternatively the short oligonucleotides could be
analogues that bind with greater strength such as PNA, LNA and
Morpholino oligos.
[0557] Zhong et al, 2001 (PNAS 98: 3940-3945) and Woolley et al.,
2000 (Nature Biotechnology 18: 760-763) have demonstrated analysis
of haplotypes on single molecules. The methods of the present
invention would similarly obtain haplotypes in molecules that are
not captured by array probes. However, the methods disclosed herein
differ from Zhong et al, 2001 and Woolley et al., 2000 in that the
molecules are probed with two distal tagged probes to uniquely
identify a strand and SNPs analysed in between these two tagged
probes by using dual labelled biallelic probes.
[0558] The entire sequence of individual molecules can be
determined by probing with a complete set of oligonucleotides. This
can be done sequentially with each individual oligonucleotide of
the set. Or it can be done simultaneously where each probe sequence
of the set is encoded. It is advantageous to discriminate
mismatches of one or two bases (this can be done by controlling
hybridisation and wash stringencies, use of enzymes, chemical
cleavage of mismatches etc). However, highly exacting
discrimination of mismatches is not essential and may be tolerated
depending on the approach used for sequence reconstruction. In a
population of molecules there would be many copies of each sequence
present. Each position may be probed by multiple sequences and even
mismatches, which usually behave predictably can be informative. If
the oligonucleotides are short (e.g. 6 mers) then their
complementary sequences are likely to occur at multiple positions
along the length of a single molecule. However, the positional
information and/or information of order of probes along molecule,
that can be obtained by analysing horizontalised single molecules
would be highly useful in re-assembling the sequence. The
software/algorithms used for sequence reconstruction may be similar
to those developed for Sequencing by Hybridisation (SbH) by for
example Pavel Pevzner of the University of Califronia.
[0559] The process typically involves: addition of an
oligonucleotide, recording if it interacts with the target molecule
and determining its location relative to the ends of the target
molecule and relative to positions occupied by other probes;
denaturation of oligonucleotide from the target molecule (e.g. by
heating, manipulating salt concentration or pH or by applying an
appropriately biased electric field); and probing with the next
oligonucleotide. There may be many copies of each target molecule
(unless the sample material is unamplified genomic DNA from a
single cell) and so results from each copy would add to the
confidence in the reconstruction.
[0560] Application of positive charge or electric field to the chip
surface would be expected to facilitate horizontalisation. This
could be done by adding positive charges by chemical treatment of
the surface or by applying an positive bias. Hybridisation would
also be facilitated by electric fields, when much of the
hybridisation volume distal to the surface is kept at high
stringency to eliminate secondary structure in the target, but the
polarity of the electric field applied in the vicinity of the probe
would serve to attract DNA and to screen the negative backbones of
the DNA target and probe to enable hybridisation to occur. Flipping
to opposite polarity would serve to remove mismatches
preferentially.
[0561] Double stranded DNA could be analysed using strand invasion
by PNA or LNA probes as described above.
[0562] Complementary Strand Synthesis by Ligation
[0563] The target may be probed (tagged) and made double stranded
prior to immobilisation. A complementary strand is usually
synthesised by DNA polymerase or reverse transcriptase. If each
base that is incorporated could be identified then the sequence
could be obtained. However current techniques are not sensitive
enough to identify individual bases. In the method of the invention
a complementary strand to a target single strand is synthesised by
concatenation and ligation of oligonucleotides along the DNA (see
FIG. 3). The reason for doing this is to incorporate. into the DNA
chain, oligonucleotides which can be individually detected and
uniquely identified as will be discussed below.
[0564] Typically around 250 nts can be efficiently synthesised by
ligating 9 mers. However, further optimisation would be expected to
increase the length that is possible to synthesize by ligation.
[0565] Gap Fill Ligation
[0566] Long stretches of DNA may be better catered for by using
"gap fill ligation". Here, rather than ligation of
adjacent/contiguous oligonucleotides, oligonucleotides are placed
apart and the gap between then is filled by template directed
polymerisation of a complementary strand, primed from the 3' side
of each oligonucleotide, and abutting and terminating at the next
oligonuceleotide. The polymerised strand is then ligated to the 5'
abutting end of the next oligonucleotide. This abutting
oligonucleotide will itself have primed polymerisation toward the
next olignucleotide along the chain and so on (see FIG. 4).
[0567] Such an approach would be useful for haplotyping. It will
also be possible to use this approach for sequencing, where as
described previously a complete library of oligonucleotides are
used, each oligonucleotide being used in a concentration that
allows attachment of oligonucleotides at roughly the desired
distances form each other on the DNA chain, and each target DNA
sequence being present in many copies. Sequence will be
reconstructed from obtaining tag and positional information
concerning the different sub-set of probes from the complete set
that associate with each of the copies. The reaction is optimised
so that enough overlapping sequence information is achieved by
probing each of the copies to reconstruct the sequence.
[0568] Alternatives to Immobilisation to a Flat Surface
[0569] In addition to immobilisation of the target single DNA
molecule on a flat surface, the DNA could be wrapped around a bead
or particle or a nanorod or nanobar or it could be freely floating
in solution. The tags that are associated with the beads would then
be read by a sensitive flow cytometry system. Furthermore the
strands could be flowing in channels or capillaries. Readout would
typically be by far-field optical acquisition. Excitation could be
through a near-field slit as described by Tegenfeldt et al., 2001,
Physical Review Letters 86: 1378-1381
[0570] Tagging Schemes for Single Molecule Analysis
[0571] Analysis can be performed with one or a few different dyes
(or other tags) if each oligonucleotide is applied sequentially.
However, to maximise throughput in the system it would be desirable
to probe with many or all oligonucleotides simultaneously. Hence
there is a need to have a repertoire of tags corresponding to the
repertoire of oligonucleotide sequences. A range of dye molecules
are currently in use and these can be easily detected and can be
differentiated. However, clearly there are not enough spectrally
resolvable single dye molecules available to encode an entire
repertoire of oligonucleotide sequences, even for say every 4 mer
which numbers 256. One means to increase the available repertoire
is by measuring fluorescent lifetimes as well as wavelength. For
example Lanthanide containing fluorophores have around 5-50K fold
longer lifetimes than ordinary fluors. But this will still not give
enough unique tags to code for a useful number of oligonucleotides.
Another means is measurement of molecular brightness which varies
with number of dye molecules or particles of a particular
wavelength. One solution is to use combinations of tags (dye
molecules) to encode each oligonucleotide in the repertoire. Image
acquisition may be by a system of the type described by Harold
Garner and colleagues Schultz et al., 2001, Cytometry
43:239-247where a complete emission spectrum is obtained from each
molecule, hence eliminating the complication of using filters to
acquire multiple fluorescent wavelengths. Alternatives to dyes,
especially for higher resolution analysis are labels that can be
detected by STM or various forms of electron microscopy. These may
be conducting materials or molecules. Also there would be a wide
variety of tags that could be used if analysis was by AFM. Tags
bearing any physicochemical property that is detectable by AFM
could be used. For example, Ishino et al., (Jpn. J. Appl. Phys.
1995 33: 4718-4722) have demonstrated the discrimination of a
series of charged functional groups by AFM. For encoding, these
functional groups would need to be tethered to the oligonucleotide
probes but spatially far apart from each other so that they can be
individually detected by AFM; it would be desirable to use
sharpened AFM probes or thin-walled carbon-nanotube probes.
Alternatively and more preferably, each functional group from the
palette is used to derivatise nanobeads. The oligonucleotides of
the repertoire would be encoded by combinations of the derivatised
nanobeads. AFM can be used to produce force curves which display
the force-distance relationship of the AFM tip and substrate.
Different physico-chemical features contribute to different shapes
of the force curves. A single functional group could derivatise
nanobeads at different surface coverages, for example 5% covered to
100% covered, and this density of coverage would be reflected in
the force curves that are obtained. Different force-distance
relationships would be found if the chemical nature of either the
tip or the sample were changed. Chemical force microscopy involves
derivatising the tip with known chemistries, including specific
oligonucleotide sequences. An array of force curves can be obtained
by Force Mapping (Heinz W F and Hoh J H, 1999, Biophysical Journal
76: 528-538).
[0572] The repertoire of tags could be created by using mass
encoding. Each tag would have a different mass. This may be
detected by scanning probe mass spectrometry as proposed by, for
example, workers at Lancaster and Loughborough Universities
(Pollock HM And Somg M, UK SPM meeting 2001, University of Leeds
10.sup.th and 11.sup.th April, Poster). A range of mass tags and
various means for their construction have been reported by
Shchepinov et al (e.g Tetrahedron 56: 2713 (2000)). Compounds of
different masses can be directly coupled to the oligonucleotide by
an asymmetric synthon. To generate large repertoire mass tags can
be combined on beads or on the arms of dendrimers.
[0573] Dendrimers are useful structures for encoding, able to form
a repertoire of nanostructures that can be discriminated by SPM for
example or serve as arms or receptacles for holding other encoding
entities.
[0574] Electrophore tags have also been described by Aclara
Inc.
[0575] Luminex corp load polystyrene beads with different ratios of
20 or more dyes for encoding. Currently around 1000 different
combinations can be produced but they aim to produce one million in
the future. However, the size of beads they use for this are too
large for single molecule sequencing as described above although
they may be useful for determining haplotypes as SNP occur roughly
every 1000/1200 bases (around 300-400 nm). However, if the
appropriate dye dynamic range could be obtained from beads with a
few nanometre dimensions, then these would be useful for encoding
oligonucleotides for single molecule analysis. Fluorescently
labelled latex nanoparticles are available from Sigma. Polystyrene
beads loaded with dyes are available from molecular probes and
these are available in sizes as small as 20 and 40 nm. Different
surface chemistries are available for linking oligonucleotides
including, carboxyl groups and streptavidin. Features of good beads
would be non-leakiness of the dyes. Semiconductor nanocrystals or
quantum dots (QDs) whose spectral emmision is narrow can be found
from around 1.5 nm upwards. However the current sets of QDs that
have been used rely on size of particle to control the emission
wavelength (this is a quantum effect). This suggests that to obtain
a repertoire, some QDs may be larger than the .about.3 nm size
which will best fit with say 9 mer oligonucleotides. This would be
satisfactory for haplotype analysis but not for sequencing.
However, QDs of the same size but made of different semiconductor
materials would be expected to emit at distinct wavelengths, so a
series of sizes from 1.5 to 3 nm and a series of semiconductor
materials can be used as the palette from which to build encodings.
The QDS can be linked along a linear chain and spatial origin of
signal from each could be determined or the spectral
characteristics of the combination could be deconvoluted. Diversity
could be increased by covering the QDs or dye loaded nanoparticles
with a different characteristic that can be measured e.g. different
types of functional groups or different densities of functional
groups as described. Detection in this case would be by dual SNOM
and AFM, various configurations of which have been described. AFM
can also be combined with Far-field optical methods (e.g. Kolodny
et al., 2001, Anal Chem. 73: 1959-1966). Similarly PRPs or SERS
nanoparticles can be employed in many of these ways. Recently
quantum nanobars have been reported. These have useful spectral
signatures and also would be good substrates for linking further
encoding schemes. The nanobars can be made of composite materials
and so have different spectral characteristics in different
regions. SurroMed Inc suggest that as many as 24,421,875 different
Nanobarcodes could be created by nanobars composed of 11 stripes of
5 different metals (Remy Cromer, at Cambridge Healthtech Institutes
Fifth Annual meeting on Advances in Assays, Molecular Labels,
Signaling and Detection, May 17-18.sup.th Washington D.C.)
[0576] Alternative methods would be required to resolve spectral
lines if they a numbered around a million different lines.
Interferometrc techniques may be capable of resolving this high
number of spectral lines.
[0577] Another means for encoding is to use different lengths of an
aliphatic chain. These must be perpendicular to the DNA chain. For
example the DNA chain could be in a channel as has been described,
the tag-chains could bear paramagnetic particles that are attracted
to magnets that are placed parallel to the DNA, causing the chains
to align perpendicular to the DNA. Or they could be aligned by an
electric field. AFM can distinguish lengths very accurately. The
tag chain could be electronic/metallic wires which are interrogated
with STM or they could be coated with material that is easy to
detect by AFM, e,g, by lateral force measurement. Polymerases can
incorporate bases with for example biotin attached so the chains
could be attached at appropriate positions on oligonucleotides or
even on individual nucleotides. The chain could be very narrow and
in this case if linked to individual bases they could allow base by
base sequencing. This kind of tagging of individual nucleotides
would allow linear reading of sequence without the need for
sequence reconstruction. The tagging chain could in some instances
be DNA, preferably restricted in alphabet (Mir K U. A Restricted
Genetic Alphabet for DNA Computing In DNA Based Computers II,
Publisher: American Mathematical Society (1998 ), which could be
covered specifically with cytochrome and/or metal as done in
electron microscopy.
[0578] Steric interference from tags could be reduced by using an
adaptor encoding DNA sequence. For example, an oligonculeotide
probe could be attached to a "surrogate" sequence tag which would
bind to a specific anti-tag on the encoded bead.
[0579] Building up the Repertoire of Tagged Oligonucleotides
[0580] Oligonucleotides can be linked to their tags in two ways.
Firstly, oligonucleotides and tags can be prepared separately and
then manually linked together (not combinatorially). Secondly they
can be joined by combinatorial chemistry by various means. Split
and mix synthesis would be particularly appropriate. But rather
than perform this on beads which is the usual way, this should be
done directly by using an asymmetric synthon that can initiate both
the oligonucleotide and the tag synthesis by stepwise solid-phase
synthesis, each of olignucleotides and the tags having different
protection groups (special protected sythons may need to be
produced for the tags). Alternatively, say for the differential
loading of dye or functional groups, the nature and position of
each base would need to be encoded, Each base addition would
correspond to a different level of loading of an entity of a
particular type. For example, the identity of the base added could
be encoded in the density of surface coverage of the bead by a
particular functional group.
[0581] A=0 surface coverage
[0582] C=33% surface coverage
[0583] G=66% surface coverage
[0584] T=100% surface coverage
[0585] For the second and subsequent base additions a separate
functional group, distinct from previous functional group(s) would
be added, the identity of the functional group would indicate its
location along the chain (is it first or seventh base for example).
The density of surface coverage would indicate which base is added.
The same logic can be followed for optical encoding, mass encoding
etc.
[0586] Where the encoding is inherent in beads (or chains of
beads), the beads could serve as templates for synthesis whilst
orthogonally being labelled with appropriate encoding at each step.
This would result with an encoded bead with many copies of the
probe sequence on its surface. For single molecule analysis only
one molecule would be required to interact with the target. The
remaining molecules could be left redundant or inactivated. For
example 99% of the molecules on the surface could be cleaved off
and discarded. Alternatively, bead derivation chemistry would be
done in away such that the stoichiometry of bead versus functional
group (for initiating oligonucleotide synthesis) would be such that
statistically only one or very few functional groups would
associate with one bead.
[0587] Analysing Chromosomes/Fibre Fish
[0588] The tagging schemes described above could be used to
haplotype or sequence directly on metaphase chromosomes by
derivatives of FISH (Fluorescent in situ hybridisation). Here the
genome would come in a pre-fractionated state, partitioned in the
46 chromosomes of a diploid cell. Landmarks that are visible by
staining, further aid the positioning analysis. Proximal probes
that cannot be resolved by location can be resolved by the
encoding. The tags are typically separated from the probes by long
linkers. Because of the condensed state of the DNA it is difficult
to get access to the DNA. DNA in interphase nuclei is more
accessible. The most accessible systems are DNA fibers (Fiber FISH)
and nuclear (Weigant) halo preparations where the DNA is in a more
extended form (Zhong et al., 2001, PNAS 98: 3940-3945). In Weigant
halo preparations nuclei are prepared and treated so that DNA is
de-proteinized and exploded from the nucleus. In these interphase
DNA or naked DNA preparations, chromosomal information is lost.
[0589] Access with PNA, LNA, DNG (Linkletter et al., 2001, Nucleic
Acids Research 29: 2370-2376) or Morpholino derivatives are
expected to be better than DNA.
[0590] In general there is less restriction on the size of
beads/tags for haplotyping (hundreds of nm to a few microns) but
beads/tags must be of a few nm dimensions for sequencing
applications. Similarly, the resolution of optical methods for
reading fluorescent tags need not be high for haplotyping but must
be high for sequencing. When the association of adjacent
fluorescent tags to the DNA can be temporally resolved, then
resolution can be improved by deconvolution algorithms.
[0591] A number of new tagging schemes are proposed that would be
particularly useful for single molecule analysis. Such tagging
schemes can be applied to other embodiments described in this
invention and to other applications not related to single molecule
analysis.
[0592] The immobilisation/tagging/encoding procedures described
above may be used to generate randomly immobilised arrays of
nucleic acid molecules, whose identity need not be known prior to
immobilisation. The molecules are encoded such that they can be
uniquely identified by hybridising a plurality of tagged probes as
described above. Hybridisation of tagged probes may be conducted
before or after immobilisation. Typically, the nucleic acid
molecules are fragmented genomic DNAs or cDNAs.
[0593] Experimental Procedures
[0594] Primary Arrays
[0595] Primary arrays are those that carry the molecular species
that are directly involved in molecular technique of the
invention.
[0596] Preparations for Arraying
[0597] Arrays may be made by spotting one or more probes or sample
molecules onto specific locations on a surface or by spreading
probes or sample molecules onto a surface.
[0598] Cleaning Substrates
[0599] The following procedures are preferably performed in a clean
room. The surface of a glass slide (e.g Knittel Glazer, Germany) or
spectrosil slides is thoroughly cleaned. For example, Sonicate in a
surfactant solution (2% Micro-90) for 25 minutes, wash in deionised
water, rinse thoroughly with milliQ water, immerse in 6:4:1 milliQ
H.sub.200:30%NH.sub.4OH:30% H.sub.2O.sub.2 or in a H2SO4/CrO.sub.3
cleaning solution for 1.5 hr. Rinse and store in dust free
environment e.g under milliQ water. The top layer of Mica
substrates are cleaved by covering with scotch tape and rapid
pulling off of layer.
[0600] Slides
[0601] It was found that slides from several manufacturers were
compatible with single molecule detection. It was found that slides
from different suppliers varied in the quality of evanescent field
that can be formed. We found that slides from Asper Biotech (Tartu,
Estonia) produced a good evanescent field.
[0602] Slide Surface Chemistry
[0603] Three different slide chemistries ,Epoxysilane, Aminosilane
and enhanced aminosilane (3-Aminopropyltrimethoxysilane+1,
4-Phenylenediisothiocyanate) have been tested. Single molecule
arrays can be obtained with all three chemistries. Aminosilane
surface coating can be used both for experiments which look at
molecules as point sources of fluorescence as well as experiments
which look at linearised DNA polymers therefore has been a
preferred substrate. Enhanced aminosilane slides or polyelectrolyte
coated slides are preferable when enzymatic reactions are
performed
[0604] Derivatization of Glass with Polyethylenimine (PEI)
[0605] A glass slide is washed with 0.1 N acetic acid, then rinsed
with water until the water rinsed from the slide has a pH equal to
the pH of the water being used to rinse the slide. The slide is
then allowed to dry. To a 95:5 ethanol:water solution is added a
sufficient quantity of a 50% w/w solution of
trimethoxysilylpropyl-polyethylenimine (600 MW) in 2- to achieve a
2% w/w final concentration. After stirring this 2% solution for
five minutes, the glass slide is dipped into the solution, gently
agitated for 2 minutes, and then removed. The glass slide is dipped
into ethanol in order wash away excess silylating agent. The glass
slide is then air dried. Aminated oligonucleotides are spotted in a
1 M sodium borate pH 8.3 based buffer or 50% DMSO.
[0606] Similar Polyelectrolyte coated slides may be purchased from
VBC-Genomics (Austria).
[0607] Printing
[0608] Each sequence or molecular identity is placed at a specific
spatial location on a surface so that a specific known molecular
identity can be found by going to a particlar location on the
surface and conversely by determining the coordinates of a location
it is possible to determine the identity of molecules present
therein.
[0609] Spotting Pins
[0610] Capillary pins from Amersham Biotech optimized for Sodium
Thiocyanate buffer or pins optimized for DMSO buffer were used in
different spotting runs. Both type of pins enabled single molecule
arrays to be constructed. Other preferred spotting methods are the
Affymetrix ring and pin system and ink jet printing. Spotting Pins
have also been used (Kaken, Japan)
[0611] Determining Optimal Spotting Concentration for Making
Spatially Addressable Single Molecule Arrays.
[0612] The first step in the procedure for making a single molecule
microarrays is to do a dilution series of fluorescent
oligonucleotidenculeotides. This has been done with 13 mers and 25
mers but any appropriate length of oligonucleotide can be chosen.
These oligonucleotides may be aminated and preferably Cy3 labelled
at the 5' end.
[0613] 10 uM solution of the oligonucleotide (this procedure is
also appropriate to proteins and chemical spotting) is placed in a
first well of the microtitre plate. For a 10 fold dilution, 1 ul is
transferred into the next well of the microtitre plate and so on
over several orders of magnitude (twelve orders of magnitude were
tested. A 1:1 volume of 2.times. spotting buffer that is being
tested is added to each well. This gives 5 uM concentration in the
first well, 500 nM in the second well and so on. The array is then
spotted using a microarrayer (Amersahm Generation III). The
Dilution series is then analysed by TIRF microscpy or AFM or other
relevant microscopy system, The morphology of spot is analysed and
the distribution of molecules within the spot determined. The spot
range with the desired number of resolvable single molecules is
chosen. Optionally a further more focused dilution series is
created around the dilution of interest. For example two 50%
dilutions in the range 500 nM to 50 nM can be done.
[0614] In a first experiment a dilution series over 12 orders of
magnitude was spotted with 4 buffers to establish the range of
dilutions necessary. Subsequently more focused dilutions series are
used. It was found that between 250 nM to 67.5 nM gave resolvable
single molecules within an identifiable spot (if there are too few
molecules then it is difficult to know exactly where the spot is.
This will not be a problem when spot position and morphology is
know to be regular and movement of translation stage or CCD is
automated and is not manual). Some spots give a faint ring around
the perimeter, this can help identify spots
[0615] To achieve single molecule array a dilution series of
modified and unmodified oligonucleotides in several different
spotting buffers on three different slide chemistries, on slides
from several different manufacturers, two different humidities and
using several different post-spotting protocols were tested. Due to
the effects of photobleaching, the amount of pre-exposure to light
will also influence the number of single-dye labeled single
molecules that can be counted.
[0616] On enhanced aminosilane slides, QMT buffer 1, 1.5 M Betaine
3.times.SSC gave the best results. A faint ring was seen around the
spots in 1.5 M Betaine 3.times.SSC. Concentrations between 250 nM
and 67.5 nM were appropriate for single molecule counting on
relatively fresh slides. These slides should be stored at -70. At
room temperature the ability to retain probe after spotting wanes
badly over a 2 month period.
[0617] Preparing Single Molecule Oligonucleotide Arrays
[0618] Oligonucleotide Chemistry
[0619] Unmodified DNA olignucleotides and oligonucleotidenucletides
that were aminated at the 5' or 3' end were tested. There appears
to be no significant difference in morphology or attachment whether
the oligonucleotides are terminally modified or not. Several
different sequences, of varying lengths that probe TNF alpha
promoter have been tested. Thiol terminated nucleic acids can be
spotted onto gold surfaces or mercaptosilane coated surfaces.
[0620] Spotting, from microtitre plates to slide, normal terminally
aminated phospodiester oligonucleotides(Eurogentec, Belgium) are
used.
[0621] Make arrays as above but employ oligonucleotides in which
one more base is an LNA base (Proligo). 0.2 uM scale synthesis is
sufficient to print thousands of arrays, alternatively for a large
number of elements the arrays are more economic to make by
combinatorial synthesis). Arrays can also be made by spotting PNA
oligonucleotides (Oswel, UK or Boston Probes, USA).
[0622] Arraying Buffers
[0623] In total 11 different buffers have been tested. From the
study it has emerged that the best general buffer on the APTES
slides supplied by Asper Biotech is 50% DMSO and 50% Water. This
buffer gives far superior spot morphology than any other buffer
that was tested. Spotting humidity affects the morphology. Spotting
was tested at 43%/42% and 53-55% humidity with both conditions
giving useable arrays. However, there is a slight dougaut effect at
43% humidity compared to the almost perfect homogeneity at 55%
humidity. QMT2 (Quantifoil, Jena Germany) buffer also give
reasonable spots on Asper's Epoxysilane slides.
[0624] After spotting the epoxysilane slide is placed 15 minutes at
97 degrees C (this step may be omitted) and RT storage for 12 hours
to 24 hours. This is followed by storage at 4 degrees C. overnight
or preferably longer). The slides are washed before use. Two
methods of washing work well. The first is washing 3.times. in
miliQ water at room temperature. The second is washing on the
Amersham Slide Processor (ASP). The following wash protocol was
used.
[0625] Asp Wash Protocol
[0626] HEAT To 25 degrees
[0627] MIX Wash 1, (1.times.SSC/0.2%SDS) 5 or 10 minutes
[0628] PRIME Prime with wash 2(0.1.times.SSC/0.2%SDS)
[0629] FLUSH Wash 2
[0630] MIX Wash 2 30 seconds or 1 minute
[0631] FLUSH Wash 3 (Wash (0.1.times.SSC)
[0632] MIX Wash 3 30 seconds or 1 minute
[0633] PRIME Prime with was 4 (0.1.times.SSC)
[0634] FLUSH Wash 4 (0.1.times.SSC)
[0635] Prime Prime with Isopropanol
[0636] Flush Flush with Isopropanol
[0637] Flush Flush with air
[0638] Airpump Dry Slide
[0639] Heat Turn off Heat
[0640] The best buffers on the more expensive enhanced aminosilane
(3-Aminopropyltrimethoxysilane+1, 4-Phenylenediisothiocyanate)
slides from Asper Biotech are 50%1.5 M Betaine 50% 3.times.SSC and
10%QMT1 spotting buffer(Quantifoil, Jena). In addition some of the
other buffers from Quantifoil (Jena, Germany) performed reasonably
well; with testing of different concentrations of these buffers
better morphology might be achievable. Detailed internal morphology
seen with epi was not good. DMSO buffer (Amersham) gave intense
"sunspots", ie a dot of intense fluorescence, within the spots; it
is conceivable that single molecules can be counted in the rest of
the spot, ignoring the sunspot. Spotting was tested at 43% and 55%
humidity with both conditions giving useable arrays.
[0641] For the enhanced aminosilane slides post-processing involves
optional 2 hours at 37 degrees in humid chamber (more molecule
stick but sometimes the spots can come out of line or merge and so
this step is preferably avoided or the spots are arryed far enough
apart to prevent merger). This is followed by overnight (or longer)
at 4 degrees C. The slides are then dipped in 1% Ammonia solution
for 2-3 minutes. The slides are then washed 3.times. in milliQ
water and then put at 4 degrees C. overnight. There is some degree
of bleeding of dye from the spots after hybridization. This may be
addressed by more stringent or longer washing.
[0642] If the buffers in the microtire wells dry out, they can be
resuspended again in water. The betaine buffer did not perform well
when this was done.
[0643] 50% DMSO is the best buffer for aminoslinae slides. After
spotting these slides are immediately crosslinked with 300 mJoules
on a Stratagene Crosslinker. The arrays are washed in hot water
with shaking twice for two minutes and are then dunked five times
in 95% ethanol and immediately dried with forced air. Substantialy
more aminated oligonucleotides stick to the surface with this slide
chemistry than other slide chemistries. Therfore less
oligonucleotide needs to be spotted to get a particular surface
density.
[0644] The spotting buffers produce significant autofluorescence in
the green range which must be removed for accurate single molecule
counting. This can be substantially removed by washing, especially
with buffers containing detergents such as SDS and Sarkosyl.
Alternatively, the green range of the spectrum is avoided, opting
for probes which fluoresce in the red range, for example.
[0645] Spreading
[0646] Arrays can be made in which the location of the molecule
does not specify the identity of the molecule until the molecules
are sequenced or an encoding is decoded.
[0647] This type of array is also characterised by the fact that
single molecules of the same identity are not necessarily found in
the same region but are arranged randomly i.e. Sequence A may be
adjacent to Sequence B and a second occurance of Sequnece A may be
at a distal location from the first occurance. This random
arrangement of the molecular species is due to the method used for
making the array. Although having the molecules in such a arandom
location does not confer any advantages, the fabrication of this
type of array is far simpler than the fabrication of an array where
many molecules of the same species are found in the same region on
the surface as is the case for DNA colonies/Polonies or DNA
microarrays.
[0648] This random aspect is a feature of many types of surface
immobilised arrays. For example, Dynamic Molecular Combing
(Michalet et al) produces random arrays, in vitro cloning
(Chetverin et al) produces random arrays and so on.
[0649] We describe particular ways of making random arrays which
are particularly suitable for single molecule applications
described in this document. Firstly it is very simple to make a
random array of any molecule of interest simply by spreading it out
on a surface to which it interacts/binds or to which it adsorbs.
For example, proteins can adsorb onto various type of surfaces. DNA
can electrostatically bind to surfaces bearing positive charges
etc, hence genomic DNA can be extracted and binds to aminosilane
coated surface (see figure). Furthermore, It is almost as simple to
make a pool of oligonucleotide in which the sequence at one or more
positions is randomised. This can be done by providing mixes of the
nucleotides during synthesis. Such a pool can be easily spread out
on a surface to provide an array in which molecules of each species
are distributed at random locations. Spreads may be of molecules
which are viewed as a single point source of fluorescence.
[0650] Alternatively the molecules may be horizontalised and may be
visualized as polymers. A procedure for horizonalising and
substantially straightening molecules is as flollows: between 10
and 100 ul of sample (e.g Lambda DNA at a concentration of 500
ng/ml is placed between two microscope coverslips (24.times.60 mm,
Matsunami Japan) in either TE Buffer pH 8 or HEPES/EDTA buffer pH
8. One surface is removed from the other by a lateral motion,
optionally excess material is removed from the surfaces. Random
arrays of straightened polymer are now created on both of the two
flat surfaces. This method produces very good distributions of
molecules as compared to many other combing methods where typically
it is difficult to produce homogeneous molecular combing. The
molecules of a secondary array (see below) can also be
straightened/linearised in this way.
[0651] The following is another procedure for horizontalising
straigtening DNA. Add a 30 ul drop at one end of the slide at the
center. Use a forced air canister (Air Duster, Sapona) at an
approximately 45 degree angle from the slide surface to gently blow
the droplet from one side of the cente of the slide to the other.
It is then blown off the slide. Lambda DNA is retained on
aminosiline coated slides compared to an uncoated slide.
[0652] Determining Optimal Spreading Concentration for Random
Location Arrays
[0653] For example a mix of an oligonucletide complementary to the
sticky ends of Lambda DNA (see below) each bearing a fluorescent
label are pipetted at a concentration of 0.5 uM each in 50% DMSO
onto APTES coated slides. Antifade and a coverslip is added and the
slide is analysed to see if individual molecules are resolvable. If
not then a dilution is done e.g. 4 fold and then the solution
pipetted onto the slide again and so on.
[0654] Preparation of Single Molecule Chemical Arrays
[0655] Each chemical compound in the library to be tested is
synthesised with a common thiol functional group that enables
covalent attachment to the slide surface. The compounds are spotted
or spread, in DMF, onto maleimide-derivatized glass microscope
slides. Following spotting/spreading, the slides are incubated at
room temperature for 12 h and then immersed in a solution of
2-mercaptoethanol/DMF (1:99) to block remaining maleimide
functionalities. The slides are subsequently washed for 1 h each
with DMF, THF, and iPrOH, followed by a 1 h aqueous wash with MBST
(50 mM MES, 100 mM NaCl, 0.1% Tween20.RTM., pH 6.0). Slides are
rinsed with double-distilled water and dried by centrifugation. A
dilution series is done to establish optimal concentrations for
single molecule detectiom. The compound to be tested may further be
linked to a fluorescently detectable moiety, such as Cy3 dye.
[0656] Preparation of Single Molecule Protein Arrays
[0657] Antibody/antigen pairs provided by BD Transduction
Laboratories (Cincinnati, Ohio), Research Genetics (Huntsville,
Ala.), and Sigma Chemical. Antibodies are chosen which are in
glycerol-free, phosphate-buffered saline (PBS) solution (137 mM
NaCl, 2.7 mM KCl, 4.3 mM Na.sub.2HPO.sub.4, 1.4 mM
KH.sub.2PO.sub.4, pH 7.4). Antibody and antigen solutions are
prepared at a concentation chosen from range from 0.0025-0.0075
mg/ml in 384-well plates, using approximately 4 .mu.l per well (a
wider range can be first tested depending on method to be used for
analysis and the spotter that is to be used. The protein solutions
in an ordered array onto poly-L-lysine coated microscope slides at
a 375 .mu.m spacing using 16 steel tips or the capillary tips of
the Amersham Generation m spotter. The coated slides are purchased
from CEL Associates (Houston, Tex.) or are prepared as follows.
Briefly, glass microscope slides are cleaned in 2.5 M NaOH for 2 h,
rinsed thoroughly in ultra-pure H.sub.2O, soaked for 1 hour in a 3%
poly-L-lysine solution in PBS, rinsed in ultra-pure H.sub.2O, spun
dry, and further dried for 1 h at 80.degree. C. in a vacuum oven.
The resulting microarrays are sealed in a slide box and stored at
4.degree. C. The arrays are rinsed briefly in a 3% non-fat
milk/PBS/0.1% Tween-20 solution to remove unbound protein. They are
transferred immediately to a 3% non-fat milk/PBS/0.02% sodium azide
blocking solution and allowed to sit overnight at 4.degree. C. (The
milk solution is first spun for 10 min at 10,000.times.g to remove
particulate matter). Excess milk is removed in three room
temperature PBS washes of 1 min each, and the arrays are kept in
the final wash until application of the probe solution (see
below).
[0658] Preparing Single Molecule mRNA-Polypeptide Fusion Arrays
[0659] This is a method for linking molecular genotype with
molecular phenotype. mRNA are ligated to a sequence containing a 5'
phosphate and a 3' Puromycin.The ligated products are in vitro
translatedin rabbit reticulocyte lysate kit (Ambion) for 30
minutes. The solution is adjusted to 150 mM MagCl.sub.2 and 425 mM
KCL to promote the formation of a puromycin-peptide bond. The
mRNA-polypeptide fusion s are isolated by chromotagraphy. The
fusions can then be arrayed onto a surface by any of the methods
described in this document. For example the mRNA portion would be
able to bind ot APTES surfaces by electrostatic interaction.
Alternatively the mRNA could be captured by interaction with probes
arrayed on enhanced aminosilane surface (Asper Biotech, Estonia).
This surface would enable the protein to be better functionally
active.
[0660] Making Single Molecule Arrays by In Situ Parallel
Synthesis
[0661] The glass substrate can be cleaned (and all reagents used in
the following steps should be of high purity) and then modified to
allow ON synthesis: For epoxy derivatisation the following steps
are taken. Prepare a mixture of 3-Glycidoxypropyl trimethoxysilane
(98%) (Aldrich), di-isopropylethylsmine, and xylene(17.8:1:69, by
volume) in a glass cylinder. Place the glass substrate in the
mixture so that it is completely immersed and incubate at 80 C for
9 hours. Remove the glass from the mixture and allow them to cool
to room temperature and wash with ethanol and ether by squirting
liquid from a wash bottle. For adding a spacer: Incubate the glass
substrates in hexaethylene glycol (neat) containing a catalytic
amount of sulphuric acid (approx. 25 ul per litre) at 80 C for 10
hours with stirring. Remove the glass substrates, allow them to
cool to room temperature and wash with ethanol and ether. Air Dry
the plates and store at -20 C.
[0662] The array of ONs complementary to for example, yeast
tRNA.sup.phe is created by coupling nucleotide residues in the
order in which they occur in the complement of the target sequence
using a reaction cell pressed against the surface of a glass
plate/slide (Knittel Glazer, Germany) which is modified (see
above).
[0663] The fluidics from an ABI 394 DNS synthesizer is coupled into
the reaction cell through inlet and outlet ports (instead of
coupling to cpg colums). The DNA synthesizer is programmed with the
following cycle (for a diamond-shaped reaction chamber with 30 mm
digonal and 0.73 mm depth):
1TABLE 1 Program for ABI394 DNA/RNA synthesizer to deliver reagents
for one coupling cycle. Step number Function Number Function Name
Step time (s) 1 106 begin 2 103 wait 999 3 64 18 to waste 5 4 42 18
to column 25 5 2 reverse flush 8 6 1 block flush 5 7 101 phos prep
3 8 111 block vent 2 9 58 tet to waste 1.7 10 34 tet to column 1 11
33 B+ tet to column 3 12 34 tet to colum 1 13 33 B + tet to column
3 14 34 tet to column 1 15 33 B + tet to column 3 16 34 tet to
column 1 17 103 wait 75/140/ 18 64 18 to waste 5 19 2 reverse flush
10 20 1 block flush 5 21 42 18 to column 15 22 2 reverse flush 10
23 63 15 to waste 5 24 41 15 to column 15 25 64 18 to waste 5 26 1
block flush 5 27 103 wait 20 28 2 reverse flush 10 29 1 block flush
5 30 64 18 to waste 5 31 42 18 to column 15 32 2 reverse flush 9 33
42 18 to column 15 34 2 reverse flush 9 35 42 18 to column 15 36 2
reverse flush 9 37 42 18 to column 15 38 2 reverse flush 9 39 1
block flush 3 40 62 14 to waste 5 41 40 14 to column 30 42 103 wait
20 43 1 block flush 5 44 64 18 to waste 5 45 42 18 to column 25 46
2 reverse flush 9 47 1 block flush 3 48 107 end
[0664] An interrupt is set at step 1 of the next base to allow the
operator (or automated x-y stage) to move the substrate one
increment and restart the program. A long wait step at the
beginning of the program is optional and is introduced if the
operator does not wish to use the interrupt step. The operator is
also advised to consult the user's manual for the DNA synthesizer.
The operator is also advised to ensure there are enough reagents in
the reagent bottles to last the run and to check the run of fluids
through the base lines (e.g the G line may need to be continuosly
flushed with acetonitrile for several minutes to ensure clear flow
through).
[0665] The movement can be achieved by attaching the substrate on a
High Precision TST series X-Y translation stage (Newport) and the
sealing of the reaction cell is controlled in the X axis a with
stepometric stage (Newport) attached with a load cell. These
devices can be controlled by software created in Labview (National
Instruments) on a IBM compatible personal computer.
[0666] After each base coupling, the synthesis is interrupted the
plate is moved along by a fixed increment. The array can be made
using "reverse synthons", i.e. 5' phosphoramidites, protected at
the 3' hydroxyl, leaving 5'-ends of the ON tethered to the glass.
The first base is then added at the right-most position. The
diameter of the reaction cell is 30 mm and the offset at each step
to the left is 2.5 mm. The result is that after 12 steps, an ON
complementary to bases 1-12 of the tRNAp.sup.he has been
synthesised in a patch 2.5 mm wide, 11.times.2.5=27.5 mm from the
right of the plate, where the 12 footprints of the reaction cell
all overlapped. At this point the footprint of the reaction cell
passes on and adds the 13.sup.th base, so that the next patch
contains the 12-mer corresponding to bases 2-13. The process
continues until, in this example all 76 bases of the tRNAp.sup.he
are represented along the centre of the plate. Depending on the
shape of the reaction cell, in addition, the following
oligonucleotidemers are also present on the array: all 11-mers are
in the cells flanking the 12-mers, the next row of cells contains
10-mers and so on to the edge rows which contained the 76
mononucleotides complementary to the sequence of the tRNAp.sup.he.
For functionalisation the protecting groups on the exocyclic amines
of the bases must be removed by Ammonia treatment. In addition this
process strips oligonucleotides from the surface of the array and a
long enough incubation reduces the density of probes to the level
that single molecules can be individually resolved. To reduce the
high density array to single molecule arrays, place the glass
substrate, array side up, into a chamber that can be very tightly
sealed. Add 30% high Ammonia into the chamber to cover the slides.
Tightly seal the chamber and place in a water bath at 65 C for 24
hours or at 55C for 4 days. The temperature and incubation period
can be adjusted depending on the density of molecules that is
required (which would be defined by method for detection e.g far
field or near-field). Cool before opening chamber. The array can be
rinsed with milliQ water and is ready for use in hybridisation or
ligation experiments (after enzymatic phosphorylation) if standard
amidites are used. If as in this example, reverse synthons are used
then the array can be used for hybridisaton, ligation or primer
extension.
[0667] As an alternative to the destructive ammonia method, the
first base coupling in the array can be mixed with monomer amidite
containing a blocking group such as the base-labile protecting
group 9-fluorenylmethoxycarbonyl (Fmoc) in 1: 1000 ratio (it is
preferable to first optimise this by coupling patches on the same
surface with different ratios of mixtures to determine optimal
molecule separation for each kind of single molecule setection
experiment). As this base is not labile to acid which is used to
remove the dimethoxytrity protecting group in the standard
chemistry, it will not get removed and therefore will not allow any
further chain extension. If the Fmoc amididte is in excess it will
limit the number of chains that can be synthesised. If desired the
Fmoc group can be deprotected at the end of chain synthesis and
functionalised with for example a group carrying a negative charge.
This will help repel any non specific binding of nucleic acids and
their monomers.
[0668] An in situ DNA synthesizer, geniom one (Febit, Mannheim,
Germany) is commercially available. DNA synthesis on this machine
can be modified to make single molecule arrays. Alternatively, once
the arrays are made the channels can be flushed with destructive
ammonia treatment.
[0669] Methods have also been described for preparing arrays of
peptides by spatially addressable synthesis.
[0670] Further Steps in the Preparation of Arrays
[0671] Functionalising Arrays
[0672] Molecules of an arrays may be at too high a density to be
individually resolvable but then the array may be functionalised so
that the molecules that are detected are far enough apart to be
individually resolved. This can be done as described above by
destructive ammonia treatment. Alternatively, only a fraction of
the molecules, each far enough from the other to be individually
resolvable may be labelled and it is only these that are detected.
This fractional labelling can be determined by the analyte
molecules which may be labelled and are of such a concentration
that they bind to the array sparsely so that despite the array
molecules being closer than the minimal distance apart to be
individually resolvable their interaction with the labelled analyte
molecules functionalises such a fraction of the molecules such that
they are far enough apart to be individually resolved.
[0673] For example as in the example given above in which
olionucleotides complementary to Lambda DNA are spread on a surface
at a concentration of 0.5 uM, Lambda DNA at a concentration of 10
ug/ml is found to hybridise at a density that enables each
individual Lambda molecule to be individually resolved, even as the
probe molecules themselves are too close to be individually
resolvable by standard optical techniques.
[0674] Making Double Stranded Arrays
[0675] Any of the primary arrays of this invention that are single
stranded can be made double stranded. A pool of all sequences of
target length can be hybridised to the array (Buffer: 3.5 M TMACL
at room temperature for 17 mers) to make it double stranded.
Alternatively a common sequence is included on all molecules of the
array such that a primer binds and initiates synthesis of a
complementary strand.
[0676] Making Array Copies
[0677] Once a double strand array has been made as described above,
the strand that is not linked to the surface, can be denatured
using hot 0.1 M Alkali Buffer and then transferred to another
surface to make a complementary copy array.
[0678] Secondary Arrays
[0679] Secondary arrays can be made where further molecules bind to
a primary array. The further molecules are the functional molecules
of the array. Molecules may be viewed as a single point source of
fluorescence. Alternatively the molecules may be horizontalised and
visualized as polymers. The capture process not only enables a
homogenous and reproducible spread of molecules on a surface, it
can also enrich molecular species of interest according to
sequence. For example, all molecules of the array may include a
sequence complementary to a sequence motif present in a particular
gene family or may target telomeres using probes complementary to
short repetitive sequences, e.g, TTAGAGAG in humans, found
therein.
[0680] Target Capture
[0681] If a repertoire of single stranded oligonucleotide probes
are arrayed or spread out onto a surface they can serve as capture
probes either to target molecules bearing sticky ends (to which
they may become ligated) or by sequence-specifically binding along
a target single or double-stranded molecule under appropriate
conditions.
[0682] An array of "sticky" probes can be created by designing and
purchasing customized oligonucleotides (e.g drom MWG Biotech).
Firstly, a binary oligonucleotide repertoire, A is created which
partially contains a fixed sequence and partly contains a
randomized sequence. A second oligonucleotide is provided, B which
binds by complementary base pairing to only the fixed sequence on
oligonucleotides of the repertoire, A. This process may be carried
out entirely in solution and then the complex spread out on the
surface. Alternatively, one of A or B is first spread out on the
surface and then the other is reacted with it. Both the above
procedures are done under conditions that enable
annealing/hybridisation, for example in 4.times.SSC 0.2%Sarkosyl or
3.5M TMA at a temperature determined by Tm. The binding of
oligonucleotide pool A with oligonucleotide B creates a repertoire
of cohesive or sticky ends. These sticky ends are able to bind the
termini of DNA molecules.
[0683] In another approach, the second part of the binary
oligonucleotide does not comprise a repertoire of sequences but
instead contains a single sequence that is complementary to a
restriction digested sticky end. These sticky ends can capture
complementary sticky ends of DNA digested with the appropriate
restriction endonuclease. Hence, sample genomic DNA is digested
with for example Not 1 Restriction endonuclease, generating sticky
ends which then interact with the array capture probes.
[0684] Once a sticky end interaction has occured, a ligation
reaction can be performed to covalently immobilse the target to the
probes which are firmly attached to the surface. For ligation to
occur between 5' and 3' termini, the desired 5' termini must bear a
terminal phosphate group or should be phosphorylated enzymatically
using T4 Polynucleotide Kinase (New England Biolabs) as described
by vendor.
[0685] The target may first be hybridised to the array in
4.times.SSC/Sarkosyl, unbound material removed or diluted and then
ligation performed. Alternatively, the target can be directly
ligated with no prior hybridisation and washing/dilution step.
[0686] Where the array comprises single stranded probes and the
sticky end is provided by the target, only one strand of the target
becomes covalently linked to the surface probe. Hence the
non-covalently linked strand can be denatured e.g. by heating or by
Alkali treatment, leaving a single stranded secondary array.
[0687] Where the array has comprised sticky probes binding to
sticky ends in the target then if the 5' termini of both sticky
ends bear a phosphate group then both strands of the target duplex
become immobilised. If desired one strand can be removed by
addition of an exonuclease that degrades 5' free termini or an
exonuclease that degrades 3' free termini, depending on which
strand is desired to be retained on the array. One set of termini
is protected from degradation due to their attachment to the
surface.
[0688] It may be desirable in some instances to enable only one
covalent link between sticky ends of the target and the sticky
probes. In this case it is ensured that one set of sticky ends does
not contain a phosphate group on the 5' termini (phosphate groups
can be removed by treatment with Shrimp Alkaline phosphatase (New
England Biolabs) according to vendor suggested protocol). Such a
structure enables complementary strand synthesis by Nick
translation, for example. The array sticky probes can also be
designed in a way that there is not a flush fit between the sticky
partners in that there is a gap. left between one strand of the
target and one strand of the sticky probes so that a ligation
reaction cannot occur between them.
[0689] It is desirable to dephosphorylate the Notl digested DNA (as
described above) to prevent self-ligation prior to ligation to the
array
[0690] When the array is composed of sticky probes and binds to a
single stranded target or a double stranded target which is
recessed at the end, then only one target strand termini becomes
covalently attached by ligation.
[0691] If single stranded DNA must be captured then measures need
to be taken to make single stranded DNA e.g. by cloning the genomic
library of fragments into single stranded M13 vector (see Sambrook
et al) or by other means described elsewhere in this document. When
the target molecule is captured and is or is made single stranded,
various assays including sequence determination can be carried out
on the single stranded molecules. Where sticky probes have been
used, the synthesis of a complementary strand can be primed by an
oligonucleotide of the sticky probe. This synthesis may be by
contiguous ligation of an oligonucleotide sequence, for example, in
order to assay repetitive sequences or it may be by contiguous
ligation from a repertoire of oligonucleotides for DNA sequencing
procedures described in this document. The sticky probes ensure
that as the new strand is synthesised both it and the template
remain in the same vicinity irrespective of whether harsh
treatments that may denature hydrogen bonds, are performed. If this
was not the case certain harsh treatments may delocalise one strand
from the other and undermine the continuity of sequence
acquisition.
[0692] A typical ligation reaction on surface is described by
Gunderson et al (Genome Res 8 1142-53, 1988) and Pritchard and
Southern (Nucleic Acid Research 25:3483) have described ligation
reactions using Tth DNA ligase (Epicentre).
[0693] The addition of 10 mM MgCl.sub.2 facillitates target
capture.
[0694] Capture and Combing of Long DNA Polymers
[0695] After the above capture reactions, with or without ligation
the target molecules can be horizontalised on the surface.
[0696] Capture of Sticky Ends and Horizontalisation
[0697] Linear Lambda DNA has complementary 12 base overhangs at
each end which can anneal to circularise the DNA. The following
oligonucleotides complementary to each end overhang are used in the
following examples:
2 Lambda A: 5' GGG CGG CGA CCT 3' Lambda B: 5' AGG TCG CCG CCC
3'.
[0698] Surface immobilised probes capture a target and the target
can become stretched out on a surface. Capture probes for lambda
DNA sequence Lambda A and Lambda B, complementary to each of sticky
ends of linear lambda were spotted in microarrays or spread on a
surface. Spots containing completely unmatched sequences were
included in the microarray. One set of A and B oligonucleotides
were modified with amine and two further A and B oligonucleotides
were modified with biotin. Amersham UV Crosslinking reagent
(containing DMSO) was spotted with an equal volume of
oligonucleotide dissolved in milliQ H.sub.20 was used to spot these
probes onto an aminosilane modified slide (Asper, Estonia). After
spotting, the slides were crosslinked at 300 mJoules followed by
two washes in hot water S followed immediately by drying by blowing
with forced air from a pressurised airduster canister. The
oligonucleotides were spotted at 5 uM and 500 nM concentrations
(using spot diameter setting 255 microns, spots per dip: 72, 55%
humidity on the Amersham Pharmacia GenerationIII spotter). Lambda
DNA (20 ul; 40 ug/ml was incubated with 3 ul YOYO (neat) (Molecular
Probes, Oregan). The Solution was then brought up to 1 millilitre
in 4.times.SSC 0.2%Sarkosyl. 250 ul of this was added to the
Amersham Slide Processor (ASP) for a 12 hour hybridization protocol
(see ASP protocol B, below). The cycle included a series of
stringency washes, isopropanol flow and air drying. The flowing of
the solutions and the air drying contribute to the
horizontalisation and straigtening out of the DNA.
[0699] An alternative for horizontalisig DNA is manual flushing
with wash reagents and isopropanol or methanol, with the slide in a
vertical position. This can be done in a "Sequenza" coverplate
appparatus used for immunostaining (Shandon, USA). Alternatively,
the slide can be held at a 60 degree angle from the horizontal and
solutions can be washed over, ensuring the solution covers all the
slide.
[0700] The slide was analysed by epi-fluorescence microscope by
pipetting 30 ul Fluoromount G under a coverslip and viewing on an
upright epi-fluorescence microscope (Olympus BX51) fitted with a
Sensys CCD camera and MetaMorph imaging software (Universal Imaging
Corporation). 10.times. Objective was used for wide field viewing
and 60.times. and 100.times.1.3 NA oil immersion lenses were used
to view micorarray spots. DNA fibres were clearly visible. Better
images of DNA fibres were obtained after removing the coverslip in
PBS/Tween, staining with YOYO, washing with PBS/Tweeen and adding
Fluoromount G.
[0701] Lambda DNA becomes immobilised and combed to spots
containing sequence A and not to non-matched sequences. Mismatch
probes bind with lower yield. It is also found that
oligonucleotides that are complementary to double stranded regions
of Lambda do not capture the lambda DNA efficiently. However, the
efficiency is improved upon addition of helper oligonucelotides
which bind elsewhere along the duplex to facilitate binding of
internal probes.
[0702] Molecules other than linear Lambda can be horizontalised and
straightened in this way by for example, sticky ends can be
generated in human genomic DNA with the infrequent base cutter Not1
(as already described) which produces fragment so of an average 65
KB length which is close to the 50 kB length of Lambda DNA. Human
genomic DNA fragmented in this or any other way can be spread on a
surface to a produce a spatially random human genomic array. Prior
to Not 1 digestion repetitive sequences can be substantially
removed by the methods described elsewhere in this document.
Alternatively, after immobilisation, where the DNA is single
stranded repetitive DNA can be suppressed by hybridisation of
unlabelled Cot-1 DNA.
[0703] Sticky ends can also be generated for capture by using the
restriction endonuclease TSPR1 (NEB), according to vendor protocol,
using vendor supplied buffer. This generates 9 base overhangs. The
recognition sequence is redundant at a number of positions. A
spatially addressable array can be made covering this sequence
space. Hybridisation of TSPR1 digested genomic DNA will enable
genomic DNA to be sorted according to the redundant sequences in
the TSPR1 recognition sequence.
[0704] Capturing Sites in Double Stranded Regions
[0705] To enable capture at internal sites in DNA one of the
following procedures can be used: List A
[0706] 1) Locked Nucleic Acids (LNA) are able to form high
stability interactions with nucleic acid targets. Custom sequences
can be ordered from Eurogentec (Belgium). Software tools for
prediction of LNA Tms are available at www.LNA-tm.com. The target
can be partially denatured by high stringency conditions known in
the art such as elevated temperature, 100 mM NaCl. Under these
conditions LNA is able to bind the target DNA but where a normal
DNA probe would likely be re-displaced by target renaturation. LNA
is able to compete more effectively with renaturation of the target
duplex.
[0707] 2) Peptide Nucleic Acids (PNAs) which have neutral backbones
are able to react with DNA under very low salt concentrationss. The
target can be partially denatured by high stringency conditions
known in the art such as elevated temperature, 0-100 mM cation.
Under these conditions PNA is able to bind the target DNA but where
a normal DNA probe would likely be re-displaced by target
renaturation PNA ia able to compete more effectively with
renaturation of the target duplex. PNA Tools for design of PNA
probes (including PNA molecular beacons) are available at
www.bostonprobes.com. Also see (Kuhn et al J Am Chem Soc. 2002 Feb.
13;124(6):1097-103) for design of PNA probes. Orum, H.; Nielsen,
P.; Jorgensen, M.; Larsson, C.; Stanley, C.; Koch, T. Biotechniques
1995, 19, 472480
[0708] 3) Enzymatic reactions. The ligation and polymerase
reactions described in this invention aid in binding to targets, by
capturing and stabilising transient interactions.
[0709] 4) Padlock Probes. Padlock probes are DNA sequences in which
probes are arranged in such a way that they bind to the target in
way that leads to ligation around the target template in a way that
they become topologically locked to the target This reaction can be
done at high temperature, enabling the padlock probe to react with
the target. Because it is locked to the target it effectively,
cannot be displaced by renaturation of the target. The Padlock
probe may contain biotin linkages which can be used for their
labelling.
[0710] 5) Helper Molecules
[0711] Helper Molecules are prepared by digesting the target DNA
and then adding this to non digested target DNA and renaturing and
then allowing brief annealing and optional snap-cooling. This
generates full length molecules in which internal regions are
looped out due to the binding of the digested fragments. The looped
out regions are single stranded and hence able to interact with the
array probes. Alternatvely, the helper oligonucleotides may be PNA
sequences complementary to the array oligonucleotide (forming a P-D
loop). An RNA Helper molecule can also be used under appropriate
conditions (Formamide/SSC).
[0712] 6) Long Capture Probes. The capture probe may be a long
molecule and thereby able to effectively compete with renaturing of
the target DNA. For example, capture probes of up to around 100
nucleotides in length can be synthesised (Oswel,UK, Xeotron,
USA).
[0713] 7) RecA:
[0714] Double-stranded DNA(this method is not applicable when the
target is single stranded) can be probed by the the RecA mediated
reaction. Aizawa and Co-workers as well as others have probed
non-denatured ds DNA by using the RecA mediated strand invasion
reaction. Essentially, this published protocol [Seong G H, Niimi T,
Yanagida Y, Kobatake E, Aizawa Anal Chem 2000 Mar.
15;72(6):1288-93] can be followed with little modification.
[0715] Capture of Single Stranded Nucleic Acids
[0716] Genomic DNA comes in a double stranded form and steps have
to be taken to make it single stranded. Denaturation can be done by
for example, putting the DNA in a boiling water bath and or raising
the pH by adding for example NaOH or other alkali treatment(this
may also fragment the DNA which may be desirable). However, in this
case renaturation will compete with the desired target-probe
interaction. When single stranded nucleic acids are obtained
problematic because they can form internal base pairings (secondary
structure) which compete with the target-probe interactions. Hence
some of the approaches described above for capturing internal sites
in double standed DNA (List A) are useful for capturing sites in
ssDNA as well
[0717] Making Single Stranded DNA/RNA
[0718] One method for probing when secondary array is made with
single stranded DNA.
[0719] Single strand are made e.g. by Asymmetric (long Range) PCR,
magnetic bead methods, selective protection of one strand form
exonuclease degradation or by in vitro RNA transcription.
[0720] Alternatively one strand can be degraded for example T7 gene
6 is able to degraded the from one of the DNA termini but not the
other. As one 5' end is attached to surface it is protected from
degradation enabling asymmetric degradation. After a certain length
of degradation sequence can be carried out on the exposed single
strand.
[0721] Single stranded DNA can be hybridised to the array, in
4.times.SSC/0.2% Sarkosyl buffer at room temperature for 25 mers
which may be facillitated by enzymatic reactions such as ligation
or by a coaxially stacking oligonucleotide or stacking of several
contiguous oligonucleotides. Sites that are known to remain
accessible to probing under low stringency conditions are
preferably chosen for probing (these can be selected on
oligonucleotide arrays; see Milner et al, Nat Biotechnol. 1997
June;15(6):53741.).
[0722] After hybridisation the single strand is covalently attached
at site of capture and then washed stringently to remove secondary
structure.
[0723] The captured single stranded target can then be stretched
out as described by Woolley and Kelly (Nanoletters 2001 1: 345-348)
by moving a droplet of fluid across a positively charged
surface.
[0724] If necessary, the density of positive charge on the surface
can be controlled by coating with 1 ppm poly-L-lysine. The
appropriate concentrations of other surface coatings e.g
Aminoslinae need to be determined empirically.
[0725] ssDNA can be maintained at low ionic strength using 10 mM
Tris, 1, M EDTA pH8 (TE bufer).
[0726] Move droplet of fluid across the surface at a velocity of
Approx. 0.5 mm/s (within range 0.2-1 mm/s). This can be done by
fixing the slide/mica onto a TST series translation stage
(Newport), placing a droplet of fluid onto this, and translating
the fluid with respect to the surface by dipping a stationary glass
pipette onto the droplet. The glass pipette attracts the droplet by
capillary action and the droplet remains stationary as the
slide/mica is moved. After solution evaporates, rinse the mica with
water and dry with compressed air (Michalet et al) Dynamic
molecular combing procedure as described or the ASP procedure
described above can also be used. Optionally the single stranded
DNA can be coated with single strand binding protein (Amersham).
Single stranded DNA can be labelled by Acridine dye or Sybr Gold
(Molecualr Probes). Stretched out single stranded molecule can be
probed with single stranded DNA by hybridisaton at 5 degrees C
below the Tm of the oligonucleotide probe. It is preferable to use
LNA oligonucleotides or PNA at 0 or up to 100 mmM NaCl. The salt
concentration is kept low to minirmise intrastrand base pairing
[0727] Capturing of mRNA
[0728] mRNA bearing a PolyA tail can be captured and enriched from
other nucleic acids by using oligo d(T) capture probes.
[0729] Using Arrays
[0730] Target Preparation.
[0731] Remove Cot 1 fraction as described and/or add Cot 1 DNA to
the DNA to reaction mix Digest genome with Not1 restriction enzyme
(NEB) as recomended by supplier.
[0732] Separate by affinity capture with a biotinylated probe
(preferably LNA) complementary to sticky end generated by Not1 on a
magnetic bead. Alternatively fragments are obtained using DNAse1.
Altenatively target preparation is by the Random Primer labelling
protocol given above with the reaction optimised to give long
fragments.
[0733] Digestion may be with other restriction enzyme. For example
EcoR1 which would produce shorter DNA fragments.
[0734] Alternatively fragments are obtained using DNAse1.
Alternatively, target preparation can be by the Random Primer
labelling protocol given elsewhere in this document with the
reaction optimised to give long fragments.
[0735] If single stranded DNA is to be captured then measures need
to be taken to make single stranded DNA e.g by cloning the genomic
library of fragments into single stranded M13 vector (see Maniatis)
or by other means described elsewhere in this document After any of
the above procedures when the target molecule is captured and made
single stranded, various assays including sequence determination
can be carried out on the single stranded molecules. Where sticky
probes have been used, the synthesis of a complementary strand can
be primed by an oligonucleotide of the sticky probe. This synthesis
may be by contiguous ligation of an oligonucleotide sequence, for
example, in order to assay repetitive sequences or it may be by
contiguous ligation from a repertoire of oligonucleotides for DNA
sequencing procedures described in this document. The sticky probes
ensure that as the new strand is synthesised both it and the
template remain in the same vicinity irrespective of whether harsh
treatments that may denature hydrogen bonds, are performed. If this
was not the case certain harsh treatments may delocalise one strand
from the other and undermine the continuity of sequence
acquisition.
[0736] A typical ligation reaction on surface is described by
Gunderson et al (Genome Res 8 1142-53, 1988) and Pritchard and
Southern (Nucleic Acid Research 25:3483) have described ligation
reactions using Tth DNA ligase (Abgene).
[0737] Hybridisation Assay on Arrayed Single Nucleic Acid
Molecules
[0738] Hybridisation is a central feature of many procedures
described in this invention. When the interacting sequence is
short, hybridization requires different conditions than when the
interacting sequences are long. For example, typically DNA in the
100s of base pairs range can be hybridized at a temperature above
65 degrees C. in a variety of buffers, containing SSC and
optionally formamide. Other components known in the art (see
Molecular Cloning, Sambrook et al) may also be included.
[0739] Where one of the interacting components is short, lower
temperatures (less stringent) need to be used and problems of
target renaturation and secondary structure formation must be taken
into account.
[0740] A simple array containing the biallelic probe set for two
SNP sequences of human TNF alpha promoter was tested. The array
probes were designed with the polymorphic base at the centre of a
13 mer sequence. The array contained a dilution series of the
biallelic probe set. One of two oligonucleotides with Cy3 label at
the 5' end, complementary to one of the two biallelic probes was
hybridises to the single molecule array. Spots down the dilution
series were analysed, and single molecule counting was done.
Resolution of molecules at higher concentrations is possible by
optimising the set up and by software for deconvolution. BSA,
Caesin, other blocking solutions carrier DNA, tRNA, NTPs can be
added in the hybridisation mix or a pre-hybridisation done to block
non-specific binding. More detectable point source signal could be
from the perfect match than the mismatch.
[0741] The addition of Mg2.sup.+ can facilitate hybridisation in
some instances.
[0742] The Automated Slide Processor from Amersham Pharmacia was
used for hybridisation. Hybridisation cycle for hybridisation of
oligonucleotides to 13 mer oligonucleotides on array is given
below.
[0743] Asp Hybridisation Protocol
[0744] PRIME PRIME WITH WASH 1
[0745] WAIT inject probe.
[0746] HEAT To 25 degrees
[0747] MIX Hybridisation mixing for 2-12 hrs
[0748] FLUSH Wash 1 (1.times.SSC/0.2%SDS)
[0749] HEAT To 30 degrees C
[0750] MIX Wash 1 Sminutes
[0751] PRIME Prime with wash 2(0.1.times.SSC/0.2%SDS)
[0752] FLUSH Wash 2
[0753] MIX Wash 2 30 seconds
[0754] FLUSH Wash 3 (Wash (0.1.times.SSC)
[0755] MIX Wash 3 30 seconds
[0756] PRIME Prime with was 4(0.1.times.SSC)
[0757] FLUSH Wash 4 (0.1.times.SSC)
[0758] Prime Prime with Isopropanol
[0759] Plush Flush with Isopropanol
[0760] Flush Flush with air
[0761] Airpump Dry Slide
[0762] Heat Turn off Heat
[0763] Alternatively, a manual hybridization set up as known in the
art can be used. Briefly, a droplet of hybridization mix is
sandwiched between the array substrate and a coverslip. The
hybridization is performed in a humid chamber (edges are optionally
sealed with nail polish).
[0764] The coverslip is slid off in wash buffer and washes are done
preferably with some shaking.
[0765] The results are analysed by TIRF microscopy using oxygen
scavenging anti-fade solution.
[0766] In Situ Denaturation of Horizontalised DNA Followed by
Probing
[0767] Once a molecule is horizontalised, for many applications, it
needs to be made further available for interaction with
oligonucleotide probes.
[0768] When DNA is arrayed or captured on the surface the following
protocol (based on Zhong et al PNAS 2001 Mar. 27;98(7):3940-5.) can
be used to probe regions:
[0769] Approximately 200 ng of each probe in 20 .mu.l hybridization
mixture (50% formamide, 10% dextran sulfate, 2.times..about.SSC,
100 ng/.mu.l salmon sperm DNA, and 100 ng/.mu.l human Cot-1 DNA)
was denatured by boiling for 5 min. Arrayed horizontalised DNA is
denatured by incubation in 70% formamide, 2.times.SSC at 70.degree.
C. for 2 min, and dehydrated through ice-cold ethanol series (70%,
90%, and 100%) 3 min each and air-dried. The hybridization mixture
is applied to the arrayed horizontalised DNA and incubated
overnight at 37.degree. C. The slide is washed three times for 5
min 2.times..about.SSC at 37.degree..
[0770] Alternatively the above protocol can be carried out using
6.2M Urea instead of the Formamide as denaturant (based on Castro
and Williams, 1997, Anal. Chem. 69:3915-3920).
[0771] The advantage of these type of protocols is that although
the DNA becomes denatured, the single strands are not able to
re-nature or form secondary structure due to the interactions that
are made with the surface.
[0772] It is preferable to apply one of the approaches provided in
List A.
[0773] The probes may be linked with labels such as 20 nM
Fluosphere nanoparticles before binding to arrayed DNA or
alternatively they may be biotinylated and and streptavidin linked
Semiconductor Nanoparticles can bind to them before or after the
DNA is arrayed on the surface, 45 degrees C for 1 hour in Quantum
Dot buffer is sufficient for this. The nanoparticles can be reacted
with 1 mg/ml BSA or caesin or other appropriate blocking mix
solution to avoid non-specific absorption onto the glass
surface.
[0774] DNA can be array captured and probed as illustrated:
Dephosphorylate Lambda DNA (500 ug/ul) with calf alkaline
phosphatase (this step minimizes concatemerization and
circularization of Lambda DNA). Hybridise lambda to array
containing complementary probes to its sticky end, using ASP
Protocol B. Optionally treat slide with BSA or Caesin or other
blocking solution. Add probes and label e.g. semi-conductor
nanoscrystals (Molecular Probes, Oregon) in buffer provided by
vendor Wash in PBS/Tween followed by PBS wash. Visualize DNA and
fluorescent nanoparticles captured and horizontalised on the
array.
[0775] Probing Followed by Horizontalisation
[0776] The target DNA can be partially denatured in solution, then
probes in solution are able to bind to or invade sites in DNA,
particularly AT rich regions. LNA oligonucleotides can bind
partially denatured ds DNA in solution at temperatures for example
ranging from around d45 degrees C. to around 95 degrees C.
depending on sequences and lengths Salt concentrations higher than
100 mM can be used, eg, 3.times.SSC or 4.times.SSC. In as similar
way PNA probes are abel to hybridise although little or no salt is
required (eg 40 mM NaCl or 6.2 M Urea). Once LNA or PNA probes are
bound they are able to persist on the DNA to a greater extent than
DNA probes. Alternatively, Padlock probes can be reacted onto the
DNA. These become permanently fixed. Following binding of the
probes the DNA can be combed by the nethods of this invention. The
probes may be attached to labels such as 20 nm Fluospheres.
Alternatively they may be biotinylated and streptavidin linked
Semiconductor Nanoparticles can bind to them before or after the
DNA is arrayed on the surface, 45 degrees C. for 1 hour in Quantum
Dot buffer is sufficient for this.
[0777] Probing of Concatemerized Lambda DNA
[0778] Concatemerize Lambda DNA by mixing 2 ul Lambda DNA(500
ug/ml) with 1 ul Thermal T4 RNA ligase (Epicentre), 8 ul
5.times.Ligase Buffer (supplied with enzyme). Incubate at 65
degrees C. for 30 minutes. Then add 8 ul of biotinylated Lamda
sequence A and B and Streptavidin coated Fluosphere mix to the
Ligation reaction. Incubate for a further 30 minutes at 65 degrees
C. Incubate with YOYO for at least 20 minutes. Horizontalise the
DNA onto an untreated glass slide or dilute and incubate on a
aminosilane coated slide. Dry slide and mount with Fluoromount G.
Horizontalisation/straightening can be done by one of a number of
different methods described in this document. Upon visualization on
an epi-fluorescence microscopy a recurring sequence on the lambda
concatamers is labelled by Fluorosphere complex (see FIG. 10b).
[0779] Ligation Assay on Single Molecule Array
[0780] Target preparation is essentially as for SNP
typing/resequencing section and target analysis Mix:
[0781] 5.times.ligation buffer*
[0782] Solution oligonucleotide 5-10 pmol, labelled with
fluorescent dye on 3' and phosphoryalted on 5' end
[0783] Thermus thermophilus DNA ligase (Tth DNA ligase) 1 U/ul,
[0784] Target sample
[0785] Add to centre of array* *
[0786] Add coverslip over the top of array area and seal edges with
rubber cement
[0787] Place at 65.degree. C. for 1 hr. *5.times.ligation buffer is
compose d of 100 mM Tris-HCL pH 8.3, 0.5% Triton X-100, 50 mM MgCl,
250 mM KCl, 5 mM NAD+, 50 mM DTT, 5 mM EDTA
[0788] ** In this example different sequences that define the
allele of a SNP are placed in adjacent spots in the microarray, by
the spotting methods described. The last base of these sequences
overlap the variant base in the target. The oligonucleotide on the
array are spotted with 5' aminatation. The 3' end is free for
ligation with the 5' phosphorylated solution oligonucleotide.
Alternatively the array oligonucleotide can be 3' aminated and 5'
phosphorylated The solution oligonucleotide can be phosphorylated
and labelled on the 5' end. The solution oligonucleotide is
preferably a mixture of every 9 mer (Oswel, Southampton, UK).
*5.times.ligation buffer is compose
[0789] Preparation of Sample DNA
[0790] From Amplicons
[0791] Produce amplicons by methods known in the art covering the
desired region, ethanol precipitate and bring up in 125 ul water.
Optimally the amplicons should be 100 bases or less. If they are
longer than 200 base pairs then the following fragmentation
protocol must be used. Fragment the amplicons as follows: To the
12.5 ul add 1.5 ul of Buffer(500 mM Tris-HCl. pH(0.0; 200 mM
(NH4)2SO4) . Add 0.5U (1U/ul) of Shrimp Alkaline Phosphatase
(Amersham). Add 0.5ul of thermolabile Uracil N-Glycosylase
(Epicentre). Incubate at 37 for one hour and then place at 95
degrees for ten minutes. Check fragmentation on a gel (successful
if no intact PCR is detected).
[0792] Genomic DNA can be extracted and purified
[0793] Digest DNA with restriction enzyme or random
fragmentation(e.g. DNAs1 treatment)
[0794] Restriciton Digest:
[0795] DNA X ul for lug
[0796] Reaction 3 10.times.Buffer 5ul
[0797] EcoR1 2ul (20 units)
[0798] Water Y ul to a final volume of 50 ul
[0799] Incubate 37 degrees for 16hours
[0800] Stop reaction by by 72 C for 10 minutes
[0801] Purify digested DNA using a commercial purificaton kit (Zymo
Research's DNA clean and Concentrator) as per supplied protocol
[0802] Cot 1 DNA can be used at this stage to remove repetitive DNA
and/or can be added to array hybridisation/reactions for in situ
suppression of hybridisation of probes to repetitive DNA by
blocking the repetitive DNA by hybridisation to the Cot-1 DNA.
[0803] Ex situ depletion of repetitive sequence:
[0804] Cot-1 DNA (Gibco BRL) is labelled with biotin using Biotin
Chem-Link kit (Boehringer Mannheim) or photoprobe Biotin Kit
(Vector Laboratories) as per manufacturs protocol and purified with
Sepahdex G50 Columns(Amersham Pharmacia) as per manufactureres
protocol.
[0805] A 700 ng amount of source DNA is hybridised with 35 ug (50
fold excess) of biotin-labelled Cot-1 DNA.
[0806] Streptavidin magnetic particles (Boehringer Mannheim) are
prepared according to manufacturers instructions, 4.4 mg to a final
125 ul volume
[0807] The Streptavidin-magnetic particles are applied to the
targe-tDNA-biotin-labelled Cot1 DNA(100 ul). After incubation f the
Magnetic bead captured Cot-1 fraaction was separated to the side of
the tube with a magnet, and the supernatant containing the target
DNA pipetted to a fresh tube. The magnetic separation is repeated,
and then the target DNA supernatent is purified using a QIAex II
kit (Qiagen).
[0808] In situ Blocking of Cot-1 Fraction
[0809] Add 25-125 ug (or 100 fold excess to target DNA) of Cot-1
DNA directly to hybridisation/reaction mix
[0810] Apply directly to the array.
[0811] Alternatively, the DNA can be randomly amplified by random
primers using reagents for Spectral Genomics(SG) (Houston, Texas)
Human BAC array and BioPrime labelling kit form Gibco/BRL.
[0812] Add SG Sterile Water(orange vial) to xul (at least 100 ng
not more than 1 ug) of digested DNA to bring volume to 25 ul. Add
2.5.times. random primer /reaction buffer (Gibco). Mix the samples
well and boil for 5 minutes and place the samples on ice for 5
minutes
[0813] On ice add 2.5 ul of SG labelling Buffer(yellow vial) toeach
sample
[0814] Optionally add 1.5 ul Cy5-dCTP or Cy3-cCTP to the samples
(In some sequencing embodiments, a mixture of for example Cy5-dCTP
and Cy3-dATP may be added to intrinsically label the DNA strand
with two labels; the 5 other combinations of dNTPs may also be
required in separate reactions)
[0815] Add 1 ul Klenow Fragment (Gibco) to the sample and mix well
by tapping and recollecting by centrifugation
[0816] Incubate the sample at 37 degrees from 2.5 hours (enough for
one or two array hybridisations/reactions) to overnight (produces
sufficient material for several array hybridisations). The probe
will range in size between 100 and 500 bp.(for sequencing
applications it may be desirable to have longer sequences and for
this the concentration of the random primer can be diluted (the
concentration of random primer to use to get a particular random
primer product must be determined empirically).
[0817] Stop the reaction by adding 0.5 ul 0.5 M EDTA pH 8 and
incubating at 72 for 10 minutes. Place samples on ice until use or
freeze at -20.
[0818] If necessary the random prime labelled DNA can again be
depleted for any sequences from the Cot-1 fraction by magnetic
separation with Cot-1 DNA.
[0819] Alternatively or in addition Cot-1 DNA can be added to the
hybridisation/reaction mix.
[0820] Fragmentation Methods
[0821] Fragmentation of the genome to the desired size can be done
by DNAse 1 treatment timised for a prticular enzyme. Fragmentation
by sonication can also be optimised to give fragments of a desired
length DNA can be sheared by passing it through a narrow gauge
needle. Heating and UV light exposure may also fragment DNA as
appropriate for use in this invention.
[0822] Nanoparticle Bioconjugation and Purification
[0823] Oligonucleotides can be coupled to microspheres (Luminex,
Austin Tex.) or nanospheres by a one step carbodiimide coupling
method. Each coupling reaction contains 10.1 uM of
amino-substituted oligonucleotide and 1.times.10.sup.8
microsheres/ml in 0.1 MES. PH 4.5. EDC is added at 0.5 mg/ml and
reaction is incubated for 30 minutes st room temperature followed
by a second EDC addition and incubation. The coupled microspheres
are washed and stored at 4 degrees C. in the same buffer.
[0824] Dendrimers are coupled to oligonucleotide-microspheres in
[tetramethylammonium chloride (TMA) buffer: 0.01% SDS, 50 mM Tris,
3.5 M TMA, 0.002 M EDTA or 2-6.times. sodium citrate (SSC) buffer:
0.9 M NaCl, 0.03 M trisodium citrate. <2.times.SSC gives more
specificity of binding at 40 degrees C. Dendrimers can be
synthesised using branched phosphoramidites (MWG Biotech,
Germany)
[0825] There are two approaches for the use of streptavidin
nanoparticles (Quantum Dot Corp, USA) to label probe
oligonucleotides:
[0826] A Hybridise biotinylated oligonucleotide to DNA and then add
streptavidin coated nanoparticle or
[0827] B Complex streptavidin-nanoparticle to biotinylated
oligonucleotide and hybridise to DNA
[0828] In Method A there will be no steric hindrance by
nanoparticles to hybridisation of the oligonucleotide to the target
DNA. However, even though the oligonucleotide is coupled to the
nanoparticle before hybridisation, in method B, it is not too
different a situation to DNA binding to oligonucleotides bound to a
surface in microarrays, which obviously works. Preferably the
nanoparticles need to be coupled to the oligonucleotide probes in
advance of hybridisation, as in method B, in a
one-colour/one-allele specific way. This is so that the allele in
the target can be typed by looking at which of the two colours
localises by hybridisation to a particular SNP site. For method B,
firstly, excess biotinylated oligonucleotide can be added to the
beads so that substantially all the beads become attached with
oligonucleotide (one should estimate the amount of nanoparticle and
add oligonucleotide at e.g. 1000-10,000 excess) then unreacted
oligonucleotide needs to be separated and discarded. This
separation can be done by one of the following three methods:
[0829] 1 Dialysis
[0830] 2 Chromaspin columns (eg chromaspin 100).
[0831] 3 Ultra-centrifugation at its highest speed setting (e.g.
120K rpm).
[0832] 4 Streptavidin coated magnetic beads
[0833] 3 Vectrex columns (Vector Laboratories
[0834] A nanoparticle attached oligonucleotide probe can be reacted
with sample (e.g. lambda) DNA that has already been horizontalised.
This can be done in the presence of BSA and/or other blockers.
Alternatively the nanoparticle oligonucleotide can be reacted with
lambda before combing. If this is done then, before combing, the
reaction should be put through Chromospin 1000 (Clontech, USA)
which can separate the long DNA target fragment from smaller
products.
[0835] Nanoparticle can be reacted with 1 mg/ml BSA/Caesin solution
to avoid absorption of the beads onto the glass surface.
[0836] Genomic DNA Labeling Protocol
[0837] The following protocol is developed for microarray-based
comparative genomic hybridization but can also be used for other
applications of this invention.
[0838] Genomic DNA can be labeled with a simple random-priming
protocol based on Gibco/BRL's Bioprime DNA Labeling kit, though
nick translation protocols work too. I routinely use the BioPrime
labeling kit (Gibco/BRL) as a convenient and inexpensive source of
random octamers, reaction buffer, and high concentration klenow (do
not use the dNTP mix provided in the kit), though other sources of
random primers and high concentration klenow work as well.
[0839] 1. Add 2 ug DNA of the Sample to be Labeled to an Eppindorf
Tube.
[0840] Note: For high complexity DNAs (e.g. human genomic DNA), the
labeling reaction works more efficiently if the fragment size of
the DNA is first reduced. I routinely accomplish this by
restriction enzyme digestion (usually DpnII, though other 4-cutters
work as well). After digestion, the DNA should be cleaned up by
phenol/chloroform extraction/EtOH precipitation (Qiagen PCR
purification kit also works well).
[0841] 2. Add ddH.sub.20 or TE 8.0 to bring the total volume to 21
ul. Then add 20 ul of 2.5.times. random primer/reaction buffer mix.
Boil 5 min, then place on ice.
[0842] 2.5.times.X random primer/reaction buffer mix:
[0843] 125 mM Tris 6.8
[0844] 12.5 mM MgCl.sub.2
[0845] 25 mM 2-mercaptdethanol;
[0846] 750 ug/ml random octamers
[0847] 3. On ice, add 5 ul 10.times.dNTP mix.
[0848] 10.times.dNTP mix:
[0849] 1.2 mM each dATP, dGTP, and dTTP
[0850] 0.6 mM dCTP
[0851] 10 mM Tris 8.0, 1 mM EDTA
[0852] 4. Add 3 ul Cy5-dCTP or Cy3-dCTP (Amersham, 1 mM stocks)
[0853] Note: Cy-dCTP and Cy-dUTP work equally well. If using
Cy-dUTP, adjust 10.times.dNTP mix accordingly.
[0854] 5. Add 1 ul Klenow Fragment.
[0855] Note: High concentration klenow (40-50 units/ul), available
through NEB or Gibco/BRL (as part of the BioPrime labeling kit),
produces better labeling.
[0856] 6. Incubate 37 degrees C for 1 to 2 hours, then stop
reaction by adding 5 ul 0.5 M EDTA pH8.0
[0857] 7. As with RNA probes, I purify the DNA probe using a
microcon 30 filter (Amicon/Millipore):
[0858] Add 450 ul TE 7.4 to the stopped labeling reaction.
[0859] Lay onto microcon 30 filter. Spin .about.10 min at 8000 g
(10,000 rpm in microcentrifuge).
[0860] Invert and spin 1 min 8000 g to recover purified probe to
new tube (.about.20-40 ul volume).
[0861] 8. For two-color array hybridizations, combine purified
probes (Cy5 and Cy3 labeled probes) in new eppindorf tube. Then
add:
[0862] 30-50 ug human Cot-1 DNA (Gibco/BRL; 1 mg/ml stock; blocks
hybridization to repetitive DNAs if present on array).
[0863] 100 ug yeast tRNA (Gibco/BRL; make a 5 mg/ml stock; blocks
non-specific DNA hybridization).
[0864] 20 ug poly(dA)-poly(dT) (Sigma catalog No. P9764; make a 5
mg/ml stock; blocks hybridization to polyA tails of cDNA array
elements).
[0865] 450 ul TE 7.4
[0866] Concentrate with a microcon 30 filter as above (8000 g,
.about.15 min, then check volume every 1 min until appropriate).
Collect probe mixture in a volume of 12 ul or less.
[0867] 9. Adjust volume of probe mixture to 12 ul with ddH.sub.20.
Then add 2.55 ul 20.times.SSC (for a final conc.of 3.4X) and 0.45
ul 10% SDS (for a final conc. of 0.3%).
[0868] Note: The final volume of hybridization is 15 ul. This
volume is appropriate for hybridization under a 22 mm2 coverslip.
Volumes should be adjusted upwards accordingly for larger
arrays/coverslips.
[0869] 10. Denature hybridization mixture (100.degree. C., 1.5
min), incubate for 30 minutes at 37.degree. C. (Cot-1 preannealing
step), then hybridize to the array.
[0870] 11. Hybridize microarray at 65.degree. C. overnight (16-20
hrs). Note, see Human Array Hybridization protocol for details on
hybridization.
[0871] 12. Wash arrays as with mRNA labeling protocol and scan:
[0872] First wash: 2.times.SSC, 0.03% SDS, 5 min 65.degree. C.
[0873] Second wash: 1.times.SSC, 5 min RT
[0874] Third wash: 0.2.times.SSC, 5 min RT
[0875] Note: the first washing step should be performed at
65.degree. C.; this appears to significantly increase the specific
to non-specific hybridization signal.
[0876] Two methods for probing when secondary array is made with ds
DNA are given.
[0877] The problem with denaturation and probing with a single
probe for sequencing by hybridisaton when the target is
double-stranded is that it is not known which of the sense or
antisense strand each probe binds to. This is overcome in the
double complementary probe strategy by probing both strands
simultaneously.
[0878] There are two problems when trying to probe and view single
molecules with oligonucleotides, along a combed molecule. One is to
differentiate real signal from non-specific (to get get sufficient
signal to be detected above background) and the second is to get
access to the DNA sequence for binding of probe.
[0879] Specific Applications
[0880] Mini-Sequencing
[0881] The sample anneals to arrayed primers which promote DNA
polymerase extension reactions using four fluorescently labeled
dideoxynucleotides. In these examples both strands of the target
can be analysed simultaneously. But in other cases it may be chosen
to use single stranded products (eg, by asymmetric PCR, RNA
transcription, selective degradation of one strand or biotinylation
of target strand and removal of non-biotinylated other strand by
for example, magnetic beads methods known in the art.
[0882] Wash enhanced aminosilane slides with milliQ water before
using and dry (e.g place on 58 C heating plate). Denature the
sample DNA for 6 minutes at 95 degrees. Centrifuge and put on ice.
Add 5 ul of dye terminators (e.g Texas Red-ddATP, Cy3-ddCTP,
Fluorescein-ddGTP, Cy5-ddUTP, all 50 uM) and diluted
Thermosequenase (4 U/ul), mix and pipette onto slide covering
region carrying the array. Immediately cover with a piece of
Parafilm to cover the array area if the array has been printed on a
coverslip or place Parafilm or coverslip over array if it has been
printed on a slide. Lifter coverslips (Erie Scientific) are
preferably used. Incubate slide 25 minutes at 58 C. Remove
Parafilm/coverslip, wash slide 2 minutes in 95 degree miliQ water,
3 minutes in 0.3% Alcanox solution and 2 minutes in 95 degree
milliQ water.
3 Excitation Wavelengths 4 lasers 488 nm (FITC) 543 nm (Cy3) 594 nm
(Texas Red) 633 nm (Cy5) Emission Wavelengths 8 position filter
wheel with narrow band pass filters 530 nm (FITC) 570 nm (Cy3) 630
nm (Texas Red) 670 nm (Cy5)
[0883] A droplet of slowfade Light antifade reagent (Molecular
probes) is added to minimize photobleaching and cover with a
coverslip
[0884] If non-specific sticking of for example labelled nucleotides
(seen by for example signals outside the regions carrying the
microarray spots, then prehhybridisation of the array can be done
(e.g. in a 25 ml volume in a 50 ml falcon tube) with a buffer
containing 1%BSA, 0.1% SDS (and or Sarksyl) and optionally Cot1
DNA, poly(A) DNA, tRNA.
[0885] Errors are eliminated by methods of this invention, for
example by an algorithm or by enzymatic methods such as the use of
Apyrase. For the latter, 8 mU of Apyrase (Sigma) is added to the
reaction mix on the array.
[0886] The array for this experinent can be made as in example
above (with reducton of synthesis cell dimension and step size) or
by spotting 5' aminated oligonucleotides onto enhanced aminosilane
slides in DMSO:Water at an appropriate dilution (eg 50-500 nM
range)
[0887] Haplotyping
[0888] Probing a Horizontalised DNA Polymer at Multiple Loci Using
Two-Colour Probes
[0889] Each locus of interest is probes bya biallelic probe
comprising allelic probes labelled with different fluorescent tags.
For example one, allele is labelled with a semiconductor
nanocrystal emitting at 565 nm whilst the other one emits at 655
nm.
[0890] The target molecule may be spread with or without the aid of
a capture molecule. Where a capture molecule is provided it may
probe the first allele of interest. The target molecule may be
captured at a second point by arrayed capture probes, which may
also be allele specific. Different allele specific array capture
probes would be placed at distinct spatial locations by the
arraying methods described in this document and known in the art.
The double capture would be done using 4.times.SSC/Sarkosyl at a
temperature determined by the Tms of the probes. Subsequent
internal probing of the captured molecule is via any of the
approaches descriebd in this document. Each subsequent SNP site
would be probed by specific complementary allele specific probes
but as the target molecule is horizontalised, the same two labels
need be used.
[0891] Directing Different Loci on a Single Polymer Molecule to
Different Spatial Locations
[0892] Probes were placed at spatially distinct gold electrode pads
separated by a gap of approximately 5-10 um and DNA was bridged
over a gap between adjacent pads. The sticky ends of Lambda DNA was
reacted with complemetary probes in 4.times.SSC 0.1% Sarkosyl.
Similarly probes can be spaced strategically to capture other
sequences along the same DNA polymer, the spatial location to which
the DNA polymer binds being indicative of the sequence present at
that locus on the DNA.
[0893] The intervening DNA is not substantially bound to the
surface when high salt is used (if surface is APTES coated) and
this makes the DNA available for probing by any of the methods
mentioned.
[0894] Obtaining Sequence Information by Hybridisation
[0895] Where the sample to be sequenced are oligonucleotides then
the number of different of probes that need to be hybridised may
not be too large and positional information may not be
required.
[0896] There are two fundamental aspects of the single molecule
sequencing of this invention. Spatially address genome. Probing
along the DNA polymer in a manner that information about what
positional along the DNA polymer each probe binds to is
obtained.
[0897] There are several schemes with which single molecule
sequencing by hybridisation can be achieved. The following gives a
number of strategies. Experimental steps that are common are
described under separate headings. Other methods are elsewhere in
the description of methods.
[0898] Sequencing Strategy Example A
[0899] Sequencing of spatially addressably captured genomic DNA is
done by iterative probing with 6 mer oligonucleotides. There are
4096 unique 6 mers . Each oligonucleotide is added one after the
other. The position(s) of binding of each oligonucleotide is
recorded before addition of the next oligonucleotide. The target is
preferentially in a linearised single stranded form.
[0900] Sequencing Strategy Example B
[0901] Sequencing of spatially addressably captured genomic DNA is
done by iterative probing with sets of 6 mer oligonucleotides.
There are 4096 unique 6 mers, these are split into groups of 8
containing 512 oligonucleotide each. Each probe is labelled via a
C12 linker arm to a dendrimer(Shchepinov et al Nucleic Acids Res.
1999 Aug. 1;27(15):3035-41) which carries many copies of this probe
sequence (this construct is made on an Expedite 8909 synthesizer or
an ABI 394 DNA synthesizer or custom made by Oswel). The 512 probe
constructs of each set are hybridised simultaneously to the
secondary genomic array. Following this the position of binding of
the probes and the identity of the probes is detected by
hybridisation of a library of microspheres, within which each
microsphere is coated with a complementary sequence to one of the
probe sequences (e.g by first coating mucrosphere with streptavidin
(Luminex) and then binding biotinylated oligonucleotides to this as
described above or binding aminated oligonucleotides by
carbodiimide coupling; see also Bioconjugate techniques, Greg T.
Hermanson Academic Press). The arms of the dendrimer form multiple
interactions with the multitude of oligonucleotide copies that coat
the microsphere in <400 mM Monovalent salt, Na at 40 degrees C.
or above. The microsphere in one of a coded set, ratiometrically
dyed with a two or more dyes(100-1000 differnt coded beads are
available (Lumonics). The spectral proprties of these beads that
now decorate the DNA in the secondary array and their position of
binding are recorded. The probes are then denatured which releases
the whole complex. The array can then be probed with the 8 other
probe sets in a stepwise manner. The probe concentrations are
configured such that only some of the sites on the DNA are
occupied, but analysis of the multidude of copies of each genomic
fragment within a microarray spot enables information about all the
sites that are occupied to be worked out. The information obtained
from the experiment is fed into the sequence reconstruction
algorithm. Optionally the 8 sets can be further split and
hybridisation is done on multiple copies of the array. In this way
far fewer coding beads need be used.
[0902] Sequencing Strategy Example C
[0903] Sequencing of spatially addressably captured genomic DNA is
done by iterative probing simultaneously with sets of non
overlapping or minimally-overlapping sequences added together and
substantially overlapping sequences are added separatedly.
Non-overlapping and minimaly overlapping sets of sequences from
this set of 4096 are determined algorithmically. Each set is added
one after the other. The position(s) of binding of
oligonucleotidess in each set is recorded before addition of the
next oligonucleotide. The target is preferentially in stretched
single stranded form.
[0904] The information that is passed onto the algorithm for
sequence reconstruction is the identity of the sequences in the non
overlapping set, that they do not overlap, the positions of binding
of probes from the set This is preferably done with a high
resolution method such as AFM and the probe molecules need not be
labelled. In another embodiment each probe is labelled for example,
with a streptavidin molecule separated by a linker. The draft
sequence of the genome is used to reconstruct the sequence.
[0905] Sequencing Strategy D
[0906] The 4096 oligonucleotides are grouped into sets, in this
example in sets of sixteen each containing 256 oligonucleotides
(oligonucleotides in each set are chosen by algorthm to minimally
overlap in sequence). Each set is used in a series of
hybridisations to a separate copy of the secondary array. After
smmultaneous hybridisation of the 265 oligonucleotidenucletides in
the set and recording of the position of their binding they are
denatured. Next one of the oligonucleotides from the set is
ommitted and the resulting set of 255 oligonucleotides is
hybridised back to the array. The absence of signals from positions
where there was previously signal tells us the identity of the
oligonucleotide that bound in that position before as being the
oligonucleotide that is ommitted in the present run. This is
iterated with a different oligonucleotide from the set and so on,
256 times so that information is obtained from sets in which one of
the 256 is omitted each time. The oligonucleotides are bound in
saturating concentrations. The information that is obtained is
passed onto the algorithm for sequence reconstruction.
[0907] Sequencing Strategy Example E
[0908] Sequencing of spatially addressably captured genomic DNA is
done by iterative probing with complementary pairs of 6 mer
oligonucleotides, both oligonucleotides labelled with the same
label. There are 4096 unique 6 mer complementary pairs. Each pool
is added to a separate S secondary array (capture probes to which
the genomic sample array has been spatially addressably captured
and combed). After each probing step the 6 mers are be denatured
and then a different complemntary pair is added
[0909] The target is preferentially double stranded in this example
and not denatured in situ. However denaturation in situ is an
alternative.
[0910] Each of one the 256 BainsProbes in each pool will be
hybridised to a secondary array. To reduce time and the affects of
attrition on the secondary array, multiple BainsProbes are annealed
at one time. In this example two will be labelled at one time and
preferentially, these will be differentially labelled, for example
each of the 2 can be labelled with Cy3 or Cy5 dyes or a red
fluorescent or green fluorescent Fluorosphere (a more complex
coding can be devised or alternatively there would be no labelling
and it would be the task of the algorithm to reconstruct the
sequence on that basis). After annealing, the position of the
probes is recorded with respect to each other and the markers. In
some embodiments the DNA probes can be denatured from the target
DNA, before another set is added (or after several sets are added)
but in the present example, the BainsProbes are not removed after
hybridisation Instead, after recording the positions of probe
binding, the next pair of probes are added This will need to be
iterated 128 times to go through all the probe pairs. If each
iteration is approximately 10 minutes for each addition, then the
sequencing will be complete within 24 hours. This can be speeded up
further if more than 2 oligonucleotides are added at a time, for
example 80 oligonucleotides added at a time would allow whole
genome sequencing in about an hour; each of the 80 would not need
to hybridise to every copy that is captured within a microarray
spot, for example if there is 2000 50 kb molecules captured in one
spot, then each molecule need only be labelled with say, 8 probes.
This can aid in one sequence preventing the binding of another by
forming overlap with another.
[0911] Molecular beacons can be used as probes: here there is no
fluorescence when the oligonucleotide is scanning the molecule,
only signal when it forms a stable enough duplex to unwind the stem
and release the fluorophore from quenching. Two types of molecular
beacons can be used, one based on FRET and the other based on
electron transfer (Atto-Tec, Heidelberg). It is likely that as
sequence reconstruction in this case will utilise the draft
sequence of the genome, the
[0912] Sequencing Strategy Example F
[0913] Sequencing of spatially addressably captured genomic DNA is
done by iterative probing with 8 mer oligonucleotides. Each 8 mer
contains 6 unique bases and two degenerate positions, in this
example, the central two bases are degenerate. There is 4096
different probes identified by their 6 unique positions but each of
these carry 16 different sequences due to the degenerate positions
(these will be referred to as BainsProbes after Bains and Smith
Journal of theoretical biology 135: 303-307 1988). The 4096
BainsProbes are split into 16 pools of 256 BainsProbes (this is an
arbitary choice and they can be split into 4 pools of 1024 if the
number of arrays are limiting) with each pool containing sequences
approximately matched for Tm. Each pool is added to a separate
secondary array (capture probes to which the genomic sample array
has been spatially addressably captured and combed).
[0914] Each of one the 256 BainsProbes in each pool is hybridised
to a secondary array. To reduce time and the affects of attrition
on the secondary array, multiple BainsProbes are annealed at one
time. In this example two are labelled at one time and
preferentially, these are differentially labelled, in this example
each of the 2 are labelled with either Cy3 or CyS dye or a red
fluorescent or green fluorescent Fluorosphere (a more complex
coding can be devised or alternatively there would be no labelling
and it would be the task of the algorithm to reconstruct the
sequence on that basis). After annealing, the position of the
probes is recorded with respect to each other and the markers. In
some embodiments the DNA probes can be denatured from the target
DNA, before another set is added (or after several sets are added)
but in the present example, the BainsProbes are not removed after
hybridisation. Instead, after recording the positions of probe
binding, the next pair of probes are added This will need to be
iterated 128 times to go through all the probe pairs. If each
iteration is approximately 10 minutes for each addition, then the
sequencing will be complete within 24 hours. This can be speeded up
further if more than 2 oligonucleotides are added at a time, for
example 80 oligonucleotides added at a time would allow whole
genome sequencing in about an hour; each of the 80 would not need
to hybridise to every copy that is captured within a microarray
spot, for example there may be 2000 50 kb molecules captured in one
spot, and each individual molecule copy need only be labelled with
say, 8 probes. This can aid in one sequence preventing the binding
of another by forming overlap over a complementaryy region.
[0915] Molecular beacons can be used as probes: here there is no
fluorescence when the oligonucleotide is scanning the molecule,
only signal when it forms a stable enough duplex to unwind the stem
and release the fluorophore from quenching. Two types of molecular
beacons can be used, one based on FRET and the other based on
electron transfer (Atto-Tec, Heidelberg). It is likely that as
sequence reconstruction in this case will utilise the draft
sequence of the genome, the
[0916] Sequencing Strategy Example G
[0917] Sequencing of spatially addressably captured genomic DNA is
done by iterative probing with 13 mer oligonucleotides (this lenght
can form stable duplex at room temperature). Each 13 mer contains 6
unique bases and 7 degenerate positions, for example, 8 bases at
the 5' end are degenerate (will be called stabiliser probes).
Although we have the stability of a 13 mer we will only have the
sequence infromation of a 6 mer. There will be 4096 different
probes identified by their 6 unique positions but each of these
will carry ca. 16,384 different sequences due to the degenerate
positions. In this example the concentration of oligonucleotide
will be 100 to 1000 fold higher than in example A. The 4096
Stabiliser Probes will be split into 8 pools of 512(this is an
arbitary choice and they can be split into 4 pools of 256) with
each pool containing sequences approximately matched for Tm. Each
pool will be added to a separate secondary array (capture probes to
which the genomic sample array has been spatially addressably
captured and combed).
[0918] Each of one the 128 BainsProbes in each pool will be
hybridised to a secondary array. To reduce time and the affects of
attrition on the secondary array, multiple BainsProbes are annealed
at one time. In this example two will be labelled at one time and
preferentially, these will be differentially labelled, for example
each of the 2 can be labelled with Cy3 or Cy5 dyes or a red
fluorescent or green fluorescent Fluorosphere (a more complex
coding can be devised or alternatively there would be no labelling
and it would be the task of the algorithm to reconstruct the
sequence on that basis). After annealing, the position of the
probes is recorded with respect to each other and the markers. In
some embodiments the DNA probes can be denatured from the target
DNA, before another set is added (or after several sets are added)
but in the present example, the BainsProbes are not removed after
hybridisation. Instead, after recording the positions of probe
binding, the next pair of probes are added This will need to be
iterated 128 times to go through all the probe pairs. If each
iteration is approximately 10 minutes for each addition, then the
sequencing will be complete within 24 hours. This can be speeded up
further if more than 2 oligonucleotides are added at a time, for
example 80 oligonucleotides added at a time would allow whole
genome sequencing in about an hour; each of the 80 would not need
to hybridise to every copy that is captured within a microarray
spot, for example if there is 2000 50 kb molecules captured in one
spot, then each molecule need only be labelled with say, 8 probes.
This can aid in one sequence preventing the binding of another by
forming overlap with another.
[0919] Molecular beacons can be used as probes: here there is no
fluorescence when the oligonucleotide is scanning the molecule,
only signal when it forms a stable enough duplex to unwind the stem
and release the fluorophore from quenching. Two types of molecular
beacons can be used, one based on FRET and the other based on
electron trnsfer (Atto-Tec, Heidelberg). It is likely that as
sequence reconstruction in this case will utilise the draft
sequence of the genome, the
[0920] The above examples are all done with 6 mer probes, however
the strategies can be implemeted with oligonucleotidenucleotdes
shorter than 6 nt, in which case there will be fewer cycles but
more stabilising chemistries such a LNA will be used. Alternatively
oligonucleotides longer than 6 nt can be used in which case there
will be more cycles.
[0921] These three strategies serve as examples but methods from
any of these can be adapted from one to the other and there are
several other specific means which are apparent from the methods
and protocols described in this invention For example, each probe
can be ligated to a random library of ligation molecules, this
would serve to stabilise the interactions and eliminate
mismatches.
[0922] Getting Additional Experimental Validating Sequence
Information
[0923] To get further information about sequence, during
preparation the DNA sample can be internally labelled with
combinations of base labelling fluors as suggested in the random
primer labelling section above. In addition where the target DNA of
the secondary array is double stranded, optical mapping in which
gaps are created at the site of restriction digest can provide
sequence and positional information.
[0924] Sequence Reconstruction, Re-mining and Validation
[0925] A first pass at reconstructing the sequence is attempted.
This will identify regions with gaps and low confidence.
[0926] As the draft human genome sequence is known, any gaps can be
filled in by probing with specific oligonucleotides targeting the
gapped/low confidence region on a further array and this process
can be reiterated (i.e. see if additional information allows
reconstruction, if not add further probes to same array or separate
array and repeat).
[0927] Sequence reconstruction can be performed on a network of
desktop computers, e.g IBM compatible Personal computer, Apple
personal computer, or Sun Microsystem computer. Such networks can
be very large
[0928] In some instances sequence reconstruction is on a
supercomputer
[0929] The results will be presented in a graphical, interactive
format.
[0930] Low confidence regions that are persistent will be indicated
as such on a macro, chromosome by chromosome report of the regions
sequenced. The confidence assigned to each base will be available,
which is not the case in present methods.
[0931] Avoiding Mismatch Errors
[0932] Conditions will be stringent enough to prevent a 5 mer
mismatch from hybridising. Furthermore, markers can be used to
label mismatches or methods can be used to destroy mismatches, for
example, the mismatch repair system of Escherichia coli, provides
proteins, MutL, MutH and MutS which singly or in combination can be
used to detect the site of a mismatch; T4 endonuclease IV can also
do this. In addition treatment by tetraethyammonium
chloride/potassium permanganate, followed by hydroxylamine can
cleave the site of mismatch and this will be seen as a contraction
in the DNA. It is likely that mismatches will only occur when a 6
mer is stabilised by flanking contiguous stacking oligonucleotides.
This effect can be minimized by making oligonucleotides in which
one end is phosphorylated (disrupts intimate coaxial stacking) or
by adding a bulky group at the end. Depending on the algorithm
mismatches may be tolerated especially where there is a well
defined set of rules that describe mismatching behaviour.
[0933] The oligonucleotide probes may be detected by virtue of
Fluorescence Resonance Energy Transfer (FRET) interactions with a
DNA stain staining the DNA Polymer (see Howell W M, Jobs M, Brookes
A J 2002 Genome Res. September;12(9):1401-7). This drastically
reduces signal from non specific interactions of the probes withn
the surface because only those probes which are within around 10 nm
of the DNA polymer will undergo FRET.
[0934] For complete de novo sequencing, for example of organisms
where no reference sequence is available, the experimental
procedure is exactly the same but the task of the algorithm is
greater. Supercomputers may be needed fro sequence reconstruction
depending on the quality of data that is obtained.
[0935] The data is deconvoluted for ordering along the molecule and
data about order and approximate distance from other probes is
taken into account. A list with orders is then present to a
sequencing by hybridisation algorithm. In one example of the
reconstruction strategy the algorithm then splits the regions of
the genome into a series of overlapping segments and computes the
sequencing from the hybridisation data from each area, matching to
the draft genome sequence where available, assigning probabilistic
scores to the sequence data. The data is presented (e.g. via a
colour chart) indicating regions of high certainty and regions of
lower certainty. The regions of high certainty can be used in
genetic studies.
[0936] The results are also cross-validated by Sanger sequencing
technologies and with this comparison a heuristic or knowledge
based system will be built up over time, enabling more accurate
sequence. The aim would be to get confidences higher than error
rates for common enzymes, eg. 99.9% confidence. Ultimately the
sequencing may be run in parallel with other whole genome
sequencing technologies to further increase confidence.
[0937] With this method it is possible that unless specific
measures are taken algorithms can e be s confounded by
heterozygocity over the regions. Therefore it will be preferable to
use bialllelic probes to isolate haplotype tags which seed a region
of linkage disequilibrium. This information about the haplotype
structure of the geneome will soon become available through
international efforts. Two-colour gene expression analysis
[0938] The Experimental Apparatus
[0939] The edges of the area surrounding the array are raised so
that addition and removal of fluids can take place (e.g a
microtitre set-up; low intrinsic fluorescene glass bottomed plates
area available, e.g. from Whatman Polyfiltronics or custom made
glass). Alternatively, the array substrate is sealed to a reaction
cell (e.g. Teflon or Teflon coated which makes a good seal with
glass) with inlet and outlet ports. Where information from single
dye molecules is required, the microscopy set up will be TIRF,
preferably with ulsed lasers and time gated detection, with full
gamut of measures taken to minimise fluorescence background. Where
the probes are labelled with fluorspheres then epi-fluorescence
microscopy and excitation with a 100 W mercury lamp can be used.
Where the analysis is with AFM, then nanoparticles of different
sizes cna be used for labelling, analysis will be with tapping mode
in Air and a liquid cell will be used for flowing in reagents and
washing the array.
[0940] Experimental Procedures
[0941] Hybridise target to array (ASP method as described for
lambda DNA above). Use as much target DNA as can be tolerated in
the reaction mix for example, at least 10 ug of restriction
digested DNA or if whole genome amplification by random primer
labelling has been done then the amount of DNA obtained after
amplification of as little as 500 ng of starting DNA, can be
used.
[0942] Optionally instead of ligation, the captured target is
chemically attached to the surface after hybridisation
[0943] Preparation and Marker Labelling of Array
[0944] The digoxygenin can be added to the array oligonucleotide
during their synthesis. Once the target has hybridised a signal
amplification reaction can be performed on the digoxygenin so that
the point of array capture can be identified
[0945] Block slide with milk protein supernatant in PBS/Tween 20
(10" at room temperature) and wash with PBS/Tween
[0946] 1.sup.st Antibody layer Add Mouse Anti-Digoxygenin Antibody
(Roche) diluted 1/250 in milk protein+PBS for. Leave 30" at RT in
the dark thene do PBS/Tween washes
[0947] 2.sup.nd Antibody layer Add Goat Anti-Mouse Alexa Fluor
488/520 (Molecular Probes) 1/50 dilution in milk proein+PBS. Leave
30" at 37 C in dark. Do PBS/tween wash followed by a PBS wash Dry
slide (for example with gentle forced air)
[0948] The target Genomic DNA is stained with YOYO-1 (Moecular
Probes) in a 1 in 1000 or 1 in 2000 dilution (other DNA labels
might be used depending on wavelength of labelling of
oligonucleotide probes and markers and the available filters and
laser lines)A CCD image of the array is taken before the sequencing
reactions begin.
[0949] Annealing of Oligonucleotide Sets and Detection
[0950] The DNA array is placed on a temperature control device such
as a thermocycler fitted with a flat block
[0951] Hybridisation can be done in 3.5 M Tetramethyl ammonium
Chloride that reduce the effects of base composition (see section D
above for a list of other possible buffers) in which case all
annealing will be done at one or two temperatures. Hybridisation of
short oligonucleotides with 4-6 SSC.
[0952] Add first set of oligonucleotide probes at a concentration
between 1 nM-1 uM depending oligonucleotide length and
chemistry
[0953] Concentrations can be adjusted so that some but not all
sample molecules give signal (for example, optimised so that 1 in
12 oligonucleotide give a signal with a particular
oligonucleotidene sequence).
[0954] This is done at a temperature that is optimal for the Tm.
For DNA oligonucleotides this may be between 0 and 10 degrees C.
For LNA/PNA oligonucleotides a higher temperature can be used e.g.
room temperature. If for example an enzymatic reaction is performed
e.g. ligation to random 9 mers then a higher reaction temperature
e.g 65 degrees C. with Tth DNA ligase, can be used.
[0955] Rolling circle amplificaton can be used to amplify signal
from each probe. In this example the probes are bipartite, with
sequecnce complementary to target and circuler oligonucleotide
round which polymerisation extends using Sequenase enzyme and
single stranded bindig protein (SSB) essentially as described
(Zhong et al PNAS 98: 3940-3945).
[0956] Also bipartites probes may comprise one portion which is
complementary to the target and a second portion which is a partner
to a molecule attached to a fluorecent label. The partners may be
antibody-antigen interactions or they may be complementary
olgounucleotide interactions.
[0957] Denaturing Oligonucleotides
[0958] Some sequencing strategies require oligonculetodes or at
least ther labels to be removed. Oligonucleotides can be denatured
under gentle agitation by one or more of the following
treatments
[0959] *High Stringency buffer e.g. 0.1.times.SSC or
[0960] High Stringency buffer e.g. 0.1.times.SSC followed by water
or Tris EDTA or Alkali buffer, 100 mM Sodium Carbonate/Hydrogen
carbonate, room temperature
[0961] *And/or Heat to 37
[0962] And/or Heat to 37 to 70 degrees C
[0963] Harshness of treatment that can be tolerated is determined
by the number of cycles that need to be performed.
[0964] It is not essential to remove all probes. But it is
important to image which probes remain binding after treatment
[0965] Less harsh treatments labelled with asterisk above are
preferred.
[0966] The addition of glycerol can aid in keeping the DNA in good
condition
[0967] Removal of oligonucleotides by enzymatic treatment would
also be preferable as this is less harsh.
[0968] The sequence can be computed from the hybridisation data
from each area, matching to the draft genome sequence where
available assigning probabilistic scores. The data is presented
with a colour chart indicating regions of high certainty and
regions of lower certainty. The regions of high certainty can be
used in genetic studies.
[0969] Sequencing may be by any of the sequencing approaches
described in this document. Alternatively the arrays of this
invention generate substrates highly suitable for sequencing by
synthesis.
[0970] Gene (mRNA) Expression Analysis
[0971] Single molecule arrays of two types can be prepared for gene
expression analysis. The first is oligonucleotide arrays, which are
either synthesised in situ or are pre-synthesised and spotted. The
second is by spotting of cDNAs or PCR product. The former can be
spotted essentially as described. For the latter the optimal
concentration to spot the oligonucleotides to get single molecule
detection with a method of choice needs to be determined
empirically, as already described. Following this cDNA arrays are
spotted essentially as described onto for example, aminosilane
slides using 50% DMSO as spotting buffer.
[0972] Preparing Fluoresenctly Labeled cDNA (Probe) by Brown/DeLisi
Protocol or an Adaptation Thereof:
[0973] For single molecule counting based on analysis of a single
dye molecule, the CDNA must be primer labelled where the primer
carries a single dye molecule or alternatively carries a single
biotin molecucle or is aminated for attachment to single
beads.nanoparticles.
[0974] In a modification, the cDNAs are labelled with incorporation
of ddNTPs so that short fragments are created.
[0975] To anneal primer, mix 2 ug of mRNA or 50-100 .mu.g total RNA
with 4 ug of a regular or anchored oligonucleotide-dT primer in a
total volume of 15.4 ul:
4 Cy3 Cy5 mRNA (1 .gamma./.lambda.) x .lambda. Y .lambda. (2 .mu.g
of each if mRNA, 50-100 .mu.g if total RNA) Oligonucleotide- 1
.lambda. 1 .lambda. (Anchored: 5'- dT (4 .gamma./.lambda.) TTT TTT
TTT TTT TTT TTT TTV N-3') This primer may be labelled at the 5' end
with a dye moleucle e.g Cy3 or Cy5. This can be specified when the
oligonucleotide is ordered from e.g.Oswel, Southampton, UK)
ddH.sub.2O (DEPC) to 15.4 .lambda. to 15.4 .lambda. Total volume:
15.4 .lambda. 15.4 .lambda.
[0976] Heat to 65.degree. C. for 10 min and cool on ice.
[0977] Add 14.6 .mu.L of reaction mixture each to Cy3 and Cy5
reactions:
5 . . . Unlabeled Final Reaction mixture .lambda. . . . dNTPs Vol.
conc. 5.times. first-strand buffer* 6.0 dATP (100 mM) 25 uL 25 mM
0.1 M DTT 3.0 DCTP (100 mM) 25 uL 25 mM Unlabeled dNTPs 0.6 DGTP
(100 mM) 25 uL 25 mM Cy3 or Cy5 (1 mM, 3.0 DTTP (100 mM) 10 uL 10
mM Amersham)** Superscript II (200 U/uL, 2.0 ddH2O 15 uL Gibco BRL)
Total volume: 14.6 Total volume: 100 uL .lambda. *5.times.
first-strand buffer: 250 mM Tris-HCL (pH 8.3), 375 mM KCl, 15 mM
MgCl2) **Fluorescent nucleotides are omitted when a labelled primer
is included or when labelling is through a labelled ligation primer
(as described below) Incubate at 42.degree. C. for 1 hr.
[0978] Add 1 .lambda. SSII (RT booster) to each sample. Incubate
for an additional 0.5-1 hrs.
[0979] Degrade RNA and stop reaction by addition 15 .mu.l of 0.1 N
NaOH, 2 mM EDTA and incubate at 65-70.degree. C. for 10 min. If
starting with total RNA, degrade for 30 min instead of 10 min.
[0980] Neutralize by addition of 15 .mu.l of 0.1 N HCl.
[0981] Add 3801 .mu.l of TE (10 mM Tris, 1 mM EDTA) to a Microcon
YM-30 column (Millipore). Next add the 60 .mu.l of CyS probe and
the 60 .mu.l of Cy3 probe to the same microcon. (Note: If
re-purification of cy dye flow-through is desired, do not combine
probes until Wash 2.)
[0982] WASH 1: Spin column for 7-8 min. at 14,000.times.g.
[0983] WASH 2: Remove flow-through and add 450 ul TE and spin for
7-8 min. at 14,000.times.g. It is a good idea to save the flow
trough for each set of reactions in a separate microcentrifuge tube
in case Microcon membrane ruptures.
[0984] WASH 3: Remove flow-through and add 450 ul 1.times. TE, 20
.mu.g of Cot1 human DNA (20 .mu.g/.mu.l, Gibco-BRL), 20 .mu.g polyA
RNA (10 .mu.g/.mu.l, Sigma, #P9403) and 20 .mu.g tRNA (10
.mu.g/.mu.l, Gibco-BRL, #15401-011). Spin 7-10 min. at
14,000.times.g. Look for concentration of the probe in the
microcon. The probe usually has a purple color at this point.
Concentrate to a volume of less than or equal to the 28 ul . These
low volumes are attained after the centre of the membrane is dry
and the probe forms a ring of liquid at the edges of the membrane.
Make sure not to dry the membrane completely!
[0985] Invert the microcon into a clean tube and spin briefly at
14,000 RPM to recover the probe. Using a 22.times.60 mm coverslip
use a total volume of 35 ul composed of 28 ul Probe and TE, 5.95 ul
20.times.SSC, 1.05 ul 10% SDS
[0986] *20.times.SSC: 3.0 M NaCl, 300 mM NaCitrate (pH 7.0)
[0987] Adjust the probe volume to 28 ul column above.
[0988] For final probe preparation add 4.25.lambda. 20.times.SSC
and 0.75.lambda. 10% SDS. When adding the SDS, be sure to wipe the
pipette tip with clean, gloved fingers to rid of excess SDS. Avoid
introducing bubbles and never vortex after adding SDS.
[0989] Denature probe by heating for 2 min at 100.degree. C., and
spin at 14,000 RPM for 15-20 min. Place the entire probe volume on
the array under a the appropriately sized glass cover slip.
Hybridize at 65.degree. C. for 14 to 18 hours in a custom slide
chamber with humidity maintained by a small reservoir of
3.times.SSC (spot around 3-6 .lambda. 3.times.SSC at each corner of
the slide, as far away from the array as possible).
[0990] II. Washing and Scanning Arrays:
[0991] Ready washes in 250 ml chambers to 200 ml volume as
indicated in the table below. Avoid adding excess SDS. The Wash 1A
chamber and the Wash 2 chambers should each have a slide rack
ready. All washes are done at room temperature.
6 Wash Description Vol (ml) SSC SDS (10%) 1A 2.times. SSC, 0.03%
SDS 200 200 ml 2.times. 0.6 ml 1B 2.times. SSC 200 200 ml 2.times.
-- 2 1.times. SSC 200 200 ml 1.times. -- 3 0.2.times. SSC 200 200
ml 0.2.times. --
[0992] Blot dry chamber exterior with towels and aspirate any
remaining liquid from the water bath. Unscrew chamber; aspirate the
holes to remove last traces of water bath liquid.
[0993] Place arrays, singly, in rack, inside Wash I chamber
(maximum 4 arrays at a time). Allow cover slip to fall, or
carefully use forceps to aid cover slip removal if it remains stuck
to the array. DO NOT AGITATE until cover slip is safely removed.
Then agitate for 2 min.
[0994] Remove array by forceps, rinse in a Wash II chamber without
a rack, and transfer to the Wash II chamber with the rack. This
step minimizes transfer of SDS from Wash I to Wash II. Wash arrays
by submersion and agitation for 2 min in Wash II chamber, then for
2 min in Wash III (transfer the entire slide rack this time).
[0995] Spin dry by centrifugation in a slide rack in a Beckman GS-6
tabletop centrifuge at 600 RPM for 2 min
[0996] Analyse arrays immediately on a single molecule sensitive
detector such as the Light station (Atto-tec).
[0997] Instead of performing step 1 in the above protocol with
labelled target cDNA, because the requirement of the assay of this
invention is a single dye molecule, a target labelling procedure
can be ommitted. Thence, unlabelled cDNA or Poly A mRNA or total
RNA can be hybridised directly. This is then followed by
hybridisation of either:
[0998] A random library of n-mers (e.g 8-10 m mers) which are
labelled 5' phosphorylated and 3' labelled are ligated to arrayed
sequence specific oligonucleotidenculeotide probes (e.g to as can
be made by Febit or Xeotron, or can be spotted), templated by the
target mRNA
[0999] A library of sequence specific probes which are labelled as
above are ligated to oligonucleotides in an n-mer array, templated
by the target mRNA
[1000] Where Total RNA is used blocking sequences are used to mop
up ribosomal RNAs, small nuclear RNAs and transfer RNAs.
[1001] In the above process, several dye molecules are incorporated
into each single cDNA molecule. If the density of the array is low
enough signals from a single species can be distinguished by their
spatial co-localization and that they are a single colour. The
single molecules will form a Poisson distributon so there will be
some molecules that cannot be resolved but these will be minimal if
the spacing is far enough apart. In an alternative method the
oligonucleotided(I) primer s end labelled. This can be labelled ith
a single dye molecule, multilabelled with dendrimers or labelled
with a Fluospheres (Molecular Probes).
[1002] The results of the assay are based on the ratio of the
number of molecules (or colocalized sets of molecules) counted for
each of the populations.
[1003] Single Molecules can be counted on low density arrays when
using small number of cells (.about.1000) and when using normal
amounts (e.g 10.sup.6). Alternatively arrays, can be single
molecule arrays by functionalisation. In this case, small amounts
of sample material 100-1000 cells must be used to achieve the
single molecule functional array which can be used to count single
molecules.
[1004] Determining the levels of translated proteins by analyzing
mRNA linked to polysomes as Brown et al.
[1005] RNA is extracted by methods known in the art
[1006] Ligand-Protein Binding Assay on Single Molecule Chemical
Arrays
[1007] Aminosilane (APTES) slides from Asper biotech (Estonia) are
derivatised (according to Gavin MacBeath, Angela N. Koehler, and
Stuart L. Schreiber J. Am. Chem. Soc., 121 (34), 7967-7968, 1999)
to give surfaces that are densely functionalized with maleimide
groups. To achieve this, one face of each slide is treated with 20
mM N-succinimidyl 3-maleimido propionate (Aldrich Chemical Co.,
Milwaukee, Wis.) in 50 mM sodium bicarbonate buffer, pH 8.5, for
three hours. (This solution is prepared by dissolving the
N-succinimidyl 3-maleimido propionate in DMF and then diluting
10-fold with buffer). After incubation, the slides are washed
several times with milliQ water, dried by centrifugation, and
stored at room temperature under vacuum until further use. A
dilution series of biotin molecule is arrayed. Upon binding of
cy3-labelled streptavidin or a 20 nm Streptavidin coated Fluosphere
to the array, the optimal dilution for detecting single molecules
is established. A single molecule binding assay can then be
conducted. Where Streptavidin is labeled with a single cy3 dye, the
single step photobleaching characteristics of the dye are
sufficient to indicate single molecules.
[1008] Protein-Ligand Binding Assay on Single Molecule Protein
Arrays
[1009] Avidin , Streptavidin, Neuravidin are arrayed on a surface,
for example onto a biotin-derivatized surface. Fluorescent
semiconductor nanocrystals coated with biotin molecules (Quantum
Dot Corp) are then interacted with the Proteins using the Quantum
Dot buffer supplied by the vendor at a temperature between room
temperature and 45 degrees. A 1 hour reaction at 45 degrees is
sufficient. Arrayed single molecules are then interrogated. In an
alternative example, the Avidin and derivatives are also previously
labelled e.g with different dyes or Fluospheres (Molecular Probes
Copr, Oreg.) according to which they can be distinguished. The
assay can then be carried out on arrays spreads of the avidin and
derivatives.
[1010] Protein:Protein/Antigen:Antibody Binding on Protein
Arrays
[1011] The following is adapted from the procedure of Haab and
Brown:
[1012] Preparation of Protein Analyte Solutions
[1013] Protein solutions and NHS-ester activated Cy3 and Cy5
solutions (Amersham) are prepared in a 0.1 M pH 8.0 sodium
carbonate buffer. The protein and dye solutions are mixed together
so that the final protein concentration is 0.2-2 mg/ml and the
final dye concentration was 100-300 .mu.M. Normally approximately
15 g protein is labeled per array. The reactions are allowed to sit
in the dark for 45 min and then quenched by the addition of a tenth
volume 1 M pH 8 Tris base (a 500-fold molar excess of quencher).
The reaction solutions are brought to 0.5 ml with PBS and then
loaded into microconcentrator spin columns (Amicon Microcon 10)
with a 10,000 Da molecular weight cut off. After centrifugation to
reduce the volume to approximately 10 .mu.l (approximately 20 min),
a 3% non-fat milk blocking solution is added to each Cy5-labeled
solution such that 25 .mu.l milk is added for each array to be
generated from the mix. (The milk had been first spun down as
above.) The volume is again brought to 0.5 ml with PBS and the
sample again centrifuged to .about.10 .mu.l. The Cy3-labeled
reference mix is divided equally among the Cy5-labeled mixes, and
PBS is added to each to achieve 25 .mu.l for each array. Finally,
the mixes are filtered with a 0.45 .mu.m spin filter (Millipore) by
centrifugation at 10,000.times.g for 2 min.
[1014] Binding to Array
[1015] Without allowing the array to dry, 25 .mu.l dye-labeled
protein solution is applied to the array surface and a 24.times.30
mm cover slip is placed over the solution. The arrays are sealed in
a chamber with an under-layer of PBS to provide humidification,
after which they are left at 4.degree. C. for 2 h. The arrays are
dipped briefly in PBS to remove the protein solution and cover
slip, and are then allowed to rock gently in PBS/0.1% Tween-20
solution for 20 min. The arrays are then washed twice in PBS for
5-10 min each and twice in H.sub.2O for 5-10 min each. All washes
are at room temperature. After spinning to dryness in a centrifuge
equipped with plate carriers (Beckman) or by removing moisture by
forced air, the single molecule protein arrays are ready for
analysis.
[1016] Measuring Physico-Chemical Properties and Interactions
[1017] Scanning probe microscopes can be used to measure
physicochemical properties of molecules. An AFM tip may be made
hydrophobic and its interaction with arryed proteins can be
measured. In addition a chemical (Chemical Force Microscopy) pr
biomolecule can be attached to the tip of an AFM and its
interactions with an arrayed protein or DNA molecule can be
analysed. A 2-dimensional array of force curves can be obtained by
using an AFM developed by Asylum research. Different aspects of the
interactions, such as electrostics can be determined from these
force curves by those trained in the art. The properties of a given
protein can be learnt and stored in a look up table. During or
after force mapping, comparisons are made with the look up table to
see if the ascertained features match those in the look up table.
Depending on the match the identity of a protein molecule can
determined. Radmacher et al Science 1994 265: 1577 describe the use
of AFM force measurements for analysing the properties of an enzyme
molecule. The sample protein molecules can be arrayed in a manner
that those molecules with certain features lie at certain regions
of the array. For example, proteins may be immobilised on a surface
bearing a pH gradient on which different proteins bind to different
pH locations according to their corresponding Isoelectic point
(Wasch-Mesthgeet al Scanning 2000 22:380).
[1018] Another method for fingerprinting single protein molecules
is by taking advantage of the massive enhancement in Raman signal
due to surface enhancement by metal clusters. Colloidal gold
(Sigma) is added to a gold coated microscope slide (Erie
Scientific) and clusters are allowed to form to generate a SERS
(Surface Enhanced Raman. Spectroscopy) active surface. Raman
spectrum is obtained in the near Infra-red wavelength range, using
a CCD camera and a spectrograph. The concentration of target
molecule required so that only a single target protein molecule is
immobilised per cluster is determined by testing a range of sample
concentrations. The spectra for each protein of interest are
obtained and stored in a look up table. Then a mixture of proteins
is arrayed onto a surface containing metallic clusters, at a
dilution that a single molecule will bind to a single cluster.
Raman spectrum are then obtained from different locations on the
surface (using a X_Y translation of the sample for example). Each
spectrum is compared to fingerprints in the look up table, if a
match is found then presence of that particular protein in the
sample is indicated The look up tables are stored in computer
memory and comparisons with the look up table may utilise neural
network and fuzzy logic software as known in the art.
[1019] Microscopy and Imaging
[1020] Fluorescence Detection Schemes and Instrumentation
[1021] The images of the molecules are projected onto the array of
a Charge-couple device (CCD) camera, from which they are digitized
and stored in memory. The images stored in memory are then
subjected to image analysis algorithms. These algorithms can
distinguish signal from background, monitor changes in signal
characteristics, and perform other signal processing functions. The
memory and signal processing may be performed off-line in a
computer, or in specialized digital signal processing (DSP)
circuits controlled by a microprocessor.
[1022] When individual molecules within the microarray spot are
analysed directly, then wide field CCD imaging is used. CCD imaging
enables a population of single molecules distributed
2-dimensionally on a surface to be viewed simultaneously. Although
microarray imagers based on epifluorescence illumination and wide
field imaging are available, the optics and range of stage movement
of these instruments does not enable single molecules to be
monitored across large areas of the slide surface. Typically,
wide-field illumination schemes may involve illumination with a
lamp, a defocused laser beam or by an evanescent field generated by
Total Internal Reflection of a laser beam. The field that can be
viewed is determined by the magnification of the objective, any
magnification due to the C-mount and, the size and number of pixels
of the CCD chip. Typically, a microarray spot can be viewed by
either a 40.times. or 60.times. objective depending on CCD camera
and C-mount. Therefore to view large regions of a slide (several
cm.sup.2) multiple images must be taken. A low noise high
sensitivity camera is used to capture images. There are several
camera models that can be used; Cooled Micromax camera (Roper
scientific) controlled by MetaMorph (also MetaView software; both
from Universal Imaging). MetaMorph can be run on a Dell OptiPlex
GX260 personal Computer.
[1023] The following CCD set ups can be used I-PentaMAX Gen m;
Roper Scientific, Trenton, N.J. USA) or cooled (e.g. Model ST-71
(Santa Barbara Instruments Group, Calif., USA); ISIT camera
composed of a SIT camer a(Hamamatsu), an image intensifier and
(VS-1845, Video Scope Intematinal, USA) and stored on S-VHS
videotape. Video taped images are processed with a digital image
processor (Argus-30, Hamamatsu photonics). Gain setting are
adjusted depending on camera and brightness of signal.
[1024] The movement form one field of view to another can be done
by attaching the substrate on a X-Y translation stage (Prior
Scientific).
[1025] Feature Recognition and Single Molecule Imaging
[1026] MetaMorph's optional microarray module and a low
magnification objective are used to locate spots before taking a
CCD image of each of the spots using higher magnification.
[1027] As the signal from the spots containing singly resolvable
molecules is very low under low magnification, a marker dye, which
emits at a different wavelength to the sample emission should be
included in the spots to help locate them. The objectives need to
be of high numerical aperture (NA) in order to obtain good
resolution and contrast. The integration of an autofocusing
capability within the procedure to maintain focus as the slide is
scanned, is useful especially when Total Internal Reflection
Fluorescence microscopy (TIRF) is employed. Software can be used to
control Z movement (integral to motorized microscopes) for the
purpose of autofocusing (e.g. MetaMorph). Images of microarray
spots can be obtained by x-y movements of the sample stage (e.g.
using Prior Scientific's Proscan stage under MetaMorph control). To
avoid photobleaching it is advisable to use a shutter (e.g. from
Prior Scientific) to shut of illumination while moving from one
spot to another. A controller can be used to control X-Y stage, the
filter wheels and shutter, (eg Prior Scientific ProScan).
[1028] Once the spots are found, their coordinates are recorded by
the software controlling the instrument and then after each base
addition, a CCD image is taken of each spot of the microarray.
[1029] In addition to the instrument being used for looking at a
microarray where template molecules have been captured by probes, a
large number of samples can be gridded (as a microarray) to form an
array of arrays and then the instrument can be used to analyse each
array. The samples may be individual nucleotide populations or a
set of differentially labeled nucleotide populations.
[1030] Two imaging set ups, Total Internal Reflection Fluorescence
microscopy (TIRF) and epi-fluorescence microscopy have been
used.
[1031] Epi-Fluorescence Imaging
[1032] Images of single molecules labeled with a single dye
molecule can be obtained using a standard epi-fluorescence
microscopy set up, using high NA objectives and a high grade CCD
camera. However, the image can be hazy. In order to obtain a
clearer image it is preferable to use deconvolution software to
remove the haze. Deconvolution modules are available as drop-ins
for MetaMorph software. When the single molecules are labeled with
nanoparticles the camera and objectives may be of a lower
grade.
[1033] Total Internal Reflection Microscopy (TIRF)
[1034] TIRF enables very clean images to be obtained by creating an
evanescent field which decays exponentially from the surface, for
example using off the shelf system for Objective style TIRF (such
as those produced by Olympus or Nikon). A full description can be
found in the brochure at the following website:
www.nikon-instruments.com/uk/pdf/broch- ure-tirf.pdf
[1035] Objective style TIRF can be used when the sample is on a
coverslip. However, it is not compatible when the sample is on a
microscope slide. For this Prism type TIRF must be used (See Light
Microscopy in Biology, A practical Approach Ed. AJ Lacey OUP). In
addition a high NA condensor can be used to create TIRF on a
microscope slide.
[1036] There are two configurations that can be used with TIRF. The
first is the Prism method and the second is the objective
method.
[1037] The objective method is supported by Olympus Microscopes and
application notes are found at the following web site:
http://www.olympusmicro.com/primer/techniques/fluorescence/tirf/olympusap-
tirf.html
[1038] The Prism method below is described in Osborne et al J.
Phys. Chem. B, 105 (15), 3120-3126,2001.
[1039] This instrument consists of an inverted optical microscope
(Nikon TE200, Japan), two color laser excitation sources, and an
Intensified Charge Coupled Device (ICCD) camera (Pentamax,
Princeton Instruments, NJ). A mode-locked frequency-doubled Nd:YAG
laser (76 MHz Antares 76-s, Coherent) is split into two beams to
provide up to 100 mW of 532-mn laser light and pump a dye laser
(700 series, Coherent) with output powers in excess of 200 mW at
630 mn (DCM, Lambda Physik). The sample chamber is inverted over a
.times.100 oil immersion objective lens and a 60 fused silica
dispersion prism optically coupled to the back of the slide through
a thin film of glycerol. Laser light is focused with a 20-cm focal
length lens at the prism such that at the glass/sample interface it
subtends an angle of approximately 68 to the normal of the slide
and undergoes total intemalreflection (TIR). The critical angle for
a glass/water interface is 66. The footprint of the TIR has a 1/e2
diameter of about 300 m. Fluorescence produced by excitation of the
sample with the surface-specific evanescent wave is collected by
the objective, passed through a dichroic beam splitter (560 DRLP,
Omega Optics), and filtered before imaging onto the ICCD camera.
Images are recorded by using synchronized 532 nm excitation
withdetection at 580 mn (580DF30, Omega) for TAMRA labeled
substrates and 630 nm excitation with detection at 670 nm (670DF40,
Omega) for Cy5 labeled probes. Exposure times are set between 250
and 500 ms with the ICCD gain at maximum (1 kV). The laser powers
at the prism are adjusted to 40 mW at both laserwavelengths.
[1040] Although the above describes use of the system on an
inverted microscope, an upright microscope can also be configured
in an appropriate way, for example Braslavsky I, Hebert B, Kartalov
E, Quake SR. Proc Natl Acad Sci U S A.100:39604. (2003)
[1041] Multi-Colour Single Molecule Imaging
[1042] When the single molecule technique involves different
fluorescen labels added sequentially, then a single CCD image can
be taken for each. However, if each nucleotide is differentially
labeled (i.e. each nucleotide type is labelled with a different
fluorophore) and added simultaneously, then the signal from each of
the differerent fluorophores needs to be acquired distinguishably.
This can be done by taking four separate images by switching
excitation/emission filters. Alternatively, an image (Wavelength)
splitter such as the Dual View (Optical Insights, Santa Fe, N.Mex.)
or W View (Hamamatsu, Japan) which direct the light through two
separate bandpass filters with little loss of light between them,
can be used for imaging two different wavelenghts onto different
portions of a CCD chip. Alternatively the light can be split into
four wavelengths and sent to the four quadrants of a CCD chip (e.g
Quad view from Optical Insights). This obviates the need to switch
filters using a filter wheel. A MetaMorph drop-in for single image
dual emission optical splitters can also be employed.
[1043] Analysing Single Molecules Randomly Distributed on a
Surface
[1044] As an alternative to microarray spot finding prior to single
molcule imaging and for implementations where the single molecules
to be analysed are not organised within the spatially addressable
microarray spots, a series of images of the surface can be taken by
x-y translation of the slide. A super-wide field image is then
composed by stitching each of the images together.
[1045] SNOM (Scanning Near-Field Optical Microscopy
[1046] SNOM (e.g BioLyser SNOM (Triple-O Potsdam, Germnay)) can be
used for near field optical imaging, allowing molecules at closer
spacings to be individually resolved.
[1047] Stains and Antifade
[1048] The following oxygen scavenging solution can be used to
minimise photobleaching when single molecule analysis is done in
solution: Catalase (0.2 mg/ml), Glucose oxidase (0.1 mg/ml), DTT
(20 mM), BSA (0.5 mg/ml), Glucose 3 mg/ml. This can be added to the
buffer solution that is being used in the experiment.
[1049] Adding 20-30% beta-mercaptoethanol to a solution will
attenuate photobleaching. DNA can be stained by using a variety of
dyes available form Moelcular Probes (Oregon) e.g. YOYO-1, POPO-3
and SYBR Gold, used at manufacturers recommended
concentrations.
[1050] AFM
[1051] Images cn be obtained by using a Multimode IIIa with a
nanoscope IV controller and Si cantilever tips (Veeco, Santa
Barbara, Calif.). This is placed on an active isolation system
(MOD1-M, Halcyonics, Gottingen, Germany). Typical imaging
parameters are 60-90 Hz resonant frequency, 0.5-1V oscillation
amplitude, 0.3-0.7V setpoint voltage, 1.5-2 Hz scan rate.
[1052] Image Processing, Single Molecule Counting and Error
Management
[1053] The above can be done using algorithms of any of the type in
the detailed description of the invention. In addition below is an
example of how to do single molecule counting using simple
commercial software.
[1054] The objective is to use image analysis to count and
determine the confidence in putative signals from single molecules
within a microarray spot. The image processing package SigmaScanPro
is used to automate single molecule counting and measurement. The
procedure described here, or modifications of it, can be used for
simple single molecule signal counting or more complex analyses of
single molecule information, multi-colour analysis and error
mangement.
[1055] The microarray spot or array region of interest image is
captured using a CCD camera, such as the I-PentaMAX Genm or Gen
IV(Roper Scientific) and an off-the-shelf frame grabber board. The
single molecules are excited by laser in a TIRF configuration.
Using a 100.times. objective and spots of approximately 200 microns
in diameter.
[1056] The image is spatially calibrated using the Image,
Calibrate, Distance and Area menu option. A 2-Point Rescaling
calibration is performed using micron units. Single molecule areas
will then be reported in square microns.
[1057] Increasing the contrast between single molecules and the
surrounding region will help identify the single molecules by
thresholding. Image contrast is improved by performing a Histogram
Stretch from the Image, Intensity menu. This procedure measures the
grey levels in the image. The user then "stretches" the range of
grey levels with significant magnitude over the entire 255 level
intensity range. In this case moving the Old Start line with the
mouse to an intensity of 64 will eliminate the effect of the
insignificant dark gray levels and improve the contrast.
[1058] The single molecules can be identified by thresholding the
intensity level to fill in the darkest objects. This is done by
selecting Threshold, Intensity Threshold from the Image menu.
[1059] Under certain spotting conditions (e.g. 1.5 M Betaine
3.times.SSC onto enhanced Aminoslinae slides as well as in 50% DMSO
buffer under certain conditions) the spot has a thin but
discernably bright ring round the edge. This can be used to define
the area to be processed. This ring can be removed from
contributing to the data by using image overlay layer math to
intersect the single molecule signals with an overlay plane
consisting of the interior of the ring. The overlay is created by
filling light pixels in the interior of the spot and selecting out
the ring by thresholding. Set the Level to be 180 and the option to
select objects that are lighter than this level. Select the Fill
Measurement mode (paint bucket icon) and left click in the interior
of the plate to fill it. Set the source overlay to red in the
Measurements, Settings, Overlays dialog. There are "holes" in the
red overlay plane that are not filled since they contain bright
pixels from the single molecules. To fill them select Image,
Overlay Filters and select the Fill Holes option. Let both the
source and destination overlays be red. The red circular overlay
plane contains the green bacterial colonies.
[1060] The overlay math feature is used to identify the
intersection of the red and green overlay planes. From the Image
menu select Overlay Math and specify red and green to be the source
layers and blue to be the destination layer. Then AND the two
layers to obtain the intersection.
[1061] The blue pixels overlay the single molecule that can now be
counted. Select the blue overlay plane as the source overlay from
the Overlays tab in the Measurement Settings dialog. Select
Perimeter, Area, Shape Factor, Compactness and Number of Pixels
from the Measurements tab in the Measurements Settings dialog. Then
measure the single molecule signals by using Measure Objects from
the Measurements menu. The single molecule signals can be
arbitrarily numbered and the corresponding measured quantities
placed into an Excel Microsoft) spreadsheet
[1062] A macro is written to perform this for each spot in the
array.
[1063] The microarray slide is translated relative to the CCD by a
X-Y translation (Prior Scientific) stage with images taken
approximately every 100 micron spacings.
[1064] The example given here is for end-point analysis. However,
for enhanced error discrimination real time analysis may be
desirable, in this case a wider field images can be taken of the
whole array by the CCD camera under lower magnification and
enhanced by image processing. However, in most cases, a time window
after the start of the reaction will have been determined within
which the image should be acquired to gate out errors, which may
occur early (non specific absorption) and late (mismatch
interactions) in the process.
[1065] Adobe Photoshop software contains a number of image
processing facilities which can be used and more advanced plug-ins
are available. The Image Processing Toolkit is available which
Plug-in to Photoshops, MicroGrafx Picture Publisher, NIH Image and
other programs is available from Quantitative Image Analysis.
[1066] Biosensors
[1067] Biosensor in which single molecules are detected by
fluorescence
[1068] The molecular array, an excitation source and a detector CCD
are integrated into a small device. The Molecualr array is
synthesised on a substrate in which an evanescent field is created
by a waveguide.
[1069] Biosensor in Which Single Molecules are Detected by
Conductivity
[1070] An integrated biosensor is created in which the molecules of
the array are attached to electrodes. A voltage source for the
electrodes is provided and electronic circuitry for detection.
[1071] Additional Elements of the Integrated Sensor
[1072] In addition, optionally means for any or all of the
following can be included in the sensor microprocessor, memory,
hardware-based signal processing (circuitry for processing the
electrical change generated by at least one sensing element into a
resulting output signal indicative of feature analysed),
software-based signal processing; software-based processing of
results, display of results; transmitting antennae and optionally
receiver for communication with a local or remote computer carrying
central database on a remote computer, computer memory. The
microprocessor includes suitable memory as well as processing. The
device can further include a set of internal batteries for powering
the processor and the sensing array.
[1073] The microprocessor may electronics include an analog-digital
(A/D) converter as well as resident control and timing circuitry
which is used in conjunction with a reference crystal in order to
detect the amount of electrical change by each of the sensing
elements of the array for processing, such as comparing to a stored
look-up table and then outputting the results to an LCD or other
suitable display.
[1074] Coating of Palladium
[1075] A saturated Pd solution is prepared by dissolving Palladium
in aqueous buffer or other solvent. This is then placed on the DNA
sample and allowed to react. Then reducing solution containing
dimethylamine borane is added. Excess reagent is washed away or
diluted out.
[1076] Coating of Silver
[1077] Being a polyanion the DNA bridge is loaded with silver ions
by NA+/AG+ ion exchange using 0.1 MagNO3basic aqueous solution
(ammonimum hydroxide pH 10.5) The silver ion is reduced using basic
hydroquinone solution (0.05 M ammhyydroxideph10.5) to form small
silver aggregates bound to DNA . The DNA wire is developed using an
acidic solution ph 3.5 citrate buffer of hydroquinone 0.05 M and
silver ions 0.1 M under low light conditions
[1078] Aldehyde Mediated Metalization
[1079] Kerene et al (Science 297: 72 , 2002) have described a
procedure in which first reducing agent is coupled to the DNA
polymer by incubating the DNA with glutaraldehyde. AgNO3 in ammonia
buffer is then added and the reduction of silver by the DNA bound
aldehyde leads to the formation of microscopic Ag aggregates along
the DNA polymer. The silver aggregates serve as catalysts for
subsequent gold deposition to produce continuous gold wires.
[1080] The same procedure can be used for metallization of a
microarray spot (see below).
[1081] Metallizing DNA with Zinc to Form M-DNA
[1082] A. Rakitin et al (Physical Review Letters 86:3670-3673) have
described an approach which involves substituting the imino proton
of each base pair with a metal ion, Zn.sup.2+ to obtain M-DNA with
altered electronic properties. M-DNA is prepared in 20 mM NaBO3
buffer, pH=9.0 (or 20 mM Tris, pH=7.5) with 10 mM NaCl at 20 C and
0.1 mM Zn.sup.2+. This treatment can be performed before or after
binding of the DNA to the array.
[1083] Gold can also be deposited using an evaporation procedure as
described (Quake and Scherer Science 290:1536-1540).
[1084] Metalizing Microarray Spots
[1085] A DNA array may first be made and then metallized thereby
becoming microelectrodes. This can be done by for example one of
the following approaches
[1086] Glutaraldehyde treatment followed by metallization as
described above
[1087] Providing a thiol or sulfhydryl group on the array probes
such that colloidal gold particles interact with them. The gold
particles then seeed silver enhancement. This can be done by an
adaption of the strategy described by Taton et al Science 289: 1757
(2000): The Gold particles(e.g. Sigma) bind to the array probes
then silver enhancer (e.g. Sigma) is added. In this process silver
ions are reduced by hydroquinone to silver metal at the surfaces of
the gold nanoparticles.
[1088] Adding gold particles with a positively charges surface
coating such as lysine which bind by electrostatic attraction to
the negatively charged nucleic acid probes This is done by
adjusting th epH of the colloidal gold particles to pH 7 and then
adding lysine molecules. In a test experiment, continuity between
two separated microarray spots due to a metalized DNA bridge can be
checked using mobile electrodes and an electrometer.
[1089] Fabrication of an Array of Microelectrodes and Deposition of
Probes
[1090] An array of microelectrodes were fabricated by electron beam
evaporation of chromium and gold onto silicon wafers or glass
surfaces previously patterned with an organic photoresist using
conventional UV light photolithography. After the photoresist is
removed the metal is annealed by heating and cleaned by reactive
ion etching. The resulting microelectrodes are connected to
separate printed circuit board tracks via gold wire bonds (which
may be fanned out). Electrodes in the 100 nm range can also be made
by essentially the same type of procedures but using
electromagnetic waves of lower wavelength). The probes may be
deposited or synthesised atop of this array of microelectrodes.
[1091] The contact between the electrode and the metalized sample
DNA may be improved by engineering the interface by mixing for
example conjugated polymers such as polypyrrole with the nucleic
acid probes on the surface.
[1092] Single molecules can be viewed on stripped fused silica
optical fibres, essentially as described by Watterson et al
(Sensors and Actuators B 74: 27-36 (2001). Molecular Beacons can be
seen in the same way (Liu et al Analytical Biochemistry 283: 56-63
(2000)). A biosensor device can be made in which on single molecule
analysis of Molecular Beacons in an evanescent field can be
done.
[1093] The various features and embodiments, referred to in
individual sections above apply, as appropriate, to other sections,
mutatis mutandis. Consequently features specified in one section
may be combined with features specified in other sections, as
appropriate.
[1094] All publications mentioned in the above specification are
herein incorporated by reference. Various modifications and
variations of the described methods and system of the invention
will be apparent to those skilled in the art without departing from
the scope and spirit of the invention. Although the invention has
been described in connection with specific preferred embodiments,
it should be understood that the invention as claimed should not be
unduly limited to such specific embodiments. Indeed, various
modifications of the described modes for carrying out the invention
which are apparent to those skilled in molecular biology, single
molecule detection or combinatorial chemistry or related fields are
intended to be within the scope of the following claims.
Sequence CWU 1
1
2 1 12 DNA Artificial Lambda A overhang DNA 1 gggcggcgac ct 12 2 12
DNA Artificial Lambda B overhang DNA 2 aggtcgccgc cc 12
* * * * *
References