U.S. patent application number 16/312901 was filed with the patent office on 2019-07-18 for novel nucleoside triphosphate transporter and uses thereof.
This patent application is currently assigned to The Scripps Research Institute. The applicant listed for this patent is THE SCRIPPS RESEARCH INSTITUTE. Invention is credited to Floyd E. ROMESBERG, Yorke ZHANG.
Application Number | 20190218257 16/312901 |
Document ID | / |
Family ID | 60784727 |
Filed Date | 2019-07-18 |
View All Diagrams
United States Patent
Application |
20190218257 |
Kind Code |
A1 |
ROMESBERG; Floyd E. ; et
al. |
July 18, 2019 |
NOVEL NUCLEOSIDE TRIPHOSPHATE TRANSPORTER AND USES THEREOF
Abstract
Disclosed herein are proteins, methods, cells, engineered
microorganisms, and kits for generating a modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum. Also
disclosed herein proteins, methods, cells, engineered
microorganisms, and kits for production of a nucleic acid molecule
that comprises an unnatural nucleotide utilizing a modified
nucleoside triphosphate transporter from Phaeodactylum
tricornutum.
Inventors: |
ROMESBERG; Floyd E.; (La
Jolla, CA) ; ZHANG; Yorke; (San Diego, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
THE SCRIPPS RESEARCH INSTITUTE |
La Jolla |
CA |
US |
|
|
Assignee: |
The Scripps Research
Institute
La Jolla
CA
|
Family ID: |
60784727 |
Appl. No.: |
16/312901 |
Filed: |
June 23, 2017 |
PCT Filed: |
June 23, 2017 |
PCT NO: |
PCT/US2017/039133 |
371 Date: |
December 21, 2018 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
62354650 |
Jun 24, 2016 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12N 15/90 20130101;
C12N 2310/20 20170501; C12N 15/11 20130101; C12N 2800/80 20130101;
C12N 15/10 20130101; C07K 14/405 20130101; C12N 9/22 20130101 |
International
Class: |
C07K 14/405 20060101
C07K014/405; C12N 9/22 20060101 C12N009/22; C12N 15/11 20060101
C12N015/11; C12N 15/90 20060101 C12N015/90 |
Goverment Interests
STATEMENT AS TO FEDERALLY SPONSORED RESEARCH
[0002] The invention disclosed herein was made, at least in part,
with U.S. government support under Grant No. GM060005 by The
National Institutes of Health (NIH). Accordingly, the U.S.
Government has certain rights in this invention.
Claims
1. An isolated and modified nucleoside triphosphate transporter
from Phaeodactylum tricornutum (PtNTT2) comprising a deletion,
wherein the isolated and modified nucleoside triphosphate
transporter is obtained from an engineered cell.
2. The isolated and modified nucleoside triphosphate transporter of
claim 1, wherein the deletion is a terminal deletion or an internal
deletion.
3. The isolated and modified nucleoside triphosphate transporter of
claim 2, wherein the deletion is a terminal deletion.
4. The isolated and modified nucleoside triphosphate transporter of
claim 2, wherein the deletion is an internal deletion.
5. The isolated and modified nucleoside triphosphate transporter of
claim 3, wherein the terminal deletion is an N-terminal deletion, a
C-terminal deletion, or a deletion of both termini.
6. The isolated and modified nucleoside triphosphate transporter of
claim 3, wherein the terminal deletion is an N-terminal
deletion.
7. The isolated and modified nucleoside triphosphate transporter of
claim 2, wherein the deletion comprises about 5, 10, 15, 20, 22,
25, 30, 40, 44, 50, 60, 66, 70, or more amino acid residues.
8. The isolated and modified nucleoside triphosphate transporter of
claim 2, wherein the isolated and modified nucleoside triphosphate
transporter comprises a deletion of about 5, 10, 15, 20, 22, 25,
30, 40, 44, 50, 60, 66, 70, or more amino acid residues at the
N-terminus.
9. The isolated and modified nucleoside triphosphate transporter of
claim 8, wherein the isolated and modified nucleoside triphosphate
transporter comprises a deletion of about 66 amino acid residues at
the N-terminus.
10. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity sequence identity to SEQ ID
NO: 4.
11. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the isolated and modified nucleoside
triphosphate transporter comprises 100% sequence identity to SEQ ID
NO: 4.
12. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 6.
13. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the isolated and modified nucleoside
triphosphate transporter comprises 100% sequence identity to SEQ ID
NO: 6.
14. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 8.
15. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the isolated and modified nucleoside
triphosphate transporter comprises 100% sequence identity to SEQ ID
NO: 8.
16. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the isolated and modified nucleoside
triphosphate transporter further comprises a signal peptide.
17. The isolated and modified nucleoside triphosphate transporter
of claim 16, wherein the signal peptide is selected from Table
3.
18. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the engineered cell comprises a prokaryotic
cell.
19. The isolated and modified nucleoside triphosphate transporter
of claim 1, wherein the engineered cell is E. coli.
20. A nucleic acid molecule encoding an isolated and modified
nucleoside triphosphate transporter of claims 1-19.
21. Use of a modified nucleoside triphosphate transporter of claims
1-19 for the incorporation of an unnatural triphosphate during the
synthesis of a nucleic acid molecule.
22. An engineered cell comprising: a first nucleic acid molecule
encoding a modified nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2).
23. The engineered cell of claim 22, wherein the nucleic acid of
the modified nucleoside triphosphate transporter is incorporated in
the genomic sequence of the engineered cell.
24. The engineered cell of claim 22, wherein the engineered cell
comprises a plasmid comprising the modified nucleoside triphosphate
transporter.
25. The engineered cell of claim 22, wherein the modified
nucleoside triphosphate transporter is a codon optimized nucleoside
triphosphate transporter from Phaeodactylum tricornutum.
26. The engineered cell of claim 22, wherein the modified
nucleoside triphosphate transporter comprises a deletion.
27. The engineered cell of claim 26, wherein the deletion is a
terminal deletion or an internal deletion.
28. The engineered cell of claim 27, wherein the deletion is an
N-terminal truncation, a C-terminal truncation, or a truncation of
both termini.
29. The engineered cell of claim 26, wherein the modified
nucleoside triphosphate transporter comprises a deletion of about
5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino
acid residues.
30. The engineered cell of claim 26, wherein the modified
nucleoside triphosphate transporter comprises a deletion of about
5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino
acid residues at the N-terminus.
31. The engineered cell of claim 30, wherein the modified
nucleoside triphosphate transporter comprises a deletion of about
66 amino acid residues at the N-terminus.
32. The engineered cell of claim 22, wherein the isolated and
modified nucleoside triphosphate transporter comprises at least
80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity
sequence identity to SEQ ID NO: 4.
33. The engineered cell of claim 22, wherein the isolated and
modified nucleoside triphosphate transporter comprises 100%
sequence identity to SEQ ID NO: 4.
34. The engineered cell of claim 22, wherein the isolated and
modified nucleoside triphosphate transporter comprises at least
80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ
ID NO: 6.
35. The engineered cell of claim 22, wherein the isolated and
modified nucleoside triphosphate transporter comprises 100%
sequence identity to SEQ ID NO: 6.
36. The engineered cell of claim 22, wherein the isolated and
modified nucleoside triphosphate transporter comprises at least
80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ
ID NO: 8.
37. The engineered cell of claim 22, wherein the isolated and
modified nucleoside triphosphate transporter comprises 100%
sequence identity to SEQ ID NO: 8.
38. The engineered cell of claim 22, wherein the modified
nucleoside triphosphate transporter is under the control of a
promoter selected from an E. coli promoter or a phage promoter.
39. The engineered cell of claim 38, wherein the promoter is
selected from P.sub.bla, P.sub.lac, P.sub.lacUV5, P.sub.H207,
P.sub..lamda., P.sub.tac, or P.sub.N25.
40. The engineered cell of claim 38, wherein the modified
nucleoside triphosphate transporter is under the control of
promoter P.sub.lacUV5.
41. The engineered cell of claim 38, wherein the modified
nucleoside triphosphate transporter is under the control of a
promoter from a lac operon.
42. The engineered cell of claim 22, wherein the modified
nucleoside triphosphate transporter is encoded within a pSC
plasmid.
43. The engineered cell of claim 22, wherein the modified
nucleoside triphosphate transporter decreases doubling time of the
engineered cell.
44. The engineered cell of claim 22, wherein the modified
nucleoside triphosphate transporter enables unnatural base pair
retention of about 50%, 60%, 70%, 80%, 90%, 95%, 99% or more.
45. The engineered cell of claim 22, wherein the engineered cell
further comprises a second nucleic acid molecule encoding a Cas9
polypeptide or variants thereof, a third nucleic acid molecule
encoding a single guide RNA (sgRNA) comprising a crRNA-tracrRNA
scaffold; and a fourth nucleic acid molecule comprising an
unnatural nucleotide.
46. The engineered cell of claim 45, wherein the second nucleic
acid molecule, the third nucleic acid molecule, and the fourth
nucleic acid molecule are encoded in one or more plasmids.
47. The engineered cell of claim 46, wherein the sgRNA encoded by
the third nucleic acid molecule comprises a target motif that
recognizes a modification at the unnatural nucleotide position
within the fourth nucleic acid molecule.
48. The engineered cell of claim 47, wherein the modification at
the unnatural nucleotide position within the third nucleic acid
molecule generates a modified third nucleic acid molecule.
49. The engineered cell of claim 47, wherein the modification is a
substitution.
50. The engineered cell of claim 47, wherein the modification is a
deletion.
51. The engineered cell of claim 47, wherein the modification is an
insertion.
52. The engineered cell of claim 45, wherein the sgRNA encoded by
the third nucleic acid molecule further comprises a protospacer
adjacent motif (PAM) recognition element.
53. The engineered cell of claim 52, wherein the PAM element is
adjacent to the 3' terminus of the target motif.
54. The engineered cell of claim 45, wherein the combination of
Cas9 polypeptide or variants thereof and sgRNA modulates
replication of the modified fourth nucleic acid molecule.
55. The engineered cell of claim 45, wherein the combination of
Cas9 polypeptide or variants thereof, sgRNA and the modified
nucleoside triphosphate transporter modulates replication of the
modified fourth nucleic acid molecule.
56. The engineered cell of claim 45, wherein the combination of
Cas9 polypeptide or variants thereof, sgRNA and the modified
nucleoside triphosphate transporter decreases the replication rate
of the modified fourth nucleic acid molecule by about 80%, 85%,
95%, 99%, or higher.
57. The engineered cell of claim 45, wherein the production of the
fourth nucleic acid molecule in the engineered cell increases by
about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or
higher.
58. The engineered cell of claim 45, wherein the Cas9 polypeptide
or variants thereof generate a double-stranded break.
59. The engineered cell of claim 45, wherein the Cas9 polypeptide
is a wild-type Cas9.
60. The engineered cell of claim 22, wherein the unnatural
nucleotide comprises an unnatural base selected from the group
consisting of 2-aminoadenin-9-yl, 2-aminoadenine, 2-F-adenine,
2-thiouracil, 2-thio-thymine, 2-thiocytosine, 2-propyl and alkyl
derivatives of adenine and guanine, 2-amino-adenine,
2-amino-propyl-adenine, 2-aminopyridine, 2-pyridone,
2'-deoxyuridine, 2-amino-2'-deoxyadenosine 3-deazaguanine,
3-deazaadenine, 4-thio-uracil, 4-thio-thymine, uracil-5-yl,
hypoxanthin-9-yl (I), 5-methyl-cytosine, 5-hydroxymethyl cytosine,
xanthine, hypoxanthine, 5-bromo, and 5-trifluoromethyl uracils and
cytosines; 5-halouracil, 5-halocytosine, 5-propynyl-uracil,
5-propynyl cytosine, 5-uracil, 5-substituted, 5-halo, 5-substituted
pyrimidines, 5-hydroxycytosine, 5-bromocytosine, 5-bromouracil,
5-chlorocytosine, chlorinated cytosine, cyclocytosine, cytosine
arabinoside, 5-fluorocytosine, fluoropyrimidine, fluorouracil,
5,6-dihydrocytosine, 5-iodocytosine, hydroxyurea, iodouracil,
5-nitrocytosine, 5-bromouracil, 5-chlorouracil, 5-fluorouracil, and
5-iodouracil, 6-alkyl derivatives of adenine and guanine,
6-azapyrimidines, 6-azo-uracil, 6-azo cytosine, azacytosine,
6-azo-thymine, 6-thio-guanine, 7-methylguanine, 7-methyladenine,
7-deazaguanine, 7-deazaguanosine, 7-deaza-adenine,
7-deaza-8-azaguanine, 8-azaguanine, 8-azaadenine, 8-halo, 8-amino,
8-thiol, 8-thioalkyl, and 8-hydroxyl substituted adenines and
guanines; N4-ethylcytosine, N-2 substituted purines, N-6
substituted purines, O-6 substituted purines, those that increase
the stability of duplex formation, universal nucleic acids,
hydrophobic nucleic acids, promiscuous nucleic acids, size-expanded
nucleic acids, fluorinated nucleic acids, tricyclic pyrimidines,
phenoxazine cytidine([5,4-b][1,4]benzoxazin-2(3H)-one),
phenothiazine cytidine
(1H-pyrimido[5,4-b][1,4]benzothiazin-2(3H)-one), G-clamps,
phenoxazine cytidine
(9-(2-aminoethoxy)-H-pyrimido[5,4-b][1,4]benzoxazin-2(3H)-one),
carbazole cytidine (2H-pyrimido[4,5-b]indol-2-one), pyridoindole
cytidine (H-pyrido [3',2':4,5]pyrrolo [2,3-d]pyrimidin-2-one),
5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,
hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl)
uracil, 5-carboxymethylaminomethyl-2-thiouridine,
5-carboxymethylaminomethyluracil, dihydrouracil,
beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,
1-methylguanine, 1-methylinosine, 2,2-dimethylguanine,
2-methyladenine, 2-methylguanine, 3-methylcytosine,
5-methylcytosine, N6-adenine, 7-methylguanine,
5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil,
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil,
5-methoxyuracil, 2-methythio-N6-isopentenyladeninje,
uracil-5-oxyacetic acid, wybutoxosine, pseudouracil, queosine,
2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil,
5-methyluracil, uracil-5-oxacetic acid methylester,
uracil-5-oxacetic acid, 5-methyl-2-thiouracil,
3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and
2,6-diaminopurine and those in which the purine or pyrimidine base
is replaced with a heterocycle.
61. The engineered cell of claim 60, wherein the unnatural base is
selected from the group consisting of ##STR00010##
62. The engineered cell of claim 22, wherein the unnatural
nucleotide further comprises an unnatural sugar moiety.
63. The engineered cell of claim 62, wherein the unnatural sugar
moiety is selected from the group consisting of a modification at
the 2' position: OH; substituted lower alkyl, alkaryl, aralkyl,
O-alkaryl or O-aralkyl, SH, SCH.sub.3, OCN, Cl, Br, CN, CF.sub.3,
OCF.sub.3, SOCH.sub.3, SO.sub.2 CH.sub.3, ONO.sub.2, NO.sub.2,
N.sub.3, NH.sub.2F; O-alkyl, S-alkyl, N-alkyl; O-alkenyl,
S-alkenyl, N-alkenyl; O-alkynyl, S-alkynyl, N-alkynyl;
O-alkyl-O-alkyl, 2'-F, 2'-OCH.sub.3, 2'-O(CH.sub.2).sub.2OCH.sub.3
wherein the alkyl, alkenyl and alkynyl may be substituted or
unsubstituted C.sub.1-C.sub.10, alkyl, C.sub.2-C.sub.10 alkenyl,
C.sub.2-C.sub.10 alkynyl, --O[(CH2)n O]mCH.sub.3,
--O(CH.sub.2)nOCH.sub.3, --O(CH.sub.2)n NH.sub.2, --O(CH.sub.2)n
CH.sub.3, --O(CH.sub.2)n-ONH.sub.2, and
--O(CH.sub.2)nON[(CH.sub.2)n CH.sub.3)].sub.2, where n and m are
from 1 to about 10; and/or a modification at the 5' position:
5'-vinyl, 5'-methyl (R or S), a modification at the 4' position,
4'-S, heterocycloalkyl, heterocycloalkaryl, aminoalkylamino,
polyalkylamino, substituted silyl, an RNA cleaving group, a
reporter group, an intercalator, a group for improving the
pharmacokinetic properties of an oligonucleotide, or a group for
improving the pharmacodynamic properties of an oligonucleotide, and
any combination thereof.
64. The engineered cell of claim 22, wherein the unnatural
nucleotide further comprises an unnatural backbone.
65. The engineered cell of claim 64, wherein the unnatural backbone
is selected from the group consisting of a phosphorothioate, chiral
phosphorothioate, phosphorodithioate, phosphotriester,
aminoalkylphosphotriester, C.sub.1-C.sub.10 phosphonates,
3'-alkylene phosphonate, chiral phosphonates, phosphinates,
phosphoramidates, 3'-amino phosphoramidate,
aminoalkylphosphoramidates, thionophosphoramidates,
thionoalkylphosphonates, thionoalkylphosphotriesters, and
boranophosphates.
66. The engineered cell of claim 45, wherein the sgRNA has less
than about 20%, 15%, 10%, 5%, 3%, 1%, or less off-target binding
rate.
67. The engineered cell of claim 45, further comprising an
additional nucleic acid molecule that encodes an additional single
guide RNA (sgRNA) comprising a crRNA-tracrRNA scaffold.
68. The engineered cell of claim 22, wherein the engineered cell is
a semi-synthetic organism.
69. An in vivo method of increasing the production of a nucleic
acid molecule containing an unnatural nucleotide comprising an
engineered cell of claims 22-68.
70. A nucleic acid molecule containing an unnatural nucleotide
produced by an engineered cell of claims 22-68.
71. An isolated and purified plasmid comprising: a nucleic acid
molecule encoding a modified nucleoside triphosphate transporter
from Phaeodactylum tricornutum (PtNTT2); and a promoter region
selected from a pSC plasmid or lacZYA locus.
72. The isolated and purified plasmid of claim 71, wherein the
modified nucleoside triphosphate transporter is a codon optimized
nucleoside triphosphate transporter from Phaeodactylum
tricornutum.
73. The isolated and purified plasmid of claim 71, wherein the
modified nucleoside triphosphate transporter comprises a
deletion.
74. The isolated and purified plasmid of claim 73, wherein the
deletion is a terminal deletion or an internal deletion.
75. The isolated and purified plasmid of claim 74, wherein the
deletion is an N-terminal truncation, a C-terminal truncation, or a
truncation of both termini.
76. The isolated and purified plasmid of claim 73, wherein the
modified nucleoside triphosphate transporter comprises a deletion
of about 5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more
amino acid residues.
77. The isolated and purified plasmid of claim 73, wherein the
modified nucleoside triphosphate transporter comprises a deletion
of about 5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more
amino acid residues at the N-terminus.
78. The isolated and purified plasmid of claim 77, wherein the
modified nucleoside triphosphate transporter comprises a deletion
of about 66 amino acid residues at the N-terminus.
79. The isolated and purified plasmid of claim 71, wherein the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity sequence identity to SEQ ID NO: 4.
80. The isolated and purified plasmid of claim 71, wherein the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 4.
81. The isolated and purified plasmid of claim 71, wherein the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity to SEQ ID NO: 6.
82. The isolated and purified plasmid of claim 71, wherein the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 6.
83. The isolated and purified plasmid of claim 71, wherein the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity to SEQ ID NO: 8.
84. The isolated and purified plasmid of claim 71, wherein the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 8.
85. The isolated and purified plasmid of claim 71, wherein the
promoter region is selected from P.sub.bla, P.sub.lac,
P.sub.lacUV5, P.sub.H207, P.sub..lamda., P.sub.tac, or
P.sub.N25.
86. The isolated and purified plasmid of claim 71, wherein the
promoter region is selected from P.sub.lacI, P.sub.bla, or
P.sub.lac.
87. The isolated and purified plasmid of claim 71, wherein the
plasmid is a prokaryotic plasmid.
88. An in vivo method of increasing the production of a nucleic
acid molecule containing an unnatural nucleotide comprising
incubating a cell with an isolated and purified plasmid of claims
71-87.
89. A kit comprising an isolated and modified nucleoside
triphosphate transporter of claims 1-19.
90. A kit comprising an engineered cell of claims 22-68.
91. A kit comprising an isolated and purified plasmid of claims
71-87.
Description
CROSS-REFERENCE
[0001] This patent application claims benefit of U.S. Patent
Application Ser. No. 62/354,650, filed Jun. 24, 2016, which is
incorporated herein by reference in its entirety.
SEQUENCE LISTING
[0003] The instant application contains a Sequence Listing which
has been submitted electronically in ASCII format and is hereby
incorporated by reference in its entirety. Said ASCII copy, created
on Jun. 22, 2017, is named 46085-707_601_SL.txt and is 101,123
bytes in size.
BACKGROUND OF THE DISCLOSURE
[0004] Oligonucleotides and their applications have revolutionized
biotechnology. However, the oligonucleotides including both DNA and
RNA each includes only the four natural nucleotides of adenosine
(A), guanosine (G), cytosine (C), thymine (T) for DNA, and the four
natural nucleotides of adenosine (A), guanosine (G), cytosine (C),
and uridine (U) for RNA, and which significantly restricts the
potential functions and applications of the oligonucleotides.
[0005] The ability to sequence-specifically synthesize/amplify
oligonucleotides (DNA or RNA) with polymerases, for example by PCR
or isothermal amplification systems (e.g., transcription with T7
RNA polymerase), has revolutionized biotechnology. In addition to
all of the potential applications in nanotechnology, this has
enabled a diverse range of new technologies such as the in vitro
evolution via SELEX (Systematic Evolution of Ligands by Exponential
Enrichment) of RNA and DNA aptamers and enzymes. See, for example,
Oliphant A R, Brandl C J & Struhl K (1989), Defining the
sequence specificity of DNA-binding proteins by selecting binding
sites from random-sequence oligonucleotides: analysis of yeast GCN4
proteins, Mol. Cell Biol., 9:2944-2949; Tuerk C & Gold L
(1990), Systematic evolution of ligands by exponential enrichment:
RNA ligands to bacteriophage T4 DNA polymerase, Science,
249:505-510; Ellington A D & Szostak J W (1990), In vitro
selection of RNA molecules that bind specific ligands, Nature,
346:818-822.
[0006] In some aspects, these applications are restricted by the
limited chemical/physical diversity present in the natural genetic
alphabet (the four natural nucleotides A, C, G, and T in DNA, and
the four natural nucleotides A, C, G, and U in RNA).
SUMMARY OF THE DISCLOSURE
[0007] Disclosed herein, in certain embodiments, is an isolated and
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2) comprising a deletion, wherein the isolated
and modified nucleoside triphosphate transporter is obtained from
an engineered cell. In some embodiments, the deletion is a terminal
deletion or an internal deletion. In some embodiments, the deletion
is a terminal deletion. In some embodiments, the deletion is an
internal deletion. In some embodiments, the terminal deletion is a
N-terminal deletion, a C-terminal deletion, or a deletion of both
termini. In some embodiments, the terminal deletion is a N-terminal
deletion. In some embodiments, the deletion comprises about 5, 10,
15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino acid
residues. In some embodiments, the isolated and modified nucleoside
triphosphate transporter comprises a deletion of about 5, 10, 15,
20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino acid residues
at the N-terminus. In some embodiments, the isolated and modified
nucleoside triphosphate transporter comprises a deletion of about
66 amino acid residues at the N-terminus. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity sequence identity to SEQ ID NO: 4. In some embodiments,
the isolated and modified nucleoside triphosphate transporter
comprises 100% sequence identity to SEQ ID NO: 4. In some
embodiments, the isolated and modified nucleoside triphosphate
transporter comprises at least 80%, 85%, 90%, 95%, 96%, 97%, 98%,
or 99% sequence identity to SEQ ID NO: 6. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 6. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity to SEQ ID NO: 8. In some embodiments, the isolated and
modified nucleoside triphosphate transporter comprises 100%
sequence identity to SEQ ID NO: 8. In some embodiments, the
engineered cell comprises a prokaryotic cell. In some embodiments,
the engineered cell is E. coli.
[0008] Disclosed herein, in certain embodiments, is a nucleic acid
molecule encoding an isolated and modified nucleoside triphosphate
transporter described above.
[0009] Disclosed herein, in certain embodiments, is use of a
modified nucleoside triphosphate transporter described above for
the incorporation of an unnatural triphosphate during the synthesis
of a nucleic acid molecule.
[0010] Disclosed herein, in certain embodiments, is an engineered
cell comprising a first nucleic acid molecule encoding a modified
nucleoside triphosphate transporter from Phaeodactylum tricornutum
(PtNTT2). In some embodiments, the nucleic acid of the modified
nucleoside triphosphate transporter is incorporated in the genomic
sequence of the engineered cell. In some embodiments, the
engineered cell comprises a plasmid comprising the modified
nucleoside triphosphate transporter. In some embodiments, the
modified nucleoside triphosphate transporter is a codon optimized
nucleoside triphosphate transporter from Phaeodactylum tricornutum.
In some embodiments, the modified nucleoside triphosphate
transporter comprises a deletion. In some embodiments, the deletion
is a terminal deletion or an internal deletion. In some
embodiments, the deletion is a N-terminal truncation, a C-terminal
truncation, or a truncation of both termini. In some embodiments,
the modified nucleoside triphosphate transporter comprises a
deletion of about 5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66,
70, or more amino acid residues. In some embodiments, the modified
nucleoside triphosphate transporter comprises a deletion of about
5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino
acid residues at the N-terminus. In some embodiments, the modified
nucleoside triphosphate transporter comprises a deletion of about
66 amino acid residues at the N-terminus. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity sequence identity to SEQ ID NO: 4. In some embodiments,
the isolated and modified nucleoside triphosphate transporter
comprises 100% sequence identity to SEQ ID NO: 4. In some
embodiments, the isolated and modified nucleoside triphosphate
transporter comprises at least 80%, 85%, 90%, 95%, 96%, 97%, 98%,
or 99% sequence identity to SEQ ID NO: 6. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 6. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity to SEQ ID NO: 8. In some embodiments, the isolated and
modified nucleoside triphosphate transporter comprises 100%
sequence identity to SEQ ID NO: 8. In some embodiments, the
modified nucleoside triphosphate transporter is under the control
of a promoter selected from an E. coli promoter or a phage
promoter. In some embodiments, the promoter is selected from
P.sub.bla, P.sub.lac, P.sub.lacUV5, P.sub.H207, P.sub..lamda.,
P.sub.tac, or P.sub.N25. In some embodiments, the modified
nucleoside triphosphate transporter is under the control of
promoter P.sub.lacUV5. In some embodiments, the modified nucleoside
triphosphate transporter is under the control of a promoter from a
lac operon. In some embodiments, the modified nucleoside
triphosphate transporter is encoded within a pSC plasmid. In some
embodiments, the modified nucleoside triphosphate transporter
decreases doubling time of the engineered cell. In some
embodiments, the modified nucleoside triphosphate transporter
enables unnatural base pair retention of about 50%, 60%, 70%, 80%,
90%, 95%, 99% or more. In some embodiments, the engineered cell
further comprises a second nucleic acid molecule encoding a Cas9
polypeptide or variants thereof, a third nucleic acid molecule
encoding a single guide RNA (sgRNA) comprising a crRNA-tracrRNA
scaffold; and a fourth nucleic acid molecule comprising an
unnatural nucleotide. In some embodiments, the second nucleic acid
molecule, the third nucleic acid molecule, and the fourth nucleic
acid molecule are encoded in one or more plasmids. In some
embodiments, the sgRNA encoded by the third nucleic acid molecule
comprises a target motif that recognizes a modification at the
unnatural nucleotide position within the fourth nucleic acid
molecule. In some embodiments, the modification at the unnatural
nucleotide position within the third nucleic acid molecule
generates a modified third nucleic acid molecule. In some
embodiments, the modification is a substitution. In some
embodiments, the modification is a deletion. In some embodiments,
the modification is an insertion. In some embodiments, the sgRNA
encoded by the third nucleic acid molecule further comprises a
protospacer adjacent motif (PAM) recognition element. In some
embodiments, the PAM element is adjacent to the 3' terminus of the
target motif. In some embodiments, the combination of Cas9
polypeptide or variants thereof and sgRNA modulates replication of
the modified fourth nucleic acid molecule. In some embodiments, the
combination of Cas9 polypeptide or variants thereof, sgRNA and the
modified nucleoside triphosphate transporter modulates replication
of the modified fourth nucleic acid molecule. In some embodiments,
the combination of Cas9 polypeptide or variants thereof, sgRNA and
the modified nucleoside triphosphate transporter decreases the
replication rate of the modified fourth nucleic acid molecule by
about 80%, 85%, 95%, 99%, or higher. In some embodiments, the
production of the fourth nucleic acid molecule in the engineered
cell increases by about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%,
98%, 99%, or higher. In some embodiments, the Cas9 polypeptide or
variants thereof generate a double-stranded break. In some
embodiments, the Cas9 polypeptide is a wild-type Cas9. In some
embodiments, the unnatural nucleotide comprises an unnatural base
selected from the group consisting of 2-aminoadenin-9-yl,
2-aminoadenine, 2-F-adenine, 2-thiouracil, 2-thio-thymine,
2-thiocytosine, 2-propyl and alkyl derivatives of adenine and
guanine, 2-amino-adenine, 2-amino-propyl-adenine, 2-aminopyridine,
2-pyridone, 2'-deoxyuridine, 2-amino-2'-deoxyadenosine
3-deazaguanine, 3-deazaadenine, 4-thio-uracil, 4-thio-thymine,
uracil-5-yl, hypoxanthin-9-yl (I), 5-methyl-cytosine,
5-hydroxymethyl cytosine, xanthine, hypoxanthine, 5-bromo, and
5-trifluoromethyl uracils and cytosines; 5-halouracil,
5-halocytosine, 5-propynyl-uracil, 5-propynyl cytosine, 5-uracil,
5-substituted, 5-halo, 5-substituted pyrimidines,
5-hydroxycytosine, 5-bromocytosine, 5-bromouracil,
5-chlorocytosine, chlorinated cytosine, cyclocytosine, cytosine
arabinoside, 5-fluorocytosine, fluoropyrimidine, fluorouracil,
5,6-dihydrocytosine, 5-iodocytosine, hydroxyurea, iodouracil,
5-nitrocytosine, 5-bromouracil, 5-chlorouracil, 5-fluorouracil, and
5-iodouracil, 6-alkyl derivatives of adenine and guanine,
6-azapyrimidines, 6-azo-uracil, 6-azo cytosine, azacytosine,
6-azo-thymine, 6-thio-guanine, 7-methylguanine, 7-methyl adenine,
7-deazaguanine, 7-deazaguanosine, 7-deaza-adenine,
7-deaza-8-azaguanine, 8-azaguanine, 8-azaadenine, 8-halo, 8-amino,
8-thiol, 8-thioalkyl, and 8-hydroxyl substituted adenines and
guanines; N4-ethylcytosine, N-2 substituted purines, N-6
substituted purines, O-6 substituted purines, those that increase
the stability of duplex formation, universal nucleic acids,
hydrophobic nucleic acids, promiscuous nucleic acids, size-expanded
nucleic acids, fluorinated nucleic acids, tricyclic pyrimidines,
phenoxazine cytidine([5,4-b][1,4]benzoxazin-2(3H)-one),
phenothiazine cytidine
(1H-pyrimido[5,4-b][1,4]benzothiazin-2(3H)-one), G-clamps,
phenoxazine cytidine
(9-(2-aminoethoxy)-H-pyrimido[5,4-b][1,4]benzoxazin-2(3H)-one),
carbazole cytidine (2H-pyrimido[4,5-b]indol-2-one), pyridoindole
cytidine (H-pyrido [3',2':4,5]pyrrolo [2,3-d]pyrimidin-2-one),
5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,
hypoxanthine, xanthine, 4-acetyl cytosine,
5-(carboxyhydroxylmethyl) uracil,
5-carboxymethylaminomethyl-2-thiouridine,
5-carboxymethylaminomethyluracil, dihydrouracil,
beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,
1-methylguanine, 1-methylinosine, 2,2-dimethylguanine,
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methyl
cytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil,
5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,
5'-methoxycarboxymethyluracil, 5-methoxyuracil,
2-methythio-N6-isopentenyladeninje, uracil-5oxyacetic acid,
wybutoxosine, pseudouracil, queosine, 2-thiocytosine,
5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,
uracil-5-oxacetic acid methylester, uracil-5-oxacetic acid,
5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil,
(acp3)w, and 2,6-diaminopurine and those in which the purine or
pyrimidine base is replaced with a heterocycle. In some
embodiments, the unnatural base is selected from the group
consisting of
##STR00001##
[0011] In some embodiments, the unnatural nucleotide further
comprises an unnatural sugar moiety. In some embodiments, the
unnatural sugar moiety is selected from the group consisting of a
modification at the 2' position: OH; substituted lower alkyl,
alkaryl, aralkyl, O-alkaryl or O-aralkyl, SH, SCH.sub.3, OCN, Cl,
Br, CN, CF.sub.3, OCF.sub.3, SOCH.sub.3, SO.sub.2 CH.sub.3,
ONO.sub.2, NO.sub.2, N.sub.3, NH.sub.2F; O-alkyl, S-alkyl, N-alkyl;
O-alkenyl, S-alkenyl, N-alkenyl; O-alkynyl, S-alkynyl, N-alkynyl;
2'-F, 2'-OCH.sub.3, 2'-O(CH.sub.2).sub.2OCH.sub.3 wherein the
alkyl, alkenyl and alkynyl may be substituted or unsubstituted
C.sub.1-C.sub.10, alkyl, C.sub.2-C.sub.10 alkenyl, C.sub.2-C.sub.10
alkynyl, --O[(CH.sub.2)n O]mCH.sub.3, --O(CH.sub.2)nOCH.sub.3,
--O(CH.sub.2)n NH.sub.2, --O(CH.sub.2)n CH.sub.3,
--O(CH.sub.2)n-ONH.sub.2, and --O(CH.sub.2)nON[(CH.sub.2)n
CH.sub.3)].sub.2, where n and m are from 1 to about 10; and/or a
modification at the 5' position: 5'-vinyl, 5'-methyl (R or S), a
modification at the 4' position, 4'-S, heterocycloalkyl,
heterocycloalkaryl, aminoalkylamino, polyalkylamino, substituted
silyl, an RNA cleaving group, a reporter group, an intercalator, a
group for improving the pharmacokinetic properties of an
oligonucleotide, or a group for improving the pharmacodynamic
properties of an oligonucleotide, and any combination thereof. In
some embodiments, the unnatural nucleotide further comprises an
unnatural backbone. In some embodiments, the unnatural backbone is
selected from the group consisting of a phosphorothioate, chiral
phosphorothioate, phosphorodithioate, phosphotriester,
aminoalkylphosphotriester, C.sub.1-C.sub.10 phosphonates,
3'-alkylene phosphonate, chiral phosphonates, phosphinates,
phosphoramidates, 3'-amino phosphoramidate,
aminoalkylphosphoramidates, thionophosphoramidates,
thionoalkylphosphonates, thionoalkylphosphotriesters, and
boranophosphates. In some embodiments, the sgRNA has less than
about 20%, 15%, 10%, 5%, 3%, 1%, or less off-target binding rate.
In some embodiments, the engineered cell further comprises an
additional nucleic acid molecule that encodes an additional single
guide RNA (sgRNA) comprising a crRNA-tracrRNA scaffold. In some
embodiments, the engineered cell is a semi-synthetic organism.
[0012] Disclosed herein, in certain embodiments, is an in vivo
method of increasing the production of a nucleic acid molecule
containing an unnatural nucleotide comprising an engineered cell
described above.
[0013] Disclosed herein, in certain embodiments, is a nucleic acid
molecule containing an unnatural nucleotide produced by an
engineered cell described above.
[0014] Disclosed herein, in certain embodiments, is an isolated and
purified plasmid comprising a nucleic acid molecule encoding a
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2); and a promoter region selected from a pSC
plasmid or lacZYA locus. In some embodiments, the modified
nucleoside triphosphate transporter is a codon optimized nucleoside
triphosphate transporter from Phaeodactylum tricornutum. In some
embodiments, the modified nucleoside triphosphate transporter
comprises a deletion. In some embodiments, the deletion is a
terminal deletion or an internal deletion. In some embodiments, the
deletion is a N-terminal truncation, a C-terminal truncation, or a
truncation of both termini. In some embodiments, the modified
nucleoside triphosphate transporter comprises a deletion of about
5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino
acid residues. In some embodiments, the modified nucleoside
triphosphate transporter comprises a deletion of about 5, 10, 15,
20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino acid residues
at the N-terminus. In some embodiments, the modified nucleoside
triphosphate transporter comprises a deletion of about 66 amino
acid residues at the N-terminus. In some embodiments, the isolated
and modified nucleoside triphosphate transporter comprises at least
80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity
sequence identity to SEQ ID NO: 4. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 4. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity to SEQ ID NO: 6. In some embodiments, the isolated and
modified nucleoside triphosphate transporter comprises 100%
sequence identity to SEQ ID NO: 6. In some embodiments, the
isolated and modified nucleoside triphosphate transporter comprises
at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence
identity to SEQ ID NO: 8. In some embodiments, the isolated and
modified nucleoside triphosphate transporter comprises 100%
sequence identity to SEQ ID NO: 8. In some embodiments, the
promoter region is selected from P.sub.bla, P.sub.las,
P.sub.lacUV5, P.sub.H207, P.sub..lamda., P.sub.tac, or P.sub.N25.
In some embodiments, the promoter region is selected from
P.sub.lacI, P.sub.bla, or P.sub.lac. In some embodiments, the
plasmid is a prokaryotic plasmid.
[0015] Disclosed herein, in certain embodiments, is an in vivo
method of increasing the production of a nucleic acid molecule
containing an unnatural nucleotide comprising incubating a cell
with an isolated and purified plasmid described above.
[0016] Disclosed herein, in certain embodiments, is a kit
comprising an isolated and modified nucleoside triphosphate
transporter described above.
[0017] Disclosed herein, in certain embodiments, is a kit
comprising an engineered cell described above.
[0018] Disclosed herein, in certain embodiments, is a kit
comprising an isolated and purified plasmid described above.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] Various aspects of the disclosure are set forth with
particularity in the appended claims. A better understanding of the
features and advantages of the present disclosure will be obtained
by reference to the following detailed description that sets forth
illustrative embodiments, in which the principles of the disclosure
are utilized, and the accompanying drawings of which:
[0020] FIG. 1A-FIG. 1B illustrate UBPs and transporter
optimization. FIG. 1A shows the chemical structure of the
dNaM-d5SICS and dNaM-dTPT3 UBPs compared to the natural dC-dG base
pair. FIG. 1B shows comparison of fitness and
[.alpha.-.sup.32P]-dATP uptake in DM1 and the various constructed
strains: pCDF and inducible PtNTT2(1-575) (gray); pSC and
constitutive PtNTT2(66-575) (blue); integrated and constitutive
PtNTT2(66-575) (green). Open triangles denote corresponding control
strains without PtNTT2. pCDF plasmids are in E. coli C41(DE3); pSC
plasmids and integrants are in E. coli BL21(DE3). All PtNTT2
strains are non-codon optimized for plasmid-based expression and
codon-optimized for chromosomal expression unless otherwise
indicated. r.d.u=relative decay units. Error bars represent s.d. of
the mean, n=3 cultures grown and assayed in parallel; the error
bars on some data points are smaller than their marker.
[0021] FIG. 2A-FIG. 2B illustrate increased UBP retention resulting
from transporter and UBP optimization. FIG. 2A shows UBP retentions
of plasmids pUCX1, pUCX2, and pBRX2 in strains DM1 and YZ3. Error
bars represent s.d. of the mean, n=4 transformations for pUCX1 and
pUCX2, n=3 for DM1 pBRX2 and n=5 for YZ3 pBRX2. FIG. 2B shows UBP
retentions of pUCX2 variants, wherein the UBP is flanked by all
possible combinations of natural nucleotides (NXN, where N=G, C, A,
or T and X=NaM), in strain YZ3 grown in media supplemented with
either dNaMTP and d5SICSTP (grey bars) or dNaMTP and dTPT3TP (black
bars).
[0022] FIG. 3A-FIG. 3C illustrate the Cas9-based editing system.
FIG. 3A illustrates the model for Cas9-mediated immunity to UBP
loss. FIG. 3B shows UBP retentions of pUCX2 variants in strain YZ2
with a pCas9 plasmid that expresses a non-target sgRNA (gray) or an
on-target sgRNA (black). Error bars represent s.d. of the mean, n=3
transformations for all sequences except on-target CXA and CXG,
where n=5. FIG. 3C shows UBP retentions of pAIO plasmids in strain
YZ3 (gray), which does not express Cas9, or in strain YZ4 (black)
with expression of Cas9. In FIG. 3B and FIG. 3C, the nucleotides
immediately flanking X=NaM are indicated, as is distance to the
PAM. (N) denotes the nucleotide N in the sgRNA that targets a
substitution mutation of the UBP; all pCas9 and pAIO plasmids also
express an sgRNA targeting the deletion mutation. Error bars
represent s.d. of the mean, n>3 colonies; see FIG. 11 for exact
values of n, sequences, and IPTG concentrations used to induced
Cas9 in YZ4.
[0023] FIG. 4 shows simultaneous retention of two UBPs during
extended growth. Strains YZ3 and YZ4 were transformed with pAIO2X
and plated on solid media containing dNaMTP and dTPT3TP, with or
without IPTG to induce Cas9. Single colonies were inoculated into
liquid media of the same composition and cultures were grown to an
OD.sub.600 of .about.2 (point 1). Cultures were subsequently
diluted 30,000-fold and regrown to an OD.sub.600 of .about.2 (point
2), and this dilution-regrowth process was then repeated two more
times (points 3 and 4). As a no immunity control, strain YZ3 was
grown in the absence of IPTG and two representative cultures are
indicated in gray. Strain YZ4 was grown in the presence of varying
amounts of IPTG and averages of cultures are indicated in green (0
.mu.M, n=5), blue (20 .mu.M, n=5), and red (40 .mu.M, n=4).
Retentions of the UBP in gfp and serT are indicated with solid or
dotted lines, respectively. After the fourth outgrowth, two of the
YZ4 cultures grown with 20 .mu.M IPTG were subcultured on solid
media of the same composition. Three randomly selected colonies
from each plate (n=6 total) were inoculated into liquid media of
the same composition, and each of the six cultures was grown to an
OD.sub.600 of .about.1 (point 5), diluted 300,000-fold into media
containing 0, 20, and 40 .mu.M IPTG, and regrown to an OD.sub.600
of .about.1 (point 6). This dilution-regrowth process was
subsequently repeated (point 7). pAIO2X plasmids were isolated at
each of the numbered points and analyzed for UBP retention. Cell
doublings are estimated from OD.sub.600 (see Methods) and did not
account for growth on solid media (thus making them an
underestimate of actual growth). Error bars represent s.d. of the
mean.
[0024] FIG. 5A-FIG. 5F illustrate dATP uptake and growth of cells
expressing PtNTT2 as a function of inducer (IPTG) concentration or
promoter strength, strain background and presence of N-terminal
signal sequences. FIG. 5A and FIG. 5D show uptake of
[.alpha.-.sup.32P]-dATP. Error bars represent s.d. of the mean, n=3
cultures. r.d.u.=relative decay units, which corresponds to the
total number of radioactive counts per minute normalized to the
average OD.sub.600 across the 1 h window of uptake, with the uptake
of C41(DE3) pCDF-1b PtNTT2(1-575) (i.e. DM1) induced with 1000
.mu.M IPTG set to 1. Deletion of the N-terminal signal sequences
drastically reduces uptake activity in C41(DE3), but activity can
be restored with higher levels of expression in BL21(DE3). FIG. 5B
shows growth curves of C41(DE3) strains. Induction of PtNTT2(1-575)
is toxic. FIG. 5C shows growth curves of BL21(DE3) strains.
Induction of T7 RNAP in BL21(DE3) is toxic (see empty vector
traces), which masks the effect of deleting the N-terminal signal
sequences of PtNTT2 on cell growth. FIG. 5E shows growth curves of
plasmid-based transporter strains. Strains are expressing non
codon-optimized (co) PtNTT2(66-575) unless otherwise indicated.
FIG. 5F shows growth curves of chromosomally-integrated transporter
strains. Strains are expressing codon-optimized PtNTT2(66-575)
unless otherwise indicated. Strain YZ4 also contains a
chromosomally integrated Cas9 gene.
[0025] FIG. 6 shows plasmid maps. Promoters and terminators are
denoted by white and gray features, respectively. * denotes the
derivative of the pMB1 origin from pUC19, which contains a mutation
that increases its copy number. Plasmids that contain a UBP are
generally indicated with the TK1 sequence (orange), but as
described in the text and indicated above, pUCX2 and pAIO variants
with other UBP-containing sequences also position the UBP in the
approximate locus shown with TK1 above. sgRNA (N) denotes the guide
RNA that recognizes a natural substitution mutation of the UBP,
with N being the nucleotide present in the guide RNA. sgRNA (A)
denotes the guide RNA that recognizes a single base pair deletion
of the UBP; this and its associated promoter and terminator
(indicated by .dagger-dbl.) are only present in certain
experiments. The serT and gfp genes do not have promoters.
[0026] FIG. 7A-FIG. 7B illustrate biotin shift assay gels. FIG. 7A
shows biotin shift assay scheme and representative gels for FIG.
2b. Input plasmid refers to the ligation product used to transform
the SSO. * denotes a band whose mobility does not change in the
absence of streptavidin (data not shown) and does not appear in any
samples from clonally-derived cultures (data not shown). The band
likely corresponds to a fully natural plasmid derived from
non-specific priming during the PCR used to generate the insert for
ligation, and is present in very small quantities in the input
plasmid, but is enriched for during replication in vivo by
competition against challenging UBP sequences. Such bands are not
included in the calculation of retention. FIG. 7B illustrates
representative gels for FIG. 4. Each lane (excluding the
oligonucleotide controls) corresponds to a pAIO2X plasmid sample
isolated from a clonally-derived YZ4 culture, grown with the IPTG
concentration indicated, after an estimated 108 cell doublings in
liquid culture (point 7 in FIG. 4). Each plasmid sample is split
and analyzed in parallel biotin shift reactions that assay the UBP
content at the gfp and serT loci (red and blue primers,
respectively). The 80 .mu.M samples are not shown in the plot for
FIG. 4.
[0027] FIG. 8A-FIG. 8B show additional characterization of UBP
propagation. FIG. 8A shows growth curves for the experiments shown
in FIG. 2a. YZ3 and DM1 (induced with 1 mM IPTG) were transformed
with the indicated UBP-containing plasmids, or their corresponding
fully natural controls, and grown in media containing dNaMTP and
d5SICSTP. Each line represents one transformation and subsequent
growth in liquid culture. The x-axis represents time spent in
liquid culture, excluding the 1 h of recovery following
electroporation (see Methods). Growth curves terminate at the
OD.sub.600 at which cells were collected for plasmid isolation and
analysis of UBP retention. Staggering of the curves along the
x-axis for replicates within a given strain and plasmid combination
is likely due to minor variability in transformation frequencies
between transformations (and thus differences in the number of
cells inoculated into each culture), whereas differences in slope
between curves indicate differences in fitness. Growth of YZ3 is
comparable between all three UBP-containing plasmids (and between
each UBP-containing plasmid with its respective natural control),
whereas growth of DM1 is impaired by the UBP-containing plasmids,
especially for pUCX1 and pUCX2. FIG. 8B shows retentions of gfp
pUCX2 variants propagated in YZ3 by transformation, plating on
solid media, isolation of single colonies, and subsequent
inoculation and growth in liquid media, in comparison to retentions
from plasmids propagated by transformation and growth of YZ3 in
liquid media only. Cells were plated from the same transformations
described in FIG. 2b. Solid and liquid media both contained dNaMTP
and dTPT3TP. Cells were harvested at OD.sub.600.about.1. Five
colonies were inoculated for each of the pUCX2 variants indicated,
but some colonies failed to grow (indicated by a blank space in the
table). Retentions for samples isolated from transformants grown
solely in liquid media were assayed from the same samples shown in
FIG. 2b, but were assayed and normalized to an oligonucleotide
control in parallel with the plated transformant samples to
facilitate comparisons in retention. See Methods for additional
details regarding UBP retention normalization. For samples with
near zero shift, we cannot determine whether the UBP was completely
lost in vivo or if the sample came from a colony that was
transformed with a fully natural plasmid (some of which arises
during plasmid construction, specifically during the PCR used to
generate the UBP-containing insert).
[0028] FIG. 9A-FIG. 9B illustrate effect of dNaM-dTPT3 on
Cas9-mediated cleavage of DNA in vitro. Cas9-mediated in vitro
cleavage was assessed for six DNA substrates, wherein the third
base pair upstream of the PAM was either one of the four natural
base pairs or the UBP (in both strand contexts). The four sgRNAs
that are complementary to each natural template were prepared by in
vitro transcription with T7 RNAP. To account for differences in
sgRNA activity and/or minor variations in preparation, a relative
percent maximal cleavage for each sgRNA vs all six DNA substrates
is shown in parentheses. Values represent means.+-.1 s.d. (n=3
technical replicates). In several cases, the presence of an
unnatural nucleotide significantly reduced cleavage compared to DNA
complementary to the sgRNA. This data suggests that Cas9 programmed
with sgRNA(s) complementary to one or more of the natural sequences
would preferentially degrade DNA that had lost the UBP. FIG. 9B
discloses SEQ ID NOS 201 and 202, respectively, in order of
appearance.
[0029] FIG. 10A-FIG. 10D show Cas9-mediated immunity to UBP loss in
TK1. FIG. 10A shows sgRNA sequences used to enhance retention of
the UBP (SEQ ID NOS 203, 204, 204, 205, and 206, respectively, in
order of appearance). FIG. 10B shows UBP retention for pUCX2 TK1 is
enhanced by targeting Cas9 to the major mutation (dTPT3.fwdarw.dA).
As cells continue to grow in the absence of correct sgRNAs
targeting mutations, UBP retention declines. Error bars represent
s.d. of the mean, n=3 transformations. In FIG. 10A and FIG. 10B,
hEGFP is a non-target sgRNA. FIG. 10C shows Sanger sequencing
chromatogram illustrating mutation of dNaM to dT in the absence of
an sgRNA to target Cas9 nuclease activity (SEQ ID NOS 207 and 208,
respectively, in order of appearance). FIG. 10D shows Sanger
sequencing chromatogram illustrating that loss of retention in the
presence of Cas9 and a targeting sgRNA (TK1-A) is due to growth of
cells with plasmids possessing a single base pair mutation.
UBP-containing species were depleted before sequencing. FIG. 10D
discloses SEQ ID NOS 207 and 208, respectively, in order of
appearance.
[0030] FIG. 11 shows Cas9 NXN sequences. 22-nt of each
UBP-containing sequence examined in FIG. 3 is shown above. X=dNaM,
Y=dTPT3 (SEQ ID NOS 209-224, respectively, in order of appearance).
The sequence of the sgRNA targeting the substitution mutation of
the UBP (N) is the 18-nt sequence 5' to the NGG PAM with X or Y
replaced by the natural nucleotide indicated. The sequence of the
sgRNA targeting the deletion mutation of the UBP (A) is the 19-nt
sequence 5' to the NGG PAM without X or Y. YZ3 experiments were
performed without IPTG. Retentions shown in FIG. 3C are averaged
from the values and number of colonies indicated here.
[0031] FIG. 12 shows growth curves of YZ4 replicating pAIO2X.
Growth curves for the first dilution-regrowth (point 2) in FIG. 4.
Curves terminate at the OD.sub.600 at which cultures were collected
for both plasmid isolation and dilution for the next regrowth.
Doubling times are calculated from the timepoints collected between
OD.sub.600 0.1-1.0 for each curve and averaged for each strain
and/or IPTG condition.
DETAILED DESCRIPTION OF THE DISCLOSURE
[0032] Nucleosides are hydrophilic molecules which requires
transport proteins for permeation of cell membranes. Nucleoside
transporters (NTs) are a group of membrane transport proteins that
facilitate crossing of the nucleosides through cell membranes and
vesicles. In some cases, there are two types of nucleoside
transporters, concentrative nucleoside transporters which drives a
concentrative process by electrochemical gradient, and
equilibrative nucleoside transporters which drives an equilibrative
bidirectional process by chemical gradient. In some instances, a
nucleoside transporter further encompasses a nucleoside
triphosphate transporter.
[0033] Natural nucleosides comprise adenine, guanine, thymine,
uracil, and cytosine; and are recognized by nucleotide transporters
for permeation of cell membranes. Unnatural nucleosides, in some
cases, are either not recognized by endogenous nucleotide
transporters or are recognized but the efficiency of transport is
low.
[0034] In some embodiments, described herein are modified
nucleotide transporters that recognize and facilitate transport of
unnatural nucleic acids into a cell. In some instances, the
modified nucleotide transporter enhances import of unnatural
nucleic acids into a cell relative to an endogeneous nucleotide
transporter. In some cases, the modified nucleotide transporter
increases unnatural nucleic acid retention within a cell. In
additional cases, the modified nucleotide transporter minimizes
toxicity due to its expression, and optionally improves cell
doubling time and fitness relative to a cell in the absence of the
transporter.
Nucleoside Triphosphate Transporters
[0035] In certain embodiments, described herein are modified
nucleoside triphosphate transporters for transporting unnatural
nucleic acids into a cell. In some instances, the modified
nucleoside triphosphate transporter is from Phaeodactylum
tricornutum(PtNTT2). In some instances, the modified nucleoside
triphosphate transporter further comprises a deletion. In some
cases, the deletion is a terminal deletion (e.g., a N-terminal
deletion or a C-terminal deletion) or is an internal deletion.
[0036] In some embodiments, described herein is an isolated and
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2) comprising a deletion. In some instances, the
deletion comprises about 5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60,
66, 70, 80, 90, or more amino acid residues. In some instances, the
deletion comprises about 5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60,
66, 70, or more amino acid residues. In some cases, the modified
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2) comprises a deletion of about 5 or more amino
acid residues. In some cases, the modified modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum (PtNTT2)
comprises a deletion of about 10 or more amino acid residues. In
some cases, the modified modified nucleoside triphosphate
transporter from Phaeodactylum tricornutum (PtNTT2) comprises a
deletion of about 15 or more amino acid residues. In some cases,
the modified modified nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2) comprises a deletion of about 20
or more amino acid residues. In some cases, the modified modified
nucleoside triphosphate transporter from Phaeodactylum tricornutum
(PtNTT2) comprises a deletion of about 22 or more amino acid
residues. In some cases, the modified modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum (PtNTT2)
comprises a deletion of about 25 or more amino acid residues. In
some cases, the modified modified nucleoside triphosphate
transporter from Phaeodactylum tricornutum (PtNTT2) comprises a
deletion of about 30 or more amino acid residues. In some cases,
the modified modified nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2) comprises a deletion of about 40
or more amino acid residues. In some cases, the modified modified
nucleoside triphosphate transporter from Phaeodactylum tricornutum
(PtNTT2) comprises a deletion of about 44 or more amino acid
residues. In some cases, the modified modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum (PtNTT2)
comprises a deletion of about 50 or more amino acid residues. In
some cases, the modified modified nucleoside triphosphate
transporter from Phaeodactylum tricornutum (PtNTT2) comprises a
deletion of about 60 or more amino acid residues. In some cases,
the modified modified nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2) comprises a deletion of about 66
or more amino acid residues. In some cases, the modified modified
nucleoside triphosphate transporter from Phaeodactylum tricornutum
(PtNTT2) comprises a deletion of about 70 or more amino acid
residues.
[0037] In some embodiments, described herein is an isolated and
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2) comprising a N-terminal deletion. In some
instances, the N-terminal deletion comprises about 5, 10, 15, 20,
22, 25, 30, 40, 44, 50, 60, 66, 70, 80, 90, or more amino acid
residues. In some instances, the N-terminal deletion comprises
about 5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more
amino acid residues. In some cases, the modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum (PtNTT2)
comprises a N-terminal deletion of about 5 or more amino acid
residues. In some cases, the modified nucleoside triphosphate
transporter from Phaeodactylum tricornutum (PtNTT2) comprises a
N-terminal deletion of about 10 or more amino acid residues. In
some cases, the modified nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2) comprises a N-terminal deletion
of about 15 or more amino acid residues. In some cases, the
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2) comprises a N-terminal deletion of about 20 or
more amino acid residues. In some cases, the modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum (PtNTT2)
comprises a N-terminal deletion of about 22 or more amino acid
residues. In some cases, the isolated and modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum (PtNTT2)
comprises a N-terminal deletion of about 25 or more amino acid
residues. In some cases, the modified nucleoside triphosphate
transporter from Phaeodactylum tricornutum (PtNTT2) comprises a
N-terminal deletion of about 30 or more amino acid residues. In
some cases, the modified nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2) comprises a N-terminal deletion
of about 40 or more amino acid residues. In some cases, the
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2) comprises a N-terminal deletion of about 44 or
more amino acid residues. In some cases, the modified nucleoside
triphosphate transporter from Phaeodactylum tricornutum (PtNTT2)
comprises a N-terminal deletion of about 50 or more amino acid
residues. In some cases, the modified nucleoside triphosphate
transporter from Phaeodactylum tricornutum (PtNTT2) comprises a
N-terminal deletion of about 60 or more amino acid residues. In
some cases, the modified nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2) comprises a N-terminal deletion
of about 66 or more amino acid residues. In some cases, the
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2) comprises a N-terminal deletion of about 70 or
more amino acid residues.
[0038] In some embodiments, the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 80% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 85% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 90% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 95% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 96% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 97% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 98% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 99% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
comprises 100% sequence identity to SEQ ID NO: 4. In some
instances, the modified nucleoside triphosphate transporter
consists of 100% sequence identity to SEQ ID NO: 4.
[0039] In some embodiments, the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 80% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 85% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 90% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 95% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 96% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 97% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 98% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 99% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
comprises 100% sequence identity to SEQ ID NO: 6. In some
instances, the modified nucleoside triphosphate transporter
consists of 100% sequence identity to SEQ ID NO: 6.
[0040] In some embodiments, the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 80% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 85% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 90% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 95% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 96% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 97% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 98% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises at least 99% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
comprises 100% sequence identity to SEQ ID NO: 8. In some
instances, the modified nucleoside triphosphate transporter
consists of 100% sequence identity to SEQ ID NO: 8.
[0041] In some embodiments, a modified nucleoside triphosphate
transporter described herein has a specificity for an unnatural
nucleic acid that is at least about 10%, 20%, 30%, 40%, 50%, 60%,
70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the specificity of
the wild type nucleoside triphosphate transporter toward the
unnatural nucleic acid. In some embodiments, the modified
nucleoside triphosphate transporter has a specificity for an
unnatural nucleic acid comprising a modified sugar that is at least
about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%,
99%, 99.5%, 99.99% the specificity of the wild type nucleoside
triphosphate transporter toward a natural nucleic acid and/or the
unnatural nucleic acid without the modified sugar. In some
embodiments, the modified nucleoside triphosphate transporter has a
specificity for an unnatural nucleic acid comprising a modified
base that is at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,
90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the specificity of the wild
type nucleoside triphosphate transporter toward a natural nucleic
acid and/or the unnatural nucleic acid without the modified base.
In some embodiments, the modified nucleoside triphosphate
transporter has a specificity for an unnatural nucleic acid
comprising a triphosphate that is at least about 10%, 20%, 30%,
40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the
specificity of the wild type nucleoside triphosphate transporter
toward a nucleic acid comprising a triphosphate and/or the
unnatural nucleic acid without the triphosphate. For example, a
modified nucleoside triphosphate transporter can have a specificity
for an unnatural nucleic acid comprising a triphosphate that is at
least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%,
98%, 99%, 99.5%, 99.99% the specificity of the wild type nucleoside
triphosphate transporter toward the unnatural nucleic acid with a
diphosphate or monophosphate, or no phosphate, or a combination
thereof.
[0042] In some embodiments, a modified nucleoside triphosphate
transporter described herein has a specificity for an unnatural
nucleic acid and a specificity to a natural nucleic acid that is at
least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%,
98%, 99%, 99.5%, 99.99% the specificity of the wild type nucleoside
triphosphate transporter toward the natural nucleic acid. In some
embodiments, the modified nucleoside triphosphate transporter has a
specificity for an unnatural nucleic acid comprising a modified
sugar and a specificity to a natural nucleic acid that is at least
about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%,
99%, 99.5%, 99.99% the specificity of the wild type nucleoside
triphosphate transporter toward the natural nucleic acid. In some
embodiments, the modified nucleoside triphosphate transporter has a
specificity for an unnatural nucleic acid comprising a modified
base and a specificity to a natural nucleic acid that is at least
about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%,
99%, 99.5%, 99.99% the specificity of the wild type nucleoside
triphosphate transporter toward the natural nucleic acid.
[0043] In some embodiments, a sequence of a modified nucleoside
triphosphate transporter is further modified to improve the
expression and cellular activity. In some instances, the codon
usage is modified to introduce ribosomal pause sites to slow
translation and to improve the targeting of the modified nucleoside
triphosphate transporter polypeptide to membrane translocons
(Fluman, et al., "mRNA-programmed translation pauses in the
targeting of E. coli membrane proteins," eLife 2014; 3:e03440). In
some instances, modification of one or more transmembrane helices,
for example, modification of a first transmembrane helix and/or
generating a chimeric transporter comprising a first transmembrane
helix of a different protein (e.g., a related transporter) may
enhance expression and cellular activities (Marshall, et al., "A
link between integral membrane protein expression and simulated
integration efficiency," Cell Reports, 16(8): 2169-2177 (2016)). In
some instances, an endogenous, a modified, or a heterologous signal
peptide is incorporated into the sequence of a modified nucleoside
triphosphate transporter to improve expression and cellular
activity. In some cases, the signal peptide is optionally linked
in-frame with the sequence of the modified nucleoside triphosphate
transporter through a linker. In some cases, the linker is a
non-cleavable linker. In other cases, the linker is a cleavable
linker. Exemplary signal peptides are illustrated in Table 3. In
some cases, a signal peptide from Table 3, optionally linked to a
linker, is incorporated into the sequence of a modified nucleoside
triphosphate transporter described herein.
[0044] In some embodiments, the expression of the modified
nucleoside triphosphate transporter is tuned through modification
of the ribosomal binding site to modulate the rate of the modified
nucleoside triphosphate transporter polypeptide's synthesis. See,
e.g., Howard, et al., "Automated design of synthetic ribosome
binding sites to control protein expression," Nature Biotechnology
27: 946-950 (2009); and Mutalik, et al., "Precise and reliable gene
expression via standard transcription and translation initiation
elements," Nature Methods 10: 354-360 (2013).
[0045] In some embodiments, the expression of the modified
nucleoside triphosphate transporter is modulated by the attachment
of a tunable degradation tag. In some instances, a tunable
degradation tag comprises a small amino acid sequence that, when
fused to a target protein, marks the protein for degradation by a
cognate protease in a bacterial cell. Exemplary tunable degradation
tag and cognate protease pairs include, but are not limited to, E
coil ssrA (ec-ssrA)/E. coli Lon (ec-Lon), and Mesoplasma florum
ssrA (mf-ssrA)/Mesoplasma florum Lon (mf-Lon). In some instances,
the tunable degradation tag comprises a modified tag that alters
expression and/or degradation dynamis relative to an unmodified
degradation tag. In some instances, a tunable degradation tag
contemplated herein comprises a degradation tag described in PCT
Patent Publication WO2014/160025A2. In some instances, a tunable
degradation tag contemplated herein comprises a degradation tag
described in Cameron, et al., "Tunable protein degradation in
bacteria," Nature Biotechnology 32: 1276-1281 (2014).
[0046] In some embodiments, the expression of the modified
nucleoside triphosphate transporter is modulated by the
availability of an endogenous or exogenous (e.g. unnatural
nucleotide triphosphate or unnatural amino acid) molecule during
translation. In some instances, the expression of the modified
nucleoside triphosphate transporter is correlated with the copy
number of rare codons, in which the rate of a ribosomal
read-through of a rare codon modulates translation of the
transporter. See, e.g., Wang, et al., "An engineered rare codon
device for optimization of metabolic pathways," Scientific Reports
6:20608 (2016).
[0047] In some instances, a modified nucleoside triphosphate
transporter is characterized according to its rate of dissociation
from a nucleic acid substrate. In some embodiments, a modified
nucleoside triphosphate transporter has a relatively low
dissociation rate for one or more natural and unnatural nucleic
acids. In some embodiments, a modified nucleoside triphosphate
transporter has a relatively high dissociation rate for one or more
natural and unnatural nucleic acids. The dissociation rate is an
activity of an isolated and modified nucleoside triphosphate
transporter that can be adjusted to tune reaction rates in methods
set forth herein.
[0048] Modified nucleoside triphosphate transporters from native
sources or variants thereof can be screened using an assay that
detects importation of an unnatural nucleic acid having a
particular structure. In one example, the modified nucleoside
triphosphate transporters can be screened for the ability to import
an unnatural nucleic acid or UBP; e.g., d5SICSTP, dNaMTP, or
d5SICSTP-dNaMTP UBP. A NTT, e.g., a heterologous transporter, can
be used that displays a modified property for the unnatural nucleic
acid as compared to the wild-type transporter. For example, the
modified property can be, e.g., K.sub.m, k.sub.cat, V.sub.max, NTT
importation in the presence of an unnatural nucleic acid (or of a
naturally occurring nucleotide), average template read-length by a
cell with the modified nucleoside triphosphate transporter in the
presence of an unnatural nucleic acid, specificity of the
transporter for an unnatural nucleic acid, rate of binding of an
unnatural nucleic acid, or rate of product release, or any
combination thereof. In one embodiment, the modified property is a
reduced K.sub.m for an unnatural nucleic acid and/or an increased
k.sub.cat/K.sub.m or V.sub.max/K.sub.m for an unnatural nucleic
acid. Similarly, the modified nucleoside triphosphate transporter
optionally has an increased rate of binding of an unnatural nucleic
acid, an increased rate of product release, and/or an increased
cell importation rate, as compared to a wild-type transporter.
[0049] At the same time, a modified nucleoside triphosphate
transporter can import natural nucleic acids, e.g., A, C, G, and T,
into cell. For example, a modified nucleoside triphosphate
transporter optionally displays a specific importation activity for
a natural nucleic acid that is at least about 5% as high (e.g., 5%,
10%, 25%, 50%, 75%, 100% or higher), as a corresponding wild-type
transporter. Optionally, the modified nucleoside triphosphate
transporter displays a k.sub.cat/K.sub.m or V.sub.max/K.sub.m for a
naturally occurring nucleotide that is at least about 5% as high
(e.g., about 5%, 10%, 25%, 50%, 75% or 100% or higher) as the
wild-type NTT.
[0050] Modified nucleoside triphosphate transporters used herein
that can have the ability to import an unnatural nucleic acid of a
particular structure can also be produced using a directed
evolution approach. A nucleic acid synthesis assay can be used to
screen for transporter variants having specificity for any of a
variety of unnatural nucleic acids. For example, transporter
variants can be screened for the ability to import an unnatural
nucleic acid or UBP; e.g., d5SICSTP, dNaMTP, or d5SICSTP-dNaMTP UBP
into nucleic acids. In some embodiments, such an assay is an in
vitro assay, e.g., using a recombinant transporter variant. In some
embodiments, such an assay is an in vivo assay, e.g., expressing a
transporter variant in a cell. Such directed evolution techniques
can be used to screen variants of any suitable transporter for
activity toward any of the unnatural nucleic acids set forth
herein.
Engineered Cells
[0051] In some embodiments, described herein is an engineered cell
comprising a nucleic acid molecule encoding a modified nucleoside
triphosphate transporter. In some instances, the nucleic acid
molecule encodes a modified nucleoside triphosphate transporter
from Phaeodactylum tricornutum (PtNTT2). In some instances, the
nucleic acid of the modified nucleoside triphosphate transporter is
incorporated in the genomic sequence of the engineered cell.
[0052] The engineered cell can be any suitable prokaryote. In some
instances, the engineered cell is a Gram negative bacteria. In
other instances, the engineered cell is a Gram positive bacteria.
Exemplary bacteria include, but are not limited to, Bacillus
bacteria (e.g., B. subtilis, B. megaterium), Acinetobacter
bacteria, Norcardia baceteria, Xanthobacter bacteria, Escherichia
bacteria (e.g., E. coli (e.g., strains DH10B, Stb12, DH5-alpha,
DB3, DB3.1), DB4, DB5, JDP682 and ccdA-over (e.g., U.S. application
Ser. No. 09/518,188))), Streptomyces bacteria, Erwinia bacteria,
Klebsiella bacteria, Serratia bacteria (e.g., S. marcessans),
Pseudomonas bacteria (e.g., P. aeruginosa), Salmonella bacteria
(e.g., S. typhimurium, S. typhi), Megasphaera bacteria (e.g.,
Megasphaera elsdenii). Bacteria also include, but are not limited
to, photosynthetic bacteria (e.g., green non-sulfur bacteria (e.g.,
Choroflexus bacteria (e.g., C. aurantiacus), Chloronema bacteria
(e.g., C. gigateum)), green sulfur bacteria (e.g., Chlorobium
bacteria (e.g., C. limicola), Pelodictyon bacteria (e.g., P.
luteolum), purple sulfur bacteria (e.g., Chromatium bacteria (e.g.,
C. okenii)), and purple non-sulfur bacteria (e.g., Rhodospirillum
bacteria (e.g., R. rubrum), Rhodobacter bacteria (e.g., R.
sphaeroides, R. capsulatus), and Rhodomicrobium bacteria (e.g., R.
vanellii)).
[0053] In some instances, the engineered cell comprises a plasmid
comprising the modified nucleoside triphosphate transporter. In
some cases, the modified nucleoside triphosphate transporter is a
codon optimized nucleoside triphosphate transporter from
Phaeodactylum tricornutum.
[0054] In some embodiments, the modified nucleoside triphosphate
transporter comprises a deletion. In some cases, the deletion is a
terminal deletion (e.g., a N-terminal or a C-terminal deletion). In
other cases, the deletion is an internal deletion.
[0055] As described above, the modified nucleoside triphosphate
transporter comprises a deletion of about 5, 10, 15, 20, 22, 25,
30, 40, 44, 50, 60, 66, 70, or more amino acid residues. In some
cases, the deletion is a N-terminal deletion, and the modified
nucleoside triphosphate transporter comprises a deletion of about
5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66, 70, or more amino
acid residues at the N-terminus. In some cases, the modified
nucleoside triphosphate transporter comprises a deletion of about
66 amino acid residues at the N-terminus.
[0056] In some instances, the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity sequence identity to SEQ ID
NOs: 4, 6, or 8. In some cases, the isolated and modified
nucleoside triphosphate transporter comprises at least 80%, 85%,
90%, 95%, 96%, 97%, 98%, or 99% sequence identity sequence identity
to SEQ ID NO: 4. In some cases, the isolated and modified
nucleoside triphosphate transporter comprises 100% sequence
identity to SEQ ID NO: 4. In some cases, the isolated and modified
nucleoside triphosphate transporter comprises at least 80%, 85%,
90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 6.
In some cases, the isolated and modified nucleoside triphosphate
transporter comprises 100% sequence identity to SEQ ID NO: 6. In
some cases, the isolated and modified nucleoside triphosphate
transporter comprises at least 80%, 85%, 90%, 95%, 96%, 97%, 98%,
or 99% sequence identity to SEQ ID NO: 8. In some cases, the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 8.
[0057] In some embodiments, the modified nucleoside triphosphate
transporter is under the control of a promoter. In some instances,
the promoter is derived from an E. coli source. In other instances,
the promoter is derived from a phage source. Exemplary promoters,
include, but are not limited to, P.sub.bla, P.sub.lac,
P.sub.lacUV5, P.sub.H207, P.sub..lamda., P.sub.tac, or P.sub.N25.
In some instances, the promoter replaces the lac operon. In some
cases, the modified nucleoside triphosphate transporter is under
the control of a promoter selected from P.sub.bla, P.sub.lac,
P.sub.lacUV5, P.sub.H207, P.sub..lamda., P.sub.tac, or P.sub.N25.
In some cases, the modified nucleoside triphosphate transporter is
under the control of promoter P.sub.lacUV5.
[0058] In some instances, the modified nucleoside triphosphate
transporter is encoded within a pSC plasmid.
[0059] In some embodiments, the engineered cell further comprises a
second nucleic acid molecule encoding a Cas9 polypeptide or
variants thereof, a third nucleic acid molecule encoding a single
guide RNA (sgRNA) comprising a crRNA-tracrRNA scaffold; and a
fourth nucleic acid molecule comprising an unnatural
nucleotide.
[0060] The CRISPR/Cas system involves (1) an integration of short
regions of genetic material that are homologous to a nucleic acid
molecule of interest comprising an unnatural nucleotide, called
"spacers", in clustered arrays in the host genome, (2) expression
of short guiding RNAs (crRNAs) from the spacers, (3) binding of the
crRNAs to specific portions of the nucleic acid molecule of
interest referred to as protospacers, and (4) degradation of
protospacers by CRISPR-associated nucleases (Cas). In some cases, a
Type-II CRISPR system has been described in the bacterium
Streptococcus pyogenes, in which Cas9 and two non-coding small RNAs
(pre-crRNA and tracrRNA (trans-activating CRISPR RNA)) act in
concert to target and degrade a nucleic acid molecule of interest
in a sequence-specific manner (Jinek et al., "A Programmable
Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity,"
Science 337(6096):816-821 (August 2012, epub Jun. 28, 2012)).
[0061] In some instances, a CRISPR/Cas system utilizes a Cas9
polypeptide or a variant thereof. Cas9 is a double stranded
nuclease with two active cutting sites, one for each strand of the
double helix. In some instances, the Cas9 polypeptide or variants
thereof generate a double-stranded break. In some cases, the Cas9
polypeptide is a wild-type Cas9. In some instances, the Cas9
polypeptide is an optimized Cas9 for expression in an engineered
cell described herein.
[0062] In some instances, the two noncoding RNAs are further fused
into one single guide RNA (sgRNA). In some instances, the sgRNA
comprises a target motif that recognizes a modification at the
unnatural nucleotide position within a nucleic acid molecule of
interest. In some embodiments, the modification is a substitution,
insertion, or deletion. In some cases, the sgRNA comprises a target
motif that recognizes a substitution at the unnatural nucleotide
position within a nucleic acid molecule of interest. In some cases,
the sgRNA comprises a target motif that recognizes a deletion at
the unnatural nucleotide position within a nucleic acid molecule of
interest. In some cases, the sgRNA comprises a target motif that
recognizes an insertion at the unnatural nucleotide position within
a nucleic acid molecule of interest.
[0063] In some cases, the target motif is between 10 to 30
nucleotides in length. In some instances, the target motif is
between 15 to 30 nucleotides in length. In some cases, the target
motif is about 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, or 30 nucleotides in length. In some cases,
the target motif is about 15, 16, 17, 18, 19, 20, 21, or 22
nucleotides in length.
[0064] In some cases, the sgRNA further comprises a protospacer
adjacent motif (PAM) recognition element. In some instances, PAM is
located adjacent to the 3' terminus of the target motif. In some
cases, a nucleotide within the target motif that forms Watson-Crick
base pairing with the modification at the unnatural nucleotide
position within the nucleic acid molecule of interest is located
between 3 to 22, between 5 to 20, between 5 to 18, between 5 to 15,
between 5 to 12, or between 5 to 10 nucleotides from the 5'
terminus of PAM. In some cases, a nucleotide within the target
motif that forms Watson-Crick base pairing with the modification at
the unnatural nucleotide position within the nucleic acid molecule
of interest is located about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, or 15 nucleotides from the 5' terminus of PAM.
[0065] In some instances, the second nucleic acid molecule, the
third nucleic acid molecule, and the fourth nucleic acid molecule
are encoded in one or more plasmids. In some instances, the sgRNA
encoded by the third nucleic acid molecule comprises a target motif
that recognizes a modification at the unnatural nucleotide position
within the fourth nucleic acid molecule. In some cases, the
modification at the unnatural nucleotide position within the third
nucleic acid molecule generates a modified third nucleic acid
molecule. In some cases, the modification is a substitution, a
deletion, or an insertion. In some cases, the sgRNA encoded by the
third nucleic acid molecule further comprises a protospacer
adjacent motif (PAM) recognition element. In some cases, the PAM
element is adjacent to the 3' terminus of the target motif. In some
cases, the combination of Cas9 polypeptide or variants thereof and
sgRNA modulates replication of the modified fourth nucleic acid
molecule. In some cases, the combination of Cas9 polypeptide or
variants thereof, sgRNA and the modified nucleoside triphosphate
transporter modulates replication of the modified fourth nucleic
acid molecule.
[0066] In some cases, the engineered cell further comprises an
additional nucleic acid molecule that encodes an additional single
guide RNA (sgRNA) comprising a crRNA-tracrRNA scaffold.
[0067] In some instances, the combination of Cas9 polypeptide or
variants thereof, sgRNA and the modified nucleoside triphosphate
transporter decreases the replication rate of the modified fourth
nucleic acid molecule by about 80%, 85%, 95%, 99%, or higher. In
some instances, the production of the fourth nucleic acid molecule
in the engineered cell increases by about 50%, 60%, 70%, 80%, 90%,
95%, 96%, 97%, 98%, 99%, or higher.
[0068] In some embodiments, the modified nucleoside triphosphate
transporter is expressed in a modified host strain (e.g., the
engineered cell), optimized for the expression and activity of the
modified nucleoside triphosphate transporter and/or general uptake
of nucleoside triphosphates, natural or non-natural. For example,
the expression of outer membrane porins, including, but not limited
to, OmpA, OmpF, OmpC, may be modified, or the modified nucleoside
triphosphate transporter may be expressed in the host cell (e.g.,
the engineered cell) that also expresses a heterologous outer
membrane porin. Alternatively, the host cell (e.g., the engineered
cell) may be permeabilized (chemically or by genetic means) to
improve the uptake of nucleoside triphosphates. In some
embodiments, the host cell (e.g., the engineered cell) may contain
deletions of non-essential, endogenously secreted proteins to
improve the capacity of the host secretion machinery for expression
of the modified nucleoside triphosphate transporter.
[0069] In some embodiments, the modified nucleoside triphosphate
transporter decreases doubling time of the host cell (e.g., the
engineered cell).
[0070] In some cases, the modified nucleoside triphosphate
transporter enables unnatural base pair retention of about 50%,
60%, 70%, 80%, 90%, 95%, 99%, or more.
Plasmids Encoding a Modified Nucleoside Triphosphate
Transporter
[0071] In some embodiments, also described herein is an isolated
and purified plasmid comprising a nucleic acid molecule encoding a
modified nucleoside triphosphate transporter from Phaeodactylum
tricornutum (PtNTT2); and a promoter region selected from a pSC
plasmid or lacZYA locus.
[0072] In some instances, the modified nucleoside triphosphate
transporter is a codon optimized nucleoside triphosphate
transporter from Phaeodactylum tricornutum.
[0073] In some instances, the modified nucleoside triphosphate
transporter comprises a deletion. In some cases, the deletion is a
terminal deletion or an internal deletion. In some cases, the
deletion is a N-terminal truncation, a C-terminal truncation, or a
truncation of both termini.
[0074] In some embodiments, the modified nucleoside triphosphate
transporter comprises a deletion of about 5, 10, 15, 20, 22, 25,
30, 40, 44, 50, 60, 66, 70, or more amino acid residues. In some
instances, the deletion is a N-terminal deletion. In some cases,
the modified nucleoside triphosphate transporter comprises a
deletion of about 5, 10, 15, 20, 22, 25, 30, 40, 44, 50, 60, 66,
70, or more amino acid residues at the N-terminus. In some cases,
the modified nucleoside triphosphate transporter comprises a
deletion of about 66 amino acid residues at the N-terminus.
[0075] In some instances, the isolated and modified nucleoside
triphosphate transporter comprises at least 80%, 85%, 90%, 95%,
96%, 97%, 98%, or 99% sequence identity sequence identity to SEQ ID
NOs: 4, 6, or 8. In some cases, the isolated and modified
nucleoside triphosphate transporter comprises at least 80%, 85%,
90%, 95%, 96%, 97%, 98%, or 99% sequence identity sequence identity
to SEQ ID NO: 4. In some cases, the isolated and modified
nucleoside triphosphate transporter comprises 100% sequence
identity to SEQ ID NO: 4. In some cases, the isolated and modified
nucleoside triphosphate transporter comprises at least 80%, 85%,
90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 6.
In some cases, the isolated and modified nucleoside triphosphate
transporter comprises 100% sequence identity to SEQ ID NO: 6. In
some cases, the isolated and modified nucleoside triphosphate
transporter comprises at least 80%, 85%, 90%, 95%, 96%, 97%, 98%,
or 99% sequence identity to SEQ ID NO: 8. In some cases, the
isolated and modified nucleoside triphosphate transporter comprises
100% sequence identity to SEQ ID NO: 8.
[0076] In some embodiments, the modified nucleoside triphosphate
transporter is under the control of a promoter. Exemplary
promoters, include, but are not limited to, P.sub.bla, P.sub.lac,
P.sub.lacUV5, P.sub.H207, P.sub..lamda., P.sub.tac, or P.sub.N25.
In some instances, the promoter replaces the lac operon. In some
cases, the modified nucleoside triphosphate transporter is under
the control of a promoter selected from P.sub.bla, P.sub.lac,
P.sub.lacUV5, P.sub.H207, P.sub..lamda., P.sub.tac, or P.sub.N25.
In some cases, the modified nucleoside triphosphate transporter is
under the control of promoter P.sub.lacUV5.
[0077] In some instances, the modified nucleoside triphosphate
transporter is encoded within a pSC plasmid.
[0078] In some embodiments, also disclosed herein is an in vivo
method of increasing the production of a nucleic acid molecule
containing an unnatural nucleotide comprising incubating a cell
with an isolated and purified plasmid described supra.
Nucleic Acid Molecules
[0079] In some embodiments, a nucleic acid (e.g., also referred to
herein as nucleic acid molecule of interest) is from any source or
composition, such as DNA, cDNA, gDNA (genomic DNA), RNA, siRNA
(short inhibitory RNA), RNAi, tRNA, mRNA or rRNA (ribosomal RNA),
for example, and is in any form (e.g., linear, circular,
supercoiled, single-stranded, double-stranded, and the like). In
some embodiments, nucleic acids comprise nucleotides, nucleosides,
or polynucleotides. In some cases, nucleic acids comprise natural
and unnatural nucleic acids. In some cases, a nucleic acid also
comprises unnatural nucleic acids, such as DNA or RNA analogs
(e.g., containing base analogs, sugar analogs and/or a non-native
backbone and the like). It is understood that the term "nucleic
acid" does not refer to or infer a specific length of the
polynucleotide chain, thus polynucleotides and oligonucleotides are
also included in the definition. Exemplary natural nucleotides
include, without limitation, ATP, UTP, CTP, GTP, ADP, UDP, CDP,
GDP, AMP, UMP, CMP, GMP, dATP, dTTP, dCTP, dGTP, dADP, dTDP, dCDP,
dGDP, dAMP, dTMP, dCMP, and dGMP. Exemplary natural
deoxyribonucleotides include dATP, dTTP, dCTP, dGTP, dADP, dTDP,
dCDP, dGDP, dAMP, dTMP, dCMP, and dGMP. Exemplary natural
ribonucleotides include ATP, UTP, CTP, GTP, ADP, UDP, CDP, GDP,
AMP, UMP, CMP, and GMP. For RNA, the uracil base is uridine. A
nucleic acid sometimes is a vector, plasmid, phagemid, autonomously
replicating sequence (ARS), centromere, artificial chromosome,
yeast artificial chromosome (e.g., YAC) or other nucleic acid able
to replicate or be replicated in a host cell. In some cases, an
unnatural nucleic acid is a nucleic acid analogue. In additional
cases, an unnatural nucleic acid is from an extracellular source.
In other cases, an unnatural nucleic acid is available to the
intracellular space of an organism provided herein, e.g., a
genetically modified organism.
Unnatural Nucleic Acids
[0080] A nucleotide analog, or unnatural nucleotide, comprises a
nucleotide which contains some type of modification to either the
base, sugar, or phosphate moieties. In some embodiments, a
modification comprises a chemical modification. In some cases,
modifications occur at the 3'OH or 5'OH group, at the backbone, at
the sugar component, or at the nucleotide base. Modifications, in
some instances, optionally include non-naturally occurring linker
molecules and/or of interstrand or intrastrand cross links. In one
aspect, the modified nucleic acid comprises modification of one or
more of the 3'OH or 5'OH group, the backbone, the sugar component,
or the nucleotide base, and/or addition of non-naturally occurring
linker molecules. In one aspect, a modified backbone comprises a
backbone other than a phosphodiester backbone. In one aspect, a
modified sugar comprises a sugar other than deoxyribose (in
modified DNA) or other than ribose (modified RNA). In one aspect, a
modified base comprises a base other than adenine, guanine,
cytosine or thymine (in modified DNA) or a base other than adenine,
guanine, cytosine or uracil (in modified RNA).
[0081] In some embodiments, the nucleic acid comprises at least one
modified base. In some instances, the nucleic acid comprises 2, 3,
4, 5, 6, 7, 8, 9, 10, 15, 20, or more modified bases. In some
cases, modifications to the base moiety include natural and
synthetic modifications of A, C, G, and T/U as well as different
purine or pyrimidine bases. In some embodiments, a modification is
to a modified form of adenine, guanine cytosine or thymine (in
modified DNA) or a modified form of adenine, guanine cytosine or
uracil (modified RNA).
[0082] A modified base of a unnatural nucleic acid includes, but is
not limited to, uracil-5-yl, hypoxanthin-9-yl (I),
2-aminoadenin-9-yl, 5-methylcytosine (5-me-C), 5-hydroxymethyl
cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and
other alkyl derivatives of adenine and guanine, 2-propyl and other
alkyl derivatives of adenine and guanine, 2-thiouracil,
2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine,
5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine,
5-uracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol,
8-thioalkyl, 8-hydroxyl and other 8-substituted adenines and
guanines, 5-halo particularly 5-bromo, 5-trifluoromethyl and other
5-substituted uracils and cytosines, 7-methylguanine and
7-methyladenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine and
7-deazaadenine and 3-deazaguanine and 3-deazaadenine. Certain
unnatural nucleic acids, such as 5-substituted pyrimidines,
6-azapyrimidines and N-2 substituted purines, N-6 substituted
purines, O-6 substituted purines, 2-aminopropyladenine,
5-propynyluracil, 5-propynylcytosine, 5-methylcytosine, those that
increase the stability of duplex formation, universal nucleic
acids, hydrophobic nucleic acids, promiscuous nucleic acids,
size-expanded nucleic acids, fluorinated nucleic acids,
5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and O-6
substituted purines, including 2-aminopropyladenine,
5-propynyluracil and 5-propynylcytosine. 5-methylcytosine (5-me-C),
5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine,
6-methyl, other alkyl derivatives of adenine and guanine, 2-propyl
and other alkyl derivatives of adenine and guanine, 2-thiouracil,
2-thiothymine and 2-thiocytosine, 5-halouracil, 5-halocytosine,
5-propynyl (--C.ident.C--CI1/4) uracil, 5-propynyl cytosine, other
alkynyl derivatives of pyrimidine nucleic acids, 6-azo uracil,
6-azo cytosine, 6-azo thymine, 5-uracil (pseudouracil),
4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and
other 8-substituted adenines and guanines, 5-halo particularly
5-bromo, 5-trifluoromethyl, other 5-substituted uracils and
cytosines, 7-methylguanine, 7-methyl adenine, 2-F-adenine,
2-amino-adenine, 8-azaguanine, 8-azaadenine, 7-deazaguanine,
7-deazaadenine, 3-deazaguanine, 3-deazaadenine, tricyclic
pyrimidines, phenoxazine
cytidine([5,4-b][1,4]benzoxazin-2(3H)-one), phenothiazine cytidine
(1H-pyrimido[5,4-b][1,4]benzothiazin-2(3H)-one), G-clamps,
phenoxazine cytidine (e.g.
9-(2-aminoethoxy)-H-pyrimido[5,4-b][1,4]benzoxazin-2(3H)-one),
carbazole cytidine (2H-pyrimido[4,5-b]indol-2-one), pyridoindole
cytidine (H-pyrido[3',2':4,5]pyrrolo[2,3-d]pyrimidin-2-one), those
in which the purine or pyrimidine base is replaced with other
heterocycles, 7-deaza-adenine, 7-deazaguanosine, 2-aminopyridine,
2-pyridone, azacytosine, 5-bromocytosine, bromouracil,
5-chlorocytosine, chlorinated cytosine, cyclocytosine, cytosine
arabinoside, 5-fluorocytosine, fluoropyrimidine, fluorouracil,
5,6-dihydrocytosine, 5-iodocytosine, hydroxyurea, iodouracil,
5-nitrocytosine, 5-bromouracil, 5-chlorouracil, 5-fluorouracil, and
5-iodouracil, 2-amino-adenine, 6-thio-guanine, 2-thio-thymine,
4-thio-thymine, 5-propynyl-uracil, 4-thio-uracil, N4-ethylcytosine,
7-deazaguanine, 7-deaza-8-azaguanine, 5-hydroxycytosine,
2'-deoxyuridine, 2-amino-2'-deoxyadenosine, and those described in
U.S. Pat. Nos. 3,687,808; 4,845,205; 4,910,300; 4,948,882;
5,093,232; 5,130,302; 5,134,066; 5,175,273; 5,367,066; 5,432,272;
5,457,187; 5,459,255; 5,484,908; 5,502,177; 5,525,711; 5,552,540;
5,587,469; 5,594,121; 5,596,091; 5,614,617; 5,645,985; 5,681,941;
5,750,692; 5,763,588; 5,830,653 and 6,005,096; WO 99/62923;
Kandimalla et al., (2001) Bioorg. Med. Chem. 9:807-813; The Concise
Encyclopedia of Polymer Science and Engineering, Kroschwitz, J. I.,
Ed., John Wiley & Sons, 1990, 858-859; Englisch et al.,
Angewandte Chemie, International Edition, 1991, 30, 613; and
Sanghvi, Chapter 15, Antisense Research and Applications, Crookeand
Lebleu Eds., CRC Press, 1993, 273-288. Additional base
modifications can be found, for example, in U.S. Pat. No.
3,687,808; Englisch et al., Angewandte Chemie, International
Edition, 1991, 30, 613; and Sanghvi, Chapter 15, Antisense Research
and Applications, pages 289-302, Crooke and Lebleu ed., CRC Press,
1993.
[0083] Unnatural nucleic acids comprising various heterocyclic
bases and various sugar moieties (and sugar analogs) are available
in the art, and the nucleic acid in some cases include one or
several heterocyclic bases other than the principal five base
components of naturally-occurring nucleic acids. For example, the
heterocyclic base includes, in some cases, uracil-5-yl,
cytosin-5-yl, adenin-7-yl, adenin-8-yl, guanin-7-yl, guanin-8-yl,
4-aminopyrrolo [2.3-d] pyrimidin-5-yl, 2-amino-4-oxopyrolo [2, 3-d]
pyrimidin-5-yl, 2-amino-4-oxopyrrolo [2.3-d] pyrimidin-3-yl groups,
where the purines are attached to the sugar moiety of the nucleic
acid via the 9-position, the pyrimidines via the 1-position, the
pyrrolopyrimidines via the 7-position and the pyrazolopyrimidines
via the 1-position.
[0084] In some embodiments, a modified base of a unnatural nucleic
acid is depicted below, wherein the wavy line identifies a point of
attachment to the (deoxy)ribose or ribose.
##STR00002## ##STR00003## ##STR00004## ##STR00005## ##STR00006##
##STR00007## ##STR00008##
[0085] In some embodiments, nucleotide analogs are also modified at
the phosphate moiety. Modified phosphate moieties include, but are
not limited to, those with modification at the linkage between two
nucleotides and contains, for example, a phosphorothioate, chiral
phosphorothioate, phosphorodithioate, phosphotriester,
aminoalkylphosphotriester, methyl and other alkyl phosphonates
including 3'-alkylene phosphonate and chiral phosphonates,
phosphinates, phosphoramidates including 3'-amino phosphoramidate
and aminoalkylphosphoramidates, thionophosphoramidates,
thionoalkylphosphonates, thionoalkylphosphotriesters, and
boranophosphates. It is understood that these phosphate or modified
phosphate linkage between two nucleotides are through a 3'-5'
linkage or a 2'-5' linkage, and the linkage contains inverted
polarity such as 3'-5' to 5'-3' or 2'-5' to 5'-2'. Various salts,
mixed salts and free acid forms are also included. Numerous United
States patents teach how to make and use nucleotides containing
modified phosphates and include but are not limited to, U.S. Pat.
Nos. 3,687,808; 4,469,863; 4,476,301; 5,023,243; 5,177,196;
5,188,897; 5,264,423; 5,276,019; 5,278,302; 5,286,717; 5,321,131;
5,399,676; 5,405,939; 5,453,496; 5,455,233; 5,466,677; 5,476,925;
5,519,126; 5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799;
5,587,361; and 5,625,050.
[0086] In some embodiments, unnatural nucleic acids include
2',3'-dideoxy-2',3'-didehydro-nucleosides (PCT/US2002/006460),
5'-substituted DNA and RNA derivatives (PCT/US2011/033961; Saha et
al., J. Org Chem., 1995, 60, 788-789; Wang et al., Bioorganic &
Medicinal Chemistry Letters, 1999, 9, 885-890; and Mikhailov et
al., Nucleosides & Nucleotides, 1991, 10(1-3), 339-343; Leonid
et al., 1995, 14(3-5), 901-905; and Eppacher et al., Helvetica
Chimica Acta, 2004, 87, 3004-3020; PCT/JP2000/004720;
PCT/JP2003/002342; PCT/JP2004/013216; PCT/JP2005/020435;
PCT/JP2006/315479; PCT/JP2006/324484; PCT/JP2009/056718;
PCT/JP2010/067560), or 5'-substituted monomers made as the
monophosphate with modified bases (Wang et al., Nucleosides
Nucleotides & Nucleic Acids, 2004, 23 (1 & 2),
317-337).
[0087] In some embodiments, unnatural nucleic acids include
modifications at the 5'-position and the 2'-position of the sugar
ring (PCT/US94/02993), such as 5'-CH.sub.2-substituted
2'-O-protected nucleosides (Wu et al., Helvetica Chimica Acta,
2000, 83, 1127-1143 and Wu et al., Bioconjugate Chem. 1999, 10,
921-924). In some cases, unnatural nucleic acids include amide
linked nucleoside dimers have been prepared for incorporation into
oligonucleotides wherein the 3' linked nucleoside in the dimer (5'
to 3') comprises a 2'-OCH.sub.3 and a 5'-(S)--CH.sub.3 (Mesmaeker
et al., Synlett, 1997, 1287-1290). Unnatural nucleic acids can
include 2'-substituted 5'-CH.sub.2 (or O) modified nucleosides
(PCT/US92/01020). Unnatural nucleic acids can include
5'-methylenephosphonate DNA and RNA monomers, and dimers (Bohringer
et al., Tet. Lett., 1993, 34, 2723-2726; Collingwood et al.,
Synlett, 1995, 7, 703-705; and Hutter et al., Helvetica Chimica
Acta, 2002, 85, 2777-2806). Unnatural nucleic acids can include
5'-phosphonate monomers having a 2'-substitution (US2006/0074035)
and other modified 5'-phosphonate monomers (WO1997/35869).
Unnatural nucleic acids can include 5'-modified
methylenephosphonate monomers (EP614907 and EP629633). Unnatural
nucleic acids can include analogs of 5' or 6'-phosphonate
ribonucleosides comprising a hydroxyl group at the 5' and/or
6'-position (Chen et al., Phosphorus, Sulfur and Silicon, 2002,
777, 1783-1786; Jung et al., Bioorg. Med. Chem., 2000, 8,
2501-2509; Gallier et al., Eur. J. Org. Chem., 2007, 925-933; and
Hampton et al., J. Med. Chem., 1976, 19(8), 1029-1033). Unnatural
nucleic acids can include 5'-phosphonate deoxyribonucleoside
monomers and dimers having a 5'-phosphate group (Nawrot et al.,
Oligonucleotides, 2006, 16(1), 68-82). Unnatural nucleic acids can
include nucleosides having a 6'-phosphonate group wherein the 5'
or/and 6'-position is unsubstituted or substituted with a
thio-tert-butyl group (SC(CH.sub.3).sub.3) (and analogs thereof); a
methyleneamino group (CH.sub.2NH.sub.2) (and analogs thereof) or a
cyano group (CN) (and analogs thereof) (Fairhurst et al., Synlett,
2001, 4, 467-472; Kappler et al., J. Med. Chem., 1986, 29,
1030-1038; Kappler et al., J. Med. Chem., 1982, 25, 1179-1184;
Vrudhula et al., J. Med. Chem., 1987, 30, 888-894; Hampton et al.,
J. Med. Chem., 1976, 19, 1371-1377; Geze et al., J. Am. Chem. Soc,
1983, 105(26), 7638-7640; and Hampton et al., J. Am. Chem. Soc,
1973, 95(13), 4404-4414).
[0088] In some embodiments, unnatural nucleic acids also include
modifications of the sugar moiety. In some cases, nucleic acids
contain one or more nucleosides wherein the sugar group has been
modified. Such sugar modified nucleosides may impart enhanced
nuclease stability, increased binding affinity, or some other
beneficial biological property. In certain embodiments, nucleic
acids comprise a chemically modified ribofuranose ring moiety.
Examples of chemically modified ribofuranose rings include, without
limitation, addition of substitutent groups (including 5' and/or 2'
substituent groups; bridging of two ring atoms to form bicyclic
nucleic acids (BNA); replacement of the ribosyl ring oxygen atom
with S, N(R), or C(Ri)(R.sub.2) (R=H, C.sub.1-C.sub.12 alkyl or a
protecting group); and combinations thereof. Examples of chemically
modified sugars can be found in WO2008/101157, US2005/0130923, and
WO2007/134181.
[0089] In some instances, a modified nucleic acid comprises
modified sugars or sugar analogs. Thus, in addition to ribose and
deoxyribose, the sugar moiety can be pentose, deoxypentose, hexose,
deoxyhexose, glucose, arabinose, xylose, lyxose, or a sugar
"analog" cyclopentyl group. The sugar can be in a pyranosyl or
furanosyl form. The sugar moiety may be the furanoside of ribose,
deoxyribose, arabinose or 2'-O-alkylribose, and the sugar can be
attached to the respective heterocyclic bases either in [alpha] or
[beta] anomeric configuration. Sugar modifications include, but are
not limited to, 2'-alkoxy-RNA analogs, 2'-amino-RNA analogs,
2'-fluoro-DNA, and 2'-alkoxy- or amino-RNA/DNA chimeras. For
example, a sugar modification may include 2'-O-methyl-uridine or
2'-O-methyl-cytidine. Sugar modifications include
2'-O-alkyl-substituted deoxyribonucleosides and 2'-O-ethyleneglycol
like ribonucleosides. The preparation of these sugars or sugar
analogs and the respective "nucleosides" wherein such sugars or
analogs are attached to a heterocyclic base (nucleic acid base) is
known. Sugar modifications may also be made and combined with other
modifications.
[0090] Modifications to the sugar moiety include natural
modifications of the ribose and deoxy ribose as well as unnatural
modifications. Sugar modifications include, but are not limited to,
the following modifications at the 2' position: OH; F; O-, S-, or
N-alkyl; O-, S-, or N-alkenyl; O-, S- or N-alkynyl; or
O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may be
substituted or unsubstituted C.sub.1 to C.sub.10, alkyl or C.sub.2
to C.sub.10 alkenyl and alkynyl. 2' sugar modifications also
include but are not limited to --O[(CH.sub.2).sub.nO].sub.m
CH.sub.3, --O(CH.sub.2).sub.nOCH.sub.3,
--O(CH.sub.2).sub.nNH.sub.2, --O(CH.sub.2).sub.nCH.sub.3,
--O(CH.sub.2).sub.nONH.sub.2, and --O(CH.sub.2).sub.nON[(CH.sub.2)n
CH.sub.3)].sub.2, where n and m are from 1 to about 10.
[0091] Other modifications at the 2' position include but are not
limited to: C.sub.1 to C.sub.10 lower alkyl, substituted lower
alkyl, alkaryl, aralkyl, O-alkaryl, O-aralkyl, SH, SCH.sub.3, OCN,
Cl, Br, CN, CF.sub.3, OCF.sub.3, SOCH.sub.3, SO.sub.2 CH.sub.3,
ONO.sub.2, NO.sub.2, N.sub.3, NH.sub.2, heterocycloalkyl,
heterocycloalkaryl, aminoalkylamino, polyalkylamino, substituted
silyl, an RNA cleaving group, a reporter group, an intercalator, a
group for improving the pharmacokinetic properties of an
oligonucleotide, or a group for improving the pharmacodynamic
properties of an oligonucleotide, and other substituents having
similar properties. Similar modifications may also be made at other
positions on the sugar, particularly the 3' position of the sugar
on the 3' terminal nucleotide or in 2'-5' linked oligonucleotides
and the 5' position of the 5' terminal nucleotide. Modified sugars
also include those that contain modifications at the bridging ring
oxygen, such as CH.sub.2 and S. Nucleotide sugar analogs may also
have sugar mimetics such as cyclobutyl moieties in place of the
pentofuranosyl sugar. There are numerous United States patents that
teach the preparation of such modified sugar structures and which
detail and describe a range of base modifications, such as U.S.
Pat. Nos. 4,981,957; 5,118,800; 5,319,080; 5,359,044; 5,393,878;
5,446,137; 5,466,786; 5,514,785; 5,519,134; 5,567,811; 5,576,427;
5,591,722; 5,597,909; 5,610,300; 5,627,053; 5,639,873; 5,646,265;
5,658,873; 5,670,633; 4,845,205; 5,130,302; 5,134,066; 5,175,273;
5,367,066; 5,432,272; 5,457,187; 5,459,255; 5,484,908; 5,502,177;
5,525,711; 5,552,540; 5,587,469; 5,594,121, 5,596,091; 5,614,617;
5,681,941; and 5,700,920, each of which is herein incorporated by
reference in its entirety.
[0092] Examples of nucleic acids having modified sugar moieties
include, without limitation, nucleic acids comprising 5'-vinyl,
5'-methyl (R or S), 4'-S, 2'-F, 2'-OCH.sub.3, and
2'-O(CH.sub.2).sub.2OCH.sub.3 substituent groups. The substituent
at the 2' position can also be selected from allyl, amino, azido,
thio, O-allyl, O--(C.sub.1-C.sub.10 alkyl), OCF.sub.3,
O(CH.sub.2).sub.2SCH.sub.3,
O(CH.sub.2).sub.2--O--N(R.sub.m)(R.sub.n), and
O--CH.sub.2--C(.dbd.O)--N(R.sub.m)(R.sub.n), where each R.sub.m and
R.sub.n is, independently, H or substituted or unsubstituted
C.sub.1-C.sub.10 alkyl.
[0093] In certain embodiments, nucleic acids described herein
include one or more bicyclic nucleic acids. In certain such
embodiments, the bicyclic nucleic acid comprises a bridge between
the 4' and the 2' ribosyl ring atoms. In certain embodiments,
nucleic acids provided herein include one or more bicyclic nucleic
acids wherein the bridge comprises a 4' to 2' bicyclic nucleic
acid. Examples of such 4' to 2' bicyclic nucleic acids include, but
are not limited to, one of the formulae: 4'-(CH.sub.2)--O-2' (LNA);
4'-(CH.sub.2)--S-2'; 4'-(CH.sub.2).sub.2--O-2' (ENA);
4'-CH(CH.sub.3)--O-2' and 4'-CH(CH.sub.2OCH.sub.3)--O-2', and
analogs thereof (see, U.S. Pat. No. 7,399,845);
4'-C(CH.sub.3)(CH.sub.3)--O-2' and analogs thereof, (see
WO2009/006478, WO2008/150729, US2004/0171570, U.S. Pat. No.
7,427,672, Chattopadhyaya et al., J. Org. Chem., 209, 74, 118-134,
and WO2008/154401). Also see, for example: Singh et al., Chem.
Commun., 1998, 4, 455-456; Koshkin et al., Tetrahedron, 1998, 54,
3607-3630; Wahlestedt et al., Proc. Natl. Acad. Sci. U.S.A, 2000,
97, 5633-5638; Kumar et al., Bioorg. Med. Chem. Lett., 1998, 8,
2219-2222; Singh et al., J. Org. Chem., 1998, 63, 10035-10039;
Srivastava et al., J. Am. Chem. Soc., 2007, 129(26) 8362-8379;
Elayadi et al., Curr. Opinion Invens. Drugs, 2001, 2, 558-561;
Braasch et al., Chem. Biol, 2001, 8, 1-7; Oram et al., Curr.
Opinion Mol. Ther., 2001, 3, 239-243; U.S. Pat. Nos. 4,849,513;
5,015,733; 5,118,800; 5,118,802; 7,053,207; 6,268,490; 6,770,748;
6,794,499; 7,034,133; 6,525,191; 6,670,461; and 7,399,845;
International Publication Nos. WO2004/106356, WO1994/14226,
WO2005/021570, WO2007/090071, and WO2007/134181; U.S. Patent
Publication Nos. US2004/0171570, US2007/0287831, and
US2008/0039618; U.S. Provisional Application Nos. 60/989,574,
61/026,995, 61/026,998, 61/056,564, 61/086,231, 61/097,787, and
61/099,844; and International Applications Nos. PCT/US2008/064591,
PCT US2008/066154, PCT US2008/068922, and PCT/DK98/00393.
[0094] In certain embodiments, nucleic acids comprise linked
nucleic acids. Nucleic acids can be linked together using any inter
nucleic acid linkage. The two main classes of inter nucleic acid
linking groups are defined by the presence or absence of a
phosphorus atom. Representative phosphorus containing inter nucleic
acid linkages include, but are not limited to, phosphodiesters,
phosphotriesters, methylphosphonates, phosphoramidate, and
phosphorothioates (P.dbd.S). Representative non-phosphorus
containing inter nucleic acid linking groups include, but are not
limited to, methylenemethylimino
(--CH.sub.2--N(CH.sub.3)--O--CH.sub.2--), thiodiester
(--O--C(O)--S--), thionocarbamate (--O--C(O)(NH)--S--); siloxane
(--O--Si(H).sub.2--O--); and N,N*-dimethylhydrazine
(--CH.sub.2--N(CH.sub.3)--N(CH.sub.3)). In certain embodiments,
inter nucleic acids linkages having a chiral atom can be prepared
as a racemic mixture, as separate enantiomers, e.g.,
alkylphosphonates and phosphorothioates. Unnatural nucleic acids
can contain a single modification. Unnatural nucleic acids can
contain multiple modifications within one of the moieties or
between different moieties.
[0095] Backbone phosphate modifications to nucleic acid include,
but are not limited to, methyl phosphonate, phosphorothioate,
phosphoramidate (bridging or non-bridging), phosphotriester,
phosphorodithioate, phosphodithioate, and boranophosphate, and may
be used in any combination. Other non-phosphate linkages may also
be used.
[0096] In some embodiments, backbone modifications (e.g.,
methylphosphonate, phosphorothioate, phosphoroamidate and
phosphorodithioate internucleotide linkages) can confer
immunomodulatory activity on the modified nucleic acid and/or
enhance their stability in vivo.
[0097] In some instances, a phosphorous derivative (or modified
phosphate group) is attached to the sugar or sugar analog moiety in
and can be a monophosphate, diphosphate, triphosphate,
alkylphosphonate, phosphorothioate, phosphorodithioate,
phosphoramidate or the like. Exemplary polynucleotides containing
modified phosphate linkages or non-phosphate linkages can be found
in Peyrottes et al., 1996, Nucleic Acids Res. 24: 1841-1848;
Chaturvedi et al., 1996, Nucleic Acids Res. 24:2318-2323; and
Schultz et al., (1996) Nucleic Acids Res. 24:2966-2973; Matteucci,
1997, "Oligonucleotide Analogs: an Overview" in Oligonucleotides as
Therapeutic Agents, (Chadwick and Cardew, ed.) John Wiley and Sons,
New York, N.Y.; Zon, 1993, "Oligonucleoside Phosphorothioates" in
Protocols for Oligonucleotides and Analogs, Synthesis and
Properties, Humana Press, pp. 165-190; Miller et al., 1971, JACS
93:6657-6665; Jager et al., 1988, Biochem. 27:7247-7246; Nelson et
al., 1997, JOC 62:7278-7287; U.S. Pat. No. 5,453,496; and
Micklefield, 2001, Curr. Med. Chem. 8: 1157-1179.
[0098] In some cases, backbone modification comprises replacing the
phosphodiester linkage with an alternative moiety such as an
anionic, neutral or cationic group. Examples of such modifications
include: anionic internucleoside linkage; N3' to P5'
phosphoramidate modification; boranophosphate DNA;
prooligonucleotides; neutral internucleoside linkages such as
methylphosphonates; amide linked DNA; methylene(methylimino)
linkages; formacetal and thioformacetal linkages; backbones
containing sulfonyl groups; morpholino oligos; peptide nucleic
acids (PNA); and positively charged deoxyribonucleic guanidine
(DNG) oligos (Micklefield, 2001, Current Medicinal Chemistry 8:
1157-1179). A modified nucleic acid may comprise a chimeric or
mixed backbone comprising one or more modifications, e.g. a
combination of phosphate linkages such as a combination of
phosphodiester and phosphorothioate linkages.
[0099] Substitutes for the phosphate include, for example, short
chain alkyl or cycloalkyl internucleoside linkages, mixed
heteroatom and alkyl or cycloalkyl internucleoside linkages, or one
or more short chain heteroatomic or heterocyclic internucleoside
linkages. These include those having morpholino linkages (formed in
part from the sugar portion of a nucleoside); siloxane backbones;
sulfide, sulfoxide and sulfone backbones; formacetyl and
thioformacetyl backbones; methylene formacetyl and thioformacetyl
backbones; alkene containing backbones; sulfamate backbones;
methyleneimino and methylenehydrazino backbones; sulfonate and
sulfonamide backbones; amide backbones; and others having mixed N,
O, S and CH.sub.2 component parts. Numerous United States patents
disclose how to make and use these types of phosphate replacements
and include but are not limited to U.S. Pat. Nos. 5,034,506;
5,166,315; 5,185,444; 5,214,134; 5,216,141; 5,235,033; 5,264,562;
5,264,564; 5,405,938; 5,434,257; 5,466,677; 5,470,967; 5,489,677;
5,541,307; 5,561,225; 5,596,086; 5,602,240; 5,610,289; 5,602,240;
5,608,046; 5,610,289; 5,618,704; 5,623,070; 5,663,312; 5,633,360;
5,677,437; and 5,677,439. It is also understood in a nucleotide
substitute that both the sugar and the phosphate moieties of the
nucleotide can be replaced, by for example an amide type linkage
(aminoethylglycine) (PNA). U.S. Pat. Nos. 5,539,082; 5,714,331; and
5,719,262 teach how to make and use PNA molecules, each of which is
herein incorporated by reference. See also Nielsen et al., Science,
1991, 254, 1497-1500. It is also possible to link other types of
molecules (conjugates) to nucleotides or nucleotide analogs to
enhance for example, cellular uptake. Conjugates can be chemically
linked to the nucleotide or nucleotide analogs. Such conjugates
include but are not limited to lipid moieties such as a cholesterol
moiety (Letsinger et al., Proc. Natl. Acad. Sci. USA, 1989, 86,
6553-6556), cholic acid (Manoharan et al., Bioorg. Med. Chem. Let.,
1994, 4, 1053-1060), a thioether, e.g., hexyl-S-tritylthiol
(Manoharan et al., Ann. KY. Acad. Sci., 1992, 660, 306-309;
Manoharan et al., Bioorg. Med. Chem. Let., 1993, 3, 2765-2770), a
thiocholesterol (Oberhauser et al., Nucl. Acids Res., 1992, 20,
533-538), an aliphatic chain, e.g., dodecandiol or undecyl residues
(Saison-Behmoaras et al., EM5OJ, 1991, 10, 1111-1118; Kabanov et
al., FEBS Lett., 1990, 259, 327-330; Svinarchuk et al., Biochimie,
1993, 75, 49-54), a phospholipid, e.g., di-hexadecyl-rac-glycerol
or triethylammonium 1-di-O-hexadecyl-rac-glycero-S--H-phosphonate
(Manoharan et al., Tetrahedron Lett., 1995, 36, 3651-3654; Shea et
al., Nucl. Acids Res., 1990, 18, 3777-3783), a polyamine or a
polyethylene glycol chain (Manoharan et al., Nucleosides &
Nucleotides, 1995, 14, 969-973), or adamantane acetic acid
(Manoharan et al., Tetrahedron Lett., 1995, 36, 3651-3654), a
palmityl moiety (Mishra et al., Biochem. Biophys. Acta, 1995, 1264,
229-237), or an octadecylamine or
hexylamino-carbonyl-oxycholesterol moiety (Crooke et al., J.
Pharmacol. Exp. Ther., 1996, 277, 923-937). Numerous United States
patents teach the preparation of such conjugates and include, but
are not limited to U.S. Pat. Nos. 4,828,979; 4,948,882; 5,218,105;
5,525,465; 5,541,313; 5,545,730; 5,552,538; 5,578,717, 5,580,731;
5,580,731; 5,591,584; 5,109,124; 5,118,802; 5,138,045; 5,414,077;
5,486,603; 5,512,439; 5,578,718; 5,608,046; 4,587,044; 4,605,735;
4,667,025; 4,762,779; 4,789,737; 4,824,941; 4,835,263; 4,876,335;
4,904,582; 4,958,013; 5,082,830; 5,112,963; 5,214,136; 5,082,830;
5,112,963; 5,214,136; 5,245,022; 5,254,469; 5,258,506; 5,262,536;
5,272,250; 5,292,873; 5,317,098; 5,371,241, 5,391,723; 5,416,203,
5,451,463; 5,510,475; 5,512,667; 5,514,785; 5,565,552; 5,567,810;
5,574,142; 5,585,481; 5,587,371; 5,595,726; 5,597,696; 5,599,923;
5,599,928 and 5,688,941.
Nucleic Acid Base Pairing Properties
[0100] In some embodiments, an unnatural nucleic acid forms a base
pair with another nucleic acid. In some embodiments, a stably
integrated unnatural nucleic acid is an unnatural nucleic acid that
can form a base pair with another nucleic acid, e.g., a natural or
unnatural nucleic acid. In some embodiments, a stably integrated
unnatural nucleic acid is an unnatural nucleic acid that can form a
base pair with another unnatural nucleic acid (unnatural nucleic
acid base pair (UBP)). For example, a first unnatural nucleic acid
can form a base pair with a second unnatural nucleic acid. For
example, one pair of unnatural nucleotide triphosphates that can
base pair when incorporated into nucleic acids include a
triphosphate of d5SICS (d5SICSTP) and a triphosphate of dNaM
(dNaMTP). Such unnatural nucleotides can have a ribose or
deoxyribose sugar moiety. In some embodiments, an unnatural nucleic
acid does not substantially form a base pair with a natural nucleic
acid (A, T, G, C). In some embodiments, a stably integrated
unnatural nucleic acid can form a base pair with a natural nucleic
acid.
[0101] In some embodiments, a stably integrated unnatural nucleic
acid is an unnatural nucleic acid that can form a UBP, but does not
substantially form a base pair with each of the four natural
nucleic acids. In some embodiments, a stably integrated unnatural
nucleic acid is an unnatural nucleic acid that can form a UBP, but
does not substantially form a base pair with one or more natural
nucleic acids. For example, a stably integrated unnatural nucleic
acid may not substantially form a base pair with A, T, and, C, but
can form a base pair with G. For example, a stably integrated
unnatural nucleic acid may not substantially form a base pair with
A, T, and, G, but can form a base pair with C. For example, a
stably integrated unnatural nucleic acid may not substantially form
a base pair with C, G, and, A, but can form a base pair with T. For
example, a stably integrated unnatural nucleic acid may not
substantially form a base pair with C, G, and, T, but can form a
base pair with A. For example, a stably integrated unnatural
nucleic acid may not substantially form a base pair with A and T,
but can form a base pair with C and G. For example, a stably
integrated unnatural nucleic acid may not substantially form a base
pair with A and C, but can form a base pair with T and G. For
example, a stably integrated unnatural nucleic acid may not
substantially form a base pair with A and G, but can form a base
pair with C and T. For example, a stably integrated unnatural
nucleic acid may not substantially form a base pair with C and T,
but can form a base pair with A and G. For example, a stably
integrated unnatural nucleic acid may not substantially form a base
pair with C and G, but can form a base pair with T and G. For
example, a stably integrated unnatural nucleic acid may not
substantially form a base pair with T and G, but can form a base
pair with A and G. For example, a stably integrated unnatural
nucleic acid may not substantially form a base pair with, G, but
can form a base pair with A, T, and, C. For example, a stably
integrated unnatural nucleic acid may not substantially form a base
pair with, A, but can form a base pair with G, T, and, C. For
example, a stably integrated unnatural nucleic acid may not
substantially form a base pair with, T, but can form a base pair
with G, A, and, C. For example, a stably integrated unnatural
nucleic acid may not substantially form a base pair with, C, but
can form a base pair with G, T, and, A.
[0102] Exemplary, unnatural nucleotides capable of forming an
unnatural DNA or RNA base pair (UBP) under conditions in vivo
includes, but is not limited to, 5SICS, d5SICS, NAM, dNaM, and
combinations thereof. In some embodiments, unnatural nucleotides
include:
##STR00009##
Polymerase
[0103] A particularly useful function of a polymerase is to
catalyze the polymerization of a nucleic acid strand using an
existing nucleic acid as a template. Other functions that are
useful are described elsewhere herein. Examples of useful
polymerases include DNA polymerases and RNA polymerases.
[0104] The ability to improve specificity, processivity, or other
features of polymerases unnatural nucleic acids would be highly
desirable in a variety of contexts where, e.g., unnatural nucleic
acid incorporation is desired, including amplification, sequencing,
labeling, detection, cloning, and many others. The present
invention provides polymerases with modified properties for
unnatural nucleic acids, methods of making such polymerases,
methods of using such polymerases, and many other features that
will become apparent upon a complete review of the following.
[0105] In some instances, disclosed herein includes polymerases
that incorporate unnatural nucleic acids into a growing template
copy, e.g., during DNA amplification. In some embodiments,
polymerases can be modified such that the active site of the
polymerase is modified to reduce steric entry inhibition of the
unnatural nucleic acid into the active site. In some embodiments,
polymerases can be modified to provide complementarity with one or
more unnatural features of the unnatural nucleic acids. Such
polymerases can be expressed or engineered in cells for stably
incorporating a UBP into the cells. Accordingly, the invention
includes compositions that include a heterologous or recombinant
polymerase and methods of use thereof.
[0106] Polymerases can be modified using methods pertaining to
protein engineering. For example, molecular modeling can be carried
out based on crystal structures to identify the locations of the
polymerases where mutations can be made to modify a target
activity. A residue identified as a target for replacement can be
replaced with a residue selected using energy minimization
modeling, homology modeling, and/or conservative amino acid
substitutions, such as described in Bordo, et al. J Mol Biol 217:
721-729 (1991) and Hayes, et al. Proc Natl Acad Sci, USA 99:
15926-15931 (2002).
[0107] Any of a variety of polymerases can be used in a method or
composition set forth herein including, for example, protein-based
enzymes isolated from biological systems and functional variants
thereof. Reference to a particular polymerase, such as those
exemplified below, will be understood to include functional
variants thereof unless indicated otherwise. In some embodiments, a
polymerase is a wild type polymerase. In some embodiments, a
polymerase is a modified, or mutant, polymerase.
[0108] Polymerases, with features for improving entry of unnatural
nucleic acids into active site regions and for coordinating with
unnatural nucleotides in the active site region, can also be used.
In some embodiments, a modified polymerase has a modified
nucleotide binding site.
[0109] In some embodiments, a modified polymerase has a specificity
for an unnatural nucleic acid that is at least about 10%, 20%, 30%,
40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the
specificity of the wild type polymerase toward the unnatural
nucleic acid. In some embodiments, a modified or wild type
polymerase has a specificity for an unnatural nucleic acid
comprising a modified sugar that is at least about 10%, 20%, 30%,
40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the
specificity of the wild type polymerase toward a natural nucleic
acid and/or the unnatural nucleic acid without the modified sugar.
In some embodiments, a modified or wild type polymerase has a
specificity for an unnatural nucleic acid comprising a modified
base that is at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,
90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the specificity of the wild
type polymerase toward a natural nucleic acid and/or the unnatural
nucleic acid without the modified base. In some embodiments, a
modified or wild type polymerase has a specificity for an unnatural
nucleic acid comprising a triphosphate that is at least about 10%,
20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%,
99.99% the specificity of the wild type polymerase toward a nucleic
acid comprising a triphosphate and/or the unnatural nucleic acid
without the triphosphate. For example, a modified or wild type
polymerase can have a specificity for an unnatural nucleic acid
comprising a triphosphate that is at least about 10%, 20%, 30%,
40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the
specificity of the wild type polymerase toward the unnatural
nucleic acid with a diphosphate or monophosphate, or no phosphate,
or a combination thereof.
[0110] In some embodiments, a modified or wild type polymerase has
a relaxed specificity for an unnatural nucleic acid. In some
embodiments, a modified or wild type polymerase has a specificity
for an unnatural nucleic acid and a specificity to a natural
nucleic acid that is at least about 10%, 20%, 30%, 40%, 50%, 60%,
70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the specificity of
the wild type polymerase toward the natural nucleic acid. In some
embodiments, a modified or wild type polymerase has a specificity
for an unnatural nucleic acid comprising a modified sugar and a
specificity to a natural nucleic acid that is at least about 10%,
20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%,
99.99% the specificity of the wild type polymerase toward the
natural nucleic acid. In some embodiments, a modified or wild type
polymerase has a specificity for an unnatural nucleic acid
comprising a modified base and a specificity to a natural nucleic
acid that is at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,
90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the specificity of the wild
type polymerase toward the natural nucleic acid.
[0111] Absence of exonuclease activity can be a wild type
characteristic or a characteristic imparted by a variant or
engineered polymerase. For example, an exo minus Klenow fragment is
a mutated version of Klenow fragment that lacks 3' to 5'
proofreading exonuclease activity.
[0112] The method of the invention may be used to expand the
substrate range of any DNA polymerase which lacks an intrinsic 3 to
5' exonuclease proofreading activity or where a 3 to 5' exonuclease
proofreading activity has been disabled, e.g. through mutation.
Examples of DNA polymerases include polA, polB (see e.g. Parrel
& Loeb, Nature Struc Biol 2001) polC, polD, polY, polX and
reverse transcriptases (RT) but preferably are processive,
high-fidelity polymerases (PCT/GB2004/004643). In some embodiments
a modified or wild type polymerase substantially lacks 3' to 5'
proofreading exonuclease activity. In some embodiments a modified
or wild type polymerase substantially lacks 3' to 5' proofreading
exonuclease activity for an unnatural nucleic acid. In some
embodiments, a modified or wild type polymerase has a 3' to 5'
proofreading exonuclease activity. In some embodiments, a modified
or wild type polymerase has a 3' to 5' proofreading exonuclease
activity for a natural nucleic acid and substantially lacks 3' to
5' proofreading exonuclease activity for an unnatural nucleic
acid.
[0113] In some embodiments, a modified polymerase has a 3' to 5'
proofreading exonuclease activity that is at least about 60%, 70%,
80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the proofreading
exonuclease activity of the wild type polymerase. In some
embodiments, a modified polymerase has a 3' to 5' proofreading
exonuclease activity for an unnatural nucleic acid that is at least
about 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the
proofreading exonuclease activity of the wild type polymerase to a
natural nucleic acid. In some embodiments, a modified polymerase
has a 3' to 5' proofreading exonuclease activity for an unnatural
nucleic acid and a 3' to 5' proofreading exonuclease activity for a
natural nucleic acid that is at least about 60%, 70%, 80%, 90%,
95%, 97%, 98%, 99%, 99.5%, 99.99% the proofreading exonuclease
activity of the wild type polymerase to a natural nucleic acid. In
some embodiments, a modified polymerase has a 3' to 5' proofreading
exonuclease activity for a natural nucleic acid that is at least
about 60%, 70%, 80%, 90%, 95%, 97%, 98%, 99%, 99.5%, 99.99% the
proofreading exonuclease activity of the wild type polymerase to
the natural nucleic acid.
[0114] In some embodiments, polymerases are characterized according
to their rate of dissociation from nucleic acids. In some
embodiments a polymerase has a relatively low dissociation rate for
one or more natural and unnatural nucleic acids. In some
embodiments a polymerase has a relatively high dissociation rate
for one or more natural and unnatural nucleic acids. The
dissociation rate is an activity of a polymerase that can be
adjusted to tune reaction rates in methods set forth herein.
[0115] In some embodiments, polymerases are characterized according
to their fidelity when used with a particular natural and/or
unnatural nucleic acid or collections of natural and/or unnatural
nucleic acid. Fidelity generally refers to the accuracy with which
a polymerase incorporates correct nucleic acids into a growing
nucleic acid chain when making a copy of a nucleic acid template.
DNA polymerase fidelity can be measured as the ratio of correct to
incorrect natural and unnatural nucleic acid incorporations when
the natural and unnatural nucleic acid are present, e.g., at equal
concentrations, to compete for strand synthesis at the same site in
the polymerase-strand-template nucleic acid binary complex. DNA
polymerase fidelity can be calculated as the ratio of
(k.sub.cat/K.sub.m) for the natural and unnatural nucleic acid and
(k.sub.cat/K.sub.m) for the incorrect natural and unnatural nucleic
acid; where k.sub.cat and K.sub.m are Michaelis-Menten parameters
in steady state enzyme kinetics (Fersht, A. R. (1985) Enzyme
Structure and Mechanism, 2nd ed., p 350, W. H. Freeman & Co.,
New York., incorporated herein by reference). In some embodiments,
a polymerase has a fidelity value of at least about 100, 1000,
10,000, 100,000, or 1.times.10.sup.6, with or without a
proofreading activity.
[0116] In some embodiments, polymerases from native sources or
variants thereof are screened using an assay that detects
incorporation of an unnatural nucleic acid having a particular
structure. In one example, polymerases can be screened for the
ability to incorporate an unnatural nucleic acid or UBP; e.g.,
d5SICSTP, dNaMTP, or d5SICSTP-dNaMTP UBP. A polymerase, e.g., a
heterologous polymerase, can be used that displays a modified
property for the unnatural nucleic acid as compared to the
wild-type polymerase. For example, the modified property can be,
e.g., K.sub.m, k.sub.cat, V.sub.max, polymerase processivity in the
presence of an unnatural nucleic acid (or of a naturally occurring
nucleotide), average template read-length by the polymerase in the
presence of an unnatural nucleic acid, specificity of the
polymerase for an unnatural nucleic acid, rate of binding of an
unnatural nucleic acid, rate of product (pyrophosphate,
triphosphate, etc.) release, branching rate, or any combination
thereof. In one embodiment, the modified property is a reduced
K.sub.m for an unnatural nucleic acid and/or an increased
k.sub.cat/K.sub.m or V.sub.max/K.sub.m for an unnatural nucleic
acid. Similarly, the polymerase optionally has an increased rate of
binding of an unnatural nucleic acid, an increased rate of product
release, and/or a decreased branching rate, as compared to a
wild-type polymerase.
[0117] At the same time, a polymerase can incorporate natural
nucleic acids, e.g., A, C, G, and T, into a growing nucleic acid
copy. For example, a polymerase optionally displays a specific
activity for a natural nucleic acid that is at least about 5% as
high (e.g., 5%, 10%, 25%, 50%, 75%, 100% or higher), as a
corresponding wild-type polymerase and a processivity with natural
nucleic acids in the presence of a template that is at least 5% as
high (e.g., 5%, 10%, 25%, 50%, 75%, 100% or higher) as the
wild-type polymerase in the presence of the natural nucleic acid.
Optionally, the polymerase displays a k.sub.cat/K.sub.m or
V.sub.max/K.sub.m for a naturally occurring nucleotide that is at
least about 5% as high (e.g., about 5%, 10%, 25%, 50%, 75% or 100%
or higher) as the wild-type polymerase.
[0118] Polymerases used herein that can have the ability to
incorporate an unnatural nucleic acid of a particular structure can
also be produced using a directed evolution approach. A nucleic
acid synthesis assay can be used to screen for polymerase variants
having specificity for any of a variety of unnatural nucleic acids.
For example, polymerase variants can be screened for the ability to
incorporate an unnatural nucleic acid or UBP; e.g., d5SICSTP,
dNaMTP, or d5SICSTP-dNaMTP UBP into nucleic acids. In some
embodiments, such an assay is an in vitro assay, e.g., using a
recombinant polymerase variant. In some embodiments, such an assay
is an in vivo assay, e.g., expressing a polymerase variant in a
cell. Such directed evolution techniques can be used to screen
variants of any suitable polymerase for activity toward any of the
unnatural nucleic acids set forth herein.
[0119] Modified polymerases of the compositions described can
optionally be a modified and/or recombinant .PHI.29-type DNA
polymerase. Optionally, the polymerase can be a modified and/or
recombinant .PHI.29, B103, GA-1, PZA, .PHI.15, BS32, M2Y, Nf, G1,
Cp-1, PRD1, PZE, SF5, Cp-5, Cp-7, PR4, PR5, PR722, or L17
polymerase.
[0120] Nucleic acid polymerases generally useful in the invention
include DNA polymerases, RNA polymerases, reverse transcriptases,
and mutant or altered forms thereof. DNA polymerases and their
properties are described in detail in, among other places, DNA
Replication 2.sup.nd edition, Kornberg and Baker, W. H. Freeman,
New York, N.Y. (1991). Known conventional DNA polymerases useful in
the invention include, but are not limited to, Pyrococcus furiosus
(Pfu) DNA polymerase (Lundberg et al., 1991, Gene, 108: 1,
Stratagene), Pyrococcus woesei (Pwo) DNA polymerase (Hinnisdaels et
al., 1996, Biotechniques, 20:186-8, Boehringer Mannheim), Thermus
thermophilus (Tth) DNA polymerase (Myers and Gelfand 1991,
Biochemistry 30:7661), Bacillus stearothermophilus DNA polymerase
(Stenesh and McGowan, 1977, Biochim Biophys Acta 475:32),
Thermococcus litoralis (TIi) DNA polymerase (also referred to as
Vent.TM. DNA polymerase, Cariello et al, 1991, Polynucleotides Res,
19: 4193, New England Biolabs), 9.degree. Nm.TM. DNA polymerase
(New England Biolabs), Stoffel fragment, Thermo Sequenase.RTM.
(Amersham Pharmacia Biotech UK), Therminator.TM. (New England
Biolabs), Thermotoga maritima (Tma) DNA polymerase (Diaz and
Sabino, 1998 Braz J Med. Res, 31:1239), Thermus aquaticus (Taq) DNA
polymerase (Chien et al, 1976, J. Bacteoriol, 127: 1550), DNA
polymerase, Pyrococcus kodakaraensis KOD DNA polymerase (Takagi et
al., 1997, Appl. Environ. Microbiol. 63:4504), JDF-3 DNA polymerase
(from thermococcus sp. JDF-3, Patent application WO 0132887),
Pyrococcus GB-D (PGB-D) DNA polymerase (also referred as Deep
Vent.TM. DNA polymerase, Juncosa-Ginesta et al., 1994,
Biotechniques, 16:820, New England Biolabs), UlTma DNA polymerase
(from thermophile Thermotoga maritima; Diaz and Sabino, 1998 Braz
J. Med. Res, 31:1239; PE Applied Biosystems), Tgo DNA polymerase
(from thermococcus gorgonarius, Roche Molecular Biochemicals), E.
coli DNA polymerase I (Lecomte and Doubleday, 1983, Polynucleotides
Res. 11:7505), T7 DNA polymerase (Nordstrom et al, 1981, J Biol.
Chem. 256:3112), and archaeal DP1I/DP2 DNA polymerase II (Cann et
al, 1998, Proc. Natl. Acad. Sci. USA 95:14250). Both mesophilic
polymerases and thermophilic polymerases are contemplated.
Thermophilic DNA polymerases include, but are not limited to,
ThermoSequenase.RTM., 9.degree.Nm.TM., Therminator.TM., Taq, Tne,
Tma, Pfu, TfI, Tth, TIi, Stoffel fragment, Vent.TM. and Deep
Vent.TM. DNA polymerase, KOD DNA polymerase, Tgo, JDF-3, and
mutants, variants and derivatives thereof. A polymerase that is a 3
exonuclease-deficient mutant is also contemplated. Reverse
transcriptases useful in the invention include, but are not limited
to, reverse transcriptases from HIV, HTLV-I, HTLV-II, FeLV, FIV,
SIV, AMV, MMTV, MoMuLV and other retroviruses (see Levin, Cell
88:5-8 (1997); Verma, Biochim Biophys Acta. 473:1-38 (1977); Wu et
al, CRC Crit Rev Biochem. 3:289-347 (1975)). Further examples of
polymerases include, but are not limited to 9.degree. N DNA
Polymerase, Taq DNA polymerase, Phusion.RTM. DNA polymerase, Pfu
DNA polymerase, RB69 DNA polymerase, KOD DNA polymerase, and
VentR.RTM. DNA polymerase Gardner et al. (2004) "Comparative
Kinetics of Nucleotide Analog Incorporation by Vent DNA Polymerase
(J. Biol. Chem., 279(12), 11834-11842; Gardner and Jack
"Determinants of nucleotide sugar recognition in an archaeon DNA
polymerase" Nucleic Acids Research, 27(12) 2545-2553.) Polymerases
isolated from non-thermophilic organisms can be heat inactivatable.
Examples are DNA polymerases from phage. It will be understood that
polymerases from any of a variety of sources can be modified to
increase or decrease their tolerance to high temperature
conditions. In some embodiments, a polymerase can be thermophilic.
In some embodiments, a thermophilic polymerase can be heat
inactivatable. Thermophilic polymerases are typically useful for
high temperature conditions or in thermocycling conditions such as
those employed for polymerase chain reaction (PCR) techniques.
[0121] In some embodiments, the polymerase comprises .PHI.29, B103,
GA-1, PZA, .PHI.15, BS32, M2Y, Nf, G1, Cp-1, PRD1, PZE, SF5, Cp-5,
Cp-7, PR4, PR5, PR722, L17, ThermoSequenase.RTM., 9.degree. Nm.TM.,
Therminator.TM. DNA polymerase, Tne, Tma, TfI, Tth, TIi, Stoffel
fragment, Vent.TM. and Deep Vent.TM. DNA polymerase, KOD DNA
polymerase, Tgo, JDF-3, Pfu, Taq, T7 DNA polymerase, T7 RNA
polymerase, PGB-D, UlTma DNA polymerase, E. coli DNA polymerase I,
E. coli DNA polymerase III, archaeal DP1I/DP2 DNA polymerase II,
9.degree. N DNA Polymerase, Taq DNA polymerase, Phusion.RTM. DNA
polymerase, Pfu DNA polymerase, SP6 RNA polymerase, RB69 DNA
polymerase, Avian Myeloblastosis Virus (AMV) reverse transcriptase,
Moloney Murine Leukemia Virus (MMLV) reverse transcriptase,
SuperScript.RTM. II reverse transcriptase, and SuperScript.RTM. III
reverse transcriptase.
[0122] In some embodiments, the polymerase is DNA polymerase
1-Klenow fragment, Vent polymerase, Phusion.RTM. DNA polymerase,
KOD DNA polymerase, Taq polymerase, T7 DNA polymerase, T7 RNA
polymerase, Therminator.TM. DNA polymerase, POLB polymerase, SP6
RNA polymerase, E. coli DNA polymerase I, E. coli DNA polymerase
III, Avian Myeloblastosis Virus (AMV) reverse transcriptase,
Moloney Murine Leukemia Virus (MMLV) reverse transcriptase,
SuperScript.RTM. II reverse transcriptase, or SuperScript.RTM. III
reverse transcriptase.
[0123] Additionally, such polymerases can be used for DNA
amplification and/or sequencing applications, including real-time
applications, e.g., in the context of amplification or sequencing
that include incorporation of unnatural nucleic acid residues into
DNA by the polymerase. In other embodiments, the unnatural nucleic
acid that is incorporated can be the same as a natural residue,
e.g., where a label or other moiety of the unnatural nucleic acid
is removed by action of the polymerase during incorporation, or the
unnatural nucleic acid can have one or more feature that
distinguishes it from a natural nucleic acid.
Nucleic Acid Reagents & Tools
[0124] A nucleic acid reagent for use with a method, cell, or
engineered microorganism described herein comprises one or more
ORFs. An ORF may be from any suitable source, sometimes from
genomic DNA, mRNA, reverse transcribed RNA or complementary DNA
(cDNA) or a nucleic acid library comprising one or more of the
foregoing, and is from any organism species that contains a nucleic
acid sequence of interest, protein of interest, or activity of
interest. Non-limiting examples of organisms from which an ORF can
be obtained include bacteria, yeast, fungi, human, insect,
nematode, bovine, equine, canine, feline, rat or mouse, for
example. In some embodiments, a nucleic acid reagent or other
reagent described herein is isolated or purified.
[0125] A nucleic acid reagent sometimes comprises a nucleotide
sequence adjacent to an ORF that is translated in conjunction with
the ORF and encodes an amino acid tag. The tag-encoding nucleotide
sequence is located 3' and/or 5' of an ORF in the nucleic acid
reagent, thereby encoding a tag at the C-terminus or N-terminus of
the protein or peptide encoded by the ORF. Any tag that does not
abrogate in vitro transcription and/or translation may be utilized
and may be appropriately selected by the artisan. Tags may
facilitate isolation and/or purification of the desired ORF product
from culture or fermentation media.
[0126] A nucleic acid or nucleic acid reagent can comprise certain
elements, e.g., regulatory elements, often selected according to
the intended use of the nucleic acid. Any of the following elements
can be included in or excluded from a nucleic acid reagent. A
nucleic acid reagent, for example, may include one or more or all
of the following nucleotide elements: one or more promoter
elements, one or more 5' untranslated regions (5'UTRs), one or more
regions into which a target nucleotide sequence may be inserted (an
"insertion element"), one or more target nucleotide sequences, one
or more 3' untranslated regions (3'UTRs), and one or more selection
elements. A nucleic acid reagent can be provided with one or more
of such elements and other elements may be inserted into the
nucleic acid before the nucleic acid is introduced into the desired
organism. In some embodiments, a provided nucleic acid reagent
comprises a promoter, 5'UTR, optional 3'UTR and insertion
element(s) by which a target nucleotide sequence is inserted (i.e.,
cloned) into the nucleotide acid reagent. In certain embodiments, a
provided nucleic acid reagent comprises a promoter, insertion
element(s) and optional 3'UTR, and a 5' UTR/target nucleotide
sequence is inserted with an optional 3'UTR. The elements can be
arranged in any order suitable for expression in the chosen
expression system (e.g., expression in a chosen organism, or
expression in a cell free system, for example), and in some
embodiments a nucleic acid reagent comprises the following elements
in the 5' to 3' direction: (1) promoter element, 5'UTR, and
insertion element(s); (2) promoter element, 5'UTR, and target
nucleotide sequence; (3) promoter element, 5'UTR, insertion
element(s) and 3'UTR; and (4) promoter element, 5'UTR, target
nucleotide sequence and 3'UTR.
[0127] Nucleic acid reagents, e.g., expression cassettes and/or
expression vectors, can include a variety of regulatory elements,
including promoters, enhancers, translational initiation sequences,
transcription termination sequences and other elements. A
"promoter" is generally a sequence or sequences of DNA that
function when in a relatively fixed location in regard to the
transcription start site. For example, the promoter can be upstream
of the nucleotide triphosphate transporter nucleic acid segment. A
"promoter" contains core elements required for basic interaction of
RNA polymerase and transcription factors and can contain upstream
elements and response elements. "Enhancer" generally refers to a
sequence of DNA that functions at no fixed distance from the
transcription start site and can be either 5' or 3'' to the
transcription unit. Furthermore, enhancers can be within an intron
as well as within the coding sequence itself. They are usually
between 10 and 300 by in length, and they function in cis.
Enhancers function to increase transcription from nearby promoters.
Enhancers, like promoters, also often contain response elements
that mediate the regulation of transcription. Enhancers often
determine the regulation of expression.
[0128] As noted above, nucleic acid reagents may also comprise one
or more 5' UTR's, and one or more 3'UTR's. For example, expression
vectors used in prokaryotic host cells (e.g., virus, bacterium) can
contain sequences that signal for the termination of transcription
which can affect mRNA expression. These regions can be transcribed
as polyadenylated segments in the untranslated portion of the mRNA
encoding tissue factor protein. The 3'' untranslated regions also
include transcription termination sites. In some preferred
embodiments, a transcription unit comprises a polyadenylation
region. One benefit of this region is that it increases the
likelihood that the transcribed unit will be processed and
transported like mRNA. The identification and use of
polyadenylation signals in expression constructs is well
established. In some preferred embodiments, homologous
polyadenylation signals can be used in the transgene
constructs.
[0129] A 5' UTR may comprise one or more elements endogenous to the
nucleotide sequence from which it originates, and sometimes
includes one or more exogenous elements. A 5' UTR can originate
from any suitable nucleic acid, such as genomic DNA, plasmid DNA,
RNA or mRNA, for example, from any suitable organism (e.g., virus,
bacterium, yeast, fungi, plant, insect or mammal). The artisan may
select appropriate elements for the 5' UTR based upon the chosen
expression system (e.g., expression in a chosen organism, or
expression in a cell free system, for example). A 5' UTR sometimes
comprises one or more of the following elements known to the
artisan: enhancer sequences (e.g., transcriptional or
translational), transcription initiation site, transcription factor
binding site, translation regulation site, translation initiation
site, translation factor binding site, accessory protein binding
site, feedback regulation agent binding sites, Pribnow box, TATA
box, -35 element, E-box (helix-loop-helix binding element),
ribosome binding site, replicon, internal ribosome entry site
(IRES), silencer element and the like. In some embodiments, a
promoter element may be isolated such that all 5' UTR elements
necessary for proper conditional regulation are contained in the
promoter element fragment, or within a functional subsequence of a
promoter element fragment.
[0130] A 5 'UTR in the nucleic acid reagent can comprise a
translational enhancer nucleotide sequence. A translational
enhancer nucleotide sequence often is located between the promoter
and the target nucleotide sequence in a nucleic acid reagent. A
translational enhancer sequence often binds to a ribosome,
sometimes is an 18S rRNA-binding ribonucleotide sequence (i.e., a
40S ribosome binding sequence) and sometimes is an internal
ribosome entry sequence (IRES). An IRES generally forms an RNA
scaffold with precisely placed RNA tertiary structures that contact
a 40S ribosomal subunit via a number of specific intermolecular
interactions. Examples of ribosomal enhancer sequences are known
and can be identified by the artisan (e.g., Mignone et al., Nucleic
Acids Research 33: D141-D146 (2005); Paulous et al., Nucleic Acids
Research 31: 722-733 (2003); Akbergenov et al., Nucleic Acids
Research 32: 239-247 (2004); Mignone et al., Genome Biology 3(3):
reviews0004.1-0001.10 (2002); Gallie, Nucleic Acids Research 30:
3401-3411 (2002); Shaloiko et al., DOI: 10.1002/bit.20267; and
Gallie et al., Nucleic Acids Research 15: 3257-3273 (1987)).
[0131] A translational enhancer sequence sometimes is a eukaryotic
sequence, such as a Kozak consensus sequence or other sequence
(e.g., hydroid polyp sequence, GenBank accession no. U07128). A
translational enhancer sequence sometimes is a prokaryotic
sequence, such as a Shine-Dalgarno consensus sequence. In certain
embodiments, the translational enhancer sequence is a viral
nucleotide sequence. A translational enhancer sequence sometimes is
from a 5' UTR of a plant virus, such as Tobacco Mosaic Virus (TMV),
Alfalfa Mosaic Virus (AMV); Tobacco Etch Virus (ETV); Potato Virus
Y (PVY); Turnip Mosaic (poty) Virus and Pea Seed Borne Mosaic
Virus, for example. In certain embodiments, an omega sequence about
67 bases in length from TMV is included in the nucleic acid reagent
as a translational enhancer sequence (e.g., devoid of guanosine
nucleotides and includes a 25 nucleotide long poly (CAA) central
region).
[0132] A 3' UTR may comprise one or more elements endogenous to the
nucleotide sequence from which it originates and sometimes includes
one or more exogenous elements. A 3' UTR may originate from any
suitable nucleic acid, such as genomic DNA, plasmid DNA, RNA or
mRNA, for example, from any suitable organism (e.g., a virus,
bacterium, yeast, fungi, plant, insect or mammal). The artisan can
select appropriate elements for the 3' UTR based upon the chosen
expression system (e.g., expression in a chosen organism, for
example). A 3' UTR sometimes comprises one or more of the following
elements known to the artisan: transcription regulation site,
transcription initiation site, transcription termination site,
transcription factor binding site, translation regulation site,
translation termination site, translation initiation site,
translation factor binding site, ribosome binding site, replicon,
enhancer element, silencer element and polyadenosine tail. A 3' UTR
often includes a polyadenosine tail and sometimes does not, and if
a polyadenosine tail is present, one or more adenosine moieties may
be added or deleted from it (e.g., about 5, about 10, about 15,
about 20, about 25, about 30, about 35, about 40, about 45 or about
50 adenosine moieties may be added or subtracted).
[0133] In some embodiments, modification of a 5' UTR and/or a 3'
UTR is used to alter (e.g., increase, add, decrease or
substantially eliminate) the activity of a promoter. Alteration of
the promoter activity can in turn alter the activity of a peptide,
polypeptide or protein (e.g., enzyme activity for example), by a
change in transcription of the nucleotide sequence(s) of interest
from an operably linked promoter element comprising the modified 5'
or 3' UTR. For example, a microorganism can be engineered by
genetic modification to express a nucleic acid reagent comprising a
modified 5' or 3' UTR that can add a novel activity (e.g., an
activity not normally found in the host organism) or increase the
expression of an existing activity by increasing transcription from
a homologous or heterologous promoter operably linked to a
nucleotide sequence of interest (e.g., homologous or heterologous
nucleotide sequence of interest), in certain embodiments. In some
embodiments, a microorganism can be engineered by genetic
modification to express a nucleic acid reagent comprising a
modified 5' or 3' UTR that can decrease the expression of an
activity by decreasing or substantially eliminating transcription
from a homologous or heterologous promoter operably linked to a
nucleotide sequence of interest, in certain embodiments.
[0134] Expression of a nucleotide triphosphate transporter from an
expression cassette or expression vector can be controlled by any
promoter capable of expression in prokaryotic cells. A promoter
element typically is required for DNA synthesis and/or RNA
synthesis. A promoter element often comprises a region of DNA that
can facilitate the transcription of a particular gene, by providing
a start site for the synthesis of RNA corresponding to a gene.
Promoters generally are located near the genes they regulate, are
located upstream of the gene (e.g., 5' of the gene), and are on the
same strand of DNA as the sense strand of the gene, in some
embodiments. In some embodiments, a promoter element can be
isolated from a gene or organism and inserted in functional
connection with a polynucleotide sequence to allow altered and/or
regulated expression. A non-native promoter (e.g., promoter not
normally associated with a given nucleic acid sequence) used for
expression of a nucleic acid often is referred to as a heterologous
promoter. In certain embodiments, a heterologous promoter and/or a
5'UTR can be inserted in functional connection with a
polynucleotide that encodes a polypeptide having a desired activity
as described herein. The terms "operably linked" and "in functional
connection with" as used herein with respect to promoters, refer to
a relationship between a coding sequence and a promoter element.
The promoter is operably linked or in functional connection with
the coding sequence when expression from the coding sequence via
transcription is regulated, or controlled by, the promoter element.
The terms "operably linked" and "in functional connection with" are
utilized interchangeably herein with respect to promoter
elements.
[0135] A promoter often interacts with a RNA polymerase. A
polymerase is an enzyme that catalyzes synthesis of nucleic acids
using a preexisting nucleic acid reagent. When the template is a
DNA template, an RNA molecule is transcribed before protein is
synthesized. Enzymes having polymerase activity suitable for use in
the present methods include any polymerase that is active in the
chosen system with the chosen template to synthesize protein. In
some embodiments, a promoter (e.g., a heterologous promoter) also
referred to herein as a promoter element, can be operably linked to
a nucleotide sequence or an open reading frame (ORF). Transcription
from the promoter element can catalyze the synthesis of an RNA
corresponding to the nucleotide sequence or ORF sequence operably
linked to the promoter, which in turn leads to synthesis of a
desired peptide, polypeptide or protein.
[0136] Promoter elements sometimes exhibit responsiveness to
regulatory control. Promoter elements also sometimes can be
regulated by a selective agent. That is, transcription from
promoter elements sometimes can be turned on, turned off,
up-regulated or down-regulated, in response to a change in
environmental, nutritional or internal conditions or signals (e.g.,
heat inducible promoters, light regulated promoters, feedback
regulated promoters, hormone influenced promoters, tissue specific
promoters, oxygen and pH influenced promoters, promoters that are
responsive to selective agents (e.g., kanamycin) and the like, for
example). Promoters influenced by environmental, nutritional or
internal signals frequently are influenced by a signal (direct or
indirect) that binds at or near the promoter and increases or
decreases expression of the target sequence under certain
conditions.
[0137] Non-limiting examples of selective or regulatory agents that
influence transcription from a promoter element used in embodiments
described herein include, without limitation, (1) nucleic acid
segments that encode products that provide resistance against
otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid
segments that encode products that are otherwise lacking in the
recipient cell (e.g., essential products, tRNA genes, auxotrophic
markers); (3) nucleic acid segments that encode products that
suppress the activity of a gene product; (4) nucleic acid segments
that encode products that can be readily identified (e.g.,
phenotypic markers such as antibiotics (e.g., (3-lactamase),
0-galactosidase, green fluorescent protein (GFP), yellow
fluorescent protein (YFP), red fluorescent protein (RFP), cyan
fluorescent protein (CFP), and cell surface proteins); (5) nucleic
acid segments that bind products that are otherwise detrimental to
cell survival and/or function; (6) nucleic acid segments that
otherwise inhibit the activity of any of the nucleic acid segments
described in Nos. 1-5 above (e.g., antisense oligonucleotides); (7)
nucleic acid segments that bind products that modify a substrate
(e.g., restriction endonucleases); (8) nucleic acid segments that
can be used to isolate or identify a desired molecule (e.g.,
specific protein binding sites); (9) nucleic acid segments that
encode a specific nucleotide sequence that can be otherwise
non-functional (e.g., for PCR amplification of subpopulations of
molecules); (10) nucleic acid segments that, when absent, directly
or indirectly confer resistance or sensitivity to particular
compounds; (11) nucleic acid segments that encode products that
either are toxic or convert a relatively non-toxic compound to a
toxic compound (e.g., Herpes simplex thymidine kinase, cytosine
deaminase) in recipient cells; (12) nucleic acid segments that
inhibit replication, partition or heritability of nucleic acid
molecules that contain them; and/or (13) nucleic acid segments that
encode conditional replication functions, e.g., replication in
certain hosts or host cell strains or under certain environmental
conditions (e.g., temperature, nutritional conditions, and the
like). In some embodiments, the regulatory or selective agent can
be added to change the existing growth conditions to which the
organism is subjected (e.g., growth in liquid culture, growth in a
fermenter, growth on solid nutrient plates and the like for
example).
[0138] In some embodiments, regulation of a promoter element can be
used to alter (e.g., increase, add, decrease or substantially
eliminate) the activity of a peptide, polypeptide or protein (e.g.,
enzyme activity for example). For example, a microorganism can be
engineered by genetic modification to express a nucleic acid
reagent that can add a novel activity (e.g., an activity not
normally found in the host organism) or increase the expression of
an existing activity by increasing transcription from a homologous
or heterologous promoter operably linked to a nucleotide sequence
of interest (e.g., homologous or heterologous nucleotide sequence
of interest), in certain embodiments. In some embodiments, a
microorganism can be engineered by genetic modification to express
a nucleic acid reagent that can decrease expression of an activity
by decreasing or substantially eliminating transcription from a
homologous or heterologous promoter operably linked to a nucleotide
sequence of interest, in certain embodiments.
[0139] Nucleic acids encoding heterologous proteins, e.g.,
nucleotide triphosphate transporters, can be inserted into or
employed with any suitable expression system. In some embodiments,
a nucleic acid reagent sometimes is stably integrated into the
chromosome of the host organism, or a nucleic acid reagent can be a
deletion of a portion of the host chromosome, in certain
embodiments (e.g., genetically modified organisms, where alteration
of the host genome confers the ability to selectively or
preferentially maintain the desired organism carrying the genetic
modification). Such nucleic acid reagents (e.g., nucleic acids or
genetically modified organisms whose altered genome confers a
selectable trait to the organism) can be selected for their ability
to guide production of a desired protein or nucleic acid molecule.
When desired, the nucleic acid reagent can be altered such that
codons encode for (i) the same amino acid, using a different tRNA
than that specified in the native sequence, or (ii) a different
amino acid than is normal, including unconventional or unnatural
amino acids (including detectably labeled amino acids).
[0140] Recombinant expression is usefully accomplished using an
expression cassette that can be part of a vector, such as a
plasmid. A vector can include a promoter operably linked to nucleic
acid encoding a nucleotide triphosphate transporter. A vector can
also include other elements required for transcription and
translation as described herein. An expression cassette, expression
vector, and sequences in a cassette or vector can be heterologous
to the cell to which the unnatural nucleotides are contacted. For
example, a nucleotide triphosphate transporter sequence can be
heterologous to the cell.
[0141] A variety of prokaryotic expression vectors suitable for
carrying, encoding and/or expressing nucleotide triphosphate
transporters can be produced. Such expression vectors include, for
example, pET, pET3d, pCR2.1, pBAD, pUC, and yeast vectors. The
vectors can be used, for example, in a variety of in vivo and in
vitro situations. Non-limiting examples of prokaryotic promoters
that can be used include SP6, T7, T5, tac, bla, trp, gal, lac, or
maltose promoters. Viral vectors that can be employed include those
relating to lentivirus, adenovirus, adeno-associated virus, herpes
virus, vaccinia virus, polio virus, AIDS virus, neuronal trophic
virus, Sindbis and other viruses. Also useful are any viral
families which share the properties of these viruses which make
them suitable for use as vectors. Retroviral vectors that can be
employed include those described in Verma, American Society for
Microbiology, pp. 229-232, Washington, (1985). For example, such
retroviral vectors can include Murine Maloney Leukemia virus, MMLV,
and other retroviruses that express desirable properties.
Typically, viral vectors contain, nonstructural early genes,
structural late genes, an RNA polymerase III transcript, inverted
terminal repeats necessary for replication and encapsidation, and
promoters to control the transcription and replication of the viral
genome. When engineered as vectors, viruses typically have one or
more of the early genes removed and a gene or gene/promoter
cassette is inserted into the viral genome in place of the removed
viral nucleic acid.
[0142] Cloning
[0143] Any convenient cloning strategy known in the art may be
utilized to incorporate an element, such as an ORF, into a nucleic
acid reagent. Known methods can be utilized to insert an element
into the template independent of an insertion element, such as (1)
cleaving the template at one or more existing restriction enzyme
sites and ligating an element of interest and (2) adding
restriction enzyme sites to the template by hybridizing
oligonucleotide primers that include one or more suitable
restriction enzyme sites and amplifying by polymerase chain
reaction (described in greater detail herein). Other cloning
strategies take advantage of one or more insertion sites present or
inserted into the nucleic acid reagent, such as an oligonucleotide
primer hybridization site for PCR, for example, and others
described herein. In some embodiments, a cloning strategy can be
combined with genetic manipulation such as recombination (e.g.,
recombination of a nucleic acid reagent with a nucleic acid
sequence of interest into the genome of the organism to be
modified, as described further herein). In some embodiments, the
cloned ORF(s) can produce (directly or indirectly) modified or wild
type nucleotide triphosphate transporters and/or polymerases), by
engineering a microorganism with one or more ORFs of interest,
which microorganism comprises altered activities of nucleotide
triphosphate transporter activity or polymerase activity.
[0144] A nucleic acid may be specifically cleaved by contacting the
nucleic acid with one or more specific cleavage agents. Specific
cleavage agents often will cleave specifically according to a
particular nucleotide sequence at a particular site. Examples of
enzyme specific cleavage agents include without limitation
endonucleases (e.g., DNase (e.g., DNase I, II); RNase (e.g., RNase
E, F, H, P); Cleavase.TM. enzyme; Taq DNA polymerase; E. coli DNA
polymerase I; murine FEN-1 endonucleases; type I, II or III
restriction endonucleases such as Acc I, Afl III, Alu I, Alw44 I,
Apa I, Asn I, Ava I, Ava II, BamH I, Ban II, Bcl I, Bgl I. Bgl II,
Bln I, BsaI, Bsm I, BsmBI, BssH II, BstE II, Cfo I, CIa I, Dde I,
Dpn I, Dra I, EcIX I, EcoR I, EcoR I, EcoR II, EcoR V, Hae II, Hae
II, Hind II, Hind III, Hpa I, Hpa II, Kpn I, Ksp I, Mlu I, MIuN I,
Msp I, Nci I, Nco I, Nde I, Nde II, Nhe I, Not I, Nru I, Nsi I, Pst
I, Pvu I, Pvu II, Rsa I, Sac I, Sal I, Sau3A I, Sca I, ScrF I, Sfi
I, Sma I, Spe I, Sph I, Ssp I, Stu I, Sty I, Swa I, Taq I, Xba I,
Xho I); glycosylases (e.g., uracil-DNA glycolsylase (UDG),
3-methyladenine DNA glycosylase, 3-methyladenine DNA glycosylase
II, pyrimidine hydrate-DNA glycosylase, FaPy-DNA glycosylase,
thymine mismatch-DNA glycosylase, hypoxanthine-DNA glycosylase,
5-Hydroxymethyluracil DNA glycosylase (HmUDG),
5-Hydroxymethylcytosine DNA glycosylase, or 1,N6-etheno-adenine DNA
glycosylase); exonucleases (e.g., exonuclease III); ribozymes, and
DNAzymes. Sample nucleic acid may be treated with a chemical agent,
or synthesized using modified nucleotides, and the modified nucleic
acid may be cleaved. In non-limiting examples, sample nucleic acid
may be treated with (i) alkylating agents such as methylnitrosourea
that generate several alkylated bases, including N3-methyladenine
and N3-methylguanine, which are recognized and cleaved by alkyl
purine DNA-glycosylase; (ii) sodium bisulfite, which causes
deamination of cytosine residues in DNA to form uracil residues
that can be cleaved by uracil N-glycosylase; and (iii) a chemical
agent that converts guanine to its oxidized form, 8-hydroxyguanine,
which can be cleaved by formamidopyrimidine DNA N-glycosylase.
Examples of chemical cleavage processes include without limitation
alkylation, (e.g., alkylation of phosphorothioate-modified nucleic
acid); cleavage of acid lability of
P3'-N5'-phosphoroamidate-containing nucleic acid; and osmium
tetroxide and piperidine treatment of nucleic acid.
[0145] In some embodiments, the nucleic acid reagent includes one
or more recombinase insertion sites. A recombinase insertion site
is a recognition sequence on a nucleic acid molecule that
participates in an integration/recombination reaction by
recombination proteins. For example, the recombination site for Cre
recombinase is loxP, which is a 34 base pair sequence comprised of
two 13 base pair inverted repeats (serving as the recombinase
binding sites) flanking an 8 base pair core sequence (e.g., Sauer,
Curr. Opin. Biotech. 5:521-527 (1994)). Other examples of
recombination sites include attB, attP, attL, and attR sequences,
and mutants, fragments, variants and derivatives thereof, which are
recognized by the recombination protein .lamda. Int and by the
auxiliary proteins integration host factor (IHF), FIS and
excisionase (Xis) (e.g., U.S. Pat. Nos. 5,888,732; 6,143,557;
6,171,861; 6,270,969; 6,277,608; and 6,720,140; U.S. patent
application Ser. Nos. 09/517,466, and 09/732,914; U.S. Patent
Publication No. US2002/0007051; and Landy, Curr. Opin. Biotech.
3:699-707 (1993)).
[0146] Examples of recombinase cloning nucleic acids are in
Gateway.RTM. systems (Invitrogen, California), which include at
least one recombination site for cloning desired nucleic acid
molecules in vivo or in vitro. In some embodiments, the system
utilizes vectors that contain at least two different site-specific
recombination sites, often based on the bacteriophage lambda system
(e.g., att1 and att2), and are mutated from the wild-type (attO)
sites. Each mutated site has a unique specificity for its cognate
partner att site (i.e., its binding partner recombination site) of
the same type (for example attB1 with attP1, or attL1 with attR1)
and will not cross-react with recombination sites of the other
mutant type or with the wild-type attO site. Different site
specificities allow directional cloning or linkage of desired
molecules thus providing desired orientation of the cloned
molecules. Nucleic acid fragments flanked by recombination sites
are cloned and subcloned using the Gateway.RTM. system by replacing
a selectable marker (for example, ccdB) flanked by att sites on the
recipient plasmid molecule, sometimes termed the Destination
Vector. Desired clones are then selected by transformation of a
ccdB sensitive host strain and positive selection for a marker on
the recipient molecule. Similar strategies for negative selection
(e.g., use of toxic genes) can be used in other organisms such as
thymidine kinase (TK) in mammals and insects.
[0147] A nucleic acid reagent sometimes contains one or more origin
of replication (ORI) elements. In some embodiments, a template
comprises two or more ORIs, where one functions efficiently in one
organism (e.g., a bacterium) and another function efficiently in
another organism (e.g., a eukaryote, like yeast for example). In
some embodiments, an ORI may function efficiently in one species
(e.g., S. cerevisiae, for example) and another ORI may function
efficiently in a different species (e.g., S. pombe, for example). A
nucleic acid reagent also sometimes includes one or more
transcription regulation sites.
[0148] A nucleic acid reagent, e.g., an expression cassette or
vector, can include nucleic acid sequence encoding a marker
product. A marker product is used to determine if a gene has been
delivered to the cell and once delivered is being expressed.
Example marker genes include the E. coli lacZ gene which encodes
0-galactosidase and green fluorescent protein. In some embodiments
the marker can be a selectable marker. When such selectable markers
are successfully transferred into a host cell, the transformed host
cell can survive if placed under selective pressure. There are two
widely used distinct categories of selective regimes. The first
category is based on a cell's metabolism and the use of a mutant
cell line which lacks the ability to grow independent of a
supplemented media. The second category is dominant selection which
refers to a selection scheme used in any cell type and does not
require the use of a mutant cell line. These schemes typically use
a drug to arrest growth of a host cell. Those cells which have a
novel gene would express a protein conveying drug resistance and
would survive the selection. Examples of such dominant selection
use the drugs neomycin (Southern et al., J. Molec. Appl. Genet. 1:
327 (1982)), mycophenolic acid, (Mulligan et al., Science 209: 1422
(1980)) or hygromycin, (Sugden, et al., Mol. Cell. Biol. 5: 410-413
(1985)).
[0149] A nucleic acid reagent can include one or more selection
elements (e.g., elements for selection of the presence of the
nucleic acid reagent, and not for activation of a promoter element
which can be selectively regulated). Selection elements often are
utilized using known processes to determine whether a nucleic acid
reagent is included in a cell. In some embodiments, a nucleic acid
reagent includes two or more selection elements, where one
functions efficiently in one organism, and another functions
efficiently in another organism. Examples of selection elements
include, but are not limited to, (1) nucleic acid segments that
encode products that provide resistance against otherwise toxic
compounds (e.g., antibiotics); (2) nucleic acid segments that
encode products that are otherwise lacking in the recipient cell
(e.g., essential products, tRNA genes, auxotrophic markers); (3)
nucleic acid segments that encode products that suppress the
activity of a gene product; (4) nucleic acid segments that encode
products that can be readily identified (e.g., phenotypic markers
such as antibiotics (e.g., (3-lactamase), 0-galactosidase, green
fluorescent protein (GFP), yellow fluorescent protein (YFP), red
fluorescent protein (RFP), cyan fluorescent protein (CFP), and cell
surface proteins); (5) nucleic acid segments that bind products
that are otherwise detrimental to cell survival and/or function;
(6) nucleic acid segments that otherwise inhibit the activity of
any of the nucleic acid segments described in Nos. 1-5 above (e.g.,
antisense oligonucleotides); (7) nucleic acid segments that bind
products that modify a substrate (e.g., restriction endonucleases);
(8) nucleic acid segments that can be used to isolate or identify a
desired molecule (e.g., specific protein binding sites); (9)
nucleic acid segments that encode a specific nucleotide sequence
that can be otherwise non-functional (e.g., for PCR amplification
of subpopulations of molecules); (10) nucleic acid segments that,
when absent, directly or indirectly confer resistance or
sensitivity to particular compounds; (11) nucleic acid segments
that encode products that either are toxic or convert a relatively
non-toxic compound to a toxic compound (e.g., Herpes simplex
thymidine kinase, cytosine deaminase) in recipient cells; (12)
nucleic acid segments that inhibit replication, partition or
heritability of nucleic acid molecules that contain them; and/or
(13) nucleic acid segments that encode conditional replication
functions, e.g., replication in certain hosts or host cell strains
or under certain environmental conditions (e.g., temperature,
nutritional conditions, and the like).
[0150] A nucleic acid reagent can be of any form useful for in vivo
transcription and/or translation. A nucleic acid sometimes is a
plasmid, such as a supercoiled plasmid, sometimes is a yeast
artificial chromosome (e.g., YAC), sometimes is a linear nucleic
acid (e.g., a linear nucleic acid produced by PCR or by restriction
digest), sometimes is single-stranded and sometimes is
double-stranded. A nucleic acid reagent sometimes is prepared by an
amplification process, such as a polymerase chain reaction (PCR)
process or transcription-mediated amplification process (TMA). In
TMA, two enzymes are used in an isothermal reaction to produce
amplification products detected by light emission (e.g.,
Biochemistry 1996 June 25; 35(25):8429-38). Standard PCR processes
are known (e.g., U.S. Pat. Nos. 4,683,202; 4,683,195; 4,965,188;
and 5,656,493), and generally are performed in cycles. Each cycle
includes heat denaturation, in which hybrid nucleic acids
dissociate; cooling, in which primer oligonucleotides hybridize;
and extension of the oligonucleotides by a polymerase (i.e., Taq
polymerase). An example of a PCR cyclical process is treating the
sample at 95.degree. C. for 5 minutes; repeating forty-five cycles
of 95.degree. C. for 1 minute, 59.degree. C. for 1 minute, 10
seconds, and 72.degree. C. for 1 minute 30 seconds; and then
treating the sample at 72.degree. C. for 5 minutes. Multiple cycles
frequently are performed using a commercially available thermal
cycler. PCR amplification products sometimes are stored for a time
at a lower temperature (e.g., at 4.degree. C.) and sometimes are
frozen (e.g., at -20.degree. C.) before analysis.
Kits/Article of Manufacture
[0151] Disclosed herein, in certain embodiments, are kits and
articles of manufacture for use with one or more methods described
herein. Such kits include a carrier, package, or container that is
compartmentalized to receive one or more containers such as vials,
tubes, and the like, each of the container(s) comprising one of the
separate elements to be used in a method described herein. Suitable
containers include, for example, bottles, vials, syringes, and test
tubes. In one embodiment, the containers are formed from a variety
of materials such as glass or plastic.
[0152] In some embodiments, a kit includes a suitable packaging
material to house the contents of the kit. In some cases, the
packaging material is constructed by well-known methods, preferably
to provide a sterile, contaminant-free environment. The packaging
materials employed herein can include, for example, those
customarily utilized in commercial kits sold for use with nucleic
acid sequencing systems. Exemplary packaging materials include,
without limitation, glass, plastic, paper, foil, and the like,
capable of holding within fixed limits a component set forth
herein.
[0153] The packaging material can include a label which indicates a
particular use for the components. The use for the kit that is
indicated by the label can be one or more of the methods set forth
herein as appropriate for the particular combination of components
present in the kit. For example, a label can indicate that the kit
is useful for a method of synthesizing a polynucleotide or for a
method of determining the sequence of a nucleic acid.
[0154] Instructions for use of the packaged reagents or components
can also be included in a kit. The instructions will typically
include a tangible expression describing reaction parameters, such
as the relative amounts of kit components and sample to be admixed,
maintenance time periods for reagent/sample admixtures,
temperature, buffer conditions, and the like.
[0155] It will be understood that not all components necessary for
a particular reaction need be present in a particular kit. Rather
one or more additional components can be provided from other
sources. The instructions provided with a kit can identify the
additional component(s) that are to be provided and where they can
be obtained.
[0156] In some embodiments, a kit is provided that is useful for
stably incorporating an unnatural nucleic acid into a cellular
nucleic acid, e.g., using the methods provided by the present
invention for preparing genetically engineered cells. In one
embodiment, a kit described herein includes a genetically
engineered cell and one or more unnatural nucleic acids. In another
embodiment, a kit described herein includes an isolated and
purified plasmid comprising a sequence selected from SEQ ID NOs:
1-9. In a further embodiment, a kit described herein includes an
isolated and purified plasmid comprises a sequence of SEQ ID NOs:
2, 3, 5, or 7.
[0157] In additional embodiments, the kit described herein provides
a cell and a nucleic acid molecule containing a heterologous gene
for introduction into the cell to thereby provide a genetically
engineered cell, such as expression vectors comprising the nucleic
acid of any of the embodiments hereinabove described in this
paragraph.
Certain Terminology
[0158] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as is commonly understood by one
of skill in the art to which the claimed subject matter belongs. It
is to be understood that the foregoing general description and the
following detailed description are exemplary and explanatory only
and are not restrictive of any subject matter claimed. In this
application, the use of the singular includes the plural unless
specifically stated otherwise. It must be noted that, as used in
the specification and the appended claims, the singular forms "a,"
"an" and "the" include plural referents unless the context clearly
dictates otherwise. In this application, the use of "or" means
"and/or" unless stated otherwise. Furthermore, use of the term
"including" as well as other forms, such as "include", "includes,"
and "included," is not limiting.
[0159] As used herein, ranges and amounts can be expressed as
"about" a particular value or range. About also includes the exact
amount. Hence "about 5 .mu.L" means "about 5 .mu.L" and also "5
.mu.L." Generally, the term "about" includes an amount that would
be expected to be within experimental error.
[0160] The section headings used herein are for organizational
purposes only and are not to be construed as limiting the subject
matter described.
EXAMPLES
[0161] These examples are provided for illustrative purposes only
and not to limit the scope of the claims provided herein.
Example 1
[0162] All natural organisms store genetic information in a four
letter, and two base pair genetic alphabet. In some instances, a
mutant form of Escherichia coli was generated, grown in the
presence of the unnatural nucleoside triphosphates dNaMTP and
d5SICSTP, and provided with the means to import them via expression
of a plasmid-borne nucleoside triphosphate transporter, replicates
DNA containing a single dNaM-d5SICS UBP. In some cases, the
organism grew poorly, and was unable to indefinitely store the
unnatural information, which is a prerequisite for true
semi-synthetic life. Described below comprise an engineered
transporter, coupled with a chemically optimized UBP, to generate a
semi-synthetic organism (SSO).
Methods
[0163] Unless otherwise stated, liquid bacterial cultures were
grown in 2.times.YT (casein peptone 16 g/L, yeast extract 10 g/L,
NaCl 5 g/L) supplemented with potassium phosphate (50 mM, pH 7),
referred to hereafter as "media", and incubated at 37.degree. C. in
a 48-well flat bottomed plate (CELLSTAR, Greiner Bio-One) with
shaking at 200 rpm. Solid growth media was prepared with 2% agar.
Antibiotics were used, as appropriate, at the following
concentrations: carbenicillin, 100 .mu.g/mL; streptomycin, 50
.mu.g/mL; kanamycin, 50 .mu.g/mL; zeocin, 50 .mu.g/mL;
chloramphenicol, 33 .mu.g/mL for plasmids, 5 .mu.g/mL for
chromosomal integrants. All selective agents were purchased
commercially. Cell growth, indicated as OD.sub.600, was measured
using a Perkin Elmer Envision 2103 Multilabel Reader with a 590/20
nm filter.
[0164] Unless otherwise stated, all molecular biology reagents were
obtained from New England Biolabs (NEB) and were used according to
the manufacturer's protocols. PCRs for cloning and strain
construction were performed with Q5 DNA polymerase. Thermocycling
was performed using a PTC-200 thermocycler (MJ Research), except
for the PCRs used to generate UBP-containing Golden Gate inserts
and the PCRs used in the biotin shift assay, which were performed
with a CFX-Connect Real-Time Thermal Cycler (Bio-Rad) to monitor
product amplification with SYBR Green (Thermo Fisher). Where
necessary, primers were phosphorylated using T4 polynucleotide
kinase. Plasmids linearized by PCR were treated with DpnI to remove
the plasmid template, and ligations were performed with T4 DNA
ligase. PCRs and Golden Gate assembled plasmids were purified by
spin column (DNA Clean and Concentrator-5, Zymo Research). DNA
fragments isolated by agarose gel electrophoresis were purified
using the Zymoclean Gel DNA recovery kit (Zymo Research). Colony
PCRs were performed with Taq DNA polymerase. Natural DNA fragments
and plasmids were quantified by A.sub.260/280 using a NanoDrop 2000
(Thermo Fisher) or an Infinite M200 Pro (Tecan). DNA fragments and
plasmids that contain UBP(s), which were typically <20 ng/.mu.L,
were quantified using the Qubit dsDNA HS Assay Kit (Thermo
Fisher).
[0165] The sequences of all DNA oligonucleotides used in this study
are provided in Table 2. Natural oligonucleotides were purchased
from IDT (San Diego, Calif., USA) with standard purification and
desalting. Gene synthesis of the codon optimized PtNTT2 and GFP
gene sequences was performed by GeneArt Gene Synthesis (Thermo
Fisher) and GenScript, respectively, and kindly provided by
Synthorx. Sequencing was performed by Eton Biosciences (San Diego,
Calif., USA) or Genewiz (San Diego, Calif., USA). Plasmids were
isolated using commercial miniprep kits (QIAprep, Qiagen or ZR
Plasmid Miniprep Classic, Zymo Research).
[0166] [.alpha.-.sup.32P]-dATP (3000 Ci/mmol, 10 mCi/mL) was
purchased from PerkinElmer (Shelton, Conn., USA). Triphosphates of
dNaM, d5SICS, dTPT3, and dMMO2.sup.bio were synthesized as
described in Li et al, "Natural-like replication of an unnatural
base pair for the expansion of the genetic alphabet and
biotechnology applications," J. Am. Chem. Soc. 136, 825-829 (2014);
or kindly provided by Synthorx (San Diego, Calif., USA). The
dNaM-containing TK1 oligonucleotide was described in Malyshev, et
al., "A semi-synthetic organism with an expanded genetic alphabet,"
Nature, 509, 385-388 (2014). All other unnatural oligonucleotides
containing dNaM were synthesized by Biosearch Technologies
(Petaluma, Calif., USA) with purification by reverse phase
cartridge.
[0167] The C41(DE3) E. coli strain was kindly provided by J. P.
Audia (University of South Alabama, USA). pKIKOarsBKm was a gift
from Lars Nielsen & Claudia Vickers (Addgene plasmid #46766).
pRS426 was kindly provided by Richard Kolodner (University of
California San Diego, USA).
Construction of PtNTT2 Plasmids
[0168] Construction of pCDF-1b-PtNTT2 was described Malyshev, et
al., "A semi-synthetic organism with an expanded genetic alphabet,"
Nature, 509, 385-388 (2014). To create pCDF-1b-PtNTT2(66-575),
phosphorylated primers YZ552 and pCDF-1b-fwd were used to linearize
pCDF-1b-PtNTT2 by PCR and the resulting product was
intramolecularly ligated. Plasmids from single clones were isolated
and confirmed by sequencing the PtNTT2 gene using primers T7 seq
and T7 term seq.
[0169] To create plasmids pSCP.sub.(lacI, bla, lac,
lacUV5)PtNTT2(66-575)-T.sub.0, phosphorylated primers YZ581 and
YZ576 were used to amplify the PtNTT2(66-575) gene, and its
corresponding ribosomal binding sequence and terminator, from a
version of pCDF-1b-PtNTT2(66-575) that replaces the T7 terminator
with a .lamda. T.sub.0 terminator. This insert was ligated into
plasmid pHSG576 linearized with primers DM002 and YZ580. A single
clone of the resulting plasmid pSC-PtNTT2(66-575)-T.sub.0 was
verified by sequencing the PtNTT2 gene using primers DM052 and
YZ50. pSC-PtNTT2(66-575)-T.sub.0 was then linearized with primers
YZ580 and YZ581, and ligated to a phosphorylated primer duplex
corresponding to the P.sub.lacI, P.sub.bla, P.sub.lac or
P.sub.lacUV5 promoter (YZ584/YZ585, YZ582/YZ583, YZ599/YZ600, and
YZ595/YZ596, respectively) to yield plasmids pSC-P.sub.(lacI, bla,
lac, lacUV5)PtNTT2(66-575)-T.sub.0. Correct promoter orientation
and promoter-gene sequences were again confirmed by sequencing
using primers DM052 and YZ50. pSC-P.sub.blaPtNTT2(66-575
co)-T.sub.0 was generated analogously to
pSC-P.sub.blaPtNTT2(66-575)-T.sub.0 by using a .lamda. T.sub.0
terminator version of pCDF-1b-PtNTT2(66-575) containing a codon
optimized PtNTT2 sequence (see Table 2).
Construction of PtNTT2 and Cas9 Strains
[0170] The PtNTT2(66-575) expression cassette and its
chloramphenicol resistance marker Cm.sup.R in the pSC plasmids is
.about.2.8 kb, a size that is prohibitive for chromosomal
integration with the small (.about.50 bp) stretches of homology
that can be introduced via primers during PCR, as is traditionally
done in recombineering. Homologous recombination in S. cerevisiae
was used to construct a series of integration template plasmids
with the PtNTT2(66-575) expression cassette and Cm.sup.R flanked by
.about.1 kb of sequence 5' to lacZ and .about.1 kb of sequence 3'
to 246 bp downstream of the lacA start codon. The lacZYA locus was
chosen so that integration of the transporter would also knockout
the lactose permease lacY, thus creating a BL21(DE3) strain that
allows for uniform cellular entry of IPTG, and thereby homogenous,
finely titratable induction of promoters containing lac
operators.
[0171] To create the integration template plasmids, pRS426 was
digested with PvuI-HF and the resulting 3810-bp plasmid fragment
was isolated by agarose gel electrophoresis and purification. This
fragment was then gap repaired in the S. cerevisiae strain BY4741
via lithium acetate mediated chemical transformation of the plasmid
fragment and PCR products of the following primer/primer/template
combinations: YZ7/YZ12/pBR322, YZ613/YZ580/E. coli genomic DNA,
YZ614/615/E. coli genomic DNA, and
DM052/YZ612/pSC-P.sub.lacUV5(66-575)-T.sub.0. The resulting
plasmid, 426.lacZYA::P.sub.lacUV5PtNTT2(66-575)-T.sub.0 Cm.sup.R,
was isolated (Zymoprep Yeast Plasmid Miniprep, Zymo Research),
digested with PvuI-HF and XbaI (to reduce background during
integration, since the pRS426 shuttle plasmid also contains an E.
coli pMB1 origin), and used as the template to generate a linear
integration fragment via PCR with primers YZ616 and YZ617.
Integration of this fragment into BL21(DE3) to generate strain YZ2
was performed using pKD46 as described in Datsenko, et al.,
"One-step inactivation of chromosomal genes in Escherichia coli
K-12 using PCR products," PNAS 97, 6640-6645 (2000). Integrants
were confirmed by colony PCR of the 5' and 3' junctions using
primers YZ618 and YZ587 (1601-bp product) and YZ69 and YZ619
(1402-bp product), respectively, detection of lacZ deletion via
growth on plates containing X-gal (80 .mu.g/mL) and IPTG (100
.mu.M), and PCR and sequencing of the transporter with primers
DM053 and YZ50.
[0172] Plasmid 426.lacZYA::Cm.sup.R was generated from the
linearization of 426.lacZYA::P.sub.lacUV5PtNTT2(66-575)-T.sub.0
Cm.sup.R using phosphorylated primers YZ580 and pCDF-1b-rev, and
subsequent intramolecular ligation. 426.lacZYA::Cm.sup.R was then
integrated into BL21(DE3) to create an isogenic, transporter-less
control strain for dATP uptake assays.
[0173] To create plasmids 426.lacZYA::P.sub.(bla, lac,
lacUV5)PtNTT2(66-575 co)-T.sub.0 Cm.sup.R, plasmid
426.lacZYA::PtNTT2(66-575)-T.sub.0 Cm.sup.R (a promoter-less
plasmid generated analogously to
426.lacZYA::P.sub.lacUV5PtNTT2(66-575)-T.sub.0 Cm.sup.R using
pSC-PtNTT2(66-575)-T.sub.0), referred to hereafter as 426.trunc,
was digested with PvuI-HF and AvrII, and the resulting 5969-bp
plasmid fragment was isolated by agarose gel electrophoresis and
purification, and gap repaired using PCR products of the following
primer/primer/template combinations: 1, YZ12/YZ580/426.trunc; 2,
DM053/YZ610/pSC-P.sub.(bla, lac, lacUV5)PtNTT2(66-575)-T.sub.0; 3,
YZ581/YZ50/pSC-N.sub.aPtNTT2(66-575 co)-T.sub.0. Plasmids
426.lacZYA::P.sub.(tac, N25, .lamda., H207)PtNTT2(66-575
co)-T.sub.0 Cm.sup.R were generated analogously except fragments 2
were replaced with fragments corresponding to the promoters
P.sub.tac, P.sub.N25, P.sub..lamda. and P.sub.H207, which were
generated by annealing and extension of primer pairs YZ703/YZ704,
YZ707/YZ708, YZ709/YZ710, and YZ711/YZ712, respectively, with
Klenow fragment. Plasmids 426.lacZYA::P.sub.(bla, l ac, lacUV5,
tac, N25, .lamda., H207)PtNTT2(66-575 co)-T.sub.0 Cm.sup.R were
then used to integrate the transporter into BL21(DE3) using primers
YZ616 and YZ617, and recombineering, as described above. Strain YZ3
denotes BL21(DE3) integrated with lacZYA::P.sub.lacUV5PtNTT2(66-575
co)-T.sub.0 Cm.sup.R.
[0174] To create strain YZ4, the 4362-bp fragment of Spel and AvrII
digested pCas9-Multi was ligated into Spel digested pKIKOarsBKm and
the resulting plasmid, pKIKOarsB: P.sub.lacO-Cas9-T.sub.rmB
Km.sup.R, was used as the template to generate a linear integration
fragment via PCR with primers YZ720 and YZ721. The fragment was
then integrated into BL21(DE3) as described above, and confirmed by
colony PCR with primers YZ720 and YZ721 and sequencing of the
product with primers TG1-TG6. P.sub.lacUV5PtNTT2(66-575 co)-T.sub.0
Cm.sup.R was subsequently integrated into this strain, as described
above, to generate strain YZ4.
dATP Uptake Assay
[0175] Radioactive uptake assays were conducted as described in
Haferkamp, et al., "Tapping the nucleotide pool of the host: novel
nucleotide carrier proteins of Protochlamydia amoebophila," Mol.
Microbial. 60, 1534-1545 (2006) with the following modifications:
C41(DE3) and BL21(DE3) strains carrying plasmid-based transporters
and their appropriate empty plasmid controls, as well as BL21(DE3)
chromosomal transporter integrants and their appropriate isogenic
transporter-less control, were grown overnight with appropriate
antibiotics (streptomycin for pCDF plasmids and chloramphenicol for
pSC plasmids and integrants) in 500 .mu.L of media. Cultures were
diluted to an OD.sub.600 of 0.02 in 500 .mu.L of fresh media, grown
for 2.5 h, induced with IPTG (0-1 mM, pCDF strains only) or grown
(all other strains) for 1 h, and incubated with dATP spiked with
[.alpha.-.sup.32P]-dATP (final concentration=250 .mu.M (0.5
.mu.Ci/mL)) for .about.1 h. This experimental scheme is analogous
to the protocol used to prepare cells for transformation with
UBP-containing plasmids, with the 1 h of dATP incubation simulating
the 1 h of recovery in the presence of unnatural triphosphates
following electroporation. A duplicate 48-well plate without
[.alpha.-.sup.32P]-dATP was grown in parallel to monitor
growth.
[0176] Following incubation with dATP, 200 .mu.L of each culture
was collected through a 96-well 0.65 .mu.m glass fiber filter plate
(MultiScreen, EMD Millipore) under vacuum and washed with cold
potassium phosphate (3.times.200 .mu.L, 50 mM, pH 7) and cold
ddH.sub.2O (1.times.200 .mu.L). Filters were removed from the plate
and exposed overnight to a storage phosphor screen (BAS-IP MS, GE
Healthcare Life Sciences), which was subsequently imaged using a
flatbed laser scanner (Typhoon 9410, GE Healthcare Life Sciences).
The resulting image was quantified by densitometric analysis using
Image Studio Lite (LI-COR). Raw image intensities of each sample
were normalized to the length of time and average OD.sub.600 during
dATP incubation (i.e. normalized to an estimate of the area under
the growth curve corresponding to the window of uptake), followed
by subtracting the normalized signals of the appropriate negative,
no transporter controls.
[0177] Doubling times for strains grown in the dATP uptake assay
were calculated by doubling time as
(t.sub.2-t.sub.1)/log.sub.2(OD.sub.600,2/OD.sub.600,1), averaging
across three, .about.30 min time intervals roughly corresponding to
30 min prior to dATP uptake and 60 min during dATP uptake.
Construction of Golden Gate Destination Plasmids for pUCX1, pUCX2,
and pBRX2
[0178] Although the UBP was cloned into plasmids via circular
polymerase extension cloning (CPEC).sup.4,27, the method results in
a doubly-nicked plasmid that cannot be treated with T5 exonuclease
to degrade unincorporated linear plasmid and inserts, and thus
makes it difficult to accurately quantify the yield of the cloning
reaction and control the amount of input plasmid used to transform
cells during an in vivo replication experiment. Furthermore, the
unincorporated linear plasmid and inserts of a CPEC reaction can
also template PCR reactions with the primers used in the biotin
shift assay, and thus biotin shift assays on CPEC products do not
truly reflect the UBP content of the plasmids that are actually
transformed into cells. To circumvent these complications, the UBP
was incorporated into plasmids using Golden Gate Assembly.
[0179] To create pUCX1 GG and pUCX2 GG, the Golden Gate destination
plasmids for pUCX1 and pUCX2, respectively, pUC19 was linearized
with phosphorylated primers pUC19-lin-fwd and pUC19-lin-rev, and
the resulting product was intramolecularly ligated to delete the
natural 75-nt TK1 sequence. The resulting plasmid was then
linearized with phosphorylated primers YZ51 and YZ52, and the
resulting product was intramolecularly ligated to mutate the BsaI
recognition site within the ampicillin resistance marker Amp.sup.R.
This plasmid was then linearized with primers pUC19-lin-fwd and
pUC19-lin-rev (for pUCX1), or primers YZ95 and YZ96 (for pUCX2),
and ligated to an insert generated from PCR with phosphorylated
primers YZ93 and YZ94 and template pCas9-Multi, to introduce two
BsaI recognition sites (for cloning by Golden Gate Assembly) and a
zeocin resistance marker (a stuffer cassette used to differentiate
between plasmids with or without an insert) into pUC19.
[0180] To create pBRX2 GG, the Golden Gate destination plasmid for
pBRX2, the 2934-bp fragment of AvaI and EcoRI-HF digested pBR322
was end-filled with Klenow fragment and intramolecularly ligated to
delete the tetracycline resistance cassette. The BsaI recognition
site within Amp.sup.R was mutated as described above. The plasmid
was then linearized with primers YZ95 and YZ96, and ligated to the
BsaI-zeo.sup.R-BsaI cassette as described above. Thus, pBRX2 is a
lower copy analog of pUCX2.
Golden Gate Assembly of UBP-Containing Plasmids
[0181] Plasmids containing UBP(s) were generated by Golden Gate
Assembly. Inserts containing the UBP were generated by PCR of
chemically synthesized oligonucleotides containing dNaM, using
dTPT3TP and dNaMTP, and primers that introduce terminal BsaI
recognition sites that, when digested, produce overhangs compatible
with an appropriate destination plasmid; see Table 2 for a full
list of primers, templates and their corresponding Golden Gate
destination plasmids. Template oligonucleotides (0.025 ng per 50
.mu.L reaction) were PCR amplified using reagent concentrations and
equipment under the following thermocycling conditions (times
denoted as mm:ss): [96.degree. C. 1:00|20.times.[96.degree. C.
0:15|60.degree. C. 0:15|68.degree. C. 4:00]].
[0182] To assemble the UBP-containing plasmids, destination plasmid
(200-400 ng), PCR insert(s) (3:1 insert:plasmid molar ratio), T4
DNA ligase (200 U), BsaI-HF (20 U), and ATP (1 mM) were combined in
1.times.NEB CutSmart buffer (final volume 30 .mu.L) and
thermocycled under the following conditions: [37.degree. C.
20:00|40.times.[37.degree. C. 5:00|16.degree. C. 5:00|22.degree. C.
2:30] 37.degree. C. 20:00|55.degree. C. 15:00|80.degree. C. 30:00].
Following the Golden Gate reaction, T5 exonuclease (10 U) and
additional BsaI-HF (20 U) were added, and the reaction was
incubated (37.degree. C., 1 h) to digest unincorporated plasmid and
insert fragments. Assembled plasmids were quantified by Qubit.
Construction of Golden Gate Destination Plasmids for pCas9 and
pAIO
[0183] To create pCas9-Multi, the Golden Gate destination plasmid
for cloning sgRNA cassettes alongside Cas9, pPDAZ.sup.29 and a PCR
amplified Cas9 gene (Primers JL126 and JL128, template Addgene
plasmid #41815) were digested with KpnI and XbaI, and ligated to
create pCas9(-). This plasmid and a PCR amplified GFPT2-sgRNA
cassette (template Addgene plasmid #41820, which contains the sgRNA
sequence; the ProK promoter and terminator were introduced by PCR)
were digested with SalI and ligated to created pCas9-GFPT2. This
plasmid was then linearized with primers BL557 and BL558 (to remove
the BsmBI recognition sites within Cas9) and circularized via
Gibson Assembly. The resulting plasmid was then linearized with
primers BL559 and BL560 (to reintroduce two BsmBI sites in the
plasmid backbone), and circularized via Gibson Assembly to yield
pCas9-Multi, which was confirmed by sequencing with primers
TG1-TG6. Digestion of pCas9-Multi with BsmBI results in a
linearized plasmid with overhangs that allow for the simultaneous
cloning of one or more sgRNAs by Golden Gate Assembly (see section
below).
[0184] To create pAIO-Multi, pCas9-Multi was linearized with
primers BL731 and BL732 (to remove Cas9 and introduce BsaI
recognition sites for UBP cloning), phosphorylated, and
intramolecularly ligated, and confirmed by sequencing with primer
BL450. Digestion of pAIO-Multi with BsaI results in a linearized
plasmid with overhangs identical to the ones produced by BsaI
digestion of the pUCX2 destination plasmid, and thus PCR-generated
inserts for cloning the UBP into pUCX2 can also be used to clone
the UBP into pAIO-Multi and its derivatives. After the sgRNA
cassettes were cloned into pAIO-Multi (see next section below), the
Golden Gate Assembly protocol for cloning in a UBP was identical to
the one described above for pUCX2, except the product of pAIO-Multi
(with sgRNAs) amplified with BL731 and BL732 was used in place of
the plasmid itself.
sgRNA Cloning into pCas9 and pAIO
[0185] Dual sgRNA cassettes were cloned into pCas9-Multi or
pAIO-Multi via Golden Gate Assembly. To generate the first sgRNA
cassette of each pair, pCas9-GFPT2 (1 ng) was PCR amplified with
primers 1.sup.st sgRNA GG (200 nM) and BL562 (200 nM), and OneTaq
DNA polymerase, under the following thermocycling conditions:
[30.times.[94.degree. C. 0:30|52.degree. C. 0:15|68.degree. C.
0:30]]. PCR products were purified by agarose gel electrophoresis
and purification. The 1.sup.st sgRNA GG primer is a 70-nt primer
that possesses (from 5' to 3') a BsmBI restriction site, 10-nt of
homology with the ProK promoter, an 18-nt variable guide (spacer)
complementary to a UBP-mutation, and 25-nt of homology to the
non-variable sgRNA scaffold. To generate the second sgRNA cassette,
pCas9-GFPT2 was PCR amplified with primers BL563 and 2.sup.nd sgRNA
Rev, and primers BL566 and 2.sup.nd sgRNA Fwd, and the resulting
two products were combined and amplified by overlap extension PCR
using primers BL563 and BL566, followed by agarose gel
electrophoresis and purification.
[0186] To assemble the guide plasmids, pCas9-Multi (40 ng) or
pAIO-Multi (20 ng), purified DNA of the first sgRNA cassette (4.5
ng) and second sgRNA cassette (8 ng), T4 DNA ligase (200 U), BsmBI
(5 U), and ATP (1 mM) were combined in 1.times.NEB CutSmart
reaction buffer (final volume 20 .mu.L) and thermocycled under the
following conditions: [5.times.[37.degree. C. 6:00|16.degree. C.
8:00] 15.times.[55.degree. C. 6:00|16.degree. C. 8:00]]. Assembled
plasmids were transformed into electrocompetent cells for
subsequent sequencing and testing.
[0187] To assemble pCas9-TK1-A, a plasmid containing only one sgRNA
cassette, pCas9-GFPT2 was amplified with primers BL566 and BL567,
and the resulting product was ligated into pCas9-Multi by Golden
Gate Assembly as described above.
[0188] To assemble pCas9-hEGFP, a plasmid containing a non-target
sgRNA cassette for TK1 experiments, primers BL514 and BL515 were
annealed and ligated, by Gibson Assembly, into pCas9-GFPT2
linearized with primers BL464 and BL465.
Construction of pAIO2X
[0189] pAIO2X GG, the Golden Gate destination plasmid for pAIO2X,
is derived from three plasmids, using PCR-generated inserts and
multiple steps of cloning by restriction enzyme digest and
ligation. Inserts from pSYN36, which contains a codon-optimized
superfolder gfp with a Golden Gate entry site for cloning in
sequences that correspond to nucleotides 409-483 of gfp, and
pET-22b-ESerGG, which contains an E. coli serT gene with a Golden
Gate entry site for cloning in sequences that correspond to
nucleotides 10-65 of serT, were cloned into pAIO dual guide BsmBI,
a version of pAIO-Multi that contains two sgRNA cassettes, with the
targeting guide (spacer) sequences replaced by two orthogonal pairs
of BsmBI recognition sites that enable guide cloning using annealed
primer duplexes.
[0190] To create pAIO2X-GFP151/Eser-69 GG, annealed primer duplexes
of YZ310/YZ316 and YZ359/YZ360 were ligated into pAIO2X GG using
the same Golden Gate Assembly reagents and thermocycling conditions
used for UBP cloning, with the exception that BsaI was replaced by
BsmBI, each primer duplex was used at a 50:1 insert:plasmid molar
ratio with 30 fmol of destination plasmid, and the reaction was
scaled by one third to 10 .mu.L. Following assembly, the reaction
was not digested with additional enzymes or purified, and was
directly transformed into chemically competent E. coli DH5.alpha..
Following isolation of single plasmid clones and confirmation of
the guides by sequencing using primer BL450, the UBPs were cloned
into the plasmid by Golden Gate assembly with BsaI, as described in
the section, Golden Gate Assembly of UBP-containing plasmids.
Cas9 In Vitro Cleavage Assay
[0191] To generate the DNA substrates for in vitro Cas9 cleavage
assays, templates BL408, BL409, BL410, BL487, BL488, and BL489 (1
ng per 50 .mu.L reaction) were PCR amplified with primers BL415
(400 nM) and BL416 (400 nM), and OneTaq DNA polymerase in 1.times.
OneTaq standard reaction buffer supplemented with dNaMTP (100
.mu.M), dTPT3TP (100 .mu.M), and MgCl.sub.2 (1.5 mM), under the
following thermocycling conditions: [25.times.[95.degree. C.
0:15|56.degree. C. 0:15|68.degree. C. 1:30]].
[0192] To generate the DNA templates for in vitro transcription of
sgRNAs, templates BL318, BL484, BL485, and BL486 (1 ng per 50 .mu.L
reaction), which contain the T7 promoter and a CRISPR RNA (crRNA)
spacer sequence, were PCR amplified with primers BL472 (200 nM) and
BL473 (200 nM), and OneTaq DNA polymerase in 1.times. OneTaq
standard reaction buffer supplemented with MgCl.sub.2 (6 mM), under
the following thermocycling conditions: [20.times.[95.degree. C.
0:15|60.degree. C. 0:15|68.degree. C. 1:30]]. DNA from this first
PCR reaction (0.5 .mu.L) was then transferred into a second PCR
reaction (100 .mu.L) containing primers BL472 (400 nM), BL439 (500
nM), and BL440 (600 nM), and thermocycled under the following
conditions: [4.times.[95.degree. C. 0:15|68.degree. C.
0:15|68.degree. C. 1:30] 20.times.[95.degree. C. 0:15|60.degree. C.
0:15|68.degree. C. 1:30]]. In vitro transcription of the PCR
products with T7 RNA polymerase was performed, and transcribed
sgRNAs were purified by PAGE, band excision, and extraction
(37.degree. C., overnight) into an aqueous solution of NaCl (200
mM) and EDTA (1 mM, pH 7), followed by concentration and
purification by ethanol precipitation.
[0193] For in vitro cleavage reactions, Cas9 nuclease (125 nM) was
incubated with each transcribed sgRNA (125 nM) in 1.times.Cas9
nuclease reaction buffer for 5 min, then DNA substrate was added
and the reaction was incubated (37.degree. C., 10 min). The
reaction was quenched with SDS-PAGE loading buffer (62 mM Tris-HCl,
2.5% SDS, 0.002% bromophenol blue, 0.7 M .beta.-mercaptoethanol,
and 10% glycerol), heat denatured (95.degree. C., 10 min), and then
loaded onto an SDS-PAGE gel. The resulting cleavage bands were
quantified by densitometric analysis using ImageJ.sup.31. For each
sgRNA, raw cleavage efficiencies were divided by the maximum
cleavage observed for that sgRNA across the set of the six DNA
substrates, to account for differences in sgRNA activity and/or
minor variations in preparation. Experiments were performed in
technical triplicate and averages represent an average of three in
vitro cleavage reactions performed in parallel.
In Vivo Plasmid Replication Experiments
[0194] Electrocompetent YZ3 cells were prepared by overnight growth
in .about.5 mL of media supplemented with chloramphenicol, dilution
to OD.sub.600 of 0.02 in the same media (variable volumes,
.about.10 mL of media per transformation), and growth to OD.sub.600
of .about.0.3-0.4. Cells were then rapidly chilled in an ice water
bath with shaking, pelleted (2500.times.g, 10 min), and washed
twice with one culture volume of ice-cold ddH.sub.2O.
Electrocompetent cells were then resuspended in ice-cold ddH.sub.2O
(50 .mu.L per transformation), mixed with a Golden Gate assembled
plasmid (.about.1 .mu.L, .about.1 ng) containing the UBP, and
transferred to a pre-chilled 0.2 cm gap electroporation cuvette.
Cells were electroporated (Gene Pulser II, Bio-Rad) according to
the manufacturer's recommendations (voltage 25 kV, capacitor 2.5
.mu.F, resistor 200.OMEGA.) then immediately diluted with 950 .mu.L
of pre-warmed media supplemented with chloramphenicol. An aliquot
(10-40 .mu.L) of this dilution was then immediately diluted 5-fold
with the same pre-warmed media, but additionally supplemented with
dNaMTP (250 .mu.M) and d5SICSTP (250 .mu.M). The samples were
incubated (37.degree. C., 1 h) and then .about.15% of the sample
was used to inoculate media (final volume 250-300 .mu.L)
supplemented with chloramphenicol, carbenicillin, dNaMTP (250
.mu.M) and d5SICSTP (250 .mu.M). Cells were then monitored for
growth, collected at the density (OD.sub.600) indicated in the main
text, and subjected to plasmid isolation. Dilutions of the recovery
mixture were also spread onto solid media with chloramphenicol and
carbenicillin to ascertain transformation efficiencies. Experiments
with dNaMTP (150 .mu.M) and dTPT3TP (37.5 .mu.M) were performed
analogously.
[0195] Experiments with DM1 were performed analogously using media
supplemented with streptomycin, with the additional step of
inducing transporter expression with IPTG (1 mM, 1 h) prior to
pelleting the cells. All media following electrocompetent cell
preparation was also supplemented with streptomycin and IPTG (1 mM)
to maintain expression of the transporter.
In Vivo Plasmid Replication Experiments with Cas9 (Liquid Culture
Only)
[0196] Electrocompetent YZ2 cells were transformed with various
pCas9 guide plasmids and single clones were used to inoculate
overnight cultures. Cells were then grown, prepared and
electroporated as described above for YZ3, with the following
modifications: all media was additionally supplemented with zeocin
(to select for pCas9) and 0.2% glucose, electrocompetent cells were
stored in 10% (v/v ddH.sub.2O) DMSO at -80.degree. C. until use,
and recovery and growth media were supplemented with dNaMTP (250
.mu.M) and dTPT3TP (75 .mu.M). Varying concentrations of IPTG
(0-100 .mu.M) were added to the growth media (but not the recovery
media) to induce Cas9 expression. The sgRNAs corresponding to the
d(AXT) sequence are the non-target guides for all sequences except
for the d(AXT)-containing sequence itself, the non-target guides
for which correspond to the d(GXT) sequence and all experiments
with non-target sgRNAs were conducted with the addition of IPTG (10
.mu.M) to the growth media. For growth and regrowth experiments,
cells were grown to an OD.sub.600 of 3.5-4.0, then diluted 1:250
and regrown to an OD.sub.600 of 3.5-4.0, after which plasmids were
isolated.
In Vivo Plasmid Replication Experiments with Cas9 (Plating and
Liquid Culture)
[0197] Electrocompetent YZ4 cells were grown, prepared and
electroporated as described above for YZ2, with the following
modifications: media for growing cells prior to electroporation
only contained chloramphenicol (i.e. no zeocin), zeocin was used to
select for pAIO (i.e. no carbenicillin), and recovery and growth
media were supplemented with dNaMTP (150 .mu.M) and dTPT3TP (37.5
.mu.M). Following transformation with pAIO, dilutions of the
recovery mixture were spread onto solid media containing
chloramphenicol, zeocin, dNaMTP (150 .mu.M), dTPT3TP (37.5 .mu.M),
0.2% glucose, and various concentrations of IPTG (0-50 .mu.M).
Following overnight growth (37.degree. C., .about.14 h), individual
colonies were used to inoculate liquid media of the same
composition as the solid media. Experiments performed with pAIO2X
were conducted as described above for YZ4 without using frozen
electrocompetent cells or glucose. The second plating depicted in
FIG. 4 was performed by streaking cells from liquid culture onto
solid media of the same composition as the liquid media, and growth
at 37.degree. C. (.about.14 h). Six random colonies were selected
to continue propagation in liquid culture.
Cell Doubling Calculation
[0198] Cell doublings for liquid culture growth-dilution-regrowth
experiments were calculated by log.sub.e of the dilution factor
(30,000 or 300,000) between growths, except for growths inoculated
from a plated colony, the cell doublings for which were calculated
by averaging, for each individual clone, the time from inoculation
to target OD.sub.600 (9.4.+-.1.1 h (1 SD) for the first plating
inoculation, 10.2.+-.3 h for the second plating inoculation) and
dividing these averages by an estimated doubling time of 40 min.
Growth times varied for each clone because colonies were isolated
when they were barely visible to the naked eye, and thus it was not
attempted to control for variability in the number of cells
inoculated into the liquid cultures. Note that the reported cell
doublings was only an estimate of doublings in liquid culture,
which underreported the total number of cell doublings, as it was
not attempted to estimate the number of cell doublings that
occurred during each of the growths on solid media.
Biotin Shift Assay
[0199] The retention of the UBP(s) in isolated plasmids was
determined and validated as follows: plasmid minipreps or Golden
Gate assembled plasmids (0.5L, 0.5-5 ng/.mu.L), or dNaM-containing
oligonucleotides (0.5 fmol), were PCR amplified with dNTPs (400
.mu.M), 1.times. SYBR Green, MgSO.sub.4 (2.2 mM), primers (10 nM
each), d5SICSTP (65 .mu.M), dMMO2.sup.BioTP (65 .mu.M), OneTaq DNA
polymerase (0.018 U/.mu.L), and DeepVent DNA polymerase (0.007
U/.mu.L) in 1.times. OneTaq standard reaction buffer (final volume
15 .mu.L), under the following thermocycling conditions:
[20.times.[95.degree. C. 0:15|x.degree. C. 0:15|68.degree. C.
4:00]]; see Table 2 for a list of primers and their corresponding
annealing temperatures used in this assay. After amplification, 1
.mu.L of each reaction was mixed with streptavidin (2.5 .mu.L, 2
.mu.g/.mu.L, Promega) and briefly incubated at 37.degree. C. After
incubation, samples were mixed with loading buffer and run on a 6%
polyacrylamide (29:1 acrylamide:bis-acrylamide) TBE gel, at 120 V
for .about.30 min. Gels were then stained with 1.times.SYBR Gold
dye (Thermo Fisher) and imaged using a Molecular Imager Gel Doc XR+
(Bio-Rad) equipped with a 520DF30 filter (Bio-Rad).
Calculation of UBP Retention
[0200] UBP retention was assessed by densitometric analysis of the
gels (ImageJ or Image Studio Lite, LICOR) from the biotin shift
assay and calculation of a percent raw shift, which equals the
intensity of the streptavidin-shifted band divided by the sum of
the intensities of the shifted and unshifted bands. See FIG. 7 for
representative gels. Reported UBP retentions are normalized
values.
[0201] Unless otherwise indicated, for experiments not involving
plating on solid media, UBP retention was normalized by dividing
the percent raw shift of each propagated plasmid sample by the
percent raw shift of the Golden Gate assembled input plasmid. It
was assumed that the starting UBP content of the cellular plasmid
population was equivalent to the UBP content of the input plasmid,
based on direct inoculation of the transformation into liquid
culture. Thus, in these experiments, normalized UBP retention was a
relative value that related the UBP content of the propagated
plasmid population to the UBP content of the starting population,
which was not 100% due to loss during the PCR used to generate the
insert for input plasmid assembly (FIG. 7).
[0202] For experiments involving plating on solid media, UBP
retention was normalized by dividing the percent raw shift of each
propagated plasmid sample by the percent raw shift of the
dNaM-containing oligonucleotide template used in the assembly of
the input plasmid. Plating enabled clonal isolation of
UBP-containing plasmids from fully natural plasmids that arose
during plasmid construction (some of which may contain sequences
that were not recognized by the sgRNA(s) employed). Because there
was no PCR-mediated loss of the UBP in the oligonucleotide
template, normalization to the oligonucleotide template was a
better indicator of absolute UBP retention than normalization to
the input plasmid. Under the conditions used in the biotin shift
assay, most oligonucleotide templates and sequence contexts gave
>90% raw shift, with <2% shift for a cognate fully natural
template (i.e. UBP misincorporation during the biotin shift assay
was negligible).
[0203] Plating allowed for the differentiation between UBP loss
that occurred in vivo from loss that occurred in vitro, with the
exception of clonally-derived samples that gave <2% shift, for
which it was unable to differentiate between whether the UBP was
completely lost in vivo or if the sample came from a transformant
that originally received a fully natural plasmid. Such samples were
excluded from reported average values when other samples from the
same transformation give higher shifts.
Biotin Shift Depletion and In Vivo Mutation Analysis
[0204] To determine the mutational spectrum of the UBP in isolated
plasmid samples, biotin shift assays were performed as described
above. Non-shifted bands, which corresponded to natural mutations
of the UBP-containing sequences, were excised and extracted
(37.degree. C., overnight) into a minimal amount of an aqueous
solution of NaCl (200 mM) and EDTA (1 mM, pH 7), followed by
concentration and purification by ethanol precipitation. A sample
of extract (1 .mu.L) was PCR amplified under standard conditions
(natural dNTPs only), with OneTaq DNA polymerase and the same
primers used for the biotin shift PCR, and the resulting products
were sequenced by Sanger sequencing.
Functional Characterization of a Mutant PtNTT2 Transporter
[0205] Expression of the nucleoside triphosphate transporter from
Phaeodactylum tricornutum (PtNTT2) in E. coli enabled the import of
dNaMTP and d5SICSTP and the subsequent replication of the
dNaM-d5SICS UBP (FIG. 1A), but its expression was also toxic (FIG.
1B). In SSO referred to herein as DM1, the transporter was
expressed from a T7 promoter on a multicopy plasmid (pCDF-1b) in E.
coli C41(DE3), and its induction was controlled due to the
associated toxicity. In its native algal cell, PtNTT2's N-terminal
signal sequences direct its subcellular localization and are
removed by proteolysis. In some cases in the E. coli system, the
N-terminal signal was retained, and contributed to the observed
toxicity. Removal of amino acids 1-65 and expression of the
resulting N-terminally truncated variant PtNTT2(66-575) in E. coli
C41(DE3) resulted in lower toxicity relative to the full length
PtNTT2, but also reduced uptake of [.alpha.-.sup.32P]-dATP (FIG. 5A
and FIG. 5B), possibly due to reduced expression. Expression of
PtNTT2(66-575) in E. coli BL21(DE3) resulted in increased levels of
[.alpha.-.sup.32P]-dATP uptake with little increase in toxicity
relative to an empty vector control (FIG. 5A and FIG. 5C), but the
higher level of T7 RNAP in this strain was itself toxic (FIG. 5A
and FIG. 5C).
[0206] Constitutive expression of PtNTT2(66-575) from a low copy
plasmid or a chromosomal locus was explored, to eliminate the need
to produce toxic levels of T7 RNAP, and to impart the SSO with
greater autonomy, more homogeneous transporter expression and
triphosphate uptake across a population of cells, and ultimately
improve UBP retention. Expression of PtNTT2(66-575) in E. coli
BL21(DE3) was explored with the E. coli promoters P.sub.lacI,
P.sub.bla, and P.sub.lac from a pSC plasmid, and with P.sub.bla,
P.sub.lac, P.sub.lacUV5, P.sub.H207, P.sub..lamda., P.sub.tac, and
P.sub.N25 from the chromosomal lacZYA locus (see Table 2). The use
of a codon-optimized variant of the truncated transporter was also
explored (see Table 2). Although uptake of [.alpha.-.sup.32P]-dATP
was negatively correlated with doubling time, each strain exhibited
an improved ratio of uptake to fitness compared to DM1 (FIG. 1B).
Strain YZ3, which expressed the codon-optimized, chromosomally
integrated PtNTT2(66-575) from the P.sub.lacUV5 promoter, exhibited
both robust growth (<20% increased doubling time relative to the
isogenic strain without the transporter), and reasonable levels of
[.alpha.-.sup.32P]-dATP uptake, and was selected for further
characterization.
[0207] To determine whether the optimized transporter system of YZ3
facilitates high UBP retention, three plasmids that position the
UBP within the 75-nt TK1 sequence were constructed (with a local
sequence context of d(A-NaM-T)). These include two high copy
pUC19-derived plasmids, pUCX1 and pUCX2, as well as one low copy
pBR322-derived plasmid, pBRX2 (FIG. 6). In addition to examine the
effect of copy number on UBP retention, these plasmids positioned
the UBP at proximal (pUCX1) and distal (pUCX2 and pBRX2) positions
relative to the origin of replication. E. coli YZ3 and DM1 were
transformed with pUCX1, pUCX2, or pBRX2 and directly cultured in
liquid growth media supplemented with dNaMTP and d5SICSTP (and IPTG
for DM1 to induce the transporter), and growth and UBP retention
were characterized (at an OD.sub.600 of .about.1) (see Methods and
FIG. 7A). While DM1 showed variable levels of retention and reduced
growth with the high copy plasmids, YZ3 showed uniformly high
levels of UBP retention and robust growth (FIG. 2a and FIG.
8A).
[0208] To explore the effect of local sequence context on UBP
retention in YZ3, sixteen pUCX2 variants were constructed in which
the UBP was flanked by each possible combination of natural base
pairs within a fragment of gfp (see Table 2). Under the same growth
conditions as above, a wide range of UBP retentions was observed,
with some sequence contexts showing complete loss of the UBP (FIG.
2B). However, since the development of DM1 with the dNaM-d5SICS
UBP, it was determined that ring contraction and sulfur
derivatization of d5SICS, yielding the dNaM-dTPT3 UBP (FIG. 1A),
resulted in more efficient replication in vitro. To explore the in
vivo use of dNaM-dTPT3, the experiments were repeated with YZ3 and
each of the sixteen pUCX2 plasmids but with growth in media
supplemented with dNaMTP and dTPT3TP. UBP retentions were clearly
higher with dNaM-dTPT3 than with dNaM-d5SICS (FIG. 2B).
[0209] While dNaM-dTPT3 is a more optimal UBP for the SSO than
dNaM-d5SICS, its retention is still moderate to poor in some
sequences. Moreover, several sequences that show good retention in
YZ3 cultured in liquid media show poor retention when growth
includes culturing on solid media (FIG. 8B). To further increase
UBP retention with even these challenging sequences and/or growth
conditions, selective elimination of plasmids was carried out that
lose the UBP. In prokaryotes, the clustered regularly interspaced
short palindromic repeats (CRISPR)/CRISPR-associated (Cas) system
provides adaptive immunity against viruses and plasmids. In type II
CRISPR-Cas systems, such as that from Streptococcus pyogenes, the
endonuclease Cas9 utilizes encoded RNAs (or their artificial mimics
known as single-guide RNAs (sgRNAs)) to introduce double-strand
breaks into complementary DNA upstream of a 5'-NGG-3' protospacer
adjacent motif (PAM) (FIG. 3A), which then results in DNA
degradation by exonucleases. In vitro, it was found that the
presence of a UBP in the target DNA generally reduces Cas9-mediated
cleavage relative to sequences that are fully complementary to the
provided sgRNA (FIG. 9). In some instances within a cell, Cas9
programmed with sgRNA(s) complementary to natural sequences that
arise from UBP loss would enforce retention in a population of
plasmids, which was refer to as immunity to UBP loss. To test this,
a p15A plasmid was used to construct pCas9, which expresses Cas9
via an IPTG-inducible LacO promoter, as well as an 18-nt sgRNA that
is complementary to the TK1 sequence containing the most common
dNaM-dTPT3 mutation (dT-dA) via the constitutive ProK promoter
(FIGS. 6 and 10A). Strain YZ2 (a forerunner of YZ3 with slightly
less optimal transporter performance; FIG. 1A, FIG. 5D, and FIG.
5F) carrying the pCas9 plasmid was transformed with the
corresponding pUCX2 plasmid (i.e. the pUCX2 variant with the UBP
embedded within the TK1 sequence such that loss of the UBP produces
a sequence targeted by the sgRNA encoded on pCas9), grown to an
OD.sub.600 of .about.4, diluted 250-fold, and regrown to the same
OD.sub.600. UBP retention in control experiments with a non-target
sgRNA dropped to 17% after the second outgrowth; in contrast UBP
retention in the presence of the correct sgRNA was 70% (FIG. 10B).
Sequencing revealed that the majority of plasmids lacking a UBP
when the correct sgRNA was provided contained a single nucleotide
deletion in its place, which was not observed with the non-target
sgRNA (FIG. 10C, and FIG. 10D). With a pCas9 plasmid that expresses
two sgRNAs, one targeting the most common substitution mutation and
one targeting the single nucleotide deletion mutation (FIG. 6), and
the same growth and regrowth assay, loss of the UBP was
undetectable (FIG. 10B).
[0210] To more broadly explore Cas9-mediated immunity to UBP loss,
retention was examined using sixteen pUCX2 variants with sequences
that flank the UBP with each possible combination of natural base
pairs, but also vary its position relative to the PAM, and vary
which unnatural nucleotide is present in the strand recognized by
the sgRNAs (FIG. 11). A corresponding set of sixteen pCas9 plasmids
was also constructed that express two sgRNAs, one targeting a
substitution mutation and one targeting the single nucleotide
deletion mutation, for each pUCX2 variant. Strain YZ2 carrying a
pCas9 plasmid was transformed with its corresponding pUCX2 variant
and grown in the presence of the unnatural triphosphates and IPTG
(to induce Cas9), and UBP retention was assessed after cells
reached an OD.sub.600 of .about.1. As a control, the sixteen pUCX2
plasmids were also propagated in YZ2 carrying a pCas9 plasmid with
a non-target sgRNA. For four of the sixteen sequences explored, UBP
loss was already minimal without immunity (non-target sgRNA), but
was undetectable with expression of the correct sgRNA (FIG. 3B).
The remaining sequences showed moderate to no retention without
immunity, and significantly higher retention with it, including at
positions up to 15 nts from the PAM.
[0211] To further simplify and streamline the SSO, strain YZ4 was
constructed by integrating an IPTG-inducible Cas9 gene at the arsB
locus of the YZ3 chromosome, which allows for the use of a single
plasmid that both carries a UBP and expresses the sgRNAs that
enforce its retention. Sixteen such "all in one" plasmids (pAIO)
were constructed by replacing the Cas9 gene in each of the pCas9
variants with a UBP sequence from the corresponding pUCX2 variant
(Extended Data FIG. 2). YZ4 and YZ3 (included as a no Cas9 control
due to leaky expression of Cas9 in YZ4) were transformed with a
single pAIO plasmid and cultured on solid growth media supplemented
with the unnatural triphosphates and with or without IPTG to induce
Cas9. Single colonies were used to inoculate liquid media of the
same composition, and UBP retention was assessed after cells
reached an OD.sub.600 of .about.1-2 (FIG. 3C). Despite variable
levels of retention in the absence of Cas9 (YZ3), with induction of
Cas9 expression in YZ4, loss was minimal to undetectable in 13 of
the 16 the sequences. While retention with the three problematic
sequences, d(C-NaM-C) d(C-NaM-A) and d(C-NaM-G), might be
optimized, for example, through alterations in Cas9 or sgRNA
expression, the undetectable loss of the UBP with the majority of
the sequences after a regimen that included growth both on solid
and in liquid media, which was not possible with our previous SSO
DM1, attests to the vitality of YZ4.
[0212] Finally, a pAIO plasmid, pAIO2X, was constructed containing
two UBPs: dNaM paired opposite dTPT3 at position 451 of the sense
strand of the gfp gene and dTPT3 paired opposite dNaM at position
35 of the sense strand of the serT tRNA gene, as well as encoding
the sgRNAs targeting the most common substitution mutation expected
in each sequence (FIG. 6). YZ4 and YZ3 were transformed with pAIO2X
and subjected to the challenging growth regime depicted in FIG. 4,
which included extensive high-density growth in liquid and on solid
growth media. Plasmids were recovered and analyzed for UBP
retention (FIG. 7B) when the OD.sub.600 reached 1-2 during each
liquid outgrowth. In YZ3, which does not express Cas9, or in the
absence of Cas9 induction (no IPTG) in YZ4, UBP retention steadily
declined with extended growth (FIG. 4). With induction of immunity
(20 or 40 .mu.M IPTG) a marginal reduction in growth rate (less
than 14% increase in doubling time: FIG. 12) was observed, and
about 100% UBP retention (no detectable loss) in both genes.
Table 1 illustrates sequences described herein.
TABLE-US-00001 SEQ ID Sequence NO: PtNTT2 MRPYPTIALI SVFLSAATRI
SATSSHQASA LPVKKGTHVP 1 (full length) DSPKLSKLYI
MAKTKSVSSSFDPPRGGSTV APTTPLATGG ALRKVRQAVF PIYGNQEVTK FLLIGSIKFF
IILALTLTRD TKDTLIVTQC GAEAIAFLKI YGVLPAATAF IALYSKMSNA MGKKMLFYST
CIPFFTFFGLFDVFIYPNAE RLHPSLEAVQ AILPGGAASG GMAVLAKIAT HWTSALFYVM
AEIYSSVSVG LLFWQFANDV VNVDQAKRFY PLFAQMSGLAPVLAGQYVVR FASKAVNFEA
SMHRLTAAVTFAGIMICIFY QLSSSYVERT ESAKPAADNE QSIKPKKKKP KMSMVESGKF
LASSQYLRLI AMLVLGYGLS INFTEIMWKS LVKKQYPDPL DYQRFMGNFS SAVGLSTCIV
IFFGVHVIRLLGWKVGALAT PGIMAILALP FFACILLGLD SPARLEIAVI FGTIQSLLSK
TSKYALFDPT TQMAYIPLDD ESKVKGKAAI DVLGSRIGKS GGSLIQQGLV FVFGNIINAA
PVVGVVYYSVLVAWMSAAGR LSGLFQAQTE MDKADKMEAK TNKEK PtNTT2 (66-575)*
ATGGGTGGTAGCACCGTTGCACCGACCACACCGCTGGCAAC 2 DNA (codon
CGGTGGTGCACTGCGTAAAGTTCGTCAGGCAGTTTTTCCGA optimized)
TTTATGGCAATCAAGAAGTGACCAAATTTCTGCTGATTGGC (*66-575 denotes
AGCATCAAATTCTTTATTATCCTGGCACTGACCCTGACCCG that DNA encoding
TGATACCAAAGATACCCTGATTGTTACCCAGTGTGGTGCAG the first 65 amino
AAGCAATTGCATTTCTGAAAATCTATGGTGTTCTGCCTGCA acid residues have
GCAACCGCATTTATTGCACTGTATAGCAAAATGAGCAACGC been deleted
AATGGGCAAAAAAATGCTGTTTTATAGCACCTGTATCCCGT relative to the full
TCTTTACCTTTTTTGGTCTGTTCGATGTGTTCATTTATCCG length PtNTT2)
AATGCCGAACGTCTGCATCCGAGCCTGGAAGCAGTTCAGGC
AATTCTGCCTGGTGGTGCCGCAAGCGGTGGTATGGCAGTTC
TGGCAAAAATTGCAACCCATTGGACCAGCGCACTGTTTTAT
GTTATGGCAGAAATCTATAGCAGCGTTAGCGTTGGTCTGCT
GTTTTGGCAGTTTGCAAATGATGTTGTTAATGTGGATCAGG
CCAAACGTTTTTATCCGCTGTTTGCACAGATGAGCGGTCTG
GCACCGGTTCTGGCAGGTCAGTATGTTGTTCGTTTTGCAAG
CAAAGCCGTTAATTTTGAAGCAAGCATGCATCGTCTGACCG
CAGCAGTTACCTTTGCAGGTATTATGATCTGCATCTTTTAT
CAGCTGAGCAGCTCATATGTTGAACGTACCGAAAGCGCAAA
ACCGGCAGCAGATAATGAACAGAGCATTAAACCGAAGAAAA
AAAAACCGAAAATGTCGATGGTGGAAAGCGGTAAATTTCTG
GCAAGCAGCCAGTATCTGCGTCTGATTGCAATGCTGGTTCT
GGGTTATGGTCTGAGCATTAACTTTACCGAAATCATGTGGA
AAAGCCTGGTGAAAAAACAGTATCCGGATCCGCTGGATTAT
CAGCGTTTTATGGGTAATTTTAGCAGCGCAGTTGGTCTGAG
TACCTGCATTGTTATCTTTTTTGGCGTGCATGTTATTCGTC
TGCTGGGTTGGAAAGTTGGTGCCCTGGCAACACCGGGTATT
ATGGCCATTCTGGCACTGCCGTTTTTTGCATGTATTCTGCT
GGGCCTGGATAGTCCGGCACGTCTGGAAATTGCAGTTATTT
TTGGCACCATTCAGAGCCTGCTGAGCAAAACCAGCAAATAT
GCACTGTTTGATCCGACCACCCAGATGGCATATATCCCGCT
GGATGATGAAAGCAAAGTTAAAGGCAAAGCAGCCATTGATG
TTCTGGGTAGCCGTATTGGTAAATCAGGTGGTAGCCTGATT
CAGCAGGGTCTGGTTTTTGTTTTTGGCAATATTATCAATGC
CGCACCGGTTGTTGGTGTTGTGTATTATAGCGTTCTGGTTG
CATGGATGAGTGCAGCAGGTCGTCTGAGTGGTCTGTTTCAG
GCACAGACCGAAATGGATAAAGCAGATAAAATGGAAGCCAA AACCAACAAAGAAAAATGA
PtNTT2 (66-575) ATGGGAGGCAGTACTGTTGCACCAACTACACCGTTGGCAAC 3 DNA
(non codon CGGCGGTGCGCTCCGCAAAGTGCGACAAGCCGTCTTTCCCA optimized)
TCTACGGAAACCAAGAAGTCACCAAATTTCTGCTCATCGGA
TCCATTAAATTCTTTATAATCTTGGCACTCACGCTCACGCG
TGATACCAAGGACACGTTGATTGTCACGCAATGTGGTGCCG
AAGCGATTGCCTTTCTCAAAATATACGGGGTGCTACCCGCA
GCGACCGCATTTATCGCGCTCTATTCCAAAATGTCCAACGC
CATGGGCAAAAAAATGCTATTTTATTCCACTTGCATTCCTT
TCTTTACCTTTTTCGGGCTGTTTGATGTTTTCATTTACCCG
AACGCGGAGCGACTGCACCCTAGTTTGGAAGCCGTGCAGGC
AATTCTCCCGGGCGGTGCCGCATCTGGCGGCATGGCGGTTC
TGGCCAAGATTGCGACACACTGGACATCGGCCTTATTTTAC
GTCATGGCGGAAATATATTCTTCCGTATCGGTGGGGCTATT
GTTTTGGCAGTTTGCGAACGACGTCGTCAACGTGGATCAGG
CCAAGCGCTTTTATCCATTATTTGCTCAAATGAGTGGCCTC
GCTCCAGTTTTAGCGGGCCAGTATGTGGTACGGTTTGCCAG
CAAAGCGGTCAACTTTGAGGCATCCATGCATCGACTCACGG
CGGCCGTAACATTTGCTGGTATTATGATTTGCATCTTTTAC
CAACTCAGTTCGTCATATGTGGAGCGAACGGAATCAGCAAA
GCCAGCGGCAGATAACGAGCAGTCTATCAAACCGAAAAAGA
AGAAACCCAAAATGTCCATGGTTGAATCGGGGAAATTTCTC
GCGTCAAGTCAGTACCTGCGTCTAATTGCCATGCTGGTGCT
GGGATACGGCCTCAGTATTAACTTTACCGAAATCATGTGGA
AAAGCTTGGTGAAGAAACAATATCCAGACCCGCTAGATTAT
CAACGATTTATGGGTAACTTCTCGTCAGCGGTTGGTTTGAG
CACATGCATTGTTATTTTCTTCGGTGTGCACGTGATCCGTT
TGTTGGGGTGGAAAGTCGGAGCGTTGGCTACACCTGGGATC
ATGGCCATTCTAGCGTTACCCTTTTTTGCTTGCATTTTGTT
GGGTTTGGATAGTCCAGCACGATTGGAGATCGCCGTAATCT
TTGGAACAATTCAGAGTTTGCTGAGCAAAACCTCCAAGTAT
GCCCTTTTCGACCCTACCACACAAATGGCTTATATTCCTCT
GGACGACGAATCAAAGGTCAAAGGAAAAGCGGCAATTGATG
TTTTGGGATCGCGGATTGGCAAGAGTGGAGGCTCACTGATC
CAGCAGGGCTTGGTCTTTGTTTTTGGAAATATCATTAATGC
CGCACCTGTAGTAGGGGTTGTCTACTACAGTGTCCTTGTTG
CGTGGATGAGCGCAGCTGGCCGACTAAGTGGGCTTTTTCAA
GCACAAACAGAAATGGATAAGGCCGACAAAATGGAGGCAAA GACCAACAAAGAAAAGTAG
PtNTT2 (66-575) MGGSTVAPTTPLATGGALRKVRQAVFPIYGNQEVTKFLLIG 4 protein
SIKFFIILALTLTRDTKDTLIVTQCGAEMAFLKIYGVLPAA
TAFIALYSKMSNAMGKKMLFYSTCIPFFTFFGLFDVFIYPN
AERLHPSLEAVQAILPGGAASGGMAVLAKIATHWTSALFYV
MAEIYSSVSVGLLFWQFANDVVNVDQAKRFYPLFAQMSGLA
PVLAGQYVVRFASKAVNFEASMHRLTAAVTFAGIMICIFYQ
LSSSYVERTESAKPAADNEQSIKPKKKKPKMSMVESGKFLA
SSQYLRLIAMLVLGYGLSINFTEIMWKSLVKKQYPDPLDYQ
RFMGNFSSAVGLSTCIVIFFGVHVIRLLGWKVGALATPGIM
AILALPFFACILLGLDSPARLEIAVIFGTIQSLLSKTSKYA
LFDPTTQMAYIPLDDESKVKGKAAIDVLGSRIGKSGGSLIQ
QGLVFVFGNIINAAPVVGVVYYSVLVAWMSAAGRLSGLFQA QTEMDKADKMEAKTNKEK PtNTT2
(1-22, ATGAGACCATTTCCGACGATTGCCTTGATTTCGGTTTTTCT 5 66-575)*
TTCGGCGGCGACTCGCATTTCGGCAGGAGGCAGTACTGTTG DNA
CACCAACTACACCGTTGGCAACCGGCGGTGCGCTCCGCAAA (*1-22,66-575
GTGCGACAAGCCGTCTTTCCCATCTACGGAAACCAAGAAGT denotes that DNA
CACCAAATTTCTGCTCATCGGATCCATTAAATTCTTTATAA encoding amino
TCTTGGCACTCACGCTCACGCGTGATACCAAGGACACGTTG acid residues 23-65
ATTGTCACGCAATGTGGTGCCGAAGCGATTGCCTTTCTCAA have been deleted
AATATACGGGGTGCTACCCGCAGCGACCGCATTTATCGCGC relative to the full-
TCTATTCCAAAATGTCCAACGCCATGGGCAAAAAAATGCTA length PtNTT2)
TTTTATTCCACTTGCATTCCTTTCTTTACCTTTTTCGGGCT
GTTTGATGTTTTCATTTACCCGAACGCGGAGCGACTGCACC
CTAGTTTGGAAGCCGTGCAGGCAATTCTCCCGGGCGGTGCC
GCATCTGGCGGCATGGCGGTTCTGGCCAAGATTGCGACACA
CTGGACATCGGCCTTATTTTACGTCATGGCGGAAATATATT
CTTCCGTATCGGTGGGGCTATTGTTTTGGCAGTTTGCGAAC
GACGTCGTCAACGTGGATCAGGCCAAGCGCTTTTATCCATT
ATTTGCTCAAATGAGTGGCCTCGCTCCAGTTTTAGCGGGCC
AGTATGTGGTACGGTTTGCCAGCAAAGCGGTCAACTTTGAG
GCATCCATGCATCGACTCACGGCGGCCGTAACATTTGCTGG
TATTATGATTTGCATCTTTTACCAACTCAGTTCGTCATATG
TGGAGCGAACGGAATCAGCAAAGCCAGCGGCAGATAACGAG
CAGTCTATCAAACCGAAAAAGAAGAAACCCAAAATGTCCAT
GGTTGAATCGGGGAAATTTCTCGCGTCAAGTCAGTACCTGC
GTCTAATTGCCATGCTGGTGCTGGGATACGGCCTCAGTATT
AACTTTACCGAAATCATGTGGAAAAGCTTGGTGAAGAAACA
ATATCCAGACCCGCTAGATTATCAACGATTTATGGGTAACT
TCTCGTCAGCGGTTGGTTTGAGCACATGCATTGTTATTTTC
TTCGGTGTGCACGTGATCCGTTTGTTGGGGTGGAAAGTCGG
AGCGTTGGCTACACCTGGGATCATGGCCATTCTAGCGTTAC
CCTTTTTTGCTTGCATTTTGTTGGGTTTGGATAGTCCAGCA
CGATTGGAGATCGCCGTAATCTTTGGAACAATTCAGAGTTT
GCTGAGCAAAACCTCCAAGTATGCCCTTTTCGACCCTACCA
CACAAATGGCTTATATTCCTCTGGACGACGAATCAAAGGTC
AAAGGAAAAGCGGCAATTGATGTTTTGGGATCGCGGATTGG
CAAGAGTGGAGGCTCACTGATCCAGCAGGGCTTGGTCTTTG
TTTTTGGAAATATCATTAATGCCGCACCTGTAGTAGGGGTT
GTCTACTACAGTGTCCTTGTTGCGTGGATGAGCGCAGCTGG
CCGACTAAGTGGGCTITTTCAAGCACAAACAGAAATGGATA
AGGCCGACAAAATGGAGGCAAAGACCAACAAAGAAAAGTAG PtNTT2 (1-22,
MRPFPTIALISVFLSAATRISAGGSTVAPTTPLATGGALRK 6 66-575)
VRQAVFPIYGNQEVTKFLLIGSIKFFIILALTLTRDTKDTL protein
IVTQCGAEATAFLKIYGVLPAATAFIALYSKMSNAMGKKML
FYSTCIPFFTFFGLFDVFIYPNAERLHPSLEAVQAILPGGA
ASGGMAVLAKIATHWTSALFYVMAEIYSSVSVGLLFWQFAN
DVVNVDQAKRFYPLFAQMSGLAPVLAGQYVVRFASKAVNFE
ASMHRLTAAVTFAGIMICIFYQLSSSYVERTESAKPAADNE
QSIKPKKKKPKMSMVESGKFLASSQYLRLIAMLVLGYGLSI
NFTEIMWKSLVKKQYPDPLDYQRFMGNFSSAVGLSTCIVIF
FGVHVIRLLGWKVGALATPGIMAILALPFFACILLGLDSPA
RLEIAVIFGTIQSLLSKTSKYALFDPTTQMAYIPLDDESKV
KGKAAIDVLGSRIGKSGGSLIQQGLVFVFGNIINAAPVVGV
VYYSVLVAWMSAAGRLSGLFQAQTEMDKADKMEAKTNKEK PtNTT2 (23-575)*
ATGACTTCCTCTCATCAAGCAAGTGCACTTCCTCTCAAAAA 7 DNA
GGGAACGCATGTCCCGGACTCTCCGAAGTTGTCAAAGCTAT (*23-575 denotes
ATATCATGGCCAAAACCAAGAGTGTATCCTCGTCCTTCGAC that DNA encoding
CCCCCTCGGGGAGGCAGTACTGTTGCACCAACTACACCGTT amino acid residues
GGCAACCGGCGGTGCGCTCCGCAAAGTGCGACAAGCCGTCT 1-22 have been
TTCCCATCTACGGAAACCAAGAAGTCACCAAATTTCTGCTC deleted relative to
ATCGGATCCATTAAATTCTTTATAATCTTGGCACTCACGCT the full-length
CACGCGTGATACCAAGGACACGTTGATTGTCACGCAATGTG PtNTT2)
GTGCCGAAGCGATTGCCTTTCTCAAAATATACGGGGTGCTA
CCCGCAGCGACCGCATTTATCGCGCTCTATTCCAAAATGTC
CAACGCCATGGGCAAAAAAATGCTATTTTATTCCACTTGCA
TTCCTTTCTTTACCTTTTTCGGGCTGTTTGATGTTTTCATT
TACCCGAACGCGGAGCGACTGCACCCTAGTTTGGAAGCCGT
GCAGGCAATTCTCCCGGGCGGTGCCGCATCTGGCGGCATGG
CGGTTCTGGCCAAGATTGCGACACACTGGACATCGGCCTTA
TTTTACGTCATGGCGGAAATATATTCTTCCGTATCGGTGGG
GCTATTGTTTTGGCAGTTTGCGAACGACGTCGTCAACGTGG
ATCAGGCCAAGCGCTTTTATCCATTATTTGCTCAAATGAGT
GGCCTCGCTCCAGTTTTAGCGGGCCAGTATGTGGTACGGTT
TGCCAGCAAAGCGGTCAACTTTGAGGCATCCATGCATCGAC
TCACGGCGGCCGTAACATTTGCTGGTATTATGATTTGCATC
TTTTACCAACTCAGTTCGTCATATGTGGAGCGAACGGAATC
AGCAAAGCCAGCGGCAGATAACGAGCAGTCTATCAAACCGA
AAAAGAAGAAACCCAAAATGTCCATGGTTGAATCGGGGAAA
TTTCTCGCGTCAAGTCAGTACCTGCGTCTAATTGCCATGCT
GGTGCTGGGATACGGCCTCAGTATTAACTTTACCGAAATCA
TGTGGAAAAGCTTGGTGAAGAAACAATATCCAGACCCGCTA
GATTATCAACGATTTATGGGTAACTTCTCGTCAGCGGTTGG
TTTGAGCACATGCATTGTTATTTTCTTCGGTGTGCACGTGA
TCCGTTTGTTGGGGTGGAAAGTCGGAGCGTTGGCTACACCT
GGGATCATGGCCATTCTAGCGTTACCCTTTTTTGCTTGCAT
TTTGTTGGGTTTGGATAGTCCAGCACGATTGGAGATCGCCG
TAATCTTTGGAACAATTCAGAGTTTGCTGAGCAAAACCTCC
AAGTATGCCCTTTTCGACCCTACCACACAAATGGCTTATAT
TCCTCTGGACGACGAATCAAAGGTCAAAGGAAAAGCGGCAA
TTGATGTTTTGGGATCGCGGATTGGCAAGAGTGGAGGCTCA
CTGATCCAGCAGGGCTTGGTCTTTGTTTTTGGAAATATCAT
TAATGCCGCACCTGTAGTAGGGGTTGTCTACTACAGTGTCC
TTGTTGCGTGGATGAGCGCAGCTGGCCGACTAAGTGGGCTT
TTTCAAGCACAAACAGAAATGGATAAGGCCGACAAAATGGA GGCAAAGACCAACAAAGAAAAGTAG
PtNTT2 (23-575) MTSSHQASALPLKKGTHVPDSPKLSKLYIMAKTKSVSSSFD 8 protein
PPRGGSTVAPTTPLATGGALRKVRQAVFPIYGNQEVTKFLL
IGSIKFFIILALTLTRDTKDTLIVTQCGAEMAFLKIYGVLP
AATAFIALYSKMSNAMGKKMLFYSTCIPFFTFFGLFDVFIY
PNAERLHPSLEAVQAILPGGAASGGMAVLAKIATHWTSALF
YVMAEIYSSVSVGLLFWQFANDVVNVDQAKRFYPLFAQMSG
LAPVLAGQYVVRFASKAVNFEASMHRLTAAVTFAGIMICIF
YQLSSSYVERTESAKPAADNEQSIKPKKKKPKMSMVESGKF
LASSQYLRLIAMLVLGYGLSINFTEIMWKSLVKKQYPDPLD
YQRFMGNFSSAVGLSTCIVIFFGVHVIRLLGWKVGALATPG
IMAILALPFFACILLGLDSPARLEIAVIFGTIQSLLSKTSK
YALFDPTTQMAYIPLDDESKVKGKAAIDVLGSRIGKSGGSL
IQQGLVFVFGNIINAAPVVGVVYYSVLVAWMSAAGRLSGLF QAQTEMDKADKMEAKTNKEK
pBRX2* AACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTG 9 (*N denotes the
CTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTG position of the
ATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGAT UBP)
ACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGT
GAGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCC
TTACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACT
CTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTA
TACACTCCGCTATCGCTACGTGACTGGGTCATGGCTGCGCC
CCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTT
GTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTC
TCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACC
GAAACGCGCGAGGCAGCTGCGGTAAAGCTCATCAGCGTGGT
CGTGAAGCGATTCACAGATGTCTGCCTGTTCATCCGCGTCC
AGCTCGTTGAGTTTCTCCAGAAGCGTTAATGTCTGGCTTCT
GATAAAGCGGGCCATGTTAAGGGCGGTTTTTTCCTGTTTGG
TCACTGATGCCTCCGTGTAAGGGGGATTTCTGTTCATGGGG
GTAATGATACCGATGAAACGAGAGAGGATGCTCACGATACG
GGTTACTGATGATGAACATGCCCGGTTACTGGAACGTTGTG
AGGGTAAACAACTGGCGGTATGGATGCGGCGGGACCAGAGA
AAAATCACTCAGGGTCAATGCCAGCGCTTCGTTAATACAGA
TGTAGGTGTTCCACAGGGTAGCCAGCAGCATCCTGCGATGC
AGATCCGGAACATAATGGTGCAGGGCGCTGACTTCCGCGTT
TCCAGACTTTACGAAACACGGAAACCGAAGACCATTCATGT
TGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTC
ACGTTCGCTCGCGTATCGGTGATTCATTCTGCTAACCAGTA
AGGCAACCCCGCCAGCCTAGCCGGGTCCTCAACGACAGGAG
CACGATCATGCGCACCCGTGGCCAGGACCCAACGCTGCCCG
AAATTCTTGAAGACGAAAGGGCCTCGTGATACGCCTATTTT
TATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCA
GGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTG
TTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGA
GACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGG
AAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCC
CTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAG
AAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGT
GCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAA
GATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGA
TGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCC
CGTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACA
CTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAG
AAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGC
AGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTT
ACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTT
TTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGT
TGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCG
TGACACCACGATGCCTGCAGCAATGGCAACAACGTTGCGCA
AACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAA
CAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACC
ACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTG
ATAAATCTGGAGCCGGTGAGCGTGGCTCTCGCGGTATCATT
GCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGT
TATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAA
ATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCAT
TGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGAT
TGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGA
AGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGT
GAGTTTTCGTTGGCTTTACACTTTATGCTTCCGGCTCGTAT
GTTGTGTGGAANTGTGAGCGGATAACAATTTCACACAGGAA
ACAGCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGA
TCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTT
GCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGC
CGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGC
TTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTA
GCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGC
CTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCT
GCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAG
ACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGG
GGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTAC
ACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGC
CACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAA
GCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCA
GGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCG
CCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAG GGGGGCGGAGCCTATGGAAA
TABLE-US-00002 TABLE 2 SEQ ID Primer Application Sequence NO:
Transporter plasmid cloning and chromosomal integration YZ552
Cloning GGGAGGCAGTACTGTTGCAC 10 PtNTT2(66-575) pCDF- Cloning
GGTATATCTCCTTATTAAAGTTAAACAAAATTATTTCT 11 1b-fwd PtNTT2(66-575)
ACAGGGG T7 seq sequencing TAATACGACTCACTATAGGG 12 pCDF plasmids T7
term sequencing GCTAGTTATTGCTCAGCGG 13 seq pCDF plasmids YZ580
transporter TTACATTAATTGCGTTGCGCTC 14 cloning DM002 transporter
TTTTGGCGGATGGCATTTGAGAAGCACACGG 15 cloning YZ576 transporter
ATTCTCACCAATAAAAAACGCCCGG 16 cloning YZ581 transporter
CCTGTAGAAATAATTTTGTTTAACTTTAATAAGGAG 17 cloning DM052 transporter
CCCCGCGCGTTGGCCGATTC 18 sequencing DM053 transporter
GAAGGGCAATCAGCTGTTG 19 sequencing YZ50 transporter
CAGGGCAGGGTCGTTAAATAG 20 sequencing YZ584 lacI promoter
GACACCATCGAATGGCGCAAAACCTTTCGCGGTATGG 21 CATGATAGCGCCCGG YZ585 lacI
promoter CCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGCG 22 CCATTCGATGGTGTC
YZ582 bla promoter ATTTTTCTAAATACATTCAAATATGTATCCGCTCATGA 23
GACAATAACCCTG YZ583 bla promoter
CAGGGTTATTGTCTCATGAGCGGATACATATTTGAAT 24 GTATTTAGAAAAAT YZ599 lac
promoter TTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTG 25 GTATGTTGTGTGGA
YZ600 lac promoter TCCACACAACATACCAGCCGGAAGCATAAAGTGTAAA 26
GCCTGGGGTGCCTAA YZ595 lacUV5 CTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTG
27 promoter GTATAATGTGTGGA YZ596 lacUV5
TCCACACATTATACCAGCCGGAAGCATAAAGTGTAAA 28 promoter GCCTGGGGTGCCTAG
YZ7 transporter GGGGATAACGCAGGAAAGAACATG 29 integration cloning
YZ12 transporter GCACTTTTCGGGGAAATGTGCG 30 integration cloning
YZ612 transporter AATTGCGGCCTATATGGATGTTGGAACCGTAAGAGAA 31
integration ATAGACAGGCGGTCCTGTGACGGAAGATCACTTCGCA cloning G YZ613
transporter TGCTCACATGTTCTTTCCTGCGTTATCCCCGCGTGGTG 32 integration
AACCAGGC cloning YZ614 transporter ACCGCCTGTCTATTTCTCTTACGGTTCC 33
integration cloning YZ615 transporter
CGCGCTTAATGCGCCGCTACAGGGCGCGTCGATTGGT 34 integration GCCAGCGCGCAG
cloning YZ610 transporter GGTATATCTCCTTATTAAAGTTAAACAAAATTATTTCT 35
integration ACAGG cloning YZ616 lacZYA:: CAGCCACGTTTCTGCGAAAAC 36
transporter integration YZ617 lacZYA:: TACAGCGGTTCCTTACTGGC 37
transporter integration YZ618 transporter
GGGTGGTGAATGTGAAACCAGTAACG 38 integration colony PCR YZ619
transporter CTGGGTGTTTACTTCGGTCTG 39 integration colony PCR YZ69
transporter GGCCGTAATATCCAGCTGAAC 40 integration colony PCR YZ587
transporter ACTAGGGTGCAGTCGCTCCG 41 integration colony PCR pCDF-
integration TTAACCTAGGCTGCTGCCACCG 42 1b-rev cloning YZ703 tac
promoter GCGCAACGCAATTAATGTAATTCTGAAATGAGCTGTT 43
GACAATTAATCATCGGCTCG YZ704 tac promoter
AACAAAATTATTTCTACAGGTCCACACATTATACGAG 44 CCGATGATTAATTGTCAAC YZ707
N25 promoter GCGCAACGCAATTAATGTAATCATAAAAAATTTATTT 45
GCTTTCAGGAAAATTTTTCTG YZ708 N25 promoter
AACAAAATTATTTCTACAGGTGAATCTATTATACAGA 46 AAAATTTTCCTGAAAGCAAATA
YZ709 .lamda. promoter GCGCAACGCAATTAATGTAATTATCTCTGGCGGTGTT 47
GACATAAATACCACTGGCG YZ710 .lamda. promoter
AACAAAATTATTTCTACAGGTGTGCTCAGTATCACCG 48 CCAGTGGTATTTATGTCAAC YZ711
H207 promtoer GCGCAACGCAATTAATGTAATTTTAAAAAATTCATTT 49
GCTAAACGCTTCAAATTCTCG YZ712 H207 promtoer
AACAAAATTATTTCTACAGGTGAAGTATATTATACGA 50 GAATTTGAAGCGTTTAGC pUCX
and pBRX Golden Gate destination plasmid cloning pUC19- TK1 site
TGGGGTGCCTAATGAGTGAGC 51 lin-fwd removal/pBRX1 linearization pUC19-
TK1 site CTATGACCATGATTACGCCAAGCTTG 52 lin-rev removal/pBRX1
linearization YZ51 bla BsaI site TCTCGCGGTATCATTGCAGCACTG 53
mutation YZ52 bla BsaI site GCCACGCTCACCGGCTCC 54 mutation YZ95
pUCX2/pBRX2 AACGAAAACTCACGTTAAGGG 55 linearization YZ96 pUCX2/pBRX2
CCACTGAGCGTCAGACC 56 linearization YZ93 BsaI zeo.sup.R stuffer
GAGACCCGTCGTTGACAATTAATCATCGGC 57 cassette YZ94 BsaI zeo.sup.R
stuffer GAGACCATTCTCACCAATAAAAAACGCCCGG 58 cassette pCas9 and pAIO
cloning, Cas9 chromosomal integration JL126 pCas9 cloning
CGGGGTACCATGGACAAGAAGTACTCCATT 59 JL128 pCas9 cloning
CTAGTCTAGATTACACCTTCCTCTTCTTCTTGGG 60 pCas9 BsmBI
CTCCGGGGAAACCGCCGAAGCCACGCGGCTCAA 61 BL557 site removal BL558 pCas9
BsmBI CTTCGGCGGTTTCCCCGGAGTCGAACAGGAGGGCGCC 62 site removal AATGAGG
BL559 pCas9-Multi AGGAAGAAGACGTCTCACGCATCTTACTGCGCAGATA 63 cloning
CGC BL560 pCas9-Multi AAGATGCGTGAGACGTCTTCTTCCTCGTCTCGGTCGAC 64
cloning AGTTCATAGGTGATTGCTCAGG YZ720 arsB::Cas9
GTCCCAAATCGCAGCCAATCACATTG 65 integration YZ721 arsB::Cas9
GTCCTGACCATCGTATTGGTTATCTGGC 66 integration TG1 Cas9 sequencing
ATTTAGAGGGCAGTGCCAGCTCGTTA 67 TG2 Cas9 sequencing
CTGCATTCAGGTAGGCATCATGCGCA 68 TG3 Cas9 sequencing
CTGGGCTACCTGCAAGATTAGCGATG 69 TG4 Cas9 sequencing
TGAAGGACTGGGCAGAGGCCCCCTT 70 TG5 Cas9 sequencing
CGTAGGTGTCTTTGCTCAGTTGAAGC 71 TG6 Cas9 sequencing
TAGCCATCTCATTACTAAAGATCTCCT 72 BL731 pAIO-Multi
CGATATCGTTGGTCTCAACGACACAATTGTAAAGGTT 73 cloning (.DELTA.Cas9,
AGATCT introduce BsaI) BL732 pAIO-Multi
CAACGATATCGGTCTCACACTGACTGGGCCTTTCGTTT 74 cloning (.DELTA.Cas9,
TATCT introduce BsaI) BL450 pAIO guide GCAATCACCTATGAACTGTCGAC 75
sequencing Cas9 guide cloning upper case denotes the BsmBI
recognition sequence (6 nt),the BsmBI restriction site overhang (4
nt), or variable target (spacer) sequence (18-20 nt) BL691 1st
sgRNA GGA aggaggaaggaCGTCTCaTGCGccccgcattCACACAAT 76 GXG-T
GTAGTGATCAgttttagagctagaaatagc BL627 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCCAGGATG 77 GXA-T
GGTACCACCCgttttagagctagaaatagc BL707 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCACACAAT 78 GXC-T
GTAGTCATCAgttttagagctagaaatagc BL642 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCCAGGATG 79 GXT-G
GGCACCAGCCgttttagagctagaaatagc BL659 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCCATGATG 80 AXG-T/CXT-T
GGCACCACCCgttttagagctagaaatagc BL623 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCACACAAT 81 AXA-T
GTATAGATCAgttttagagctagaaatagc BL628 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCCAGGATG 82 AXC-T
GGCACCATCCgttttagagctagaaatagc BL567 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattGTTGTGTG 83 AXT-A
GAAATGTGAGgttttagagctagaaatagc BL593 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattTGTCACTA 84 CXG-G
CTCTGACCAGgttttagagctagaaatagc
BL639 1st sgRNA GGA aggaggaaggaCGTCTCaTGCGccccgcattTGTCACTA 85
CXA-A CTCTGACCAAgttttagagctagaaatagc BL693 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCACACAAT 86 CXC-T
GTACTCATCAgttttagagctagaaatagc BL660 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCCAAGATG 87 CXT-A
GGCACCACCCgttttagagctagaaatagc BL695 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCACACAAT 88 TXG-T
GTATTGATCAgttttagagctagaaatagc BL629 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCACACAAT 89 TXA-A
GTATAAATCAgttttagagctagaaatagc BL657 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattCCAGGATG 90 TXC-G
GGGACCACCCgttttagagctagaaatagc BL620 1st sgRNA GGA
aggaggaaggaCGTCTCaTGCGccccgcattTTCACAAT 91 TXT-T
ACTTTCTTTAgttttagagctagaaatagc BL701 2nd sgRNA,
gcattTCACACAATGTAGGATCAgttttagagctagaaa 92 Fwd G.DELTA.G tagc BL702
2nd sgRNA, TGATCCTACATTGTGTGAaatgcggggcgcatcttact 93 Rev G.DELTA.G
BL617 2nd sgRNA, gcattACCAGGATGGGACCACCCgttttagagctagaa 94 Fwd
G.DELTA.A atagc BL618 2nd sgRNA,
GGGTGGTCCCATCCTGGTaatgcggggcgcatcttact 95 Rev G.DELTA.A BL705 2nd
sgRNA, gcattTCACACAATGTAGCATCAgttttagagctaga 96 Fwd G.DELTA.C
aatagc BL706 2nd sgRNA, TGATGCTACATTGTGTGAaatgcggggcgcatcttact 97
Rev G.DELTA.C BL614 2nd sgRNA,
gcattACCAGGATGGGCACCACCgttttagagctaga 98 Fwd G.DELTA.T aatagc BL615
2nd sgRNA, GGTGGTGCCCATCCTGGTaatgcggggcgcatcttact 99 Rev G.DELTA.T
BL682 2nd sgRNA, gcattACCAGATGGGCACCACCCgttttagagctag 100 Fwd
A.DELTA.G aaatagc BL683 2nd sgRNA,
GGGTGGTGCCCATCTGGTaatgcggggcgcatcttact 101 Rev A.DELTA.G BL575 2nd
sgRNA, gcattTCACACAATGTAAGATCAgttttagagctaga 102 Fwd A.DELTA.A
aatagc BL576 2nd sgRNA, TGATCTTACATTGTGTGAaatgcggggcgcatcttact 103
Rev A.DELTA.A BL564 2nd sgRNA, gcattTGTTGTGTGGAATGTGAGgttttagagcta
104 Fwd A.DELTA.T gaaatagc BL565 2nd sgRNA,
CTCACATTCCACACAACAaatgcggggcgcatcttact 105 Rev A.DELTA.T BL675 2nd
sgRNA, gcattTTGTCACTACTCTGACCGgttttagagctag 106 Fwd C.DELTA.G
aaatagc BL676 2nd sgRNA, CGGTCAGAGTAGTGACAAaatgcggggcgcatcttact 107
Rev C.DELTA.G BL673 2nd sgRNA, gcattTTGTCACTACTCTGACCAgttttagagctag
108 Fwd C.DELTA.A aaatagc BL674 2nd sgRNA,
TGGTCAGAGTAGTGACAAaatgcggggcgcatcttact 109 Rev C.DELTA.A BL703 2nd
sgRNA, gcattTCACACAATGTACCATCAgttttagagctag 110 Fwd C.DELTA.C
aaatagc BL704 2nd sgRNA, TGATGGTACATTGTGTGAaatgcggggcgcatcttact 111
Rev C.DELTA.C BL697 2nd sgRNA, gcattTCACACAATGTATGATCAgttttagagctag
112 Fwd T.DELTA.G aaatagc BL698 2nd sgRNA,
TGATCATACATTGTGTGAaatgcggggcgcatcttact 113 Rev T.DELTA.G BL679 2nd
sgRNA, gcattTCACACAATGTATAATCAgttttagagcta 114 Fwd T.DELTA.A
gaaatagc BL680 2nd sgRNA, TGATTATACATTGTGTGAaatgcggggcgcatcttact
115 Rev T.DELTA.A BL620 2nd sgRNA,
gcattATTCACAATACTTCTTTAgttttagagctag 116 Fwd T.DELTA.T aaatagc
BL621 2nd sgRNA, TAAAGAAGTATTGTGAATaatgcggggcgcatcttact 117 Rev
T.DELTA.T BL562 1st sgRNA agaaggaagaCGTCTCaCTGTcgaccaaaaaagcct 118
construction gctcgttgagc BL563 2nd sgRNA
aagaaggaCGTCTCaACAGtagtggcagcggctaactaag 119 construction BL566
Terminating aggagaggaCGTCTCtCGACcaaaaaagcctgctcgttg 120 sgRNA gagca
construction BL514 natural hEGFP
agtaagatgcgccccgcattGACCAGGATGGGCACCACCC 121 guide cloning
gttttagagctagaaatag BL515 natural hEGFP
ctatttctagctctaaaacGGGTGGTGCCCATCCTGGTC 122 guide cloning
aatgcggggcgcatcttact BL464 natural hEGFP
GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGT 123 guide cloning BL465
natural hEGFP AATGCGGGGCGCATCTTACT 124 guide cloning YZ310 ESer-69
guide cattGGCACCGGTCTACTAAAC 125 fwd YZ316 ESer-69 guide
aaacGTTTAGTAGACCGGTGCC 126 rev YZ359 GFP151-69
cacattCACACAATGTAAGTATCAgtttt 127 guide fwd YZ360 GFP151-69
ctctaaaacTGATACTTACATTGTGTGaa 128 guide rev In vitro ceavage (IVC)
DNA templates and sgRNAs BL487 IVC DNA
GTTTACGTCGCCGTCCAGCTCGACCAGGATGGGCACC 129 Template
AACCCGGTGAACAGCTCCTCGCC BL488 IVC DNA
GTTTACGTCGCCGTCCAGCTCGACCAGGATGGGCACC 130 Template
AGCCCGGTGAACAGCTCCTCGCC BL489 IVC DNA
GTTTACGTCGCCGTCCAGCTCGACCAGGATGGGCACC 131 Template
ATCCCGGTGAACAGCTCCTCGCC BL408 IVC DNA
GTTTACGTCGCCGTCCAGCTCGACCAGGATGGGCACC 132 Template
ACCCCGGTGAACAGCTCCTCGCC BL415 IVC Template
GCATCGCCCTCGCCCTCGCCGGACACGCTGAACTTGT 133 Extension
GGCCGTTTACGTCGCCGTCCAGC BL416 IVC Template
TGCAGTTTCATTTGATGCTCGATGAGTTATGGTGAGCA 134 Extension
AGGGCGAGGAGCTGTTCACCGG BL484 IVC crRNA
TTAATACGACTCACTATAGGGACCAGGATGGGCACCA 135
ACCGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG CTAGT BL485 IVC crRNA
TTAATACGACTCACTATAGGGACCAGGATGGGCACCA 136
GCCGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG CTAGT BL486 IVC crRNA
TTAATACGACTCACTATAGGGACCAGGATGGGCACCA 137
TCCGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGG CTAGT BL318 IVC crRNA
TTAATACGACTCACTATAGGGACCAGGATGGGCACCA 138 CCCGTTTTAGAGCTATGCTGTTTTG
BL439 IVC Conversion AAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATA 139
crRNA to ACGGACTAGCCTTATTTTAACTTGCTATTTCTAGCTCT sgRNA AAAACGG BL472
PCR sgRNA AAGAGGAAGAGGTTAATACGACTCACTATAGGGAC 140 BL440 PCR sgRNA
AAAAGCACCGACTCGGTGCC 141 BL473 PCR sgRNA
ACTAGCCTTATTTTAACTTGCTATTTCTAGCTCTAAAA 142 CGG Primers for UBP-
containing Golden Gate inserts italics denote the BsaI recognition
sequence, underline denotes the BsaI restriction site overhang
sequence insert primers for pUCX2 are also compatible with pBRX2
and pAIO GG plasmids # denotes primers that were also used for
biotin shift (the corresponding dedicated biotin shift primers are
identical in annealing sequence) YZ401 and YZ403 are highly
sensitive to annealing temperature. The optimal annealing
temperature is 54.degree. C. BL528 hEGFP insert for
AAGAAGGAGAAGGTCTCTAGTGGAGCAAGGGCGAGG 143 pUCX2 GG fwd AGCTGTTCACCG
BL529 hEGFP insert for AAGAGAAGAGAGGTCTCATCGTGTTTACGTCGCCGTC 144
pUCX2 GG rev CAGCTC YZ148 GFP66 insert for
ATGGGTCTCCAGTGGGGCCAACACTTGTCACTAC 145 pUCX2 GG fwd YZ149 GFP66
insert for ATGGGTCTCTTCGTTTCCGGATAACGGGAAAAGC 146 pUCX2 GG rev
YZ150 GFP151 insert ATGGGTCTCCAGTGGCTCGAGTACAACTTTAACTCACAC 147 for
pUCX2 GG fwd YZ151 GFP151 insert
ATGGGTCTCTTCGTTGATTCCATTCTTTTGTTTGTCTGC 148 for pUCX2 GG rev YZ97
TK1 insert for ATGGGTCTCTCATAGCTGTTTCCTGTGTGAAATTGTTA 149 pUCX1 GG
fwd TCC YZ98 TK1 insert for ATGGGTCTCACCCCAGGCTTTACACTTTATGCTTCCG
150 pUCX1 GG rev YZ99.sup.# TK1 insert for
ATGGGTCTCCAGTGGCTGTTTCCTGTGTGAAATTGTTA 151 pUCX2 GG fwd TCC
YZ100.sup.# TK1 insert for ATGGGTCTCTTCGTTGGCTTTACACTTTATGCTTCCG
152 pUCX2 GG rev YZ118 D8 insert for
ATGGGTCTCCAGTGGCACACAGGAAACAGCTATGAC 153 pUCX2 GG fwd YZ119 D8
insert for ATGGGTCTCTTCGTTGGGTTAAGCTTAACTTTAAGAAG 154 pUCX2 GG rev
GAG YZ73.sup.# GFP151 insert ATGGGTCTCACACAAACTCGAGTACAACTTTAACTCA
155 for pAIO2X GG fwd CAC YZ74.sup.# GFP151 insert
ATGGGTCTCGATTCCATTCTTTTGTTTGTCTGC 156 for pAIO2X GG rev YZ401.sup.#
ESer insert for ATTGGTCTCGGCCGAGCGGTTGAAGGCAC 157 pAIO2X GG fwd
YZ403.sup.# ESer insert for ATTGGTCTCTCTGGAACCCTTTCGGGTCG 158
pAIO2X GG rev Biotin shift primers annealing temperature (.degree.
C.) denoted in parentheses BL745 hEGFP fwd GGCGAGGAGCTGTTCACCG 159
(48 .times. 3 cycles, 54 .times. 20 cycles) BL744 hEGFP rev
GTTTACGTCGCCGTCCAGCTC 160
(48 .times. 3 cycles, 54 .times. 20 cycles) BL750 GFP66 fwd (50)
GGCCAACACTTGTCACTACT 161 BL751 GFP66 rev (50) TCCGGATAACGGGAAAAGC
162 YZ351 GFP151 fwd (50) CTCGAGTACAACTTTAACTCACAC 163 YZ352 GFP151
rev (50) GATTCCATTCTTTTGTTTGTCTGC 164 BL748 TK1 fwd (50)
CTGTTTCCTGTGTGAAATTGTTATCC 165 BL749 TK1 rev (50)
GGCTTTACACTTTATGCTTCCG 166 BL774 D8 fwd (50)
CCCGGGTTATTACATGCGCTAGCACT 167 BL775 D8 rev (50)
GAAATTAATACGACTCACTATAGGGTTAAGCTTAACT 168 TTAAGAAGGAG YZ17 pUCX2
fwd (60) TGCAAGCAGCAGATTACGCGC 169 YZ18 pUCX2 rev (60)
GTAACTGTCAGACCAAGTTTACTC 170 UBP template oligonucleotides *denotes
sequences used in Cas9 experiments X denotes dNaM Name Sequence
Type Sequence BL410* hEGFP GGCGAGGAGCTGTTCACCGGGXTGGTGCCCATCCTGG
171 TCGAGCTGGACGGCGACGTAAAC BL413* hEGFP
GTTTACGTCGCCGTCCAGCTCGACCAGGATGGGXACC 172 ACCCCGGTGAACAGCTCCTCGCC
BL411* hEGFP GTTTACGTCGCCGTCCAGCTCGACCAXGATGGGCACC 173
ACCCCGGTGAACAGCTCCTCGCC BL409* hEGFP
GTTTACGTCGCCGTCCAGCTCGACCAGGATGGGCACC 174 AXCCCGGTGAACAGCTCCTCGCC
TK1* TK1 CTGTTTCCTGTGTGAAATTGTTATCCGCTCACAXTTCC 175
ACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCC DM510-16* GFP151
CTCGAGTACAACTTTAACTCACACAATGTAXAGATCA 176
CGGCAGACAAACAAAAGAATGGAATC DM510-13* GFP66
CCGGATAACGGGAAAAGCATTGAACACCGCXGGTCA 177 GAGTAGTGACAAGTGTTGGCCA
DM510-11* GFP66 GGCCAACACTTGTCACTACTCTGACCXAGGGTGTTCA 178
ATGCTTTTCCCGTTATCCGGA BL412* hEGFP
GGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCXTGG 179 TCGAGCTGGACGGCGACGTAAAC
BL414* hEGFP GGCGAGGAGCTGTTCACCGGGGTGGTXCCCATCCTGG 180
TCGAGCTGGACGGCGACGTAAAC DM510-20* GFP151
GATTCCATTCTTTTGTTTGTCTGCCGTGATTXATACATT 181
GTGTGAGTTAAAGTTGTACTCGAGT D8* D8
CACACAGGAAACAGCTATGACCCGGGTTATTACATGC 182
GCTAGCACTTGGAATTCACAATACTXTCTTTAAGGAA
ACCATAGTAAATCTCCTTCTTAAAGTTAAGCTTAACCC TATAGTGAGTCGTATTAATTTC
GFP151-33 GFP151 CTCGAGTACAACTTTAACTCACACAATGTAAXAATCA 183
CGGCAGACAAACAAAAGAATGGAATC GFP151-35 GFP151
CTCGAGTACAACTTTAACTCACACAATGTAAXCATCA 184
CGGCAGACAAACAAAAGAATGGAATC GFP151-37 GFP151
CTCGAGTACAACTTTAACTCACACAATGTAAXGATCA 185
CGGCAGACAAACAAAAGAATGGAATC GFP151-39 GFP151
CTCGAGTACAACTTTAACTCACACAATGTCAXTGTCA 186
CGGCAGACAAACAAAAGAATGGAATC GFP151-41 GFP151
CTCGAGTACAACTTTAACTCACACAATGTACXAATCA 187
CGGCAGACAAACAAAAGAATGGAATC GFP151-43* GFP151
CTCGAGTACAACTTTAACTCACACAATGTACXCATCA 188
CGGCAGACAAACAAAAGAATGGAATC DM510-19 GFP151
CTCGAGTACAACTTTAACTCACACAATGTACXGATCA 189
CGGCAGACAAACAAAAGAATGGAATC GFP151-47 GFP151
CTCGAGTACAACTTTAACTCACACAATGTACXTATCA 190
CGGCAGACAAACAAAAGAATGGAATC GFP151-49 GFP151
CTCGAGTACAACTTTAACTCACACAATGTAGXAATCA 191
CGGCAGACAAACAAAAGAATGGAATC GFP151-51* GFP151
CTCGAGTACAACTTTAACTCACACAATGTAGXCATCA 192
CGGCAGACAAACAAAAGAATGGAATC GFP151-53* GFP151
CTCGAGTACAACTTTAACTCACACAATGTAGXGATCA 193
CGGCAGACAAACAAAAGAATGGAATC GFP151-55 GFP151
CTCGAGTACAACTTTAACTCACACAATGTAGXTATCA 194
CGGCAGACAAACAAAAGAATGGAATC GFP151-57 GFP151
CTCGAGTACAACTTTAACTCACACAATGTATXAATCA 195
CGGCAGACAAACAAAAGAATGGAATC GFP151-59 GFP151
CTCGAGTACAACTTTAACTCACACAATGTATXCATCA 196
CGGCAGACAAACAAAAGAATGGAATC GFP151-61* GFP151
CTCGAGTACAACTTTAACTCACACAATGTATXGATCA 197
CGGCAGACAAACAAAAGAATGGAATC GFP151-63 GFP151
CTCGAGTACAACTTTAACTCACACAATGTATXTATCAC 198
GGCAGACAAACAAAAGAATGGAATC GFP151-69* GFP151
CTCGAGTACAACTTTAACTCACACAATGTAAGXATCA 199
CGGCAGACAAACAAAAGAATGGAATC ESer-69* ESer
CTCTGGAACCCTTTCGGGTCGCCGGTTTAGXAGACCG 200 GTGCCTTCAACCGCTCGGC
[0213] Table 3 illustrate signal peptide sequences described
herein.
TABLE-US-00003 Signal SEQ Peptide ID Sequences ID NO: pelB-SP1
MKYLLPTAEAGLLLLAAQPAIA 225 malE_SP2 MKIKTGARILALSELTTMMFSASALA 226
phoA_SP3 MKQSTIALALLPLLFTPVTKA 227 treA_SP4
MKSPAPSRPQKMALIPACIFLCFAALS 228 VQA pcoE_SP5 MKKILVSFVAIMAVASSAMA
229 Chitosanase MKISMQKADFWKKAAISLLVFTMFFTL 230 (Csn)-SP6
MMSETVFAAGLNK OmpA_SP7 MKKTAIAIAVALAGFATVAQASAGLNK 231 D DsbAss
Protein: 232 MKKIWLALAGLVLAFSASA DNA: 233
ATGAAAAAGATTTGGCTGGCGCTGGCT GGTTTAGTTTTAGCGTTTAGCGCATCG GCG PelBss
Protein: 234 MKYLLPTAAAGLLLLAAQPAMA DNA: 235
ATGAAATACCTGCTGCCGACCGCTGCT GCTGGTCTGCTGCTCCTCGCTGCCCAG
CCGGCGATGGCG PhoAss Protein: 227 MKQSTIALALLPLLFTPVTKA DNA: 236
ATGAAACAAAGCACTATTGCACTGGCA CTCTTACCGTTACTGTTTACCCCTGTG ACAAAAGCG
NTss Protein: 237 MKTHIVSSVTTTLLLGSILMNPVANA DNA: 238
ATGAAAACACATATAGTCAGCTCAGTA ACAACAACACTATTGCTAGGTTCCATA
TTAATGAATCCTGTCGCTAATGCC NSP1 Protein: 239 MKYLLPWLALAGLVLAFSASA
DNA: 240 ATGAAATACCTGCTGCCGTGGCTGGCG CTGGCTGGTTTAGTTTTAGCGTTTAGC
GCATCGGCG NSP2 Protein: 241 MKKITAAAGLLLLAAFSASA DNA: 242
ATGAAAAAGATTACCGCTGCTGCTGGT CTGCTGCTCCTCGCTGCGTTTAGCGCA TCGGCG NSP3
Protein: 243 MKKIWLALAGLVLAQPAMA DNA: 244
ATGAAAAAGATTTGGCTGGCGCTGGCT GGTTTAGTTTTAGCCCAGCCGGCGATG GCG NSP3a
Protein: 245 MKKILVLGALALWAQPAMA DNA: 246
ATGAAAAAGATTTTAGTTTTAGGTGCT CTGGCGCTGTGGGCCCAGCCGGCGATG GCG NSP3b
Protein: 247 MKKIWLALVLLAGAQPAMA DNA: 248
ATGAAAAAGATTTGGCTGGCGTTAGTT TTACTGGCTGGTGCCCAGCCGGCGATG GCG NSP3c
Protein: 249 MKKILAGWLALVLAQPAMA DNA: 250
ATGAAAAAGATTCTGGCTGGTTGGCTG GCGTTAGTTTTAGCCCAGCCGGCGATG GCG NSP3d
Protein: 251 MKKILVLLAGWLAAQPAMA DNA: 252
ATGAAAAAGATTTTAGTTTTACTGGCT GGTTGGCTGGCGGCCCAGCCGGCGATG GCG NSP4
Protein: 253 MKKITAAAGLLLLAAQPAMA DNA: 254
ATGAAAAAGATTACCGCTGCTGCTGGT CTGCTGCTCCTCGCTGCCCAGCCGGCG ATGGCG
NSP4a Protein: 255 MKKILLLLGTAAAAAQPAMA DNA: 256
ATGAAAAAGATTCTGCTGCTCCTCGGT ACCGCTGCTGCTGCTGCCCAGCCGGCG ATGGCG
NSP4b Protein: 257 MKKILLLLLLLLLLAQPAMA DNA: 258
ATGAAAAAGATTCTGCTGCTCCTCCTG CTGCTCCTCCTGCTCGCCCAGCCGGCG ATGGCG
NSP4c Protein: 259 MKKIAAAAAAAAAAAQPAMA DNA: 260
ATGAAAAAGATTGCTGCTGCTGCTGCG GCGGCGGCGGCTGCGGCCCAGCCGGCG ATGGCG NSP5
Protein: 261 MKYLLPWLALAGLVLAQPAMA DNA: 262
ATGAAATACCTGCTGCCGTGGCTGGCG CTGGCTGGTTTAGTTTTAGCCCAGCCG GCGATGGCG
NSP6 Protein: 263 MKYLLPTAAAGLLLLAAFSASA DNA: 264
ATGAAATACCTGCTGCCGACCGCTGCT GCTGGTCTGCTGCTCCTCGCTGCGTTT
AGCGCATCGGCG
[0214] While preferred embodiments of the present disclosure have
been shown and described herein, it will be obvious to those
skilled in the art that such embodiments are provided by way of
example only. Numerous variations, changes, and substitutions will
now occur to those skilled in the art without departing from the
disclosure. It should be understood that various alternatives to
the embodiments of the disclosure described herein may be employed
in practicing the disclosure. It is intended that the following
claims define the scope of the disclosure and that methods and
structures within the scope of these claims and their equivalents
be covered thereby.
Sequence CWU 1
1
2641575PRTPhaeodactylum tricornutum 1Met Arg Pro Tyr Pro Thr Ile
Ala Leu Ile Ser Val Phe Leu Ser Ala1 5 10 15Ala Thr Arg Ile Ser Ala
Thr Ser Ser His Gln Ala Ser Ala Leu Pro 20 25 30Val Lys Lys Gly Thr
His Val Pro Asp Ser Pro Lys Leu Ser Lys Leu 35 40 45Tyr Ile Met Ala
Lys Thr Lys Ser Val Ser Ser Ser Phe Asp Pro Pro 50 55 60Arg Gly Gly
Ser Thr Val Ala Pro Thr Thr Pro Leu Ala Thr Gly Gly65 70 75 80Ala
Leu Arg Lys Val Arg Gln Ala Val Phe Pro Ile Tyr Gly Asn Gln 85 90
95Glu Val Thr Lys Phe Leu Leu Ile Gly Ser Ile Lys Phe Phe Ile Ile
100 105 110Leu Ala Leu Thr Leu Thr Arg Asp Thr Lys Asp Thr Leu Ile
Val Thr 115 120 125Gln Cys Gly Ala Glu Ala Ile Ala Phe Leu Lys Ile
Tyr Gly Val Leu 130 135 140Pro Ala Ala Thr Ala Phe Ile Ala Leu Tyr
Ser Lys Met Ser Asn Ala145 150 155 160Met Gly Lys Lys Met Leu Phe
Tyr Ser Thr Cys Ile Pro Phe Phe Thr 165 170 175Phe Phe Gly Leu Phe
Asp Val Phe Ile Tyr Pro Asn Ala Glu Arg Leu 180 185 190His Pro Ser
Leu Glu Ala Val Gln Ala Ile Leu Pro Gly Gly Ala Ala 195 200 205Ser
Gly Gly Met Ala Val Leu Ala Lys Ile Ala Thr His Trp Thr Ser 210 215
220Ala Leu Phe Tyr Val Met Ala Glu Ile Tyr Ser Ser Val Ser Val
Gly225 230 235 240Leu Leu Phe Trp Gln Phe Ala Asn Asp Val Val Asn
Val Asp Gln Ala 245 250 255Lys Arg Phe Tyr Pro Leu Phe Ala Gln Met
Ser Gly Leu Ala Pro Val 260 265 270Leu Ala Gly Gln Tyr Val Val Arg
Phe Ala Ser Lys Ala Val Asn Phe 275 280 285Glu Ala Ser Met His Arg
Leu Thr Ala Ala Val Thr Phe Ala Gly Ile 290 295 300Met Ile Cys Ile
Phe Tyr Gln Leu Ser Ser Ser Tyr Val Glu Arg Thr305 310 315 320Glu
Ser Ala Lys Pro Ala Ala Asp Asn Glu Gln Ser Ile Lys Pro Lys 325 330
335Lys Lys Lys Pro Lys Met Ser Met Val Glu Ser Gly Lys Phe Leu Ala
340 345 350Ser Ser Gln Tyr Leu Arg Leu Ile Ala Met Leu Val Leu Gly
Tyr Gly 355 360 365Leu Ser Ile Asn Phe Thr Glu Ile Met Trp Lys Ser
Leu Val Lys Lys 370 375 380Gln Tyr Pro Asp Pro Leu Asp Tyr Gln Arg
Phe Met Gly Asn Phe Ser385 390 395 400Ser Ala Val Gly Leu Ser Thr
Cys Ile Val Ile Phe Phe Gly Val His 405 410 415Val Ile Arg Leu Leu
Gly Trp Lys Val Gly Ala Leu Ala Thr Pro Gly 420 425 430Ile Met Ala
Ile Leu Ala Leu Pro Phe Phe Ala Cys Ile Leu Leu Gly 435 440 445Leu
Asp Ser Pro Ala Arg Leu Glu Ile Ala Val Ile Phe Gly Thr Ile 450 455
460Gln Ser Leu Leu Ser Lys Thr Ser Lys Tyr Ala Leu Phe Asp Pro
Thr465 470 475 480Thr Gln Met Ala Tyr Ile Pro Leu Asp Asp Glu Ser
Lys Val Lys Gly 485 490 495Lys Ala Ala Ile Asp Val Leu Gly Ser Arg
Ile Gly Lys Ser Gly Gly 500 505 510Ser Leu Ile Gln Gln Gly Leu Val
Phe Val Phe Gly Asn Ile Ile Asn 515 520 525Ala Ala Pro Val Val Gly
Val Val Tyr Tyr Ser Val Leu Val Ala Trp 530 535 540Met Ser Ala Ala
Gly Arg Leu Ser Gly Leu Phe Gln Ala Gln Thr Glu545 550 555 560Met
Asp Lys Ala Asp Lys Met Glu Ala Lys Thr Asn Lys Glu Lys 565 570
57521536DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotide 2atgggtggta gcaccgttgc accgaccaca
ccgctggcaa ccggtggtgc actgcgtaaa 60gttcgtcagg cagtttttcc gatttatggc
aatcaagaag tgaccaaatt tctgctgatt 120ggcagcatca aattctttat
tatcctggca ctgaccctga cccgtgatac caaagatacc 180ctgattgtta
cccagtgtgg tgcagaagca attgcatttc tgaaaatcta tggtgttctg
240cctgcagcaa ccgcatttat tgcactgtat agcaaaatga gcaacgcaat
gggcaaaaaa 300atgctgtttt atagcacctg tatcccgttc tttacctttt
ttggtctgtt cgatgtgttc 360atttatccga atgccgaacg tctgcatccg
agcctggaag cagttcaggc aattctgcct 420ggtggtgccg caagcggtgg
tatggcagtt ctggcaaaaa ttgcaaccca ttggaccagc 480gcactgtttt
atgttatggc agaaatctat agcagcgtta gcgttggtct gctgttttgg
540cagtttgcaa atgatgttgt taatgtggat caggccaaac gtttttatcc
gctgtttgca 600cagatgagcg gtctggcacc ggttctggca ggtcagtatg
ttgttcgttt tgcaagcaaa 660gccgttaatt ttgaagcaag catgcatcgt
ctgaccgcag cagttacctt tgcaggtatt 720atgatctgca tcttttatca
gctgagcagc tcatatgttg aacgtaccga aagcgcaaaa 780ccggcagcag
ataatgaaca gagcattaaa ccgaagaaaa aaaaaccgaa aatgtcgatg
840gtggaaagcg gtaaatttct ggcaagcagc cagtatctgc gtctgattgc
aatgctggtt 900ctgggttatg gtctgagcat taactttacc gaaatcatgt
ggaaaagcct ggtgaaaaaa 960cagtatccgg atccgctgga ttatcagcgt
tttatgggta attttagcag cgcagttggt 1020ctgagtacct gcattgttat
cttttttggc gtgcatgtta ttcgtctgct gggttggaaa 1080gttggtgccc
tggcaacacc gggtattatg gccattctgg cactgccgtt ttttgcatgt
1140attctgctgg gcctggatag tccggcacgt ctggaaattg cagttatttt
tggcaccatt 1200cagagcctgc tgagcaaaac cagcaaatat gcactgtttg
atccgaccac ccagatggca 1260tatatcccgc tggatgatga aagcaaagtt
aaaggcaaag cagccattga tgttctgggt 1320agccgtattg gtaaatcagg
tggtagcctg attcagcagg gtctggtttt tgtttttggc 1380aatattatca
atgccgcacc ggttgttggt gttgtgtatt atagcgttct ggttgcatgg
1440atgagtgcag caggtcgtct gagtggtctg tttcaggcac agaccgaaat
ggataaagca 1500gataaaatgg aagccaaaac caacaaagaa aaatga
153631536DNAPhaeodactylum tricornutum 3atgggaggca gtactgttgc
accaactaca ccgttggcaa ccggcggtgc gctccgcaaa 60gtgcgacaag ccgtctttcc
catctacgga aaccaagaag tcaccaaatt tctgctcatc 120ggatccatta
aattctttat aatcttggca ctcacgctca cgcgtgatac caaggacacg
180ttgattgtca cgcaatgtgg tgccgaagcg attgcctttc tcaaaatata
cggggtgcta 240cccgcagcga ccgcatttat cgcgctctat tccaaaatgt
ccaacgccat gggcaaaaaa 300atgctatttt attccacttg cattcctttc
tttacctttt tcgggctgtt tgatgttttc 360atttacccga acgcggagcg
actgcaccct agtttggaag ccgtgcaggc aattctcccg 420ggcggtgccg
catctggcgg catggcggtt ctggccaaga ttgcgacaca ctggacatcg
480gccttatttt acgtcatggc ggaaatatat tcttccgtat cggtggggct
attgttttgg 540cagtttgcga acgacgtcgt caacgtggat caggccaagc
gcttttatcc attatttgct 600caaatgagtg gcctcgctcc agttttagcg
ggccagtatg tggtacggtt tgccagcaaa 660gcggtcaact ttgaggcatc
catgcatcga ctcacggcgg ccgtaacatt tgctggtatt 720atgatttgca
tcttttacca actcagttcg tcatatgtgg agcgaacgga atcagcaaag
780ccagcggcag ataacgagca gtctatcaaa ccgaaaaaga agaaacccaa
aatgtccatg 840gttgaatcgg ggaaatttct cgcgtcaagt cagtacctgc
gtctaattgc catgctggtg 900ctgggatacg gcctcagtat taactttacc
gaaatcatgt ggaaaagctt ggtgaagaaa 960caatatccag acccgctaga
ttatcaacga tttatgggta acttctcgtc agcggttggt 1020ttgagcacat
gcattgttat tttcttcggt gtgcacgtga tccgtttgtt ggggtggaaa
1080gtcggagcgt tggctacacc tgggatcatg gccattctag cgttaccctt
ttttgcttgc 1140attttgttgg gtttggatag tccagcacga ttggagatcg
ccgtaatctt tggaacaatt 1200cagagtttgc tgagcaaaac ctccaagtat
gcccttttcg accctaccac acaaatggct 1260tatattcctc tggacgacga
atcaaaggtc aaaggaaaag cggcaattga tgttttggga 1320tcgcggattg
gcaagagtgg aggctcactg atccagcagg gcttggtctt tgtttttgga
1380aatatcatta atgccgcacc tgtagtaggg gttgtctact acagtgtcct
tgttgcgtgg 1440atgagcgcag ctggccgact aagtgggctt tttcaagcac
aaacagaaat ggataaggcc 1500gacaaaatgg aggcaaagac caacaaagaa aagtag
15364511PRTPhaeodactylum tricornutum 4Met Gly Gly Ser Thr Val Ala
Pro Thr Thr Pro Leu Ala Thr Gly Gly1 5 10 15Ala Leu Arg Lys Val Arg
Gln Ala Val Phe Pro Ile Tyr Gly Asn Gln 20 25 30Glu Val Thr Lys Phe
Leu Leu Ile Gly Ser Ile Lys Phe Phe Ile Ile 35 40 45Leu Ala Leu Thr
Leu Thr Arg Asp Thr Lys Asp Thr Leu Ile Val Thr 50 55 60Gln Cys Gly
Ala Glu Ala Ile Ala Phe Leu Lys Ile Tyr Gly Val Leu65 70 75 80Pro
Ala Ala Thr Ala Phe Ile Ala Leu Tyr Ser Lys Met Ser Asn Ala 85 90
95Met Gly Lys Lys Met Leu Phe Tyr Ser Thr Cys Ile Pro Phe Phe Thr
100 105 110Phe Phe Gly Leu Phe Asp Val Phe Ile Tyr Pro Asn Ala Glu
Arg Leu 115 120 125His Pro Ser Leu Glu Ala Val Gln Ala Ile Leu Pro
Gly Gly Ala Ala 130 135 140Ser Gly Gly Met Ala Val Leu Ala Lys Ile
Ala Thr His Trp Thr Ser145 150 155 160Ala Leu Phe Tyr Val Met Ala
Glu Ile Tyr Ser Ser Val Ser Val Gly 165 170 175Leu Leu Phe Trp Gln
Phe Ala Asn Asp Val Val Asn Val Asp Gln Ala 180 185 190Lys Arg Phe
Tyr Pro Leu Phe Ala Gln Met Ser Gly Leu Ala Pro Val 195 200 205Leu
Ala Gly Gln Tyr Val Val Arg Phe Ala Ser Lys Ala Val Asn Phe 210 215
220Glu Ala Ser Met His Arg Leu Thr Ala Ala Val Thr Phe Ala Gly
Ile225 230 235 240Met Ile Cys Ile Phe Tyr Gln Leu Ser Ser Ser Tyr
Val Glu Arg Thr 245 250 255Glu Ser Ala Lys Pro Ala Ala Asp Asn Glu
Gln Ser Ile Lys Pro Lys 260 265 270Lys Lys Lys Pro Lys Met Ser Met
Val Glu Ser Gly Lys Phe Leu Ala 275 280 285Ser Ser Gln Tyr Leu Arg
Leu Ile Ala Met Leu Val Leu Gly Tyr Gly 290 295 300Leu Ser Ile Asn
Phe Thr Glu Ile Met Trp Lys Ser Leu Val Lys Lys305 310 315 320Gln
Tyr Pro Asp Pro Leu Asp Tyr Gln Arg Phe Met Gly Asn Phe Ser 325 330
335Ser Ala Val Gly Leu Ser Thr Cys Ile Val Ile Phe Phe Gly Val His
340 345 350Val Ile Arg Leu Leu Gly Trp Lys Val Gly Ala Leu Ala Thr
Pro Gly 355 360 365Ile Met Ala Ile Leu Ala Leu Pro Phe Phe Ala Cys
Ile Leu Leu Gly 370 375 380Leu Asp Ser Pro Ala Arg Leu Glu Ile Ala
Val Ile Phe Gly Thr Ile385 390 395 400Gln Ser Leu Leu Ser Lys Thr
Ser Lys Tyr Ala Leu Phe Asp Pro Thr 405 410 415Thr Gln Met Ala Tyr
Ile Pro Leu Asp Asp Glu Ser Lys Val Lys Gly 420 425 430Lys Ala Ala
Ile Asp Val Leu Gly Ser Arg Ile Gly Lys Ser Gly Gly 435 440 445Ser
Leu Ile Gln Gln Gly Leu Val Phe Val Phe Gly Asn Ile Ile Asn 450 455
460Ala Ala Pro Val Val Gly Val Val Tyr Tyr Ser Val Leu Val Ala
Trp465 470 475 480Met Ser Ala Ala Gly Arg Leu Ser Gly Leu Phe Gln
Ala Gln Thr Glu 485 490 495Met Asp Lys Ala Asp Lys Met Glu Ala Lys
Thr Asn Lys Glu Lys 500 505 51051599DNAArtificial
SequenceDescription of Artificial Sequence Synthetic polynucleotide
5atgagaccat ttccgacgat tgccttgatt tcggtttttc tttcggcggc gactcgcatt
60tcggcaggag gcagtactgt tgcaccaact acaccgttgg caaccggcgg tgcgctccgc
120aaagtgcgac aagccgtctt tcccatctac ggaaaccaag aagtcaccaa
atttctgctc 180atcggatcca ttaaattctt tataatcttg gcactcacgc
tcacgcgtga taccaaggac 240acgttgattg tcacgcaatg tggtgccgaa
gcgattgcct ttctcaaaat atacggggtg 300ctacccgcag cgaccgcatt
tatcgcgctc tattccaaaa tgtccaacgc catgggcaaa 360aaaatgctat
tttattccac ttgcattcct ttctttacct ttttcgggct gtttgatgtt
420ttcatttacc cgaacgcgga gcgactgcac cctagtttgg aagccgtgca
ggcaattctc 480ccgggcggtg ccgcatctgg cggcatggcg gttctggcca
agattgcgac acactggaca 540tcggccttat tttacgtcat ggcggaaata
tattcttccg tatcggtggg gctattgttt 600tggcagtttg cgaacgacgt
cgtcaacgtg gatcaggcca agcgctttta tccattattt 660gctcaaatga
gtggcctcgc tccagtttta gcgggccagt atgtggtacg gtttgccagc
720aaagcggtca actttgaggc atccatgcat cgactcacgg cggccgtaac
atttgctggt 780attatgattt gcatctttta ccaactcagt tcgtcatatg
tggagcgaac ggaatcagca 840aagccagcgg cagataacga gcagtctatc
aaaccgaaaa agaagaaacc caaaatgtcc 900atggttgaat cggggaaatt
tctcgcgtca agtcagtacc tgcgtctaat tgccatgctg 960gtgctgggat
acggcctcag tattaacttt accgaaatca tgtggaaaag cttggtgaag
1020aaacaatatc cagacccgct agattatcaa cgatttatgg gtaacttctc
gtcagcggtt 1080ggtttgagca catgcattgt tattttcttc ggtgtgcacg
tgatccgttt gttggggtgg 1140aaagtcggag cgttggctac acctgggatc
atggccattc tagcgttacc cttttttgct 1200tgcattttgt tgggtttgga
tagtccagca cgattggaga tcgccgtaat ctttggaaca 1260attcagagtt
tgctgagcaa aacctccaag tatgcccttt tcgaccctac cacacaaatg
1320gcttatattc ctctggacga cgaatcaaag gtcaaaggaa aagcggcaat
tgatgttttg 1380ggatcgcgga ttggcaagag tggaggctca ctgatccagc
agggcttggt ctttgttttt 1440ggaaatatca ttaatgccgc acctgtagta
ggggttgtct actacagtgt ccttgttgcg 1500tggatgagcg cagctggccg
actaagtggg ctttttcaag cacaaacaga aatggataag 1560gccgacaaaa
tggaggcaaa gaccaacaaa gaaaagtag 15996532PRTArtificial
SequenceDescription of Artificial Sequence Synthetic polypeptide
6Met Arg Pro Phe Pro Thr Ile Ala Leu Ile Ser Val Phe Leu Ser Ala1 5
10 15Ala Thr Arg Ile Ser Ala Gly Gly Ser Thr Val Ala Pro Thr Thr
Pro 20 25 30Leu Ala Thr Gly Gly Ala Leu Arg Lys Val Arg Gln Ala Val
Phe Pro 35 40 45Ile Tyr Gly Asn Gln Glu Val Thr Lys Phe Leu Leu Ile
Gly Ser Ile 50 55 60Lys Phe Phe Ile Ile Leu Ala Leu Thr Leu Thr Arg
Asp Thr Lys Asp65 70 75 80Thr Leu Ile Val Thr Gln Cys Gly Ala Glu
Ala Ile Ala Phe Leu Lys 85 90 95Ile Tyr Gly Val Leu Pro Ala Ala Thr
Ala Phe Ile Ala Leu Tyr Ser 100 105 110Lys Met Ser Asn Ala Met Gly
Lys Lys Met Leu Phe Tyr Ser Thr Cys 115 120 125Ile Pro Phe Phe Thr
Phe Phe Gly Leu Phe Asp Val Phe Ile Tyr Pro 130 135 140Asn Ala Glu
Arg Leu His Pro Ser Leu Glu Ala Val Gln Ala Ile Leu145 150 155
160Pro Gly Gly Ala Ala Ser Gly Gly Met Ala Val Leu Ala Lys Ile Ala
165 170 175Thr His Trp Thr Ser Ala Leu Phe Tyr Val Met Ala Glu Ile
Tyr Ser 180 185 190Ser Val Ser Val Gly Leu Leu Phe Trp Gln Phe Ala
Asn Asp Val Val 195 200 205Asn Val Asp Gln Ala Lys Arg Phe Tyr Pro
Leu Phe Ala Gln Met Ser 210 215 220Gly Leu Ala Pro Val Leu Ala Gly
Gln Tyr Val Val Arg Phe Ala Ser225 230 235 240Lys Ala Val Asn Phe
Glu Ala Ser Met His Arg Leu Thr Ala Ala Val 245 250 255Thr Phe Ala
Gly Ile Met Ile Cys Ile Phe Tyr Gln Leu Ser Ser Ser 260 265 270Tyr
Val Glu Arg Thr Glu Ser Ala Lys Pro Ala Ala Asp Asn Glu Gln 275 280
285Ser Ile Lys Pro Lys Lys Lys Lys Pro Lys Met Ser Met Val Glu Ser
290 295 300Gly Lys Phe Leu Ala Ser Ser Gln Tyr Leu Arg Leu Ile Ala
Met Leu305 310 315 320Val Leu Gly Tyr Gly Leu Ser Ile Asn Phe Thr
Glu Ile Met Trp Lys 325 330 335Ser Leu Val Lys Lys Gln Tyr Pro Asp
Pro Leu Asp Tyr Gln Arg Phe 340 345 350Met Gly Asn Phe Ser Ser Ala
Val Gly Leu Ser Thr Cys Ile Val Ile 355 360 365Phe Phe Gly Val His
Val Ile Arg Leu Leu Gly Trp Lys Val Gly Ala 370 375 380Leu Ala Thr
Pro Gly Ile Met Ala Ile Leu Ala Leu Pro Phe Phe Ala385 390 395
400Cys Ile Leu Leu Gly Leu Asp Ser Pro Ala Arg Leu Glu Ile Ala Val
405 410 415Ile Phe Gly Thr Ile Gln Ser Leu Leu Ser Lys Thr Ser Lys
Tyr Ala 420 425 430Leu Phe Asp Pro Thr Thr Gln Met Ala Tyr Ile Pro
Leu Asp Asp Glu 435 440 445Ser Lys Val Lys Gly Lys Ala Ala Ile Asp
Val Leu Gly Ser Arg Ile 450 455 460Gly Lys Ser Gly Gly Ser Leu Ile
Gln Gln Gly Leu Val Phe Val Phe465 470 475 480Gly Asn Ile Ile Asn
Ala Ala Pro Val Val Gly Val Val Tyr Tyr Ser 485 490 495Val Leu Val
Ala Trp Met Ser Ala Ala Gly Arg Leu Ser Gly Leu Phe 500 505 510Gln
Ala Gln Thr Glu Met Asp Lys Ala Asp Lys Met Glu Ala Lys Thr 515 520
525Asn Lys Glu Lys 53071665DNAArtificial SequenceDescription of
Artificial Sequence Synthetic polynucleotide 7atgacttcct ctcatcaagc
aagtgcactt
cctctcaaaa agggaacgca tgtcccggac 60tctccgaagt tgtcaaagct atatatcatg
gccaaaacca agagtgtatc ctcgtccttc 120gacccccctc ggggaggcag
tactgttgca ccaactacac cgttggcaac cggcggtgcg 180ctccgcaaag
tgcgacaagc cgtctttccc atctacggaa accaagaagt caccaaattt
240ctgctcatcg gatccattaa attctttata atcttggcac tcacgctcac
gcgtgatacc 300aaggacacgt tgattgtcac gcaatgtggt gccgaagcga
ttgcctttct caaaatatac 360ggggtgctac ccgcagcgac cgcatttatc
gcgctctatt ccaaaatgtc caacgccatg 420ggcaaaaaaa tgctatttta
ttccacttgc attcctttct ttaccttttt cgggctgttt 480gatgttttca
tttacccgaa cgcggagcga ctgcacccta gtttggaagc cgtgcaggca
540attctcccgg gcggtgccgc atctggcggc atggcggttc tggccaagat
tgcgacacac 600tggacatcgg ccttatttta cgtcatggcg gaaatatatt
cttccgtatc ggtggggcta 660ttgttttggc agtttgcgaa cgacgtcgtc
aacgtggatc aggccaagcg cttttatcca 720ttatttgctc aaatgagtgg
cctcgctcca gttttagcgg gccagtatgt ggtacggttt 780gccagcaaag
cggtcaactt tgaggcatcc atgcatcgac tcacggcggc cgtaacattt
840gctggtatta tgatttgcat cttttaccaa ctcagttcgt catatgtgga
gcgaacggaa 900tcagcaaagc cagcggcaga taacgagcag tctatcaaac
cgaaaaagaa gaaacccaaa 960atgtccatgg ttgaatcggg gaaatttctc
gcgtcaagtc agtacctgcg tctaattgcc 1020atgctggtgc tgggatacgg
cctcagtatt aactttaccg aaatcatgtg gaaaagcttg 1080gtgaagaaac
aatatccaga cccgctagat tatcaacgat ttatgggtaa cttctcgtca
1140gcggttggtt tgagcacatg cattgttatt ttcttcggtg tgcacgtgat
ccgtttgttg 1200gggtggaaag tcggagcgtt ggctacacct gggatcatgg
ccattctagc gttacccttt 1260tttgcttgca ttttgttggg tttggatagt
ccagcacgat tggagatcgc cgtaatcttt 1320ggaacaattc agagtttgct
gagcaaaacc tccaagtatg cccttttcga ccctaccaca 1380caaatggctt
atattcctct ggacgacgaa tcaaaggtca aaggaaaagc ggcaattgat
1440gttttgggat cgcggattgg caagagtgga ggctcactga tccagcaggg
cttggtcttt 1500gtttttggaa atatcattaa tgccgcacct gtagtagggg
ttgtctacta cagtgtcctt 1560gttgcgtgga tgagcgcagc tggccgacta
agtgggcttt ttcaagcaca aacagaaatg 1620gataaggccg acaaaatgga
ggcaaagacc aacaaagaaa agtag 16658554PRTArtificial
SequenceDescription of Artificial Sequence Synthetic polypeptide
8Met Thr Ser Ser His Gln Ala Ser Ala Leu Pro Leu Lys Lys Gly Thr1 5
10 15His Val Pro Asp Ser Pro Lys Leu Ser Lys Leu Tyr Ile Met Ala
Lys 20 25 30Thr Lys Ser Val Ser Ser Ser Phe Asp Pro Pro Arg Gly Gly
Ser Thr 35 40 45Val Ala Pro Thr Thr Pro Leu Ala Thr Gly Gly Ala Leu
Arg Lys Val 50 55 60Arg Gln Ala Val Phe Pro Ile Tyr Gly Asn Gln Glu
Val Thr Lys Phe65 70 75 80Leu Leu Ile Gly Ser Ile Lys Phe Phe Ile
Ile Leu Ala Leu Thr Leu 85 90 95Thr Arg Asp Thr Lys Asp Thr Leu Ile
Val Thr Gln Cys Gly Ala Glu 100 105 110Ala Ile Ala Phe Leu Lys Ile
Tyr Gly Val Leu Pro Ala Ala Thr Ala 115 120 125Phe Ile Ala Leu Tyr
Ser Lys Met Ser Asn Ala Met Gly Lys Lys Met 130 135 140Leu Phe Tyr
Ser Thr Cys Ile Pro Phe Phe Thr Phe Phe Gly Leu Phe145 150 155
160Asp Val Phe Ile Tyr Pro Asn Ala Glu Arg Leu His Pro Ser Leu Glu
165 170 175Ala Val Gln Ala Ile Leu Pro Gly Gly Ala Ala Ser Gly Gly
Met Ala 180 185 190Val Leu Ala Lys Ile Ala Thr His Trp Thr Ser Ala
Leu Phe Tyr Val 195 200 205Met Ala Glu Ile Tyr Ser Ser Val Ser Val
Gly Leu Leu Phe Trp Gln 210 215 220Phe Ala Asn Asp Val Val Asn Val
Asp Gln Ala Lys Arg Phe Tyr Pro225 230 235 240Leu Phe Ala Gln Met
Ser Gly Leu Ala Pro Val Leu Ala Gly Gln Tyr 245 250 255Val Val Arg
Phe Ala Ser Lys Ala Val Asn Phe Glu Ala Ser Met His 260 265 270Arg
Leu Thr Ala Ala Val Thr Phe Ala Gly Ile Met Ile Cys Ile Phe 275 280
285Tyr Gln Leu Ser Ser Ser Tyr Val Glu Arg Thr Glu Ser Ala Lys Pro
290 295 300Ala Ala Asp Asn Glu Gln Ser Ile Lys Pro Lys Lys Lys Lys
Pro Lys305 310 315 320Met Ser Met Val Glu Ser Gly Lys Phe Leu Ala
Ser Ser Gln Tyr Leu 325 330 335Arg Leu Ile Ala Met Leu Val Leu Gly
Tyr Gly Leu Ser Ile Asn Phe 340 345 350Thr Glu Ile Met Trp Lys Ser
Leu Val Lys Lys Gln Tyr Pro Asp Pro 355 360 365Leu Asp Tyr Gln Arg
Phe Met Gly Asn Phe Ser Ser Ala Val Gly Leu 370 375 380Ser Thr Cys
Ile Val Ile Phe Phe Gly Val His Val Ile Arg Leu Leu385 390 395
400Gly Trp Lys Val Gly Ala Leu Ala Thr Pro Gly Ile Met Ala Ile Leu
405 410 415Ala Leu Pro Phe Phe Ala Cys Ile Leu Leu Gly Leu Asp Ser
Pro Ala 420 425 430Arg Leu Glu Ile Ala Val Ile Phe Gly Thr Ile Gln
Ser Leu Leu Ser 435 440 445Lys Thr Ser Lys Tyr Ala Leu Phe Asp Pro
Thr Thr Gln Met Ala Tyr 450 455 460Ile Pro Leu Asp Asp Glu Ser Lys
Val Lys Gly Lys Ala Ala Ile Asp465 470 475 480Val Leu Gly Ser Arg
Ile Gly Lys Ser Gly Gly Ser Leu Ile Gln Gln 485 490 495Gly Leu Val
Phe Val Phe Gly Asn Ile Ile Asn Ala Ala Pro Val Val 500 505 510Gly
Val Val Tyr Tyr Ser Val Leu Val Ala Trp Met Ser Ala Ala Gly 515 520
525Arg Leu Ser Gly Leu Phe Gln Ala Gln Thr Glu Met Asp Lys Ala Asp
530 535 540Lys Met Glu Ala Lys Thr Asn Lys Glu Lys545
55093013DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotidemodified_base(2349)..(2349)Any unnatural
nucleotide 9aacgccagca acgcggcctt tttacggttc ctggcctttt gctggccttt
tgctcacatg 60ttctttcctg cgttatcccc tgattctgtg gataaccgta ttaccgcctt
tgagtgagct 120gataccgctc gccgcagccg aacgaccgag cgcagcgagt
cagtgagcga ggaagcggaa 180gagcgcctga tgcggtattt tctccttacg
catctgtgcg gtatttcaca ccgcatatgg 240tgcactctca gtacaatctg
ctctgatgcc gcatagttaa gccagtatac actccgctat 300cgctacgtga
ctgggtcatg gctgcgcccc gacacccgcc aacacccgct gacgcgccct
360gacgggcttg tctgctcccg gcatccgctt acagacaagc tgtgaccgtc
tccgggagct 420gcatgtgtca gaggttttca ccgtcatcac cgaaacgcgc
gaggcagctg cggtaaagct 480catcagcgtg gtcgtgaagc gattcacaga
tgtctgcctg ttcatccgcg tccagctcgt 540tgagtttctc cagaagcgtt
aatgtctggc ttctgataaa gcgggccatg ttaagggcgg 600ttttttcctg
tttggtcact gatgcctccg tgtaaggggg atttctgttc atgggggtaa
660tgataccgat gaaacgagag aggatgctca cgatacgggt tactgatgat
gaacatgccc 720ggttactgga acgttgtgag ggtaaacaac tggcggtatg
gatgcggcgg gaccagagaa 780aaatcactca gggtcaatgc cagcgcttcg
ttaatacaga tgtaggtgtt ccacagggta 840gccagcagca tcctgcgatg
cagatccgga acataatggt gcagggcgct gacttccgcg 900tttccagact
ttacgaaaca cggaaaccga agaccattca tgttgttgct caggtcgcag
960acgttttgca gcagcagtcg cttcacgttc gctcgcgtat cggtgattca
ttctgctaac 1020cagtaaggca accccgccag cctagccggg tcctcaacga
caggagcacg atcatgcgca 1080cccgtggcca ggacccaacg ctgcccgaaa
ttcttgaaga cgaaagggcc tcgtgatacg 1140cctattttta taggttaatg
tcatgataat aatggtttct tagacgtcag gtggcacttt 1200tcggggaaat
gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta
1260tccgctcatg agacaataac cctgataaat gcttcaataa tattgaaaaa
ggaagagtat 1320gagtattcaa catttccgtg tcgcccttat tccctttttt
gcggcatttt gccttcctgt 1380ttttgctcac ccagaaacgc tggtgaaagt
aaaagatgct gaagatcagt tgggtgcacg 1440agtgggttac atcgaactgg
atctcaacag cggtaagatc cttgagagtt ttcgccccga 1500agaacgtttt
ccaatgatga gcacttttaa agttctgcta tgtggcgcgg tattatcccg
1560tgttgacgcc gggcaagagc aactcggtcg ccgcatacac tattctcaga
atgacttggt 1620tgagtactca ccagtcacag aaaagcatct tacggatggc
atgacagtaa gagaattatg 1680cagtgctgcc ataaccatga gtgataacac
tgcggccaac ttacttctga caacgatcgg 1740aggaccgaag gagctaaccg
cttttttgca caacatgggg gatcatgtaa ctcgccttga 1800tcgttgggaa
ccggagctga atgaagccat accaaacgac gagcgtgaca ccacgatgcc
1860tgcagcaatg gcaacaacgt tgcgcaaact attaactggc gaactactta
ctctagcttc 1920ccggcaacaa ttaatagact ggatggaggc ggataaagtt
gcaggaccac ttctgcgctc 1980ggcccttccg gctggctggt ttattgctga
taaatctgga gccggtgagc gtggctctcg 2040cggtatcatt gcagcactgg
ggccagatgg taagccctcc cgtatcgtag ttatctacac 2100gacggggagt
caggcaacta tggatgaacg aaatagacag atcgctgaga taggtgcctc
2160actgattaag cattggtaac tgtcagacca agtttactca tatatacttt
agattgattt 2220aaaacttcat ttttaattta aaaggatcta ggtgaagatc
ctttttgata atctcatgac 2280caaaatccct taacgtgagt tttcgttggc
tttacacttt atgcttccgg ctcgtatgtt 2340gtgtggaant gtgagcggat
aacaatttca cacaggaaac agccactgag cgtcagaccc 2400cgtagaaaag
atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt
2460gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag
agctaccaac 2520tctttttccg aaggtaactg gcttcagcag agcgcagata
ccaaatactg tccttctagt 2580gtagccgtag ttaggccacc acttcaagaa
ctctgtagca ccgcctacat acctcgctct 2640gctaatcctg ttaccagtgg
ctgctgccag tggcgataag tcgtgtctta ccgggttgga 2700ctcaagacga
tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac
2760acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc
gtgagctatg 2820agaaagcgcc acgcttcccg aagggagaaa ggcggacagg
tatccggtaa gcggcagggt 2880cggaacagga gagcgcacga gggagcttcc
agggggaaac gcctggtatc tttatagtcc 2940tgtcgggttt cgccacctct
gacttgagcg tcgatttttg tgatgctcgt caggggggcg 3000gagcctatgg aaa
30131020DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 10gggaggcagt actgttgcac 201145DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
11ggtatatctc cttattaaag ttaaacaaaa ttatttctac agggg
451220DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 12taatacgact cactataggg 201319DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
13gctagttatt gctcagcgg 191422DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 14ttacattaat tgcgttgcgc tc
221531DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 15ttttggcgga tggcatttga gaagcacacg g
311625DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 16attctcacca ataaaaaacg cccgg 251736DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
17cctgtagaaa taattttgtt taactttaat aaggag 361820DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
18ccccgcgcgt tggccgattc 201919DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 19gaagggcaat cagctgttg
192021DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 20cagggcaggg tcgttaaata g 212152DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
21gacaccatcg aatggcgcaa aacctttcgc ggtatggcat gatagcgccc gg
522252DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 22ccgggcgcta tcatgccata ccgcgaaagg ttttgcgcca
ttcgatggtg tc 522351DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 23atttttctaa atacattcaa atatgtatcc
gctcatgaga caataaccct g 512451DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 24cagggttatt gtctcatgag
cggatacata tttgaatgta tttagaaaaa t 512552DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
25ttaggcaccc caggctttac actttatgct tccggctggt atgttgtgtg ga
522652DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 26tccacacaac ataccagccg gaagcataaa gtgtaaagcc
tggggtgcct aa 522752DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 27ctaggcaccc caggctttac actttatgct
tccggctggt ataatgtgtg ga 522852DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 28tccacacatt ataccagccg
gaagcataaa gtgtaaagcc tggggtgcct ag 522924DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
29ggggataacg caggaaagaa catg 243022DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
30gcacttttcg gggaaatgtg cg 223175DNAArtificial SequenceDescription
of Artificial Sequence Synthetic primer 31aattgcggcc tatatggatg
ttggaaccgt aagagaaata gacaggcggt cctgtgacgg 60aagatcactt cgcag
753246DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 32tgctcacatg ttctttcctg cgttatcccc gcgtggtgaa
ccaggc 463328DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 33accgcctgtc tatttctctt acggttcc
283449DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 34cgcgcttaat gcgccgctac agggcgcgtc gattggtgcc
agcgcgcag 493543DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 35ggtatatctc cttattaaag ttaaacaaaa
ttatttctac agg 433621DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 36cagccacgtt tctgcgaaaa c
213720DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 37tacagcggtt ccttactggc 203826DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
38gggtggtgaa tgtgaaacca gtaacg 263921DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
39ctgggtgttt acttcggtct g 214021DNAArtificial SequenceDescription
of Artificial Sequence Synthetic primer 40ggccgtaata tccagctgaa c
214120DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 41actagggtgc agtcgctccg 204222DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
42ttaacctagg ctgctgccac cg 224357DNAArtificial SequenceDescription
of Artificial Sequence Synthetic primer 43gcgcaacgca attaatgtaa
ttctgaaatg agctgttgac aattaatcat cggctcg 574456DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
44aacaaaatta tttctacagg tccacacatt atacgagccg atgattaatt gtcaac
564558DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 45gcgcaacgca attaatgtaa tcataaaaaa tttatttgct
ttcaggaaaa tttttctg 584659DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 46aacaaaatta tttctacagg
tgaatctatt atacagaaaa attttcctga aagcaaata 594756DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
47gcgcaacgca attaatgtaa ttatctctgg cggtgttgac ataaatacca ctggcg
564857DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 48aacaaaatta tttctacagg tgtgctcagt atcaccgcca
gtggtattta tgtcaac 574958DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 49gcgcaacgca attaatgtaa
ttttaaaaaa ttcatttgct aaacgcttca aattctcg 585055DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
50aacaaaatta tttctacagg tgaagtatat tatacgagaa tttgaagcgt ttagc
555121DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 51tggggtgcct aatgagtgag c 215226DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
52ctatgaccat gattacgcca agcttg 265324DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
53tctcgcggta tcattgcagc actg 245418DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
54gccacgctca ccggctcc 185521DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 55aacgaaaact cacgttaagg g
215617DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 56ccactgagcg tcagacc 175730DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
57gagacccgtc gttgacaatt aatcatcggc 305831DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
58gagaccattc tcaccaataa aaaacgcccg g 315930DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
59cggggtacca tggacaagaa gtactccatt
306034DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 60ctagtctaga ttacaccttc ctcttcttct tggg
346133DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 61ctccggggaa accgccgaag ccacgcggct caa
336244DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 62cttcggcggt ttccccggag tcgaacagga gggcgccaat gagg
446340DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 63aggaagaaga cgtctcacgc atcttactgc gcagatacgc
406460DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 64aagatgcgtg agacgtcttc ttcctcgtct cggtcgacag
ttcataggtg attgctcagg 606526DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 65gtcccaaatc gcagccaatc acattg
266628DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 66gtcctgacca tcgtattggt tatctggc
286726DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 67atttagaggg cagtgccagc tcgtta 266826DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
68ctgcattcag gtaggcatca tgcgca 266926DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
69ctgggctacc tgcaagatta gcgatg 267025DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
70tgaaggactg ggcagaggcc ccctt 257126DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
71cgtaggtgtc tttgctcagt tgaagc 267227DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
72tagccatctc attactaaag atctcct 277343DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
73cgatatcgtt ggtctcaacg acacaattgt aaaggttaga tct
437443DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 74caacgatatc ggtctcacac tgactgggcc tttcgtttta tct
437523DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 75gcaatcacct atgaactgtc gac 237669DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
76aggaggaagg acgtctcatg cgccccgcat tcacacaatg tagtgatcag ttttagagct
60agaaatagc 697769DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 77aggaggaagg acgtctcatg cgccccgcat
tccaggatgg gtaccacccg ttttagagct 60agaaatagc 697869DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
78aggaggaagg acgtctcatg cgccccgcat tcacacaatg tagtcatcag ttttagagct
60agaaatagc 697969DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 79aggaggaagg acgtctcatg cgccccgcat
tccaggatgg gcaccagccg ttttagagct 60agaaatagc 698069DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
80aggaggaagg acgtctcatg cgccccgcat tccatgatgg gcaccacccg ttttagagct
60agaaatagc 698169DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 81aggaggaagg acgtctcatg cgccccgcat
tcacacaatg tatagatcag ttttagagct 60agaaatagc 698269DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
82aggaggaagg acgtctcatg cgccccgcat tccaggatgg gcaccatccg ttttagagct
60agaaatagc 698369DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 83aggaggaagg acgtctcatg cgccccgcat
tgttgtgtgg aaatgtgagg ttttagagct 60agaaatagc 698469DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
84aggaggaagg acgtctcatg cgccccgcat ttgtcactac tctgaccagg ttttagagct
60agaaatagc 698569DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 85aggaggaagg acgtctcatg cgccccgcat
ttgtcactac tctgaccaag ttttagagct 60agaaatagc 698669DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
86aggaggaagg acgtctcatg cgccccgcat tcacacaatg tactcatcag ttttagagct
60agaaatagc 698769DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 87aggaggaagg acgtctcatg cgccccgcat
tccaagatgg gcaccacccg ttttagagct 60agaaatagc 698869DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
88aggaggaagg acgtctcatg cgccccgcat tcacacaatg tattgatcag ttttagagct
60agaaatagc 698969DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 89aggaggaagg acgtctcatg cgccccgcat
tcacacaatg tataaatcag ttttagagct 60agaaatagc 699069DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
90aggaggaagg acgtctcatg cgccccgcat tccaggatgg ggaccacccg ttttagagct
60agaaatagc 699169DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 91aggaggaagg acgtctcatg cgccccgcat
tttcacaata ctttctttag ttttagagct 60agaaatagc 699243DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
92gcatttcaca caatgtagga tcagttttag agctagaaat agc
439338DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 93tgatcctaca ttgtgtgaaa tgcggggcgc atcttact
389443DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 94gcattaccag gatgggacca cccgttttag agctagaaat agc
439538DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 95gggtggtccc atcctggtaa tgcggggcgc atcttact
389643DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 96gcatttcaca caatgtagca tcagttttag agctagaaat agc
439738DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 97tgatgctaca ttgtgtgaaa tgcggggcgc atcttact
389843DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 98gcattaccag gatgggcacc accgttttag agctagaaat agc
439938DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 99ggtggtgccc atcctggtaa tgcggggcgc atcttact
3810043DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 100gcattaccag atgggcacca cccgttttag agctagaaat agc
4310138DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 101gggtggtgcc catctggtaa tgcggggcgc atcttact
3810243DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 102gcatttcaca caatgtaaga tcagttttag agctagaaat agc
4310338DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 103tgatcttaca ttgtgtgaaa tgcggggcgc atcttact
3810443DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 104gcatttgttg tgtggaatgt gaggttttag agctagaaat agc
4310538DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 105ctcacattcc acacaacaaa tgcggggcgc atcttact
3810643DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 106gcattttgtc actactctga ccggttttag agctagaaat agc
4310738DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 107cggtcagagt agtgacaaaa tgcggggcgc atcttact
3810843DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 108gcattttgtc actactctga ccagttttag agctagaaat agc
4310938DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 109tggtcagagt agtgacaaaa tgcggggcgc atcttact
3811043DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 110gcatttcaca caatgtacca tcagttttag agctagaaat agc
4311138DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 111tgatggtaca ttgtgtgaaa tgcggggcgc atcttact
3811243DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 112gcatttcaca caatgtatga tcagttttag agctagaaat agc
4311338DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 113tgatcataca ttgtgtgaaa tgcggggcgc atcttact
3811443DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 114gcatttcaca caatgtataa tcagttttag agctagaaat agc
4311538DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 115tgattataca ttgtgtgaaa tgcggggcgc atcttact
3811643DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 116gcattattca caatacttct ttagttttag agctagaaat agc
4311738DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 117taaagaagta ttgtgaataa tgcggggcgc atcttact
3811847DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 118agaaggaaga cgtctcactg tcgaccaaaa aagcctgctc
gttgagc 4711940DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 119aagaaggacg tctcaacagt agtggcagcg
gctaactaag 4012044DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 120aggagaggac gtctctcgac caaaaaagcc
tgctcgttga gcag 4412159DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 121agtaagatgc gccccgcatt
gaccaggatg ggcaccaccc gttttagagc tagaaatag 5912259DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
122ctatttctag ctctaaaacg ggtggtgccc atcctggtca atgcggggcg catcttact
5912339DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 123gttttagagc tagaaatagc aagttaaaat aaggctagt
3912420DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 124aatgcggggc gcatcttact 2012522DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
125cattggcacc ggtctactaa ac 2212622DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
126aaacgtttag tagaccggtg cc 2212729DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
127cgcattcaca caatgtaagt atcagtttt 2912829DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
128ctctaaaact gatacttaca ttgtgtgaa 2912960DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
129gtttacgtcg ccgtccagct cgaccaggat gggcaccaac ccggtgaaca
gctcctcgcc 6013060DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 130gtttacgtcg ccgtccagct cgaccaggat
gggcaccagc ccggtgaaca gctcctcgcc 6013160DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
131gtttacgtcg ccgtccagct cgaccaggat gggcaccatc ccggtgaaca
gctcctcgcc 6013260DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 132gtttacgtcg ccgtccagct cgaccaggat
gggcaccacc ccggtgaaca gctcctcgcc 6013360DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
133gcatcgccct cgccctcgcc ggacacgctg aacttgtggc cgtttacgtc
gccgtccagc 6013460DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 134tgcagtttca tttgatgctc gatgagttat
ggtgagcaag ggcgaggagc tgttcaccgg 6013579DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
135ttaatacgac tcactatagg gaccaggatg ggcaccaacc gttttagagc
tagaaatagc 60aagttaaaat aaggctagt 7913679DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
136ttaatacgac tcactatagg gaccaggatg ggcaccagcc gttttagagc
tagaaatagc 60aagttaaaat aaggctagt 7913779DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
137ttaatacgac tcactatagg gaccaggatg ggcaccatcc gttttagagc
tagaaatagc 60aagttaaaat aaggctagt 7913862DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
138ttaatacgac tcactatagg gaccaggatg ggcaccaccc gttttagagc
tatgctgttt 60tg 6213982DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 139aaaagcaccg actcggtgcc
actttttcaa gttgataacg gactagcctt attttaactt 60gctatttcta gctctaaaac
gg 8214035DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 140aagaggaaga ggttaatacg actcactata gggac
3514120DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 141aaaagcaccg actcggtgcc 2014241DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
142actagcctta ttttaacttg ctatttctag ctctaaaacg g
4114348DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 143aagaaggaga aggtctctag tggagcaagg gcgaggagct
gttcaccg 4814443DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 144aagagaagag aggtctcatc gtgtttacgt
cgccgtccag ctc 4314534DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 145atgggtctcc agtggggcca
acacttgtca ctac 3414634DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 146atgggtctct tcgtttccgg
ataacgggaa aagc 3414739DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 147atgggtctcc agtggctcga
gtacaacttt aactcacac 3914839DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 148atgggtctct tcgttgattc
cattcttttg tttgtctgc 3914941DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 149atgggtctct catagctgtt
tcctgtgtga aattgttatc c 4115037DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 150atgggtctca ccccaggctt
tacactttat gcttccg 3715141DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 151atgggtctcc agtggctgtt
tcctgtgtga aattgttatc c 4115237DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 152atgggtctct tcgttggctt
tacactttat gcttccg 3715336DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 153atgggtctcc agtggcacac
aggaaacagc tatgac 3615441DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 154atgggtctct tcgttgggtt
aagcttaact ttaagaagga g 4115540DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 155atgggtctca cacaaactcg
agtacaactt taactcacac 4015633DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 156atgggtctcg attccattct
tttgtttgtc tgc 3315729DNAArtificial SequenceDescription of
Artificial Sequence Synthetic primer 157attggtctcg gccgagcggt
tgaaggcac 2915829DNAArtificial SequenceDescription of Artificial
Sequence Synthetic primer 158attggtctct ctggaaccct ttcgggtcg
2915919DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 159ggcgaggagc tgttcaccg 1916021DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
160gtttacgtcg ccgtccagct c 2116120DNAArtificial SequenceDescription
of Artificial Sequence Synthetic primer 161ggccaacact tgtcactact
2016219DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 162tccggataac gggaaaagc 1916324DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
163ctcgagtaca actttaactc acac 2416424DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
164gattccattc ttttgtttgt ctgc 2416526DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
165ctgtttcctg tgtgaaattg ttatcc 2616622DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
166ggctttacac tttatgcttc cg 2216726DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
167cccgggttat tacatgcgct agcact 2616848DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
168gaaattaata cgactcacta tagggttaag cttaacttta agaaggag
4816921DNAArtificial SequenceDescription of Artificial Sequence
Synthetic primer 169tgcaagcagc agattacgcg c 2117024DNAArtificial
SequenceDescription of Artificial Sequence Synthetic primer
170gtaactgtca gaccaagttt actc 2417160DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidemodified_base(22)..(22)dNaM 171ggcgaggagc tgttcaccgg
gntggtgccc atcctggtcg agctggacgg cgacgtaaac 6017260DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidemodified_base(34)..(34)dNaM 172gtttacgtcg ccgtccagct
cgaccaggat gggnaccacc ccggtgaaca gctcctcgcc 6017360DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidemodified_base(27)..(27)dNaM 173gtttacgtcg ccgtccagct
cgaccangat gggcaccacc ccggtgaaca gctcctcgcc 6017460DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidemodified_base(39)..(39)dNaM 174gtttacgtcg ccgtccagct
cgaccaggat gggcaccanc ccggtgaaca gctcctcgcc 6017575DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidemodified_base(34)..(34)dNaM 175ctgtttcctg tgtgaaattg
ttatccgctc acanttccac acaacatacg agccggaagc 60ataaagtgta aagcc
7517663DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(31)..(31)dNaM 176ctcgagtaca
actttaactc acacaatgta nagatcacgg cagacaaaca aaagaatgga 60atc
6317758DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(31)..(31)dNaM 177ccggataacg
ggaaaagcat tgaacaccgc nggtcagagt agtgacaagt gttggcca
5817858DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(27)..(27)dNaM 178ggccaacact
tgtcactact ctgaccnagg gtgttcaatg cttttcccgt tatccgga
5817960DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(34)..(34)dNaM 179ggcgaggagc
tgttcaccgg ggtggtgccc atcntggtcg agctggacgg cgacgtaaac
6018060DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(27)..(27)dNaM 180ggcgaggagc
tgttcaccgg ggtggtnccc atcctggtcg agctggacgg cgacgtaaac
6018164DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 181gattccattc
ttttgtttgt ctgccgtgat tnatacattg tgtgagttaa agttgtactc 60gagt
64182134DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotidemodified_base(63)..(63)dNaM 182cacacaggaa
acagctatga cccgggttat tacatgcgct agcacttgga attcacaata 60ctntctttaa
ggaaaccata gtaaatctcc ttcttaaagt taagcttaac cctatagtga
120gtcgtattaa tttc 13418363DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(32)..(32)dNaM 183ctcgagtaca actttaactc
acacaatgta anaatcacgg cagacaaaca aaagaatgga 60atc
6318463DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 184ctcgagtaca
actttaactc acacaatgta ancatcacgg cagacaaaca aaagaatgga 60atc
6318563DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 185ctcgagtaca
actttaactc acacaatgta angatcacgg cagacaaaca aaagaatgga 60atc
6318663DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 186ctcgagtaca
actttaactc acacaatgtc antgtcacgg cagacaaaca aaagaatgga 60atc
6318763DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 187ctcgagtaca
actttaactc acacaatgta cnaatcacgg cagacaaaca aaagaatgga 60atc
6318863DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 188ctcgagtaca
actttaactc acacaatgta cncatcacgg cagacaaaca aaagaatgga 60atc
6318963DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 189ctcgagtaca
actttaactc acacaatgta cngatcacgg cagacaaaca aaagaatgga 60atc
6319063DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 190ctcgagtaca
actttaactc acacaatgta cntatcacgg cagacaaaca aaagaatgga 60atc
6319163DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 191ctcgagtaca
actttaactc acacaatgta gnaatcacgg cagacaaaca aaagaatgga 60atc
6319263DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 192ctcgagtaca
actttaactc acacaatgta gncatcacgg cagacaaaca aaagaatgga 60atc
6319363DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 193ctcgagtaca
actttaactc acacaatgta gngatcacgg cagacaaaca aaagaatgga 60atc
6319463DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 194ctcgagtaca
actttaactc acacaatgta gntatcacgg cagacaaaca aaagaatgga 60atc
6319563DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 195ctcgagtaca
actttaactc acacaatgta tnaatcacgg cagacaaaca aaagaatgga 60atc
6319663DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 196ctcgagtaca
actttaactc acacaatgta tncatcacgg cagacaaaca aaagaatgga 60atc
6319763DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 197ctcgagtaca
actttaactc acacaatgta tngatcacgg cagacaaaca aaagaatgga 60atc
6319863DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(32)..(32)dNaM 198ctcgagtaca
actttaactc acacaatgta tntatcacgg cagacaaaca aaagaatgga 60atc
6319963DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(33)..(33)dNaM 199ctcgagtaca
actttaactc acacaatgta agnatcacgg cagacaaaca aaagaatgga 60atc
6320056DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(31)..(31)dNaM 200ctctggaacc
ctttcgggtc gccggtttag nagaccggtg ccttcaaccg ctcggc
5620123DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(18)..(18)dC, dT, dA, dG,
dTPT3 or dNaMSee specification as filed for detailed description of
substitutions and preferred embodiments 201ctggtcctac ccgtggtngg
tcc 2320220DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotideDescription of Combined DNA/RNA Molecule
Synthetic oligonucleotidemodified_base(18)..(18)g, a, u or cSee
specification as filed for detailed description of substitutions
and preferred embodiments 202gaccaggatg ggcaccancc
2020320RNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 203gaccaggaug ggcaccaccc
2020418RNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 204guugugugga aaugugag
1820518RNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 205uguugugugg aaugugag
1820622DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotideDescription of Combined DNA/RNA Molecule
Synthetic oligonucleotidemodified_base(16)..(16)dTPT3 206guatgttgtg
tggaantgtg ag 2220713DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotideCDS(1)..(12) 207tca
cat ttc cac a 13Ser His Phe His12084PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 208Ser
His Phe His120922DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotidemodified_base(14)..(14)dNaM
209tcacacaatg tagngatcac gg 2221022DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotidemodified_base(12)..(12)dNaM 210accaggatgg gnaccacccc
gg 2221122DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(14)..(14)dNaM 211tcacacaatg
tagncatcac gg 2221222DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(17)..(17)dTPT3 212accaggatgg
gcaccanccc gg 2221322DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(5)..(5)dNaM 213accangatgg gcaccacccc
gg 2221422DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(13)..(13)dNaM 214tcacacaatg
tanagatcac gg 2221522DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(17)..(17)dNaM 215accaggatgg gcaccanccc
gg 2221622DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(13)..(13)dTPT3 216tgttgtgtgg
aantgtgagc gg 2221722DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(18)..(18)dTPT3 217ttgtcactac
tctgaccngc gg 2221822DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(18)..(18)dNaM 218ttgtcactac tctgaccnag
gg 2221922DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(14)..(14)dNaM 219tcacacaatg
tacncatcac gg 2222022DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(5)..(5)dTPT3 220accangatgg gcaccacccc
gg 2222122DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotidemodified_base(14)..(14)dNaM 221tcacacaatg
tatngatcac gg 2222222DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(14)..(14)dTPT3 222tcacacaatg
tatnaatcac gg 2222322DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(12)..(12)dTPT3 223accaggatgg
gnaccacccc gg 2222422DNAArtificial SequenceDescription of
Artificial Sequence Synthetic
oligonucleotidemodified_base(13)..(13)dNaM 224attcacaata ctntctttaa
gg 2222522PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 225Met Lys Tyr Leu Leu Pro Thr Ala Glu Ala Gly
Leu Leu Leu Leu Ala1 5 10 15Ala Gln Pro Ala Ile Ala
2022626PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 226Met Lys Ile Lys Thr Gly Ala Arg Ile Leu Ala
Leu Ser Glu Leu Thr1 5 10 15Thr Met Met Phe Ser Ala Ser Ala Leu Ala
20 2522721PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 227Met Lys Gln Ser Thr Ile Ala Leu Ala Leu Leu
Pro Leu Leu Phe Thr1 5 10 15Pro Val Thr Lys Ala
2022830PRTArtificial SequenceDescription of Artificial Sequence
Synthetic polypeptide 228Met Lys Ser Pro Ala Pro Ser Arg Pro Gln
Lys Met Ala Leu Ile Pro1 5 10 15Ala Cys Ile Phe Leu Cys Phe Ala Ala
Leu Ser Val Gln Ala 20 25 3022920PRTArtificial SequenceDescription
of Artificial Sequence Synthetic peptide 229Met Lys Lys Ile Leu Val
Ser Phe Val Ala Ile Met Ala Val Ala Ser1 5 10 15Ser Ala Met Ala
2023040PRTArtificial SequenceDescription of Artificial Sequence
Synthetic polypeptide 230Met Lys Ile Ser Met Gln Lys Ala Asp Phe
Trp Lys Lys Ala Ala Ile1 5 10 15Ser Leu Leu Val Phe Thr Met Phe Phe
Thr Leu Met Met Ser Glu Thr 20 25 30Val Phe Ala Ala Gly Leu Asn Lys
35 4023128PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 231Met Lys Lys Thr Ala Ile Ala Ile Ala Val Ala
Leu Ala Gly Phe Ala1 5 10 15Thr Val Ala Gln Ala Ser Ala Gly Leu Asn
Lys Asp 20 2523219PRTArtificial SequenceDescription of Artificial
Sequence Synthetic peptide 232Met Lys Lys Ile Trp Leu Ala Leu Ala
Gly Leu Val Leu Ala Phe Ser1 5 10 15Ala Ser Ala23357DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 233atgaaaaaga tttggctggc gctggctggt ttagttttag
cgtttagcgc atcggcg 5723422PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 234Met Lys Tyr Leu Leu Pro
Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala1 5 10 15Ala Gln Pro Ala Met
Ala 2023566DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 235atgaaatacc tgctgccgac cgctgctgct
ggtctgctgc tcctcgctgc ccagccggcg 60atggcg 6623663DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 236atgaaacaaa gcactattgc actggcactc ttaccgttac
tgtttacccc tgtgacaaaa 60gcg 6323726PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 237Met
Lys Thr His Ile Val Ser Ser Val Thr Thr Thr Leu Leu Leu Gly1 5 10
15Ser Ile Leu Met Asn Pro Val Ala Asn Ala 20 2523878DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 238atgaaaacac atatagtcag ctcagtaaca acaacactat
tgctaggttc catattaatg 60aatcctgtcg ctaatgcc 7823921PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 239Met
Lys Tyr Leu Leu Pro Trp Leu Ala Leu Ala Gly Leu Val Leu Ala1 5 10
15Phe Ser Ala Ser Ala 2024063DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 240atgaaatacc
tgctgccgtg gctggcgctg gctggtttag ttttagcgtt tagcgcatcg 60gcg
6324120PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 241Met Lys Lys Ile Thr Ala Ala Ala Gly Leu Leu
Leu Leu Ala Ala Phe1 5 10 15Ser Ala Ser Ala 2024260DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 242atgaaaaaga ttaccgctgc tgctggtctg ctgctcctcg
ctgcgtttag cgcatcggcg 6024319PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 243Met Lys Lys Ile Trp Leu
Ala Leu Ala Gly Leu Val Leu Ala Gln Pro1 5 10 15Ala Met
Ala24457DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 244atgaaaaaga tttggctggc gctggctggt
ttagttttag cccagccggc gatggcg 5724519PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 245Met
Lys Lys Ile Leu Val Leu Gly Ala Leu Ala Leu Trp Ala Gln Pro1 5 10
15Ala Met Ala24657DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 246atgaaaaaga ttttagtttt
aggtgctctg gcgctgtggg cccagccggc gatggcg 5724719PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 247Met
Lys Lys Ile Trp Leu Ala Leu Val Leu Leu Ala Gly Ala Gln Pro1 5 10
15Ala Met Ala24857DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 248atgaaaaaga tttggctggc
gttagtttta ctggctggtg cccagccggc gatggcg 5724919PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 249Met
Lys Lys Ile Leu Ala Gly Trp Leu Ala Leu Val Leu Ala Gln Pro1 5 10
15Ala Met Ala25057DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 250atgaaaaaga ttctggctgg
ttggctggcg ttagttttag cccagccggc gatggcg 5725119PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 251Met
Lys Lys Ile Leu Val Leu Leu Ala Gly Trp Leu Ala Ala Gln Pro1 5 10
15Ala Met Ala25257DNAArtificial SequenceDescription of Artificial
Sequence Synthetic oligonucleotide 252atgaaaaaga ttttagtttt
actggctggt tggctggcgg cccagccggc gatggcg 5725320PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 253Met
Lys Lys Ile Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala Ala Gln1 5 10
15Pro Ala Met Ala 2025460DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 254atgaaaaaga
ttaccgctgc tgctggtctg ctgctcctcg ctgcccagcc ggcgatggcg
6025520PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 255Met Lys Lys Ile Leu Leu Leu Leu Gly Thr Ala
Ala Ala Ala Ala Gln1 5 10 15Pro Ala Met Ala 2025660DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
oligonucleotide 256atgaaaaaga ttctgctgct cctcggtacc gctgctgctg
ctgcccagcc ggcgatggcg 6025720PRTArtificial SequenceDescription of
Artificial Sequence Synthetic peptide 257Met Lys Lys Ile Leu Leu
Leu Leu Leu Leu Leu Leu Leu Leu Ala Gln1 5 10 15Pro Ala Met Ala
2025860DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 258atgaaaaaga ttctgctgct cctcctgctg
ctcctcctgc tcgcccagcc ggcgatggcg 6025920PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 259Met
Lys Lys Ile Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Ala Gln1 5 10
15Pro Ala Met Ala 2026060DNAArtificial SequenceDescription of
Artificial Sequence Synthetic oligonucleotide 260atgaaaaaga
ttgctgctgc tgctgcggcg gcggcggctg cggcccagcc ggcgatggcg
6026121PRTArtificial SequenceDescription of Artificial Sequence
Synthetic peptide 261Met Lys Tyr Leu Leu Pro Trp Leu Ala Leu Ala
Gly Leu Val Leu Ala1 5 10 15Gln Pro Ala Met Ala
2026263DNAArtificial SequenceDescription of Artificial Sequence
Synthetic oligonucleotide 262atgaaatacc tgctgccgtg gctggcgctg
gctggtttag ttttagccca gccggcgatg 60gcg 6326322PRTArtificial
SequenceDescription of Artificial Sequence Synthetic peptide 263Met
Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala1 5 10
15Ala Phe Ser Ala Ser Ala 2026466DNAArtificial SequenceDescription
of Artificial Sequence Synthetic oligonucleotide 264atgaaatacc
tgctgccgac cgctgctgct ggtctgctgc tcctcgctgc gtttagcgca 60tcggcg
66
* * * * *