U.S. patent application number 16/746428 was filed with the patent office on 2020-12-10 for site-specific incorporation of phosphoserine into proteins in escherichia coli.
The applicant listed for this patent is Yale University. Invention is credited to Hee-Sung Park, Dieter Soll.
Application Number | 20200385742 16/746428 |
Document ID | / |
Family ID | 1000005034186 |
Filed Date | 2020-12-10 |
United States Patent
Application |
20200385742 |
Kind Code |
A1 |
Park; Hee-Sung ; et
al. |
December 10, 2020 |
SITE-SPECIFIC INCORPORATION OF PHOSPHOSERINE INTO PROTEINS IN
ESCHERICHIA COLI
Abstract
Nucleic acids encoding mutant elongation factor proteins
(EF-Sep), phosphoseryl-tRNA synthetase (SepRS), and
phosphoseryl-tRNA (tRNA.sup.Sep) and methods of use in site
specific incorporation of phosphoserine into & protein or
polypeptide are described. Typically, SepRS preferentially
aminoacylates tRNA.sup.Sepwith O-phosphoserine and the tRNA.sup.Sep
recognizes at least one codon such as a stop codon. Due to the
negative charge of the phosphoserine, Sep-tRNA.sup.Sep does not
bind elongation factor Tu (EF-Tu). However, mutant EF-Sep proteins
are disclosed that bind Sep-tRNA.sup.Sep and protect
Sep-tRNA.sup.Sep from deacylation. In a preferred embodiment the
nucleic acids are on vectors and are expressed in cells such as
bacterial cells, archeaebacterial cells, and eukaryotic cells.
Proteins or polypeptides containing phosphoserine produced by the
methods described herein can be used for a variety of applications
such as research, antibody production, protein array manufacture
and development of cell-based screens for new drug discovery.
Inventors: |
Park; Hee-Sung; (Daejeon,
KR) ; Soll; Dieter; (Guilford, CT) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Yale University |
New Haven |
CT |
US |
|
|
Family ID: |
1000005034186 |
Appl. No.: |
16/746428 |
Filed: |
January 17, 2020 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
15439449 |
Feb 22, 2017 |
10538773 |
|
|
16746428 |
|
|
|
|
14992542 |
Jan 11, 2016 |
9580716 |
|
|
15439449 |
|
|
|
|
14795434 |
Jul 9, 2015 |
9567594 |
|
|
14992542 |
|
|
|
|
13877628 |
Apr 3, 2013 |
9090928 |
|
|
PCT/US2011/055414 |
Oct 7, 2011 |
|
|
|
14795434 |
|
|
|
|
61390853 |
Oct 7, 2010 |
|
|
|
61470332 |
Mar 31, 2011 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12P 21/02 20130101;
C12Y 601/01027 20130101; C07K 14/435 20130101; C12N 15/67 20130101;
C12N 9/93 20130101; C12P 21/00 20130101; C07K 14/245 20130101 |
International
Class: |
C12N 15/67 20060101
C12N015/67; C12P 21/02 20060101 C12P021/02; C12P 21/00 20060101
C12P021/00; C12N 9/00 20060101 C12N009/00; C07K 14/245 20060101
C07K014/245; C07K 14/435 20060101 C07K014/435 |
Goverment Interests
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
[0002] This invention was made with Government Support under
Agreement R01 GM022854 awarded by the National Institutes of Health
and Agreement 0654283 awarded by the National Science Foundation.
The Government has certain rights in the invention.
Claims
1. A method of making a target protein, comprising expressing a
messenger RNA (mRNA) encoding the target protein in a system
comprising: an O-phosphoseryl-tRNA synthetase (SepRS) that
preferentially aminoacylates a tRNA (tRNA.sup.Sep) with
phosphoserine; a tRNA.sup.Sep that can be aminoacylated with
phosphoserine by SepRS to form a Sep-tRNA.sup.Sep and recognize at
least one codon in the mRNA encoding the target protein; and a
mutant elongation factor (EF-Sep) that binds Sep-tRNA.sup.Sep;
wherein the SepRS preferentially aminoacylates the tRNA.sup.Sep
with phosphoserine and the resulting Sep-tRNA.sup.Sep recognizes at
least one codon such that phosphoserine is incorporated during
translation to form the target protein.
2. The method of claim 1, wherein the EF-Sep comprises an amino
acid sequence at least 90% identical to any one of SEQ ID NOS:1-4,
wherein (i) the amino acid sequence comprises the amino acids of
any one of SEQ ID NOS:1-4 at positions corresponding to amino acid
number 67, 216, 217, 219, 229, and 274 of any one of SEQ ID
NOS:1-4; or (ii) the amino acid sequence comprises the amino acids
of any one of SEQ ID NOS:1-4 at positions corresponding to amino
acid number 67, 217, 219, 229, and 274 of any one of SEQ ID NOS:1-4
and a substitution mutation corresponding to amino acid number 216
of any one of SEQ ID NOS:1-4.
3. The method of claim 1, wherein the EF-Sep comprises an amino
acid binding pocket for aminoacylated tRNA, wherein the binding
pocket comprises (i) the binding pocket for aminoacylated tRNA of
any one of SEQ ID NOS: 1-4, or any one of SEQ ID NO:1-4; or (ii)
the binding pocket for aminoacylated tRNA of SEQ ID NO:3 with a
substitution at amino acid reside 216, optionally wherein the
substitution is an asparagine-to-valine substitution (N216V).
4. The method of claim 1, wherein the tRNA.sup.Sep is
cysteinyl-tRNA from Methanocaldococcus jannaschii.
5. The method of claim 4, wherein the tRNA.sup.Sep is encoded by a
the nucleic acid sequence SEQ ID NO:41.
6. The method of claim 1, wherein the SepRS is the
phosphoseryl-tRNA synthetase from Methanococcus maripaludis or
Methanocaldococcus jannaschii.
7. The method of claim 6, wherein the SepRS is comprises an amino
acid sequence at least 85% identical to SEQ ID NO:43 or 46.
8. (canceled)
9. The method of claim 1, wherein the nucleic acid encoding a gene
with tRNA.sup.Sep activity and the nucleic acid encoding a gene
with SepRS activity are on one or more vectors.
10. The method of claim 9, wherein the vector is an expression
vector selected from the group consisting of a plasmid, a virus, a
naked polynucleotide, and a conjugated polynucleotide.
11. The method of claim 9, wherein the vector is expressed in cells
selected from the group consisting of bacterial cells,
archeaebacterial cells, and eukaryotic cells.
12. The method of claim 9, wherein the vector is expressed in an in
vitro transcription/translation system.
13. The method of claim 12, wherein the vector is transcribed and
translated prior to or along with nucleic acids encoding one or
more proteins or polypeptides.
14. The method of claim 1, wherein the nucleic acids are expressed
in an organism.
15. The method of claim 1, wherein the nucleic acids are under
control of a promoter selected from the group consisting of
constitutive, inducible and tissue-specific.
16. A kit for producing a target protein containing phosphoserine,
comprising a polynucleotide encoding EF-Sep, a polynucleotide
encoding tRNA.sup.Sep, and a polynucleotide encoding SepRS.
17. The kit of claim 16, wherein the kit further comprises
phosphoserine.
18. The kit of claim 16, further comprising a host system for
expressing a polynucleotide encoding the protein, the
polynucleotide encoding EF-Sep, the polynucleotide encoding
tRNA.sup.Sep, and the polynucleotide encoding SepRS.
19. A plurality of a target protein produced according to the
method of claim 1.
20. The plurality of the target protein of claim 19 in a lysate of
host cells used to produce the target protein.
21. A method comprising screening candidate drugs for activity
against a protein, wherein the protein is produced according to a
process comprising the method of claim 1.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims benefit of U.S. Provisional
Application No. 61/390,853, filed on Oct. 7, 2010, and U.S.
Provisional Application No. 61/470,332, filed on Mar. 31, 2011.
FIELD OF THE INVENTION
[0003] The field of the present invention generally relates to
methods for the site specific phosphorylation of proteins in vitro
and in vivo.
BACKGROUND OF THE INVENTION
[0004] Signal transduction is any process by which a cell converts
one kind of signal or stimulus into another. Processes referred to
as signal transduction often involve a sequence of biochemical
reactions inside the cell, which are carried out by enzymes and
linked through second messengers. Signal transduction is often
accomplished by the activation of enzymes that can act upon other
enzymes and change their catalytic activity. This may lead to
increases or decreases in the activity of certain metabolic
pathways, or may lead to even large intracellular changes, for
example, the initiation of specific patterns of gene expression
and/or changes in cell proliferation.
[0005] The most common covalent modification used in signal
transduction processes is phosphorylation, which results in the
alteration of the activity of those enzymes which become
phosphorylated. Phosphorylation is the addition of a phosphate
(PO.sub.4) group to a protein or a small molecule. Any of several
amino acids in a protein may be phosphorylated. Phosphorylation on
serine is the most common, followed by threonine. Tyrosine
phosphorylation is relatively rare. However, since tyrosine
phosphorylated proteins are relatively easy to purify using
antibodies, tyrosine phosphorylation sites are relatively well
understood. Histidine and aspartate phosphorylation occurs in
prokaryotes as part of two-component signaling. Other types of
phosphorylation include oxidative phosphorylation. Adenosine
triphosphate (ATP), the "high-energy" exchange medium in the cell,
is synthesized in the mitochondrion by addition of a third
phosphate group to Adenosine diphosphate (ADP) in a process
referred to as oxidative phosphorylation. ATP is also synthesized
by substrate level phosphorylation during glycolysis. ATP is
synthesized at the expense of solar energy by photophosphorylation
in the chloroplasts of plant cells.
[0006] In eukaryotes, protein phosphorylation is probably the most
important regulatory event. Many enzymes and receptors are switched
"on" or "off" by phosphorylation and dephosphorylation.
Phosphorylation is catalyzed by enzymes known as ATP-dependent
phosphotransferases which are often simply referred to as
"kinases." These include, among others, protein kinases, lipid
kinases, inositol kinases, non-classical protein kinases, histidine
kinases, aspartyl kinases, nucleoside kinases, and polynucleotide
kinases.
[0007] Phosphorylation regulates protein function, for example, by
affecting conformation. This in turn regulates such processes as
enzyme activity, protein-protein interactions, subcellular
distribution, and stability and degradation. The stoichiometry of
phosphorylation of a given site is controlled by the relative
activities of a cell's repertoire of protein kinases and
phosphatases. Thus phosphorylation can often generate extremely
rapid and reversible changes in the activity of target proteins.
The ability to assay the state of phosphorylation of specific
proteins is of great utility in the quest to establish the function
of a given protein. Such assays are also critical for the
identification of drugs that can influence the phosphorylation, and
hence the function, of specific proteins.
[0008] In general, phosphoproteins are highly unstable and
difficult to produce, both in terms of specific phosphorylation of
biologically relevant amino acids and subsequent purification of
protein. A means to specify and drive a targeted phosphorylation
event with a high degree of certainty and efficiency is needed.
This is particularly important for recombinant proteins expressed
in bacterial or fungal expression systems which do not
phosphorylate proteins in the same way as mammalian cells.
[0009] Therefore, it is an object of the present invention to
provide a method for the site specific phosphorylation of
proteins.
[0010] It is further an object of the present invention to provide
a method for the site specific phosphorylation of proteins in
vivo.
[0011] In particular, it is an object of the present invention to
provide a method for the site specific incorporation of
phosphoserine into a protein.
SUMMARY OF THE INVENTION
[0012] Mutant elongation factor proteins (EF-Sep) are described for
use with phosphoseryl-tRNA synthetase (SepRS) and phosphoseryl-tRNA
(tRNA.sup.Sep) in site specific incorporation of phosphoserine into
a protein or polypeptide. Typically. SepRS preferentially
aminoacylates tRNA.sup.Sep with O-phosphoserine and the
tRNA.sup.Sep recognizes at least one codon such as a stop codon.
Due to the negative charge of the phosphoserine, Sep-tRNA.sup.Sep
does not bind elongation factor Tu (EF-Tu). However, the disclosed
EF-Sep proteins can bind Sep-tRNA.sup.Sep and protect
Sep-tRNA.sup.Sep from deacylation and catalyze the covalent
transfer of the phosphoserine amino acid onto the polypeptide.
[0013] In some embodiments, EF-Sep is a mutant form of bacterial
EF-Tu having a mutation at one or more of amino acid residues
corresponding to His67, Asp216, Glu217, Phe219, Thr229, and Asn274
in E. coli EF-Tu, which are located in the amino acid binding
pocket for aminoacylated tRNA. In some embodiments, EF-Sep is a
mutant form of eukaryotic elongation factor 1A (eEF1A) with
mutations in positions equivalent to bacterial counterpart. In
preferred embodiments, the EF-Sep has the amino acid sequence SEQ
ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, or a conservative
variant thereof. Nucleic acids encoding EF-Sep are also disclosed.
For example, in some embodiments, the nucleic acid sequence
encoding EF-Sep has the nucleic acid sequence SEQ ID NO:5, SEQ ID
NO:6, SEQ ID NO:7, SEQ ID NO:8, or a conservative variant
thereof.
[0014] In a preferred embodiment, "tRNA.sup.Sep" and "SepRS" refer
to the cysteinyl-tRNA from Methanocaldococcus jannaschii and the
phosphoseryl-tRNA synthetase from Methanococcus maripaludis,
respectively and variants thereof having conservative
substitutions, additions, and/or deletions therein not affecting
the structure or function. Typically, SepRS preferentially
aminoacylates tRNA.sup.Sep with O-phosphoserine and the
tRNA.sup.Sep recognizes at least one codon. In a preferred
embodiment, the tRNA.sup.Sep recognizes a stop codon or an
unconventional or non-native codon.
[0015] Methods for producing target proteins that contain at least
one phosphoserine are described. The method results in proteins
that have a phosphoserine incorporated into a protein in a manner
indistinguishable front the phosphorylation of a serine by a
kinase. Nucleic acids encoding genes with SepRS and tRNA.sup.Sep
activity are provided, preferably on vectors, such as cloning
vectors and expression vectors. These vectors can be in the form of
a plasmid, a bacterium, a virus, a naked polynucleotide, or a
conjugated polynucleotide. In one embodiment, the vectors are
expressed in cells such as bacterial cells (e.g., Escherichia
coli), archeaebacterial cells, and eukaryotic cells (e.g., yeast
cells, mammalian cells, plant cells, insect cells, fungal cells).
The cells preferably lack a protein with Sep-tRNA:Cys-tRNA synthase
(SepCysS) activity that converts tRNA-bound phosphoserine to
cysteine. In an alternative embodiment, the vectors are expressed
in an in vitro transcription/translation system. In this embodiment
the vectors are transcribed and translated prior to or along with
nucleic acids encoding one or more proteins or polypeptides.
[0016] In some embodiments, the target protein containing
phosphoserine is produced and modified in a cell-dependent manner.
This provides for the production of proteins that are stably
folded, glycosylated, or otherwise modified by the cell.
[0017] Kits for producing polypeptides and/or proteins containing
phosphoserine are also provided.
[0018] The proteins or polypeptides containing phosphoserine and
antibodies to such polypeptides or proteins have a variety of uses
including the study of kinases, phosphotases, and target proteins
in signal transduction pathways, antibody production, protein array
manufacture and development of cell-based screens for new drug
discovery and the development of therapeutic agents, agricultural
products, or peptide-based libraries such as phage display
libraries.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] FIG. 1A is a diagram showing the indirect pathway for the
synthesis of Cys-tRNA.sup.Cys in methanogenic archaea. FIG. 1B is a
diagram showing the secondary structure (of Methanocaldococcus
jannaschii tRNA.sup.Cys (SEQ ID NO:41) shown in clover leaf form.
Mutations introduced to form tRNA.sup.Sep are indicated with
arrows. FIG. 1C is a graph showing percent phosphoserine (Sep)
acceptance in M. jannaschii tRNA as a function of time (min) for
unfractionated total tRNA from E. coli (triangle) or E. coli
strains expressing tRNA.sup.Cys (closed circle) or tRNA.sup.Sep
(open circle).
[0020] FIG. 2 is a graph showing chloramphenicol resistance (IC50,
.mu.g-ml) for E. coli. containing 1) a chloramphenicol
acetyltransferase (CAT) gene with an amber stop code (UAG) at a
permissive site and 2) combinations of tRNA.sup.Sep, [SepRS or
CysRS (Mmp)], SepCysS, and [EF-Sep or and EF-Tu (wt)]. The
suppressor tRNA.sup.Sep was coexpressed with the indicated enzymes
in E. coli Top 10.DELTA.serB. Selection was carried out on LB agar
plates containing 2 mM Sep and various concentrations of
chloramphenicol.
[0021] FIG. 3 is a graph showing deacylation of
[.sup.14C]Sep-tRNA.sup.Cys (percent Sep-tRNA.sup.Cys remaining) as
a function of time following incubated in the presence and absence
of bovine serum albumin control (open circle), wild type EF-Tu
(closed circle), or EF-Sep (square).
[0022] FIG. 4 is a graph showing kinase activity (phosphate
incorporation into MyBP (pmol/min)) as a function of MEK1
concentration (.mu.g/assay) for wild type (triangle) and mutant
(closed and open circles) MEK1. Human MEK1 was produced as a
maltose-binding protein (MBP) fusion-protein in E. coli. Residues
Ser218 and Ser222, which are targets of phosphorylation by MEK1
activators were either mutated to Glu218/Glu222 (closed circle) or
to Sep218/Glu222 (open circle) to produce active MEK1 variants.
Various amounts of MBP-MEK1 were used to phosphorylate inactive
ERK2 in vitro. ERK2 activity was then measured in a radiometric
assay using [.sup.32P]-.gamma.ATP and myelin basic protein as
substrates.
[0023] FIGS. 5A and 5B are graphs showing EF-Tu protects
Cys-tRNA.sup.Cys (FIG. 5A) but not Sep-tRNA.sup.Cys (FIG. 5B) from
deacylation. Hydrolysis of M. jannaschii [.sup.35S]Cys-tRNA.sup.Cys
or [.sup.14C]Sep-tRNA.sup.Cys was determined at pH 8.2 and room
temperature in the presence or absence of EF-Tu.
DETAILED DESCRIPTION OF THE INVENTION
I. Definitions
[0024] The term "transfer RNA (tRNA)" refers to a set of
genetically encoded RNAs that act during protein synthesis as
adaptor molecules, matching individual amino acids to their
corresponding codon on a messenger RNA (mRNA). In higher eukaryotes
such as mammals, there is at least one tRNA for each of the 20
naturally occurring amino acids. The 3' end of a tRNA is
aminoacylated by a tRNA synthetase so that an amino acid is
attached to the 3' end of the tRNA. This amino acid is delivered to
a growing polypeptide chain as the anticodon sequence of the tRNA
reads a codon triplet in an mRNA.
[0025] The term "aminoacyl tRNA synthetase (AARS)" refers to an
enzyme that catalyzes the esterification of a specific amino acid
or its precursor to one of all its compatible cognate tRNAs to form
an aminoacyl-tRNA. These charged aminoacyl tRNAs then participate
in mRNA translation and protein synthesis. The AARS show high
specificity for charging a specific tRNA with the appropriate amino
acid. In general, there is at least one AARS for each of the twenty
amino acids.
[0026] The term "tRNA.sup.Sep" refers to a tRNA that can be
aminoacylated with O-phosphoserine (Sep) and recognize at least one
codon such that the phosphoserine is incorporated into a protein or
polypeptide. In some embodiments, the tRNA.sup.Sep is a
tRNA.sup.Cys from Methanocaldococcus jannaschii containing a C20U
mutation that improves aminoacylation by SepRS without affecting
CysRS recognition. In some embodiments, the tRNA.sup.Sep contains
an anticodon that binds a stop codon.
[0027] The term "Sep-tRNA.sup.Sep" refers to a tRNA.sup.Sep that
has been aminoacylated with O-phosphoserine (Sep).
[0028] The term "O-phosphoseryl-tRNA synthetase (SepRS)" refers to
an O-phosphoseryl-tRNA synthetase that preferentially aminoacylates
tRNA.sup.Sep with O-phosphoserine (Sep) to form Sep-tRNA.
[0029] The term "EF-Sep" refers to a mutant elongation factor
protein that binds Sep-tRNA.sup.Sep and catalyzes the covalent
transfer of the phosphoserine amino acid onto the polypeptide. Due
to the negative charge of the phosphoserine, Sep-tRNA.sup.Sep does
not bind elongation factor Tu (EF-Tu). EF-Sep proteins can bind
Sep-tRNA.sup.Sep, protect Sep-tRNA.sup.Sep from deacylation, and
catalyze the covalent transfer of the phosphoserine amino acid onto
the polypeptide.
[0030] As used herein "suppressor tRNA" refers to a tRNA that
alters the reading of a messenger RNA (mRNA) in a given translation
system. For example, a suppressor tRNA can read through a stop
codon.
[0031] The term "anticodon" refers to a unit made up of three
nucleotides that correspond to the three bases of a codon on the
mRNA. Each tRNA contains a specific anticodon triplet sequence that
can base-pair to one or more codons for an amino acid or "stop
codon." Known stop codons include but are not limited to, the three
codon bases UAA (known as ochre), UAG (known as amber), and UGA
(known as opal), which do not code for an amino acid but act as
signals for the termination of protein synthesis.
[0032] The term "protein" "polypeptide" or "peptide" refers to a
natural or synthetic molecule comprising two or more amino acids
linked by the carboxyl group of one amino acid to the alpha amino
group of another.
[0033] The term "residue" as used herein refers to an amino acid
that is incorporated into a protein. The amino acid may be a
naturally occurring amino acid and, unless otherwise limited, may
encompass known analogs of natural amino acids that can function in
a similar manner as naturally occurring amino, acids.
[0034] The term "polynucleotide" or "nucleic acid sequence" refers
to a natural or synthetic molecule comprising two or more
nucleotides linked by a phosphate group at the 3' position of one
nucleotide to the 5' end of another nucleotide. The polynucleotide
is not limited by length, and thus the polynucleotide can include
deoxyribonucleic acid (DNA) or ribonucleic acid (RNA).
[0035] The term "gene" refers to a polynucleotide that encodes a
protein or functional RNA molecule.
[0036] The term "vector" or "construct" refers to a polynucleotide
capable of transporting into a cell another polynucleotide to which
the vector sequence has been linked. The term "expression vector"
includes any vector, (e.g., a plasmid, cosmid or phage chromosome)
containing a gene construct in a form suitable for expression by a
cell (e.g., linked to a transcriptional control element). "Plasmid"
and "vector" are used interchangeably, as a plasmid is a commonly
used form of vector.
[0037] The term "operatively linked to" refers to the functional
relationship of a nucleic acid with another nucleic acid sequence.
Promoters, enhancers, transcriptional and translational stop sites,
and other signal sequences are examples of nucleic acid sequences
operatively linked to other sequences. For example, operative
linkage of gene to a transcriptional control element refers to the
physical and functional relationship between the gene and promoter
such that the transcription of the gene is initiated front the
promoter by an RNA polymerase that specifically recognizes, binds
to and transcribes the DNA.
[0038] The terms "transformation" and "transfection" refer to the
introduction of a polynucleotide, e.g., an expression vector, into
a recipient cell including introduction of a polynucleotide to the
chromosomal DNA of the cell.
[0039] The term "variant" refers to an amino acid or nucleic acid
sequence having conservative substitutions, non-conservative
subsitutions (i.e. a degenerate variant), substitutions within the
wobble position of a codon encoding an amino acid, amino acids
added to the C-terminus of a peptide, or a peptide having 60%, 70%,
80%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to an amino
acid sequence.
[0040] The term "conservative variant" refers to a particular
nucleic acid sequence that encodes identical or essentially
identical amino acid sequences. Conservative substitution tables
providing functionally similar amino acids are well known in the
art. The following sets forth exemplary groups which contain
natural amino acids that are "conservative substitutions" for one
another. Conservative Substitution Groups 1 Alanine (A) Serine (S)
Threonine (T); 2 Aspartic acid (D) Glutamic acid (E); 3 Asparagine
(N) Glutamine (Q); 4 Arginine (R) Lysine (K); 5 Isoleucine (I)
Leucine (L) Methionine (M) Valine (V); and 6 Phenylalanine (F)
Tyrosine (Y) Tryptophan (W).
[0041] The term "percent (%) sequence identity" or "homology"
refers to the percentage of nucleotides or amino acids in a
candidate sequence that are identical with the nucleotides or amino
acids in a reference nucleic acid sequence, after aligning the
sequences and introducing gaps, if necessary, to achieve the
maximum percent sequence identity. Alignment for purposes of
determining percent sequence identity can be achieved in various
ways that are within the skill in the art, for instance, using
publicly available computer software such as BLAST, BLAST-2, ALIGN,
ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for
measuring alignment, including any algorithms needed to achieve
maximal alignment over the full-length of the sequences being
compared can be determined by known methods.
[0042] The term "translation system" refers to the components
necessary to incorporate an amino acid into a growing polypeptide
chain (protein). Components of a translation system generally
include amino acids, ribosomes, tRNAs, synthetases, and mRNA. The
components described herein can be added to a translation system,
in vivo or in vitro, to incorporate phosphoserine into a
protein.
[0043] The term "transgenic organism" refers to any organism, in
which one or more of the cells of the organism contains
heterologous nucleic acid introduced by way of human intervention,
such as by transgenic techniques well known in the art. The nucleic
acid is introduced into the cell, directly or indirectly by
introduction into a precursor of the cell, by way of deliberate
genetic manipulation, such as by microinjection or by infection
with a recombinant virus. Suitable transgenic organisms include,
but are not limited to, bacteria, cyanobacteria, fungi, plants and
animals. The nucleic acids described herein can be introduced into
the host by methods known in the art, for example infection,
transfection, transformation or transconjugation.
[0044] The term "eukaryote" or "eukaryotic" refers to organisms or
cells or tissues derived from these organisms belonging to the
phylogenetic domain Eukarya such as animals (e.g., mammals,
insects, reptiles, and birds), ciliates, plants (e.g., monocots,
dicots, and algae), fungi, yeasts, flagellates, microsporidia, and
protists.
[0045] The term "prokaryote" or "prokaryotic" refers to organisms
including, but not limited to, organisms of the Eubacteria
phylogenetic domain, such as Escherichia coli, Thermus
thermophilus, and Bacillus stearothermophilus, or organisms of the
Archaea phylogenetic domain such as, Methanococcus jannaschii,
Methanobacterium thermoautotrophicum, Halobacterium such as
Haloferax volcanii and Halobaeterium species NRC-1. Archaeoglobus
fulgidus, Pyrococcus furiosus, Pyrococcus horikoshii, and
Aeuropyrum pernix.
II. Compositions
A. Aminoacyl-tRNA Synthetases
[0046] A tRNA that can be aminoacylated with O-phosphoserine
("tRNA.sup.Sep" ) is disclosed for use in incorporating
phosphoserine into a protein. The tRNA.sup.Sep recognizes at least
one codon in the mRNA for the protein such that a phosphoserine can
incorporated into the protein. For example, the tRNA.sup.Sep can
contain an anticodon that binds a stop codon or an unconventional
or non-native codon, in some embodiments, the tRNA.sup.Sep is a
tRNA.sup.Cys from an achaea, such as Methanocaldococcus jannaschii
or Methanococcus maripaludis. tRNA.sup.Cys is also found in
Methanopyrus kandleri, Methanococcoides burtonii, Methanospirillum
hungatei, Methanocorpusculum labreanum, Methanoregula boonei,
Methanococcus aeolicus, Methanococcus vannieli, Methanosarcina
mazei. Methanosarcina barkeri. Methanosarcina acetivorans,
Methanosaeta thermophila, Methanoculleus marisnigri,
Methanocaldococcus vulcanius, Methanocaldococcus fervens, and
Methanosphaerula palustris. In preferred embodiments, the
tRNA.sup.Sep contains a mutation (e.g., C20U mutation) that
improves aminoacylation by SepRS without affecting CysRS
recognition. In particularly preferred embodiments, the
tRNA.sup.Sep is encoded by the nucleic acid sequence SEQ ID NO:41,
or a conservative variant thereof.
[0047] tRNA.sup.Sep from Methanocaldococcus jannaschii (FIG.
1B):_________________
TABLE-US-00001 (SEQ ID NO: 41)
GCCGGGGTAGTCTAGGGGTTAGGCAGCGGACTGCAGATCCGCCTTACGTG
GGTTCAAATCCCACCCCCGGCT
[0048] A phosphoseryl-tRNA synthetase (SepRS) that preferentially
aminoacylates tRNA.sup.Sep with phosphoserine is also disclosed for
use in incorporating phosphoserine into a protein. In some
embodiments, the SepRS is a phosphoseryl-tRNA synthetase from an
achaea, such as Methanococcus maripaludis or Methanocaldococcus
jannaschii. SepRS is also found in Methanopyrus kandleri,
Methanococcoides burtonii, Methanospirillum hungatei,
Methanocorpusculum labreanum, Methanoregula boonei, Methanococcus
aeolicus, Methanococcus vannieli, Methanosarcina mazei,
Methanosarcina barkeri, Methanosarcina acetivorans, Methanosaeta
thermophila, Methanoculleus marisnigri, Methanocaldococcus
vulcanius, Methanocaldococcus fervens, and Methanosphaerula
palustris.
[0049] In particularly preferred embodiments, the SepRS has the
amino acid sequence SEQ ID NO:43 or 46, or a conservative variant
thereof. For example, the SepRS can be encoded by the nucleic acid
sequence SEQ ID NO:42 or 45, or a variant thereof.
TABLE-US-00002 SepRS from Methanocaldococcus jannaschii: (SEQ ID
NO: 42) ATGAAATTAAAACATAAAAGGGATGATAAAATGAGATTTGATATAAAAAA
GGTTTTAGAGTTAGCAGAGAAGGATTTTGAGACGGCATGGAGAGAGACAA
GGGCATTAATAAAGGATAAACATATTGACAATAAATATCCAAGATTAAAG
CCTGTCTATGGAAAGCCACATCCAGTGATGGAGACGATAGAGAGATTAAG
ACAAGCTTATCTAAGAATGGGATTTGAAGAGATGATTAATCCAGTTATCG
TTGATGAGATGGAGATTTATAAGCAATTTGGACCAGAAGCAATGGCAGTT
TTAGATAGATGTTTTTACTTGGCTGGATTACCAAGGCCAGATGTTGGTTT
AGGAAATGAGAAGGTTGAGATTATAAAAAATTTGGGCATAGATATAGATG
AGGAGAAAAAAGAGAGGTTGAGAAGTTTTACATTTATACAAAAAAGGAGC
TATAGATGGGGATGATTTAGTCTTTGQAGATTGCCAAAGCTTTAAATGTG
AGTAATGAAATGGGATTGAAGGTTTTAGAAACTGCATTTCCTGAATTTAA
AGATTTGAAGCCAGAATCAACAACTCTAACTTTAAGAAGCCACATGACAT
CTGGGTGGTTTATAACTCTAAGCAGTTTAATAAAGAAGAGAAAACTGCTT
TAAAGTTATTCTCTATAGATAGATGTTTTAGAAGGGAGCAAAGAGAGGAT
AGAAGCCATTTAATGAGTTATCACTCTGCATCTTGTGTAGTTGTTGGTGA
AGATGTTAGTGTAGATGATGGAAAGGTAGTTGCTGAAGGATTGTTGGCTC
AATTTGGATTTACAAAATTTAAGTTTAAGCCAGATGAGAAAAAGAGTAAG
TATTATACACCAGAAACTCAAACAGAGGTTTATGCCTATCATCCAAAGTT
GGGAGAGTGGATTGAAGTAGCAACCTTTGGAGTTTATTCACCAATTGCAT
TAGCTAAATATAACATAGATGTGCCAGTTATGAACCTTGGCCTTAGGAGT
TGAGAGGTTGGCAATGATTATTTACGGCTATGAGGATGTTAGGGCAATGG
TTTATCCTCAATTTTATGAATACAGGTTGAGTGATAGAGATATAGCTGGG
ATGATAAGAGTTGATAAAGTTCCTATATTGGATGAATTCTACAACTTTGC
AAATGAGCTTATTGATATATGCATAGCAAATAAAGATAAGGAAAGCCCAT
GTTCAGTTGAAGTTAAAAGGGAATTCAATTTCAATGGGGAGAGAAGAGTA
ATTAAAGTAGAAATATTTGAGAATGAACCAAATAAAAAGCTTTTAGGTCC
TTCTGTGTTAAATGAGGTTTATGTCTATGATGGAAATATATATGGCATTC
CGCCAACGTTTGAAGGGGTTAAAGAACAGTATATCCCAATTTTAAAGAAA
GCTAAGGAAGAAGGAGTTTCTACAAACATTAGATACATAGATGGGATTAT
CTATAAATTAGTAGCTAAGATTGAAGAGGCTTTAGTTTCAAATGTGGATG
AATTTAAGTTCAGAGTCCCAATAGTTAGAAGTTTGAGTGACATAAACCTA
AAAATTGATGAATTGGCTTTAAAACAGATAATGGGGGAGAATAAGGTTAT
AGATGTTAGGGGACCAGTTTTCTTAAATGCAAAGGTTGAGATAAAATAG; (SEQ ID NO: 43)
MKLKLHRDDKMRFDIKKVLELAEKDFETAWRETRALIKDKHIDNKYPRLK
PVYGKPHPVMETIERLRQAYLRMGFEEMINPVIVDEMEIYKQFGPEAMAV
LDRVFYLAGLPRDVGLGNEKVEIIKNLGIDIDEEKKERLREVLHLYKKGA
IDGDDLVFEIAKALNVSNEMGLKVLETAFPEFKDLKPESTTLTLRSHMTS
GWFTTLSSLIKKRKLPLKLFSIDRCFRREQREDRSHLMSYHSASCVVVGE
DVSVDDGKVVAEGLLAQFGFTKFKFKPDEKKSKYYTPETQTEVYAYHPKL
GEWIEVATFGVYSPIALAKYNIDVPVMNLGLGVERLAMIIYGYEDVRAMV
YPQFYEYRLSDRDIAGMIRVDKVPILDEFYNFANELIDICIANKDKESPC
SVEVKREFNFNGERRVIKVEIFENEPNKKLLGPSVLNEVYVYDGNIYGIP
PTFEGVKEQYIPILKKAKEEGVSTNIRYIDGIIYKLVAKIEEALVSNVDE
FKFRVPIVRSLSDINLKIDELALKQIMGENKVIDVRGPVFLNAKVEIK. SepRS from
Methanococcus maripaludis: (SEQ ID NO: 45)
ATGTTTAAAAGAGAAGAAATCATTGAAATGGCCAATAAGGACTTTGAAAA
AGCATGGATCGAAACTAAAGACCTTATAAAAGCTAAAAAGATAAACGAAA
GTTACCCAAGAATAAAACCAGTTTTGGAAAAACACACCCTGTAAATGACA
CTATTGAAAATTTAAGACAGGCATATCTTAGAATGGGTTTTGAAGAATAT
AAACCCAGTAATTGTCGATGAAAGAGATATTTATAAACAATTCGGCCCAG
AAGCTATGGCAGTTTTGGATAGATGCTTTTATTTAGCGGGACTTCCAAGA
CCTGACGTTGGTTTGAGCGATGAAAAAATTTCACAGATTGAAAAACTTGG
AATTAAAGTTTCTGAGCACAAAGAAAGTTTACAAAAAATACTTCACGGAT
ACAAAAAAGGAACTCTTGATGGTGACGATTTAGTTTTAGAAATTTCAAAT
GCACTTGAAATTTCAAGCGAGATGGGTTTAAAAATTTTAGAAGATGTTTT
CCCAGAATTTAAGGATTTAACCGCAGTTTCTTCAAAATTAACTTTAAGAA
GCCACATGACTTCAGGATGGTTCCTTACTGTTTCAGACCTCATGAACAAA
AAACCCTTGCCATTTAAACTCTTTTCAATCGATAGATGTTTTAGAAGAGA
ACAAAAAGAAGATAAAAGCCACTTAATGACATACCACTCTGCATCCTGTG
CAATTGCAGGTGAAGGCGTGGATATTAATGATGGAAAAGCAATTGCAGAA
GGATTATTATCCCAATTTGGCTTTACAAACTTTAAATTCATTCCTGATGA
AAAGAAAAGTAAATACTACACCCCTGAAACACAGACTGAAGTTTACGCAT
ACCACCCAAAATTAAAAGAATGGCTCGAAGTTGCTACATTTGGAGTATAT
TCGCCAGTTGCATTAAGCAAATACGGAATAGATGTACCTGTAATGAATTT
GGGTCTTGGTGTTGAAAGACTTGCAATGATTTCTGGAAATTTCGCAGATG
TTCGAGAAATGGTATATCCTCAGTTTTACGAACACAAACTTAATGACCGG
AATGTCGCTTCAATGGTAAAACTCGATAAAGTTCCAGTAATGGATGAAAT
TTACGATTTAACAAAAGAATTAATTGAGTCATGTGTTAAAAACAAAGATT
TAAAATCCCCTTGTGAATTAGCTATTGAAAAAACGTTTTCATTTGGAAAA
ACCAAGAAAAATGTAAAAATAAACATTTTTGAAAAAGAAGAAGGTAAAAA
TTTACTCGGACCTTCAATTTTAAACGAAATCTACGTTTACGATGGAAATG
TAATTGGAATTCCTGAAAGCTTTGACGGAGTAAAAGAAGAATTTAAAGAC
TTCTTAGAAAAAGGAAAATCAGAAGGGGTAGCAACAGGCATTCGATATAT
CGATGCGCTTTGCTTTAAAATTACTTCAAAATTAGAAGAAGCATTTGTGT
CAAACACTACTGAATTCAAAGTTAAAGTTCCAATTGTCAGAAGTTTAAGC
GACATTAACTTAAAAATCGATGATATCGCATTAAAACAGATCATGAGCAA
AAATAAAGTAATCGACGTTAGAGGCCCAGTCTTTTAAATGTCGAAGTAAA AATTGAATAA; (SEQ
ID NO: 46) MFKREEIIEMANKDFEKAWIETKDLIKAKKINESYPRIKPVFGKTHPVND
TIENLRQAYLRMGFEEYINPVIVDERDIYKQFGPEAMAVLDRCFYLAGLP
RPDVGLSDEKISQIEKLGIKVSEHKESLQKILHGYKKGTLDGDDLVLEIS
NALEISSEMGLKILEDVFPEFKDLTAVSSKLTLRSHMTSGWFLTVSDLMN
KKPLPFKLFSIDRCFRREQKEDKSHLMTYHSASCAIAGEGVDINGDKAIA
EGLLSQFGFTNFKFIPDEKKSKYYTPETQTEVYAYHPKLKEWLEVATFGV
YSPVALSKYGIDVPVMNLGLGVERLAMISGNFADVREMVYPQFYEHKLND
RNVASMVKLDKVPVMDEIYDLTKELIESCVKNKDLKSPCELAIEKTFSFG
KTKKVKINIFEKEEGKNLLGPSILNEIYVYDGNVIGIPESFDGVKEEFKD
FLEKGKSEGVATGIRYIDALVGKITSKLEEAFVSNTTEFKVKVPIVRSLS
DINLKIDDIALKQIMSKNKVIDVRGPVFLNVEVKIE.
B. Elongation Factor Proteins
[0050] Nucleic acid sequences encoding mutant elongation factor
proteins (EF-Sep) are described for use with phosphoseryl-tRNA
synthetase (SepRS) and phosphoseryl-tRNA (tRNA.sup.Sep) in site
specific incorporation of phosphoserine into a protein or
polypeptide. Typically. SepRS preferentially aminoacylates
tRNA.sup.Sep with O-phosphoserine and the tRNA.sup.Sep recognizes
at least one codon such as a stop codon. Due to the negative charge
of the phosphoserine, Sep-tRNA.sup.Sep does not bind elongation
factor Tu (EF-Tu). However, the disclosed EF-Sep proteins can bind
Sep-tRNA.sup.Sep and protect Sep-tRNA.sup.Sep from deacylation.
[0051] In some embodiments, EF-Sep is a mutant form of bacterial
EF-Tu having a mutation at one or more of amino acid residues
corresponding to His67, Asp216, Glu217, Phe219, Thr229, and Asn274
in E. coli EF-Tu, which are located in the amino acid binding
pocket for aminoacylated tRNA. In some embodiment, EF-Sep is a
mutant form of eukaryotic elongation factor 1A (eEF1A) with
mutations in positions equivalent to bacterial counterpart.
[0052] In preferred embodiments, EF-Sep has the amino acid sequence
SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, or a
conservative variant thereof. For example, in some embodiments, the
nucleic acid sequence encoding EF-Sep has the nucleic acid sequence
SEQ ID NO:5, SEQ ID NO:6. SEQ ID NO:7, SEQ ID NO:8, or a
conservative variant thereof.
C. Variants
[0053] Also disclosed are variants of the disclosed proteins and
polynucleotides that include conservative substitutions, additions,
and deletions therein not affecting the structure or function. For
example, biologically active sequence variants of tRNA.sup.Sep,
SepRS, and EF-Sep and in vitro generated covalent derivatives of
tRNA.sup.Sep, SepRS, and EF-Sep that demonstrate tRNA.sup.Sep,
SepRS, and EF-Sep activity are disclosed.
[0054] Various types of mutagenesis can be used to modify a nucleic
acid. They include, but are not limited to, site-directed, random
point mutagenesis, homologous recombination (DNA shuffling),
mutagenesis using uracil containing templates,
oligonucleotide-directed mutagenesis, phosphorothioate-modified DNA
mutagenesis, and mutagenesis using methods such as gapped duplex
DNA. Additional suitable methods include point mismatch repair,
mutagenesis using repair-deficient host strains,
restriction-selection and restriction-purification, deletion
mutagenesis, mutagenesis by total gene synthesis and double-strand
break repair.
[0055] Sequence variants of tRNA.sup.Sep, SepRS, and EF-Sep fall
into one or more of three classes: substitutional, insertional
and/or deletional variants. Sequence variants of tRNA.sup.Sep
include nucleotide variants, while sequence variants of SepRS and
EF-Sep include nucleotide and/or amino acid variants. Insertions
include amino and/or carboxyl terminal fusions as well as
intrasequence insertions of single or multiple residues.
tRNA.sup.Sep, SepRS, and EF-Sep include, for example, hybrids of
mature tRNA.sup.Sep, SepRS, and EF-Sep with nucleotides or
polypeptides that are homologous with tRNA.sup.Sep, SepRS, and
EF-Sep. tRNA.sup.Sep, SepRS, and EF-Sep also include hybrids of
tRNA.sup.Sep, SepRS, and EF-Sep with nucleotides or polypeptides
homologous to the host cell but not to tRNA.sup.Sep, SepRS, and
EF-Sep, as well as nucleotides or polypeptides heterologous to both
the host cell and tRNA.sup.Sep, SepRS, and EF-Sep. Fusions include
amino or carboxy terminal fusions with either prokaryotic
nucleotides or peptides or signal peptides of prokaryotic, yeast,
viral or host cell signal sequences.
[0056] Insertions can also be introduced within the mature coding
sequence of tRNA.sup.Sep, SepRS, and EF-Sep. These, however,
ordinarily will be smaller insertions than those of amino or
carboxyl terminal fusions, on the order of one to four residues.
Insertional sequence variants of tRNA.sup.Sep, SepRS, and EF-Sep
are those in which one or more residues are introduced into a
predetermined site in the target tRNA.sup.Sep, SepRS, and
EF-Sep.
[0057] Deletion variants are characterized by the removal of one or
more nucleotides or amino acid residues from the tRNA.sup.Sep,
SepRS, and EF-Sep sequence. For SepRS and EF-Sep, deletions or
substitutions of cysteine or other labile residues may be
desirable, for example in increasing the oxidative stability or
selecting the preferred disulfide bond arrangement of SepRS or
EF-Sep. Deletions or substitutions of potential proteolysis sites,
e.g. Arg Arg, are accomplished, for example, by deleting one of the
basic residues or substituting one by glutaminyl or histidyl
residues. Variants ordinarily are prepared by site specific
mutagenesis of nucleotides in the DNA encoding the tRNA.sup.Sep,
SepRS, and EF-Sep, thereby producing DNA encoding the variant, and
thereafter expressing the DNA in recombinant cell culture. Variant
tRNA.sup.Sep, SepRS, and EF-Sep fragments may also be prepared by
in vitro synthesis. The variants typically exhibit the same
qualitative biological activity as the naturally-occurring
analogue, although variants also are selected in order to modify
the characteristics of tRNA.sup.Sep, SepRS, and EF-Sep.
[0058] Substitutional variants are those in which at least one
residue sequence has been removed and a different residue inserted
in its place. Owing to the degeneracy of the genetic code, "silent
substitutions" (i.e., substitutions in a nucleic acid sequence
which do not result in an alteration in an encoded polypeptide) are
an implied feature of every nucleic acid sequence which encodes an
amino acid. Similarly, conservative amino acid substitutions are
also readily identified. Such conservative variations are a feature
of each disclosed sequence. The substitutions which in general are
expected to produce the greatest changes in SepRS or EF-Sep protein
properties are those in which (a) a hydrophilic residue, e.g. seryl
or threonyl, is substituted for (or by) a hydrophobic residue, e.g.
leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or
proline is substituted for (or by) any other residue; (c) a residue
having an electropositive side chain, e.g., lysyl, arginyl, or
histidyl, is substituted for (or by) an electronegative residue,
e.g., glutamyl or aspartyl; or (d) a residue having a bulky side
chain, e.g., phenylalanine, is substituted for (or by) one not
having a side chain, e.g., glycine.
[0059] While the site for introducing a nucleotide or amino acid
sequence variation is predetermined, the mutation per se need not
be predetermined. For example, in order to optimize the performance
of a mutation at a given site, random mutagenesis may be conducted
at the target codon or region and the expressed tRNA.sup.Sep,
SepRS, and EF-Sep variants screened for the optimal combination of
desired activity. Techniques for making substitution mutations at
predetermined sites in DNA having a known sequence are well
known.
[0060] Substitutions are typically of single residues; insertions
usually will be on the order of about from 1 to 10 residues; and
deletions will range about from 1 to 30 residues. Substitutions,
deletion, insertions or any combination thereof may be combined to
arrive at a final construct. The mutations that will be made in the
DNA encoding the variant SepRS and EF-Sep must not place the
sequence out of reading frame and preferably will not create
complementary regions that could produce secondary mRNA
structure.
[0061] A DNA isolate is understood to mean chemically synthesized
DNA, cDNA or genomic DNA with or without the 3' and/or 5' flanking
regions. DNA encoding tRNA.sup.Sep, SepRS, and EF-Sep can be
obtained from other sources than Methanocaldococcus jannaschii by
screening a cDNA library from cells containing mRNA using
hybridization with labeled DNA encoding Methanocaldococcus
jannaschii tRNA.sup.Sep, SepRS, and EF-Sep, or fragments thereof
(usually, greater than 10 bp).
[0062] The precise percentage of similarity between sequences that
is useful in establishing sequence identity varies with the nucleic
acid and protein at issue, but as little as 25% sequence similarity
is routinely used to establish sequence identity. Higher levels of
sequence similarity, e.g., 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%
or 99% or more can also be used to establish sequence identity.
Methods for determining sequence similarity percentages (e.g.,
BLASTP and BLASTN using default parameters) are generally
available.
[0063] Alignment of sequences for comparison can be conducted by
many well-known methods in the art, for example, by the local
homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482
(1981), by the homology alignment algorithm of Needleman &
Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity
method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444
(1988), by computerized implementations of these algorithms (GAP,
BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software
Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.),
by the Gibbs sampling method (Chatterji and Pachter, J Comput Biol.
12(6):599-608 (2005)), by PSI-BLAST-ISS (Margelevicius and
Venclovas, BMC Bioinformatics 21; 6:185 (2005)), or by visual
inspection. One algorithm that is suitable for determining percent
sequence identity and sequence similarity is the BLAST algorithm,
which is described in Altschul et al., J. Mol. Biol. 215:403-410
(1990). Software for performing BLAST analyses is publicly
available through the National Center for Biotechnology Information
(www.ncbi.nlm.nib.gov).
[0064] In addition to calculating percent sequence identity, the
BLAST algorithm also performs a statistical analysts of the
similarity between two sequences (see, e.g., Karlin & Altschul,
Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of
similarity provided by the BLAST algorithm is the smallest sum
probability (P(N)), which provides an indication of the probability
by which a match between two nucleotide or amino acid sequences
would occur by chance. For example, a nucleic acid is considered
similar to a reference sequence if the smallest sum probability in
a comparison of the test nucleic acid to the reference nucleic acid
is less than about 0.1, more preferably less than about 0.01, and
most preferably less than about 0.001.
D. Expression or Translation Systems
[0065] Also disclosed are expression or translation systems for
incorporate phosphoserine into a growing polypeptide chain
(protein). Components of a translation system generally include
amino acids, ribosomes, tRNAs, synthetases, and mRNA. The disclosed
tRNA.sup.Sep, SepRS, and EF-Sep can be added to a translation
system, in vivo or in vitro to incorporate phosphoserine into a
protein.
[0066] In some embodiments, a cell-based (in vivo) expression
system is used. In these embodiments, nucleic acids encoding one or
more of tRNA.sup.Sep, SepRS, and EF-Sep are delivered to cells
under conditions suitable for translation and or transcription of
tRNA.sup.Sep, SepRS, EF-Sep, or a combination thereof. The cells
can in some embodiments be prokaryotic, e.g., an E. coli cell, or
eukaryotic, e.g., a yeast, mammalian, plant, or insect or cells
thereof.
[0067] In some embodiments, a cell-free (in vitro) expression
system is used. The most frequently used cell-free translation
systems involve extracts containing all the macromolecular
components (70 S or 80 S ribosomes, tRNAs, aminoacyl-tRNA
synthetases, initiation, elongation and termination factors, etc.)
required for translation of exogenous RNA. To ensure efficient
translation, each extract is supplemented with amino acids, energy
sources (ATP, GTP), energy regenerating systems (creatine phosphate
and creatine phosphokinase for eukaryotic systems, and phosphoenol
pyruvate and pyruvate kinase for the E. coli lysate), and other
co-factors (Mg.sup.2+, K.sup.-etc.).
i) Promoters and Enhancers
[0068] Nucleic acids that are delivered to cells typically contain
expression controlling systems. For example, the inserted genes in
viral and retroviral systems usually contain promoters, and/or
enhancers to help control the expression of the desired gene
product. A promoter is generally a sequence or sequences of DNA
that function when in a relatively fixed location in regard to the
transcription start site. A promoter contains core elements
required for basic interaction of RNA polymerase and transcription
factors, and may contain upstream elements and response
elements.
[0069] Therefore, also disclosed is a polynucleotide encoding one
or more of tRNA.sup.Sep, SepRS, and EF-Sep, operably linked to an
expression control sequence.
[0070] Suitable promoters are generally obtained from viral genomes
(e.g., polyoma, Simian Virus 40 (SV40), adenovirus, retroviruses,
hepatitis-B virus, and cytomegalovirus) or heterologous mammalian
genes (e.g. beta actin promoter). Enhancer generally refers to a
sequence of DNA that functions at no fixed distance from the
transcription start site and can be either 5' or 3' to the
transcription unit. Furthermore, enhancers can be within an intron
as well as within the coding sequence itself. They are usually
between 10 and 300 bp in length, and they function in cis.
Enhancers function to increase transcription from nearby promoters.
Enhancers also often contain response elements that mediate the
regulation of transcription. Many enhancer sequences are now known
from mammalian genes (globin, elastase, albumin,
.alpha.-fetoprotein and insulin). However, enhancer from a
eukaryotic cell virus are preferably used tor general expression.
Suitable examples include the SV40 enhancer on the late side of the
replication origin, the cytomegalovirus early promoter enhancer,
the polyoma enhancer on the late side of the replication origin,
and adenovirus enhancers.
[0071] In certain embodiments the promoter and/or enhancer region
can act as a constitutive promoter and/or enhancer to maximize
expression of the region of the transcription unit to be
transcribed. In certain constructs the promoter and/or enhancer
region is active in ail eukaryotic cell types, even if it is only
expressed in a particular type of cell at a particular time. A
preferred promoter of this type is the CMV promoter. In other
embodiments, the promoter and/or enhancer is tissue or cell
specific.
[0072] In certain embodiments the promoter and/or enhancer region
is inducible. Induction can occur, e.g., as the result of a
physiologic response, a response to outside signals, or as the
result of artificial manipulation. Such promoters are well known to
those of skill in the art. For example, in some embodiments, the
promotor and/or enhancer may be specifically activated either by
light or specific chemical events which trigger their function.
Systems can be regulated by reagents such as tetracycline and
dexamethasone. There are also ways to enhance viral vector gene
expression by exposure to irradiation, such as gamma irradiation,
or alkylating chemotherapy drugs.
[0073] Expression vectors used in eukaryotic host cells (yeast,
fungi, insect, plant, animal, human or nucleated cells) may also
contain sequences necessary for the termination of transcription
which may affect mRNA expression. These regions are transcribed as
polyadenylated segments in the untranslated portion of the mRNA
encoding tissue factor protein. The 3' untranslated regions also
include transcription termination sites. It is preferred that the
transcription unit also contains a polyadenylation region. One
benefit of this region is that it increases the likelihood that the
transcribed unit will be processed and transported like mRNA. The
identification and use of polyadenylation signals in expression
constructs is well established. It is preferred that homologous
polyadenylation signals be used in the transgene constructs.
ii) Cell Delivery Systems
[0074] There are a number of compositions and methods which can be
used to deliver nucleic acids to cells, either in vitro or in vivo.
These methods and compositions can largely be broken down into two
classes: viral based delivery systems and non-viral based delivery
systems. For example, nucleic acids can be delivered through a
number of direct delivery systems such as electroporation,
lipofection, calcium phosphate precipitation, plasmids, viral
vectors, viral nucleic acids, phage nucleic acids, phages, cosmids,
or via transfer of genetic material in cells or carriers such as
cationic liposomes. Appropriate means for transfection, including
viral vectors, chemical transfectants, or physico-mechanical
methods such as electroporation and direct diffusion of DNA, are
well known in the art and readily adaptable for use with the
compositions and methods described herein.
[0075] Transfer vectors can be any nucleotide construction used to
deliver genetic material into cells. In some embodiments the
vectors are derived from either a virus or a retrovirus. Viral
vectors include, for example, Adenovirus, Adeno-associated virus,
Herpes virus, Vaccinia virus, Polio virus, AIDS virus, neuronal
trophic virus, Sindbis and other RNA viruses, including these
viruses with the HIV backbone.
[0076] Typically, viral vectors contain nonstructural early genes,
structural late genes, an RNA polymerase III transcript, inverted
terminal repeats necessary for replication and encapsidation, and
promoters to control the transcription and replication of the viral
genome. When engineered as vectors, viruses typically have one or
more of the early genes removed and a gene or gene/promotor
cassette is inserted into the viral genome in place of the removed
viral DNA. The necessary functions of the removed early genes are
typically supplied by cell lines which have been engineered to
express the gene products of the early genes in trans.
[0077] Nucleic acids can also be delivered through electroporation,
sonoporation, lipofection, or calcium phosphate precipitation.
Lipofection involves the use liposomes, including cationic
liposomes (e.g., DOTMA, DOPE, DC-cholesterol) and anionic
liposomes, to delivery genetic material to a cell. Commercially
available liposome preparations include LIPOFECTIN, LIPOFECTAMINE
(GIBCO-BRL, Inc., Gaithersburg, Md.), SUPERFECT (Qiagen, Inc.
Hilden, Germany), and TRANSFECTAM (Promega Biotec, Inc,. Madison,
Wis.).
[0078] Nucleic acids that are delivered to cells which are to be
integrated into the host cell genome, typically contain integration
sequences. These sequences are often viral related sequences,
particularly when viral based systems are used. These viral
intergration systems can also be incorporated into nucleic acids
which are to be delivered using a non-nucleic acid based system of
deliver, such as a liposome, so that the nucleic acid contained in
the delivery system can be come integrated into the host genome.
Techniques for integration of genetic material into a host genome
are also known and include, for example, systems designed to
promote homologous recombination with the host genome. These
systems typically rely on sequence flanking the nucleic acid to be
expressed that has enough homology with a target sequence within
the host cell genome that recombination between the vector nucleic
acid and the target nucleic acid takes place, causing the delivered
nucleic acid to be integrated into the host genome. These systems
and the methods necessary to promote homologous recombination are
known to those of skill in the art.
iii) Markers
[0079] The vectors used to deliver the disclosed nucleic acids to
cells can further include nucleic acid sequence encoding a marker
product. This marker product is used to determine if the gene has
been delivered to the cell and once delivered is being expressed.
In some embodiments the marker is a detectable label. Exemplary
labels include the E. coli lacZ gene, which encodes
.beta.-galactosidase, and green fluorescent protein (GFP).
[0080] In some embodiments the marker may be a selectable marker.
Examples of suitable selectable markers for mammalian cells are
dihydrofolate reductase (DHFR), thymidine kinase, neomycin,
neomycin analog G418, hydromycin, and puromycin. When such
selectable markers are successfully transferred into a mammalian
host cell, the transformed mammalian host cell can survive if
placed under selective pressure. There are two widely used distinct
categories of selective regimes. The first category is based on a
cell's metabolism and the use of a mutant cell line which lacks the
ability to grow independent of a supplemented media. The second
category is dominant selection which refers to a selection scheme
used in any cell type and does not require the use of a mutant cell
line. These schemes typically use a drug to arrest growth of a host
cell. Those cells which have a novel gene would express a protein
conveying drug resistance and would survive the selection.
III. Methods
A. Site-Specific Phosphorylation of Proteins
[0081] Methods for incorporating phosphoserine into polypeptides
are disclosed. The method involves the use of tRNA.sup.Sep, SepRS,
and EF-Sep in the translation process for a target polypeptide from
mRNA. SepRS preferentially aminoacylates tRNA.sup.Sep with
O-phosphoserine. The resulting Sep-tRNA.sup.Sep recognizes at least
one codon in the mRNA for the target protein, such as a stop codon.
EF-Sep mediates the entry of the Sep-tRNA.sup.Sep into a free site
of the ribosome. If the codon-anticodon pairing is correct, EF-Sep
hydrolyzes guanosine triphosphate (GTP) into guanosine diphosphate
(GDP) and inorganic phosphate, and changes in conformation to
dissociate from the tRNA molecule. The Sep-tRNA.sup.Sep then fully
enters the A site, where its amino acid is brought near the P
site's polypeptide and the ribosome catalyzes the covalent transfer
of the amino acid onto the polypeptide.
[0082] In preferred embodiments, the tRNA.sup.Sep is a tRNA.sup.Cys
from a methanogenic archaea, such as Methanocaldococcus jannaschii,
containing a mutation (e.g., C20U) that improves aminoacylation of
the tRNA by SepRS without affecting CysRS recognition. In some
embodiments, the tRNA.sup.Sep contains an anticodon that binds a
codon other than a Cys codon, such as a stop codon. In some
embodiments, the tRNA.sup.Sep is encoded the nucleic acid sequence
SEQ ID NO:41, or a conservative variant thereof.
[0083] In some embodiments, the SepRS is any tRNA synthetase that
preferentially aminoacylates tRNA.sup.Sep with a phosphoserine. In
preferred embodiments, the SepRS is a tRNA synthetase from a
methanogenic archaea, such as Methanococcus maripaludis or
Methanocaldococcus jannaschii. In some embodiments, the SepRS has
the amino acid sequence SEQ ID NO:43 or 46, or a conservative
variant thereof.
[0084] In some embodiments, the EF-Sep is any elongation factor
protein that binds Sep-tRNA.sup.Sep and catalyzes the covalent
transfer of the phosphoserine amino acid onto the polypeptide.
EF-Sep proteins can bind Sep-tRNA.sup.Sep and can preferably
protect Sep-tRNA.sup.Sep from deacylation. In some embodiments,
EF-Sep is a mutant form of bacterial EF-Tu having a mutation at one
or more of amino acid residues corresponding to His67, Asp216,
Glu217, Phe219, Thr229, and Asn274 in E. coli EF-Tu, which are
located in the amino acid binding pocket for aminoacylated tRNA. In
some embodiments, EF-Sep has the amino acid sequence SEQ ID NO:1,
SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, or a conservative variant
thereof. In some embodiments, EF-Sep is a mutant form of eukaryotic
elongation factor 1A (eEF1A) with mutations in positions equivalent
to bacterial counterpart.
i) In Vitro Transcription/Translation
[0085] In one embodiment, the nucleic acids encoding tRNA.sup.Sep
and SepRS activity are synthesized prior to translation of the
target protein and are used to incorporate phosphoserine into a
target protein in a cell-free (in vitro) protein synthesis
system.
[0086] In vitro protein synthesis systems involve the use crude
extracts containing all the macromolecular components (70 S or 80 S
ribosomes. tRNAs, aminoacyl-tRNA synthetases, initiation,
elongation and termination factors, etc.) required for translation
of exogenous RNA. For the current method, the tRNAs, aminoacyl-tRNA
synthetases, and elongation factors in the crude extract are
supplemented with tRNA.sup.Sep, SepRS, and EF-Sep. To ensure
efficient translation, each extract must be supplemented with amino
acids, energy sources (ATP, GTP), energy regenerating systems
(creatine phosphate and creatine phosphokinase for eukaryotic
systems, and phosphoenol pyruvate and pyruvate kinase for the E.
coli lysate), and other co-factors (Mg2+, K+, etc.).
[0087] In vitro protein synthesis does not depend on having a
polyadenylated RNA, but if having a poly(A) tail is essential for
some other purpose, a vector may be used that has a stretch of
about 100 A residues incorporated into the polylinker region. That
way, the poly(A) tail is "built in" by the synthetic method. In
addition, eukaryotic ribosomes read RNAs that have a 5' methyl
guanosine cap more efficiently. RNA caps can be incorporated by
initiation of transcription using a capped base analogue, or adding
a cap in a separate in vitro reaction post-transcriptionally.
[0088] Suitable in vitro transcription/translation systems include,
but are not limited to, the rabbit reticulocyte system, the E. coli
S-30 transcription-translation system, the wheat germ based
translational system. Combined transcription/translation systems
are available, in which both phage RNA polymerases (such as T7 or
SP6) and eukaryotic ribosomes are present. One example of a kit is
the TNT.RTM. system from Promega Corporation.
ii) In Vivo Methods
[0089] Host cells and organisms can also incorporate phosphoserine
into proteins or polypeptides via nucleic acids encoding
tRNA.sup.Sep, SepRS, and EF-Sep. Nucleic acids encoding tRNA Sep,
SepRS, and EF-Sep, operably linked to one or more expression
control sequences are introduced into cells or organisms using a
cell delivery system. These cells also contain a gene encoding the
target protein operably linked to an expression control
sequence.
[0090] Suitable organisms include, but are not limited to,
microorganisms such as bacteria transformed with recombinant
bacteriophage, plasmid, or cosmid DNA expression vectors; yeast
transformed with yeast expression vectors; insect cell systems
infected with viral expression vectors (e.g., baculovirus); plant
cell systems transformed with viral expression vectors (e.g.,
cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or
with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or
animal cell systems.
[0091] It will be understood by one of ordinary skill in the art
that regardless of the system used (i.e. in vitro or in vivo),
expression of genes encoding tRNA.sup.Sep, SepRS, and EF-Sep
activity will result in site specific incorporation of
phosphoserine into the target polypeptides or proteins that are
translated in the system. Host cells are genetically engineered
(e.g., transformed, transduced or transfected) with the vectors
encoding tRNA.sup.Sep, SepRS, and EF-Sep, which can be, for
example, a cloning vector or an expression vector. The vector can
be, for example, in the form of a plasmid, a bacterium, a virus, a
naked polynucleotide, or a conjugated polynucleotide. The vectors
are introduced into cells and/or microorganisms by standard methods
including electroporation, infection by viral vectors, high
velocity ballistic penetration by small particles with the nucleic
acid either within the matrix of small beads or particles, or on
the surface. Such vectors can optionally contain one or more
promoter. A "promoter" as used herein is a DNA regulatory region
capable of initiating transcription of a gene of interest.
[0092] Kits are commercially available for the purification of
plasmids from bacteria, (see, e.g., GFX.TM. Micro Plasmid Prep Kit
from GE Healthcare; Strataprep.RTM. Plasmid Miniprep Kit and
StrataPrep.RTM. EF Plasmid Midiprep Kit from Stratagene; Gen
Elute.TM. HP Plasmid Midiprep and Maxiprep Kits from Sigma-Aldrich,
and, Qiagen plasmid prep kits and QIAfilter.TM. kits from Qiagen).
The isolated and purified plasmids are then further manipulated to
produce other plasmids, used to transfect cells or incorporated
into related vectors to infect organisms. Typical vectors contain
transcription and translation terminators, transcription and
translation initiation sequences, and promoters useful for
regulation of the expression of the particular target nucleic acid.
The vectors optionally comprise generic expression cassettes
containing at least one independent terminator sequence, sequences
permitting replication of the cassette in eukaryotes, or
prokaryotes, or both, (e.g., shuttle vectors) and selection markers
for both prokaryotic and eukaryotic systems.
[0093] Prokaryotes useful as host cells include, but are not
limited to, gram negative or gram positive organisms such as E.
coli or Bacilli. In a prokaryotic host cell, a polypeptide may
include an N-terminal methionine residue to facilitate expression
of the recombinant polypeptide in the prokaryotic host cell. The
N-terminal Met may be cleaved from the expressed recombinant
polypeptide. Promoter sequences commonly used for recombinant
prokaryotic host cell expression vectors include lactamase and the
lactose promoter system.
[0094] Expression vectors for use in prokaryotic host cells
generally comprise one or more phenotypic selectable marker genes.
A phenotypic selectable marker gene is, for example, a gene
encoding a protein that confers antibiotic resistance or that
supplies an autotrophic requirement. Examples of useful expression
vectors for prokaryotic host cells include those derived from
commercially available plasmids such as the cloning vector pBR322
(ATCC 37017). pBR322 contains genes for ampicillin and tetracycline
resistance and thus provides simple means for identifying
transformed cells. To construct an expression vector using pBR322,
an appropriate promoter and a DNA sequence are inserted into the
pBR322 vector. Other commercially available vectors include, for
example, T7 expression vectors from Invitrogen, pET vectors from
Novagen and pALTER.RTM. vectors and PinPoint.RTM. vectors from
Promega Corporation.
[0095] Yeasts useful as host cells include, but are not limited to,
those from the genus Saccharomyces, Pichia, K. Actinomycetes and
Kluyveromyces. Yeast vectors will often contain an origin of
replication sequence, an autonomously replicating sequence (ARS), a
promoter region, sequences for polyadenylation, sequences tor
transcription termination, and a selectable marker gene. Suitable
promoter sequences for yeast vectors include, among others,
promoters for metallothionein, 3-phosphoglycerate kinase (Hitzeman
et al., Biol. Chem. 255:2073, (1980)) or other glycolytic enzymes
(Holland et al., Biochem. 17:4900, (1978)) such as enolase,
glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate
decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase,
3-phosphoglycerate mutase, pyruvate kinase, triosephosphate
isomerase, phosphoglucose isomerase, and glucokinase. Other
suitable vectors and promoters for use in yeast expression are
further described m Fleer et al., Gene, 107:285-195 (1991), in Li,
et al., Lett Appl Microbiol. 40(5):347-52 (2005), Jansen, et al.,
Gene 344:43-51 (2005) and Daly and Hearn, J. Mol. Recognit.
18(2):119-38 (2005). Other suitable promoters and vectors for yeast
and yeast transformation protocols are well known in the art.
[0096] Mammalian or insect host cell culture systems well known in
the art can also be employed to express recombinant tRNA.sup.Sep,
SepRS, and EF-Sep for producing proteins or polypeptides containing
phosphoserine. Commonly used promoter sequences and enhancer
sequences are derived from Polyoma virus, Adenovirus 2, Simian
Virus 40 (SV40), and human cytomegalovirus. DNA sequences derived
from the SV40 viral genome may be used to provide other genetic
elements for expression of a structural gene sequence in a
mammalian host cell, e.g., SV40 origin, early and late promoter,
enhancer, splice, and polyadenylation sites. Viral early and late
promoters are particularly useful because both are easily obtained
from a viral genome as a fragment which may also contain a viral
origin of replication. Exemplary expression vectors for use in
mammalian host cells are well known in the art.
B. Purifying Proteins Containing Phosphoserine
[0097] Proteins or polypeptides containing phosphoserine can be
purified, either partially or substantially to homogeneity,
according to standard procedures known to and used by those of
skill in the art including, but not limited to, ammonium sulfate or
ethanol precipitation, acid or base extraction, column
chromatography, affinity column chromatography, anion or cation
exchange chromatography, phosphocellulose chromatography,
hydrophobic interaction chromatography, hydroxylapatite
chromatography, lectin chromatography, and gel electrophoresis.
Protein refolding steps can be used, as desired, in making
correctly folded mature proteins. High performance liquid
chromatography (HPLC), affinity chromatography or other suitable
methods can be employed in final purification steps where high
purity is desired. In one embodiment, antibodies made against
proteins containing phosphoserine are used as purification
reagents, e.g., for affinity-based purification of proteins
containing phosphoserine. Once purified, partially or to
homogeneity, as desired, the polypeptides may be used as assay
components, therapeutic reagents, immunogens for antibody
production, etc.
[0098] Those of skill in the art will recognize that, after
synthesis, expression and/or purification, proteins can possess
conformations different from the desired conformations of the
relevant polypeptides. For example, polypeptides produced by
prokaryotic systems often are optimized by exposure to chaotropic
agents to achieve proper folding. During purification from lysates
derived from E. coli, the expressed protein is optionally denatured
and then renatured. This is accomplished by solubilizing the
proteins in a chaotropic agent such its guanidine HCl.
[0099] It is occasionally desirable to denature and reduce
expressed polypeptides and then to cause the polypeptides to
re-fold into the preferred conformation. For example, guanidine,
urea, DTT, DTE, and/or a chaperonin can be added to a translation
product of interest. Methods of reducing, denaturing and renaturing
proteins are well known to those of skill in the art Refolding
reagents can be flowed or otherwise moved into contact with the one
or more polypeptide or other expression product, or vice-versa.
C. Using Phosphoserine Containing Peptides
[0100] Proteins or polypeptides containing phosphoserine and
antibodies that bind to such proteins produced by the methods
described herein can be used for research involving phosphoproteins
such as the study of kinases, phosphotases, and target proteins in
signal transduction pathways. Proteins or polypeptides containing
phosphoserine produced by the methods described herein can also be
used for antibody production, protein array manufacture and
development of cell-based screens for new drug discovery.
IV. Kits
[0101] Kits for producing polypeptides and/or proteins containing
phosphoserine are also provided. For example, a kit for producing a
protein that contains phosphoserine in a cell is provided, where
the kit includes a polynucleotide sequence encoding tRNA.sup.Sep, a
polynucleotide sequence encoding SepRS, and a polynucleotide
sequence encoding EF-Sep. In one embodiment, the kit further
includes phosphoserine. In another embodiment, the kit further
comprises instructional materials tor producing the protein. In
another embodiment, a kit for producing a protein that contains
phosphoserine in vitro is provided, where the kit includes a
polynucleotide sequence encoding tRNA.sup.Sep, a polynucleotide
sequence encoding SepRS, a polynucleotide sequence encoding EF-Sep,
and phsophoserine. In another embodiment, the kit further comprises
instructional materials for producing the protein in vitro.
[0102] The present invention will be further understood by
reference to the following non-limiting examples.
EXAMPLES
Example 1: SepRS and tRNA.sup.Sep are an Orthogonal Pair in E.
coli
Materials and Methods
Constructions of Strains
[0103] To prevent possible enzymatic dephosphorylation of
O-phospho-L-serine (Sep) in vivo, the gene encoding phosphoserine
phosphatase (serB), which catalyzes the last step in serine
biosynthesis, was deleted from Escherichia coli strains Top 10 and
BL21. Markerless gene deletions were carried out using a
.lamda.-red and FLP recombinase-based gene knockout strategy as
described by Datsenko K A, et al. Proc Natl Acad Sci USA. 97:6640
(2000). E. coli strains Top 10.DELTA.serB and BL21.DELTA.serB were
used for EF-Tu library construction and MEK1 expression
experiments.
Construction of Plasmids
[0104] To construct plasmid pSepT, the full-length gene encoding
tRNA.sup.Sep was constructed from overlapping oligonucleotides and
ligated immediately downstream of the lpp promoter in pTECH (Bunjun
S, et al. Proc Natl Acad Sci USA. 97:12997 (2000)) using EcoRI and
BamHI restriction sites. pCysT, encoding the wild type tRNA.sup.Cys
gene from Methanocaldococcus jannaschii was constructed in the same
way.
[0105] The gene fragment encoding .beta.-lactamase was
PCR-amplified from plasmid pUC18 using primers PBLAF (5'-TGC GCA
ATG CGG CCG CCC GTA GCG CCG ATG GTA GTG T-3', SEQ ID NO:9) and
PBLAR (5'-ACA CGG AGA TCT CTA AAG TAT ATA TGA GTA AAC-3', SEQ ID
NO:10), and ligated with a NotI and BglII digested PCR product
which was constructed from pSepT using primers PSEPF (5'-TGC GCA
ATG CGG CCG CCC GGG TCG AAT TTG CTT TCG A-3', SEQ ID NO:11) and
PSEPR (5'-ACA CGG AGA TCT ATG CCC CGC GCC CAC CGG AAG-3', SEQ ID
NO:12).
[0106] pKD was derived from pKK223-3 (Pharmacia). The ampicillin
resistance gene was replaced with a kanamycin resistance gene by
combining two PCR products generated from pKK223-3 and pET28a. The
following PCR primers were used: PKF (5'-TGC AGCA ATG CGG CCG CTT
TCA CCG TCA TCA CCG AAA C-3', SEQ ID NO:13) and PKR (5'-GGG ACG CTA
GCA AAC AAA AAG AGT TTG TAG AA-3', SEQ ID NO:14) for pKK223-3
amplification and PKNF (5'-GGG ACG CTA GCT TTT CTC TGG TCC CGC CGC
AT-3', SEQ ID NO:15) and PKNR (5'-TGC GCA ATG CGG CCG CGG TGG CAC
TTT TCG GGG AAA T-3', SEQ ID NO:16) for Kan.sup.R gene
amplification.
[0107] The original multiple cloning site (MCS; NcoI-EcoRI-SacI)
was modified by adding an additional ribosome binding site (RBS)
and a second MCS (NdeI-BamHI-SalI-HindIII), thus enabling
simultaneous protein expression front two genes, both under the
control of the same tac promoter. The Methanococcus maripaludis
SepRS gene was cloned into pKD using NcoI and SacI sites to produce
pKD-SepRS. The E. coli EE-Tu gene (tufB) was ligated into pKD-SepRS
using BamHI and SalI sites resulting in pKD-SepRS-EFTu. The M.
maripaludis pscS gene encoding SepCysS was cloned into pKD-SepRS
using BamHI and SalI sites to produce pKD-SepRS-SepCysS and the M.
maripaludis CysRS gene was cloned into pKD using EcoRI and SalI to
yield pKD-CysRS.
[0108] pCAT112TAG-SepT was created from pACYC184. The gene encoding
chloramphenicol acetyltransferae (CAT) was modified by quickchange
mutagenesis to introduce an amber stop codon at position Asp112
(Wang L, et al. Science 292:498 (2001)). The resulting plasmid was
PCR amplified using primers PBLAF (5'-TGC GCA ATG CGG CCG CCC GTA
GCG CCG ATG GTA GTG T-3', SEQ ID NO:9) and PBLAR (5'-ACA CGG AGA
TCT CTA AAG TAT ATA TGA GTA AAC-3', SEQ ID NO:10) and ligated with
a PCR product containing a tRNA.sup.Sep expression cassette from
pSepT, created with primers TSEPF (5'-GCA TGC GCC GCC AGC TGT TGC
CCG TCT CGC-3', SEQ ID NO:17) and TSEPR (5'-GCA TAG ATC TTC AGC TGG
CGA AAG GGG GAT G-3', SEQ ID NO:18).
[0109] Plasmid pCcdB was created by adding a ccdB gene under the
control of a lac promoter into pTECH using NotI and BglII sites
(Wang L, et al. Science 292:498 (2001)). Two amber stop codons were
introduced at positions 13 and 44 based on the crystal structure
and mutagenesis study of the CcdB protein (Bernard. P., et al. Gene
148:71 (1994); Bajaj K. et al. Proc Natl Acad Sci USA 102:16221
(2005)).
[0110] Plasmid pL11C-SepT encodes tRNA.sup.Sep and the C-terminal
domain of the ribosomal protein L11 under control of lpp promoters.
Part of the rplK gene was PCR amplified from genomic E. coli DNA
using primers L11C-F (5'-GGA ATT CCA TAT GAC CAA GAC CCC GCC GGC
AGC AGT T-3', SEQ ID NO:38) and L11C-R (5'-AGG CGC GCC TTA GTC CTC
CAC TAC-3', SEQ ID NO:39). The PCR product was digested with NdeI
and AscI and was ligated into NdeI and AscI digested
pMYO127TAG-SepT to replace the myoglobin gene.
[0111] To construct pMAL-EFTu and pMAL-EFSep E. coli tufB, or the
gene encoding EF-Sep, respectively, were cloned between the NdeI
and BamHI sites in the pET20b plasmid (Novaven) to add a C-terminal
His.sub.6 tag. This fusion construct was then PCR-amplified using
primers adding MfeI and PstI restriction sites. The PCR product was
cloned in-frame between EcoRI and PstI in pMAL-c2x (New England
Biolabs) to add an N-terminal maltose binding protein (MBP)
tag.
Aminoacylation of tRNA and EF-Sep Binding Assays
[0112] In vitro transcript of Methanocaldococcus jannaschii
tRNA.sup.Cys was prepared and acylated with [.sup.14C]Sep (55
mCi/mmol) using recombinant Methanococcus maripaludis SepRS as
described previously by Hohn M J, et al. Proc Natl Acad Sci USA.
2006 Nov. 28; 103(48); 18095-100. Sep-tRNA.sup.Cys was
phenol/chlorophorm extracted, and the aqueous phase was passed over
Sephadex G25 Microspin columns(GE Healthcare) equilibrated with
water.
[0113] Protection of Sep-tRNA.sup.Cys by EF-Tu was assayed as
described earlier with slight modifications (Ling J, et al. RNA.
2007 November; 13(11);1881 -6.). Briefly, EF-Tu or EF-Sep (both
purified as maltose binding protein fusion proteins) were activated
for 20 min. at 37.degree.C. in buffer containing 100 mM Tris-HCl
(pH 8.2), 120 mM NH.sub.4Cl, 7 mM MgCl.sub.2, 5 mM DTT, 5 mM
phosphoenolpyruvate, 1.5 mM GTP, and 0.12 .mu.g/.mu.l pyruvate
kinase. Hydrolysis of 2 .mu.M [.sup.14C]Sep-tRNA.sup.Cys was then
monitored at 25.degree.C. in the presence of 40 .mu.M EF-Tu (wt),
EF-Sep, or BSA, respectively. Aliquots were taken from the reaction
mix at indicated time points and spotted on 3 MM filter discs
presoaked with 10% trichloroacetic acid. Filters were washed with
5% trichloroacetic acid, dried, and radioactivity was measured by
liquid scintillation counting.
Results
[0114] The Sep-insertion strategy was based on the discovery that
most methanogens form Cys-tRNA.sup.Cys by an unusual pathway
required for cysteine synthesis in these archaea (Sauerwald A. et
al. Science 307, 1969 (2005)). In this route (FIG. 1A),
tRNA.sup.Cys first becomes acylated with O-phosphoserine (Sep) by
O-phosphoseryl-tRNA synthetase (SepRS), an unusual aminoacyl-tRNA
synthetase specific solely for the substrates Sep and tRNA.sup.Cys
(Hohn, M J., et al. Proc Natl Acad Sci USA 103, 18095 (2006)). The
resulting product Sep-tRNA.sup.Cys is then converted to
Cys-tRNA.sup.Cys by the enzyme SepCysS in the presence of a
sulfur-donor (Sauerwald A. et al., Science 307, 1969 (2005)). The
exclusive recognition of Sep by SepRS was further confirmed by the
structural elucidation of this enzyme and by the biochemical
analysis of its catalytic site (Kamtekar S. et al., Proc Natl Acad
Sci USA 104, 2620 (2007); Fukunaga, R. et al., Nat Struct Mol Biol
14, 272 (2007)). The molecular basis of Methanocaldococcus
jannaschii (Mj) tRNA.sup.Cys recognition by SepRS and CysRS from
Methanococcus maripaludis (Mmp) was also explored, yielding the
SepRS-specific tRNA identity elements (Hohn, M J., et al., Proc
Natl Acad Sci USA 103, 18095 (2006)). Based on these results it was
decided to test the applicability of Mj tRNA.sup.Cys and Mmp SepRS
as an orthogonal pair for UAG-directed translational incorporation
of Sep into proteins expressed in Escherichia coli. A scheme was
sought for co-translational insertion of phosphoserine (Sep) into
proteins in E. coli in response to the amber codon UAG. Methanogens
utilize an aminoacyl-tRNA synthetase (SepRS) that acylates
tRNA.sup.Cys with Sep during the biosynthesis of Cys-tRNA.sup.Cys
(FIG. 1A).
[0115] A tRNA (tRNA.sup.Sep) was designed that could be
aminoacylated with phosphoserine (FIG. 1B). tRNA.sup.Sep is a tRNA
derived from Mj tRNA.sup.Cys containing a C20U change that improves
2.5-fold the aminoacylation by SepRS without affecting CysRS
recognition. In addition, tRNA.sup.Sep was modified to be an amber
suppressor by including two mutations in the anticodon (FIG.
1B).
[0116] Both tRNA.sup.Sep and tRNA.sup.Cys were overexpressed in E.
coli. In vivo aminoacylation by Mmp SepRS showed (FIG. 1C) that the
anticodon change lowered (to about 40%) the activity of
tRNA.sup.Sep when compared to tRNA.sup.Cys. Total E. coli tRNA
could not be charged with Sep (FIG. 1C). Based on these in vitro
data, Mj tRNA.sup.Sep and Mmp SepRS appear to be an orthogonal
pair.
[0117] Efficient and selective addition of Sep to the E. coli
genetic repertoire requires exclusive interaction of SepRS with
tRNA.sup.Sep for Sep-tRNA.sup.Sep formation without interfering in
the host translation system as well as a sufficient intracellular
concentration of Sep. As E. coli has a Sep-compatible transporter
(Wanner, B L. FEMS Microbiol Lett 79, 133 (1992)). Sep (2 mM) was
added to the growth medium, and the endogenous serB gene encoding
phosphoserine phosphatase was deleted in the E. coli test strain.
To assess whether the Mj tRNA.sup.Sep/Mmp SepRS pair is functional
and orthogonal in E. coli, a suppression assay was performed that
employed a chloramphenicol acetyltransferase (CAT) gene with a UAG
stop codon at the permissive position 112 (wild-type amino acid:
Asp) to produce chloramphenicol (Cm) acetyltransferase: then cell
survival was measured in the presence of Sep and varying amounts of
Cm. The different IC.sub.50 values (FIG. 2) relate to suppression
efficiency (i.e., amount of CAT made dependent on the various
transformed genes). When only tRNA.sup.Sep is expressed (FIG. 2,
second bar) Cm resistance increases about 3.3-fold over background
(FIG. 2, first bar). Thus, tRNA.sup.Sep can be aminoacylated to a
certain degree by an unknown E. coli aminoacyl-tRNA synthetase (Gln
is being incorporated at the amber stop codon). In contrast,
simultaneous expression of tRNA.sup.Sep and SepRS does not provide
Cm resistance (FIG. 2. third bar). This may indicate that SepRS can
out-compete any endogenous aminoacyl-tRNA synthetase and form
Sep-tRNA.sup.Sep; however, this aminoacyl-tRNA is not delivered to
the ribosome or not accommodated on the ribosome. Providing
additional EF-Tu does not improve the result (FIG. 2, fifth bar).
Co-expression of tRNA.sup.Sep, SepRS and SepCysS should result in
formation of Sep-tRNA.sup.Sep and subsequent SepCysS-mediated
conversion to Cys-tRNA.sup.Sep (A. Sauerwald et al., Science 307,
1969 (2005)). Indeed, a 2.3-fold increase in Cm resistance is
observed (FIG. 2, sixth bar). This further supports the notion that
while Sep-tRNA.sup.Sep is synthesized, it cannot be used properly
by the E. coli protein biosynthesis machinery. On the other hand,
co-expression of tRNA.sup.Sep and Mmp CysRS generates a 12.3-fold
increase in Cm resistance (FIG. 2, eight bar), demonstrating that
Cys-tRNA.sup.Sep can be readily used for amber codon suppression in
the CAT gene.
[0118] Given that EF-Tu is a component of quality control in
protein synthesis (LaRiviere, F J., et al. Science 294, 165
(2001)), it is highly plausible that Sep-tRNA.sup.Sep may be
rejected by EF-Tu in order not to interfere with the complicated
cellular mechanism of phosphoprotein production. Chemically
synthesized Sep-tRNA.sup.Gln was a poor substrate for in vitro
protein synthesis (Rothman D M. et al., J Am Chem Soc 127, 846
(2005)). tRNAs carrying negatively charged amino acids are bound
poorly by EF-Tu (Dale, T., et al. Biochemistry 43, 6159 (2004)),
and molecular dynamics simulations suggested that Sep-tRNA.sup.Cys
may not be bound by EF-Tu (Eargle, J., et al. J Mol Biol 377, 1382
(2008)). This assumption was tested in EF-Tu mediated Sep-tRNA
hydrolysis protection experiments (J. Ling et al., Proc Natl Acad
Sci USA 104, 15299 (2007)), and incubated recombinant E. coli EF-Tu
with the Mj tRNA.sup.Cys in vitro transcript either acylated with
[.sup.35S]Cys or [.sup.14C]Sep at pH 8.2. While EF-Tu protected
[.sup.35S]Cys-tRNA.sup.Cys from deacylation (FIG. 5A).
Sep-tRNA.sup.Cys was significantly deacylated irrespective of the
presence of EF-Tu (FIG. 3 and FIG. 5B). Thus, insufficient binding
of Sep-tRNA.sup.Sep to EF-Tu may explain the lack of Sep insertion
into protein.
Example 2: Development of EF-Sep
Materials and Methods
Library Construction and Selection of Sep-tRNA Specific EF-Tu
[0119] Six residues, His67, Asp216, Glu217, Phe2l9, Thr229, and
Asn274, located in the amino acid binding pocket of the E. coli
elongation factor EF-Tu were selected for randomization based on
the crystal structure of the E. coli EF-Tu:Phe-tRNA.sup.Phe complex
(protein data base accession number 1OB2). Multiple rounds of
overlap PCR were carried out to incorporate random codons (NNK) at
these positions by using the following primers described in Park
H-S et al., Science 311:535-538 (2006):
TABLE-US-00003 67XF, (SEQ ID NO: 19) 5'-GT ATC ACC ATC AAC ACT TCT
NNK GTT GAA TAC GAC ACC CCG-3'; H67R, (SEQ ID NO: 20) 5'-AGA AGT
GTT GAT GGT GAT AC-3'; 216XF, (SEQ ID NO: 21) 5'-CCG TTC CTG CTG
CCG ATC NNK NNK GTA NNK TCC ATC TCC GGT CGT GGT-3'; 216R, (SEQ ID
NO: 22) 5'-GAT CGG CAG CAG GAA CGG-3'; 229XF, (SEQ ID NO: 23)
5'-GGT CGT GGT ACC GTT GTT NNK GGT CGT GTA GAA CGC GG-3'; 229R,
(SEQ ID NO: 24) 5'-AAC AAC GGT ACC ACG ACC-3'; 274XF, (SEQ ID NO:
25) 5'-GAA GGC CGT GCT GGT GAG NNK GTA GGT GTT CTG CTG CG-3'; and
274R, (SEQ ID NO: 26) 5'-CTC ACC AGC ACG GCC TTC-3'.
[0120] The final PCR products were purified and digested with BamHI
and SalI, and ligated into pKD-SepRS to generate the EF-Tu library.
The ligated vectors were transformed into E. coli Top10.DELTA.serB
containing pCAT112-SepT to generate a library of 3.times.10.sup.8
mutants. The unbiased mutation of the library was confirmed by
selecting twenty random clones and sequencing each mutant tufB
insert.
[0121] The mutant EF-Tu library was subjected to a first round of
selection, in which clones sup-pressing the amber stop codon in the
CAT gene can survive on LB plates supplemented with 10 mg/ml
tetracycline (Tc), 25 mg/ml Kan, 50 mg/ml chloramphenicol (Cm), 2
mM Sep, and 0.05 mM isopropyl-.beta.-D-thiogalactopyranoside
(IPTG). After 48 h incubation at 30.degree.C., a pool of 10.sup.4
colonies was collected from the plates for plasmid preparation. The
pKD-SepRS-EFTu plasmids were separated from the reporter plasmid by
agarose gel electrophosis and isolated using the Qiagen gel
purification kit.
[0122] There is a possibility that mutations in the amino acid
binding site of EF-Tu could induce incorporation of natural amino
acids in response to the amber codon in the CAT gene, resulting in
false positive clones. To select against these EF-Tu mutants, the
pKD-SepRS-EFTu plasmids from the first positive selection were
transformed into E. coli Top10.DELTA.serB harboring pCcdB. The
cells were plated onto LB agar supplemented with 25 mg/ml Kan, 25
mg/ml Cm, and 0.1 mM IPTG. After 48 h incubation at 30.degree.C.,
twenty individual clones were picked and subjected to plasmid
purification to isolate pKD-SepRS-EFTu as described above. The
EF-Tu mutant genes were digested from the plasmid and recloned into
pKD-SepRS.
[0123] Resulting pKD-SepRS-EFTu plasmids were transformed into E.
coli Top10.DELTA.serB containing pCAT112-SepT for a third round of
selection which was carried out under the same conditions as the
first. This time, individual colonies were isolated from agar
plates and clones were tested for their ability to grow on Cm over
a concentration range from 5 to 100 mg/ml. Total plasmid was
purified from isolates showing strong Cm resistance, and
pKD-SepRS-EFTu plasmids were subjected to sequencing.
[0124] To confirm that the observed Cm resistance is dependent on
the presence of both, mutant EF-Tu and SepRS, EF-Tu mutant genes
were excised from their plasmids, recloned into pKD, and
retransformed into E. coli Top10.DELTA.serB containing
pCAT112-SepT. Cells were then tested for Cm resistance as described
above.
Expression and Purification of M. maripaludis SepRS and CysRS
[0125] SepRS and CysRS were produced in E. coli and purified as
described by Hohn, M. J. et al., Proc Natl Acad Sci USA 103, 18095
(2006).
Expression and Purification of EF-Tu and EF-Sep
[0126] pMAL-EFTu or pMAL-EFSep were transformed into E. coli BL21
(DE3) codon plus (Stratagene). A pre-culture was used to inoculate
1000 ml of LB broth with 100 .mu.g/ml of Amp, 34 .mu.g/ml Cm, 5052
solution, and phosphate buffer for autoinduction as described by
Studier, F W. Protein Expr Purif 41, 207 (2005). The cells were
grown for 6 h at 37.degree.C. and continued at 20.degree.C. for 18
h.
[0127] The cells were pelleted and lysed by shaking for 20 min. in
BugBuster (Novagen) reagent supplemented with 50 mM Tris-HCl (pH
7.6), 60 mM NH.sub.4Cl, 7 mM MgCl.sub.2, 14.3 mM
2-mercapto-ethanol, 50 .mu.M GDP, 10% glycerol, 25 U ml.sup.-1
Benzoase, 1 mg ml.sup.-1 lysozyme, and Protease inhibitor cocktail
(Roche).
[0128] The extract was clarified by ultracentrifugation and applied
to a Ni.sup.2+-NTA resin (Qiagen) and purified according to the
manufacturer's instructions.
[0129] The eluted enzymes were dialyzed into 20 mM Hepes-KOH (pH
7.0). 40 mM KCl, 1 mM MgCl.sub.2, 5 mM DTT, 50 .mu.M GDP, and 30%
glycerol. SDS-PAGE electrophoresis followed by staining with
Coomassie blue revealed greater than 95% purity.
Results
[0130] Guided by the structure of the E. coli EF-Tu:Phe-tRNAPhe
complex (P. Nissen et al., Science 270, 1464 (1995)) it was decided
to randomize certain positions in the amino acid binding pocket to
evolve EF-Tu variants that bind Sep-tRNA and promote its delivery
to the ribosome. Six residues (His67, Asp216, Glu217, Phe219,
Thr229, and Asn274) were selected for complete randomization
generating a library of 3.times.10.sup.8 EF-Tu mutants. To select
in vivo variants that permits Sep incorporation in the presence of
SepRS and tRNA.sup.Sep three rounds of selections (positive,
negative, positive) were performed that yielded several clones with
the desired phenotype. One clone, designated EF-Sep, was tested
further in detail. While the combination of SepRS and EF-Sep was
not active in the CAT suppression assay (FIG. 2, lane G), the
further inclusion of tRNA.sup.Sep led to a 10-fold increase in Cm
resistance (FIG. 2, lane H). Thus, it appeared that EF-Sep could
bind Sep-tRNA.sup.Sep, a fact that was ascertained in the
hydrolysis protection assay (FIG. 3). The DNA sequence of the
EF-Sep gene revealed the nature of the mutations.
[0131] EF-Tu mutants (EF-Sep) that could bind Sep-tRNA include
those having the following amino acid sequences;
TABLE-US-00004 (EFSep-M6, SEQ ID NO: 1)
MSKEKFERTKPHVNVGIGHVDHGKTTLTAAITTVLAKTYGGTARAFDQID
NAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDG
AILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELVE
MEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILILAGFLDSYIP
EPERAIDKPFLLPITRVYSISGRGTVVSGRVERGIIKVGEEVIEVGIKET
QKSTCTGVEMFRKLLDEGRAGEFVGVLLRGIKREEIERGQVLAKPGTIKP
HTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEMV
MPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSSV DKLAAALE
(EFSep-M7, SEQ ID NO: 2)
MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARAFDQI
DNAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMD
GAILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELV
EMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILELAGFLDSYI
PEPERAIDKPFLLPITYVYSISGRGTVVSGRVERGIIKVGEEVEIVGINE
TQKSTCTGVEMFRKLLDEGRAGEAVGVLLRGIKREEIERGQVLAKPGTIK
PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEM
VMPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSS VDKLAAALE
(EFSep-M8, SEQ ID NO: 3)
MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARAFDQI
DNAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMD
GAILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELV
EMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILELAGFLDSYI
PEPERAIDKPFLLPINGVYSISGRGTVVSGRVERGIIKVGEEVEIVGIKE
TQKSTCTGVEMFRKLLDEGRAGEWVGVLLRGIKREEIERGQVLAKPGTIK
PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEM
VMPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSS VDKLAAALE
(EFSep-M9, SEQ ID NO: 4)
MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARARFQI
DNAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMD
GAILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELV
EMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILELAGFLDSYI
PEPERAIDKPFLLPITAVYSISGRGTVVSGRVERGIIKVGEEVEIVGIKE
TQKSTCTGVEMFRKLLDEGRAGEAVGVLLRGIKREEIERGQVLAKPGTIK
PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEM
VMPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSS VDKLAAALE
[0132] Nucleic acid encoding EF-Tu mutants (EFSep) that could bind
Sep-tRNA include those having the following amino acid
sequences:
TABLE-US-00005 (EFSep-M6, SEQ ID NO: 5)
ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC
TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA
CCGTACTGGCTAAAACCTACGGCGGTACTGCTCGCGCATTCGACCAGATC
GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTCG
GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG
GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC
GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG
TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT
TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT
GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA
CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG
AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT
CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCACCCG
GGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC
GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAAAGAG
ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA
CGAAGGCGTGCTGGTGAGTTCGTAGGTGTTCTGCTGCGTGGTATCAAACG
TGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAGC
CGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGGC
GGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCCG
TACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATGG
TAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGATC
GCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCGT
TGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCCG
TCGACAAGCTTGCGGCCGCACTCGAG (EFSep-M7, SEQ ID NO: 6)
ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC
TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA
CCGTACTGGCTAAAACCTACGGCGGTGCTGCTCGCGCATTCGACCAGATC
GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTAG
GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG
GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC
GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG
TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT
TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT
GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA
CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG
AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT
CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCACCTA
CGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC
GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAATGAG
ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA
CGAAGGCCGTGCTGGTGAGGCGGTAGGTGTTCTGCTGCGTGGTATCAAAC
GTGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAG
CCGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGG
CGGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCC
GTACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATG
GTAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGAT
CGCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCG
TTGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCC
GTCGACAAGCTTGCGGCCGCACTCGAG (EFSep-M8, SEQ ID NO: 7)
ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC
TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA
CCGTACTGGCTAAAACCTACGGCGGTGCTGCTCGCGCATTCGACCAGATC
GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTCG
GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG
GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC
GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG
TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT
TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT
GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA
CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG
AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT
CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCAACGG
GGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC
GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAAAGAG
ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA
CGAAGGCCGTGCTGGTGAGTGGGTAGGTGTTCTGCTGCGTGGTATCAAAC
GTGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAG
CCGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGG
CGGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCC
GTACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATG
GTAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGAT
CGCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCG
TTGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCC
GTCGACAAGCTTGCGGCCGCACTCGAG (EFSep-M9, SEQ ID NO: 8)
ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC
TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA
CCGTACTGGCTAAAACCTACGGCGGTGCTGCTCGCGCATTCGACCAGATC
GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTCG
GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG
GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC
GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG
TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT
TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT
GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA
CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG
AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT
CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCACCGC
GGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC
GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAAAGAG
ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA
CGAAGGCCGTGCTGGTGAGGCCGTAGGTGTTCTGCTGCGTGGTATCAAAC
GTGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAG
CCGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGG
CGGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCC
GTACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATG
GTAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGAT
CGCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCG
TTGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCC
GTCGACAAGCTTGCGGCCGCACTCGAG
Example 3: Demonstration of Sep Incorporation into Myoglobin
Materials and Methods
Construction of Plasmids
[0133] pMYO127TAG-SepT was constructed by cloning a codon-optimized
and C-terminally His.sub.6-tagged sperm whale myoglobin gene under
the control of the lpp promoter between NotI and BglII in pSepT. An
amber stop codon was introduced to the myoglobin gene at position
Asp127 by quickchange mutagenesis. The nucleotide sequence of the
codon-optimized myoglobin gene is as follows:
TABLE-US-00006 (SEQ ID NO: 44)
ATGGTTCTGTCTGAAGGTGAATGGCAGCTGGTTCTGCACGTTTGGGCTAA
AGTTGAAGCTGACGTTGCTGGTCACGGTCAGGACATCCTGATCCGTCTGT
TCAAATCTCACCCGGAAACCCTGGAAAAATTCGACCGTTTCAAACACCTG
AAAACCGAAGCTGAAATGAAGGCTTCTGAAGACCTGAAAAAACACGGTGT
TACCGTTCTGACCGCTCTGGGTGCTATCCTGAAGAAAAAGGGTCACCACG
AAGCTGAACTGAAACCGCTGGCTCAGTCTCACGCTACCAAACACAAAATC
CCGATCAAATACCTGGAGTTCATCTCTGAAGCTATCATCCACGTTCTGCA
CTCTCGTCATCCGGGTAACTTCGGTGCTGACGCTCAGGGTGCTATGAACA
AAGCTCTGGAACTGTTCCGTAAAGACATCGCTGCTAAATACAAAGAACTG
GGTTACCAGGGTGGTTCTGGTCATCACCATCACCATCACTAA.
Results
[0134] To prove that the observed suppression is due to Sep
incorporation a myoglobin variant with an amber codon in position
127 (normally Asp) and a C-terminal His.sub.6-tag was expressed.
The expected full length protein was synthesized (yield is 2 mg/L
of culture) only when EF-Sep, SepRS and tRNA.sup.Sep were
co-expressed. The amino acid incorporated via EF-Sep in response to
the amber codon was identified by analyzing both the intact and
trypsin-digested Myo-His.sub.6 mutant protein. MS-TOF and MS/MS
analysis show that Sep is present at the position specified by
UAG.
Example 4: Active MEK Synthesis In Vivo
Materials and Methods
Construction of Plasmids
[0135] pET15-ERK2 encodes N-terminally His.sub.6-tagged
mitogen-activated protein kinase (Erk2) under the control of a T7
promoter. The human Erk2 gene was PCR amplified from plasmid
BC017832 (ATCC) using primers ERK2-F (5'-GGA ATT CCA TAT GGC GGC
GGC GGC GGC G-3', SEQ ID NO:27) and ERK2-R (5'-CCG CTC GAG TTA AGA
TCT GTA TCC TGG-3', SEQ ID NO:28). The PCR product was cloned
between NdeI and XhoI in vector pET15b (Novagen).
[0136] pET20-MBPMEK1 encodes a fusion protein consisting of human
MEK1 with an N-terminal maltose binding protein (MBP) tag and a
C-terminal His.sub.6-tag. The gene encoding human MEK1 which was
codon-optimized for E. coli and custom-synthesized in vitro
(Genscript), was cloned between EcoRI and PstI into pMALc2x (New
England Biolabs). The resulting MBP-MEK1 fusion construct was then
amplified with primers ET20MEKF (5'-AAG GAA ATT AAT GAA AAT CGA AGA
AGG TAA-3', SEQ ID NO:29) and ET20MEKR (5'-CTA GAG GAT CCG GCG
CGC-3', SEQ ID NO:30) adding AseI and BamHI restriction sites, and
the PCR product was ligated between NdeI and BamHI into pET20b.
Nucleotide Sequence of Codon-optimized MEK1
TABLE-US-00007 [0137] (SEQ ID NO: 31)
ATGCCGAAGAAGAAACCGACCCCGATCCAGCTGAACCCGGCTCCGGACGG
TTCTGCTGGTTAACGGCACCTCTTCTGCTGAAACCAACCTGGAAGCTCTG
CAAAAGAAACTGGAAGAACTGGAACTGGACGAACAGCAGCGTAAACGTCT
GGAAGCGTTCCTGACCCAGAAACAGAAAGTTGGTGAACTGAAAGACGACG
ACTTCGAAAAAATCTCTGAACTGGGTGCTGGTAACGGTGGTGTTGTTTTC
AAAGTTTCTCACAAACCGTCCGGTCTGGTTATGGCTCGTAAACTGATCCA
CCTGGAAATCAAACCGGCTATCCGTAACCAGATCATCCGTGAACTGCAAG
TTCTGCACGAATGCAACTCTCCGTACATCGTTGGTTTCTACGGTGCTTTC
TACTCTGACGGTGAAATCTCTATCTGCATGGAACACATGGACGGTGGTTC
TCTGGACCAGGTTCTGAAAAAAGCTGGTCGTATCCCGGAACAGATCCTGG
GTAAAGTTTCTATCGCTGTTATCAAAGGTCTGACCTACCTGCGTGAAAAA
CACAAAATCATGCACCGTGACGTTAAACCGTCTAACATCCTGGTTAACTC
TCGTGGTGAAATCAAACTGTGCGACTTCGGTGTTTCTGGTCAGCTGATCG
ACTCTATGGCTAACTCTTTCGTTGGCACCCGTTCTTACATGTCTCCGGAA
CGTCTGCAAGGCACCCACTACTCTGTTCAGTCTGACATCTGGTCTATGGG
TCTGTCTCTGGTTGAAATGGCTGTTGGTCGTTACCCGATCCCGCCGCCGG
ACGCTAAAGAACTGGAACTGATGTTCGGTTGCCAGGTTGAAGGTGACGCT
GCTGAAACCCCGCCGCGTCCGCGTACTCCGGGTCGTCCGCTGTCTTCTTA
CGGTATGGACTCTCGTCCGCCGATGGCTATCTTCGAACTGCTGGACTACA
TCGTTAACGAACCGCCGCCGAAACTGCCGTCTGGTGTTTTCTCTCTGGAG
TTCCAGGACTTCGTTAACAAATGCCTGATCAAAAACCCGGCTGAACGTGC
TGACCTGAAACAGCTGATGGTTCACGCTTTCATCAAACGTTCTGACGCTG
AAGAAGTTGACTTCGCTGGTTGGCTGTGCTCTACCATCGGTCTGAACCAG
CCGTCTACCCCGACCCACGCTGCTGGTGTGGCAGCCGCAGCTGCGCATCA
TCACCACCATCACTAA.
[0138] pCG-MBPMEK1SS was generated by the ligation of three PCR
products. One PCR product was derived from pGFIB (Normanly J, et
al. Nature 321:213 (1986)) using primers GFIB-F (5'-ATA AGA ATG CGG
CCG CGC CGC AGC CGA ACG ACC GAG-3', SEQ ID NO:32) and GFIB-R
(5'-CTA GCT AGC GTC TGA CGC TCA GTG GAA CG-3', SEQ ID NO:33). The
second PCR product was generated from pCDFDuet-1 (Novagen) using
primers CDF-F (5'-CTA GCT AGC TCA CTC GGT CGC TAC GCT-3', SEQ ID
NO:34) and CDF-R (5'-ATA AGA ATG CGG CCG CTG AAA TCT AGA GCG GTT
CAG-3', SEQ ID NO:35). Both PCR products were digested with NheI
and NotI and ligated to form plasmid pCG, The third PCR product,
encoding an expression cassette for MBP-MEK1-His.sub.6 under the
control of T7 promoter and T7 terminator, was generated from
pET20-MBPMEK1 using primers ETCDGFF (5'-AAA AGG CGC CGC CAG CCT AGC
CGG GTC CTC AAC G-3', SEQ ID NO:36) and ETCDGFR (5'- AAC TGC AGC
CAA TCC GGA TAT AGT TC-3', SEQ ID NO:37). This PCR product was
cloned between the NarI and PstI sites of pCG.
[0139] The codon for Ser 222 in MEK1 was then replaced by a GAA
codon (encoding Glu) using Quickchange mutagenesis (Stratagene). In
the same way, codon Ser 218 was either changed to GAA to generate
pCG-MBPMEK1EE, or to an amber stop codon, resulting in
pCG-MBPMEK1XE. In pCG-MBPMEK1XS only the codon for Ser218 was
changed to UAG and in pCG-MBPMEK1XX both codons for Ser218 and
Ser222 were changed to amber.
Expression and Purification of Myoglobin
[0140] To express mutant myoglobin, pKD-SepRS-EF-Sep and pKD-SepRS
were transformed into E. coli Top10.DELTA.serB containing
pMYO127TAG-SepT. E. coli Top10.DELTA.serB with pMYO, encoding the
wild type myoglobin gene was used as a control. Cultures were grown
in LB medium supplemented with 2 mM Sep. When A.sub.600 reached 0.6
protein expression was induced with 0.05 mM IPTG for 12 h at
25.degree.C. The cells were harvested, resuspended in lysis buffer
(50 mM Tris-HCl (pH 7.8), 300 mM NaCl, 14.3 mM 2-mercaptoethanol)
supplemented with protease inhibitor cocktail (Roche), and
subjected to sonication. The lysate was centrifuged at
10,000.times.g for 30 min and the supernatant was applied to
Ni.sup.2+-NTA agarose (Qiagen) purification according to the
manufacturer's instruction.
Expression and Purification of MEK1
[0141] To express MEK1 (as a maltose binding protein
fusion-protein) E. coli BL21.DELTA.serB was transformed with
plasmids pKD-SepRS-EFSep. pCAT112TAG-SepT, and pCG-MBPMEK1SS,
pCG-MBPMEK1EE, pCG-MBPMEK1XE, pCG-MBPMEK1XS, or pCG-MBPMEK1XX,
respectively. Plasmid pCAT112TAG-SepT was replaced by pL11C-SepT in
the strain used to produce MBP-MEK1(Sep218,Ser222)-His.sub.6 for
mass spectrometry analysis.
[0142] Cells were grown at 30.degree. C. in 1 liter of LB
supplemented with 100 .mu.g/ml of Amp, 50 .mu.g/ml Kan, 12 .mu.g/ml
Tc, 2 mM Sep, 5052 solution, and phosphate buffer for
autoinduction. When A600 reached 0.6, temperature was changed to
16.degree. C. and incubation continued for 18 h. After harvesting,
cells were lysed in 20 ml BugBuster reagent containing 50 mM
Tris-HCl (pH 7.8), 500 mM NaCl, 0.5 mM EGTA. 0.5 mM EDTA, 14.3 mM
2-mercapto-ethanol, 10% glycerol, 0.03% Brij-35, protease
inhibitors, 25 U ml.sup.-1 Benzoase, and 1 mg ml.sup.-1 lysozyme.
The lysate was clarified by ultracentrifugation, and applied to a
0.4 ml Ni.sup.2+-NTA agarose column. The column was washed with 15
ml wash buffer (50 mM Tris-HCl (pH 7.8), 150 mM NaCl, 0.5 mM EGTA,
0.5 mM EDTA, 14.3 mM 2-mercaptoethanol, 10% glycerol, 0.03%
Brij-35, and 20 mM imidazole). Proteins were eluted in 0.8 ml of
wash buffer supplemented with 300 mM imidazole, dialyzed against 50
mM Tris-HCl (pH 7.8), 150 mM NaCl, 0.1 mM EGTA, 5 mM DTT, 30%
glycerol, and 0.03% Brij-35, and stored at -20.degree. C. Purified
proteins were analyzed by SDS-PAGE.
Expression and Purification of Erk2
[0143] E. coli BL21 (DE3) codon plus cells were transformed with
pET15-ERK2 and grown at 37.degree. C. in 1 liter LB broth
supplemented with 100 g/ml Amp and 34 g/ml Cm. When the cultures
reached A600 of 0.6, 0.2 mM IPTG was added and expression was
induced for 19 h at 16.degree. C.
[0144] Cell lysis, Ni.sup.2+purification, and dialysis of Erk2 were
carried out as described for MEK1. Erk2 was 99% pure, as judged by
Coomassie brilliant blue staining after SDS-PAGE.
Preparation and Aminoacylation of tRNA.
[0145] Total tRNA from E. coli Top10 or from E. coli Top10
complemented with pCysT or pSepT, respectively, was purified by
standard procedures and acylated with [.sup.14C]Sep by M.
maripaludis SepRS as described previously. In vivo synthesized tRNA
was for this experiment to ensure that nucleoside modifications
introduced into tRNA by E. coli modifying enzymes do not affect
tRNA recognition by SepRS. M. jannaschii tRNA.sup.Cys contains
m.sup.1G37 when isolated from M. jannaschii. Since the E. coli
methylase TrmD is known to methylate G537 of archaeal tRNA.sup.Pro,
it is believed that the in vivo expressed tRNA.sup.Sep also carries
the m.sup.1G37 modification. In vitro transcript of M. jannaschii
tRNA.sup.Cys was prepared and acylated with [.sup.14C]Sep or
[.sup.35S]Cys using recombinant M. maripaludis SepRS or CysRS. M.
jannaschii tRNA.sup.Cys transcript was chosen for these experiments
because of the poor folding properties of in vitro transcribed M.
maripaludis tRNA.sup.Cys (Hohn, M. J. Proc Natl Acad Sci USA 103,
18095 (2006)).
EF-Tu Hydrolysis Protection Assays
[0146] To assay hydrolysis protection of acylated tRNA.sup.Cys by
EF-Tu, Mmp tRNA.sup.Cys in vitro transcripts acylated with
[.sup.14C]Sep or [.sup.35S]Cys, respectively, were
phenol/chlorophorm extracted, and the aqueous phase was passed over
Sephadex.RTM. G25 Microspin columns (GE Healthcare) equilibrated
with water. Protection of aminoacylated tRNA by EF-Tu was assayed
as described earlier with slight modifications (Ling J. et al.,
Proc Natl Acad Sci USA 104, 15299 (2007)). Briefly, EF-Tu or EF-Sep
(both purified as maltose binding protein fusion proteins) were
activated for 20 min. at 37.degree. C. in buffer containing 100 mM
Tris-HCl (pH 8.2), 120 mM NH4Cl, 7 mM MgCl.sub.2, 5 mM DTT, 5 mM
phosphoenolpyruvate, 1.5 mM GTP, and 0.12 .mu.g/.mu.l pyruvate
kinase. Hydrolysis of 2 .mu.M [.sup.14C]Sep-tRNA.sup.Cys was then
monitored at 25.degree. C. in the presence of 40 .mu.M EF-Tu (wt),
EF-Sep, or BSA, respectively. Aliquots were taken from the reaction
mix at indicated time points and spotted on 3 MM filter discs
presoaked with 10% trichloroacetic acid. Filters were washed with
5% trichloroacetic acid, dried, and radioactivity was measured by
liquid scintillation counting.
MEK Activity Assays
[0147] Recombinant MEK1 variants were assayed (as maltose binding
protein (MBP) fusion-proteins). Briefly, in a first reaction,
various amounts (2.5-5000 ng) of recombinant MBP-MEK1 variants were
used to phosphorylate (and activate) bacterially expressed MAP
kinase (Erk2) for 15 min. at 30.degree. C. in 35 .mu.l kinase assay
buffer containing 12 mM MOPS pH 7.2, 20 mM MgCl.sub.2, 3 mM EGTA,
15 mM .beta.-glycerol phosphate, 0.6 mM DTT, 140 .mu.M ATP, and 1
.mu.g Erk2.
[0148] After 15 min, a 5 .mu.l aliquot was transferred to a second
reaction in which activated Erk2 phosphorylates myelin basic
protein (MBP: 570 .mu.g ml.sup.1) in kinase assay buffer in the
presence of [.gamma.-.sup.32P]ATP. After 15 min. incubation at
30.degree. C. 25 .mu.l aliquots were transferred onto p81
phospho-cellulose filters (Whatman). The filters were washed three
times with 180 mM phosphoric acid and then rinsed with acetone.
Phosphorylation was quantitated by scintillation counting and the
specific activity of MEK1 was calculated from the amount of
[.sup.32P]phosphate incorporated into MBP.
LC and MS/MS Conditions for Multiple Reaction Monitoring (MRM)
[0149] Purified MEK1 proteins were separated by SDS-PAGE,
visualized with Comassie stain, excised, washed in 50% acetonitrile
(ACN)/50 mM NH.sub.4HCO.sub.3, crushed, and digested at 37.degree.
C. in a 20 .mu.g/ml trypsin (Promega) solution in 10 mM
NH.sub.4HCO.sub.3. Digested peptides in solution were dried and
dissolved in 3 .mu.l of 70% formic acid (FA), and then diluted to
10 .mu.l with 0.1% TFA. Peptides for MRM were synthesized at the
KECK peptide synthesis facility at Yale. The human MEK peptide
LCDFGVSGQLIDS*MANSFVGTR (SEQ ID NO:40) (*phospho-Ser; YPED peptide
ID, SOL14075) was synthesized to permit the development of a
specific method for quantitative MRM. Crude synthetic peptides were
direct infused at a concentration of .about.10 pmol/.mu.l and
Collision Energy and Declustering Potentials of the transitions
were optimized. LC-MRM was performed on an ABI 5500 QTRAP triple
quadruple mass spectrometer inter-faced with a Waters nanoAcquity
UPLC system running Analyst 1.5 software. Peptides were resolved
for MRM (LC step) by loading 4 .mu.l of sample onto a Symmetry C18
nanoAcquity trapping column (180 .mu.m.times.20 mm 5 .mu.m) with
100% water at 15 .mu.l per minute for 1 minute. After trapping,
peptides were resolved on a BEH130 C18 nanoAcquity column (75
.mu.m.times.50 mm 1.7 .mu.m) with a 30 minute. 2-40%* ACN/0.1% FA
linear gradient. (0.5 .mu.l/min flow rate). MRM scanning was
carried out with 18 transitions and a cycle time of 1.44 seconds
with a 40 millisecond dwell time per transition. An MRM Initiated
Detection and Sequencing (MIDAS) was performed. The IDA method
consisted of the most intense peak using rolling collision energy.
The target ions were excluded after 3 occurrences for 30 seconds.
The EPI scan had a scan rate of 20,000 Da/sec with a sum of 3 scans
and mass range of 100-1000 Da and a cycle time of 1.4 msec. Files
were searched using Mascot version 2.3 with the Swissprot database
(08/2010) selected (human taxonomic restriction,). Phosphorylated S
and T, and propionamide C were variable modifications. Peptide and
fragment mass tolerance is 0.6 Da, with 1 missed cleavage.
Quantification was performed using MultiQuant 2.0.
Results
[0150] To further demonstrate the usefulness of the disclosed
strategy for the synthesis of a protein that is naturally
phosphorylated at a serine residue, recombinant, Sep containing
mitogen-activated ERK activating kinase 1 (MEK1) was produced. This
key enzyme of the mitogen-activated signaling cascade in eukaryotic
cells plays crucial roles in cell proliferation, cell development
and differentiation, cell cycle control and oncogenesis
(Sebolt-Leopold, J. S., et al. Nat Rev Cancer 4, 937 (2004)).
Activation of MEK1 requires post-translational phosphorylation of
Ser218 and Ser222 by MEK activating kinases (e.g., Raf-1, MEKK, or
MOS). Change of both Ser residues to Glu yields a constitutively
active enzyme albeit with lower activity (Alessi D R. et al., EMBO
J 13, 1610 (1994)).
[0151] To improve expression of this human protein in the E. coli
BL21 .DELTA.serB strain and to allow purification by
Ni.sup.2+-affinity chromatography a MEK1 clone was designed to
generate an N-terminal fusion with maltose binding protein (MBP)
and with a C-terminal His.sub.6-tag. Position 222 was changed to
Glu and the codon for Ser2l8 was replaced by UAG to encode Sep.
After expression in the presence of SepRS, tRNA.sup.Sep and EF-Sep
25 .mu.g of full-length MBP-MEK1(Sep218,Glu222) were isolated from
1 L of culture. The presence of Sep in this recombinant MEK1-fusion
protein was demonstrated by its activity in phosphorylating ERK2.
The assay requires the additional component, myelin basic protein
(MyBP) which will be phosphorylated by activated ERK2 in the
presence of [.gamma.-.sup.32P]ATP; the amount of [.sup.32P]MyBP
relates to the specific activity of MEK1. As FIG. 4 shows, MBP-MEK1
(Sep218,Glu222) had a 2,500-fold higher specific activity than
non-phosphorylated MBP-MEK1 (Ser218,Ser222), and a 70-fold higher
specific activity than the constitutively active
MBP-MEK1(Glu218,Glu222) mutant (FIG. 4).
[0152] To demonstrate the incorporation of Sep at position 218 an
assay was developed utilizing multiple reaction monitoring (MRM)
and a triple-quadrupole mass spectrometer. The MRM assay was
designed to detect an intact tryptic phosphopeptide ion (m/z
823.4.sup.+3) derived from MBP-MEK1(Sep218,Ser222) and 4 fragment
ions produced by collision-induced dissociation of this intact
phosphopeptide (Table 1). The MRM method included an Information
Dependent Acquisition (IDA) step that triggered a full MS/MS scan
once the 823.4.sup.+3 ion, and associated fragment ions, were
detected. The IDA MS/MS spectrum confirmed the incorporation of Sep
at position 218 and Ser at 222 in MBP-MEK1 (Sep218, Ser222).
TABLE-US-00008 TABLE 1 Peptide information for MRM precursor/
Peptide (SEQ ID NO: 40) product ion CE DP
LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.( .sup.)/ 30.85 160.9
333.2.sup.( .sup.)[y3] LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.(
.sup.)/ 38.26 160.9 666.35.sup.( .sup.)[y6]
LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.( .sup.)/ 38.62 160.9
780.4.sup.( .sup.)[y7] LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.(
.sup.)/ 38.12 160.9 851.4.sup.( .sup.)[y8] S.sup.P, phosphoserine,
C*, propionamide; CE, Collision energy; DP, Dilution Potential
indicates data missing or illegible when filed
[0153] To determine if our E. coli expression system would allow
the simultaneous insertion of two Sep residues into the protein,
the Ser codons in positions 218 and 222 were changed to UAG. As
expected the expression efficiency of MBP-MEK1 (Sep218,Sep222) was
dramatically reduced compared to wild-type MBP-MEK1 (only about 1
.mu.g of full length protein was obtained from 1 L culture). The
presence of Sep at both active site positions of MEK1 was tested by
Western blot analysis using a monoclonal antibody specific to the
phosphorylated active site of human MEK2. Only recombinant
MBP-MEK1(Sep218,Sep222), and to a weaker extent
MBP-MEK1(Sep218,Ser222) was detected in this experiment, while
neither MBP-MEK1(Ser218,Ser222), MBP-MEK(Sep218,Glu222) or
MBP-MEK(Glu218,Glu222) was recognized by this antibody. The
presence of full-length MBP-fusion proteins was confirmed by
Coomassie staining and by Western hybridization with an
MBP-specific antibody. This demonstrates that the addition of
SepRS, tRNA.sup.Sep and EF-Sep endows E. coli with the ability to
read UAG as a phosphoserine codon.
[0154] Unless defined otherwise, all technical and scientific terms
used herein have the same meanings as commonly understood by one of
skill in the art to which the disclosed invention belongs.
Sequence CWU 1
1
461409PRTArtificial SequenceSynthetic Construct 1Met Ser Lys Glu
Lys Phe Glu Arg Thr Lys Pro His Val Asn Val Gly1 5 10 15Thr Ile Gly
His Val Asp His Gly Lys Thr Thr Leu Thr Ala Ala Ile 20 25 30Thr Thr
Val Leu Ala Lys Thr Tyr Gly Gly Thr Ala Arg Ala Phe Asp 35 40 45Gln
Ile Asp Asn Ala Pro Glu Glu Lys Ala Arg Gly Ile Thr Ile Asn 50 55
60Thr Ser Arg Val Glu Tyr Asp Thr Pro Thr Arg His Tyr Ala His Val65
70 75 80Asp Cys Pro Gly His Ala Asp Tyr Val Lys Asn Met Ile Thr Gly
Ala 85 90 95Ala Gln Met Asp Gly Ala Ile Leu Val Val Ala Ala Thr Asp
Gly Pro 100 105 110Met Pro Gln Thr Arg Glu His Ile Leu Leu Gly Arg
Gln Val Gly Val 115 120 125Pro Tyr Ile Ile Val Phe Leu Asn Lys Cys
Asp Met Val Asp Asp Glu 130 135 140Glu Leu Leu Glu Leu Val Glu Met
Glu Val Arg Glu Leu Leu Ser Gln145 150 155 160Tyr Asp Phe Pro Gly
Asp Asp Thr Pro Ile Val Arg Gly Ser Ala Leu 165 170 175Lys Ala Leu
Glu Gly Asp Ala Glu Trp Glu Ala Lys Ile Leu Glu Leu 180 185 190Ala
Gly Phe Leu Asp Ser Tyr Ile Pro Glu Pro Glu Arg Ala Ile Asp 195 200
205Lys Pro Phe Leu Leu Pro Ile Thr Arg Val Tyr Ser Ile Ser Gly Arg
210 215 220Gly Thr Val Val Ser Gly Arg Val Glu Arg Gly Ile Ile Lys
Val Gly225 230 235 240Glu Glu Val Glu Ile Val Gly Ile Lys Glu Thr
Gln Lys Ser Thr Cys 245 250 255Thr Gly Val Glu Met Phe Arg Lys Leu
Leu Asp Glu Gly Arg Ala Gly 260 265 270Glu Phe Val Gly Val Leu Leu
Arg Gly Ile Lys Arg Glu Glu Ile Glu 275 280 285Arg Gly Gln Val Leu
Ala Lys Pro Gly Thr Ile Lys Pro His Thr Lys 290 295 300Phe Glu Ser
Glu Val Tyr Ile Leu Ser Lys Asp Glu Gly Gly Arg His305 310 315
320Thr Pro Phe Phe Lys Gly Tyr Arg Pro Gln Phe Tyr Phe Arg Thr Thr
325 330 335Asp Val Thr Gly Thr Ile Glu Leu Pro Glu Gly Val Glu Met
Val Met 340 345 350Pro Gly Asp Asn Ile Lys Met Val Val Thr Leu Ile
His Pro Ile Ala 355 360 365Met Asp Asp Gly Leu Arg Phe Ala Ile Arg
Glu Gly Gly Arg Thr Val 370 375 380Gly Ala Gly Val Val Ala Lys Val
Leu Arg Asp Pro Asn Ser Ser Ser385 390 395 400Val Asp Lys Leu Ala
Ala Ala Leu Glu 4052409PRTArtificial SequenceSynthetic Construct
2Met Ser Lys Glu Lys Phe Glu Arg Thr Lys Pro His Val Asn Val Gly1 5
10 15Thr Ile Gly His Val Asp His Gly Lys Thr Thr Leu Thr Ala Ala
Ile 20 25 30Thr Thr Val Leu Ala Lys Thr Tyr Gly Gly Ala Ala Arg Ala
Phe Asp 35 40 45Gln Ile Asp Asn Ala Pro Glu Glu Lys Ala Arg Gly Ile
Thr Ile Asn 50 55 60Thr Ser Arg Val Glu Tyr Asp Thr Pro Thr Arg His
Tyr Ala His Val65 70 75 80Asp Cys Pro Gly His Ala Asp Tyr Val Lys
Asn Met Ile Thr Gly Ala 85 90 95Ala Gln Met Asp Gly Ala Ile Leu Val
Val Ala Ala Thr Asp Gly Pro 100 105 110Met Pro Gln Thr Arg Glu His
Ile Leu Leu Gly Arg Gln Val Gly Val 115 120 125Pro Tyr Ile Ile Val
Phe Leu Asn Lys Cys Asp Met Val Asp Asp Glu 130 135 140Glu Leu Leu
Glu Leu Val Glu Met Glu Val Arg Glu Leu Leu Ser Gln145 150 155
160Tyr Asp Phe Pro Gly Asp Asp Thr Pro Ile Val Arg Gly Ser Ala Leu
165 170 175Lys Ala Leu Glu Gly Asp Ala Glu Trp Glu Ala Lys Ile Leu
Glu Leu 180 185 190Ala Gly Phe Leu Asp Ser Tyr Ile Pro Glu Pro Glu
Arg Ala Ile Asp 195 200 205Lys Pro Phe Leu Leu Pro Ile Thr Tyr Val
Tyr Ser Ile Ser Gly Arg 210 215 220Gly Thr Val Val Ser Gly Arg Val
Glu Arg Gly Ile Ile Lys Val Gly225 230 235 240Glu Glu Val Glu Ile
Val Gly Ile Asn Glu Thr Gln Lys Ser Thr Cys 245 250 255Thr Gly Val
Glu Met Phe Arg Lys Leu Leu Asp Glu Gly Arg Ala Gly 260 265 270Glu
Ala Val Gly Val Leu Leu Arg Gly Ile Lys Arg Glu Glu Ile Glu 275 280
285Arg Gly Gln Val Leu Ala Lys Pro Gly Thr Ile Lys Pro His Thr Lys
290 295 300Phe Glu Ser Glu Val Tyr Ile Leu Ser Lys Asp Glu Gly Gly
Arg His305 310 315 320Thr Pro Phe Phe Lys Gly Tyr Arg Pro Gln Phe
Tyr Phe Arg Thr Thr 325 330 335Asp Val Thr Gly Thr Ile Glu Leu Pro
Glu Gly Val Glu Met Val Met 340 345 350Pro Gly Asp Asn Ile Lys Met
Val Val Thr Leu Ile His Pro Ile Ala 355 360 365Met Asp Asp Gly Leu
Arg Phe Ala Ile Arg Glu Gly Gly Arg Thr Val 370 375 380Gly Ala Gly
Val Val Ala Lys Val Leu Arg Asp Pro Asn Ser Ser Ser385 390 395
400Val Asp Lys Leu Ala Ala Ala Leu Glu 4053409PRTArtificial
SequenceSynthetic Construct 3Met Ser Lys Glu Lys Phe Glu Arg Thr
Lys Pro His Val Asn Val Gly1 5 10 15Thr Ile Gly His Val Asp His Gly
Lys Thr Thr Leu Thr Ala Ala Ile 20 25 30Thr Thr Val Leu Ala Lys Thr
Tyr Gly Gly Ala Ala Arg Ala Phe Asp 35 40 45Gln Ile Asp Asn Ala Pro
Glu Glu Lys Ala Arg Gly Ile Thr Ile Asn 50 55 60Thr Ser Arg Val Glu
Tyr Asp Thr Pro Thr Arg His Tyr Ala His Val65 70 75 80Asp Cys Pro
Gly His Ala Asp Tyr Val Lys Asn Met Ile Thr Gly Ala 85 90 95Ala Gln
Met Asp Gly Ala Ile Leu Val Val Ala Ala Thr Asp Gly Pro 100 105
110Met Pro Gln Thr Arg Glu His Ile Leu Leu Gly Arg Gln Val Gly Val
115 120 125Pro Tyr Ile Ile Val Phe Leu Asn Lys Cys Asp Met Val Asp
Asp Glu 130 135 140Glu Leu Leu Glu Leu Val Glu Met Glu Val Arg Glu
Leu Leu Ser Gln145 150 155 160Tyr Asp Phe Pro Gly Asp Asp Thr Pro
Ile Val Arg Gly Ser Ala Leu 165 170 175Lys Ala Leu Glu Gly Asp Ala
Glu Trp Glu Ala Lys Ile Leu Glu Leu 180 185 190Ala Gly Phe Leu Asp
Ser Tyr Ile Pro Glu Pro Glu Arg Ala Ile Asp 195 200 205Lys Pro Phe
Leu Leu Pro Ile Asn Gly Val Tyr Ser Ile Ser Gly Arg 210 215 220Gly
Thr Val Val Ser Gly Arg Val Glu Arg Gly Ile Ile Lys Val Gly225 230
235 240Glu Glu Val Glu Ile Val Gly Ile Lys Glu Thr Gln Lys Ser Thr
Cys 245 250 255Thr Gly Val Glu Met Phe Arg Lys Leu Leu Asp Glu Gly
Arg Ala Gly 260 265 270Glu Trp Val Gly Val Leu Leu Arg Gly Ile Lys
Arg Glu Glu Ile Glu 275 280 285Arg Gly Gln Val Leu Ala Lys Pro Gly
Thr Ile Lys Pro His Thr Lys 290 295 300Phe Glu Ser Glu Val Tyr Ile
Leu Ser Lys Asp Glu Gly Gly Arg His305 310 315 320Thr Pro Phe Phe
Lys Gly Tyr Arg Pro Gln Phe Tyr Phe Arg Thr Thr 325 330 335Asp Val
Thr Gly Thr Ile Glu Leu Pro Glu Gly Val Glu Met Val Met 340 345
350Pro Gly Asp Asn Ile Lys Met Val Val Thr Leu Ile His Pro Ile Ala
355 360 365Met Asp Asp Gly Leu Arg Phe Ala Ile Arg Glu Gly Gly Arg
Thr Val 370 375 380Gly Ala Gly Val Val Ala Lys Val Leu Arg Asp Pro
Asn Ser Ser Ser385 390 395 400Val Asp Lys Leu Ala Ala Ala Leu Glu
4054409PRTArtificial SequenceSynthetic Construct 4Met Ser Lys Glu
Lys Phe Glu Arg Thr Lys Pro His Val Asn Val Gly1 5 10 15Thr Ile Gly
His Val Asp His Gly Lys Thr Thr Leu Thr Ala Ala Ile 20 25 30Thr Thr
Val Leu Ala Lys Thr Tyr Gly Gly Ala Ala Arg Ala Phe Asp 35 40 45Gln
Ile Asp Asn Ala Pro Glu Glu Lys Ala Arg Gly Ile Thr Ile Asn 50 55
60Thr Ser Arg Val Glu Tyr Asp Thr Pro Thr Arg His Tyr Ala His Val65
70 75 80Asp Cys Pro Gly His Ala Asp Tyr Val Lys Asn Met Ile Thr Gly
Ala 85 90 95Ala Gln Met Asp Gly Ala Ile Leu Val Val Ala Ala Thr Asp
Gly Pro 100 105 110Met Pro Gln Thr Arg Glu His Ile Leu Leu Gly Arg
Gln Val Gly Val 115 120 125Pro Tyr Ile Ile Val Phe Leu Asn Lys Cys
Asp Met Val Asp Asp Glu 130 135 140Glu Leu Leu Glu Leu Val Glu Met
Glu Val Arg Glu Leu Leu Ser Gln145 150 155 160Tyr Asp Phe Pro Gly
Asp Asp Thr Pro Ile Val Arg Gly Ser Ala Leu 165 170 175Lys Ala Leu
Glu Gly Asp Ala Glu Trp Glu Ala Lys Ile Leu Glu Leu 180 185 190Ala
Gly Phe Leu Asp Ser Tyr Ile Pro Glu Pro Glu Arg Ala Ile Asp 195 200
205Lys Pro Phe Leu Leu Pro Ile Thr Ala Val Tyr Ser Ile Ser Gly Arg
210 215 220Gly Thr Val Val Ser Gly Arg Val Glu Arg Gly Ile Ile Lys
Val Gly225 230 235 240Glu Glu Val Glu Ile Val Gly Ile Lys Glu Thr
Gln Lys Ser Thr Cys 245 250 255Thr Gly Val Glu Met Phe Arg Lys Leu
Leu Asp Glu Gly Arg Ala Gly 260 265 270Glu Ala Val Gly Val Leu Leu
Arg Gly Ile Lys Arg Glu Glu Ile Glu 275 280 285Arg Gly Gln Val Leu
Ala Lys Pro Gly Thr Ile Lys Pro His Thr Lys 290 295 300Phe Glu Ser
Glu Val Tyr Ile Leu Ser Lys Asp Glu Gly Gly Arg His305 310 315
320Thr Pro Phe Phe Lys Gly Tyr Arg Pro Gln Phe Tyr Phe Arg Thr Thr
325 330 335Asp Val Thr Gly Thr Ile Glu Leu Pro Glu Gly Val Glu Met
Val Met 340 345 350Pro Gly Asp Asn Ile Lys Met Val Val Thr Leu Ile
His Pro Ile Ala 355 360 365Met Asp Asp Gly Leu Arg Phe Ala Ile Arg
Glu Gly Gly Arg Thr Val 370 375 380Gly Ala Gly Val Val Ala Lys Val
Leu Arg Asp Pro Asn Ser Ser Ser385 390 395 400Val Asp Lys Leu Ala
Ala Ala Leu Glu 40551227DNAArtificial SequenceSynthetic Construct
5atgtctaaag aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac
60gttgaccatg gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac
120ggcggtactg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa
agctcgtggt 180atcaccatca acacttctcg ggttgaatac gacaccccga
cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa
aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc
tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc
gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg
420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact
tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt
ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg
gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat
tgacaagccg ttcctgctgc cgatcacccg ggtatactcc 660atctccggtc
gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt
720gaagaagttg aaatcgttgg tatcaaagag actcagaagt ctacctgtac
tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagt
tcgtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt
caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc
tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct
tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt
1020accatcgaac tgccggaagg cgtagagatg gtaatgccgg gcgacaacat
caaaatggtt 1080gttaccctga tccacccgat cgcgatggac gacggtctgc
gtttcgcaat ccgtgaaggc 1140ggccgtaccg ttggcgcggg cgttgtagca
aaagttctga gggatccgaa ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag
122761227DNAArtificial SequenceSynthetic Construct 6atgtctaaag
aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac 60gttgaccatg
gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac
120ggcggtgctg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa
agctcgtggt 180atcaccatca acacttctag ggttgaatac gacaccccga
cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa
aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc
tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc
gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg
420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact
tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt
ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg
gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat
tgacaagccg ttcctgctgc cgatcaccta cgtatactcc 660atctccggtc
gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt
720gaagaagttg aaatcgttgg tatcaatgag actcagaagt ctacctgtac
tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagg
cggtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt
caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc
tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct
tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt
1020accatcgaac tgccggaagg cgtagagatg gtaatgccgg gcgacaacat
caaaatggtt 1080gttaccctga tccacccgat cgcgatggac gacggtctgc
gtttcgcaat ccgtgaaggc 1140ggccgtaccg ttggcgcggg cgttgtagca
aaagttctga gggatccgaa ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag
122771227DNAArtificial SequenceSynthetic Construct 7atgtctaaag
aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac 60gttgaccatg
gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac
120ggcggtgctg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa
agctcgtggt 180atcaccatca acacttctcg ggttgaatac gacaccccga
cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa
aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc
tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc
gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg
420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact
tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt
ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg
gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat
tgacaagccg ttcctgctgc cgatcaacgg ggtatactcc 660atctccggtc
gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt
720gaagaagttg aaatcgttgg tatcaaagag actcagaagt ctacctgtac
tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagt
gggtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt
caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc
tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct
tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt
1020accatcgaac tgccggaagg cgtagagatg gtaatgccgg gcgacaacat
caaaatggtt 1080gttaccctga tccacccgat cgcgatggac gacggtctgc
gtttcgcaat ccgtgaaggc 1140ggccgtaccg ttggcgcggg cgttgtagca
aaagttctga gggatccgaa ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag
122781227DNAArtificial SequenceSynthetic Construct 8atgtctaaag
aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac 60gttgaccatg
gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac
120ggcggtgctg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa
agctcgtggt 180atcaccatca acacttctcg ggttgaatac gacaccccga
cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa
aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc
tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc
gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg
420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact
tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt
ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg
gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat
tgacaagccg ttcctgctgc cgatcaccgc ggtatactcc 660atctccggtc
gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt
720gaagaagttg aaatcgttgg tatcaaagag actcagaagt ctacctgtac
tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagg
ccgtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt
caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc
tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct
tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt
1020accatcgaac tgccggaagg
cgtagagatg gtaatgccgg gcgacaacat caaaatggtt 1080gttaccctga
tccacccgat cgcgatggac gacggtctgc gtttcgcaat ccgtgaaggc
1140ggccgtaccg ttggcgcggg cgttgtagca aaagttctga gggatccgaa
ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag 1227936DNAArtificial
SequenceSynthetic Construct 9tgcgcaatgc ggccgcccgt agcgccgatg
gtagtg 361033DNAArtificial SequenceSynthetic Construct 10acacggagat
ctctaaagta tatatgagta aac 331137DNAArtificial SequenceSynthetic
Construct 11tgcgcaatgc ggccgcccgg gtcgaatttg ctttcga
371233DNAArtificial SequenceSynthetic Construct 12acacggagat
ctatgccccg cgcccaccgg aag 331338DNAArtificial SequenceSynthetic
Construct 13tgcagcaatg cggccgcttt caccgtcatc accgaaac
381432DNAArtificial SequenceSynthetic Construct 14gggacgctag
caaacaaaaa gagtttgtag aa 321532DNAArtificial SequenceSynthetic
Construct 15gggacgctag cttttctctg gtcccgccgc at 321637DNAArtificial
SequenceSynthetic Construct 16tgcgcaatgc ggccgcggtg gcacttttcg
gggaaat 371730DNAArtificial SequenceSynthetic Construct
17gcatgcgccg ccagctgttg cccgtctcgc 301831DNAArtificial
SequenceSynthetic Construct 18gcatagatct tcagctggcg aaagggggat g
311941DNAArtificial SequenceSynthetic
Constructmisc_feature(21)..(23)n is a, c, g, or t 19gtatcaccat
caacacttct nnngttgaat acgacacccc g 412020DNAArtificial
SequenceSynthetic Construct 20agaagtgttg atggtgatac
202148DNAArtificial SequenceSynthetic
Constructmisc_feature(19)..(24)n is a, c, g, or
tmisc_feature(28)..(30)n is a, c, g, or t 21ccgttcctgc tgccgatcnn
nnnngtannn tccatctccg gtcgtggt 482218DNAArtificial
SequenceSynthetic Construct 22gatcggcagc aggaacgg
182338DNAArtificial SequenceSynthetic
Constructmisc_feature(19)..(21)n is a, c, g, or t 23ggtcgtggta
ccgttgttnn nggtcgtgta gaacgcgg 382418DNAArtificial
SequenceSynthetic Construct 24aacaacggta ccacgacc
182538DNAArtificial SequenceSynthetic
Constructmisc_feature(19)..(21)n is a, c, g, or t 25gaaggccgtg
ctggtgagnn ngtaggtgtt ctgctgcg 382618DNAArtificial
SequenceSynthetic Construct 26ctcaccagca cggccttc
182728DNAArtificial SequenceSynthetic Construct 27ggaattccat
atggcggcgg cggcggcg 282827DNAArtificial SequenceSynthetic Construct
28ccgctcgagt taagatctgt atcctgg 272930DNAArtificial
SequenceSynthetic Construct 29aaggaaatta atgaaaatcg aagaaggtaa
303018DNAArtificial SequenceSynthetic Construct 30ctagaggatc
cggcgcgc 18311215DNAArtificial SequenceSynthetic Construct
31atgccgaaga agaaaccgac cccgatccag ctgaacccgg ctccggacgg ttctgctgtt
60aacggcacct cttctgctga aaccaacctg gaagctctgc aaaagaaact ggaagaactg
120gaactggacg aacagcagcg taaacgtctg gaagcgttcc tgacccagaa
acagaaagtt 180ggtgaactga aagacgacga cttcgaaaaa atctctgaac
tgggtgctgg taacggtggt 240gttgttttca aagtttctca caaaccgtcc
ggtctggtta tggctcgtaa actgatccac 300ctggaaatca aaccggctat
ccgtaaccag atcatccgtg aactgcaagt tctgcacgaa 360tgcaactctc
cgtacatcgt tggtttctac ggtgctttct actctgacgg tgaaatctct
420atctgcatgg aacacatgga cggtggttct ctggaccagg ttctgaaaaa
agctggtcgt 480atcccggaac agatcctggg taaagtttct atcgctgtta
tcaaaggtct gacctacctg 540cgtgaaaaac acaaaatcat gcaccgtgac
gttaaaccgt ctaacatcct ggttaactct 600cgtggtgaaa tcaaactgtg
cgacttcggt gtttctggtc agctgatcga ctctatggct 660aactctttcg
ttggcacccg ttcttacatg tctccggaac gtctgcaagg cacccactac
720tctgttcagt ctgacatctg gtctatgggt ctgtctctgg ttgaaatggc
tgttggtcgt 780tacccgatcc cgccgccgga cgctaaagaa ctggaactga
tgttcggttg ccaggttgaa 840ggtgacgctg ctgaaacccc gccgcgtccg
cgtactccgg gtcgtccgct gtcttcttac 900ggtatggact ctcgtccgcc
gatggctatc ttcgaactgc tggactacat cgttaacgaa 960ccgccgccga
aactgccgtc tggtgttttc tctctggagt tccaggactt cgttaacaaa
1020tgcctgatca aaaacccggc tgaacgtgct gacctgaaac agctgatggt
tcacgctttc 1080atcaaacgtt ctgacgctga agaagttgac ttcgctggtt
ggctgtgctc taccatcggt 1140ctgaaccagc cgtctacccc gacccacgct
gctggtgtgg cagccgcagc tgcgcatcat 1200caccaccatc actaa
12153236DNAArtificial SequenceSynthetic Construct 32ataagaatgc
ggccgcgccg cagccgaacg accgag 363329DNAArtificial SequenceSynthetic
Construct 33ctagctagcg tctgacgctc agtggaacg 293427DNAArtificial
SequenceSynthetic Construct 34ctagctagct cactcggtcg ctacgct
273536DNAArtificial SequenceSynthetic Construct 35ataagaatgc
ggccgctgaa atctagagcg gttcag 363634DNAArtificial SequenceSynthetic
Construct 36aaaaggcgcc gccagcctag ccgggtcctc aacg
343726DNAArtificial SequenceSynthetic Construct 37aactgcagcc
aatccggata tagttc 263837DNAArtificial SequenceSynthetic Construct
38ggaattccat atgaccaaga ccccgccggc agcagtt 373924DNAArtificial
SequenceSynthetic Construct 39aggcgcgcct tagtcctcca ctac
244022PRTArtificial SequenceSynthetic
ConstructMOD_RES(13)..(13)PHOSPHORYLATION 40Leu Cys Asp Phe Gly Val
Ser Gly Gln Leu Ile Asp Ser Met Ala Asn1 5 10 15Ser Phe Val Gly Thr
Arg 204172DNAArtificial SequenceSynthetic Construct 41gccggggtag
tctaggggtt aggcagcgga ctgcagatcc gccttacgtg ggttcaaatc 60ccacccccgg
ct 72421650DNAMethanocaldococcus jannaschii 42atgaaattaa aacataaaag
ggatgataaa atgagatttg atataaaaaa ggttttagag 60ttagcagaga aggattttga
gacggcatgg agagagacaa gggcattaat aaaggataaa 120catattgaca
ataaatatcc aagattaaag cctgtctatg gaaagccaca tccagtgatg
180gagacgatag agagattaag acaagcttat ctaagaatgg gatttgaaga
gatgattaat 240ccagttatcg ttgatgagat ggagatttat aagcaatttg
gaccagaagc aatggcagtt 300ttagatagat gtttttactt ggctggatta
ccaaggccag atgttggttt aggaaatgag 360aaggttgaga ttataaaaaa
tttgggcata gatatagatg aggagaaaaa agagaggttg 420agagaagttt
tacatttata caaaaaagga gctatagatg gggatgattt agtctttgag
480attgccaaag ctttaaatgt gagtaatgaa atgggattga aggttttaga
aactgcattt 540cctgaattta aagatttgaa gccagaatca acaactctaa
ctttaagaag ccacatgaca 600tctgggtggt ttataactct aagcagttta
ataaagaaga gaaaactgcc tttaaagtta 660ttctctatag atagatgttt
tagaagggag caaagagagg atagaagcca tttaatgagt 720tatcactctg
catcttgtgt agttgttggt gaagatgtta gtgtagatga tggaaaggta
780gttgctgaag gattgttggc tcaatttgga tttacaaaat ttaagtttaa
gccagatgag 840aaaaagagta agtattatac accagaaact caaacagagg
tttatgccta tcatccaaag 900ttgggagagt ggattgaagt agcaaccttt
ggagtttatt caccaattgc attagctaaa 960tataacatag atgtgccagt
tatgaacctt ggcttaggag ttgagaggtt ggcaatgatt 1020atttacggct
atgaggatgt tagggcaatg gtttatcctc aattttatga atacaggttg
1080agtgatagag atatagctgg gatgataaga gttgataaag ttcctatatt
ggatgaattc 1140tacaactttg caaatgagct tattgatata tgcatagcaa
ataaagataa ggaaagccca 1200tgttcagttg aagttaaaag ggaattcaat
ttcaatgggg agagaagagt aattaaagta 1260gaaatatttg agaatgaacc
aaataaaaag cttttaggtc cttctgtgtt aaatgaggtt 1320tatgtctatg
atggaaatat atatggcatt ccgccaacgt ttgaaggggt taaagaacag
1380tatatcccaa ttttaaagaa agctaaggaa gaaggagttt ctacaaacat
tagatacata 1440gatgggatta tctataaatt agtagctaag attgaagagg
ctttagtttc aaatgtggat 1500gaatttaagt tcagagtccc aatagttaga
agtttgagtg acataaacct aaaaattgat 1560gaattggctt taaaacagat
aatgggggag aataaggtta tagatgttag gggaccagtt 1620ttcttaaatg
caaaggttga gataaaatag 165043549PRTMethanocaldococcus jannaschii
43Met Lys Leu Lys His Lys Arg Asp Asp Lys Met Arg Phe Asp Ile Lys1
5 10 15Lys Val Leu Glu Leu Ala Glu Lys Asp Phe Glu Thr Ala Trp Arg
Glu 20 25 30Thr Arg Ala Leu Ile Lys Asp Lys His Ile Asp Asn Lys Tyr
Pro Arg 35 40 45Leu Lys Pro Val Tyr Gly Lys Pro His Pro Val Met Glu
Thr Ile Glu 50 55 60Arg Leu Arg Gln Ala Tyr Leu Arg Met Gly Phe Glu
Glu Met Ile Asn65 70 75 80Pro Val Ile Val Asp Glu Met Glu Ile Tyr
Lys Gln Phe Gly Pro Glu 85 90 95Ala Met Ala Val Leu Asp Arg Cys Phe
Tyr Leu Ala Gly Leu Pro Arg 100 105 110Pro Asp Val Gly Leu Gly Asn
Glu Lys Val Glu Ile Ile Lys Asn Leu 115 120 125Gly Ile Asp Ile Asp
Glu Glu Lys Lys Glu Arg Leu Arg Glu Val Leu 130 135 140His Leu Tyr
Lys Lys Gly Ala Ile Asp Gly Asp Asp Leu Val Phe Glu145 150 155
160Ile Ala Lys Ala Leu Asn Val Ser Asn Glu Met Gly Leu Lys Val Leu
165 170 175Glu Thr Ala Phe Pro Glu Phe Lys Asp Leu Lys Pro Glu Ser
Thr Thr 180 185 190Leu Thr Leu Arg Ser His Met Thr Ser Gly Trp Phe
Ile Thr Leu Ser 195 200 205Ser Leu Ile Lys Lys Arg Lys Leu Pro Leu
Lys Leu Phe Ser Ile Asp 210 215 220Arg Cys Phe Arg Arg Glu Gln Arg
Glu Asp Arg Ser His Leu Met Ser225 230 235 240Tyr His Ser Ala Ser
Cys Val Val Val Gly Glu Asp Val Ser Val Asp 245 250 255Asp Gly Lys
Val Val Ala Glu Gly Leu Leu Ala Gln Phe Gly Phe Thr 260 265 270Lys
Phe Lys Phe Lys Pro Asp Glu Lys Lys Ser Lys Tyr Tyr Thr Pro 275 280
285Glu Thr Gln Thr Glu Val Tyr Ala Tyr His Pro Lys Leu Gly Glu Trp
290 295 300Ile Glu Val Ala Thr Phe Gly Val Tyr Ser Pro Ile Ala Leu
Ala Lys305 310 315 320Tyr Asn Ile Asp Val Pro Val Met Asn Leu Gly
Leu Gly Val Glu Arg 325 330 335Leu Ala Met Ile Ile Tyr Gly Tyr Glu
Asp Val Arg Ala Met Val Tyr 340 345 350Pro Gln Phe Tyr Glu Tyr Arg
Leu Ser Asp Arg Asp Ile Ala Gly Met 355 360 365Ile Arg Val Asp Lys
Val Pro Ile Leu Asp Glu Phe Tyr Asn Phe Ala 370 375 380Asn Glu Leu
Ile Asp Ile Cys Ile Ala Asn Lys Asp Lys Glu Ser Pro385 390 395
400Cys Ser Val Glu Val Lys Arg Glu Phe Asn Phe Asn Gly Glu Arg Arg
405 410 415Val Ile Lys Val Glu Ile Phe Glu Asn Glu Pro Asn Lys Lys
Leu Leu 420 425 430Gly Pro Ser Val Leu Asn Glu Val Tyr Val Tyr Asp
Gly Asn Ile Tyr 435 440 445Gly Ile Pro Pro Thr Phe Glu Gly Val Lys
Glu Gln Tyr Ile Pro Ile 450 455 460Leu Lys Lys Ala Lys Glu Glu Gly
Val Ser Thr Asn Ile Arg Tyr Ile465 470 475 480Asp Gly Ile Ile Tyr
Lys Leu Val Ala Lys Ile Glu Glu Ala Leu Val 485 490 495Ser Asn Val
Asp Glu Phe Lys Phe Arg Val Pro Ile Val Arg Ser Leu 500 505 510Ser
Asp Ile Asn Leu Lys Ile Asp Glu Leu Ala Leu Lys Gln Ile Met 515 520
525Gly Glu Asn Lys Val Ile Asp Val Arg Gly Pro Val Phe Leu Asn Ala
530 535 540Lys Val Glu Ile Lys54544492DNAArtificial
SequenceSynthetic Construct 44atggttctgt ctgaaggtga atggcagctg
gttctgcacg tttgggctaa agttgaagct 60gacgttgctg gtcacggtca ggacatcctg
atccgtctgt tcaaatctca cccggaaacc 120ctggaaaaat tcgaccgttt
caaacacctg aaaaccgaag ctgaaatgaa ggcttctgaa 180gacctgaaaa
aacacggtgt taccgttctg accgctctgg gtgctatcct gaagaaaaag
240ggtcaccacg aagctgaact gaaaccgctg gctcagtctc acgctaccaa
acacaaaatc 300ccgatcaaat acctggagtt catctctgaa gctatcatcc
acgttctgca ctctcgtcat 360ccgggtaact tcggtgctga cgctcagggt
gctatgaaca aagctctgga actgttccgt 420aaagacatcg ctgctaaata
caaagaactg ggttaccagg gtggttctgg tcatcaccat 480caccatcact aa
492451614DNAMethanococcus maripaludis 45atgtttaaaa gagaagaaat
cattgaaatg gccaataagg actttgaaaa agcatggatc 60gaaactaaag accttataaa
agctaaaaag ataaacgaaa gttacccaag aataaaacca 120gtttttggaa
aaacacaccc tgtaaatgac actattgaaa atttaagaca ggcatatctt
180agaatgggtt ttgaagaata tataaaccca gtaattgtcg atgaaagaga
tatttataaa 240caattcggcc cagaagctat ggcagttttg gatagatgct
tttatttagc gggacttcca 300agacctgacg ttggtttgag cgatgaaaaa
atttcacaga ttgaaaaact tggaattaaa 360gtttctgagc acaaagaaag
tttacaaaaa atacttcacg gatacaaaaa aggaactctt 420gatggtgacg
atttagtttt agaaatttca aatgcacttg aaatttcaag cgagatgggt
480ttaaaaattt tagaagatgt tttcccagaa tttaaggatt taaccgcagt
ttcttcaaaa 540ttaactttaa gaagccacat gacttcagga tggttcctta
ctgtttcaga cctcatgaac 600aaaaaaccct tgccatttaa actcttttca
atcgatagat gttttagaag agaacaaaaa 660gaagataaaa gccacttaat
gacataccac tctgcatcct gtgcaattgc aggtgaaggc 720gtggatatta
atgatggaaa agcaattgca gaaggattat tatcccaatt tggctttaca
780aactttaaat tcattcctga tgaaaagaaa agtaaatact acacccctga
aacacagact 840gaagtttacg cataccaccc aaaattaaaa gaatggctcg
aagttgctac atttggagta 900tattcgccag ttgcattaag caaatacgga
atagatgtac ctgtaatgaa tttgggtctt 960ggtgttgaaa gacttgcaat
gatttctgga aatttcgcag atgttcgaga aatggtatat 1020cctcagtttt
acgaacacaa acttaatgac cggaatgtcg cttcaatggt aaaactcgat
1080aaagttccag taatggatga aatttacgat ttaacaaaag aattaattga
gtcatgtgtt 1140aaaaacaaag atttaaaatc cccttgtgaa ttagctattg
aaaaaacgtt ttcatttgga 1200aaaaccaaga aaaatgtaaa aataaacatt
tttgaaaaag aagaaggtaa aaatttactc 1260ggaccttcaa ttttaaacga
aatctacgtt tacgatggaa atgtaattgg aattcctgaa 1320agctttgacg
gagtaaaaga agaatttaaa gacttcttag aaaaaggaaa atcagaaggg
1380gtagcaacag gcattcgata tatcgatgcg ctttgcttta aaattacttc
aaaattagaa 1440gaagcatttg tgtcaaacac tactgaattc aaagttaaag
ttccaattgt cagaagttta 1500agcgacatta acttaaaaat cgatgatatc
gcattaaaac agatcatgag caaaaataaa 1560gtaatcgacg ttagaggccc
agtcttttta aatgtcgaag taaaaattga ataa 161446537PRTMethanococcus
maripaludis 46Met Phe Lys Arg Glu Glu Ile Ile Glu Met Ala Asn Lys
Asp Phe Glu1 5 10 15Lys Ala Trp Ile Glu Thr Lys Asp Leu Ile Lys Ala
Lys Lys Ile Asn 20 25 30Glu Ser Tyr Pro Arg Ile Lys Pro Val Phe Gly
Lys Thr His Pro Val 35 40 45Asn Asp Thr Ile Glu Asn Leu Arg Gln Ala
Tyr Leu Arg Met Gly Phe 50 55 60Glu Glu Tyr Ile Asn Pro Val Ile Val
Asp Glu Arg Asp Ile Tyr Lys65 70 75 80Gln Phe Gly Pro Glu Ala Met
Ala Val Leu Asp Arg Cys Phe Tyr Leu 85 90 95Ala Gly Leu Pro Arg Pro
Asp Val Gly Leu Ser Asp Glu Lys Ile Ser 100 105 110Gln Ile Glu Lys
Leu Gly Ile Lys Val Ser Glu His Lys Glu Ser Leu 115 120 125Gln Lys
Ile Leu His Gly Tyr Lys Lys Gly Thr Leu Asp Gly Asp Asp 130 135
140Leu Val Leu Glu Ile Ser Asn Ala Leu Glu Ile Ser Ser Glu Met
Gly145 150 155 160Leu Lys Ile Leu Glu Asp Val Phe Pro Glu Phe Lys
Asp Leu Thr Ala 165 170 175Val Ser Ser Lys Leu Thr Leu Arg Ser His
Met Thr Ser Gly Trp Phe 180 185 190Leu Thr Val Ser Asp Leu Met Asn
Lys Lys Pro Leu Pro Phe Lys Leu 195 200 205Phe Ser Ile Asp Arg Cys
Phe Arg Arg Glu Gln Lys Glu Asp Lys Ser 210 215 220His Leu Met Thr
Tyr His Ser Ala Ser Cys Ala Ile Ala Gly Glu Gly225 230 235 240Val
Asp Ile Asn Asp Gly Lys Ala Ile Ala Glu Gly Leu Leu Ser Gln 245 250
255Phe Gly Phe Thr Asn Phe Lys Phe Ile Pro Asp Glu Lys Lys Ser Lys
260 265 270Tyr Tyr Thr Pro Glu Thr Gln Thr Glu Val Tyr Ala Tyr His
Pro Lys 275 280 285Leu Lys Glu Trp Leu Glu Val Ala Thr Phe Gly Val
Tyr Ser Pro Val 290 295 300Ala Leu Ser Lys Tyr Gly Ile Asp Val Pro
Val Met Asn Leu Gly Leu305 310 315 320Gly Val Glu Arg Leu Ala Met
Ile Ser Gly Asn Phe Ala Asp Val Arg 325 330 335Glu Met Val Tyr Pro
Gln Phe Tyr Glu His Lys Leu Asn Asp Arg Asn 340 345 350Val Ala Ser
Met Val Lys Leu Asp Lys Val Pro Val Met Asp Glu Ile 355 360 365Tyr
Asp Leu Thr Lys Glu Leu Ile Glu Ser Cys Val Lys Asn Lys Asp 370 375
380Leu Lys Ser Pro Cys Glu Leu Ala Ile Glu Lys Thr Phe Ser Phe
Gly385 390 395 400Lys Thr Lys Lys Asn Val Lys Ile Asn Ile Phe Glu
Lys Glu Glu Gly 405 410 415Lys Asn Leu Leu Gly Pro Ser Ile Leu Asn
Glu Ile Tyr Val Tyr Asp
420 425 430Gly Asn Val Ile Gly Ile Pro Glu Ser Phe Asp Gly Val Lys
Glu Glu 435 440 445Phe Lys Asp Phe Leu Glu Lys Gly Lys Ser Glu Gly
Val Ala Thr Gly 450 455 460Ile Arg Tyr Ile Asp Ala Leu Cys Phe Lys
Ile Thr Ser Lys Leu Glu465 470 475 480Glu Ala Phe Val Ser Asn Thr
Thr Glu Phe Lys Val Lys Val Pro Ile 485 490 495Val Arg Ser Leu Ser
Asp Ile Asn Leu Lys Ile Asp Asp Ile Ala Leu 500 505 510Lys Gln Ile
Met Ser Lys Asn Lys Val Ile Asp Val Arg Gly Pro Val 515 520 525Phe
Leu Asn Val Glu Val Lys Ile Glu 530 535
* * * * *