Site-specific Incorporation Of Phosphoserine Into Proteins In Escherichia Coli Park; Hee-Sung ; et al. [Yale University]

Site-specific Incorporation Of Phosphoserine Into Proteins In Escherichia Coli

Park; Hee-Sung ; et al.

Patent Application Summary

U.S. patent application number 16/746428 was filed with the patent office on 2020-12-10 for site-specific incorporation of phosphoserine into proteins in escherichia coli. The applicant listed for this patent is Yale University. Invention is credited to Hee-Sung Park, Dieter Soll.

Application Number	20200385742 16/746428
Document ID	/
Family ID	1000005034186
Filed Date	2020-12-10

United States Patent Application	20200385742
Kind Code	A1
Park; Hee-Sung ; et al.	December 10, 2020

SITE-SPECIFIC INCORPORATION OF PHOSPHOSERINE INTO PROTEINS IN ESCHERICHIA COLI

Abstract

Nucleic acids encoding mutant elongation factor proteins (EF-Sep), phosphoseryl-tRNA synthetase (SepRS), and phosphoseryl-tRNA (tRNA.sup.Sep) and methods of use in site specific incorporation of phosphoserine into & protein or polypeptide are described. Typically, SepRS preferentially aminoacylates tRNA.sup.Sepwith O-phosphoserine and the tRNA.sup.Sep recognizes at least one codon such as a stop codon. Due to the negative charge of the phosphoserine, Sep-tRNA.sup.Sep does not bind elongation factor Tu (EF-Tu). However, mutant EF-Sep proteins are disclosed that bind Sep-tRNA.sup.Sep and protect Sep-tRNA.sup.Sep from deacylation. In a preferred embodiment the nucleic acids are on vectors and are expressed in cells such as bacterial cells, archeaebacterial cells, and eukaryotic cells. Proteins or polypeptides containing phosphoserine produced by the methods described herein can be used for a variety of applications such as research, antibody production, protein array manufacture and development of cell-based screens for new drug discovery.

Inventors:

Park; Hee-Sung; (Daejeon, KR) ; Soll; Dieter; (Guilford, CT)

Applicant:

Name	City	State	Country	Type
Yale University	New Haven	CT	US

Family ID:

1000005034186

Appl. No.:

16/746428

Filed:

January 17, 2020

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
15439449	Feb 22, 2017	10538773
16746428
14992542	Jan 11, 2016	9580716
15439449
14795434	Jul 9, 2015	9567594
14992542
13877628	Apr 3, 2013	9090928
PCT/US2011/055414	Oct 7, 2011
14795434
61390853	Oct 7, 2010
61470332	Mar 31, 2011

Current U.S. Class:	1/1
Current CPC Class:	C12P 21/02 20130101; C12Y 601/01027 20130101; C07K 14/435 20130101; C12N 15/67 20130101; C12N 9/93 20130101; C12P 21/00 20130101; C07K 14/245 20130101
International Class:	C12N 15/67 20060101 C12N015/67; C12P 21/02 20060101 C12P021/02; C12P 21/00 20060101 C12P021/00; C12N 9/00 20060101 C12N009/00; C07K 14/245 20060101 C07K014/245; C07K 14/435 20060101 C07K014/435

Goverment Interests

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0002] This invention was made with Government Support under Agreement R01 GM022854 awarded by the National Institutes of Health and Agreement 0654283 awarded by the National Science Foundation. The Government has certain rights in the invention.

Claims

1. A method of making a target protein, comprising expressing a messenger RNA (mRNA) encoding the target protein in a system comprising: an O-phosphoseryl-tRNA synthetase (SepRS) that preferentially aminoacylates a tRNA (tRNA.sup.Sep) with phosphoserine; a tRNA.sup.Sep that can be aminoacylated with phosphoserine by SepRS to form a Sep-tRNA.sup.Sep and recognize at least one codon in the mRNA encoding the target protein; and a mutant elongation factor (EF-Sep) that binds Sep-tRNA.sup.Sep; wherein the SepRS preferentially aminoacylates the tRNA.sup.Sep with phosphoserine and the resulting Sep-tRNA.sup.Sep recognizes at least one codon such that phosphoserine is incorporated during translation to form the target protein.

2. The method of claim 1, wherein the EF-Sep comprises an amino acid sequence at least 90% identical to any one of SEQ ID NOS:1-4, wherein (i) the amino acid sequence comprises the amino acids of any one of SEQ ID NOS:1-4 at positions corresponding to amino acid number 67, 216, 217, 219, 229, and 274 of any one of SEQ ID NOS:1-4; or (ii) the amino acid sequence comprises the amino acids of any one of SEQ ID NOS:1-4 at positions corresponding to amino acid number 67, 217, 219, 229, and 274 of any one of SEQ ID NOS:1-4 and a substitution mutation corresponding to amino acid number 216 of any one of SEQ ID NOS:1-4.

3. The method of claim 1, wherein the EF-Sep comprises an amino acid binding pocket for aminoacylated tRNA, wherein the binding pocket comprises (i) the binding pocket for aminoacylated tRNA of any one of SEQ ID NOS: 1-4, or any one of SEQ ID NO:1-4; or (ii) the binding pocket for aminoacylated tRNA of SEQ ID NO:3 with a substitution at amino acid reside 216, optionally wherein the substitution is an asparagine-to-valine substitution (N216V).

4. The method of claim 1, wherein the tRNA.sup.Sep is cysteinyl-tRNA from Methanocaldococcus jannaschii.

5. The method of claim 4, wherein the tRNA.sup.Sep is encoded by a the nucleic acid sequence SEQ ID NO:41.

6. The method of claim 1, wherein the SepRS is the phosphoseryl-tRNA synthetase from Methanococcus maripaludis or Methanocaldococcus jannaschii.

7. The method of claim 6, wherein the SepRS is comprises an amino acid sequence at least 85% identical to SEQ ID NO:43 or 46.

8. (canceled)

9. The method of claim 1, wherein the nucleic acid encoding a gene with tRNA.sup.Sep activity and the nucleic acid encoding a gene with SepRS activity are on one or more vectors.

10. The method of claim 9, wherein the vector is an expression vector selected from the group consisting of a plasmid, a virus, a naked polynucleotide, and a conjugated polynucleotide.

11. The method of claim 9, wherein the vector is expressed in cells selected from the group consisting of bacterial cells, archeaebacterial cells, and eukaryotic cells.

12. The method of claim 9, wherein the vector is expressed in an in vitro transcription/translation system.

13. The method of claim 12, wherein the vector is transcribed and translated prior to or along with nucleic acids encoding one or more proteins or polypeptides.

14. The method of claim 1, wherein the nucleic acids are expressed in an organism.

15. The method of claim 1, wherein the nucleic acids are under control of a promoter selected from the group consisting of constitutive, inducible and tissue-specific.

16. A kit for producing a target protein containing phosphoserine, comprising a polynucleotide encoding EF-Sep, a polynucleotide encoding tRNA.sup.Sep, and a polynucleotide encoding SepRS.

17. The kit of claim 16, wherein the kit further comprises phosphoserine.

18. The kit of claim 16, further comprising a host system for expressing a polynucleotide encoding the protein, the polynucleotide encoding EF-Sep, the polynucleotide encoding tRNA.sup.Sep, and the polynucleotide encoding SepRS.

19. A plurality of a target protein produced according to the method of claim 1.

20. The plurality of the target protein of claim 19 in a lysate of host cells used to produce the target protein.

21. A method comprising screening candidate drugs for activity against a protein, wherein the protein is produced according to a process comprising the method of claim 1.

Description

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit of U.S. Provisional Application No. 61/390,853, filed on Oct. 7, 2010, and U.S. Provisional Application No. 61/470,332, filed on Mar. 31, 2011.

FIELD OF THE INVENTION

[0003] The field of the present invention generally relates to methods for the site specific phosphorylation of proteins in vitro and in vivo.

BACKGROUND OF THE INVENTION

[0004] Signal transduction is any process by which a cell converts one kind of signal or stimulus into another. Processes referred to as signal transduction often involve a sequence of biochemical reactions inside the cell, which are carried out by enzymes and linked through second messengers. Signal transduction is often accomplished by the activation of enzymes that can act upon other enzymes and change their catalytic activity. This may lead to increases or decreases in the activity of certain metabolic pathways, or may lead to even large intracellular changes, for example, the initiation of specific patterns of gene expression and/or changes in cell proliferation.

[0005] The most common covalent modification used in signal transduction processes is phosphorylation, which results in the alteration of the activity of those enzymes which become phosphorylated. Phosphorylation is the addition of a phosphate (PO.sub.4) group to a protein or a small molecule. Any of several amino acids in a protein may be phosphorylated. Phosphorylation on serine is the most common, followed by threonine. Tyrosine phosphorylation is relatively rare. However, since tyrosine phosphorylated proteins are relatively easy to purify using antibodies, tyrosine phosphorylation sites are relatively well understood. Histidine and aspartate phosphorylation occurs in prokaryotes as part of two-component signaling. Other types of phosphorylation include oxidative phosphorylation. Adenosine triphosphate (ATP), the "high-energy" exchange medium in the cell, is synthesized in the mitochondrion by addition of a third phosphate group to Adenosine diphosphate (ADP) in a process referred to as oxidative phosphorylation. ATP is also synthesized by substrate level phosphorylation during glycolysis. ATP is synthesized at the expense of solar energy by photophosphorylation in the chloroplasts of plant cells.

[0006] In eukaryotes, protein phosphorylation is probably the most important regulatory event. Many enzymes and receptors are switched "on" or "off" by phosphorylation and dephosphorylation. Phosphorylation is catalyzed by enzymes known as ATP-dependent phosphotransferases which are often simply referred to as "kinases." These include, among others, protein kinases, lipid kinases, inositol kinases, non-classical protein kinases, histidine kinases, aspartyl kinases, nucleoside kinases, and polynucleotide kinases.

[0007] Phosphorylation regulates protein function, for example, by affecting conformation. This in turn regulates such processes as enzyme activity, protein-protein interactions, subcellular distribution, and stability and degradation. The stoichiometry of phosphorylation of a given site is controlled by the relative activities of a cell's repertoire of protein kinases and phosphatases. Thus phosphorylation can often generate extremely rapid and reversible changes in the activity of target proteins. The ability to assay the state of phosphorylation of specific proteins is of great utility in the quest to establish the function of a given protein. Such assays are also critical for the identification of drugs that can influence the phosphorylation, and hence the function, of specific proteins.

[0008] In general, phosphoproteins are highly unstable and difficult to produce, both in terms of specific phosphorylation of biologically relevant amino acids and subsequent purification of protein. A means to specify and drive a targeted phosphorylation event with a high degree of certainty and efficiency is needed. This is particularly important for recombinant proteins expressed in bacterial or fungal expression systems which do not phosphorylate proteins in the same way as mammalian cells.

[0009] Therefore, it is an object of the present invention to provide a method for the site specific phosphorylation of proteins.

[0010] It is further an object of the present invention to provide a method for the site specific phosphorylation of proteins in vivo.

[0011] In particular, it is an object of the present invention to provide a method for the site specific incorporation of phosphoserine into a protein.

SUMMARY OF THE INVENTION

[0012] Mutant elongation factor proteins (EF-Sep) are described for use with phosphoseryl-tRNA synthetase (SepRS) and phosphoseryl-tRNA (tRNA.sup.Sep) in site specific incorporation of phosphoserine into a protein or polypeptide. Typically. SepRS preferentially aminoacylates tRNA.sup.Sep with O-phosphoserine and the tRNA.sup.Sep recognizes at least one codon such as a stop codon. Due to the negative charge of the phosphoserine, Sep-tRNA.sup.Sep does not bind elongation factor Tu (EF-Tu). However, the disclosed EF-Sep proteins can bind Sep-tRNA.sup.Sep and protect Sep-tRNA.sup.Sep from deacylation and catalyze the covalent transfer of the phosphoserine amino acid onto the polypeptide.

[0013] In some embodiments, EF-Sep is a mutant form of bacterial EF-Tu having a mutation at one or more of amino acid residues corresponding to His67, Asp216, Glu217, Phe219, Thr229, and Asn274 in E. coli EF-Tu, which are located in the amino acid binding pocket for aminoacylated tRNA. In some embodiments, EF-Sep is a mutant form of eukaryotic elongation factor 1A (eEF1A) with mutations in positions equivalent to bacterial counterpart. In preferred embodiments, the EF-Sep has the amino acid sequence SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, or a conservative variant thereof. Nucleic acids encoding EF-Sep are also disclosed. For example, in some embodiments, the nucleic acid sequence encoding EF-Sep has the nucleic acid sequence SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, or a conservative variant thereof.

[0014] In a preferred embodiment, "tRNA.sup.Sep" and "SepRS" refer to the cysteinyl-tRNA from Methanocaldococcus jannaschii and the phosphoseryl-tRNA synthetase from Methanococcus maripaludis, respectively and variants thereof having conservative substitutions, additions, and/or deletions therein not affecting the structure or function. Typically, SepRS preferentially aminoacylates tRNA.sup.Sep with O-phosphoserine and the tRNA.sup.Sep recognizes at least one codon. In a preferred embodiment, the tRNA.sup.Sep recognizes a stop codon or an unconventional or non-native codon.

[0015] Methods for producing target proteins that contain at least one phosphoserine are described. The method results in proteins that have a phosphoserine incorporated into a protein in a manner indistinguishable front the phosphorylation of a serine by a kinase. Nucleic acids encoding genes with SepRS and tRNA.sup.Sep activity are provided, preferably on vectors, such as cloning vectors and expression vectors. These vectors can be in the form of a plasmid, a bacterium, a virus, a naked polynucleotide, or a conjugated polynucleotide. In one embodiment, the vectors are expressed in cells such as bacterial cells (e.g., Escherichia coli), archeaebacterial cells, and eukaryotic cells (e.g., yeast cells, mammalian cells, plant cells, insect cells, fungal cells). The cells preferably lack a protein with Sep-tRNA:Cys-tRNA synthase (SepCysS) activity that converts tRNA-bound phosphoserine to cysteine. In an alternative embodiment, the vectors are expressed in an in vitro transcription/translation system. In this embodiment the vectors are transcribed and translated prior to or along with nucleic acids encoding one or more proteins or polypeptides.

[0016] In some embodiments, the target protein containing phosphoserine is produced and modified in a cell-dependent manner. This provides for the production of proteins that are stably folded, glycosylated, or otherwise modified by the cell.

[0017] Kits for producing polypeptides and/or proteins containing phosphoserine are also provided.

[0018] The proteins or polypeptides containing phosphoserine and antibodies to such polypeptides or proteins have a variety of uses including the study of kinases, phosphotases, and target proteins in signal transduction pathways, antibody production, protein array manufacture and development of cell-based screens for new drug discovery and the development of therapeutic agents, agricultural products, or peptide-based libraries such as phage display libraries.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019] FIG. 1A is a diagram showing the indirect pathway for the synthesis of Cys-tRNA.sup.Cys in methanogenic archaea. FIG. 1B is a diagram showing the secondary structure (of Methanocaldococcus jannaschii tRNA.sup.Cys (SEQ ID NO:41) shown in clover leaf form. Mutations introduced to form tRNA.sup.Sep are indicated with arrows. FIG. 1C is a graph showing percent phosphoserine (Sep) acceptance in M. jannaschii tRNA as a function of time (min) for unfractionated total tRNA from E. coli (triangle) or E. coli strains expressing tRNA.sup.Cys (closed circle) or tRNA.sup.Sep (open circle).

[0020] FIG. 2 is a graph showing chloramphenicol resistance (IC50, .mu.g-ml) for E. coli. containing 1) a chloramphenicol acetyltransferase (CAT) gene with an amber stop code (UAG) at a permissive site and 2) combinations of tRNA.sup.Sep, [SepRS or CysRS (Mmp)], SepCysS, and [EF-Sep or and EF-Tu (wt)]. The suppressor tRNA.sup.Sep was coexpressed with the indicated enzymes in E. coli Top 10.DELTA.serB. Selection was carried out on LB agar plates containing 2 mM Sep and various concentrations of chloramphenicol.

[0021] FIG. 3 is a graph showing deacylation of [.sup.14C]Sep-tRNA.sup.Cys (percent Sep-tRNA.sup.Cys remaining) as a function of time following incubated in the presence and absence of bovine serum albumin control (open circle), wild type EF-Tu (closed circle), or EF-Sep (square).

[0022] FIG. 4 is a graph showing kinase activity (phosphate incorporation into MyBP (pmol/min)) as a function of MEK1 concentration (.mu.g/assay) for wild type (triangle) and mutant (closed and open circles) MEK1. Human MEK1 was produced as a maltose-binding protein (MBP) fusion-protein in E. coli. Residues Ser218 and Ser222, which are targets of phosphorylation by MEK1 activators were either mutated to Glu218/Glu222 (closed circle) or to Sep218/Glu222 (open circle) to produce active MEK1 variants. Various amounts of MBP-MEK1 were used to phosphorylate inactive ERK2 in vitro. ERK2 activity was then measured in a radiometric assay using [.sup.32P]-.gamma.ATP and myelin basic protein as substrates.

[0023] FIGS. 5A and 5B are graphs showing EF-Tu protects Cys-tRNA.sup.Cys (FIG. 5A) but not Sep-tRNA.sup.Cys (FIG. 5B) from deacylation. Hydrolysis of M. jannaschii [.sup.35S]Cys-tRNA.sup.Cys or [.sup.14C]Sep-tRNA.sup.Cys was determined at pH 8.2 and room temperature in the presence or absence of EF-Tu.

DETAILED DESCRIPTION OF THE INVENTION

I. Definitions

[0024] The term "transfer RNA (tRNA)" refers to a set of genetically encoded RNAs that act during protein synthesis as adaptor molecules, matching individual amino acids to their corresponding codon on a messenger RNA (mRNA). In higher eukaryotes such as mammals, there is at least one tRNA for each of the 20 naturally occurring amino acids. The 3' end of a tRNA is aminoacylated by a tRNA synthetase so that an amino acid is attached to the 3' end of the tRNA. This amino acid is delivered to a growing polypeptide chain as the anticodon sequence of the tRNA reads a codon triplet in an mRNA.

[0025] The term "aminoacyl tRNA synthetase (AARS)" refers to an enzyme that catalyzes the esterification of a specific amino acid or its precursor to one of all its compatible cognate tRNAs to form an aminoacyl-tRNA. These charged aminoacyl tRNAs then participate in mRNA translation and protein synthesis. The AARS show high specificity for charging a specific tRNA with the appropriate amino acid. In general, there is at least one AARS for each of the twenty amino acids.

[0026] The term "tRNA.sup.Sep" refers to a tRNA that can be aminoacylated with O-phosphoserine (Sep) and recognize at least one codon such that the phosphoserine is incorporated into a protein or polypeptide. In some embodiments, the tRNA.sup.Sep is a tRNA.sup.Cys from Methanocaldococcus jannaschii containing a C20U mutation that improves aminoacylation by SepRS without affecting CysRS recognition. In some embodiments, the tRNA.sup.Sep contains an anticodon that binds a stop codon.

[0027] The term "Sep-tRNA.sup.Sep" refers to a tRNA.sup.Sep that has been aminoacylated with O-phosphoserine (Sep).

[0028] The term "O-phosphoseryl-tRNA synthetase (SepRS)" refers to an O-phosphoseryl-tRNA synthetase that preferentially aminoacylates tRNA.sup.Sep with O-phosphoserine (Sep) to form Sep-tRNA.

[0029] The term "EF-Sep" refers to a mutant elongation factor protein that binds Sep-tRNA.sup.Sep and catalyzes the covalent transfer of the phosphoserine amino acid onto the polypeptide. Due to the negative charge of the phosphoserine, Sep-tRNA.sup.Sep does not bind elongation factor Tu (EF-Tu). EF-Sep proteins can bind Sep-tRNA.sup.Sep, protect Sep-tRNA.sup.Sep from deacylation, and catalyze the covalent transfer of the phosphoserine amino acid onto the polypeptide.

[0030] As used herein "suppressor tRNA" refers to a tRNA that alters the reading of a messenger RNA (mRNA) in a given translation system. For example, a suppressor tRNA can read through a stop codon.

[0031] The term "anticodon" refers to a unit made up of three nucleotides that correspond to the three bases of a codon on the mRNA. Each tRNA contains a specific anticodon triplet sequence that can base-pair to one or more codons for an amino acid or "stop codon." Known stop codons include but are not limited to, the three codon bases UAA (known as ochre), UAG (known as amber), and UGA (known as opal), which do not code for an amino acid but act as signals for the termination of protein synthesis.

[0032] The term "protein" "polypeptide" or "peptide" refers to a natural or synthetic molecule comprising two or more amino acids linked by the carboxyl group of one amino acid to the alpha amino group of another.

[0033] The term "residue" as used herein refers to an amino acid that is incorporated into a protein. The amino acid may be a naturally occurring amino acid and, unless otherwise limited, may encompass known analogs of natural amino acids that can function in a similar manner as naturally occurring amino, acids.

[0034] The term "polynucleotide" or "nucleic acid sequence" refers to a natural or synthetic molecule comprising two or more nucleotides linked by a phosphate group at the 3' position of one nucleotide to the 5' end of another nucleotide. The polynucleotide is not limited by length, and thus the polynucleotide can include deoxyribonucleic acid (DNA) or ribonucleic acid (RNA).

[0035] The term "gene" refers to a polynucleotide that encodes a protein or functional RNA molecule.

[0036] The term "vector" or "construct" refers to a polynucleotide capable of transporting into a cell another polynucleotide to which the vector sequence has been linked. The term "expression vector" includes any vector, (e.g., a plasmid, cosmid or phage chromosome) containing a gene construct in a form suitable for expression by a cell (e.g., linked to a transcriptional control element). "Plasmid" and "vector" are used interchangeably, as a plasmid is a commonly used form of vector.

[0037] The term "operatively linked to" refers to the functional relationship of a nucleic acid with another nucleic acid sequence. Promoters, enhancers, transcriptional and translational stop sites, and other signal sequences are examples of nucleic acid sequences operatively linked to other sequences. For example, operative linkage of gene to a transcriptional control element refers to the physical and functional relationship between the gene and promoter such that the transcription of the gene is initiated front the promoter by an RNA polymerase that specifically recognizes, binds to and transcribes the DNA.

[0038] The terms "transformation" and "transfection" refer to the introduction of a polynucleotide, e.g., an expression vector, into a recipient cell including introduction of a polynucleotide to the chromosomal DNA of the cell.

[0039] The term "variant" refers to an amino acid or nucleic acid sequence having conservative substitutions, non-conservative subsitutions (i.e. a degenerate variant), substitutions within the wobble position of a codon encoding an amino acid, amino acids added to the C-terminus of a peptide, or a peptide having 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to an amino acid sequence.

[0040] The term "conservative variant" refers to a particular nucleic acid sequence that encodes identical or essentially identical amino acid sequences. Conservative substitution tables providing functionally similar amino acids are well known in the art. The following sets forth exemplary groups which contain natural amino acids that are "conservative substitutions" for one another. Conservative Substitution Groups 1 Alanine (A) Serine (S) Threonine (T); 2 Aspartic acid (D) Glutamic acid (E); 3 Asparagine (N) Glutamine (Q); 4 Arginine (R) Lysine (K); 5 Isoleucine (I) Leucine (L) Methionine (M) Valine (V); and 6 Phenylalanine (F) Tyrosine (Y) Tryptophan (W).

[0041] The term "percent (%) sequence identity" or "homology" refers to the percentage of nucleotides or amino acids in a candidate sequence that are identical with the nucleotides or amino acids in a reference nucleic acid sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared can be determined by known methods.

[0042] The term "translation system" refers to the components necessary to incorporate an amino acid into a growing polypeptide chain (protein). Components of a translation system generally include amino acids, ribosomes, tRNAs, synthetases, and mRNA. The components described herein can be added to a translation system, in vivo or in vitro, to incorporate phosphoserine into a protein.

[0043] The term "transgenic organism" refers to any organism, in which one or more of the cells of the organism contains heterologous nucleic acid introduced by way of human intervention, such as by transgenic techniques well known in the art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus. Suitable transgenic organisms include, but are not limited to, bacteria, cyanobacteria, fungi, plants and animals. The nucleic acids described herein can be introduced into the host by methods known in the art, for example infection, transfection, transformation or transconjugation.

[0044] The term "eukaryote" or "eukaryotic" refers to organisms or cells or tissues derived from these organisms belonging to the phylogenetic domain Eukarya such as animals (e.g., mammals, insects, reptiles, and birds), ciliates, plants (e.g., monocots, dicots, and algae), fungi, yeasts, flagellates, microsporidia, and protists.

[0045] The term "prokaryote" or "prokaryotic" refers to organisms including, but not limited to, organisms of the Eubacteria phylogenetic domain, such as Escherichia coli, Thermus thermophilus, and Bacillus stearothermophilus, or organisms of the Archaea phylogenetic domain such as, Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Halobacterium such as Haloferax volcanii and Halobaeterium species NRC-1. Archaeoglobus fulgidus, Pyrococcus furiosus, Pyrococcus horikoshii, and Aeuropyrum pernix.

II. Compositions

A. Aminoacyl-tRNA Synthetases

[0046] A tRNA that can be aminoacylated with O-phosphoserine ("tRNA.sup.Sep" ) is disclosed for use in incorporating phosphoserine into a protein. The tRNA.sup.Sep recognizes at least one codon in the mRNA for the protein such that a phosphoserine can incorporated into the protein. For example, the tRNA.sup.Sep can contain an anticodon that binds a stop codon or an unconventional or non-native codon, in some embodiments, the tRNA.sup.Sep is a tRNA.sup.Cys from an achaea, such as Methanocaldococcus jannaschii or Methanococcus maripaludis. tRNA.sup.Cys is also found in Methanopyrus kandleri, Methanococcoides burtonii, Methanospirillum hungatei, Methanocorpusculum labreanum, Methanoregula boonei, Methanococcus aeolicus, Methanococcus vannieli, Methanosarcina mazei. Methanosarcina barkeri. Methanosarcina acetivorans, Methanosaeta thermophila, Methanoculleus marisnigri, Methanocaldococcus vulcanius, Methanocaldococcus fervens, and Methanosphaerula palustris. In preferred embodiments, the tRNA.sup.Sep contains a mutation (e.g., C20U mutation) that improves aminoacylation by SepRS without affecting CysRS recognition. In particularly preferred embodiments, the tRNA.sup.Sep is encoded by the nucleic acid sequence SEQ ID NO:41, or a conservative variant thereof.

[0047] tRNA.sup.Sep from Methanocaldococcus jannaschii (FIG. 1B):_________________

TABLE-US-00001 (SEQ ID NO: 41) GCCGGGGTAGTCTAGGGGTTAGGCAGCGGACTGCAGATCCGCCTTACGTG GGTTCAAATCCCACCCCCGGCT

[0048] A phosphoseryl-tRNA synthetase (SepRS) that preferentially aminoacylates tRNA.sup.Sep with phosphoserine is also disclosed for use in incorporating phosphoserine into a protein. In some embodiments, the SepRS is a phosphoseryl-tRNA synthetase from an achaea, such as Methanococcus maripaludis or Methanocaldococcus jannaschii. SepRS is also found in Methanopyrus kandleri, Methanococcoides burtonii, Methanospirillum hungatei, Methanocorpusculum labreanum, Methanoregula boonei, Methanococcus aeolicus, Methanococcus vannieli, Methanosarcina mazei, Methanosarcina barkeri, Methanosarcina acetivorans, Methanosaeta thermophila, Methanoculleus marisnigri, Methanocaldococcus vulcanius, Methanocaldococcus fervens, and Methanosphaerula palustris.

[0049] In particularly preferred embodiments, the SepRS has the amino acid sequence SEQ ID NO:43 or 46, or a conservative variant thereof. For example, the SepRS can be encoded by the nucleic acid sequence SEQ ID NO:42 or 45, or a variant thereof.

TABLE-US-00002 SepRS from Methanocaldococcus jannaschii: (SEQ ID NO: 42) ATGAAATTAAAACATAAAAGGGATGATAAAATGAGATTTGATATAAAAAA GGTTTTAGAGTTAGCAGAGAAGGATTTTGAGACGGCATGGAGAGAGACAA GGGCATTAATAAAGGATAAACATATTGACAATAAATATCCAAGATTAAAG CCTGTCTATGGAAAGCCACATCCAGTGATGGAGACGATAGAGAGATTAAG ACAAGCTTATCTAAGAATGGGATTTGAAGAGATGATTAATCCAGTTATCG TTGATGAGATGGAGATTTATAAGCAATTTGGACCAGAAGCAATGGCAGTT TTAGATAGATGTTTTTACTTGGCTGGATTACCAAGGCCAGATGTTGGTTT AGGAAATGAGAAGGTTGAGATTATAAAAAATTTGGGCATAGATATAGATG AGGAGAAAAAAGAGAGGTTGAGAAGTTTTACATTTATACAAAAAAGGAGC TATAGATGGGGATGATTTAGTCTTTGQAGATTGCCAAAGCTTTAAATGTG AGTAATGAAATGGGATTGAAGGTTTTAGAAACTGCATTTCCTGAATTTAA AGATTTGAAGCCAGAATCAACAACTCTAACTTTAAGAAGCCACATGACAT CTGGGTGGTTTATAACTCTAAGCAGTTTAATAAAGAAGAGAAAACTGCTT TAAAGTTATTCTCTATAGATAGATGTTTTAGAAGGGAGCAAAGAGAGGAT AGAAGCCATTTAATGAGTTATCACTCTGCATCTTGTGTAGTTGTTGGTGA AGATGTTAGTGTAGATGATGGAAAGGTAGTTGCTGAAGGATTGTTGGCTC AATTTGGATTTACAAAATTTAAGTTTAAGCCAGATGAGAAAAAGAGTAAG TATTATACACCAGAAACTCAAACAGAGGTTTATGCCTATCATCCAAAGTT GGGAGAGTGGATTGAAGTAGCAACCTTTGGAGTTTATTCACCAATTGCAT TAGCTAAATATAACATAGATGTGCCAGTTATGAACCTTGGCCTTAGGAGT TGAGAGGTTGGCAATGATTATTTACGGCTATGAGGATGTTAGGGCAATGG TTTATCCTCAATTTTATGAATACAGGTTGAGTGATAGAGATATAGCTGGG ATGATAAGAGTTGATAAAGTTCCTATATTGGATGAATTCTACAACTTTGC AAATGAGCTTATTGATATATGCATAGCAAATAAAGATAAGGAAAGCCCAT GTTCAGTTGAAGTTAAAAGGGAATTCAATTTCAATGGGGAGAGAAGAGTA ATTAAAGTAGAAATATTTGAGAATGAACCAAATAAAAAGCTTTTAGGTCC TTCTGTGTTAAATGAGGTTTATGTCTATGATGGAAATATATATGGCATTC CGCCAACGTTTGAAGGGGTTAAAGAACAGTATATCCCAATTTTAAAGAAA GCTAAGGAAGAAGGAGTTTCTACAAACATTAGATACATAGATGGGATTAT CTATAAATTAGTAGCTAAGATTGAAGAGGCTTTAGTTTCAAATGTGGATG AATTTAAGTTCAGAGTCCCAATAGTTAGAAGTTTGAGTGACATAAACCTA AAAATTGATGAATTGGCTTTAAAACAGATAATGGGGGAGAATAAGGTTAT AGATGTTAGGGGACCAGTTTTCTTAAATGCAAAGGTTGAGATAAAATAG; (SEQ ID NO: 43) MKLKLHRDDKMRFDIKKVLELAEKDFETAWRETRALIKDKHIDNKYPRLK PVYGKPHPVMETIERLRQAYLRMGFEEMINPVIVDEMEIYKQFGPEAMAV LDRVFYLAGLPRDVGLGNEKVEIIKNLGIDIDEEKKERLREVLHLYKKGA IDGDDLVFEIAKALNVSNEMGLKVLETAFPEFKDLKPESTTLTLRSHMTS GWFTTLSSLIKKRKLPLKLFSIDRCFRREQREDRSHLMSYHSASCVVVGE DVSVDDGKVVAEGLLAQFGFTKFKFKPDEKKSKYYTPETQTEVYAYHPKL GEWIEVATFGVYSPIALAKYNIDVPVMNLGLGVERLAMIIYGYEDVRAMV YPQFYEYRLSDRDIAGMIRVDKVPILDEFYNFANELIDICIANKDKESPC SVEVKREFNFNGERRVIKVEIFENEPNKKLLGPSVLNEVYVYDGNIYGIP PTFEGVKEQYIPILKKAKEEGVSTNIRYIDGIIYKLVAKIEEALVSNVDE FKFRVPIVRSLSDINLKIDELALKQIMGENKVIDVRGPVFLNAKVEIK. SepRS from Methanococcus maripaludis: (SEQ ID NO: 45) ATGTTTAAAAGAGAAGAAATCATTGAAATGGCCAATAAGGACTTTGAAAA AGCATGGATCGAAACTAAAGACCTTATAAAAGCTAAAAAGATAAACGAAA GTTACCCAAGAATAAAACCAGTTTTGGAAAAACACACCCTGTAAATGACA CTATTGAAAATTTAAGACAGGCATATCTTAGAATGGGTTTTGAAGAATAT AAACCCAGTAATTGTCGATGAAAGAGATATTTATAAACAATTCGGCCCAG AAGCTATGGCAGTTTTGGATAGATGCTTTTATTTAGCGGGACTTCCAAGA CCTGACGTTGGTTTGAGCGATGAAAAAATTTCACAGATTGAAAAACTTGG AATTAAAGTTTCTGAGCACAAAGAAAGTTTACAAAAAATACTTCACGGAT ACAAAAAAGGAACTCTTGATGGTGACGATTTAGTTTTAGAAATTTCAAAT GCACTTGAAATTTCAAGCGAGATGGGTTTAAAAATTTTAGAAGATGTTTT CCCAGAATTTAAGGATTTAACCGCAGTTTCTTCAAAATTAACTTTAAGAA GCCACATGACTTCAGGATGGTTCCTTACTGTTTCAGACCTCATGAACAAA AAACCCTTGCCATTTAAACTCTTTTCAATCGATAGATGTTTTAGAAGAGA ACAAAAAGAAGATAAAAGCCACTTAATGACATACCACTCTGCATCCTGTG CAATTGCAGGTGAAGGCGTGGATATTAATGATGGAAAAGCAATTGCAGAA GGATTATTATCCCAATTTGGCTTTACAAACTTTAAATTCATTCCTGATGA AAAGAAAAGTAAATACTACACCCCTGAAACACAGACTGAAGTTTACGCAT ACCACCCAAAATTAAAAGAATGGCTCGAAGTTGCTACATTTGGAGTATAT TCGCCAGTTGCATTAAGCAAATACGGAATAGATGTACCTGTAATGAATTT GGGTCTTGGTGTTGAAAGACTTGCAATGATTTCTGGAAATTTCGCAGATG TTCGAGAAATGGTATATCCTCAGTTTTACGAACACAAACTTAATGACCGG AATGTCGCTTCAATGGTAAAACTCGATAAAGTTCCAGTAATGGATGAAAT TTACGATTTAACAAAAGAATTAATTGAGTCATGTGTTAAAAACAAAGATT TAAAATCCCCTTGTGAATTAGCTATTGAAAAAACGTTTTCATTTGGAAAA ACCAAGAAAAATGTAAAAATAAACATTTTTGAAAAAGAAGAAGGTAAAAA TTTACTCGGACCTTCAATTTTAAACGAAATCTACGTTTACGATGGAAATG TAATTGGAATTCCTGAAAGCTTTGACGGAGTAAAAGAAGAATTTAAAGAC TTCTTAGAAAAAGGAAAATCAGAAGGGGTAGCAACAGGCATTCGATATAT CGATGCGCTTTGCTTTAAAATTACTTCAAAATTAGAAGAAGCATTTGTGT CAAACACTACTGAATTCAAAGTTAAAGTTCCAATTGTCAGAAGTTTAAGC GACATTAACTTAAAAATCGATGATATCGCATTAAAACAGATCATGAGCAA AAATAAAGTAATCGACGTTAGAGGCCCAGTCTTTTAAATGTCGAAGTAAA AATTGAATAA; (SEQ ID NO: 46) MFKREEIIEMANKDFEKAWIETKDLIKAKKINESYPRIKPVFGKTHPVND TIENLRQAYLRMGFEEYINPVIVDERDIYKQFGPEAMAVLDRCFYLAGLP RPDVGLSDEKISQIEKLGIKVSEHKESLQKILHGYKKGTLDGDDLVLEIS NALEISSEMGLKILEDVFPEFKDLTAVSSKLTLRSHMTSGWFLTVSDLMN KKPLPFKLFSIDRCFRREQKEDKSHLMTYHSASCAIAGEGVDINGDKAIA EGLLSQFGFTNFKFIPDEKKSKYYTPETQTEVYAYHPKLKEWLEVATFGV YSPVALSKYGIDVPVMNLGLGVERLAMISGNFADVREMVYPQFYEHKLND RNVASMVKLDKVPVMDEIYDLTKELIESCVKNKDLKSPCELAIEKTFSFG KTKKVKINIFEKEEGKNLLGPSILNEIYVYDGNVIGIPESFDGVKEEFKD FLEKGKSEGVATGIRYIDALVGKITSKLEEAFVSNTTEFKVKVPIVRSLS DINLKIDDIALKQIMSKNKVIDVRGPVFLNVEVKIE.

B. Elongation Factor Proteins

[0050] Nucleic acid sequences encoding mutant elongation factor proteins (EF-Sep) are described for use with phosphoseryl-tRNA synthetase (SepRS) and phosphoseryl-tRNA (tRNA.sup.Sep) in site specific incorporation of phosphoserine into a protein or polypeptide. Typically. SepRS preferentially aminoacylates tRNA.sup.Sep with O-phosphoserine and the tRNA.sup.Sep recognizes at least one codon such as a stop codon. Due to the negative charge of the phosphoserine, Sep-tRNA.sup.Sep does not bind elongation factor Tu (EF-Tu). However, the disclosed EF-Sep proteins can bind Sep-tRNA.sup.Sep and protect Sep-tRNA.sup.Sep from deacylation.

[0051] In some embodiments, EF-Sep is a mutant form of bacterial EF-Tu having a mutation at one or more of amino acid residues corresponding to His67, Asp216, Glu217, Phe219, Thr229, and Asn274 in E. coli EF-Tu, which are located in the amino acid binding pocket for aminoacylated tRNA. In some embodiment, EF-Sep is a mutant form of eukaryotic elongation factor 1A (eEF1A) with mutations in positions equivalent to bacterial counterpart.

[0052] In preferred embodiments, EF-Sep has the amino acid sequence SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, or a conservative variant thereof. For example, in some embodiments, the nucleic acid sequence encoding EF-Sep has the nucleic acid sequence SEQ ID NO:5, SEQ ID NO:6. SEQ ID NO:7, SEQ ID NO:8, or a conservative variant thereof.

C. Variants

[0053] Also disclosed are variants of the disclosed proteins and polynucleotides that include conservative substitutions, additions, and deletions therein not affecting the structure or function. For example, biologically active sequence variants of tRNA.sup.Sep, SepRS, and EF-Sep and in vitro generated covalent derivatives of tRNA.sup.Sep, SepRS, and EF-Sep that demonstrate tRNA.sup.Sep, SepRS, and EF-Sep activity are disclosed.

[0054] Various types of mutagenesis can be used to modify a nucleic acid. They include, but are not limited to, site-directed, random point mutagenesis, homologous recombination (DNA shuffling), mutagenesis using uracil containing templates, oligonucleotide-directed mutagenesis, phosphorothioate-modified DNA mutagenesis, and mutagenesis using methods such as gapped duplex DNA. Additional suitable methods include point mismatch repair, mutagenesis using repair-deficient host strains, restriction-selection and restriction-purification, deletion mutagenesis, mutagenesis by total gene synthesis and double-strand break repair.

[0055] Sequence variants of tRNA.sup.Sep, SepRS, and EF-Sep fall into one or more of three classes: substitutional, insertional and/or deletional variants. Sequence variants of tRNA.sup.Sep include nucleotide variants, while sequence variants of SepRS and EF-Sep include nucleotide and/or amino acid variants. Insertions include amino and/or carboxyl terminal fusions as well as intrasequence insertions of single or multiple residues. tRNA.sup.Sep, SepRS, and EF-Sep include, for example, hybrids of mature tRNA.sup.Sep, SepRS, and EF-Sep with nucleotides or polypeptides that are homologous with tRNA.sup.Sep, SepRS, and EF-Sep. tRNA.sup.Sep, SepRS, and EF-Sep also include hybrids of tRNA.sup.Sep, SepRS, and EF-Sep with nucleotides or polypeptides homologous to the host cell but not to tRNA.sup.Sep, SepRS, and EF-Sep, as well as nucleotides or polypeptides heterologous to both the host cell and tRNA.sup.Sep, SepRS, and EF-Sep. Fusions include amino or carboxy terminal fusions with either prokaryotic nucleotides or peptides or signal peptides of prokaryotic, yeast, viral or host cell signal sequences.

[0056] Insertions can also be introduced within the mature coding sequence of tRNA.sup.Sep, SepRS, and EF-Sep. These, however, ordinarily will be smaller insertions than those of amino or carboxyl terminal fusions, on the order of one to four residues. Insertional sequence variants of tRNA.sup.Sep, SepRS, and EF-Sep are those in which one or more residues are introduced into a predetermined site in the target tRNA.sup.Sep, SepRS, and EF-Sep.

[0057] Deletion variants are characterized by the removal of one or more nucleotides or amino acid residues from the tRNA.sup.Sep, SepRS, and EF-Sep sequence. For SepRS and EF-Sep, deletions or substitutions of cysteine or other labile residues may be desirable, for example in increasing the oxidative stability or selecting the preferred disulfide bond arrangement of SepRS or EF-Sep. Deletions or substitutions of potential proteolysis sites, e.g. Arg Arg, are accomplished, for example, by deleting one of the basic residues or substituting one by glutaminyl or histidyl residues. Variants ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the tRNA.sup.Sep, SepRS, and EF-Sep, thereby producing DNA encoding the variant, and thereafter expressing the DNA in recombinant cell culture. Variant tRNA.sup.Sep, SepRS, and EF-Sep fragments may also be prepared by in vitro synthesis. The variants typically exhibit the same qualitative biological activity as the naturally-occurring analogue, although variants also are selected in order to modify the characteristics of tRNA.sup.Sep, SepRS, and EF-Sep.

[0058] Substitutional variants are those in which at least one residue sequence has been removed and a different residue inserted in its place. Owing to the degeneracy of the genetic code, "silent substitutions" (i.e., substitutions in a nucleic acid sequence which do not result in an alteration in an encoded polypeptide) are an implied feature of every nucleic acid sequence which encodes an amino acid. Similarly, conservative amino acid substitutions are also readily identified. Such conservative variations are a feature of each disclosed sequence. The substitutions which in general are expected to produce the greatest changes in SepRS or EF-Sep protein properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine.

[0059] While the site for introducing a nucleotide or amino acid sequence variation is predetermined, the mutation per se need not be predetermined. For example, in order to optimize the performance of a mutation at a given site, random mutagenesis may be conducted at the target codon or region and the expressed tRNA.sup.Sep, SepRS, and EF-Sep variants screened for the optimal combination of desired activity. Techniques for making substitution mutations at predetermined sites in DNA having a known sequence are well known.

[0060] Substitutions are typically of single residues; insertions usually will be on the order of about from 1 to 10 residues; and deletions will range about from 1 to 30 residues. Substitutions, deletion, insertions or any combination thereof may be combined to arrive at a final construct. The mutations that will be made in the DNA encoding the variant SepRS and EF-Sep must not place the sequence out of reading frame and preferably will not create complementary regions that could produce secondary mRNA structure.

[0061] A DNA isolate is understood to mean chemically synthesized DNA, cDNA or genomic DNA with or without the 3' and/or 5' flanking regions. DNA encoding tRNA.sup.Sep, SepRS, and EF-Sep can be obtained from other sources than Methanocaldococcus jannaschii by screening a cDNA library from cells containing mRNA using hybridization with labeled DNA encoding Methanocaldococcus jannaschii tRNA.sup.Sep, SepRS, and EF-Sep, or fragments thereof (usually, greater than 10 bp).

[0062] The precise percentage of similarity between sequences that is useful in establishing sequence identity varies with the nucleic acid and protein at issue, but as little as 25% sequence similarity is routinely used to establish sequence identity. Higher levels of sequence similarity, e.g., 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or 99% or more can also be used to establish sequence identity. Methods for determining sequence similarity percentages (e.g., BLASTP and BLASTN using default parameters) are generally available.

[0063] Alignment of sequences for comparison can be conducted by many well-known methods in the art, for example, by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), by the Gibbs sampling method (Chatterji and Pachter, J Comput Biol. 12(6):599-608 (2005)), by PSI-BLAST-ISS (Margelevicius and Venclovas, BMC Bioinformatics 21; 6:185 (2005)), or by visual inspection. One algorithm that is suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al., J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (www.ncbi.nlm.nib.gov).

[0064] In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysts of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.

D. Expression or Translation Systems

[0065] Also disclosed are expression or translation systems for incorporate phosphoserine into a growing polypeptide chain (protein). Components of a translation system generally include amino acids, ribosomes, tRNAs, synthetases, and mRNA. The disclosed tRNA.sup.Sep, SepRS, and EF-Sep can be added to a translation system, in vivo or in vitro to incorporate phosphoserine into a protein.

[0066] In some embodiments, a cell-based (in vivo) expression system is used. In these embodiments, nucleic acids encoding one or more of tRNA.sup.Sep, SepRS, and EF-Sep are delivered to cells under conditions suitable for translation and or transcription of tRNA.sup.Sep, SepRS, EF-Sep, or a combination thereof. The cells can in some embodiments be prokaryotic, e.g., an E. coli cell, or eukaryotic, e.g., a yeast, mammalian, plant, or insect or cells thereof.

[0067] In some embodiments, a cell-free (in vitro) expression system is used. The most frequently used cell-free translation systems involve extracts containing all the macromolecular components (70 S or 80 S ribosomes, tRNAs, aminoacyl-tRNA synthetases, initiation, elongation and termination factors, etc.) required for translation of exogenous RNA. To ensure efficient translation, each extract is supplemented with amino acids, energy sources (ATP, GTP), energy regenerating systems (creatine phosphate and creatine phosphokinase for eukaryotic systems, and phosphoenol pyruvate and pyruvate kinase for the E. coli lysate), and other co-factors (Mg.sup.2+, K.sup.-etc.).

i) Promoters and Enhancers

[0068] Nucleic acids that are delivered to cells typically contain expression controlling systems. For example, the inserted genes in viral and retroviral systems usually contain promoters, and/or enhancers to help control the expression of the desired gene product. A promoter is generally a sequence or sequences of DNA that function when in a relatively fixed location in regard to the transcription start site. A promoter contains core elements required for basic interaction of RNA polymerase and transcription factors, and may contain upstream elements and response elements.

[0069] Therefore, also disclosed is a polynucleotide encoding one or more of tRNA.sup.Sep, SepRS, and EF-Sep, operably linked to an expression control sequence.

[0070] Suitable promoters are generally obtained from viral genomes (e.g., polyoma, Simian Virus 40 (SV40), adenovirus, retroviruses, hepatitis-B virus, and cytomegalovirus) or heterologous mammalian genes (e.g. beta actin promoter). Enhancer generally refers to a sequence of DNA that functions at no fixed distance from the transcription start site and can be either 5' or 3' to the transcription unit. Furthermore, enhancers can be within an intron as well as within the coding sequence itself. They are usually between 10 and 300 bp in length, and they function in cis. Enhancers function to increase transcription from nearby promoters. Enhancers also often contain response elements that mediate the regulation of transcription. Many enhancer sequences are now known from mammalian genes (globin, elastase, albumin, .alpha.-fetoprotein and insulin). However, enhancer from a eukaryotic cell virus are preferably used tor general expression. Suitable examples include the SV40 enhancer on the late side of the replication origin, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers.

[0071] In certain embodiments the promoter and/or enhancer region can act as a constitutive promoter and/or enhancer to maximize expression of the region of the transcription unit to be transcribed. In certain constructs the promoter and/or enhancer region is active in ail eukaryotic cell types, even if it is only expressed in a particular type of cell at a particular time. A preferred promoter of this type is the CMV promoter. In other embodiments, the promoter and/or enhancer is tissue or cell specific.

[0072] In certain embodiments the promoter and/or enhancer region is inducible. Induction can occur, e.g., as the result of a physiologic response, a response to outside signals, or as the result of artificial manipulation. Such promoters are well known to those of skill in the art. For example, in some embodiments, the promotor and/or enhancer may be specifically activated either by light or specific chemical events which trigger their function. Systems can be regulated by reagents such as tetracycline and dexamethasone. There are also ways to enhance viral vector gene expression by exposure to irradiation, such as gamma irradiation, or alkylating chemotherapy drugs.

[0073] Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human or nucleated cells) may also contain sequences necessary for the termination of transcription which may affect mRNA expression. These regions are transcribed as polyadenylated segments in the untranslated portion of the mRNA encoding tissue factor protein. The 3' untranslated regions also include transcription termination sites. It is preferred that the transcription unit also contains a polyadenylation region. One benefit of this region is that it increases the likelihood that the transcribed unit will be processed and transported like mRNA. The identification and use of polyadenylation signals in expression constructs is well established. It is preferred that homologous polyadenylation signals be used in the transgene constructs.

ii) Cell Delivery Systems

[0074] There are a number of compositions and methods which can be used to deliver nucleic acids to cells, either in vitro or in vivo. These methods and compositions can largely be broken down into two classes: viral based delivery systems and non-viral based delivery systems. For example, nucleic acids can be delivered through a number of direct delivery systems such as electroporation, lipofection, calcium phosphate precipitation, plasmids, viral vectors, viral nucleic acids, phage nucleic acids, phages, cosmids, or via transfer of genetic material in cells or carriers such as cationic liposomes. Appropriate means for transfection, including viral vectors, chemical transfectants, or physico-mechanical methods such as electroporation and direct diffusion of DNA, are well known in the art and readily adaptable for use with the compositions and methods described herein.

[0075] Transfer vectors can be any nucleotide construction used to deliver genetic material into cells. In some embodiments the vectors are derived from either a virus or a retrovirus. Viral vectors include, for example, Adenovirus, Adeno-associated virus, Herpes virus, Vaccinia virus, Polio virus, AIDS virus, neuronal trophic virus, Sindbis and other RNA viruses, including these viruses with the HIV backbone.

[0076] Typically, viral vectors contain nonstructural early genes, structural late genes, an RNA polymerase III transcript, inverted terminal repeats necessary for replication and encapsidation, and promoters to control the transcription and replication of the viral genome. When engineered as vectors, viruses typically have one or more of the early genes removed and a gene or gene/promotor cassette is inserted into the viral genome in place of the removed viral DNA. The necessary functions of the removed early genes are typically supplied by cell lines which have been engineered to express the gene products of the early genes in trans.

[0077] Nucleic acids can also be delivered through electroporation, sonoporation, lipofection, or calcium phosphate precipitation. Lipofection involves the use liposomes, including cationic liposomes (e.g., DOTMA, DOPE, DC-cholesterol) and anionic liposomes, to delivery genetic material to a cell. Commercially available liposome preparations include LIPOFECTIN, LIPOFECTAMINE (GIBCO-BRL, Inc., Gaithersburg, Md.), SUPERFECT (Qiagen, Inc. Hilden, Germany), and TRANSFECTAM (Promega Biotec, Inc,. Madison, Wis.).

[0078] Nucleic acids that are delivered to cells which are to be integrated into the host cell genome, typically contain integration sequences. These sequences are often viral related sequences, particularly when viral based systems are used. These viral intergration systems can also be incorporated into nucleic acids which are to be delivered using a non-nucleic acid based system of deliver, such as a liposome, so that the nucleic acid contained in the delivery system can be come integrated into the host genome. Techniques for integration of genetic material into a host genome are also known and include, for example, systems designed to promote homologous recombination with the host genome. These systems typically rely on sequence flanking the nucleic acid to be expressed that has enough homology with a target sequence within the host cell genome that recombination between the vector nucleic acid and the target nucleic acid takes place, causing the delivered nucleic acid to be integrated into the host genome. These systems and the methods necessary to promote homologous recombination are known to those of skill in the art.

iii) Markers

[0079] The vectors used to deliver the disclosed nucleic acids to cells can further include nucleic acid sequence encoding a marker product. This marker product is used to determine if the gene has been delivered to the cell and once delivered is being expressed. In some embodiments the marker is a detectable label. Exemplary labels include the E. coli lacZ gene, which encodes .beta.-galactosidase, and green fluorescent protein (GFP).

[0080] In some embodiments the marker may be a selectable marker. Examples of suitable selectable markers for mammalian cells are dihydrofolate reductase (DHFR), thymidine kinase, neomycin, neomycin analog G418, hydromycin, and puromycin. When such selectable markers are successfully transferred into a mammalian host cell, the transformed mammalian host cell can survive if placed under selective pressure. There are two widely used distinct categories of selective regimes. The first category is based on a cell's metabolism and the use of a mutant cell line which lacks the ability to grow independent of a supplemented media. The second category is dominant selection which refers to a selection scheme used in any cell type and does not require the use of a mutant cell line. These schemes typically use a drug to arrest growth of a host cell. Those cells which have a novel gene would express a protein conveying drug resistance and would survive the selection.

III. Methods

A. Site-Specific Phosphorylation of Proteins

[0081] Methods for incorporating phosphoserine into polypeptides are disclosed. The method involves the use of tRNA.sup.Sep, SepRS, and EF-Sep in the translation process for a target polypeptide from mRNA. SepRS preferentially aminoacylates tRNA.sup.Sep with O-phosphoserine. The resulting Sep-tRNA.sup.Sep recognizes at least one codon in the mRNA for the target protein, such as a stop codon. EF-Sep mediates the entry of the Sep-tRNA.sup.Sep into a free site of the ribosome. If the codon-anticodon pairing is correct, EF-Sep hydrolyzes guanosine triphosphate (GTP) into guanosine diphosphate (GDP) and inorganic phosphate, and changes in conformation to dissociate from the tRNA molecule. The Sep-tRNA.sup.Sep then fully enters the A site, where its amino acid is brought near the P site's polypeptide and the ribosome catalyzes the covalent transfer of the amino acid onto the polypeptide.

[0082] In preferred embodiments, the tRNA.sup.Sep is a tRNA.sup.Cys from a methanogenic archaea, such as Methanocaldococcus jannaschii, containing a mutation (e.g., C20U) that improves aminoacylation of the tRNA by SepRS without affecting CysRS recognition. In some embodiments, the tRNA.sup.Sep contains an anticodon that binds a codon other than a Cys codon, such as a stop codon. In some embodiments, the tRNA.sup.Sep is encoded the nucleic acid sequence SEQ ID NO:41, or a conservative variant thereof.

[0083] In some embodiments, the SepRS is any tRNA synthetase that preferentially aminoacylates tRNA.sup.Sep with a phosphoserine. In preferred embodiments, the SepRS is a tRNA synthetase from a methanogenic archaea, such as Methanococcus maripaludis or Methanocaldococcus jannaschii. In some embodiments, the SepRS has the amino acid sequence SEQ ID NO:43 or 46, or a conservative variant thereof.

[0084] In some embodiments, the EF-Sep is any elongation factor protein that binds Sep-tRNA.sup.Sep and catalyzes the covalent transfer of the phosphoserine amino acid onto the polypeptide. EF-Sep proteins can bind Sep-tRNA.sup.Sep and can preferably protect Sep-tRNA.sup.Sep from deacylation. In some embodiments, EF-Sep is a mutant form of bacterial EF-Tu having a mutation at one or more of amino acid residues corresponding to His67, Asp216, Glu217, Phe219, Thr229, and Asn274 in E. coli EF-Tu, which are located in the amino acid binding pocket for aminoacylated tRNA. In some embodiments, EF-Sep has the amino acid sequence SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, or a conservative variant thereof. In some embodiments, EF-Sep is a mutant form of eukaryotic elongation factor 1A (eEF1A) with mutations in positions equivalent to bacterial counterpart.

i) In Vitro Transcription/Translation

[0085] In one embodiment, the nucleic acids encoding tRNA.sup.Sep and SepRS activity are synthesized prior to translation of the target protein and are used to incorporate phosphoserine into a target protein in a cell-free (in vitro) protein synthesis system.

[0086] In vitro protein synthesis systems involve the use crude extracts containing all the macromolecular components (70 S or 80 S ribosomes. tRNAs, aminoacyl-tRNA synthetases, initiation, elongation and termination factors, etc.) required for translation of exogenous RNA. For the current method, the tRNAs, aminoacyl-tRNA synthetases, and elongation factors in the crude extract are supplemented with tRNA.sup.Sep, SepRS, and EF-Sep. To ensure efficient translation, each extract must be supplemented with amino acids, energy sources (ATP, GTP), energy regenerating systems (creatine phosphate and creatine phosphokinase for eukaryotic systems, and phosphoenol pyruvate and pyruvate kinase for the E. coli lysate), and other co-factors (Mg2+, K+, etc.).

[0087] In vitro protein synthesis does not depend on having a polyadenylated RNA, but if having a poly(A) tail is essential for some other purpose, a vector may be used that has a stretch of about 100 A residues incorporated into the polylinker region. That way, the poly(A) tail is "built in" by the synthetic method. In addition, eukaryotic ribosomes read RNAs that have a 5' methyl guanosine cap more efficiently. RNA caps can be incorporated by initiation of transcription using a capped base analogue, or adding a cap in a separate in vitro reaction post-transcriptionally.

[0088] Suitable in vitro transcription/translation systems include, but are not limited to, the rabbit reticulocyte system, the E. coli S-30 transcription-translation system, the wheat germ based translational system. Combined transcription/translation systems are available, in which both phage RNA polymerases (such as T7 or SP6) and eukaryotic ribosomes are present. One example of a kit is the TNT.RTM. system from Promega Corporation.

ii) In Vivo Methods

[0089] Host cells and organisms can also incorporate phosphoserine into proteins or polypeptides via nucleic acids encoding tRNA.sup.Sep, SepRS, and EF-Sep. Nucleic acids encoding tRNA Sep, SepRS, and EF-Sep, operably linked to one or more expression control sequences are introduced into cells or organisms using a cell delivery system. These cells also contain a gene encoding the target protein operably linked to an expression control sequence.

[0090] Suitable organisms include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal cell systems.

[0091] It will be understood by one of ordinary skill in the art that regardless of the system used (i.e. in vitro or in vivo), expression of genes encoding tRNA.sup.Sep, SepRS, and EF-Sep activity will result in site specific incorporation of phosphoserine into the target polypeptides or proteins that are translated in the system. Host cells are genetically engineered (e.g., transformed, transduced or transfected) with the vectors encoding tRNA.sup.Sep, SepRS, and EF-Sep, which can be, for example, a cloning vector or an expression vector. The vector can be, for example, in the form of a plasmid, a bacterium, a virus, a naked polynucleotide, or a conjugated polynucleotide. The vectors are introduced into cells and/or microorganisms by standard methods including electroporation, infection by viral vectors, high velocity ballistic penetration by small particles with the nucleic acid either within the matrix of small beads or particles, or on the surface. Such vectors can optionally contain one or more promoter. A "promoter" as used herein is a DNA regulatory region capable of initiating transcription of a gene of interest.

[0092] Kits are commercially available for the purification of plasmids from bacteria, (see, e.g., GFX.TM. Micro Plasmid Prep Kit from GE Healthcare; Strataprep.RTM. Plasmid Miniprep Kit and StrataPrep.RTM. EF Plasmid Midiprep Kit from Stratagene; Gen Elute.TM. HP Plasmid Midiprep and Maxiprep Kits from Sigma-Aldrich, and, Qiagen plasmid prep kits and QIAfilter.TM. kits from Qiagen). The isolated and purified plasmids are then further manipulated to produce other plasmids, used to transfect cells or incorporated into related vectors to infect organisms. Typical vectors contain transcription and translation terminators, transcription and translation initiation sequences, and promoters useful for regulation of the expression of the particular target nucleic acid. The vectors optionally comprise generic expression cassettes containing at least one independent terminator sequence, sequences permitting replication of the cassette in eukaryotes, or prokaryotes, or both, (e.g., shuttle vectors) and selection markers for both prokaryotic and eukaryotic systems.

[0093] Prokaryotes useful as host cells include, but are not limited to, gram negative or gram positive organisms such as E. coli or Bacilli. In a prokaryotic host cell, a polypeptide may include an N-terminal methionine residue to facilitate expression of the recombinant polypeptide in the prokaryotic host cell. The N-terminal Met may be cleaved from the expressed recombinant polypeptide. Promoter sequences commonly used for recombinant prokaryotic host cell expression vectors include lactamase and the lactose promoter system.

[0094] Expression vectors for use in prokaryotic host cells generally comprise one or more phenotypic selectable marker genes. A phenotypic selectable marker gene is, for example, a gene encoding a protein that confers antibiotic resistance or that supplies an autotrophic requirement. Examples of useful expression vectors for prokaryotic host cells include those derived from commercially available plasmids such as the cloning vector pBR322 (ATCC 37017). pBR322 contains genes for ampicillin and tetracycline resistance and thus provides simple means for identifying transformed cells. To construct an expression vector using pBR322, an appropriate promoter and a DNA sequence are inserted into the pBR322 vector. Other commercially available vectors include, for example, T7 expression vectors from Invitrogen, pET vectors from Novagen and pALTER.RTM. vectors and PinPoint.RTM. vectors from Promega Corporation.

[0095] Yeasts useful as host cells include, but are not limited to, those from the genus Saccharomyces, Pichia, K. Actinomycetes and Kluyveromyces. Yeast vectors will often contain an origin of replication sequence, an autonomously replicating sequence (ARS), a promoter region, sequences for polyadenylation, sequences tor transcription termination, and a selectable marker gene. Suitable promoter sequences for yeast vectors include, among others, promoters for metallothionein, 3-phosphoglycerate kinase (Hitzeman et al., Biol. Chem. 255:2073, (1980)) or other glycolytic enzymes (Holland et al., Biochem. 17:4900, (1978)) such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. Other suitable vectors and promoters for use in yeast expression are further described m Fleer et al., Gene, 107:285-195 (1991), in Li, et al., Lett Appl Microbiol. 40(5):347-52 (2005), Jansen, et al., Gene 344:43-51 (2005) and Daly and Hearn, J. Mol. Recognit. 18(2):119-38 (2005). Other suitable promoters and vectors for yeast and yeast transformation protocols are well known in the art.

[0096] Mammalian or insect host cell culture systems well known in the art can also be employed to express recombinant tRNA.sup.Sep, SepRS, and EF-Sep for producing proteins or polypeptides containing phosphoserine. Commonly used promoter sequences and enhancer sequences are derived from Polyoma virus, Adenovirus 2, Simian Virus 40 (SV40), and human cytomegalovirus. DNA sequences derived from the SV40 viral genome may be used to provide other genetic elements for expression of a structural gene sequence in a mammalian host cell, e.g., SV40 origin, early and late promoter, enhancer, splice, and polyadenylation sites. Viral early and late promoters are particularly useful because both are easily obtained from a viral genome as a fragment which may also contain a viral origin of replication. Exemplary expression vectors for use in mammalian host cells are well known in the art.

B. Purifying Proteins Containing Phosphoserine

[0097] Proteins or polypeptides containing phosphoserine can be purified, either partially or substantially to homogeneity, according to standard procedures known to and used by those of skill in the art including, but not limited to, ammonium sulfate or ethanol precipitation, acid or base extraction, column chromatography, affinity column chromatography, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, hydroxylapatite chromatography, lectin chromatography, and gel electrophoresis. Protein refolding steps can be used, as desired, in making correctly folded mature proteins. High performance liquid chromatography (HPLC), affinity chromatography or other suitable methods can be employed in final purification steps where high purity is desired. In one embodiment, antibodies made against proteins containing phosphoserine are used as purification reagents, e.g., for affinity-based purification of proteins containing phosphoserine. Once purified, partially or to homogeneity, as desired, the polypeptides may be used as assay components, therapeutic reagents, immunogens for antibody production, etc.

[0098] Those of skill in the art will recognize that, after synthesis, expression and/or purification, proteins can possess conformations different from the desired conformations of the relevant polypeptides. For example, polypeptides produced by prokaryotic systems often are optimized by exposure to chaotropic agents to achieve proper folding. During purification from lysates derived from E. coli, the expressed protein is optionally denatured and then renatured. This is accomplished by solubilizing the proteins in a chaotropic agent such its guanidine HCl.

[0099] It is occasionally desirable to denature and reduce expressed polypeptides and then to cause the polypeptides to re-fold into the preferred conformation. For example, guanidine, urea, DTT, DTE, and/or a chaperonin can be added to a translation product of interest. Methods of reducing, denaturing and renaturing proteins are well known to those of skill in the art Refolding reagents can be flowed or otherwise moved into contact with the one or more polypeptide or other expression product, or vice-versa.

C. Using Phosphoserine Containing Peptides

[0100] Proteins or polypeptides containing phosphoserine and antibodies that bind to such proteins produced by the methods described herein can be used for research involving phosphoproteins such as the study of kinases, phosphotases, and target proteins in signal transduction pathways. Proteins or polypeptides containing phosphoserine produced by the methods described herein can also be used for antibody production, protein array manufacture and development of cell-based screens for new drug discovery.

IV. Kits

[0101] Kits for producing polypeptides and/or proteins containing phosphoserine are also provided. For example, a kit for producing a protein that contains phosphoserine in a cell is provided, where the kit includes a polynucleotide sequence encoding tRNA.sup.Sep, a polynucleotide sequence encoding SepRS, and a polynucleotide sequence encoding EF-Sep. In one embodiment, the kit further includes phosphoserine. In another embodiment, the kit further comprises instructional materials tor producing the protein. In another embodiment, a kit for producing a protein that contains phosphoserine in vitro is provided, where the kit includes a polynucleotide sequence encoding tRNA.sup.Sep, a polynucleotide sequence encoding SepRS, a polynucleotide sequence encoding EF-Sep, and phsophoserine. In another embodiment, the kit further comprises instructional materials for producing the protein in vitro.

[0102] The present invention will be further understood by reference to the following non-limiting examples.

EXAMPLES

Example 1: SepRS and tRNA.sup.Sep are an Orthogonal Pair in E. coli

Materials and Methods

Constructions of Strains

[0103] To prevent possible enzymatic dephosphorylation of O-phospho-L-serine (Sep) in vivo, the gene encoding phosphoserine phosphatase (serB), which catalyzes the last step in serine biosynthesis, was deleted from Escherichia coli strains Top 10 and BL21. Markerless gene deletions were carried out using a .lamda.-red and FLP recombinase-based gene knockout strategy as described by Datsenko K A, et al. Proc Natl Acad Sci USA. 97:6640 (2000). E. coli strains Top 10.DELTA.serB and BL21.DELTA.serB were used for EF-Tu library construction and MEK1 expression experiments.

Construction of Plasmids

[0104] To construct plasmid pSepT, the full-length gene encoding tRNA.sup.Sep was constructed from overlapping oligonucleotides and ligated immediately downstream of the lpp promoter in pTECH (Bunjun S, et al. Proc Natl Acad Sci USA. 97:12997 (2000)) using EcoRI and BamHI restriction sites. pCysT, encoding the wild type tRNA.sup.Cys gene from Methanocaldococcus jannaschii was constructed in the same way.

[0105] The gene fragment encoding .beta.-lactamase was PCR-amplified from plasmid pUC18 using primers PBLAF (5'-TGC GCA ATG CGG CCG CCC GTA GCG CCG ATG GTA GTG T-3', SEQ ID NO:9) and PBLAR (5'-ACA CGG AGA TCT CTA AAG TAT ATA TGA GTA AAC-3', SEQ ID NO:10), and ligated with a NotI and BglII digested PCR product which was constructed from pSepT using primers PSEPF (5'-TGC GCA ATG CGG CCG CCC GGG TCG AAT TTG CTT TCG A-3', SEQ ID NO:11) and PSEPR (5'-ACA CGG AGA TCT ATG CCC CGC GCC CAC CGG AAG-3', SEQ ID NO:12).

[0106] pKD was derived from pKK223-3 (Pharmacia). The ampicillin resistance gene was replaced with a kanamycin resistance gene by combining two PCR products generated from pKK223-3 and pET28a. The following PCR primers were used: PKF (5'-TGC AGCA ATG CGG CCG CTT TCA CCG TCA TCA CCG AAA C-3', SEQ ID NO:13) and PKR (5'-GGG ACG CTA GCA AAC AAA AAG AGT TTG TAG AA-3', SEQ ID NO:14) for pKK223-3 amplification and PKNF (5'-GGG ACG CTA GCT TTT CTC TGG TCC CGC CGC AT-3', SEQ ID NO:15) and PKNR (5'-TGC GCA ATG CGG CCG CGG TGG CAC TTT TCG GGG AAA T-3', SEQ ID NO:16) for Kan.sup.R gene amplification.

[0107] The original multiple cloning site (MCS; NcoI-EcoRI-SacI) was modified by adding an additional ribosome binding site (RBS) and a second MCS (NdeI-BamHI-SalI-HindIII), thus enabling simultaneous protein expression front two genes, both under the control of the same tac promoter. The Methanococcus maripaludis SepRS gene was cloned into pKD using NcoI and SacI sites to produce pKD-SepRS. The E. coli EE-Tu gene (tufB) was ligated into pKD-SepRS using BamHI and SalI sites resulting in pKD-SepRS-EFTu. The M. maripaludis pscS gene encoding SepCysS was cloned into pKD-SepRS using BamHI and SalI sites to produce pKD-SepRS-SepCysS and the M. maripaludis CysRS gene was cloned into pKD using EcoRI and SalI to yield pKD-CysRS.

[0108] pCAT112TAG-SepT was created from pACYC184. The gene encoding chloramphenicol acetyltransferae (CAT) was modified by quickchange mutagenesis to introduce an amber stop codon at position Asp112 (Wang L, et al. Science 292:498 (2001)). The resulting plasmid was PCR amplified using primers PBLAF (5'-TGC GCA ATG CGG CCG CCC GTA GCG CCG ATG GTA GTG T-3', SEQ ID NO:9) and PBLAR (5'-ACA CGG AGA TCT CTA AAG TAT ATA TGA GTA AAC-3', SEQ ID NO:10) and ligated with a PCR product containing a tRNA.sup.Sep expression cassette from pSepT, created with primers TSEPF (5'-GCA TGC GCC GCC AGC TGT TGC CCG TCT CGC-3', SEQ ID NO:17) and TSEPR (5'-GCA TAG ATC TTC AGC TGG CGA AAG GGG GAT G-3', SEQ ID NO:18).

[0109] Plasmid pCcdB was created by adding a ccdB gene under the control of a lac promoter into pTECH using NotI and BglII sites (Wang L, et al. Science 292:498 (2001)). Two amber stop codons were introduced at positions 13 and 44 based on the crystal structure and mutagenesis study of the CcdB protein (Bernard. P., et al. Gene 148:71 (1994); Bajaj K. et al. Proc Natl Acad Sci USA 102:16221 (2005)).

[0110] Plasmid pL11C-SepT encodes tRNA.sup.Sep and the C-terminal domain of the ribosomal protein L11 under control of lpp promoters. Part of the rplK gene was PCR amplified from genomic E. coli DNA using primers L11C-F (5'-GGA ATT CCA TAT GAC CAA GAC CCC GCC GGC AGC AGT T-3', SEQ ID NO:38) and L11C-R (5'-AGG CGC GCC TTA GTC CTC CAC TAC-3', SEQ ID NO:39). The PCR product was digested with NdeI and AscI and was ligated into NdeI and AscI digested pMYO127TAG-SepT to replace the myoglobin gene.

[0111] To construct pMAL-EFTu and pMAL-EFSep E. coli tufB, or the gene encoding EF-Sep, respectively, were cloned between the NdeI and BamHI sites in the pET20b plasmid (Novaven) to add a C-terminal His.sub.6 tag. This fusion construct was then PCR-amplified using primers adding MfeI and PstI restriction sites. The PCR product was cloned in-frame between EcoRI and PstI in pMAL-c2x (New England Biolabs) to add an N-terminal maltose binding protein (MBP) tag.

Aminoacylation of tRNA and EF-Sep Binding Assays

[0112] In vitro transcript of Methanocaldococcus jannaschii tRNA.sup.Cys was prepared and acylated with [.sup.14C]Sep (55 mCi/mmol) using recombinant Methanococcus maripaludis SepRS as described previously by Hohn M J, et al. Proc Natl Acad Sci USA. 2006 Nov. 28; 103(48); 18095-100. Sep-tRNA.sup.Cys was phenol/chlorophorm extracted, and the aqueous phase was passed over Sephadex G25 Microspin columns(GE Healthcare) equilibrated with water.

[0113] Protection of Sep-tRNA.sup.Cys by EF-Tu was assayed as described earlier with slight modifications (Ling J, et al. RNA. 2007 November; 13(11);1881 -6.). Briefly, EF-Tu or EF-Sep (both purified as maltose binding protein fusion proteins) were activated for 20 min. at 37.degree.C. in buffer containing 100 mM Tris-HCl (pH 8.2), 120 mM NH.sub.4Cl, 7 mM MgCl.sub.2, 5 mM DTT, 5 mM phosphoenolpyruvate, 1.5 mM GTP, and 0.12 .mu.g/.mu.l pyruvate kinase. Hydrolysis of 2 .mu.M [.sup.14C]Sep-tRNA.sup.Cys was then monitored at 25.degree.C. in the presence of 40 .mu.M EF-Tu (wt), EF-Sep, or BSA, respectively. Aliquots were taken from the reaction mix at indicated time points and spotted on 3 MM filter discs presoaked with 10% trichloroacetic acid. Filters were washed with 5% trichloroacetic acid, dried, and radioactivity was measured by liquid scintillation counting.

Results

[0114] The Sep-insertion strategy was based on the discovery that most methanogens form Cys-tRNA.sup.Cys by an unusual pathway required for cysteine synthesis in these archaea (Sauerwald A. et al. Science 307, 1969 (2005)). In this route (FIG. 1A), tRNA.sup.Cys first becomes acylated with O-phosphoserine (Sep) by O-phosphoseryl-tRNA synthetase (SepRS), an unusual aminoacyl-tRNA synthetase specific solely for the substrates Sep and tRNA.sup.Cys (Hohn, M J., et al. Proc Natl Acad Sci USA 103, 18095 (2006)). The resulting product Sep-tRNA.sup.Cys is then converted to Cys-tRNA.sup.Cys by the enzyme SepCysS in the presence of a sulfur-donor (Sauerwald A. et al., Science 307, 1969 (2005)). The exclusive recognition of Sep by SepRS was further confirmed by the structural elucidation of this enzyme and by the biochemical analysis of its catalytic site (Kamtekar S. et al., Proc Natl Acad Sci USA 104, 2620 (2007); Fukunaga, R. et al., Nat Struct Mol Biol 14, 272 (2007)). The molecular basis of Methanocaldococcus jannaschii (Mj) tRNA.sup.Cys recognition by SepRS and CysRS from Methanococcus maripaludis (Mmp) was also explored, yielding the SepRS-specific tRNA identity elements (Hohn, M J., et al., Proc Natl Acad Sci USA 103, 18095 (2006)). Based on these results it was decided to test the applicability of Mj tRNA.sup.Cys and Mmp SepRS as an orthogonal pair for UAG-directed translational incorporation of Sep into proteins expressed in Escherichia coli. A scheme was sought for co-translational insertion of phosphoserine (Sep) into proteins in E. coli in response to the amber codon UAG. Methanogens utilize an aminoacyl-tRNA synthetase (SepRS) that acylates tRNA.sup.Cys with Sep during the biosynthesis of Cys-tRNA.sup.Cys (FIG. 1A).

[0115] A tRNA (tRNA.sup.Sep) was designed that could be aminoacylated with phosphoserine (FIG. 1B). tRNA.sup.Sep is a tRNA derived from Mj tRNA.sup.Cys containing a C20U change that improves 2.5-fold the aminoacylation by SepRS without affecting CysRS recognition. In addition, tRNA.sup.Sep was modified to be an amber suppressor by including two mutations in the anticodon (FIG. 1B).

[0116] Both tRNA.sup.Sep and tRNA.sup.Cys were overexpressed in E. coli. In vivo aminoacylation by Mmp SepRS showed (FIG. 1C) that the anticodon change lowered (to about 40%) the activity of tRNA.sup.Sep when compared to tRNA.sup.Cys. Total E. coli tRNA could not be charged with Sep (FIG. 1C). Based on these in vitro data, Mj tRNA.sup.Sep and Mmp SepRS appear to be an orthogonal pair.

[0117] Efficient and selective addition of Sep to the E. coli genetic repertoire requires exclusive interaction of SepRS with tRNA.sup.Sep for Sep-tRNA.sup.Sep formation without interfering in the host translation system as well as a sufficient intracellular concentration of Sep. As E. coli has a Sep-compatible transporter (Wanner, B L. FEMS Microbiol Lett 79, 133 (1992)). Sep (2 mM) was added to the growth medium, and the endogenous serB gene encoding phosphoserine phosphatase was deleted in the E. coli test strain. To assess whether the Mj tRNA.sup.Sep/Mmp SepRS pair is functional and orthogonal in E. coli, a suppression assay was performed that employed a chloramphenicol acetyltransferase (CAT) gene with a UAG stop codon at the permissive position 112 (wild-type amino acid: Asp) to produce chloramphenicol (Cm) acetyltransferase: then cell survival was measured in the presence of Sep and varying amounts of Cm. The different IC.sub.50 values (FIG. 2) relate to suppression efficiency (i.e., amount of CAT made dependent on the various transformed genes). When only tRNA.sup.Sep is expressed (FIG. 2, second bar) Cm resistance increases about 3.3-fold over background (FIG. 2, first bar). Thus, tRNA.sup.Sep can be aminoacylated to a certain degree by an unknown E. coli aminoacyl-tRNA synthetase (Gln is being incorporated at the amber stop codon). In contrast, simultaneous expression of tRNA.sup.Sep and SepRS does not provide Cm resistance (FIG. 2. third bar). This may indicate that SepRS can out-compete any endogenous aminoacyl-tRNA synthetase and form Sep-tRNA.sup.Sep; however, this aminoacyl-tRNA is not delivered to the ribosome or not accommodated on the ribosome. Providing additional EF-Tu does not improve the result (FIG. 2, fifth bar). Co-expression of tRNA.sup.Sep, SepRS and SepCysS should result in formation of Sep-tRNA.sup.Sep and subsequent SepCysS-mediated conversion to Cys-tRNA.sup.Sep (A. Sauerwald et al., Science 307, 1969 (2005)). Indeed, a 2.3-fold increase in Cm resistance is observed (FIG. 2, sixth bar). This further supports the notion that while Sep-tRNA.sup.Sep is synthesized, it cannot be used properly by the E. coli protein biosynthesis machinery. On the other hand, co-expression of tRNA.sup.Sep and Mmp CysRS generates a 12.3-fold increase in Cm resistance (FIG. 2, eight bar), demonstrating that Cys-tRNA.sup.Sep can be readily used for amber codon suppression in the CAT gene.

[0118] Given that EF-Tu is a component of quality control in protein synthesis (LaRiviere, F J., et al. Science 294, 165 (2001)), it is highly plausible that Sep-tRNA.sup.Sep may be rejected by EF-Tu in order not to interfere with the complicated cellular mechanism of phosphoprotein production. Chemically synthesized Sep-tRNA.sup.Gln was a poor substrate for in vitro protein synthesis (Rothman D M. et al., J Am Chem Soc 127, 846 (2005)). tRNAs carrying negatively charged amino acids are bound poorly by EF-Tu (Dale, T., et al. Biochemistry 43, 6159 (2004)), and molecular dynamics simulations suggested that Sep-tRNA.sup.Cys may not be bound by EF-Tu (Eargle, J., et al. J Mol Biol 377, 1382 (2008)). This assumption was tested in EF-Tu mediated Sep-tRNA hydrolysis protection experiments (J. Ling et al., Proc Natl Acad Sci USA 104, 15299 (2007)), and incubated recombinant E. coli EF-Tu with the Mj tRNA.sup.Cys in vitro transcript either acylated with [.sup.35S]Cys or [.sup.14C]Sep at pH 8.2. While EF-Tu protected [.sup.35S]Cys-tRNA.sup.Cys from deacylation (FIG. 5A). Sep-tRNA.sup.Cys was significantly deacylated irrespective of the presence of EF-Tu (FIG. 3 and FIG. 5B). Thus, insufficient binding of Sep-tRNA.sup.Sep to EF-Tu may explain the lack of Sep insertion into protein.

Example 2: Development of EF-Sep

Materials and Methods

Library Construction and Selection of Sep-tRNA Specific EF-Tu

[0119] Six residues, His67, Asp216, Glu217, Phe2l9, Thr229, and Asn274, located in the amino acid binding pocket of the E. coli elongation factor EF-Tu were selected for randomization based on the crystal structure of the E. coli EF-Tu:Phe-tRNA.sup.Phe complex (protein data base accession number 1OB2). Multiple rounds of overlap PCR were carried out to incorporate random codons (NNK) at these positions by using the following primers described in Park H-S et al., Science 311:535-538 (2006):

TABLE-US-00003 67XF, (SEQ ID NO: 19) 5'-GT ATC ACC ATC AAC ACT TCT NNK GTT GAA TAC GAC ACC CCG-3'; H67R, (SEQ ID NO: 20) 5'-AGA AGT GTT GAT GGT GAT AC-3'; 216XF, (SEQ ID NO: 21) 5'-CCG TTC CTG CTG CCG ATC NNK NNK GTA NNK TCC ATC TCC GGT CGT GGT-3'; 216R, (SEQ ID NO: 22) 5'-GAT CGG CAG CAG GAA CGG-3'; 229XF, (SEQ ID NO: 23) 5'-GGT CGT GGT ACC GTT GTT NNK GGT CGT GTA GAA CGC GG-3'; 229R, (SEQ ID NO: 24) 5'-AAC AAC GGT ACC ACG ACC-3'; 274XF, (SEQ ID NO: 25) 5'-GAA GGC CGT GCT GGT GAG NNK GTA GGT GTT CTG CTG CG-3'; and 274R, (SEQ ID NO: 26) 5'-CTC ACC AGC ACG GCC TTC-3'.

[0120] The final PCR products were purified and digested with BamHI and SalI, and ligated into pKD-SepRS to generate the EF-Tu library. The ligated vectors were transformed into E. coli Top10.DELTA.serB containing pCAT112-SepT to generate a library of 3.times.10.sup.8 mutants. The unbiased mutation of the library was confirmed by selecting twenty random clones and sequencing each mutant tufB insert.

[0121] The mutant EF-Tu library was subjected to a first round of selection, in which clones sup-pressing the amber stop codon in the CAT gene can survive on LB plates supplemented with 10 mg/ml tetracycline (Tc), 25 mg/ml Kan, 50 mg/ml chloramphenicol (Cm), 2 mM Sep, and 0.05 mM isopropyl-.beta.-D-thiogalactopyranoside (IPTG). After 48 h incubation at 30.degree.C., a pool of 10.sup.4 colonies was collected from the plates for plasmid preparation. The pKD-SepRS-EFTu plasmids were separated from the reporter plasmid by agarose gel electrophosis and isolated using the Qiagen gel purification kit.

[0122] There is a possibility that mutations in the amino acid binding site of EF-Tu could induce incorporation of natural amino acids in response to the amber codon in the CAT gene, resulting in false positive clones. To select against these EF-Tu mutants, the pKD-SepRS-EFTu plasmids from the first positive selection were transformed into E. coli Top10.DELTA.serB harboring pCcdB. The cells were plated onto LB agar supplemented with 25 mg/ml Kan, 25 mg/ml Cm, and 0.1 mM IPTG. After 48 h incubation at 30.degree.C., twenty individual clones were picked and subjected to plasmid purification to isolate pKD-SepRS-EFTu as described above. The EF-Tu mutant genes were digested from the plasmid and recloned into pKD-SepRS.

[0123] Resulting pKD-SepRS-EFTu plasmids were transformed into E. coli Top10.DELTA.serB containing pCAT112-SepT for a third round of selection which was carried out under the same conditions as the first. This time, individual colonies were isolated from agar plates and clones were tested for their ability to grow on Cm over a concentration range from 5 to 100 mg/ml. Total plasmid was purified from isolates showing strong Cm resistance, and pKD-SepRS-EFTu plasmids were subjected to sequencing.

[0124] To confirm that the observed Cm resistance is dependent on the presence of both, mutant EF-Tu and SepRS, EF-Tu mutant genes were excised from their plasmids, recloned into pKD, and retransformed into E. coli Top10.DELTA.serB containing pCAT112-SepT. Cells were then tested for Cm resistance as described above.

Expression and Purification of M. maripaludis SepRS and CysRS

[0125] SepRS and CysRS were produced in E. coli and purified as described by Hohn, M. J. et al., Proc Natl Acad Sci USA 103, 18095 (2006).

Expression and Purification of EF-Tu and EF-Sep

[0126] pMAL-EFTu or pMAL-EFSep were transformed into E. coli BL21 (DE3) codon plus (Stratagene). A pre-culture was used to inoculate 1000 ml of LB broth with 100 .mu.g/ml of Amp, 34 .mu.g/ml Cm, 5052 solution, and phosphate buffer for autoinduction as described by Studier, F W. Protein Expr Purif 41, 207 (2005). The cells were grown for 6 h at 37.degree.C. and continued at 20.degree.C. for 18 h.

[0127] The cells were pelleted and lysed by shaking for 20 min. in BugBuster (Novagen) reagent supplemented with 50 mM Tris-HCl (pH 7.6), 60 mM NH.sub.4Cl, 7 mM MgCl.sub.2, 14.3 mM 2-mercapto-ethanol, 50 .mu.M GDP, 10% glycerol, 25 U ml.sup.-1 Benzoase, 1 mg ml.sup.-1 lysozyme, and Protease inhibitor cocktail (Roche).

[0128] The extract was clarified by ultracentrifugation and applied to a Ni.sup.2+-NTA resin (Qiagen) and purified according to the manufacturer's instructions.

[0129] The eluted enzymes were dialyzed into 20 mM Hepes-KOH (pH 7.0). 40 mM KCl, 1 mM MgCl.sub.2, 5 mM DTT, 50 .mu.M GDP, and 30% glycerol. SDS-PAGE electrophoresis followed by staining with Coomassie blue revealed greater than 95% purity.

Results

[0130] Guided by the structure of the E. coli EF-Tu:Phe-tRNAPhe complex (P. Nissen et al., Science 270, 1464 (1995)) it was decided to randomize certain positions in the amino acid binding pocket to evolve EF-Tu variants that bind Sep-tRNA and promote its delivery to the ribosome. Six residues (His67, Asp216, Glu217, Phe219, Thr229, and Asn274) were selected for complete randomization generating a library of 3.times.10.sup.8 EF-Tu mutants. To select in vivo variants that permits Sep incorporation in the presence of SepRS and tRNA.sup.Sep three rounds of selections (positive, negative, positive) were performed that yielded several clones with the desired phenotype. One clone, designated EF-Sep, was tested further in detail. While the combination of SepRS and EF-Sep was not active in the CAT suppression assay (FIG. 2, lane G), the further inclusion of tRNA.sup.Sep led to a 10-fold increase in Cm resistance (FIG. 2, lane H). Thus, it appeared that EF-Sep could bind Sep-tRNA.sup.Sep, a fact that was ascertained in the hydrolysis protection assay (FIG. 3). The DNA sequence of the EF-Sep gene revealed the nature of the mutations.

[0131] EF-Tu mutants (EF-Sep) that could bind Sep-tRNA include those having the following amino acid sequences;

TABLE-US-00004 (EFSep-M6, SEQ ID NO: 1) MSKEKFERTKPHVNVGIGHVDHGKTTLTAAITTVLAKTYGGTARAFDQID NAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDG AILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELVE MEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILILAGFLDSYIP EPERAIDKPFLLPITRVYSISGRGTVVSGRVERGIIKVGEEVIEVGIKET QKSTCTGVEMFRKLLDEGRAGEFVGVLLRGIKREEIERGQVLAKPGTIKP HTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEMV MPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSSV DKLAAALE (EFSep-M7, SEQ ID NO: 2) MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARAFDQI DNAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMD GAILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELV EMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILELAGFLDSYI PEPERAIDKPFLLPITYVYSISGRGTVVSGRVERGIIKVGEEVEIVGINE TQKSTCTGVEMFRKLLDEGRAGEAVGVLLRGIKREEIERGQVLAKPGTIK PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEM VMPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSS VDKLAAALE (EFSep-M8, SEQ ID NO: 3) MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARAFDQI DNAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMD GAILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELV EMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILELAGFLDSYI PEPERAIDKPFLLPINGVYSISGRGTVVSGRVERGIIKVGEEVEIVGIKE TQKSTCTGVEMFRKLLDEGRAGEWVGVLLRGIKREEIERGQVLAKPGTIK PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEM VMPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSS VDKLAAALE (EFSep-M9, SEQ ID NO: 4) MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARARFQI DNAPEEKARGITINTSRVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMD GAILVVAATDGPMPQTREHILLGRQVGVPYIIVFLNKCDMVDDEELLELV EMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWEAKILELAGFLDSYI PEPERAIDKPFLLPITAVYSISGRGTVVSGRVERGIIKVGEEVEIVGIKE TQKSTCTGVEMFRKLLDEGRAGEAVGVLLRGIKREEIERGQVLAKPGTIK PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEM VMPGDNIKMVVTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLRDPNSSS VDKLAAALE

[0132] Nucleic acid encoding EF-Tu mutants (EFSep) that could bind Sep-tRNA include those having the following amino acid sequences:

TABLE-US-00005 (EFSep-M6, SEQ ID NO: 5) ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA CCGTACTGGCTAAAACCTACGGCGGTACTGCTCGCGCATTCGACCAGATC GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTCG GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCACCCG GGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAAAGAG ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA CGAAGGCGTGCTGGTGAGTTCGTAGGTGTTCTGCTGCGTGGTATCAAACG TGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAGC CGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGGC GGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCCG TACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATGG TAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGATC GCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCGT TGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCCG TCGACAAGCTTGCGGCCGCACTCGAG (EFSep-M7, SEQ ID NO: 6) ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA CCGTACTGGCTAAAACCTACGGCGGTGCTGCTCGCGCATTCGACCAGATC GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTAG GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCACCTA CGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAATGAG ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA CGAAGGCCGTGCTGGTGAGGCGGTAGGTGTTCTGCTGCGTGGTATCAAAC GTGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAG CCGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGG CGGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCC GTACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATG GTAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGAT CGCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCG TTGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCC GTCGACAAGCTTGCGGCCGCACTCGAG (EFSep-M8, SEQ ID NO: 7) ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA CCGTACTGGCTAAAACCTACGGCGGTGCTGCTCGCGCATTCGACCAGATC GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTCG GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCAACGG GGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAAAGAG ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA CGAAGGCCGTGCTGGTGAGTGGGTAGGTGTTCTGCTGCGTGGTATCAAAC GTGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAG CCGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGG CGGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCC GTACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATG GTAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGAT CGCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCG TTGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCC GTCGACAAGCTTGCGGCCGCACTCGAG (EFSep-M9, SEQ ID NO: 8) ATGTCTAAAGAAAAGTTTGAACGTACAAAACCGCACGTTAACGTCGGTAC TATCGGCCACGTTGACCATGGTAAAACAACGCTGACCGCTGCAATCACTA CCGTACTGGCTAAAACCTACGGCGGTGCTGCTCGCGCATTCGACCAGATC GATAACGCGCCGGAAGAAAAAGCTCGTGGTATCACCATCAACACTTCTCG GGTTGAATACGACACCCCGACCCGTCACTACGCACACGTAGACTGCCCGG GGCACGCCGACTATGTTAAAAACATGATCACCGGTGCTGCGCAGATGGAC GGCGCGATCCTGGTAGTTGCTGCGACTGACGGCCCGATGCCGCAGACTCG TGAGCACATCCTGCTGGGTCGTCAGGTAGGCGTTCCGTACATCATCGTGT TCCTGAACAAATGCGACATGGTTGATGACGAAGAGCTGCTGGAACTGGTT GAAATGGAAGTTCGTGAACTTCTGTCTCAGTACGACTTCCCGGGCGACGA CACTCCGATCGTTCGTGGTTCTGCTCTGAAAGCGCTGGAAGGCGACGCAG AGTGGGAAGCGAAAATCCTGGAACTGGCTGGCTTCCTGGATTCTTACATT CCGGAACCAGAGCGTGCGATTGACAAGCCGTTCCTGCTGCCGATCACCGC GGTATACTCCATCTCCGGTCGTGGTACCGTTGTTTCGGGTCGTGTAGAAC GCGGTATCATCAAAGTTGGTGAAGAAGTTGAAATCGTTGGTATCAAAGAG ACTCAGAAGTCTACCTGTACTGGCGTTGAAATGTTCCGCAAACTGCTGGA CGAAGGCCGTGCTGGTGAGGCCGTAGGTGTTCTGCTGCGTGGTATCAAAC GTGAAGAAATCGAACGTGGTCAGGTACTGGCTAAGCCGGGCACCATCAAG CCGCACACCAAGTTCGAATCTGAAGTGTACATTCTGTCCAAAGATGAAGG CGGCCGTCATACTCCGTTCTTCAAAGGCTACCGTCCGCAGTTCTACTTCC GTACTACTGACGTGACTGGTACCATCGAACTGCCGGAAGGCGTAGAGATG GTAATGCCGGGCGACAACATCAAAATGGTTGTTACCCTGATCCACCCGAT CGCGATGGACGACGGTCTGCGTTTCGCAATCCGTGAAGGCGGCCGTACCG TTGGCGCGGGCGTTGTAGCAAAAGTTCTGAGGGATCCGAATTCGAGCTCC GTCGACAAGCTTGCGGCCGCACTCGAG

Example 3: Demonstration of Sep Incorporation into Myoglobin

Materials and Methods

Construction of Plasmids

[0133] pMYO127TAG-SepT was constructed by cloning a codon-optimized and C-terminally His.sub.6-tagged sperm whale myoglobin gene under the control of the lpp promoter between NotI and BglII in pSepT. An amber stop codon was introduced to the myoglobin gene at position Asp127 by quickchange mutagenesis. The nucleotide sequence of the codon-optimized myoglobin gene is as follows:

TABLE-US-00006 (SEQ ID NO: 44) ATGGTTCTGTCTGAAGGTGAATGGCAGCTGGTTCTGCACGTTTGGGCTAA AGTTGAAGCTGACGTTGCTGGTCACGGTCAGGACATCCTGATCCGTCTGT TCAAATCTCACCCGGAAACCCTGGAAAAATTCGACCGTTTCAAACACCTG AAAACCGAAGCTGAAATGAAGGCTTCTGAAGACCTGAAAAAACACGGTGT TACCGTTCTGACCGCTCTGGGTGCTATCCTGAAGAAAAAGGGTCACCACG AAGCTGAACTGAAACCGCTGGCTCAGTCTCACGCTACCAAACACAAAATC CCGATCAAATACCTGGAGTTCATCTCTGAAGCTATCATCCACGTTCTGCA CTCTCGTCATCCGGGTAACTTCGGTGCTGACGCTCAGGGTGCTATGAACA AAGCTCTGGAACTGTTCCGTAAAGACATCGCTGCTAAATACAAAGAACTG GGTTACCAGGGTGGTTCTGGTCATCACCATCACCATCACTAA.

Results

[0134] To prove that the observed suppression is due to Sep incorporation a myoglobin variant with an amber codon in position 127 (normally Asp) and a C-terminal His.sub.6-tag was expressed. The expected full length protein was synthesized (yield is 2 mg/L of culture) only when EF-Sep, SepRS and tRNA.sup.Sep were co-expressed. The amino acid incorporated via EF-Sep in response to the amber codon was identified by analyzing both the intact and trypsin-digested Myo-His.sub.6 mutant protein. MS-TOF and MS/MS analysis show that Sep is present at the position specified by UAG.

Example 4: Active MEK Synthesis In Vivo

Materials and Methods

Construction of Plasmids

[0135] pET15-ERK2 encodes N-terminally His.sub.6-tagged mitogen-activated protein kinase (Erk2) under the control of a T7 promoter. The human Erk2 gene was PCR amplified from plasmid BC017832 (ATCC) using primers ERK2-F (5'-GGA ATT CCA TAT GGC GGC GGC GGC GGC G-3', SEQ ID NO:27) and ERK2-R (5'-CCG CTC GAG TTA AGA TCT GTA TCC TGG-3', SEQ ID NO:28). The PCR product was cloned between NdeI and XhoI in vector pET15b (Novagen).

[0136] pET20-MBPMEK1 encodes a fusion protein consisting of human MEK1 with an N-terminal maltose binding protein (MBP) tag and a C-terminal His.sub.6-tag. The gene encoding human MEK1 which was codon-optimized for E. coli and custom-synthesized in vitro (Genscript), was cloned between EcoRI and PstI into pMALc2x (New England Biolabs). The resulting MBP-MEK1 fusion construct was then amplified with primers ET20MEKF (5'-AAG GAA ATT AAT GAA AAT CGA AGA AGG TAA-3', SEQ ID NO:29) and ET20MEKR (5'-CTA GAG GAT CCG GCG CGC-3', SEQ ID NO:30) adding AseI and BamHI restriction sites, and the PCR product was ligated between NdeI and BamHI into pET20b.

Nucleotide Sequence of Codon-optimized MEK1

TABLE-US-00007 [0137] (SEQ ID NO: 31) ATGCCGAAGAAGAAACCGACCCCGATCCAGCTGAACCCGGCTCCGGACGG TTCTGCTGGTTAACGGCACCTCTTCTGCTGAAACCAACCTGGAAGCTCTG CAAAAGAAACTGGAAGAACTGGAACTGGACGAACAGCAGCGTAAACGTCT GGAAGCGTTCCTGACCCAGAAACAGAAAGTTGGTGAACTGAAAGACGACG ACTTCGAAAAAATCTCTGAACTGGGTGCTGGTAACGGTGGTGTTGTTTTC AAAGTTTCTCACAAACCGTCCGGTCTGGTTATGGCTCGTAAACTGATCCA CCTGGAAATCAAACCGGCTATCCGTAACCAGATCATCCGTGAACTGCAAG TTCTGCACGAATGCAACTCTCCGTACATCGTTGGTTTCTACGGTGCTTTC TACTCTGACGGTGAAATCTCTATCTGCATGGAACACATGGACGGTGGTTC TCTGGACCAGGTTCTGAAAAAAGCTGGTCGTATCCCGGAACAGATCCTGG GTAAAGTTTCTATCGCTGTTATCAAAGGTCTGACCTACCTGCGTGAAAAA CACAAAATCATGCACCGTGACGTTAAACCGTCTAACATCCTGGTTAACTC TCGTGGTGAAATCAAACTGTGCGACTTCGGTGTTTCTGGTCAGCTGATCG ACTCTATGGCTAACTCTTTCGTTGGCACCCGTTCTTACATGTCTCCGGAA CGTCTGCAAGGCACCCACTACTCTGTTCAGTCTGACATCTGGTCTATGGG TCTGTCTCTGGTTGAAATGGCTGTTGGTCGTTACCCGATCCCGCCGCCGG ACGCTAAAGAACTGGAACTGATGTTCGGTTGCCAGGTTGAAGGTGACGCT GCTGAAACCCCGCCGCGTCCGCGTACTCCGGGTCGTCCGCTGTCTTCTTA CGGTATGGACTCTCGTCCGCCGATGGCTATCTTCGAACTGCTGGACTACA TCGTTAACGAACCGCCGCCGAAACTGCCGTCTGGTGTTTTCTCTCTGGAG TTCCAGGACTTCGTTAACAAATGCCTGATCAAAAACCCGGCTGAACGTGC TGACCTGAAACAGCTGATGGTTCACGCTTTCATCAAACGTTCTGACGCTG AAGAAGTTGACTTCGCTGGTTGGCTGTGCTCTACCATCGGTCTGAACCAG CCGTCTACCCCGACCCACGCTGCTGGTGTGGCAGCCGCAGCTGCGCATCA TCACCACCATCACTAA.

[0138] pCG-MBPMEK1SS was generated by the ligation of three PCR products. One PCR product was derived from pGFIB (Normanly J, et al. Nature 321:213 (1986)) using primers GFIB-F (5'-ATA AGA ATG CGG CCG CGC CGC AGC CGA ACG ACC GAG-3', SEQ ID NO:32) and GFIB-R (5'-CTA GCT AGC GTC TGA CGC TCA GTG GAA CG-3', SEQ ID NO:33). The second PCR product was generated from pCDFDuet-1 (Novagen) using primers CDF-F (5'-CTA GCT AGC TCA CTC GGT CGC TAC GCT-3', SEQ ID NO:34) and CDF-R (5'-ATA AGA ATG CGG CCG CTG AAA TCT AGA GCG GTT CAG-3', SEQ ID NO:35). Both PCR products were digested with NheI and NotI and ligated to form plasmid pCG, The third PCR product, encoding an expression cassette for MBP-MEK1-His.sub.6 under the control of T7 promoter and T7 terminator, was generated from pET20-MBPMEK1 using primers ETCDGFF (5'-AAA AGG CGC CGC CAG CCT AGC CGG GTC CTC AAC G-3', SEQ ID NO:36) and ETCDGFR (5'- AAC TGC AGC CAA TCC GGA TAT AGT TC-3', SEQ ID NO:37). This PCR product was cloned between the NarI and PstI sites of pCG.

[0139] The codon for Ser 222 in MEK1 was then replaced by a GAA codon (encoding Glu) using Quickchange mutagenesis (Stratagene). In the same way, codon Ser 218 was either changed to GAA to generate pCG-MBPMEK1EE, or to an amber stop codon, resulting in pCG-MBPMEK1XE. In pCG-MBPMEK1XS only the codon for Ser218 was changed to UAG and in pCG-MBPMEK1XX both codons for Ser218 and Ser222 were changed to amber.

Expression and Purification of Myoglobin

[0140] To express mutant myoglobin, pKD-SepRS-EF-Sep and pKD-SepRS were transformed into E. coli Top10.DELTA.serB containing pMYO127TAG-SepT. E. coli Top10.DELTA.serB with pMYO, encoding the wild type myoglobin gene was used as a control. Cultures were grown in LB medium supplemented with 2 mM Sep. When A.sub.600 reached 0.6 protein expression was induced with 0.05 mM IPTG for 12 h at 25.degree.C. The cells were harvested, resuspended in lysis buffer (50 mM Tris-HCl (pH 7.8), 300 mM NaCl, 14.3 mM 2-mercaptoethanol) supplemented with protease inhibitor cocktail (Roche), and subjected to sonication. The lysate was centrifuged at 10,000.times.g for 30 min and the supernatant was applied to Ni.sup.2+-NTA agarose (Qiagen) purification according to the manufacturer's instruction.

Expression and Purification of MEK1

[0141] To express MEK1 (as a maltose binding protein fusion-protein) E. coli BL21.DELTA.serB was transformed with plasmids pKD-SepRS-EFSep. pCAT112TAG-SepT, and pCG-MBPMEK1SS, pCG-MBPMEK1EE, pCG-MBPMEK1XE, pCG-MBPMEK1XS, or pCG-MBPMEK1XX, respectively. Plasmid pCAT112TAG-SepT was replaced by pL11C-SepT in the strain used to produce MBP-MEK1(Sep218,Ser222)-His.sub.6 for mass spectrometry analysis.

[0142] Cells were grown at 30.degree. C. in 1 liter of LB supplemented with 100 .mu.g/ml of Amp, 50 .mu.g/ml Kan, 12 .mu.g/ml Tc, 2 mM Sep, 5052 solution, and phosphate buffer for autoinduction. When A600 reached 0.6, temperature was changed to 16.degree. C. and incubation continued for 18 h. After harvesting, cells were lysed in 20 ml BugBuster reagent containing 50 mM Tris-HCl (pH 7.8), 500 mM NaCl, 0.5 mM EGTA. 0.5 mM EDTA, 14.3 mM 2-mercapto-ethanol, 10% glycerol, 0.03% Brij-35, protease inhibitors, 25 U ml.sup.-1 Benzoase, and 1 mg ml.sup.-1 lysozyme. The lysate was clarified by ultracentrifugation, and applied to a 0.4 ml Ni.sup.2+-NTA agarose column. The column was washed with 15 ml wash buffer (50 mM Tris-HCl (pH 7.8), 150 mM NaCl, 0.5 mM EGTA, 0.5 mM EDTA, 14.3 mM 2-mercaptoethanol, 10% glycerol, 0.03% Brij-35, and 20 mM imidazole). Proteins were eluted in 0.8 ml of wash buffer supplemented with 300 mM imidazole, dialyzed against 50 mM Tris-HCl (pH 7.8), 150 mM NaCl, 0.1 mM EGTA, 5 mM DTT, 30% glycerol, and 0.03% Brij-35, and stored at -20.degree. C. Purified proteins were analyzed by SDS-PAGE.

Expression and Purification of Erk2

[0143] E. coli BL21 (DE3) codon plus cells were transformed with pET15-ERK2 and grown at 37.degree. C. in 1 liter LB broth supplemented with 100 g/ml Amp and 34 g/ml Cm. When the cultures reached A600 of 0.6, 0.2 mM IPTG was added and expression was induced for 19 h at 16.degree. C.

[0144] Cell lysis, Ni.sup.2+purification, and dialysis of Erk2 were carried out as described for MEK1. Erk2 was 99% pure, as judged by Coomassie brilliant blue staining after SDS-PAGE.

Preparation and Aminoacylation of tRNA.

[0145] Total tRNA from E. coli Top10 or from E. coli Top10 complemented with pCysT or pSepT, respectively, was purified by standard procedures and acylated with [.sup.14C]Sep by M. maripaludis SepRS as described previously. In vivo synthesized tRNA was for this experiment to ensure that nucleoside modifications introduced into tRNA by E. coli modifying enzymes do not affect tRNA recognition by SepRS. M. jannaschii tRNA.sup.Cys contains m.sup.1G37 when isolated from M. jannaschii. Since the E. coli methylase TrmD is known to methylate G537 of archaeal tRNA.sup.Pro, it is believed that the in vivo expressed tRNA.sup.Sep also carries the m.sup.1G37 modification. In vitro transcript of M. jannaschii tRNA.sup.Cys was prepared and acylated with [.sup.14C]Sep or [.sup.35S]Cys using recombinant M. maripaludis SepRS or CysRS. M. jannaschii tRNA.sup.Cys transcript was chosen for these experiments because of the poor folding properties of in vitro transcribed M. maripaludis tRNA.sup.Cys (Hohn, M. J. Proc Natl Acad Sci USA 103, 18095 (2006)).

EF-Tu Hydrolysis Protection Assays

[0146] To assay hydrolysis protection of acylated tRNA.sup.Cys by EF-Tu, Mmp tRNA.sup.Cys in vitro transcripts acylated with [.sup.14C]Sep or [.sup.35S]Cys, respectively, were phenol/chlorophorm extracted, and the aqueous phase was passed over Sephadex.RTM. G25 Microspin columns (GE Healthcare) equilibrated with water. Protection of aminoacylated tRNA by EF-Tu was assayed as described earlier with slight modifications (Ling J. et al., Proc Natl Acad Sci USA 104, 15299 (2007)). Briefly, EF-Tu or EF-Sep (both purified as maltose binding protein fusion proteins) were activated for 20 min. at 37.degree. C. in buffer containing 100 mM Tris-HCl (pH 8.2), 120 mM NH4Cl, 7 mM MgCl.sub.2, 5 mM DTT, 5 mM phosphoenolpyruvate, 1.5 mM GTP, and 0.12 .mu.g/.mu.l pyruvate kinase. Hydrolysis of 2 .mu.M [.sup.14C]Sep-tRNA.sup.Cys was then monitored at 25.degree. C. in the presence of 40 .mu.M EF-Tu (wt), EF-Sep, or BSA, respectively. Aliquots were taken from the reaction mix at indicated time points and spotted on 3 MM filter discs presoaked with 10% trichloroacetic acid. Filters were washed with 5% trichloroacetic acid, dried, and radioactivity was measured by liquid scintillation counting.

MEK Activity Assays

[0147] Recombinant MEK1 variants were assayed (as maltose binding protein (MBP) fusion-proteins). Briefly, in a first reaction, various amounts (2.5-5000 ng) of recombinant MBP-MEK1 variants were used to phosphorylate (and activate) bacterially expressed MAP kinase (Erk2) for 15 min. at 30.degree. C. in 35 .mu.l kinase assay buffer containing 12 mM MOPS pH 7.2, 20 mM MgCl.sub.2, 3 mM EGTA, 15 mM .beta.-glycerol phosphate, 0.6 mM DTT, 140 .mu.M ATP, and 1 .mu.g Erk2.

[0148] After 15 min, a 5 .mu.l aliquot was transferred to a second reaction in which activated Erk2 phosphorylates myelin basic protein (MBP: 570 .mu.g ml.sup.1) in kinase assay buffer in the presence of [.gamma.-.sup.32P]ATP. After 15 min. incubation at 30.degree. C. 25 .mu.l aliquots were transferred onto p81 phospho-cellulose filters (Whatman). The filters were washed three times with 180 mM phosphoric acid and then rinsed with acetone. Phosphorylation was quantitated by scintillation counting and the specific activity of MEK1 was calculated from the amount of [.sup.32P]phosphate incorporated into MBP.

LC and MS/MS Conditions for Multiple Reaction Monitoring (MRM)

[0149] Purified MEK1 proteins were separated by SDS-PAGE, visualized with Comassie stain, excised, washed in 50% acetonitrile (ACN)/50 mM NH.sub.4HCO.sub.3, crushed, and digested at 37.degree. C. in a 20 .mu.g/ml trypsin (Promega) solution in 10 mM NH.sub.4HCO.sub.3. Digested peptides in solution were dried and dissolved in 3 .mu.l of 70% formic acid (FA), and then diluted to 10 .mu.l with 0.1% TFA. Peptides for MRM were synthesized at the KECK peptide synthesis facility at Yale. The human MEK peptide LCDFGVSGQLIDS*MANSFVGTR (SEQ ID NO:40) (*phospho-Ser; YPED peptide ID, SOL14075) was synthesized to permit the development of a specific method for quantitative MRM. Crude synthetic peptides were direct infused at a concentration of .about.10 pmol/.mu.l and Collision Energy and Declustering Potentials of the transitions were optimized. LC-MRM was performed on an ABI 5500 QTRAP triple quadruple mass spectrometer inter-faced with a Waters nanoAcquity UPLC system running Analyst 1.5 software. Peptides were resolved for MRM (LC step) by loading 4 .mu.l of sample onto a Symmetry C18 nanoAcquity trapping column (180 .mu.m.times.20 mm 5 .mu.m) with 100% water at 15 .mu.l per minute for 1 minute. After trapping, peptides were resolved on a BEH130 C18 nanoAcquity column (75 .mu.m.times.50 mm 1.7 .mu.m) with a 30 minute. 2-40%* ACN/0.1% FA linear gradient. (0.5 .mu.l/min flow rate). MRM scanning was carried out with 18 transitions and a cycle time of 1.44 seconds with a 40 millisecond dwell time per transition. An MRM Initiated Detection and Sequencing (MIDAS) was performed. The IDA method consisted of the most intense peak using rolling collision energy. The target ions were excluded after 3 occurrences for 30 seconds. The EPI scan had a scan rate of 20,000 Da/sec with a sum of 3 scans and mass range of 100-1000 Da and a cycle time of 1.4 msec. Files were searched using Mascot version 2.3 with the Swissprot database (08/2010) selected (human taxonomic restriction,). Phosphorylated S and T, and propionamide C were variable modifications. Peptide and fragment mass tolerance is 0.6 Da, with 1 missed cleavage. Quantification was performed using MultiQuant 2.0.

Results

[0150] To further demonstrate the usefulness of the disclosed strategy for the synthesis of a protein that is naturally phosphorylated at a serine residue, recombinant, Sep containing mitogen-activated ERK activating kinase 1 (MEK1) was produced. This key enzyme of the mitogen-activated signaling cascade in eukaryotic cells plays crucial roles in cell proliferation, cell development and differentiation, cell cycle control and oncogenesis (Sebolt-Leopold, J. S., et al. Nat Rev Cancer 4, 937 (2004)). Activation of MEK1 requires post-translational phosphorylation of Ser218 and Ser222 by MEK activating kinases (e.g., Raf-1, MEKK, or MOS). Change of both Ser residues to Glu yields a constitutively active enzyme albeit with lower activity (Alessi D R. et al., EMBO J 13, 1610 (1994)).

[0151] To improve expression of this human protein in the E. coli BL21 .DELTA.serB strain and to allow purification by Ni.sup.2+-affinity chromatography a MEK1 clone was designed to generate an N-terminal fusion with maltose binding protein (MBP) and with a C-terminal His.sub.6-tag. Position 222 was changed to Glu and the codon for Ser2l8 was replaced by UAG to encode Sep. After expression in the presence of SepRS, tRNA.sup.Sep and EF-Sep 25 .mu.g of full-length MBP-MEK1(Sep218,Glu222) were isolated from 1 L of culture. The presence of Sep in this recombinant MEK1-fusion protein was demonstrated by its activity in phosphorylating ERK2. The assay requires the additional component, myelin basic protein (MyBP) which will be phosphorylated by activated ERK2 in the presence of [.gamma.-.sup.32P]ATP; the amount of [.sup.32P]MyBP relates to the specific activity of MEK1. As FIG. 4 shows, MBP-MEK1 (Sep218,Glu222) had a 2,500-fold higher specific activity than non-phosphorylated MBP-MEK1 (Ser218,Ser222), and a 70-fold higher specific activity than the constitutively active MBP-MEK1(Glu218,Glu222) mutant (FIG. 4).

[0152] To demonstrate the incorporation of Sep at position 218 an assay was developed utilizing multiple reaction monitoring (MRM) and a triple-quadrupole mass spectrometer. The MRM assay was designed to detect an intact tryptic phosphopeptide ion (m/z 823.4.sup.+3) derived from MBP-MEK1(Sep218,Ser222) and 4 fragment ions produced by collision-induced dissociation of this intact phosphopeptide (Table 1). The MRM method included an Information Dependent Acquisition (IDA) step that triggered a full MS/MS scan once the 823.4.sup.+3 ion, and associated fragment ions, were detected. The IDA MS/MS spectrum confirmed the incorporation of Sep at position 218 and Ser at 222 in MBP-MEK1 (Sep218, Ser222).

TABLE-US-00008 TABLE 1 Peptide information for MRM precursor/ Peptide (SEQ ID NO: 40) product ion CE DP LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.( .sup.)/ 30.85 160.9 333.2.sup.( .sup.)[y3] LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.( .sup.)/ 38.26 160.9 666.35.sup.( .sup.)[y6] LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.( .sup.)/ 38.62 160.9 780.4.sup.( .sup.)[y7] LC*DFGVSGQLIDS.sup.PMANSFVGTR 823.4.sup.( .sup.)/ 38.12 160.9 851.4.sup.( .sup.)[y8] S.sup.P, phosphoserine, C*, propionamide; CE, Collision energy; DP, Dilution Potential indicates data missing or illegible when filed

[0153] To determine if our E. coli expression system would allow the simultaneous insertion of two Sep residues into the protein, the Ser codons in positions 218 and 222 were changed to UAG. As expected the expression efficiency of MBP-MEK1 (Sep218,Sep222) was dramatically reduced compared to wild-type MBP-MEK1 (only about 1 .mu.g of full length protein was obtained from 1 L culture). The presence of Sep at both active site positions of MEK1 was tested by Western blot analysis using a monoclonal antibody specific to the phosphorylated active site of human MEK2. Only recombinant MBP-MEK1(Sep218,Sep222), and to a weaker extent MBP-MEK1(Sep218,Ser222) was detected in this experiment, while neither MBP-MEK1(Ser218,Ser222), MBP-MEK(Sep218,Glu222) or MBP-MEK(Glu218,Glu222) was recognized by this antibody. The presence of full-length MBP-fusion proteins was confirmed by Coomassie staining and by Western hybridization with an MBP-specific antibody. This demonstrates that the addition of SepRS, tRNA.sup.Sep and EF-Sep endows E. coli with the ability to read UAG as a phosphoserine codon.

[0154] Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs.

Sequence CWU 1

1

461409PRTArtificial SequenceSynthetic Construct 1Met Ser Lys Glu Lys Phe Glu Arg Thr Lys Pro His Val Asn Val Gly1 5 10 15Thr Ile Gly His Val Asp His Gly Lys Thr Thr Leu Thr Ala Ala Ile 20 25 30Thr Thr Val Leu Ala Lys Thr Tyr Gly Gly Thr Ala Arg Ala Phe Asp 35 40 45Gln Ile Asp Asn Ala Pro Glu Glu Lys Ala Arg Gly Ile Thr Ile Asn 50 55 60Thr Ser Arg Val Glu Tyr Asp Thr Pro Thr Arg His Tyr Ala His Val65 70 75 80Asp Cys Pro Gly His Ala Asp Tyr Val Lys Asn Met Ile Thr Gly Ala 85 90 95Ala Gln Met Asp Gly Ala Ile Leu Val Val Ala Ala Thr Asp Gly Pro 100 105 110Met Pro Gln Thr Arg Glu His Ile Leu Leu Gly Arg Gln Val Gly Val 115 120 125Pro Tyr Ile Ile Val Phe Leu Asn Lys Cys Asp Met Val Asp Asp Glu 130 135 140Glu Leu Leu Glu Leu Val Glu Met Glu Val Arg Glu Leu Leu Ser Gln145 150 155 160Tyr Asp Phe Pro Gly Asp Asp Thr Pro Ile Val Arg Gly Ser Ala Leu 165 170 175Lys Ala Leu Glu Gly Asp Ala Glu Trp Glu Ala Lys Ile Leu Glu Leu 180 185 190Ala Gly Phe Leu Asp Ser Tyr Ile Pro Glu Pro Glu Arg Ala Ile Asp 195 200 205Lys Pro Phe Leu Leu Pro Ile Thr Arg Val Tyr Ser Ile Ser Gly Arg 210 215 220Gly Thr Val Val Ser Gly Arg Val Glu Arg Gly Ile Ile Lys Val Gly225 230 235 240Glu Glu Val Glu Ile Val Gly Ile Lys Glu Thr Gln Lys Ser Thr Cys 245 250 255Thr Gly Val Glu Met Phe Arg Lys Leu Leu Asp Glu Gly Arg Ala Gly 260 265 270Glu Phe Val Gly Val Leu Leu Arg Gly Ile Lys Arg Glu Glu Ile Glu 275 280 285Arg Gly Gln Val Leu Ala Lys Pro Gly Thr Ile Lys Pro His Thr Lys 290 295 300Phe Glu Ser Glu Val Tyr Ile Leu Ser Lys Asp Glu Gly Gly Arg His305 310 315 320Thr Pro Phe Phe Lys Gly Tyr Arg Pro Gln Phe Tyr Phe Arg Thr Thr 325 330 335Asp Val Thr Gly Thr Ile Glu Leu Pro Glu Gly Val Glu Met Val Met 340 345 350Pro Gly Asp Asn Ile Lys Met Val Val Thr Leu Ile His Pro Ile Ala 355 360 365Met Asp Asp Gly Leu Arg Phe Ala Ile Arg Glu Gly Gly Arg Thr Val 370 375 380Gly Ala Gly Val Val Ala Lys Val Leu Arg Asp Pro Asn Ser Ser Ser385 390 395 400Val Asp Lys Leu Ala Ala Ala Leu Glu 4052409PRTArtificial SequenceSynthetic Construct 2Met Ser Lys Glu Lys Phe Glu Arg Thr Lys Pro His Val Asn Val Gly1 5 10 15Thr Ile Gly His Val Asp His Gly Lys Thr Thr Leu Thr Ala Ala Ile 20 25 30Thr Thr Val Leu Ala Lys Thr Tyr Gly Gly Ala Ala Arg Ala Phe Asp 35 40 45Gln Ile Asp Asn Ala Pro Glu Glu Lys Ala Arg Gly Ile Thr Ile Asn 50 55 60Thr Ser Arg Val Glu Tyr Asp Thr Pro Thr Arg His Tyr Ala His Val65 70 75 80Asp Cys Pro Gly His Ala Asp Tyr Val Lys Asn Met Ile Thr Gly Ala 85 90 95Ala Gln Met Asp Gly Ala Ile Leu Val Val Ala Ala Thr Asp Gly Pro 100 105 110Met Pro Gln Thr Arg Glu His Ile Leu Leu Gly Arg Gln Val Gly Val 115 120 125Pro Tyr Ile Ile Val Phe Leu Asn Lys Cys Asp Met Val Asp Asp Glu 130 135 140Glu Leu Leu Glu Leu Val Glu Met Glu Val Arg Glu Leu Leu Ser Gln145 150 155 160Tyr Asp Phe Pro Gly Asp Asp Thr Pro Ile Val Arg Gly Ser Ala Leu 165 170 175Lys Ala Leu Glu Gly Asp Ala Glu Trp Glu Ala Lys Ile Leu Glu Leu 180 185 190Ala Gly Phe Leu Asp Ser Tyr Ile Pro Glu Pro Glu Arg Ala Ile Asp 195 200 205Lys Pro Phe Leu Leu Pro Ile Thr Tyr Val Tyr Ser Ile Ser Gly Arg 210 215 220Gly Thr Val Val Ser Gly Arg Val Glu Arg Gly Ile Ile Lys Val Gly225 230 235 240Glu Glu Val Glu Ile Val Gly Ile Asn Glu Thr Gln Lys Ser Thr Cys 245 250 255Thr Gly Val Glu Met Phe Arg Lys Leu Leu Asp Glu Gly Arg Ala Gly 260 265 270Glu Ala Val Gly Val Leu Leu Arg Gly Ile Lys Arg Glu Glu Ile Glu 275 280 285Arg Gly Gln Val Leu Ala Lys Pro Gly Thr Ile Lys Pro His Thr Lys 290 295 300Phe Glu Ser Glu Val Tyr Ile Leu Ser Lys Asp Glu Gly Gly Arg His305 310 315 320Thr Pro Phe Phe Lys Gly Tyr Arg Pro Gln Phe Tyr Phe Arg Thr Thr 325 330 335Asp Val Thr Gly Thr Ile Glu Leu Pro Glu Gly Val Glu Met Val Met 340 345 350Pro Gly Asp Asn Ile Lys Met Val Val Thr Leu Ile His Pro Ile Ala 355 360 365Met Asp Asp Gly Leu Arg Phe Ala Ile Arg Glu Gly Gly Arg Thr Val 370 375 380Gly Ala Gly Val Val Ala Lys Val Leu Arg Asp Pro Asn Ser Ser Ser385 390 395 400Val Asp Lys Leu Ala Ala Ala Leu Glu 4053409PRTArtificial SequenceSynthetic Construct 3Met Ser Lys Glu Lys Phe Glu Arg Thr Lys Pro His Val Asn Val Gly1 5 10 15Thr Ile Gly His Val Asp His Gly Lys Thr Thr Leu Thr Ala Ala Ile 20 25 30Thr Thr Val Leu Ala Lys Thr Tyr Gly Gly Ala Ala Arg Ala Phe Asp 35 40 45Gln Ile Asp Asn Ala Pro Glu Glu Lys Ala Arg Gly Ile Thr Ile Asn 50 55 60Thr Ser Arg Val Glu Tyr Asp Thr Pro Thr Arg His Tyr Ala His Val65 70 75 80Asp Cys Pro Gly His Ala Asp Tyr Val Lys Asn Met Ile Thr Gly Ala 85 90 95Ala Gln Met Asp Gly Ala Ile Leu Val Val Ala Ala Thr Asp Gly Pro 100 105 110Met Pro Gln Thr Arg Glu His Ile Leu Leu Gly Arg Gln Val Gly Val 115 120 125Pro Tyr Ile Ile Val Phe Leu Asn Lys Cys Asp Met Val Asp Asp Glu 130 135 140Glu Leu Leu Glu Leu Val Glu Met Glu Val Arg Glu Leu Leu Ser Gln145 150 155 160Tyr Asp Phe Pro Gly Asp Asp Thr Pro Ile Val Arg Gly Ser Ala Leu 165 170 175Lys Ala Leu Glu Gly Asp Ala Glu Trp Glu Ala Lys Ile Leu Glu Leu 180 185 190Ala Gly Phe Leu Asp Ser Tyr Ile Pro Glu Pro Glu Arg Ala Ile Asp 195 200 205Lys Pro Phe Leu Leu Pro Ile Asn Gly Val Tyr Ser Ile Ser Gly Arg 210 215 220Gly Thr Val Val Ser Gly Arg Val Glu Arg Gly Ile Ile Lys Val Gly225 230 235 240Glu Glu Val Glu Ile Val Gly Ile Lys Glu Thr Gln Lys Ser Thr Cys 245 250 255Thr Gly Val Glu Met Phe Arg Lys Leu Leu Asp Glu Gly Arg Ala Gly 260 265 270Glu Trp Val Gly Val Leu Leu Arg Gly Ile Lys Arg Glu Glu Ile Glu 275 280 285Arg Gly Gln Val Leu Ala Lys Pro Gly Thr Ile Lys Pro His Thr Lys 290 295 300Phe Glu Ser Glu Val Tyr Ile Leu Ser Lys Asp Glu Gly Gly Arg His305 310 315 320Thr Pro Phe Phe Lys Gly Tyr Arg Pro Gln Phe Tyr Phe Arg Thr Thr 325 330 335Asp Val Thr Gly Thr Ile Glu Leu Pro Glu Gly Val Glu Met Val Met 340 345 350Pro Gly Asp Asn Ile Lys Met Val Val Thr Leu Ile His Pro Ile Ala 355 360 365Met Asp Asp Gly Leu Arg Phe Ala Ile Arg Glu Gly Gly Arg Thr Val 370 375 380Gly Ala Gly Val Val Ala Lys Val Leu Arg Asp Pro Asn Ser Ser Ser385 390 395 400Val Asp Lys Leu Ala Ala Ala Leu Glu 4054409PRTArtificial SequenceSynthetic Construct 4Met Ser Lys Glu Lys Phe Glu Arg Thr Lys Pro His Val Asn Val Gly1 5 10 15Thr Ile Gly His Val Asp His Gly Lys Thr Thr Leu Thr Ala Ala Ile 20 25 30Thr Thr Val Leu Ala Lys Thr Tyr Gly Gly Ala Ala Arg Ala Phe Asp 35 40 45Gln Ile Asp Asn Ala Pro Glu Glu Lys Ala Arg Gly Ile Thr Ile Asn 50 55 60Thr Ser Arg Val Glu Tyr Asp Thr Pro Thr Arg His Tyr Ala His Val65 70 75 80Asp Cys Pro Gly His Ala Asp Tyr Val Lys Asn Met Ile Thr Gly Ala 85 90 95Ala Gln Met Asp Gly Ala Ile Leu Val Val Ala Ala Thr Asp Gly Pro 100 105 110Met Pro Gln Thr Arg Glu His Ile Leu Leu Gly Arg Gln Val Gly Val 115 120 125Pro Tyr Ile Ile Val Phe Leu Asn Lys Cys Asp Met Val Asp Asp Glu 130 135 140Glu Leu Leu Glu Leu Val Glu Met Glu Val Arg Glu Leu Leu Ser Gln145 150 155 160Tyr Asp Phe Pro Gly Asp Asp Thr Pro Ile Val Arg Gly Ser Ala Leu 165 170 175Lys Ala Leu Glu Gly Asp Ala Glu Trp Glu Ala Lys Ile Leu Glu Leu 180 185 190Ala Gly Phe Leu Asp Ser Tyr Ile Pro Glu Pro Glu Arg Ala Ile Asp 195 200 205Lys Pro Phe Leu Leu Pro Ile Thr Ala Val Tyr Ser Ile Ser Gly Arg 210 215 220Gly Thr Val Val Ser Gly Arg Val Glu Arg Gly Ile Ile Lys Val Gly225 230 235 240Glu Glu Val Glu Ile Val Gly Ile Lys Glu Thr Gln Lys Ser Thr Cys 245 250 255Thr Gly Val Glu Met Phe Arg Lys Leu Leu Asp Glu Gly Arg Ala Gly 260 265 270Glu Ala Val Gly Val Leu Leu Arg Gly Ile Lys Arg Glu Glu Ile Glu 275 280 285Arg Gly Gln Val Leu Ala Lys Pro Gly Thr Ile Lys Pro His Thr Lys 290 295 300Phe Glu Ser Glu Val Tyr Ile Leu Ser Lys Asp Glu Gly Gly Arg His305 310 315 320Thr Pro Phe Phe Lys Gly Tyr Arg Pro Gln Phe Tyr Phe Arg Thr Thr 325 330 335Asp Val Thr Gly Thr Ile Glu Leu Pro Glu Gly Val Glu Met Val Met 340 345 350Pro Gly Asp Asn Ile Lys Met Val Val Thr Leu Ile His Pro Ile Ala 355 360 365Met Asp Asp Gly Leu Arg Phe Ala Ile Arg Glu Gly Gly Arg Thr Val 370 375 380Gly Ala Gly Val Val Ala Lys Val Leu Arg Asp Pro Asn Ser Ser Ser385 390 395 400Val Asp Lys Leu Ala Ala Ala Leu Glu 40551227DNAArtificial SequenceSynthetic Construct 5atgtctaaag aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac 60gttgaccatg gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac 120ggcggtactg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa agctcgtggt 180atcaccatca acacttctcg ggttgaatac gacaccccga cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg 420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat tgacaagccg ttcctgctgc cgatcacccg ggtatactcc 660atctccggtc gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt 720gaagaagttg aaatcgttgg tatcaaagag actcagaagt ctacctgtac tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagt tcgtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt 1020accatcgaac tgccggaagg cgtagagatg gtaatgccgg gcgacaacat caaaatggtt 1080gttaccctga tccacccgat cgcgatggac gacggtctgc gtttcgcaat ccgtgaaggc 1140ggccgtaccg ttggcgcggg cgttgtagca aaagttctga gggatccgaa ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag 122761227DNAArtificial SequenceSynthetic Construct 6atgtctaaag aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac 60gttgaccatg gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac 120ggcggtgctg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa agctcgtggt 180atcaccatca acacttctag ggttgaatac gacaccccga cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg 420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat tgacaagccg ttcctgctgc cgatcaccta cgtatactcc 660atctccggtc gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt 720gaagaagttg aaatcgttgg tatcaatgag actcagaagt ctacctgtac tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagg cggtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt 1020accatcgaac tgccggaagg cgtagagatg gtaatgccgg gcgacaacat caaaatggtt 1080gttaccctga tccacccgat cgcgatggac gacggtctgc gtttcgcaat ccgtgaaggc 1140ggccgtaccg ttggcgcggg cgttgtagca aaagttctga gggatccgaa ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag 122771227DNAArtificial SequenceSynthetic Construct 7atgtctaaag aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac 60gttgaccatg gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac 120ggcggtgctg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa agctcgtggt 180atcaccatca acacttctcg ggttgaatac gacaccccga cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg 420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat tgacaagccg ttcctgctgc cgatcaacgg ggtatactcc 660atctccggtc gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt 720gaagaagttg aaatcgttgg tatcaaagag actcagaagt ctacctgtac tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagt gggtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt 1020accatcgaac tgccggaagg cgtagagatg gtaatgccgg gcgacaacat caaaatggtt 1080gttaccctga tccacccgat cgcgatggac gacggtctgc gtttcgcaat ccgtgaaggc 1140ggccgtaccg ttggcgcggg cgttgtagca aaagttctga gggatccgaa ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag 122781227DNAArtificial SequenceSynthetic Construct 8atgtctaaag aaaagtttga acgtacaaaa ccgcacgtta acgtcggtac tatcggccac 60gttgaccatg gtaaaacaac gctgaccgct gcaatcacta ccgtactggc taaaacctac 120ggcggtgctg ctcgcgcatt cgaccagatc gataacgcgc cggaagaaaa agctcgtggt 180atcaccatca acacttctcg ggttgaatac gacaccccga cccgtcacta cgcacacgta 240gactgcccgg ggcacgccga ctatgttaaa aacatgatca ccggtgctgc gcagatggac 300ggcgcgatcc tggtagttgc tgcgactgac ggcccgatgc cgcagactcg tgagcacatc 360ctgctgggtc gtcaggtagg cgttccgtac atcatcgtgt tcctgaacaa atgcgacatg 420gttgatgacg aagagctgct ggaactggtt gaaatggaag ttcgtgaact tctgtctcag 480tacgacttcc cgggcgacga cactccgatc gttcgtggtt ctgctctgaa agcgctggaa 540ggcgacgcag agtgggaagc gaaaatcctg gaactggctg gcttcctgga ttcttacatt 600ccggaaccag agcgtgcgat tgacaagccg ttcctgctgc cgatcaccgc ggtatactcc 660atctccggtc gtggtaccgt tgtttcgggt cgtgtagaac gcggtatcat caaagttggt 720gaagaagttg aaatcgttgg tatcaaagag actcagaagt ctacctgtac tggcgttgaa 780atgttccgca aactgctgga cgaaggccgt gctggtgagg ccgtaggtgt tctgctgcgt 840ggtatcaaac gtgaagaaat cgaacgtggt caggtactgg ctaagccggg caccatcaag 900ccgcacacca agttcgaatc tgaagtgtac attctgtcca aagatgaagg cggccgtcat 960actccgttct tcaaaggcta ccgtccgcag ttctacttcc gtactactga cgtgactggt 1020accatcgaac tgccggaagg

cgtagagatg gtaatgccgg gcgacaacat caaaatggtt 1080gttaccctga tccacccgat cgcgatggac gacggtctgc gtttcgcaat ccgtgaaggc 1140ggccgtaccg ttggcgcggg cgttgtagca aaagttctga gggatccgaa ttcgagctcc 1200gtcgacaagc ttgcggccgc actcgag 1227936DNAArtificial SequenceSynthetic Construct 9tgcgcaatgc ggccgcccgt agcgccgatg gtagtg 361033DNAArtificial SequenceSynthetic Construct 10acacggagat ctctaaagta tatatgagta aac 331137DNAArtificial SequenceSynthetic Construct 11tgcgcaatgc ggccgcccgg gtcgaatttg ctttcga 371233DNAArtificial SequenceSynthetic Construct 12acacggagat ctatgccccg cgcccaccgg aag 331338DNAArtificial SequenceSynthetic Construct 13tgcagcaatg cggccgcttt caccgtcatc accgaaac 381432DNAArtificial SequenceSynthetic Construct 14gggacgctag caaacaaaaa gagtttgtag aa 321532DNAArtificial SequenceSynthetic Construct 15gggacgctag cttttctctg gtcccgccgc at 321637DNAArtificial SequenceSynthetic Construct 16tgcgcaatgc ggccgcggtg gcacttttcg gggaaat 371730DNAArtificial SequenceSynthetic Construct 17gcatgcgccg ccagctgttg cccgtctcgc 301831DNAArtificial SequenceSynthetic Construct 18gcatagatct tcagctggcg aaagggggat g 311941DNAArtificial SequenceSynthetic Constructmisc_feature(21)..(23)n is a, c, g, or t 19gtatcaccat caacacttct nnngttgaat acgacacccc g 412020DNAArtificial SequenceSynthetic Construct 20agaagtgttg atggtgatac 202148DNAArtificial SequenceSynthetic Constructmisc_feature(19)..(24)n is a, c, g, or tmisc_feature(28)..(30)n is a, c, g, or t 21ccgttcctgc tgccgatcnn nnnngtannn tccatctccg gtcgtggt 482218DNAArtificial SequenceSynthetic Construct 22gatcggcagc aggaacgg 182338DNAArtificial SequenceSynthetic Constructmisc_feature(19)..(21)n is a, c, g, or t 23ggtcgtggta ccgttgttnn nggtcgtgta gaacgcgg 382418DNAArtificial SequenceSynthetic Construct 24aacaacggta ccacgacc 182538DNAArtificial SequenceSynthetic Constructmisc_feature(19)..(21)n is a, c, g, or t 25gaaggccgtg ctggtgagnn ngtaggtgtt ctgctgcg 382618DNAArtificial SequenceSynthetic Construct 26ctcaccagca cggccttc 182728DNAArtificial SequenceSynthetic Construct 27ggaattccat atggcggcgg cggcggcg 282827DNAArtificial SequenceSynthetic Construct 28ccgctcgagt taagatctgt atcctgg 272930DNAArtificial SequenceSynthetic Construct 29aaggaaatta atgaaaatcg aagaaggtaa 303018DNAArtificial SequenceSynthetic Construct 30ctagaggatc cggcgcgc 18311215DNAArtificial SequenceSynthetic Construct 31atgccgaaga agaaaccgac cccgatccag ctgaacccgg ctccggacgg ttctgctgtt 60aacggcacct cttctgctga aaccaacctg gaagctctgc aaaagaaact ggaagaactg 120gaactggacg aacagcagcg taaacgtctg gaagcgttcc tgacccagaa acagaaagtt 180ggtgaactga aagacgacga cttcgaaaaa atctctgaac tgggtgctgg taacggtggt 240gttgttttca aagtttctca caaaccgtcc ggtctggtta tggctcgtaa actgatccac 300ctggaaatca aaccggctat ccgtaaccag atcatccgtg aactgcaagt tctgcacgaa 360tgcaactctc cgtacatcgt tggtttctac ggtgctttct actctgacgg tgaaatctct 420atctgcatgg aacacatgga cggtggttct ctggaccagg ttctgaaaaa agctggtcgt 480atcccggaac agatcctggg taaagtttct atcgctgtta tcaaaggtct gacctacctg 540cgtgaaaaac acaaaatcat gcaccgtgac gttaaaccgt ctaacatcct ggttaactct 600cgtggtgaaa tcaaactgtg cgacttcggt gtttctggtc agctgatcga ctctatggct 660aactctttcg ttggcacccg ttcttacatg tctccggaac gtctgcaagg cacccactac 720tctgttcagt ctgacatctg gtctatgggt ctgtctctgg ttgaaatggc tgttggtcgt 780tacccgatcc cgccgccgga cgctaaagaa ctggaactga tgttcggttg ccaggttgaa 840ggtgacgctg ctgaaacccc gccgcgtccg cgtactccgg gtcgtccgct gtcttcttac 900ggtatggact ctcgtccgcc gatggctatc ttcgaactgc tggactacat cgttaacgaa 960ccgccgccga aactgccgtc tggtgttttc tctctggagt tccaggactt cgttaacaaa 1020tgcctgatca aaaacccggc tgaacgtgct gacctgaaac agctgatggt tcacgctttc 1080atcaaacgtt ctgacgctga agaagttgac ttcgctggtt ggctgtgctc taccatcggt 1140ctgaaccagc cgtctacccc gacccacgct gctggtgtgg cagccgcagc tgcgcatcat 1200caccaccatc actaa 12153236DNAArtificial SequenceSynthetic Construct 32ataagaatgc ggccgcgccg cagccgaacg accgag 363329DNAArtificial SequenceSynthetic Construct 33ctagctagcg tctgacgctc agtggaacg 293427DNAArtificial SequenceSynthetic Construct 34ctagctagct cactcggtcg ctacgct 273536DNAArtificial SequenceSynthetic Construct 35ataagaatgc ggccgctgaa atctagagcg gttcag 363634DNAArtificial SequenceSynthetic Construct 36aaaaggcgcc gccagcctag ccgggtcctc aacg 343726DNAArtificial SequenceSynthetic Construct 37aactgcagcc aatccggata tagttc 263837DNAArtificial SequenceSynthetic Construct 38ggaattccat atgaccaaga ccccgccggc agcagtt 373924DNAArtificial SequenceSynthetic Construct 39aggcgcgcct tagtcctcca ctac 244022PRTArtificial SequenceSynthetic ConstructMOD_RES(13)..(13)PHOSPHORYLATION 40Leu Cys Asp Phe Gly Val Ser Gly Gln Leu Ile Asp Ser Met Ala Asn1 5 10 15Ser Phe Val Gly Thr Arg 204172DNAArtificial SequenceSynthetic Construct 41gccggggtag tctaggggtt aggcagcgga ctgcagatcc gccttacgtg ggttcaaatc 60ccacccccgg ct 72421650DNAMethanocaldococcus jannaschii 42atgaaattaa aacataaaag ggatgataaa atgagatttg atataaaaaa ggttttagag 60ttagcagaga aggattttga gacggcatgg agagagacaa gggcattaat aaaggataaa 120catattgaca ataaatatcc aagattaaag cctgtctatg gaaagccaca tccagtgatg 180gagacgatag agagattaag acaagcttat ctaagaatgg gatttgaaga gatgattaat 240ccagttatcg ttgatgagat ggagatttat aagcaatttg gaccagaagc aatggcagtt 300ttagatagat gtttttactt ggctggatta ccaaggccag atgttggttt aggaaatgag 360aaggttgaga ttataaaaaa tttgggcata gatatagatg aggagaaaaa agagaggttg 420agagaagttt tacatttata caaaaaagga gctatagatg gggatgattt agtctttgag 480attgccaaag ctttaaatgt gagtaatgaa atgggattga aggttttaga aactgcattt 540cctgaattta aagatttgaa gccagaatca acaactctaa ctttaagaag ccacatgaca 600tctgggtggt ttataactct aagcagttta ataaagaaga gaaaactgcc tttaaagtta 660ttctctatag atagatgttt tagaagggag caaagagagg atagaagcca tttaatgagt 720tatcactctg catcttgtgt agttgttggt gaagatgtta gtgtagatga tggaaaggta 780gttgctgaag gattgttggc tcaatttgga tttacaaaat ttaagtttaa gccagatgag 840aaaaagagta agtattatac accagaaact caaacagagg tttatgccta tcatccaaag 900ttgggagagt ggattgaagt agcaaccttt ggagtttatt caccaattgc attagctaaa 960tataacatag atgtgccagt tatgaacctt ggcttaggag ttgagaggtt ggcaatgatt 1020atttacggct atgaggatgt tagggcaatg gtttatcctc aattttatga atacaggttg 1080agtgatagag atatagctgg gatgataaga gttgataaag ttcctatatt ggatgaattc 1140tacaactttg caaatgagct tattgatata tgcatagcaa ataaagataa ggaaagccca 1200tgttcagttg aagttaaaag ggaattcaat ttcaatgggg agagaagagt aattaaagta 1260gaaatatttg agaatgaacc aaataaaaag cttttaggtc cttctgtgtt aaatgaggtt 1320tatgtctatg atggaaatat atatggcatt ccgccaacgt ttgaaggggt taaagaacag 1380tatatcccaa ttttaaagaa agctaaggaa gaaggagttt ctacaaacat tagatacata 1440gatgggatta tctataaatt agtagctaag attgaagagg ctttagtttc aaatgtggat 1500gaatttaagt tcagagtccc aatagttaga agtttgagtg acataaacct aaaaattgat 1560gaattggctt taaaacagat aatgggggag aataaggtta tagatgttag gggaccagtt 1620ttcttaaatg caaaggttga gataaaatag 165043549PRTMethanocaldococcus jannaschii 43Met Lys Leu Lys His Lys Arg Asp Asp Lys Met Arg Phe Asp Ile Lys1 5 10 15Lys Val Leu Glu Leu Ala Glu Lys Asp Phe Glu Thr Ala Trp Arg Glu 20 25 30Thr Arg Ala Leu Ile Lys Asp Lys His Ile Asp Asn Lys Tyr Pro Arg 35 40 45Leu Lys Pro Val Tyr Gly Lys Pro His Pro Val Met Glu Thr Ile Glu 50 55 60Arg Leu Arg Gln Ala Tyr Leu Arg Met Gly Phe Glu Glu Met Ile Asn65 70 75 80Pro Val Ile Val Asp Glu Met Glu Ile Tyr Lys Gln Phe Gly Pro Glu 85 90 95Ala Met Ala Val Leu Asp Arg Cys Phe Tyr Leu Ala Gly Leu Pro Arg 100 105 110Pro Asp Val Gly Leu Gly Asn Glu Lys Val Glu Ile Ile Lys Asn Leu 115 120 125Gly Ile Asp Ile Asp Glu Glu Lys Lys Glu Arg Leu Arg Glu Val Leu 130 135 140His Leu Tyr Lys Lys Gly Ala Ile Asp Gly Asp Asp Leu Val Phe Glu145 150 155 160Ile Ala Lys Ala Leu Asn Val Ser Asn Glu Met Gly Leu Lys Val Leu 165 170 175Glu Thr Ala Phe Pro Glu Phe Lys Asp Leu Lys Pro Glu Ser Thr Thr 180 185 190Leu Thr Leu Arg Ser His Met Thr Ser Gly Trp Phe Ile Thr Leu Ser 195 200 205Ser Leu Ile Lys Lys Arg Lys Leu Pro Leu Lys Leu Phe Ser Ile Asp 210 215 220Arg Cys Phe Arg Arg Glu Gln Arg Glu Asp Arg Ser His Leu Met Ser225 230 235 240Tyr His Ser Ala Ser Cys Val Val Val Gly Glu Asp Val Ser Val Asp 245 250 255Asp Gly Lys Val Val Ala Glu Gly Leu Leu Ala Gln Phe Gly Phe Thr 260 265 270Lys Phe Lys Phe Lys Pro Asp Glu Lys Lys Ser Lys Tyr Tyr Thr Pro 275 280 285Glu Thr Gln Thr Glu Val Tyr Ala Tyr His Pro Lys Leu Gly Glu Trp 290 295 300Ile Glu Val Ala Thr Phe Gly Val Tyr Ser Pro Ile Ala Leu Ala Lys305 310 315 320Tyr Asn Ile Asp Val Pro Val Met Asn Leu Gly Leu Gly Val Glu Arg 325 330 335Leu Ala Met Ile Ile Tyr Gly Tyr Glu Asp Val Arg Ala Met Val Tyr 340 345 350Pro Gln Phe Tyr Glu Tyr Arg Leu Ser Asp Arg Asp Ile Ala Gly Met 355 360 365Ile Arg Val Asp Lys Val Pro Ile Leu Asp Glu Phe Tyr Asn Phe Ala 370 375 380Asn Glu Leu Ile Asp Ile Cys Ile Ala Asn Lys Asp Lys Glu Ser Pro385 390 395 400Cys Ser Val Glu Val Lys Arg Glu Phe Asn Phe Asn Gly Glu Arg Arg 405 410 415Val Ile Lys Val Glu Ile Phe Glu Asn Glu Pro Asn Lys Lys Leu Leu 420 425 430Gly Pro Ser Val Leu Asn Glu Val Tyr Val Tyr Asp Gly Asn Ile Tyr 435 440 445Gly Ile Pro Pro Thr Phe Glu Gly Val Lys Glu Gln Tyr Ile Pro Ile 450 455 460Leu Lys Lys Ala Lys Glu Glu Gly Val Ser Thr Asn Ile Arg Tyr Ile465 470 475 480Asp Gly Ile Ile Tyr Lys Leu Val Ala Lys Ile Glu Glu Ala Leu Val 485 490 495Ser Asn Val Asp Glu Phe Lys Phe Arg Val Pro Ile Val Arg Ser Leu 500 505 510Ser Asp Ile Asn Leu Lys Ile Asp Glu Leu Ala Leu Lys Gln Ile Met 515 520 525Gly Glu Asn Lys Val Ile Asp Val Arg Gly Pro Val Phe Leu Asn Ala 530 535 540Lys Val Glu Ile Lys54544492DNAArtificial SequenceSynthetic Construct 44atggttctgt ctgaaggtga atggcagctg gttctgcacg tttgggctaa agttgaagct 60gacgttgctg gtcacggtca ggacatcctg atccgtctgt tcaaatctca cccggaaacc 120ctggaaaaat tcgaccgttt caaacacctg aaaaccgaag ctgaaatgaa ggcttctgaa 180gacctgaaaa aacacggtgt taccgttctg accgctctgg gtgctatcct gaagaaaaag 240ggtcaccacg aagctgaact gaaaccgctg gctcagtctc acgctaccaa acacaaaatc 300ccgatcaaat acctggagtt catctctgaa gctatcatcc acgttctgca ctctcgtcat 360ccgggtaact tcggtgctga cgctcagggt gctatgaaca aagctctgga actgttccgt 420aaagacatcg ctgctaaata caaagaactg ggttaccagg gtggttctgg tcatcaccat 480caccatcact aa 492451614DNAMethanococcus maripaludis 45atgtttaaaa gagaagaaat cattgaaatg gccaataagg actttgaaaa agcatggatc 60gaaactaaag accttataaa agctaaaaag ataaacgaaa gttacccaag aataaaacca 120gtttttggaa aaacacaccc tgtaaatgac actattgaaa atttaagaca ggcatatctt 180agaatgggtt ttgaagaata tataaaccca gtaattgtcg atgaaagaga tatttataaa 240caattcggcc cagaagctat ggcagttttg gatagatgct tttatttagc gggacttcca 300agacctgacg ttggtttgag cgatgaaaaa atttcacaga ttgaaaaact tggaattaaa 360gtttctgagc acaaagaaag tttacaaaaa atacttcacg gatacaaaaa aggaactctt 420gatggtgacg atttagtttt agaaatttca aatgcacttg aaatttcaag cgagatgggt 480ttaaaaattt tagaagatgt tttcccagaa tttaaggatt taaccgcagt ttcttcaaaa 540ttaactttaa gaagccacat gacttcagga tggttcctta ctgtttcaga cctcatgaac 600aaaaaaccct tgccatttaa actcttttca atcgatagat gttttagaag agaacaaaaa 660gaagataaaa gccacttaat gacataccac tctgcatcct gtgcaattgc aggtgaaggc 720gtggatatta atgatggaaa agcaattgca gaaggattat tatcccaatt tggctttaca 780aactttaaat tcattcctga tgaaaagaaa agtaaatact acacccctga aacacagact 840gaagtttacg cataccaccc aaaattaaaa gaatggctcg aagttgctac atttggagta 900tattcgccag ttgcattaag caaatacgga atagatgtac ctgtaatgaa tttgggtctt 960ggtgttgaaa gacttgcaat gatttctgga aatttcgcag atgttcgaga aatggtatat 1020cctcagtttt acgaacacaa acttaatgac cggaatgtcg cttcaatggt aaaactcgat 1080aaagttccag taatggatga aatttacgat ttaacaaaag aattaattga gtcatgtgtt 1140aaaaacaaag atttaaaatc cccttgtgaa ttagctattg aaaaaacgtt ttcatttgga 1200aaaaccaaga aaaatgtaaa aataaacatt tttgaaaaag aagaaggtaa aaatttactc 1260ggaccttcaa ttttaaacga aatctacgtt tacgatggaa atgtaattgg aattcctgaa 1320agctttgacg gagtaaaaga agaatttaaa gacttcttag aaaaaggaaa atcagaaggg 1380gtagcaacag gcattcgata tatcgatgcg ctttgcttta aaattacttc aaaattagaa 1440gaagcatttg tgtcaaacac tactgaattc aaagttaaag ttccaattgt cagaagttta 1500agcgacatta acttaaaaat cgatgatatc gcattaaaac agatcatgag caaaaataaa 1560gtaatcgacg ttagaggccc agtcttttta aatgtcgaag taaaaattga ataa 161446537PRTMethanococcus maripaludis 46Met Phe Lys Arg Glu Glu Ile Ile Glu Met Ala Asn Lys Asp Phe Glu1 5 10 15Lys Ala Trp Ile Glu Thr Lys Asp Leu Ile Lys Ala Lys Lys Ile Asn 20 25 30Glu Ser Tyr Pro Arg Ile Lys Pro Val Phe Gly Lys Thr His Pro Val 35 40 45Asn Asp Thr Ile Glu Asn Leu Arg Gln Ala Tyr Leu Arg Met Gly Phe 50 55 60Glu Glu Tyr Ile Asn Pro Val Ile Val Asp Glu Arg Asp Ile Tyr Lys65 70 75 80Gln Phe Gly Pro Glu Ala Met Ala Val Leu Asp Arg Cys Phe Tyr Leu 85 90 95Ala Gly Leu Pro Arg Pro Asp Val Gly Leu Ser Asp Glu Lys Ile Ser 100 105 110Gln Ile Glu Lys Leu Gly Ile Lys Val Ser Glu His Lys Glu Ser Leu 115 120 125Gln Lys Ile Leu His Gly Tyr Lys Lys Gly Thr Leu Asp Gly Asp Asp 130 135 140Leu Val Leu Glu Ile Ser Asn Ala Leu Glu Ile Ser Ser Glu Met Gly145 150 155 160Leu Lys Ile Leu Glu Asp Val Phe Pro Glu Phe Lys Asp Leu Thr Ala 165 170 175Val Ser Ser Lys Leu Thr Leu Arg Ser His Met Thr Ser Gly Trp Phe 180 185 190Leu Thr Val Ser Asp Leu Met Asn Lys Lys Pro Leu Pro Phe Lys Leu 195 200 205Phe Ser Ile Asp Arg Cys Phe Arg Arg Glu Gln Lys Glu Asp Lys Ser 210 215 220His Leu Met Thr Tyr His Ser Ala Ser Cys Ala Ile Ala Gly Glu Gly225 230 235 240Val Asp Ile Asn Asp Gly Lys Ala Ile Ala Glu Gly Leu Leu Ser Gln 245 250 255Phe Gly Phe Thr Asn Phe Lys Phe Ile Pro Asp Glu Lys Lys Ser Lys 260 265 270Tyr Tyr Thr Pro Glu Thr Gln Thr Glu Val Tyr Ala Tyr His Pro Lys 275 280 285Leu Lys Glu Trp Leu Glu Val Ala Thr Phe Gly Val Tyr Ser Pro Val 290 295 300Ala Leu Ser Lys Tyr Gly Ile Asp Val Pro Val Met Asn Leu Gly Leu305 310 315 320Gly Val Glu Arg Leu Ala Met Ile Ser Gly Asn Phe Ala Asp Val Arg 325 330 335Glu Met Val Tyr Pro Gln Phe Tyr Glu His Lys Leu Asn Asp Arg Asn 340 345 350Val Ala Ser Met Val Lys Leu Asp Lys Val Pro Val Met Asp Glu Ile 355 360 365Tyr Asp Leu Thr Lys Glu Leu Ile Glu Ser Cys Val Lys Asn Lys Asp 370 375 380Leu Lys Ser Pro Cys Glu Leu Ala Ile Glu Lys Thr Phe Ser Phe Gly385 390 395 400Lys Thr Lys Lys Asn Val Lys Ile Asn Ile Phe Glu Lys Glu Glu Gly 405 410 415Lys Asn Leu Leu Gly Pro Ser Ile Leu Asn Glu Ile Tyr Val Tyr Asp

420 425 430Gly Asn Val Ile Gly Ile Pro Glu Ser Phe Asp Gly Val Lys Glu Glu 435 440 445Phe Lys Asp Phe Leu Glu Lys Gly Lys Ser Glu Gly Val Ala Thr Gly 450 455 460Ile Arg Tyr Ile Asp Ala Leu Cys Phe Lys Ile Thr Ser Lys Leu Glu465 470 475 480Glu Ala Phe Val Ser Asn Thr Thr Glu Phe Lys Val Lys Val Pro Ile 485 490 495Val Arg Ser Leu Ser Asp Ile Asn Leu Lys Ile Asp Asp Ile Ala Leu 500 505 510Lys Gln Ile Met Ser Lys Asn Lys Val Ile Asp Val Arg Gly Pro Val 515 520 525Phe Leu Asn Val Glu Val Lys Ile Glu 530 535

* * * * *

Patent Diagrams and Documents

D00001

D00002

D00003

D00004

D00005

D00006

D00007

P00899

S00001

XML

US20200385742A1 – US 20200385742 A1