Polypeptide having activity of aminoacyl-tRNA synthetase and use thereof Yokoyama; Shigeyuki ; et al. [RIKEN]

Polypeptide having activity of aminoacyl-tRNA synthetase and use thereof

Yokoyama; Shigeyuki ; et al.

Patent Application Summary

U.S. patent application number 12/322782 was filed with the patent office on 2009-09-10 for polypeptide having activity of aminoacyl-trna synthetase and use thereof. This patent application is currently assigned to RIKEN. Invention is credited to Takahito Mukai, Kenji Oki, Kensaku Sakamoto, Shigeyuki Yokoyama.

Application Number	20090226966 12/322782
Document ID	/
Family ID	41054004
Filed Date	2009-09-10

United States Patent Application	20090226966
Kind Code	A1
Yokoyama; Shigeyuki ; et al.	September 10, 2009

Polypeptide having activity of aminoacyl-tRNA synthetase and use thereof

Abstract

A polypeptide according to the present invention includes: an altered polypeptide obtained by altering an ArgRS, a CysRS, a MetRS, a GlnRS, a GluRS, a LysRS, a TyrRS, or a TrpRS so that an unnatural amino acid is recognized; and an editing polypeptide derived from a PheRS, a LeuRS, an IleRS, a ValRS, an AlaRS, a ProRS, or a ThrRS, the editing polypeptide having been either inserted between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or bound to an N terminal of the altered polypeptide. Thus provided are a new aaRS that exhibits high substrate specificity to an unnatural amino acid and a technique that involves the use of such an aaRS.

Inventors:	Yokoyama; Shigeyuki; (Yokohama-Shi, JP) ; Sakamoto; Kensaku; (Yokohama-Shi, JP) ; Oki; Kenji; (Yokohama-Shi, JP) ; Mukai; Takahito; (Yokohama-Shi, JP)
Correspondence Address:	EDWARDS ANGELL PALMER & DODGE LLP P.O. BOX 55874 BOSTON MA 02205 US
Assignee:	RIKEN Saitama JP
Family ID:	41054004
Appl. No.:	12/322782
Filed:	February 5, 2009

Current U.S. Class:	435/69.1 ; 435/183; 536/23.2
Current CPC Class:	C12N 9/93 20130101
Class at Publication:	435/69.1 ; 435/183; 536/23.2
International Class:	C12N 9/00 20060101 C12N009/00; C12N 15/52 20060101 C12N015/52; C12P 21/02 20060101 C12P021/02

Foreign Application Data

Date	Code	Application Number
Feb 8, 2008	JP	29236/2008

Claims

1. A polypeptide having aminoacyl-tRNA synthetase activity, the polypeptide comprising: an altered polypeptide obtained by altering an arginyl-tRNA synthetase, a cysteinyl-tRNA synthetase, a methionyl-tRNA synthetase, a glutaminyl-tRNA synthetase, a glutamyl-tRNA synthetase, a lysyl-tRNA synthetase, a tyrosyl-tRNA synthetase, or a tryptophanyl-tRNA synthetase so that an unnatural amino acid is recognized; and an editing polypeptide containing an editing reaction active site derived from a phenylalanyl-tRNA synthetase, a leucyl-tRNA synthetase, an isoleucyl-tRNA synthetase, a valyl-tRNA synthetase, an alanyl-tRNA synthetase, a prolyl-tRNA synthetase, or a threonyl-tRNA synthetase, the editing polypeptide having been either inserted between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or bound to an N terminal of the altered polypeptide.

2. The polypeptide as set forth in claim 1, wherein the editing polypeptide has been inserted into a CP1 domain that lies between the Rossman-fold N domain and the Rossman-fold C domain.

3. The polypeptide as set forth in claim 1, wherein a linker polypeptide that connects the editing polypeptide with the altered polypeptide has been further inserted.

4. The polypeptide as set forth in claim 1, wherein the altered polypeptide is a tyrosyl-tRNA synthetase altered so as to recognize an unnatural amino acid.

5. The polypeptide as set forth in claim 1, wherein the tyrosyl-tRNA synthetase is derived from eukaryotic organisms or eubacteria.

6. The polypeptide as set forth in claim 5, wherein the eubacteria are Escherichia coli.

7. The polypeptide as set forth in claim 1, wherein the tyrosyl-tRNA synthetase is a polypeptide altered so as to recognize a tyrosine derivative.

8. The polypeptide as set forth in claim 7, wherein the tyrosine derivative is 3-iodotyrosine.

9. The polypeptide as set forth in claim 1, wherein the altered polypeptide is a polypeptide as set forth in either of (a) and (b): (a) a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 1; and (b) a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 1 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having activity to bind 3-iodotyrosine to tRNA.

10. The polypeptide as set forth in claim 1, wherein the editing polypeptide is derived from a phenylalanyl-tRNA synthetase.

11. The polypeptide as set forth in claim 1, wherein the phenylalanyl-tRNA synthetase is derived from eukaryotic organisms or archaebacteria.

12. The polypeptide as set forth in claim 11, wherein the archaebacteria belong to the genus Pyrococcus.

13. The polypeptide as set forth in claim 11, wherein the archaebacteria are Pyrococcus horikoshii.

14. The polypeptide as set forth in claim 1, wherein the editing polypeptide is a polypeptide as set forth in either of (c) and (d): (c) a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 2; and (d) a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 2 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having activity to degrade binding of tyrosine to tRNA or activity to degrade tyrosyl adenylate intermediate into tyrosine and an inorganic phosphoric acid.

15. The polypeptide as set forth in claim 1, wherein the polypeptide is a polypeptide as set forth in either of (e) and (f): (e) a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 3; and (f) a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 3 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having activity to degrade binding of tyrosine to tRNA or activity to degrade tyrosyl adenylate intermediate into tyrosine and an inorganic phosphoric acid and activity to bind an unnatural amino acid to tRNA.

16. The polypeptide as set forth in claim 1, wherein: the altered polypeptide is a tyrosyl-tRNA synthetase altered so as to recognize an unnatural amino acid and the editing polypeptide contains an editing reaction active site derived from a phenylalanyl-tRNA synthetase; or the altered polypeptide is a methionyl-tRNA synthetase altered so as to recognize an unnatural amino acid and the editing polypeptide contains an editing reaction active site derived from a leucyl-tRNA synthetase.

17. A polynucleotide coding for a polypeptide having aminoacyl-tRNA synthetase activity as set froth in claim 1.

18. A method for producing a polypeptide having aminoacyl-tRNA synthetase activity, the method comprising: a preparing step of preparing a polynucleotide in which a polynucleotide coding for an editing polypeptide containing an editing reaction active site derived from a phenylalanyl-tRNA synthetase, a leucyl-tRNA synthetase, an isoleucyl-tRNA synthetase, a valyl-tRNA synthetase, an alanyl-tRNA synthetase, a prolyl-tRNA synthetase, or a threonyl-tRNA synthetase has been introduced into a polynucleotide coding for an altered polypeptide obtained by altering an arginyl-tRNA synthetase, a cysteinyl-tRNA synthetase, a methionyl-tRNA synthetase, a glutaminyl-tRNA synthetase, a glutamyl-tRNA synthetase, a lysyl-tRNA synthetase, a tyrosyl-tRNA synthetase, or a tryptophanyl-tRNA synthetase so that an unnatural amino acid is recognized; and an expressing step of expressing a polypeptide coded for by the polynucleotide obtained in the preparing step, the preparing step includes preparing either a polynucleotide in which the editing polypeptide has been introduced so as to be positioned between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or a polynucleotide in which the editing polypeptide has been introduced so as to be bound to an N terminal of the altered polypeptide.

Description

[0001] This Nonprovisional application claims priority under U.S.C. .sctn. 119(a) on Patent Application No. 029236/2008 filed in Japan on Feb. 8, 2008, the entire contents of which are hereby incorporated by reference.

FIELD OF THE INVENTION

[0002] The present invention relates to polypeptides having aminoacyl-tRNA synthetase activity and use thereof. More specifically, the present invention relates to a polypeptide having aminoacyl-tRNA synthetase activity that exhibits high specificity of association between an unnatural amino acid and tRNA and use thereof.

BACKGROUND OF THE INVENTION

[0003] In recent years, the development of genetic engineering and the resulting sufficiency of information on three-dimensional protein structures, genome sequences, and the like have made it possible to create a protein with a new function by artificially altering a protein or, specifically, to create a protein with new activity by altering, based on a protein having certain activity, some amino-acid residues of the protein. For alteration of amino-acid residues to other amino-acid residues, natural amino acids are limited as options, and as such, may make it difficult to produce a protein having a desired function and desired activity.

[0004] Proposed in view of this as a method for expanding the functions of a protein are various methods that involves the introduction of an unnatural amino acid into a protein (e.g., see Wang, L., and Schultz, P. G., Expanding the genetic code. Angew Chem Int Ed Engl, 2005, 44, 34-66.). Among them, a method that involves the use of an aminoacyl-tRNA synthetase (hereinafter referred to sometimes as "aaRS") mutant has recently been developed as a method that can be used in living cells with high yields (International Publication No. 2003/014354 Pamphlet (published on Feb. 20, 2003), Lee, N., Bessho, Y., Wei, K., Szostak, J. W., and Suga, H., Ribozyme-catalyzed tRNA aminoacylation. Nat Struct Biol, 2000, 7, 28-33.).

[0005] An aaRS exists in all living organisms. An aaRS is responsible for faithfully translating a genetic code by accurately associating an amino acid with tRNA. A reaction catalyzed by an aaRS includes a first step of activating an amino acid with ATP and a second step of adding an activated aminoacyl adenylate intermediate to the 3'-end of tRNA. Both of the reactions are carried out at a single active site (aminoacylation active site) (Fersht, A. R., and Kaethner, M. M., Mechanism of aminoacylation of tRNA. Proof of the aminoacyl adenylate pathway for the isoleucyl- and tyrosyl-tRNA synthetases from Escherichia coli K12. Biochemistry, 1976, 15, 818-823; and Freist, W., and Sternbach, H., Tyrosyl-tRNA synthetase from baker's yeast. Order of substrate addition, discrimination of 20 amino acids in aminoacylation of tRNATyr-C-C-A and tRNATyr-C-C-A(3'NH.sub.2). Eur J Biochem, 1988, 177, 425-433.).

[0006] There are two classes of aaRS, each with its own origin of evolution, and each of the classes includes 10 aaRSs (Eriani, G., Delarue, M., Poch, O., Gangloff, J., and Moras, D., Partition of tRNA synthetases into two classes based on mutually exclusive sets of sequence motifs. Nature, 1990, 347, 203-206.). A class I aaRS has an aminoacylation active site called a Rossman-fold domain. Meanwhile, a class II aaRSs does not have a Rossman-fold domain, but has an aminoacylation active site surrounded by an antiparallel beta-sheet. An aaRS must associate an amino acid with tRNA accurately. tRNAs are large molecules whose molecular weight exceeds 20,000, and vary in sequence. Meanwhile, amino acids are small molecules that share an .alpha.-amino group and an .alpha.-carboxyl group as common structures, and differ only in side chain. This makes it more difficult for an aaRS to discriminate between amino acids than to discriminate between tRNAs. For example, isoleucine and valine differ solely in methyl group, and it is believed to be difficult for an enzyme to recognize the difference (Pauling, L., The Probability of Errors in the Process of Synthesis of Protein Molecules, 1957.).

[0007] For this reason, each of seven types of natural aaRS, which amount to one third of all the natural aaRSs, namely an isoleucyl-tRNA synthetase (hereinafter referred to as "IleRS") of class I, a valyl-tRNA synthetase (hereinafter referred to as "ValRS") of class I, a leucyl-tRNA synthetase (hereinafter referred to as "LeuRS") of class I, an alanyl-tRNA synthetase (hereinafter referred to as "AlaRS") of class II, a prolyl-tRNA synthetase (hereinafter referred to as "ProRS") of class II, a threonyl-tRNA synthetase (hereinafter referred to as "ThrRS") of class II, and a phenylalanyl-tRNA synthetase (hereinafter referred to as "PheRS") of class II, is known to have an aminoacylation active site, serving as an active site, which not only recognizes a correct substrate amino acid but also misrecognizes another amino acid similar the correct substrate amino acid (Crepin, T., Yaremchuk, A., Tukalo, M., and Cusack, S., Structures of two bacterial prolyl-tRNA synthetases with and without a cis-editing domain. Structure, 2006, 14, 1511-1525.; Dock-Bregeon, A., Sankaranarayanan, R., Romby, P., Caillet, J., Springer, M., Rees, B., Francklyn, C. S., Ehresmann, C., and Moras, D., Transfer RNA-mediated editing in threonyl-tRNA synthetase. The class II solution to the double discrimination problem. Cell, 2000, 103, 877-884.; Fukai, S., Nureki, O., Sekine, S., Shimada, A., Tao, J., Vassylyev, D. G., and Yokoyama, S., Structural basis for double-sieve discrimination of L-valine from L-isoleucine and L-threonine by the complex of tRNA (Val) and valyl-tRNA synthetase. Cell, 2000, 103, 793-803.; Fukunaga, R., and Yokoyama, S., Aminoacylation complex structures of leucyl-tRNA synthetase and tRNALeu reveal two modes of discriminator-base recognition. Nat Struct Mol Biol, 2005, 12, 915-922.; Lin, L., Hale, S. P., and Schimmel, P., Aminoacylation error correction. Nature, 1996, 384, 33-34.; Nomanbhoy, T. K., Hendrickson, T. L., and Schimmel, P., Transfer RNA-dependent translocation of misactivated amino acids to prevent errors in protein synthesis. Mol Cell, 1999 4, 519-528.; Nureki, O., Vassylyev, D. G., Tateno, M., Shimada, A., Nakama, T., Fukai, S., Konno, M., Hendrickson, T. L., Schimmel, P., and Yokoyama, S., Enzyme structure with two catalytic sites for double-sieve selection of substrate. Science, 1998, 280, 578-582.; Roy, H., Ling, J., Irnov, M., and Ibba, M., Post-transfer editing in vitro and in vivo by the beta subunit of phenylalanyl-tRNA synthetase. EMBO J, 2004, 23, 4639-4648.; Ruan, B., and Soll, D., The bacterial YbaK protein is a Cys-tRNAPro and Cys-tRNA Cys deacylase. J Biol Chem, 2005, 280, 25887-25891.; Sokabe, M., Okada, A., Yao, M., Nakashima, T., and Tanaka, I., Molecular basis of alanine discrimination in editing site. Proc Natl Acad Sci USA, 2005, 102, 11669-11674.; and Swairjo, M. A., Otero, F. J., Yang, X. L., Lovato, M. A., Skene, R. J., McRee, D. E., Ribas de Pouplana, L., and Schimmel, P., Alanyl-tRNA synthetase crystal structure and design for acceptor-stem recognition. Mol Cell, 2004, 13, 829-841.). As a result of such misrecognition, an aminoacyl adenylate intermediate is produced by activation with an amino acid different from the correct substrate, or aminoacyl tRNA is produced as a product thereof.

[0008] However, each of these aaRSs has an editing reaction active site separately from the aminoacylation active site, and as such, has activity to degrade the mistakenly produced aminoacyl adenylate intermediate into an amino acid and an inorganic phosphoric acid or activity to degrade aminoacyl tRNA into an amino acid and tRNA. The editing reaction active site exists in a domain independent of a domain containing the aminoacylation active site (Fukai, S. et. al., Cell, 2000, 103, 793-803.; Fukunaga, R., and Yokoyama, S., Nat Struct Mol Biol, 2005, 12, 915-922.; Nureki, O. et. al., Science, 1998, 280, 578-582.; Kotik-Kogan, O., Moor, N., Tworowski, D., and Safro, M., Structural basis for discrimination of L-phenylalanine from L-tyrosine by phenylalanyl-tRNA synthetase. Structure, 2005, 13, 1799-1807.; and Ribas de Pouplana, L., and Schimmel, P., Two classes of tRNA synthetases suggested by sterically compatible dockings on tRNA acceptor stem. Cell, 2001, 104, 191-193.). The domain is called an editing reaction domain. Specifically, each of these aaRSs strictly recognizes only a single amino acid by recognizing the separate properties (size, hydrophilicity, hydrophobicity) of an amino-acid side chain by the two reaction sites (Fukai, S. et. al., Cell, 2000, 103, 793-803.).

[0009] It should be noted that each of the aaRSs other than the aforementioned seven types of aaRS each of which has an editing reaction active site does not have an editing reaction site, and as such, recognizes a single amino acid only by an aminoacylation active site (Fersht, A. R., Shindler, J. S., and Tsui, W. C., Probing the limits of protein-amino acid side chain recognition with the aminoacyl-tRNA synthetases. Discrimination against phenylalanine by tyrosyl-tRNA synthetases. Biochemistry, 1980 19, 5520-5524.).

[0010] Incidentally, an aaRS mutant is produced by substituting an amino-acid residue of a substrate recognition site of a wild-type aaRS. For example, TyrRS is the first aaRS that succeeded in alteration for introduction of an unnatural amino acid. Further, TyrRS has a large amino-acid-binding pocket for recognizing a comparatively large amino acid, tyrosine, and presently has the largest number of mutants specific to unnatural amino acids (International Publication No. 2003/014354 Pamphlet; and Wang, L., Xie, J., and Schultz, P. G., Expanding the genetic code. Annu Rev Biophys Biomol Struct, 2006, 35, 225-249.). For example, there has been a report on a tyrosyl-tRNA synthetase (hereinafter referred to as "TyrRS") mutant. Specifically, in International Publication No. 2003/014354 Pamphlet, known examples of a mutant that recognizes 3-iodotyrosine, which is an unnatural amino acid, include an Escherichia coli-derived TyrRS mutant and a Methanocaldococcus jannaschii-derived TyrRS mutant (MjIYRS).

SUMMARY OF THE INVENTION

[0011] As mentioned above, aaRS mutants that specifically recognize unnatural amino acids have been produced so far. However, the substrate specificity of such an aaRS mutant to an unnatural amino acid is not sufficient. Therefore, there has been a demand for the development of a new aaRS mutant.

[0012] It is an object of the present invention to provide a new aaRS that exhibits high specificity to an amino acid to be recognized and a technique that involves the use of such an aaRS.

[0013] In order to attain the foregoing object, the inventors diligently studied. As a result, the inventors newly found that by binding an aaRS-derived editing polypeptide having an editing reaction active site to a specific domain in an altered polypeptide obtained by so altering a class I aaRS having no editing reaction active site that an unnatural amino acid is recognized, the altered polypeptide is made to exhibit editing reaction activity without losing its function of recognizing an unnatural amino acid. The present invention, based on the new findings, encompasses the following inventions.

[0014] That is, in order to attain the foregoing object, a polypeptide according to the present invention is a polypeptide having aminoacyl-tRNA synthetase activity, the polypeptide including: an altered polypeptide obtained by altering an arginyl-tRNA synthetase, a cysteinyl-tRNA synthetase, a methionyl-tRNA synthetase, a glutaminyl-tRNA synthetase, a glutamyl-tRNA synthetase, a lysyl-tRNA synthetase, a tyrosyl-tRNA synthetase, or a tryptophanyl-tRNA synthetase so that an unnatural amino acid is recognized; and an editing polypeptide containing an editing reaction active site derived from a phenylalanyl-tRNA synthetase, a leucyl-tRNA synthetase, an isoleucyl-tRNA synthetase, a valyl-tRNA synthetase, an alanyl-tRNA synthetase, a prolyl-tRNA synthetase, or a threonyl-tRNA synthetase, the editing polypeptide having been either inserted between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or bound to an N terminal of the altered polypeptide.

[0015] In the polypeptide according to the present invention, it is more preferable that the editing polypeptide have been inserted into a CP1 domain, had by the altered polypeptide, which lies between the Rossman-fold N domain and the Rossman-fold C domain.

[0016] In the polypeptide according to the present invention, it is more preferable that a linker polypeptide that connects the editing polypeptide with the altered polypeptide have been further inserted.

[0017] In the polypeptide according to the present invention, the altered polypeptide may be a tyrosyl-tRNA synthetase altered so as to recognize an unnatural amino acid.

[0018] In the polypeptide according to the present invention, it is more preferable that the tyrosyl-tRNA synthetase be derived from eukaryotic organisms or eubacteria.

[0019] In the polypeptide according to the present invention, it is more preferable that the eubacteria be Escherichia coli.

[0020] In the polypeptide according to the present invention, the tyrosyl-tRNA synthetase may be a polypeptide altered so as to recognize a tyrosine derivative.

[0021] In the polypeptide according to the present invention, the tyrosine derivative may be 3-iodotyrosine.

[0022] In the polypeptide according to the present invention, it is more preferable that the altered polypeptide be a polypeptide as set forth in either of (a) and (b): (a) a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 1; and (b) a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 1 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having activity to bind 3-iodotyrosine to tRNA.

[0023] In the polypeptide according to the present invention, the editing polypeptide may be derived from a phenylalanyl-tRNA synthetase.

[0024] In the polypeptide according to the present invention, it is more preferable that the phenylalanyl-tRNA synthetase be derived from eukaryotic organisms or archaebacteria.

[0025] In the polypeptide according to the present invention, it is more preferable that the archaebacteria belong to the genus Pyrococcus.

[0026] In the polypeptide according to the present invention, it is more preferable that the archaebacteria be Pyrococcus horikoshii.

[0027] In the polypeptide according to the present invention, it is more preferable that the editing polypeptide be a polypeptide as set forth in either of (c) and (d): (c) a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 2; and (d) a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 2 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having activity to degrade binding of tyrosine to tRNA or activity to degrade tyrosyl adenylate intermediate into tyrosine and an inorganic phosphoric acid.

[0028] It is more preferable that the polypeptide according to the present invention be a polypeptide as set forth in either of (e) and (f): (e) a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 3; and (f) a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 3 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having activity to degrade binding of tyrosine to tRNA or activity to degrade tyrosyl adenylate intermediate into tyrosine and an inorganic phosphoric acid and activity to bind an unnatural amino acid to tRNA.

[0029] In the polypeptide according to the present invention, it is more preferable that: the altered polypeptide be a tyrosyl-tRNA synthetase altered so as to recognize an unnatural amino acid and the editing polypeptide contain an editing reaction active site derived from a phenylalanyl-tRNA synthetase; or the altered polypeptide be a methionyl-tRNA synthetase altered so as to recognize an unnatural amino acid and the editing polypeptide contain an editing reaction active site derived from a leucyl-tRNA synthetase.

[0030] Further, the present invention encompasses a polynucleotide coding for the polypeptide according to the present invention.

[0031] Further, a production method according to the present invention is a method for producing a polypeptide having aminoacyl-tRNA synthetase activity, the method including: a preparing step of preparing a polynucleotide in which a polynucleotide coding for an editing polypeptide containing an editing reaction active site derived from a phenylalanyl-tRNA synthetase, a leucyl-tRNA synthetase, an isoleucyl-tRNA synthetase, a valyl-tRNA synthetase, an alanyl-tRNA synthetase, a prolyl-tRNA synthetase, or a threonyl-tRNA synthetase has been introduced into a polynucleotide coding for an altered polypeptide obtained by altering an arginyl-tRNA synthetase, a cysteinyl-tRNA synthetase, a methionyl-tRNA synthetase, a glutaminyl-tRNA synthetase, a glutamyl-tRNA synthetase, a lysyl-tRNA synthetase, a tyrosyl-tRNA synthetase, or a tryptophanyl-tRNA synthetase so that an unnatural amino acid is recognized; and an expressing step of expressing a polypeptide coded for by the polynucleotide obtained in the preparing step, the preparing step includes preparing either a polynucleotide in which the editing polypeptide has been introduced so as to be positioned between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or a polynucleotide in which the editing polypeptide has been introduced so as to be bound to an N terminal of the altered polypeptide.

[0032] Additional objects, features, and strengths of the present invention will be made clear by the description below. Further, the advantages of the present invention will be evident from the following explanation in reference to the drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0033] FIG. 1, showing an embodiment of the present invention, shows the hydrolysis activity of editing reaction peptides.

[0034] FIG. 2, showing an embodiment of the present invention, schematically shows the structures of fusion proteins.

[0035] FIG. 3, showing an embodiment of the present invention, shows the hydrolysis activity of fusion proteins.

[0036] FIG. 4, showing an embodiment of the present invention, shows the substrate recognition specificity of fusion proteins.

[0037] FIG. 5, showing an embodiment of the present invention, shows the hydrolysis activity of fusion proteins.

[0038] FIG. 6, showing an embodiment of the present invention, shows the substrate recognition specificity of fusion proteins.

[0039] FIG. 7, showing an embodiment of the present invention, shows the hydrolysis activity of fusion proteins.

[0040] FIG. 8, showing an embodiment of the present invention, is a western blotting diagram showing the results of protein synthesis in a cell-free translation system with use of fusion proteins.

[0041] FIG. 9, showing an embodiment of the present invention, is a western blotting diagram showing the results of protein synthesis in cultured mammalian cells with use of fusion proteins.

DESCRIPTION OF THE EMBODIMENTS

1. Polypeptide According to the Present Invention

[0042] A polypeptide according to the present invention only needs to be a polypeptide having aminoacyl-tRNA synthetase activity, the polypeptide including: an altered polypeptide obtained by altering an arginyl-tRNA synthetase (hereinafter referred to as "ArgRS"), a cysteinyl-tRNA synthetase (hereinafter referred to as "CysRS"), a methionyl-tRNA synthetase (hereinafter referred to as "MetRS"), a glutaminyl-tRNA synthetase (hereinafter referred to as "GlnRS"), a glutamyl-tRNA synthetase (hereinafter referred to as "GluRS"), a lysyl-tRNA synthetase (hereinafter referred to as "LysRS"), a tyrosyl-tRNA synthetase (hereinafter referred to as "TyrRS"), or a tryptophanyl-tRNA synthetase (hereinafter referred to as "TrpRS") so that an unnatural amino acid is recognized; and an editing polypeptide derived from a PheRS, a LeuRS, an IleRS, a ValRS, an AlaRS, a ProRS, or a ThrRS, the editing polypeptide having been either inserted between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or bound to an N terminal of the altered polypeptide.

[0043] This makes it possible to obtain an aaRS that exhibits excellent substrate specificity. For example, in order to cause a substrate for the tyrosyl-tRNA synthetase to be an unnatural amino acid (e.g., 3-iodotyrosine) other than tyrosine, it is necessary to alter the amino-acid recognition domain of the TyrRS. Depending on how the domain is altered, the domain may recognize an amino acid other than the unnatural amino acid after alteration. However, the polypeptide according to the present invention has the editing polypeptide. Moreover, the editing polypeptide has been either inserted between the Rossman-fold N domain and the Rossman-fold C domain, or bound to the N terminal of the altered polypeptide. Therefore, the editing polypeptide functions effectively. That is, even in the case of misrecognition of an amino acid, such misrecognition can be corrected by the editing polypeptide. This makes it possible to obtain an aminoacyl-tRNA synthetase that exhibits high specificity of association between an unnatural amino acid and tRNA.

[0044] In this specification, the "polypeptide having aminoacyl-tRNA synthetase activity" means a polypeptide having the aminoacyl-tRNA synthesis activity of an aminoacyl-tRNA synthetase. The "aminoacyl-tRNA synthesis activity" means activity that synthesizes aminoacyl-tRNA by binding an amino acid to tRNA.

[0045] Further, the "unnatural amino acid" means an amino acid other than amino acids genetically encoded by natural living organisms. It should be noted that the "amino acids genetically encoded" mean 22 types of amino acid, namely 20 types of standard amino acid universally used by living organisms, selenomethionine, and pyrrolidine.

[0046] Further, examples of the unnatural amino acid include: 3-halogen-substituted tyrosine such as 3-iodotyrosine, 3-hydroxytyrosine, and 3-azidotyrosine, each having a substituent at the 3-position of tyrosine; parabenzoylphenylalanine, and 4-iodophenylalanine, 4-azidophenylalanine, each having a substituent at the 4-position of phenylalanine; and the like.

[0047] [1-1. Altered Polypeptide]

[0048] In this specification, the "altered polypeptide" only needs to be a polypeptide obtained by altering an ArgRS, a CysRS, a MetRS, a GlnRS, a GluRS, a LysRS, a TyrRS, or a TrpRS so that an unnatural amino acid is recognized.

[0049] The ArgRS, the CysRS, the MetRS, the GlnRS, the GluRS, the LysRS, the TyrRS, and the TrpRS are not particularly limited, but are preferably derived from eubacteria such as Escherichia coli and heat-resistant bacteria, archaebacteria or eukaryotic organisms such as yeasts, animals, and plants, or in particular, from Escherichia coli or archaebacteria.

[0050] An aaRS derived from archaebacteria is not used in the cells of eukaryotic organisms to synthesize a protein containing an unnatural amino acid, but can be used in the cells of prokaryotic organisms. Therefore, an altered polypeptide obtained with use of an aaRS derived from archaebacteria can be used in the cells of prokaryotic organisms or in cell-free translation systems derived from prokaryotic organisms.

[0051] Further, an aaRS derived from Escherichia coli cannot be used in the cells of prokaryotic organisms, but functions in the cells of eukaryotic organisms (yeast cells, plant cells, insect cells, and mammalian cells), and as such, is used to synthesize a protein containing an unnatural amino acid. Therefore, an altered polypeptide obtained with use of an aaRS derived from Escherichia coli can be used for protein synthesis in the cells of eukaryotic organisms or in cell-free translations system derived from eukaryotic organisms.

[0052] There is no particular limit on how to obtain an ArgRS or the like from such a living organism as mentioned above. However, for example, based on the following conventional publicly-known base sequences and the like, it is possible to obtain, by a nucleic-acid amplification reaction such as PCR, a polynucleotide coding for an ArgRS or the like and to express a polypeptide coded for by the polynucleotide.

[0053] For a DNA sequence of ArgRS, refer to GenBank Accession No. AP.sub.--002496, for example.

[0054] For a DNA sequence of CysRS, refer to GenBank Accession No. AP.sub.--001173, for example.

[0055] For a DNA sequence of MetRS, refer to GenBank Accession No. AP.sub.--002712, for example.

[0056] For a DNA sequence of GlnRS, refer to GenBank Accession No. P00962, for example.

[0057] For a DNA sequence of GluRS, refer to GenBank Accession No. AP.sub.--001318, for example.

[0058] For a DNA sequence of LysRS, refer to GenBank Accession No. AP.sub.--004631, for example.

[0059] For a DNA sequence of TyrRS, refer to GenBank Accession No. AP.sub.--00259, for example.

[0060] For a DNA sequence of TrpRS, refer to GenBank Accession No. AP.sub.--004406, for example.

[0061] An amino acid serving as a substrate for an aaRS may be altered with use of a conventional publicly-known method. For such a method, refer to "Lee, N. et al., Nat Struct Biol, 2000, 7, 28-33" mentioned above, for example. Alternatively, an amino acid serving as a substrate for an aaRS may be altered by introducing a mutation into the aaRS according to a technique, described in International Publication No. 2003/014354 Pamphlet, which involves the use of PCR. For a method for screening an aaRS mutant obtained after altering an amino-acid recognition site, refer to "Wang, L., Brock, A., Herberich, B., and Schultz, P. G., Expanding the genetic code of Escherichia coli. Science, 2001 292, 498-500" mentioned above, for example.

[0062] Further, the altered polypeptide may be a conventional publicly-known aaRS altered in terms of the amino acid it recognizes. For example, the altered polypeptide may be an EcIYRS, a MjIYRS, or the like. Further, the altered polypeptide may be one of the aaRS mutants disclosed in "Wang, L. et al., Annu Rev Biophys Biomol Struct, 2006, 35, 225-249". "Wang, L. et al., Annu Rev Biophys Biomol Struct, 2006, 35, 225-249" enumerates an Escherichia coli-derived TyrRS mutant specific to p-benzoylphenylalanine, an Escherichia coli-derived TyrRS mutant specific to p-azidophenylalanine, an Escherichia coli-derived TyrRS mutant specific to p-iodophenylalanine, an Escherichia coli-derived TyrRS mutant specific to p-acetylphenylalanine, a M. jannaschii-derived TyrRS mutant specific to p-benzoylphenylalanine, a M. jannaschii-derived TyrRS mutant specific to p-azidophenylalanine, a M. jannaschii-derived TyrRS mutant specific to p-iodophenylalanine, a M. jannaschii-derived TyrRS mutant specific to p-acetylphenylalanine, a M. jannaschii-derived TyrRS mutant specific to N-acetylgalactosamine-.alpha.-O-threonine, a M. jannaschii-derived TyrRS mutant specific to N-acetylglucosamine-.alpha.-O-serine, and the like. Further, the altered polypeptide of the present invention may be realized by a M. jannaschii-derived TyrRS mutant specific to trifluoromethyldiazinylphenylalanine (see Tippmann, E. M. et al., Chembiochem, 2007, 8, 2210-2214), a M. jannaschii-derived TyrRS mutant specific to p-carboxymethylphenylalanine (see Xie, J., Supekova, L., Schultz, P. G., ACS Chem. Biol., 2007, 2, 474-478), a M. jannaschii-derived TyrRS mutant specific to O-allyltyrosine (see Zhang, Z. et al., Angew. Chem. Int. Ed., 2002, 41, 2840-2842), a M. jannaschii-derived TyrRS mutant specific to sulfotyrosine (see Liu, C. C., Schultz, P. G., Nature biotechnology, 2006, 24, 1436-1440), or the like. Among the examples thus enumerated, EcIYRSs derived from eubacterial aaRSs are preferred.

[0063] It should be noted that substituting histidine with alanine at the 70-position, asparagine acid with threonine at the 158-position, and isoleucine with serine at the 159-position of a Methanococcus jannaschii TyrRS makes it possible to produce a MjIYRS by altering the substrate amino acid of the TyrRS from tyrosine to an unnatural amino acid 3-iodotyrosine.

[0064] Further, for example, the altered polypeptide for use in the polypeptide according to the present invention can be realized by a polypeptide altered so as to recognize a tyrosine derivative, or in particular, by a TyrRS mutant obtained by altering a TyrRS so that 3-iodotyrosine is recognized as a tyrosine derivative.

[0065] In this specification, the "tyrosine derivative" means tyrosine one of whose constituent atoms has a given substituent introduced thereinto, and the substituent is not limited in position into which it is introduced. Possible examples of the tyrosine derivative include tyrosine in which a hydrogen atom bound to a carbon atom at the 3-position of a phenolic aromatic ring has been substituted by another atom or a group of atoms, substituted by a halogen atom, or substituted by an iodine atom (e.g., 3-iodotyrosine). Further possible examples include tyrosine in which the 4-position of a phenolic aromatic ring has been substituted by a group of atoms including an O-methyl group (O-methyltyrosine), an azido group (azidophenylalanine), a benzoyl group (parabenzoylphenylalanine), an acetyl group (4-acetylphenylalanine), and a diazirine group (trifluoromethyldiazirinylphenylalanine), or by an iodine atom (4-iodophenylalanine). It should be noted that iodotyrosine is useful in that it can be used for phase determination in determining a crystal protein structure.

[0066] Further, an example of a TyrRS mutant obtained by altering a TyrRS so that the substrate amino acid becomes a tyrosine derivative is an Escherichia coli-derived TyrRS whose tyrosine at the 37-position has been substituted by valine, leucine, isoleucine, or alanine and whose glutamine at the 195-position has been substituted by alanine, cysteine, serine, or asparagine. The introduction of the aforementioned substitutions makes it possible to produce a TyrRS whose substrate amino acid has been altered from tyrosine to an unnatural amino acid 3-iodotyro sine.

[0067] In one embodiment, it is preferable that the altered polypeptide be a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 1. The amino-acid sequence represented by SEQ ID NO: 1 is an amino-acid sequence of a polypeptide obtained by altering an Escherichia coli-derived TyrRS so that 3-iodotyrosine is recognized. Further, the altered polypeptide may be a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 1 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having activity to bind 3-iodotyrosine to tRNA.

[0068] [1-2. Editing Polypeptide]

[0069] In this specification, the "editing polypeptide" means a polypeptide having activity to hydrolyze an aminoacyl adenylate intermediate and aminoacyl tRNA, each having been produced by an aaRS's misrecognizing an amino acid different from the correct substrate amino acid, into an amino acid and an inorganic phosphoric acid and into an amino acid and tRNA, respectively (such activity being referred to as "editing reaction activity"). It should be noted that a domain including the editing polypeptide is referred to sometimes as "editing reaction domain". aaRSs having editing polypeptides are an IleRS, a ValRS, a LeuRS, an AlaRS, a ProRS, a ThrRS, and a PheRS. An editing polypeptide altered so as to exhibit editing reaction activity to something other than the aminoacyl tRNA and/or aminoacyl adenylate intermediate the original aaRS is intended to edit is also encompassed in the scope of "aaRS-derived editing polypeptide". For example, an IleRS-derived editing polypeptide, which degrades valyl tRNA, may be altered so as to hydrolyze binding of an amino acid other than valine (e.g., an unnatural amino acid) to tRNA. A person skilled in the art would be able to appropriately alter an editing polypeptide by substituting some of the amino acids constituting the editing polypeptide. However, for convenience of explanation, the term "aaRS-derived editing polypeptide" used in the description of this specification means an editing polypeptide that has not been altered in terms of the aminoacyl tRNA and/or aminoacyl adenylate intermediate it is intended to edit, if not otherwise specified.

[0070] There is no particular limit on how to obtain an editing polypeptide. For example, an editing polypeptide may be obtained as follows: With use as a template of a polynucleotide coding for an IleRS, a ValRS, a LeuRS, an AlaRS, a ProRS, a ThrRS, and a PheRS, the genome DNA of a living organism having these aaRS, or the like, a polynucleotide coding for the editing polypeptide is obtained by a nucleic-acid amplification reaction such as PCR with use of a primer designed based on the base sequence of a conventional publicly-known editing polypeptide; and with use of the polynucleotide in a cell-free translation system or a conventional publicly-known expression system such as a microorganism, a polypeptide coded for by the polynucleotide is expressed. The polypeptide may be expressed not only at a site known to be an active center, but also in an appropriate surrounding domain, among the editing reaction domain. For example, in the case of the after-mentioned Pyrococcus horikoshii PheRS, better editing reaction activity can be obtained by expressing a polypeptide so that a B3/4 domain is included.

[0071] The PheRS, the IleRS, the ValRS, the LeuRS, the AlaRS, the ProRS, and the ThrRS, from each of which the editing polypeptide is obtained, will be detailed below.

[0072] The PheRS is not particularly limited, but can be derived from eukaryotic organisms or prokaryotic organisms, preferably from prokaryotic organisms, or in particular, from archaebacteria, more preferably from the genus Pyrococcus, or most preferably from Pyrococcus horikoshii. For example, it is possible to use the beta-subunit of Pyrococcus horikoshii PheRS (refer to GenBank Accession No. NP.sub.--142611). It should be noted that the Pyrococcus horikoshii PheRS is disclosed, for example, in "Roy, H. et al., EMBO J, 2004, 23, 4639-4648", "Kotik-Kogan, O. et al., Structure, 2005, 13, 1799-1807", and Sasaki, H., et al., Structural and mutational studies of the amino acid-editing domain from archaeal/eukaryal phenylalanyl-tRNA synthetase, Proc Natl Acad Sci USA, 2006, 103, 14744-14749" and it has been specified that the B3/4 domain is an editing reaction site. Further, the editing reaction site of the PheRS has activity to hydrolyze tyrosyl tRNA and a tyrosyl adenylate intermediate. There are two types of PheRS: an alpha-subunit and a beta-subunit. Among these, it is the beta-subunit that includes the editing reaction domain.

[0073] In one embodiment, it is preferable that the editing polypeptide be a polypeptide consisting of an amino-acid sequence represented by SEQ ID NO: 2. In another embodiment, it is preferable that the editing polypeptide be a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 2 with a deletion, insertion, substitution, or addition of one or several amino acids and (ii) having editing reaction activity. SEQ ID NO: 2 represents one of the amino-acid sequences of an editing polypeptide derived from a Pyrococcus horikoshii-derived PheRS, i.e., represents the amino-acid sequence of the B3/4 domain. The polypeptide consisting of these amino-acid sequences may have another polypeptide bound thereto, as long as the editing reaction activity is not impaired. For example, the polypeptide consisting of the amino-acid sequence of the B3/4 domain may have the after-mentioned B1 and B2 domains bound to the N terminal thereof, or may have a B5 domain bound to the C terminal thereof.

[0074] It should be noted that the PheRS is an enzyme that functions by alpha- and beta-heterodimers' combining to form a further dimer. Among these, the editing reaction domain exists in the beta-subunit. The beta-subunit consists of B1, B2, B3/4, B5, B6/7, and B8 domains, and the B3/4 domain is the editing reaction domain. In a eukaryotic or archaebacterial PheRS, there is a lack of B2 and B8 domains, i.e., there do not exist B2 and B8 domains. In some aaRSs, there stably exists a polypeptide including only an editing reaction domain and hydrolysis activity is retained. The inventors found that in cases where the editing polypeptide is obtained from a Pyrococcus horikoshii-derived PheRS, use of a polypeptide including a B3/4 domain makes it possible to exhibit extremely stable and satisfactory editing reaction activity. Therefore, when an editing polypeptide derived from a Pyrococcus horikoshii PheRS is employed, it is preferable to use a polypeptide including a B3/4 domain, and it is more preferable to use a polypeptide consisting of a B3/4 domain.

[0075] It should be noted that also when an editing polypeptide derived from an Escherichia coli PheRS is employed, it is preferable to use a polypeptide including a B3/4 domain, and it is more preferable to use a polypeptide consisting of a B3/4 domain.

[0076] Further, when an editing polypeptide derived from a Thermus thermophilus is employed, it is preferable to use a polypeptide of a B3/4 domain or of a B1-B3/4 domain to which B1 and B2 have been added, because excellent stability is obtained.

[0077] The LeuRS, the IleRS, the ValRS, the AlaRS, the ProRS, and the ThrRS are not particularly limited, but can be derived appropriately from eukaryotic organisms or prokaryotic organisms, preferably from prokaryotic organisms, and will be detailed below.

[0078] For LeuRS, refer to "Lincecum, T. L. et al., Structural and Mechanistic Basis of Pre-and Posttransfer Editing by Leucyl-tRNA Synthetase Molecular Cell, 2003, 11, 951-963", for example, in which the crystal structure of a T. thermophilus LeuRS is disclosed and an editing reaction site (domain including Thr247 to Thr252 and Leu329 to Asp347) is specified. Further, for example, "Fukunaga, R., and Yokoyama, S., Nat Struct Mol Biol, 2005, 12, 915-922" can be referred to, in which the crystal structure of a P. horikoshii LeuRS is disclosed and an editing reaction domain (from the vicinity of Pro205 to the vicinity of Phe433) is specified.

[0079] For IleRS, refer to "Nureki, O. et al., Science, 1998, 280, 578-582", for example, in which the crystal structure of a T. thermophilus IleRS is disclosed and an editing reaction site (domain including Trp232, Phe359, His384, and Tyr386) is specified.

[0080] For ValRS, refer to "Fukai, S. et al., Cell, 2000, 103, 793-803", for example, in which the crystal structure of a T. thermophilus ValRS is disclosed and an editing reaction site (domain including Thr214, Arg216, Thr219, Lys270, Thr272, Asp276, and Asp279) is specified.

[0081] For AlaRS, refer to "Beebe, K. et al., Distinct domains of tRNA synthetase recognize the same base pair, Nature, 2008, 451, 90-93", for example, which shows that an E. coli AlaRS polypeptide including Asp553 to Ala705 has editing reaction activity.

[0082] For ProRS, refer to "Crepin, T. et al., Structures of Two Bacterial Prolyl-tRNA Synthetases with and without a cis-Editing Domain, Structure, 2006, 14, 1511-1525", for example, in which the crystal structure of an Enterococcus faecalis ProRS is disclosed and an editing reaction domain (Thr237 to Gly390) is specified.

[0083] For ThrRS, refer to "Dock-Bregeon, A.-C. et al., Transfer RNA-Mediated Editing in Threonyl-tRNA Synthetase: The Class II Solution to the Double Discrimination Problem, Cell, 2000, 103, 877-884", for example, in which the crystal structure of an E. coli ThrRS is disclosed and an editing reaction site (domain including His73, His77, Cys182, and His186) is specified. Further, in FIG. 4 of "Korencic, D. et al., A freestanding proofreading domain is required for protein synthesis quality control in Archaea, Proc Natl Acad Scie USA, 2004, 101, 10260-10265", the amino-acid sequence alignment of an archaebacterial editing reaction domain is described.

[0084] As for these aaRSs whose editing reaction sites or editing reaction domains have been specified, polynucleotides coding for polypeptides containing editing reaction sites or the like are obtained by a nucleic-acid amplification reaction such as PCR. With use of the polynucleotides in a conventional publicly-known expression system, polypeptides coded for by the polynucleotides are expressed. The polypeptides are checked for editing reaction activity, and a polypeptide having editing reaction activity is judged to be an editing polypeptide.

[0085] Further, identical aaRSs share homology with one another across species in terms of the amino-acid sequence of an editing reaction site. For this reason, the editing reaction site of an aaRS of a living organism other than those described above can be specified by alignment with an aaRS whose editing reaction site has been specified. Based on this editing reaction site, an editing polypeptide derived from another living organism can be appropriately obtained by such a technique as described above.

[0086] [1-3. Positional Relationship Between the Altered Polypeptide and the Editing Polypeptide]

[0087] As described above, in the polypeptide according to the present invention, it is preferable that the editing polypeptide have been either inserted between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or bound to an N terminal of the altered polypeptide.

[0088] In this specification, the "Rossman-fold domain" means a domain that universally exists in class I aaRS proteins and forms a Rossman-fold structure. The "Rossman-fold structure" means a beta-alpha-beta-alpha-beta structural unit in which beta-alpha-beta structures each having a concatenation of beta structures and an alpha helix of a peptide chain are repeated. In a Rossman-fold domain, there are two Rossman-fold structures arranged. The Rossman-fold structure on the N terminal is called a "Rossman-fold N domain", and the Rossmam-fold structure on the C terminal is called "Rossman-fold C domain". The total six of beta-structures in the Rossman filed domain are arranged in parallel with one another. The Rossman-fold structure has activity to bind to a nucleotide.

[0089] In one embodiment, it is preferable that the editing polypeptide have been inserted into a CP1 domain that lies between the Rossman-fold N domain and the Rossman-fold C domain.

[0090] In this specification, the "CP1 (connective peptide 1) domain" means a domain consisting of an amino acid sandwiched between N and C domains of a Rossman-fold domain on a primary sequence of an aaRS of class I.

[0091] Insertion of the editing polypeptide into the CP1 domain makes it possible to introduce the editing polypeptide while retaining the activity to bind an amino acid to tRNA.

[0092] In cases where the editing polypeptide has been bound the N terminal of the altered polypeptide, it is preferable that the N terminal of the editing polypeptide have been altered to methionine or an additional peptide whose N terminal is methionine have been added.

[0093] As described above, if the editing polypeptide is positioned between the Rossman-fold N domain and the Rossman-fold C domain, or preferably in the CP1 domain or at the N terminal of the altered polypeptide, the aminoacylation activity and the editing reaction activity are effectively exhibited. When inserted into one of the positions, the editing polypeptide not only can maintain its three-dimensional structure, but also will not disarray the three-dimensional structure of the altered polypeptide. Furthermore, the positional relationship between the editing polypeptide and the altered polypeptide in the present invention is appropriate from the point of view that the 3'-end of aminoacylated tRNA moves rapidly from the altered polypeptide to the editing polypeptide.

[0094] Further, the polypeptide according to the present invention may have a linker polypeptide inserted therein, the linker polypeptide being a polypeptide that connects the editing polypeptide with the altered polypeptide. The sequence and length of the linker polypeptide are not limited, and can be set appropriately by the editing polypeptide and the altered polypeptide. For example, it is preferable that a linker polypeptide having 2 to 10 serine or glycine residues be inserted into each of the N and C terminals of the editing polypeptide. Further, in cases where the polypeptide according to the present invention takes such a form that a PheRS-derived editing polypeptide has been inserted into a TyrRS mutant obtained by altering a TyrRS so that the amino acid to be recognized is an unnatural amino acid, it is only necessary that a polypeptide consisting of an amino-acid sequence represented by SEQ. ID. NO.: 50 be included as a linker at the N terminal of the editing polypeptide and a polypeptide consisting of an amino-acid sequence represented by SEQ. ID. NO.: 51 be included as a linker at the C terminal of the editing polypeptide.

[0095] [1-4. Combination of the Altered Polypeptide and the Editing Polypeptide]

[0096] As to which of the aaRSs, namely the ArgRS, the CysRS, the MetRS, the GlnRS, the GluRS, the LysRS, TyrRS, and the TrpRS, is selected as the altered polypeptide and which of a PheRS-derived editing polypeptide, an IleRS-derived editing polypeptide, a ValRS-derived editing polypeptide, a LeuRS-derived editing polypeptide, an AlaRS-derived editing polypeptide, a ProRS-derived editing polypeptide, and a ThrRS-derived editing polypeptide is selected as the editing polypeptide, there is no particular limit on their combinations. However, preferred combinations will be described below.

[0097] When the altered polypeptide is an aaRS obtained by altering the ArgRS, the CysRS, the MetRS, the GlnRS, the GluRS, the LysRS, the TyrRS, or the TrpRS, it is only necessary to select an editing polypeptide having activity to hydrolyze an aminoacyl adenylate intermediate and aminoacyl tRNA, each having been produced by misrecognition of an amino acid that would not be recognized if the aaRS were not altered, into an amino acid and an inorganic phosphoric acid and into an amino acid and tRNA, respectively. This is because an aaRS altered in terms of the amino acid it recognizes can be inhibited from misrecognizing the amino acid it recognized before alteration.

[0098] In fact, it is believed to be the most important and demanding task in producing an aaRS mutant that specifically recognizes an unnatural amino acid to inhibit the aaRS from recognizing the original substrate of the wild-type aaRS (tyrosine in the case of TyrRS) (Kiga et al., PNAS 99, 9715-9720, 2002; Summerer et al., PNAS1.03, 9785-9789, 2006).

[0099] For example, in cases where the altered polypeptide is a TyrRS mutant, it is preferable to use a PheRS-derived editing polypeptide.

[0100] The PheRS is known to misrecognize tyrosine, which is similar to phenylalanine, i.e., the original substrate of the PheRS, to produce tyrosyl tRNAPhe. The editing reaction domain of the PheRS has activity to hydrolyze an ester bond between tyrosine in tyrosyl tRNAPhe and tRNA into tyrosine and tRNA. Meanwhile, an altered polypeptide obtained by altering the TyrRS may misrecognize tyrosine, which is the original substrate. Introduction of the editing reaction domain of the PheRS into the TyrRS-derived altered polypeptide makes it possible to degrade tyrosyl tRNA, which has been produced by misrecognition of tyrosine by the TyrRS-derived altered polypeptide, into tyrosine and tRNA. This makes it possible to prevent tyrosyl tRNA from being produced by misrecognition.

[0101] Further, in cases where the altered polypeptide is a MetRS mutant, it is preferable to use a LeuRS-derived editing polypeptide. The MetRS mutant may misrecognize methionine. Meanwhile, the LeuRS-derived editing polypeptide can hydrolyze binding of methionine to tRNA.

[0102] That is, in a preferred embodiment of the present invention, the altered polypeptide is a tyrosyl-tRNA synthetase altered so as to recognize an unnatural amino acid, and the editing polypeptide contained an editing reaction active site derived from a phenylalanyl-tRNA synthetase. Further, in a preferred embodiment of the present invention, the altered polypeptide is a methionyl-tRNA synthetase altered so as to recognize an unnatural amino acid, and the editing polypeptide contains an editing reaction active site derived from a leucyl-tRNA synthetase.

[0103] [1-5. Other Constituents of the Polypeptide According to the Present Invention]

[0104] The polypeptide according to the present invention may have a tag label (tag sequence) added to the N or C terminal thereof, for example, the tag label being a peptide that facilitates purification of the polypeptide. Such a sequence can be removed before the polypeptide is finally prepared. Examples of the tag sequence include a histidine tag, an HA tag, a Myc tag, and a Flag.

[0105] In one embodiment, it is preferable that the polypeptide according to the present invention be a polypeptide consisting of an amino-acid sequence represented by SEQ. ID. NO: 3.

[0106] Thus far, the constituents of the polypeptide according to the present invention have been described. In a preferred embodiment, it is preferable that the polypeptide according to the present invention be a polypeptide (i) consisting of an amino-acid sequence, represented by SEQ ID NO: 3 with a deletion, insertion, substitution, or addition of one or several amino acids, (ii) having editing reaction activity, and (iii) having activity to bind an unnatural amino acid to tRNA. SEQ ID NO: 3 is a amino-acid sequence of Ped-CP1-IYRS described below in Examples, and represents a polypeptide that exhibits the highest recognition specificity to 3-iodotyrosine.

[0107] It should be noted that a person skilled in the art can use a well-known technique to easily mutate one or several amino acids of the amino-acid sequence of the polypeptide. For example, according to a publicly-known point mutation introduction method (mutation induction method), a deletion mutant or an addition mutant can be produced by designing a primer corresponding to a given site in a polynucleotide coding for the polypeptide.

[0108] In one aspect, it is preferable that the polynucleotide according to the present invention code for the polypeptide having aminoacyl-tRNA synthetase activity. The polynucleotide according to the present invention may be synthesized from a full-length polynucleotide by chemical synthesis, or may be synthesized by ligation of polynucleotides.

[0109] [1-6. Usage of the Polypeptide According to the Present Invention]

[0110] By introducing the polypeptide according to the present invention into a cell-free protein translation system, a eubacterium, or a eukaryotic cell (e.g., a yeast cell, a plant cell, an insect cell, and a mammalian cell) together with an unnatural amino acid, a protein containing the unnatural amino acid can be synthesized in the cell-free protein translation system, the eubacterium, or the eukaryotic cell.

2. Method According to the Present Invention for Producing a Polypeptide

[0111] A method according to the present invention for producing a polypeptide (hereinafter referred to simply as "production method according to the present invention") is a method for producing a polypeptide having aminoacyl-tRNA synthetase activity. The method includes: a preparing step of preparing a polynucleotide in which a polynucleotide coding for an editing polypeptide containing an editing reaction active site derived from a PheRS, a LeuRS, an IleRS, a ValRS, an AlaRS, a ProRS, or a ThrRS has been introduced into a polynucleotide coding for an altered polypeptide obtained by altering an ArgRS, a CysRS, a MetRS, a GlnRS, a GluRS, a LysRS, a TyrRS, or a TrpRS so that an unnatural amino acid is recognized; and an expressing step of expressing a polypeptide coded for by the polynucleotide obtained in the preparing step. In the preparing step, it is only necessary to prepare either a polynucleotide in which the editing polypeptide has been introduced so as to be positioned between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or a polynucleotide in which the editing polypeptide has been introduced so as to be bound to an N terminal of the altered polypeptide.

[0112] [2-1. Preparing Step]

[0113] The preparing step that is carried out in the present invention is a step, included in the method for producing a polypeptide having aminoacyl-tRNA synthetase activity, in which to prepare a polynucleotide in which a polynucleotide coding for an editing polypeptide containing an editing reaction active site derived from a PheRS, a LeuRS, an IleRS, a ValRS, a AlaRS, a ProRS, or a ThrRS has been introduced into a polynucleotide coding for an altered polypeptide obtained by altering an ArgRS, a CysRS, a MetRS, a GlnRS, a GluRS, a LysRS, a TyrRS, or a TrpRS so that an unnatural amino acid is recognized, in which step (i) a polynucleotide in which the polynucleotide coding for the editing polypeptide has been introduced so as to be positioned between a Rossman-fold N domain and a Rossman-fold C domain in the polynucleotide coding for the altered polypeptide that recognizes the unnatural amino acid or (ii) a polynucleotide in which the editing polypeptide has been introduced so as to be bound to an N terminal of the altered polypeptide is prepared.

[0114] A person skilled in the art can use a well-known technique so that the polynucleotide coding for the editing polypeptide is introduced into a position corresponding to the space between a Rossman-fold N and C domains of the polynucleotide coding for the altered polypeptide or bound to a position corresponding to the N terminal of the altered polypeptide. For example, the polynucleotide coding for the editing polypeptide and the polynucleotide coding for the altered polypeptide can be bound with a ligase after being obtained by a nucleic-acid amplification reaction such as PCR.

[0115] There is no particular limit on a template for use in the nucleic-acid amplification reaction. However, it is possible to use a plasmid containing genome DNA, cDNA, and a clone of the polynucleotide. Further, it is possible to bind fragments cut out from the plasmid by restriction enzyme digestion. Further, as will be described below in Examples, it is possible to prepare a template by overlap PCR. Further, it is possible to chemically synthesize a full-length polynucleotide of the polynucleotide coding for the editing polypeptide and the polynucleotide coding for the altered polypeptide.

[0116] [2-2. Expressing Step]

[0117] Use of the polynucleotide obtained in the preparing step makes it possible to transform eubacteria such as Escherichia coli or eukaryotic cells (e.g., yeast cells, plant cells, insect cells, and mammalian cells) and express, in the transformed cells, a polypeptide coded for by the polynucleotide obtained in the preparing step. Further, it is possible to perform transcription and protein synthesis in a conventionally publicly-known rabbit reticulocyte, insect, or wheat cell-free system.

[0118] The polynucleotide obtained in the preparing step can be incorporated into an expression vector. There is no particular limit on the specific type of vector, and it is possible to appropriately select a vector capable of expression in a host cell for use in expression. That is, it is only necessary to select, according to the type of host cell, an appropriate promoter sequence for surely expressing a polypeptide coded for by the polynucleotide according to the present invention and use, as an expression vector, a vector obtained by incorporating the promoter sequence and the polynucleotide according to the present invention into various plasmids. It should be noted that the vector is not limited to a case where the objective protein is constitutively synthesized, but may be induced by addition of IPTG to be synthesized. Further, the vector may include a sequence that adds a tag sequence such as a histidine tag, an HA tag, a Myc tag, or a Flag to the N or C terminal of the protein to be synthesized.

[0119] After a host transformed with use of the expression vector is cultured, the objective protein can be collected from the culture or the like and purified according to a conventional technique (e.g., filtration, centrifugation, cell breakage, gel filtration chromatography, ion-exchange chromatography). Further, in cases where the tag has been added to the protein to be synthesized, the objective protein can be easily collected.

[0120] It is preferable that the expression vector include at least one selective marker. As for eukaryotic cell culture, an example of such a marker is a dihydrofolate reductase or a drug-resistant gene such as neomycin, zeocin, geneticin, blasticidin S, or hygromycin B. As for culture of Escherichia coli and other bacteria, an example of such a marker is a drug-resistant gene such as kanamycin, zeocin, actinomycin D, cefotaxime, streptomycin carbenicillin, puromycin, tetracycline, or ampicillin.

[0121] Use of the selective marker makes it possible to confirm whether or not the polynucleotide according to the present invention has been introduced into a host cell, and to further confirm whether or not the polynucleotide according to the present invention has been surely expressed in the host cell.

[0122] The host cell is not particularly limited, and can be suitably realized by various conventional publicly-known cells.

[0123] The method for introducing a polynucleotide according to the present invention or an expression vector including a polynucleotide according to the present invention into a host cell, i.e., the transformation method is not particularly limited, and can be suitably realized by a conventional publicly-known method such as electroporation or a calcium-phosphate method.

[0124] The embodiments of the present invention will be further detailed below with reference to examples. The present invention is not limited to the following examples, and details of the present invention can take various aspects. Furthermore, the present invention is not limited to the description of the embodiments above, but may be altered by a skilled person within the scope of the claims. An embodiment based on a proper combination of technical means disclosed in different embodiments is encompassed in the technical scope of the present invention. Further, all the documents cited in this specification can be cited as references.

Example 1

Construction of Editing Reaction Peptide Expression Plasmids

[0125] As mentioned above, the beta-subunit of PheRS consists of B1, B2, B3/4, B5, B6/7, and B8 domains, and the B3/4 domain is an editing reaction domain.

[0126] In the beta-subunit of PheRS, the B6/7 and B8 domains is placed separately from the B3/4 domain because of the three-dimensional structure. Further, it is believed that there is no fixed structure formed between the B5 domain and the B6/7 domain.

[0127] In view of this, the present example uses, as an editing polypeptide, a polypeptide fragment (hereinafter referred to as "editing reaction peptide") including the B3/4 domain, and as such, does not include the B6/7 domain and the B8 domain.

[0128] With use of the genome DNA of E. coli, the genome DNA of T. thermophilus, and the genome DNA of P. horikoshii as templates, a DNA fragment coding for the BH to B5 domains (B1-B5) of the beta-subunit of PheRS of each living organism, a DNA fragment coding for the BH to B3/4 domains (B1-B3/4) of the beta-subunit, a DNA fragment coding for the B3/4 to B5 domains (B3/4-B5) of the beta-subunit, and a DNA fragment coding for only the B3/4 domain of the beta-subunit were amplified by PCR.

[0129] Specifically, the 1st to 475th amino-acid residues of the amino-acid sequence of the beta-subunit of PheRS of E. coli, the 1st to 403rd amino-acid residues of the amino-acid sequence, the 188th to 475th amino-acid residues of the amino-acid sequence, and the 188th to 403rd amino-acid residues of the amino-acid sequence were used as B1-B5, B1-B3/4, B3/4-B5, and B3/4, respectively. The amino-acid sequence is represented by SEQ. ID. NO: 52.

[0130] As for PheRS of T. thermophilus, the 1st to 473rd amino-acid residues of the amino-acid sequence of the beta-subunit of PheRS of T. thermophilus, the 1st to 401st amino-acid residues of the amino-acid sequence, the 190th to 473rd amino-acid residues of the amino-acid sequence, and the 190th to 401st amino-acid residues of the amino-acid sequence were used as B1-B5, B1-B3/4, B3/4-B5, and B3/4, respectively. The amino-acid sequence is represented by SEQ. ID. NO: 53.

[0131] As for PheRS of P. horikoshii, the 1st to 353rd amino-acid residues of the amino-acid sequence of the beta-subunit of PheRS of P. horikoshii, the 1st to 280th amino-acid residues of the amino-acid sequence, the 83rd to 353rd amino-acid residues of the amino-acid sequence, and the 83rd to 280th amino-acid residues of the amino-acid sequence were used as B1-B5, B1-B3/4, B3/4-B5, and B3/4, respectively. The amino-acid sequence is represented by SEQ. ID. NO: 54.

[0132] The DNA fragments coding for their respective domains of E. coli were amplified as follows. Specifically, the DNA fragment coding for B1-B5 was amplified by PCR with use of primers Ec-B1-F (SEQ. ID. NO: 55) and Ec-B5-R (SEQ. ID. NO: 56). The DNA fragment coding for B1-B3/4 was amplified by PCR with use of primers Ec-B1-F and Ec-B3/4-R (SEQ. ID. NO: 57). The DNA fragment coding for B3/4-B5 was amplified by PCR with use of primers Ec-B3/4-F (SEQ. ID. NO: 58) and Ec-B5-R. The DNA fragment coding for B3/4 was amplified by PCR with use of primers Ec-B3/4-F and Ec-B3/4-R.

[0133] The DNA fragments coding for their respective domains of T. thermophilus were amplified as follows. Specifically, the DNA fragment coding for B1-B5 was amplified by PCR with use of primers Tt-B1-F (SEQ. ID. NO: 59) and Tt-B5-R (SEQ. ID. NO: 60). The DNA fragment coding for B1-B3/4 was amplified by PCR with use of primers Tt-B1-F and Tt-B3/4-R (SEQ. ID. NO: 61). The DNA fragment coding for B3/4-B5 was amplified by PCR with use of primers Tt-B3/4-F (SEQ. ID. NO: 62) and Tt-B5-R. The DNA fragment coding for B3/4 was by PCR amplified with use of primers Tt-B3/4-F and Tt-B3/4-R.

[0134] The DNA fragments coding for their respective domains of P. horikoshii were amplified as follows. Specifically, the DNA fragment coding for B1-B5 was amplified by PCR with use of primers Ph-B1-F (SEQ. ID. NO: 63) and Ph-B5-R (SEQ. ID. NO: 64). The DNA fragment coding for B1-B3/4 was amplified by PCR with use of primers Ph-B1-F and Ph-B3/4-R (SEQ. ID. NO: 65). The DNA fragment coding for B3/4-B5 was amplified by PCR with use of primers Ph-B3/4-F (SEQ. ID. NO: 66) and Ph-B5-R. The DNA fragment coding for B3/4 was amplified by PCR with use of primers Ph-B3/4-F and Ph-B3/4-R.

[0135] In each case, PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 54.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (25 cycles). The DNA polymerase used was Phusion High-Fidelity DNA Polymerase (NEB, Inc.).

[0136] Each of the DNA fragments thus amplified was introduced between the NdeI and XhoI sites of a vector pET-26b (Novagen, Inc.) so that His-tag was added to the C terminal of the polypeptide to be expressed. Thus produced was an editing reaction peptide expression plasmid. The editing reaction peptide expression plasmid was cloned.

Example 2

Expression and Purification of Editing Reaction Peptides

[0137] With use of each of the expression plasmids thus cloned, Escherichia coli BL21 Star (DE3) (Invitrogen Corporation) was transformed, and then cultured at 37.degree. C. in an LB medium containing 50 .mu.g/ml of kanamycin. At the point of time where the O.D. 600 of the culture fluid took on a value of 0.4 to 0.6, IPTG was added so that the final concentration was 1 mM. After the addition of IPTG, the culture temperature was lowered to 20.degree. C. In this manner, the expression of each editing reaction peptide was induced. Twenty hours after the start of induction, the bacteria were harvested.

[0138] Next, the editing reaction peptide was extracted from the bacteria thus harvested, and purified. A lytic buffer solution was prepared by adding a grain of a protease inhibitor Complete (Roche) to 200 ml of a buffer solution A (20 mM Tris-Cl (ph 8.0), 300 mM NaCl, 30 mM imidazole, 5 mM 2-mercaptoethenol). The harvested bacteria were suspended in the lytic buffer solution, and then were subjected to sonication. An insoluble fraction was precipitated by centrifugation. Each editing reaction peptide, contained in a supernatant, was adsorbed to a resin Ni Sepharose 6 Fast Flow (GE Healthcare, Inc.) equilibrated with use of the lytic buffer solution. After the resin had been washed with the lytic buffer solution, the editing reaction peptide was eluted from the resin with use of a buffer solution B obtained by increasing the imidazole concentration of the buffer solution A to 500 mM.

[0139] The eluate was diluted with use of 20 mM Tris-Cl (ph 8.0) so that the concentrations of Tris-Cl, NaCl, and imidazole in the eluate became 20 mM Tris-Cl (ph 8.0), 75 mM NaCl, and 125 mM imidazole. As for the eluate thus diluted, a protein was concentrated with use of Amicon-Ultra 4 (Millipore Corporation) until the volume became approximately 100 .mu.l. The concentrated protein was put with glycerol so that the final concentration became 50%, mixed well, and preserved at -20.degree. C.

[0140] The twelve types of editing reaction peptide thus obtained varied slightly in expression level, depending on the types of living organism, but were expressed so that soluble fractions were generally obtained. Further, none of them aggregated when concentrated.

Example 3

Hydrolysis Activity of the Editing Reaction Peptides

[0141] Next, the hydrolysis activity of each of the editing reaction peptides was measured. The measurement of the hydrolysis activity of each of the editing reaction peptides was performed by measuring the extent to which tyrosylation of tRNA.sup.Tyr by EcIYRS was inhibited, instead of directly measuring hydrolysis in tyrosyl tRNA.sup.Tyr. EcIYRS was obtained in the following manner. That is, a pET-26b-derived plasmid (donated from Dr. Kobayashi of RIKEN) cloned so that His-tag sticks to the C terminal of EcIYRS was used to transform Escherichia coli BL21 Star (DE3) (Invitrogen Corporation) The subsequent operations were performed in the same manner as in the aforementioned expression and purification of editing reaction peptides.

[0142] It should be noted that the base sequence of EcIYRS is represented by SEQ. ID. NO: 4. It should also be noted that the amino-acid sequence of EcIYRS is an amino-acid sequence represented by the aforementioned SEQ. ID. NO: 1.

[0143] A reaction solution was prepared in an amount of 20 .mu.l by adding 200 mM tyrosine, 1 .mu.M EcIYRS, and 2.5 .mu.M editing reaction peptide to an aminoacylation buffer solution (100 mM Tris-Cl (pH 7.5), 15 mM MgCl.sub.2, 5 .mu.M tRNA.sup.Tyr, 4 mM ATP). The reaction solution was incubated at 37.degree. C. for 25 minutes. The reaction was terminated by adding, to the reaction solution, an equal volume of a solution containing 300 mM sodium acetate (pH 5.0), 8 M urea, 0.05% bromophenol blue, and 0.05% xylene cyanol. Next, the reaction solution was subjected to electrophoresis (urea-denatured acidic PAGE) with use of 8 M urea denatured 7% (w/v) polyacrylamide gel (acrylamide: bisacrylamide=29:1) equilibrated with 0.1 M sodium acetate (pH 5.0). The electrophoretic buffer solution used was 0.1 M sodium acetate (pH 5.0). The electrophoresis was performed for 10 to 14 hours under the following conditions: a temperature of 4.degree. C.; and a current of 24 to 30 mA. After the electrophoresis, tRNA was detected by staining the gel with toluidine blue.

[0144] FIG. 1 shows the results of electrophoresis. (a) to (c) of FIG. 1 show that with no addition of editing reaction peptides, EcIYRS tyrosylated tRNA.sup.Tyr and the bands of tRNA shifted almost completely (Lanes 2). From this, it can be judged that the decreases in shift of bands at the time of addition of the editing reaction peptides are attributed to the hydrolyses caused by the editing reaction peptides.

[0145] As for the case of addition of the E. coli-derived editing peptides, remarkable decreases in shift of bands were observed in the case of addition of the B1-B3/4, B3/4-B5, and B3/4 editing reaction peptides (Lanes 4, 5, and 6 of FIG. 1(a)). As for the case of addition of the T. thermophilus editing reaction peptides, a decrease in shift of a band was observed in the case of addition of the B1-B3/4 editing reaction peptide (Lane 4 of FIG. 1(b)). As for the case of addition of the P. horikoshii editing reaction peptides, decreases in shift of bands were observed in the case of addition of the B1-B5 and B3/4 editing reaction peptides (Lanes 3 and 6 of FIG. 1(c)).

[0146] These results show that a polypeptide fragment including a B3/4 domain retains hydrolysis activity even if it is not a full-length protein. As for the E. coli- and P. horikoshii-derived editing reaction peptides, even the shortest editing reaction peptides retain hydrolysis activity.

[0147] Meanwhile, in the case of T. thermophilus, extremely satisfactory hydrolysis activity was exhibited in the case of use of the B1-B3/4, to which the B1 and B2 domains had been added, in comparison with the B3/4.

Example 4

Construction of N-Terminal Fusion Protein Expression Plasmids

[0148] Plasmids for expressing fusion proteins each having an editing reaction peptide introduced at the N terminal of EcIYRS were constructed.

[0149] Among the E. coli-, P. horikoshii-, and T. thermophilus-derived editing reaction peptides, the smallest editing reaction peptide was fused with the N terminal of EcIYRS. That is, as for each of the E. coli and P. horikoshii editing reaction peptides, the editing reaction domain B3/4 was used. As for the T. thermophilus editing reaction peptide, the editing reaction domain B1-B3/4 was used.

[0150] It should be noted that the amino-acid sequence of the E. coli-derived B3/4 domain is represented by SEQ. ID. NO: 5 and the base sequence is represented by SEQ. ID. NO: 6. The amino-acid sequence of the P. horikoshii-derived B3/4 domain is represented by SEQ. ID. NO: 2 and the base sequence is represented by SEQ. ID. NO: 7. The amino-acid sequence of the T. thermophilus-derived B1-B3/4 domain is represented by SEQ. ID. NO: 8 and the base sequence is represented by SEQ. ID. NO: 9.

[0151] It should be noted that the mutants are named "Eed-N-IYRS" (E. coli), "Ted-N-IYRS" (T. thermophilus), and "Ped-N-IYRS" (P. horikoshii) after the original species and sites of fusion of the editing reaction domains. FIG. 2 schematically shows the constituents of each of the fusion proteins.

[0152] Further, from each of the fusion proteins, a mutant was produced by inactivating the editing reaction activity of the fusion protein (Eed.sup.mu-N-IYRS, Ted.sup.mu-N-IYRS, and Ped.sup.mu-N-IYRS). In each of Eed.sup.mu-N-IYRS and Tedd.sup.mu-N-IYRS, Ala356 in the original PheRS beta-subunit had been substituted with Trp. In Ped.sup.mu-N-IYRS, Ala141 in the original PheRS beta-subunit had been substituted with Trp. These two types of amino-acid residue are structurally at substantially the same location. The mutation causes Trp to be embedded in a substrate-binding pocket, with the result that the editing reaction activity is lost.

[0153] The mutation that inactivates editing reaction was introduced site-specifically by amplifying the whole plasmid by PCR. The Eed.sup.mu-N-IYRS expression plasmid was produced with use of the Eed-N-IYRS expression plasmid as a template by amplifying the whole plasmid by PCR with use of primers Ec-Mu-F (SEQ. ID. NO: 67) and Ec-Mu-R (SEQ. ID. NO: 68). Further, the Ted.sup.mu-N-IYRS expression plasmid was produced with use of the Ted-N-IYRS expression plasmid as a template by amplifying the whole plasmid by PCR with use of primers Tt-Mu-F (SEQ. ID. NO: 69) and Tt-Mu-R (SEQ. ID. NO: 70). The Ped.sup.mu-N-IYRS expression plasmid was produced with use of the Ped-N-IYRS expression plasmid as a template by amplifying the whole plasmid by PCR with use of primers Ph-Mu-F (SEQ. ID. NO: 71) and Ph-Mu-R (SEQ. ID. NO: 72). In each case, PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 54.degree. C. for 30 seconds, and 72.degree. C. for 2 minutes and 30 seconds (16 cycles). The DNA polymerase used was Phusion High-Fidelity DNA Polymerase.

[0154] With use of the genome DNA of E. coli as a template, DNA coding for the B3/4 domain of E. coli was amplified by PCR with use of primers EcN1 (SEQ. ID. NO: 15) and EcN2 (SEQ. ID. NO: 16). PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 54.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (25 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 5 minutes.

[0155] With use as a template of a plasmid A (donated from Dr. Kobayashi of RIKEN) containing DNA coding for EcIYRS, the DNA coding for EcIYRS was amplified by PCR with use of primers EcN3 (SEQ. ID. NO: 17) and EcN4 (SEQ. ID. NO: 18). PCR was carried out under the same thermocycling conditions as the DNA coding for the B3/4 domain of E. coli had been amplified.

[0156] Next, a DNA fragment coding for a fusion protein in which the B3/4 domain of the E. coli PheRS beta-subunit had been added to the N terminal of the EcIYRS protein was amplified by overlap PCR with use of the PCR-amplified DNA coding for the B3/4 domain of E. coli, the PCR-amplified DNA coding for EcIYRS, and primers EcN5 (SEQ. ID. NO: 19) and EcN6 (SEQ. ID. NO: 20). First, a reaction solution containing neither of the primers EcN5 and EcN6 was prepared. With use of this reaction solution, PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 57.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (8 cycles). After the 8 cycles, the primers EcN5 and EcN6 were added to the reaction solution. With use of the reaction solution to which the primers had been added, PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 57.degree. C. for 30 seconds, and 72.degree. C. for 1 minute 45 seconds (26 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 5 minutes. The amplified DNA was inserted between the NdeI and XhoI sites of an expression vector pET-26b (Novagen, Inc.) so that a His tag was added. Thus produced was an Eed-N-IYRS expression plasmid. The base sequence of Eed-N-IYRS is represented by SEQ. ID. NO: 10.

[0157] With use of the genome DNA of T. thermophilus, DNA coding for the B1-B3/4 domain of T. thermophilus was amplified by PCR with use of primers TtN1 (SEQ. ID. NO: 21) and TtN2 (SEQ. ID. NO: 22). PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 72.2.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (25 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 5 minutes.

[0158] With use of the plasmid A as a template, DNA coding for EcIYRS was amplified by PCR with use of primers TtN3 (SEQ. ID. NO: 23) and TtN4 (SEQ. ID. NO: 24).

[0159] Next, a DNA fragment coding for a fusion protein in which the B3/4 domain of the T. thermophilus had been added to the N terminal of the EcIYRS protein was amplified by overlap PCR with use of the PCR-amplified DNA coding for the B1-B3/4 domain of T. thermophilus, the PCR-amplified DNA coding for EcIYRS, and primers TtN5 (SEQ. ID. NO: 25) and TtN6 (SEQ. ID. NO: 26). First, a reaction solution containing neither of the primers TtN5 and TtN6 was prepared. With use of this reaction solution, PCR was carried out. The reaction was paused. The primers TtN5 and TtN6 were added. The reaction was resumed. The specific conditions for overlap PCR were the same as those under which the DNA coding for Eed-N-IYRS had been prepared. The amplified DNA was inserted between the NdeI and XhoI sites of a vector pET-26b (Novagen, Inc.) so that a His tag was added. Thus produced was a Ted-N-IYRS expression plasmid. The base sequence of Ted-N-IYRS is represented by SEQ. ID. NO: 11.

[0160] With use of the genome DNA of P. horikoshii as a template, DNA coding for the B3/4 domain of P. horikoshii was amplified by PCR with use of primers PhN1 (SEQ. ID. NO: 27) and PhN2 (SEQ. ID. NO: 28). PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 57.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (25 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 5 minutes.

[0161] With use of the plasmid A as a template, DNA coding for EcIYRS was amplified by PCR with use of primers PhN3 (SEQ. ID. NO: 29) and PhN4 (SEQ. ID. NO: 30).

[0162] Next, a DNA fragment coding for a fusion protein in which the B3/4 domain of the P. horikoshii had been added to the N terminal of the EcIYRS protein was amplified by overlap PCR with use of the PCR-amplified DNA coding for the B3/4 domain of P. horikoshii, the PCR-amplified DNA coding for EcIYRS, and primers PhN5 (SEQ. ID. NO: 31) and PhN6 (SEQ. ID. NO: 32). First, a reaction solution containing neither of the primers PhN5 and PhN6 was prepared. With use of this reaction solution, PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 57.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (8 cycles). After the 8 cycles, the primers PhN5 and PhN6 were added to the reaction solution. With use of the reaction solution to which the primers had been added, PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 57.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (25 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 5 minutes. The amplified DNA was inserted between the NdeI and XhoI sites of an expression vector pET-26b (Novagen, Inc.) so that a His tag was added. Thus produced was a Ped-N-IYRS expression plasmid. The base sequence of Ped-N-IYRS is represented by SEQ. ID. NO: 12.

[0163] In the PCR of the present example, including the after-mentioned PCR, the polymerase used was Phusion High-Fidelity DNA Polymerase (NEB, Inc.).

[0164] The expression and purification of each fusion protein were performed in the same manner as in Example 2, except that the culture temperature conditions after the induction of expression were changed to 24.degree. C. Further, the expression was confirmed by SDS-PAGE (not shown).

Example 5

Determination of the Aminoacylation Activity of the N-Terminal Fusion Proteins

[0165] Next, the aminoacylation activity (tyrosylation activity) of each of the fusion proteins was determined. An aminoacylation reaction solution was prepared by adding 30 .mu.M [.sup.14C]-L-tyrosine (GE Healthcare) and 1 .mu.M aaRS to an aminoacylation buffer solution. By incubating the reaction solution at 37.degree. C., aminoacylation was performed, with the result that [.sup.14C]-L-tyrosyl tRNA.sup.Tyr was produced. [.sup.14C]-L-tyrosyl tRNA.sup.Tyr was precipitated on a piece of filter paper with 10% trichloroacetic acid (TCA). After the piece of filter paper had been washed with 5% TCA and 100% ethanol, the radioactivity of the piece of filter paper was measured by a liquid scintillation counter. Since the piece of filter paper adsorbs only a high-molecular substance, unreacted [.sup.14C]-tyrosine is washed away, so that only tyrosylated tRNA, i.e., [.sup.14C]-tyrosyl tRNA.sup.Tyr can be measured. Such radioactivity was used to determine the extent to which each fusion protein recognized tyrosine.

[0166] FIG. 3 shows the results of determination of recognition of tyrosine. As shown in FIG. 3, the mutants, in each of which an altered editing reaction domain obtained by inactivating the editing reaction activity of an editing reaction domain had been added to the N terminal, maintained the same level of tyrosylation activity as EcIYRS, regardless of the species of editing reaction domain with which the mutants had been fused. This shows that the tyrosylation activity of EcIYRS is hardly lowered due to the fusion of an exogenous sequence with the N terminal. Therefore, it can be said that the decreases in tyrosylation activity at the time of fusion of the wild-type editing reaction domains are attributed to the hydrolyses caused by the editing reactions.

[0167] A certain level of decrease in tyrosylation was seen regardless of the species of editing reaction domain fused. In each of the cases where the editing reaction domains of bacteria E. coli and T. thermophilus were added, it was confirmed that the amount of [.sup.14C]-tyrosyl tRNA.sup.Tyr detected was half as large as in the case where EcIYRS was used. That is, it was confirmed that the activity was half as high as that of EcIYRS. Further, in the case where the editing reaction domains of archaebacteria P. horikoshii was added, it was confirmed that [.sup.14C]-tyrosyl tRNA.sup.Tyr was hardly detected and the tyrosylation of tRNA.sup.Tyr was inhibited.

[0168] Next, each of the fusion proteins was assayed for substrate specificity. A reaction solution was prepared by adding 200 .mu.M tyrosine or 3-iodotyrosine and 1 .mu.M fusion protein to an aminoacylation buffer solution. The reaction solution was incubated at 24.degree. C. for 30 minutes. The reaction was terminated by adding, to the reaction solution, an equal volume of a solution containing 300 mM sodium acetate (pH 5.0), 8 M urea, 0.05% bromophenol blue, and 0.05% xylene cyanol. The reaction solution was subjected to urea-denatured acidic PAGE in the same manner as in the aforementioned determination of the hydrolysis activity of the editing reaction peptides, whereby a shift in band of tRNA was observed. The assay makes it possible to confirm whether or not 3-iodotyrosine is recognized.

[0169] FIG. 4 shows the results of assaying the fusion proteins for their substrate specificity. As shown in FIG. 4, Eed-N-IYRS, Ted-N-IYRS, and Ped-N-IYRS showed no band shifts in cases where tyrosine was used (Lanes 4, 6, and 8), but showed band shifts in cases where iodotyrosine was used (Lanes 5, 7, and 9). In contrast, EcIYRS, which does not contain an editing reaction domain, showed a band shift also in cases where tyrosine was used (Lane 2). This shows that an editing reaction domain does not degrade 3-iodotyrosyl tRNA.sup.Tyr and only tyrosyl tRNA is recognized and degraded by using an editing reaction domain.

Example 6

Search for a Place to Insert an Editing Reaction Domain

[0170] The aaRSs classified into class I share a common basic structure, i.e., a Rossman-fold domain, and each of the aaRSs further has an additional domain inserted at the N terminal of the Rossman-fold domain, at the C terminal of the Rossman-fold domain, or into the Rossman-fold domain. Locations where these insertion sequences exist are considered to be places where another sequence can be inserted without disarraying the Rossman-fold structure.

[0171] The amino-acid sequences of the 2,467 types of class I aaRS registered in Swiss-prot were aligned. The software used for alignment was MAFFT (Katoh, K., Misawa, K., Kuma, K., and Miyata T., MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res, 2002, 30, 3059-3066; Katoh, K., Kuma, K., Toh, H., and Miyata, T., MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res, 2005, 33, 511-518).

[0172] As a result of alignment of the primary sequences of the class I aaRSs, each of the class I aaRSs showed some characteristic insertion sequences. Among those insertion sequences, the most remarkable insertion sequences were the CP1 domains of ValRS, IleRS, and LeuRS. In ValRS, IleRS, and LeuRS, the CP1 domains were editing reaction domains.

[0173] A CP1 domain is an insertion sequence that every class I aaRS has inside of the Rossman fold. Each of the class I aaRSs has a CP1 domain inserted at the same site in the Rossman fold as the other. CP1 domains vary in length and function from aaRS to aaRS. The three CP1 domains of ValRS, IleRS, and LeuRS are the longest, i.e., as long as approximately 200 amino-acid residues, and serve as editing reaction domains. Meanwhile, the CP1 domain of TyrRS is as short as approximately 70 amino-acid residues. The CP1 domain of TyrRS involves in tRNA recognition, dimerization, and substrate recognition. In this alignment, the editing reaction domains of ValRS, IleRS, and LeuRS were inserted between the 166th and 167th resides of a loop domain (163rd to 169th amino-acid residues) that exists in the CP1 domain of TyrRS. This loop is not in contact with any other domain in the CP1 domain, and as such, has no particular function. Therefore, it was supposed that the structure of EcIYRS would not be disarrayed even if an editing reaction domain was inserted. In view of this, it was decided that the editing reaction domain of PheRS was inserted into the loop domain.

[0174] The editing reaction domain was realized by a P. horikoshii-derived editing reaction domain. The resultant EcIYRS mutant is named "Ped-CP1-IYRS" after the place where the editing reaction domain has been inserted. That is, the editing reaction domain has been inserted in the CP1 domain of Ped-CP1-IYRS.

[0175] For comparison, the following discusses an EcIYRS mutant obtained by inserting the editing reaction domain into the AC binding domain. The EcIYRS mutant is hereinafter referred to as "Ped-AC-IYRS".

[0176] FIG. 2 schematically shows the constituents of each of these fusion proteins. For ease of comparison, FIG. 2 also shows EcIYRS and the fusion proteins produced in Example 4.

Example 7

Construction of a Ped-CP1-IYRS Expression Plasmid

[0177] With use of the genome DNA of P. horikoshii as a template, DNA coding for the B3/4 domain of P. horikoshii was amplified by PCR with use of primers PhC1 (SEQ. ID. NO: 33) and PhC2 (SEQ. ID. NO: 34). PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 55.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (25 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 2 minutes.

[0178] With use of the plasmid A as a template, DNA coding for the first to 166th amino-acid residues of EcIYRS and DNA coding for the 167th to 424th amino-acid residues of EcIYRS were amplified by PCR with use of primers PhC3 (SEQ. ID. NO: 35) and PhC4 (SEQ. ID. NO: 36) and by PCR with use of primers PhC5 (SEQ. ID. NO: 37) and PhC6 (SEQ. ID. NO: 38).

[0179] Next, a DNA fragment coding for a fusion protein in which the B3/4 domain of P. horikoshii had been inserted into the CP1 domain of EcIYRS was amplified by PCR with use of the PCR-amplified DNA coding for the B3/4 domain of P. horikoshii, the PCR-amplified DNA coding for the N (1-166) and C (167-424) terminals of EcIYRS, and primers PhC7 (SEQ. ID. NO: 39) and PhC8 (SEQ. ID. NO: 40). A reaction solution containing the N-terminal DNA, the C-terminal DNA, and the primers PhC7 and PhC8 was prepared. The reaction solution was subjected to PCR under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 52.degree. C. for 30 seconds, and 72.degree. C. for 1 minute and 10 seconds (25 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 2 minutes.

[0180] The amplified DNA was cloned between the NdeI and XhoI sites of an expression vector pET-26b so that a His tag is added to the C terminal. Thus produced was a Ped-CP1-IYRS expression plasmid. The base sequence of Ped-CP1-IYRS is represented by SEQ. ID. NO: 13.

[0181] The mutation that inactivates editing reaction was introduced site-specifically by amplifying the whole plasmid by PCR.

[0182] The expression and purification of the Ped-CP1-IYRS protein were performed in the same manner as in Example 4. Further, the expression was confirmed by SDS-PAGE (not shown).

[0183] It should be noted that the mutant with inactivated editing activity was produced in the same manner (Ped.sup.mu-CP1-IYRS). The mutation that inactivates the editing reaction domain is the same as in the case of Ped.sup.mu-N-IYRS described in Example 4. The Ped.sup.mu-CP1-IYRS expression plasmid was produced in the same manner as Ped.sup.mu-N-IYRS described in Example 4, except that Ped-CP1-IYRS was used as a template instead.

Comparative Example 1

Construction of a Ped-AC-IYRS Expression Plasmid

[0184] With use of the genome DNA of P. horikoshii as a template, DNA coding for the B3/4 domain of P. horikoshii was amplified by PCR with use of primers PhA1 (SEQ. ID. NO: 41) and PhA2 (SEQ. ID. NO: 42). The PCR was carried out under the following thermocycling conditions: 98.degree. C. for 1 minute; 98.degree. C. for 10 seconds, 55.degree. C. for 30 seconds, and 72.degree. C. for 1 minute (25 cycles). After the last cycle, the reacted solution was preserved at 72.degree. C. for 2 minutes.

[0185] With use of the plasmid A as a template, DNA coding for the first to 304th amino-acid residues of EcIYRS and DNA coding for the 305th to 424th amino-acid residues of EcIYRS were amplified by PCR with use of primers PhA3 (SEQ. ID. NO: 43) and PhA4 (SEQ. ID. NO: 44) and by PCR with use of primers PhA5 (SEQ. ID. NO: 45) and PhA6 (SEQ. ID. NO: 46).

[0186] Next, a reaction solution containing the PCR-amplified DNA coding for the B3/4 domain of P. horikoshii, the PCR-amplified DNA coding for the N (1-304) and C (305-424) terminals of EcIYRS, and primers PhA7 (SEQ. ID. NO: 47) and PhA8 (SEQ. ID. NO: 48) was prepared, and overlap PCR was carried out, whereby a DNA fragment coding for a fusion protein in which the B1, B2, and B3/4 domains of P. horikoshii had been inserted into the alpha-helical domain (AC) of EcIYRS was amplified.

[0187] The amplified DNA was cloned between the NdeI and XhoI sites of an expression vector pET-26b so that a His tag is added to the C terminal. Thus produced was a Ped-AC-IYRS expression plasmid. The base sequence of Ped-AC-IYRS is represented by SEQ. ID. NO: 14.

[0188] The mutation that inactivates editing reaction was introduced site-specifically by amplifying the whole plasmid by PCR.

[0189] The expression and purification of the Ped-AC-IYRS protein were performed in the same manner as in Example 4. Further, the expression was confirmed by SDS-PAGE (not shown).

[0190] It should be noted that the mutant with inactivated editing activity was produced in the same manner (Ped.sup.mu-AC-IYRS). The mutation that inactivates the editing reaction domain is the same as in the case of Ped.sup.mu-N-IYRS described in Example 4. The Ped.sup.mu-AC-IYRS expression plasmid was produced in the same manner as Ped.sup.mu-N-IYRS described in Example 4, except that Ped-AC-IYRS was used as a template instead.

Example 8

Preparation of Amber Suppressor tRNA.sup.Tyr

[0191] Amber suppressor tRNA.sup.Tyr was prepared with use of a T7 RNA polymerase according to a method devised by Nureki et al. (Nureki O et al., J Mol Biol 236, 710-724 1994).

[0192] The sequence of the anticodon site of Escherichia coli tRNA.sup.Tyr was made CUA so as to correspond to an amber codon. A T7 promoter sequence (SEQ. ID. NO: 73) was added upstream of a DNA sequence serving as a template for tRNA. CCAGG was designed so that a site corresponding to the CCA terminal of tRNA could be cleaved by a restriction endonuclease Mva I. The sequence is as short as 120 bases. Therefore, a DNA fragment was amplified by PCR with use of a combination of two primers without a template, and the amplified DNA fragment was cloned into a pUC18 vector.

[0193] The DNA fragment including a cloned region was amplified by PCR with use of M13 forward and M13 reverse primers. The amplified DNA fragment was cleaved by Mva I so that one terminal is CCA. With use of the DNA fragment as a template, tRNA was transcribed with use of a T7 RNA polymerase. The transcription reaction by a T7 RNA polymerase was performed by incubation in a reaction solution of (a composition) at 37.degree. C. for 5 hours. When 2 hours had elapsed since the start of incubation, a T7 RNA polymerase was added in an amount a tenth as large as the amount of the first T7 RNA polymerase. The reaction solution was subjected to phenol/chloroform extraction. After ammonium acetate had been added as a salt to the extract, an isopropanol precipitation was performed. The precipitate was dissolved in 2 ml of 10 mM Hepes-Na (pH 7.5). After the dissolution, the solution was passed through a PD-10 desalination column (GE Healthcare, Inc.), whereby unreacted nucleotides were removed. Another ethanol precipitation was performed. The precipitate was dissolved in a solution containing 10 mM Hepes-Na (pH 7.5) and 2 mM MgCl.sub.2. After denaturation of tRNA at 80.degree. C. for 2 minutes, tRNA was rewound by slow cooling.

Example 9

Determination of the Aminoacylation Activity of Ped-CP1-IYRS

[0194] Next, the aminoacylation activity (tyrosylation activity) of each of the fusion proteins was determined. The determination was performed in the same manner as in Example 5. The results are shown in FIG. 5. It should be noted that the activity of each fusion protein having a mutation introduced into an editing reaction domain thereof is indicated by a change in activity of EcIYRS due to insertion of an editing reaction domain.

[0195] As shown in FIG. 5, Ped-CP1-IYRS had little tyrosine added to tRNA.sup.Tyr.

[0196] Further, in Ped-AC-IYRS, tyrosylation was detected. It should be noted that there was no decrease in tyrosylation of EcIYRS to tRNA.sup.Tyr due to a mutant produced in the same manner as Ped-AC-IYRS, except for insertion of an inactive editing reaction domain, and tyrosylation was detected as with Ped-AC-IYRS. That is, in Ped-AC-IYRS, there was no hydrolysis by an editing reaction domain.

[0197] Next, each of the fusion proteins was assayed for substrate specificity. Urea-denatured acidic PAGE was performed to check whether or not each fusion protein recognized 3-iodotyrosine. The reaction was performed in the same manner as in the aforementioned determination of the N-terminal fusion proteins except that the concentration of Ped-CP1-IYRS was 1 mM or 2 mM and the reaction temperature was 24.degree. C. or 37.degree. C.

[0198] FIG. 6 shows the results from the mutants. As shown in FIG. 6, Ped-CP1-IYRS was observed to recognize 3-iodotyrosine. Under a reaction condition of 37.degree. C. for 1 minute, an almost complete shift in tRNA was observed. Further, under a reaction condition of 24.degree. C., a shift in band was observed even after a further prolonged period of reaction. Meanwhile, as with EcIYRS, Ped-AC-IYRS recognized 3-iodotyrosine.

Example 10

Determination of the Hydrolysis Activity of Ped-CP1-IYRS and Ped-N-IYRS with Respect to Tyrosyl tRNA

[0199] Next, Ped-CP1-IYRS, which had not tyrosylated tRNA.sup.Tyr in aminoacylation in Example 9, was checked for its hydrolysis activity with respect to tyrosyl tRNA. Further, Ped-N-IYRS, which had been produced in Example 4, was studied, too.

[0200] Under the same conditions as in the determination of the aminoacylation activity of the N-terminal fusion protein, [.sup.14C]-tyrosyl tRNA.sup.Tyr was produced with use of a wild-type EcTyrRS as an enzyme. After 30 minutes of reaction, the protein was removed by phenol/chloroform treatment, and [.sup.14C]-tyrosyl tRNA.sup.Tyr was precipitated by ethanol precipitation. In order to inhibit a naturally-occurring hydrolysis, the resultant precipitate was dissolved in a 10 mM sodium citrate buffer solution (pH 4.5).

[0201] The hydrolysis reaction was performed by adding 2.2 mM [.sup.14C]-tyrosyl tRNA.sup.Tyr and 50 nM Ped-N-IYRS or Ped-CP1-IYRS to a buffer solution (100 mM Tris-Cl (pH 7.5), 15 mM MgCl.sub.2). As with the activity determination of aminoacylation, the remaining [.sup.14C]-tyrosyl tRNA.sup.Tyr was quantified by a liquid scintillation counter.

[0202] FIG. 7 shows the results of the hydrolysis activity of Ped-N-IYRS and Ped-CP1-IYRS.

[0203] As shown in FIG. 7, both of the mutants exhibited hydrolysis activity. Further, Ped-CP1-IYRS exhibited higher hydrolysis activity than Ped-N-IYRS. It seemed that tyrosyl tRNA can be hydrolyzed more efficiently when the editing reaction domain is inserted into the CP1 domain than when the editing reaction domain is at the N terminal.

[0204] Next, it was confirmed whether or not Ped-CP1-IYRS introduces tyrosine or 3-iodotyrosine into an amber codon in a wheat germ cell-free translation system. As the wheat germ cell-free translation system, an RTS100 wheat germ CECF kit (Roche) was used, and the protocol that came with the kit was followed. Since this reaction is performed at 24.degree. C., the structure of Ped-CP1-IYRS is not disarrayed, so that it is expected that 3-iodotyrosine will be recognized.

[0205] By adding, to this kit, a plasmid into which a gene controlled by a T7 promoter has been incorporated, the gene can be expressed. It is known that Escherichia coli-derived EcIYRS and Escherichia coli tRNA.sup.Tyr are orthogonalized to each other in a wheat germ translation system (i.e., that EcIYRS does not react with tRNA inherent in the cells and Escherichia coli tRNA.sup.Tyr does not react with aaRSs inherent in the cells) (Kiga, D., Sakamoto, K., Kodama, K., Kigawa, T., Matsuda, T., Yabuki, T., Shirouzu, M., Harada, Y., Nakayama, H., Takio, K., Hasegawa, Y., Endo, Y., Hirao, I., Yokoyama, S., An engineered Escherichia coli tyrosyl-tRNA synthetase for site-specific incorporation of an unnatural amino acid into proteins in eukaryotic translation and its application in a wheat germ cell-free system. Proc Natl Acad Sci USA, 2002, 99, 9715-9720).

[0206] As a reporter protein to be expressed, GST (glutathione-5-transferase) into which an amber codon has been introduced was used. The GST gene has a redundant sequence added upstream thereof, and the redundant sequence has the amber codon introduced thereinto (GST (Am)) (SEQ. ID. NO: 49) (donated from Dr. Kobayashi of RIKEN). A DNA fragment coding for GST (Am) was amplified by PCR, and was cloned at the EcoRV-XhoI site of a vector pEU3-N11 (TOYOBO Co., Ltd.). When EcIYRS and Escherichia coli amber suppressor tRNA are added to the cell-free translation system, tyrosine is introduced into the amber codon, and the GST protein is translated. That is, the aminoacylation activity of EcIYRS can be estimated by the amount of the GST protein.

[0207] To a reaction solution included in the wheat germ cell-free translation system kit, 5 mM Escherichia coli amber suppressor tRNA, 1 mM 3-iodotyrosine, 2 mM aaRS, and 40 ng/.mu.l of a plasmid coding for GST (Am) were added, and then were allowed to react at 24.degree. C. for 6 hours. The GST protein thus translated was detected by western blotting with use of GST antibodies.

[0208] FIG. 8 shows the results of western blotting.

[0209] As shown in FIG. 8, Ped-CP1-IYRS did not express GST at all in the absence of 3-iodotyrosine (Lane 8). This shows that the editing reaction domain of Ped-CP1-IYRS functions appropriately in a translation system. Further, in the presence of 3-iodotyrosine, Ped-CP1-IYRS expressed the same level of GST as EcIYRS. That is, there was no decrease in aminoacylation activity due to insertion of the editing reaction domain. Therefore, Ped-CP1-IYRS could achieve specific recognition with respect to 3-iodotyrosine.

[0210] Ped.sup.mu-CP1-IYRS exhibited a band of GST in the absence of 3-iodotyrosine as with EcIYRS, and in addition, expressed the same level of GST as EcIYRS in the presence of 3-iodotyrosine.

[0211] It should be noted that Ped-CP1-IYRS has a linker, represented by SEQ. ID. NO: 50, which has been inserted between the Rossman-fold domain and the N-terminal side of the editing reaction peptide and a linker, represented by SEQ. ID. NO: 51, which has been inserted the Rossman-fold domain and the C-terminal side. These linkers have been inserted in accordance with the sequences of the primers used in the aforementioned overlap PCR. It was found that substitution of linkers in Ped-CP1-IYRS has an influence on the activity to specifically recognize 3-iodotyrosine and synthesize aminoacylated tRNA. Among such Ped-CP1-IYRSs, a Ped-CP1-IYRS in which a linker represented by SEQ. ID. NO: 50 is used at the N-terminal side and a linker represented by SEQ. ID. NO: 51 is used at the C-terminal side exhibited the highest activity.

Example 11

Site-Specific Introduction of Iodotyrosine into Proteins with Use of Ped-CP1-IYRS in Cultured Mammalian Cells

[0212] In order to further verify the usefulness of Ped-CP1-IYRS, an experiment for introducing iodotyrosine into proteins with use of Ped-CP1-IYRS in cultured mammalian cells was conducted. The experiment was all conducted according to the procedures and methods described in "Sakamoto, K., Hayashi, A., Sakamoto, A., Kiga, D., Nakayama, H., Soma, A., Kobayashi, T., Kitabatake, M., Takio, T., Saito, K., Shirouzu, M., Hirao, I., and Yokoyama, S., Site-specific incorporation of an unnatural amino acid into proteins in mammalian cells, Nucleic Acids Research 30, 4692-4699, 2002". In expressing Ped-CP1-IYRS in cultured mammalian cells (CHO cells), such methods for expressing TyrRS and IYRS as described in the above document were used without modification. Therefore, the expressed Ped-CP1-IYRS has a FLAG tag added to the C terminal.

[0213] The 32nd codon of the RAS gene was substituted with an amber codon (Am32), and an attempt to introduce iodotyrosine (IY) into the position was made. It should be noted that a full-length RAS protein is expressed when some sort of amino acid has been introduced into the position. Every protein thus forcibly expressed has a FLAG tag added thereto, and the expression of proteins was detected by western blotting with use of anti-FLAG antibodies. The results are shown in FIG. 9. In FIG. 9, Bands A, B, and C correspond to Ped-CP1-IYRS, IYRS or Escherichia coli TyrRS, and RAS proteins, respectively.

[0214] First, a control experiment was conducted to express a RAS gene including no amber codon (Lane 2) and confirm the position of a full-length RAS protein on a western blot (Band C). Each of Lanes 3 and 4 shows that a full-length RAS is expressed also when Escherichia coli TyrRS is expressed. The TyrRS recognizes not iodotyrosine but a normal tyrosine, and as such, has tyrosine introduced into the amber site thereof, regardless of whether IY has been added to the medium (IY+) or not (IY-). See the case of expression of a mutant IYRS of TyrRS that recognizes iodotyrosine (Lane 5). In the presence of iodotyrosine, the amino acid is introduced into the amber site, with the result that a full-length RAS is expressed. It should be noted here that, as described in the document above, the amino acid that IYRS introduces into its amber site at the time of addition of iodotyrosine is iodotyrosine. However, IYRS still has an affinity to tyrosine. For this reason, in the absence of iodotyrosine, tyrosine is introduced into the amber site, with the result that a full-length RAS is expressed after all (Lane 6).

[0215] See the case of expression of Ped-CP1-IYRS. At the time of addition of iodotyrosine, a full-length RAS was expressed. This shows that the mutant has activity to introduce iodotyrosine into the amber site in the mammalian cells. It is also shown that, without addition of iodotyrosine, no full-length RAS is expressed, and as a result, Ped-CP1-IYRS does not introduce tyrosine into the amber site. That is, as with the results from the wheat germ cell-free translation system shown in FIG. 8, Ped-CP1-IYRS exhibits drastically improved specificity to iodotyrosine also the cultured mammalian cells.

Example 12

Site-Specific Introduction of Iodotyrosine into Proteins with Use of Ped-CP1-IYRS in Cultured Drosophila Cells

[0216] In order to further verify the usefulness of Ped-CP1-IYRS, an experiment for introducing iodotyrosine into proteins with use of Ped-CP1-IYRS in cultured drosophila cells was conducted.

[0217] In expressing Ped-CP1-IYRS in cultured drosophila cells (S2 cells), a commercially available expression vector pMT/V5-His A (Invitrogen Corporation) was used. As a result, the expressed Ped-CP1-IYRS has an HA tag added to the C terminal. In expressing Bacillus stearothermophilus-derived suppressor tRNA.sup.Tyr in the S2 cells, a U6 promoter (DU6-2 promoter) described in "Wakiyama M., Matsumoto T., Yokoyama S., Drosophila U6 promoter-driven short hairpin RNAs effectively induce RNA interference in Schneider 2 cells, Biochemical and Biophysical Research Communications 331, 1163-1170, 2005" was used upstream of a suppressor tRNA.sup.Tyr sequence described in "Sakamoto, K. et al., Nucleic Acids Research 30, 4692-4699, 2002".

[0218] The 91st codon of the lacZ gene was substituted with an amber codon (LacZ (UAG)), and an attempt to introduce iodotyrosine (IY) into the position was made. It should be noted that when some sort of amino acid has been introduced into the position, a full-length LacZ protein is expressed to exhibit .beta.-galactosidase activity. The results are shown in Table 1.

[0219] First, a control experiment was conducted to express IYRS and confirm the .beta.-galactosidase activity of a full-length LacZ protein. In the case of use of LacZ (UAG) as LacZ, LacZ (UAG) exhibited activity at 17.4% of the expression level of the wild-type LacZ (LacZ WT), which proved the usefulness of an iodotyrosine introduction system in cultured drosophila cells.

[0220] Next, see the case of expression of Ped-CP1-IYRS. At the time of addition of iodotyrosine, a full-length LacZ was expressed. This shows that the mutant has activity to introduce iodotyrosine into the amber site in the S2 cells. Meanwhile, without addition of iodotyrosine, the expression level of full-length LacZ proteins decreased to approximately 4% compared with the time of addition of iodotyrosine. This shows that Ped-CP1-IYRS hardly introduces tyrosine into the amber site.

[0221] That is, in addition to the results from the wheat germ cell-free translation system in FIG. 8 and the results from the cultured mammalian cells in FIG. 9, Ped-CP1-IYRS exhibits drastically improved specificity to iodotyrosine also in the drosophila cells. It was also shown that in an experimental system using S2 cells, Ped-CP1-IYRS exhibits not only specificity but also is comparable in activity to IYRS in terms of iodotyrosine introduction efficiency.

TABLE-US-00001 TABLE 1 LacZ LacZ WT LacZ(UAG) LacZ(UAG) LacZ(UAG) aaRS IYRS IYRS Ped-CP1-IYRS Ped-CP1-IYRS IY -- 1 mM 1 mM -- % 100.0 17.4 15.9 0.7

[0222] As described above, a polypeptide according to the present invention is a polypeptide having aminoacyl-tRNA synthetase activity, the polypeptide including: an altered polypeptide obtained by altering an arginyl-tRNA synthetase, a cysteinyl-tRNA synthetase, a methidnyl-tRNA synthetase, a glutaminyl-tRNA synthetase, a glutamyl-tRNA synthetase, a lysyl-tRNA synthetase, a tyrosyl-tRNA synthetase, or a tryptophanyl-tRNA synthetase so that an unnatural amino acid is recognized; and an editing polypeptide containing an editing reaction active site derived from a phenylalanyl-tRNA synthetase, a leucyl-tRNA synthetase, an isoleucyl-tRNA synthetase, a valyl-tRNA synthetase, an alanyl-tRNA synthetase, a prolyl-tRNA synthetase, or a threonyl-tRNA synthetase, the editing polypeptide having been either inserted between a Rossman-fold N domain and a Rossman-fold C domain that exist in the altered polypeptide, or bound to an N terminal of the altered polypeptide. This brings about an effect of high specificity to an amino acid to be recognized.

[0223] The present invention makes it possible to provide an aaRS that exhibits high specificity to an amino acid to be recognized, and as such, makes it possible to produce a protein with various functions by using an unnatural amino acid. Therefore, the present invention can be applied in the field of biochemistry and the drug industry such as the industry for development of new drugs.

[0224] The concrete embodiments and examples of implementation discussed in the foregoing detailed explanation serve solely to illustrate the technical details of the present invention, which should not be narrowly interpreted within the limits of such embodiments and concrete examples, but rather may be applied in many variations within the spirit of the present invention, provided such variations do not exceed the scope of the patent claims set forth below.

Sequence CWU 1

1

731424PRTEscherichia coli 1Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val1 5 10 15Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly 20 25 30Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His 35 40 45Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala 50 55 60Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly65 70 75 80Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr 85 90 95Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu 100 105 110Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp 115 120 125Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys 130 135 140His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg145 150 155 160Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn 165 170 175Leu Leu Gln Gly Tyr Asp Phe Ala Cys Leu Asn Lys Gln Tyr Gly Val 180 185 190Val Leu Cys Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly 195 200 205Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr 210 215 220Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu225 230 235 240Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe 245 250 255Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu 260 265 270Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu 275 280 285Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala 290 295 300Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala305 310 315 320Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser 325 330 335Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu 340 345 350Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu 355 360 365Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile 370 375 380Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu385 390 395 400Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys 405 410 415Asn Tyr Cys Leu Ile Cys Trp Lys 4202194PRTPyrococcus horikoshii 2Glu Val Lys Lys Ser Asn Val Thr Val Tyr Val Asp Glu Lys Leu Lys1 5 10 15Asp Ile Arg Pro Tyr Gly Val Tyr Ala Ile Val Glu Gly Leu Arg Leu 20 25 30Asp Glu Asp Ser Leu Ser Gln Met Ile Gln Leu Gln Glu Lys Ile Ala 35 40 45Leu Thr Phe Gly Arg Arg Arg Arg Glu Val Ala Ile Gly Ile Phe Asp 50 55 60Phe Asp Lys Ile Lys Pro Pro Ile Tyr Tyr Lys Ala Ala Glu Lys Thr65 70 75 80Glu Lys Phe Ala Pro Leu Gly Tyr Lys Glu Glu Met Thr Leu Glu Glu 85 90 95Ile Leu Glu Lys His Glu Lys Gly Arg Glu Tyr Gly His Leu Ile Lys 100 105 110Asp Lys Gln Phe Tyr Pro Leu Leu Ile Asp Ser Glu Gly Asn Val Leu 115 120 125Ser Met Pro Pro Ile Ile Asn Ser Glu Phe Thr Gly Arg Val Thr Thr 130 135 140Asp Thr Lys Asn Val Phe Ile Asp Val Thr Gly Trp Lys Leu Glu Lys145 150 155 160Val Met Leu Ala Leu Asn Val Met Val Thr Ala Leu Ala Glu Arg Gly 165 170 175Gly Lys Ile Arg Ser Val Arg Val Val Tyr Lys Asp Phe Glu Ile Glu 180 185 190Thr Pro3627PRTArtificial SequenceDescription of Artificial SequenceP.horikoshii-E.coli 3Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val1 5 10 15Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly 20 25 30Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His 35 40 45Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala 50 55 60Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly65 70 75 80Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr 85 90 95Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu 100 105 110Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp 115 120 125Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys 130 135 140His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg145 150 155 160Leu Asn Arg Glu Asp Gln Glu Val Lys Lys Ser Asn Val Thr Val Tyr 165 170 175Val Asp Glu Lys Leu Lys Asp Ile Arg Pro Tyr Gly Val Tyr Ala Ile 180 185 190Val Glu Gly Leu Arg Leu Asp Glu Asp Ser Leu Ser Gln Met Ile Gln 195 200 205Leu Gln Glu Lys Ile Ala Leu Thr Phe Gly Arg Arg Arg Arg Glu Val 210 215 220Ala Ile Gly Ile Phe Asp Phe Asp Lys Ile Lys Pro Pro Ile Tyr Tyr225 230 235 240Lys Ala Ala Glu Lys Thr Glu Lys Phe Ala Pro Leu Gly Tyr Lys Glu 245 250 255Glu Met Thr Leu Glu Glu Ile Leu Glu Lys His Glu Lys Gly Arg Glu 260 265 270Tyr Gly His Leu Ile Lys Asp Lys Gln Phe Tyr Pro Leu Leu Ile Asp 275 280 285Ser Glu Gly Asn Val Leu Ser Met Pro Pro Ile Ile Asn Ser Glu Phe 290 295 300Thr Gly Arg Val Thr Thr Asp Thr Lys Asn Val Phe Ile Asp Val Thr305 310 315 320Gly Trp Lys Leu Glu Lys Val Met Leu Ala Leu Asn Val Met Val Thr 325 330 335Ala Leu Ala Glu Arg Gly Gly Lys Ile Arg Ser Val Arg Val Val Tyr 340 345 350Lys Asp Phe Glu Ile Glu Thr Pro Gly Ser Ala Ser Gly Pro Ala Ser 355 360 365Ala Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn Leu Leu Gln Gly Tyr 370 375 380Asp Phe Ala Cys Leu Asn Lys Gln Tyr Gly Val Val Leu Cys Ile Gly385 390 395 400Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly Ile Asp Leu Thr Arg 405 410 415Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr Val Pro Leu Ile Thr 420 425 430Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu Gly Gly Ala Val Trp 435 440 445Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe Tyr Gln Phe Trp Ile 450 455 460Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu Lys Phe Phe Thr Phe465 470 475 480Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu Glu Asp Lys Asn Ser 485 490 495Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala Glu Gln Val Thr Arg 500 505 510Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala Lys Arg Ile Thr Glu 515 520 525Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser Glu Ala Asp Phe Glu 530 535 540Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu Met Glu Lys Gly Ala545 550 555 560Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu Gln Pro Ser Arg Gly 565 570 575Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile Thr Ile Asn Gly Glu 580 585 590Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu Glu Asp Arg Leu Phe 595 600 605Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys Asn Tyr Cys Leu Ile 610 615 620Cys Trp Lys62541272DNAEscherichia coli 4atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcgctcgt ttgcggcttc 120gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360gctatcgcgg cgaacaacta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gttgcagggt 540tatgacttcg cctgtctgaa caaacagtac ggtgtggtgc tgtgcattgg tggttctgac 600cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca 960aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260atttgctgga aa 12725216PRTEscherichia coli 5Val Gln Pro Glu Ile Val Pro Val Gly Ala Thr Ile Asp Asp Thr Leu1 5 10 15Pro Ile Thr Val Glu Ala Pro Glu Ala Cys Pro Arg Tyr Leu Gly Arg 20 25 30Val Val Lys Gly Ile Asn Val Lys Ala Pro Thr Pro Leu Trp Met Lys 35 40 45Glu Lys Leu Arg Arg Cys Gly Ile Arg Ser Ile Asp Ala Val Val Asp 50 55 60Val Thr Asn Tyr Val Leu Leu Glu Leu Gly Gln Pro Met His Ala Phe65 70 75 80Asp Lys Asp Arg Ile Glu Gly Gly Ile Val Val Arg Met Ala Lys Glu 85 90 95Gly Glu Thr Leu Val Leu Leu Asp Gly Thr Glu Ala Lys Leu Asn Ala 100 105 110Asp Thr Leu Val Ile Ala Asp His Asn Lys Ala Leu Ala Met Gly Gly 115 120 125Ile Phe Gly Gly Glu His Ser Gly Val Asn Asp Glu Thr Gln Asn Val 130 135 140Leu Leu Glu Cys Ala Phe Phe Ser Pro Leu Ser Ile Thr Gly Arg Ala145 150 155 160Arg Arg His Gly Leu His Thr Asp Ala Ser His Arg Tyr Glu Arg Gly 165 170 175Val Asp Pro Ala Leu Gln His Lys Ala Met Glu Arg Ala Thr Arg Leu 180 185 190Leu Ile Asp Ile Cys Gly Gly Glu Ala Gly Pro Val Ile Asp Ile Thr 195 200 205Asn Glu Ala Thr Leu Pro Lys Arg 210 2156648DNAEscherichia coli 6gttcaaccgg aaatcgttcc ggttggtgcg accatcgacg acacgctgcc gattacagtc 60gaagcgccgg aagcctgccc gcgttatctt ggccgtgtgg taaaaggcat taacgttaaa 120gcgccaactc cgctgtggat gaaagaaaaa ctgcgtcgtt gcgggatccg ttctatcgat 180gcagttgttg acgtcaccaa ctatgtgctg ctcgaactgg gccagccgat gcacgctttc 240gataaagatc gcattgaagg cggcattgtg gtgcggatgg cgaaagaggg cgaaacgctg 300gtgctgctcg acggtactga agcgaagctg aatgctgaca ctctggtcat cgccgaccac 360aacaaggcgc tggcgatggg cggcatcttc ggtggcgaac actctggcgt gaatgacgaa 420acacaaaacg tgctgctgga atgcgcgttc tttagcccgc tgtctatcac cggtcgtgct 480cgtcgtcatg gcctgcatac cgatgcgtct caccgttatg agcgtggcgt tgatccggca 540ctgcagcaca aagcgatgga acgtgcgacc cgtctgctga tcgacatctg cggtggtgag 600gctggcccgg taattgatat caccaacgaa gcaacgctgc cgaagcgt 6487582DNAPyrococcus horikoshii 7gaggttaaaa agagtaacgt aacggtttac gttgatgaaa agcttaaaga tataaggcct 60tatggagttt acgcaatagt tgaaggttta aggctcgacg aagattcttt aagtcaaatg 120attcagctac aagaaaagat agcccttaca tttggaagaa gaaggagaga agtggccata 180ggaatcttcg attttgataa gattaagcca cctatttact ataaagccgc cgaaaaaact 240gaaaagtttg cccccctggg ctataaagag gaaatgactc tagaggagat ccttgaaaag 300catgaaaagg gaagggagta tgggcacctt ataaaggata aacaatttta tccactactt 360attgacagcg aggggaatgt gctctccatg ccgccaataa tcaactccga gtttacggga 420agagtaacaa cggatacgaa aaatgtcttc atagatgtca cgggatggaa gcttgagaag 480gtaatgcttg cccttaatgt catggtaact gcattagcag agcgtggagg taaaataagg 540agcgttaggg ttgtctacaa ggacttcgaa attgaaaccc ca 5828405PRTThermus thermophilus 8Met Arg Val Pro Phe Ser Trp Leu Lys Ala Tyr Val Pro Glu Leu Glu1 5 10 15Ser Pro Glu Val Leu Glu Glu Arg Leu Ala Gly Leu Gly Phe Glu Thr 20 25 30Asp Arg Ile Glu Arg Val Phe Pro Ile Pro Arg Gly Val Val Phe Ala 35 40 45Arg Val Leu Glu Ala His Pro Ile Pro Gly Thr Arg Leu Lys Arg Leu 50 55 60Val Leu Asp Ala Gly Arg Thr Val Glu Val Val Ser Gly Ala Glu Asn65 70 75 80Ala Arg Lys Gly Ile Gly Val Ala Leu Ala Leu Pro Gly Thr Glu Leu 85 90 95Pro Gly Leu Gly Gln Lys Val Gly Glu Arg Val Ile Gln Gly Val Arg 100 105 110Ser Phe Gly Met Ala Leu Ser Pro Arg Glu Leu Gly Val Gly Glu Tyr 115 120 125Gly Gly Gly Leu Leu Glu Phe Pro Glu Asp Ala Leu Pro Pro Gly Thr 130 135 140Pro Leu Ser Glu Ala Trp Pro Glu Glu Val Val Leu Asp Leu Glu Val145 150 155 160Thr Pro Asn Arg Pro Asp Ala Leu Gly Leu Leu Gly Leu Ala Arg Asp 165 170 175Leu His Ala Leu Gly Tyr Ala Leu Val Glu Pro Glu Ala Ala Leu Lys 180 185 190Ala Glu Ala Leu Pro Leu Pro Phe Ala Leu Lys Val Glu Asp Pro Glu 195 200 205Gly Ala Pro His Phe Thr Leu Gly Tyr Ala Phe Gly Leu Arg Val Ala 210 215 220Pro Ser Pro Leu Trp Met Gln Arg Ala Leu Phe Ala Ala Gly Met Arg225 230 235 240Pro Ile Asn Asn Val Val Asp Val Thr Asn Tyr Val Met Leu Glu Arg 245 250 255Ala Gln Pro Met His Ala Phe Asp Leu Arg Phe Val Gly Glu Gly Ile 260 265 270Ala Val Arg Arg Ala Arg Glu Gly Glu Arg Leu Lys Thr Leu Asp Gly 275 280 285Val Glu Arg Thr Leu His Pro Glu Asp Leu Val Ile Ala Gly Trp Arg 290 295 300Gly Glu Glu Ser Phe Pro Leu Gly Leu Ala Gly Val Met Gly Gly Ala305 310 315 320Glu Ser Glu Val Arg Glu Asp Thr Glu Ala Ile Ala Leu Glu Val Ala 325 330 335Cys Phe Asp Pro Val Ser Ile Arg Lys Thr Ala Arg Arg His Gly Leu 340 345 350Arg Thr Glu Ala Ser His Arg Phe Glu Arg Gly Val Asp Pro Leu Gly 355 360 365Gln Val Pro Ala Gln Arg Arg Ala Leu Ser Leu Leu Gln Ala Leu Ala 370 375 380Gly Ala Arg Val Ala Glu Ala Leu Leu Glu Ala Gly Ser Pro Lys Pro385 390 395 400Pro Glu Ala Ile Pro 40591215DNAThermus thermophilus 9atgagggtgc ccttctcctg gctaaaagcc tacgtgcccg agctggaaag ccccgaggtc 60ctggaggagc gcctggcggg cctggggttt gaaacggacc ggatagagcg ggtcttcccc 120atcccaagag gggtggtctt cgcccgggtc ctggaggccc accccatccc cggcacccgg 180cttaagcgcc tggtcctgga cgcgggccgg acggtggaag tggtctcggg ggcggaaaac 240gcccgaaaag gaatcggggt ggccctggcc ctccccggga cggagcttcc cggcctgggc 300caaaaggtgg gggaacgggt catccaaggg gtgcggtcct tcggcatggc cctctctccc 360cgggagctcg gggtagggga gtacggcggg gggcttctgg agttccccga ggacgccctc 420ccccccggca cccccctttc ggaggcctgg ccggaggagg tggtgctgga cctcgaggtc 480accccgaacc gcccggacgc cctgggcctt ttgggcctcg cccgggacct ccacgccctg 540ggctacgccc tggtggagcc cgaagcggcc ctgaaggcgg aggcccttcc cctccccttc 600gccctcaagg tggaggaccc ggagggcgcc ccccacttca ccctgggcta cgccttcggc 660ctaagggtgg ccccaagccc cctctggatg cagcgggccc tcttcgccgc gggcatgcgg 720cccatcaaca acgtcgtgga cgtgaccaac tacgtcatgc tggaaagggc ccagcccatg 780cacgcctttg acctgcgctt cgtaggagag gggatcgcgg tgcgccgggc gcgggaaggg 840gagcggctta agaccctgga cggggtggaa

agaaccctcc accccgagga cctggtgatc 900gccgggtggc ggggggagga gagcttcccc ttgggcctcg ccggggtcat gggcggggcg 960gagagcgagg tccgggagga cacggaggcc atcgccttgg aggtggcctg ctttgacccg 1020gtctccatcc gcaagaccgc ccgccgccac ggcctgcgca ccgaggcgag ccaccgcttt 1080gagcgggggg tggaccccct gggccaggtc cccgcccaga ggcgggcctt aagcctcctc 1140caggccctgg cgggggcccg ggtggccgag gccctcctcg aggcgggaag ccccaagccc 1200ccggaggcca tcccc 1215101926DNAEscherichia coli 10atggttcaac cggaaatcgt tccggttggt gcgaccatcg acgacacgct gccgattaca 60gtcgaagcgc cggaagcctg cccgcgttat cttggccgtg tggtaaaagg cattaacgtt 120aaagcgccaa ctccgctgtg gatgaaagaa aaactgcgtc gttgcgggat ccgttctatc 180gatgcagttg ttgacgtcac caactatgtg ctgctcgaac tgggccagcc gatgcacgct 240ttcgataaag atcgcattga aggcggcatt gtggtgcgga tggcgaaaga gggcgaaacg 300ctggtgctgc tcgacggtac tgaagcgaag ctgaatgctg acactctggt catcgccgac 360cacaacaagg cgctggcgat gggcggcatc ttcggtggcg aacactctgg cgtgaatgac 420gaaacacaaa acgtgctgct ggaatgcgcg ttctttagcc cgctgtctat caccggtcgt 480gctcgtcgtc atggcctgca taccgatgcg tctcaccgtt atgagcgtgg cgttgatccg 540gcactgcagc acaaagcgat ggaacgtgcg acccgtctgc tgatcgacat ctgcggtggt 600gaggctggcc cggtaattga tatcaccaac gaagcaacgc tgccgaagcg tatggcaagc 660agtaacttga ttaaacaatt gcaagagcgg gggctggtag cccaggtgac ggacgaggaa 720gcgttagcag agcgactggc gcaaggcccg atcgcgctcg tttgcggctt cgatcctacc 780gctgacagct tgcatttggg gcatcttgtt ccattgttat gcctgaaacg cttccagcag 840gcgggccaca agccggttgc gctggtaggc ggcgcgacgg gtctgattgg cgacccgagc 900ttcaaagctg ccgagcgtaa gctgaacacc gaagaaactg ttcaggagtg ggtggacaaa 960atccgtaagc aggttgcccc gttcctcgat ttcgactgtg gagaaaactc tgctatcgcg 1020gcgaacaact atgactggtt cggcaatatg aatgtgctga ccttcctgcg cgatattggc 1080aaacacttct ccgttaacca gatgatcaac aaagaagcgg ttaagcagcg tctcaaccgt 1140gaagatcagg ggatttcgtt cactgagttt tcctacaacc tgttgcaggg ttatgacttc 1200gcctgtctga acaaacagta cggtgtggtg ctgtgcattg gtggttctga ccagtggggt 1260aacatcactt ctggtatcga cctgacccgt cgtctgcatc agaatcaggt gtttggcctg 1320accgttccgc tgatcactaa agcagatggc accaaatttg gtaaaactga aggcggcgca 1380gtctggttgg atccgaagaa aaccagcccg tacaaattct accagttctg gatcaacact 1440gcggatgccg acgtttaccg cttcctgaag ttcttcacct ttatgagcat tgaagagatc 1500aacgccctgg aagaagaaga taaaaacagc ggtaaagcac cgcgcgccca gtatgtactg 1560gcggagcagg tgactcgtct ggttcacggt gaagaaggtt tacaggcggc aaaacgtatt 1620accgaatgcc tgttcagcgg ttctttgagt gcgctgagtg aagcggactt cgaacagctg 1680gcgcaggacg gcgtaccgat ggttgagatg gaaaagggcg cagacctgat gcaggcactg 1740gtcgattctg aactgcaacc ttcccgtggt caggcacgta aaactatcgc ctccaatgcc 1800atcaccatta acggtgaaaa acagtccgat cctgaatact tctttaaaga agaagatcgt 1860ctgtttggtc gttttacctt actgcgtcgc ggtaaaaaga attactgtct gatttgctgg 1920aaataa 1926112508DNAArtificial SequenceDescription of Artificial SequenceTed-N-IYRS 11atgagggtgc ccttctcctg gctaaaagcc tacgtgcccg agctggaaag ccccgaggtc 60ctggaggagc gcctggcggg cctggggttt gaaacggacc ggatagagcg ggtcttcccc 120atcccaagag gggtggtctt cgcccgggtc ctggaggccc accccatccc cggcacccgg 180cttaagcgcc tggtcctgga cgcgggccgg acggtggaag tggtctcggg ggcggaaaac 240gcccgaaaag gaatcggggt ggccctggcc ctccccggga cggagcttcc cggcctgggc 300caaaaggtgg gggaacgggt catccaaggg gtgcggtcct tcggcatggc cctctctccc 360cgggagctcg gggtagggga gtacggcggg gggcttctgg agttccccga ggacgccctc 420ccccccggca cccccctttc ggaggcctgg ccggaggagg tggtgctgga cctcgaggtc 480accccgaacc gcccggacgc cctgggcctt ttgggcctcg cccgggacct ccacgccctg 540ggctacgccc tggtggagcc cgaagcggcc ctgaaggcgg aggcccttcc cctccccttc 600gccctcaagg tggaggaccc ggagggcgcc ccccacttca ccctgggcta cgccttcggc 660ctaagggtgg ccccaagccc cctctggatg cagcgggccc tcttcgccgc gggcatgcgg 720cccatcaaca acgtcgtgga cgtgaccaac tacgtcatgc tggaaagggc ccagcccatg 780cacgcctttg acctgcgctt cgtaggagag gggatcgcgg tgcgccgggc gcgggaaggg 840gagcggctta agaccctgga cggggtggaa agaaccctcc accccgagga cctggtgatc 900gccgggtggc ggggggagga gagcttcccc ttgggcctcg ccggggtcat gggcggggcg 960gagagcgagg tccgggagga cacggaggcc atcgccttgg aggtggcctg ctttgacccg 1020gtctccatcc gcaagaccgc ccgccgccac ggcctgcgca ccgaggcgag ccaccgcttt 1080gagcgggggg tggaccccct gggccaggtc cccgcccaga ggcgggcctt aagcctcctc 1140caggccctgg cgggggcccg ggtggccgag gccctcctcg aggcgggaag ccccaagccc 1200ccggaggcca tccccggcag cgcgccgagc ggcatggcaa gcagtaactt gattaaacaa 1260ttgcaagagc gggggctggt agcccaggtg acggacgagg aagcgttagc agagcgactg 1320gcgcaaggcc cgatcgcgct cgtttgcggc ttcgatccta ccgctgacag cttgcatttg 1380gggcatcttg ttccattgtt atgcctgaaa cgcttccagc aggcgggcca caagccggtt 1440gcgctggtag gcggcgcgac gggtctgatt ggcgacccga gcttcaaagc tgccgagcgt 1500aagctgaaca ccgaagaaac tgttcaggag tgggtggaca aaatccgtaa gcaggttgcc 1560ccgttcctcg atttcgactg tggagaaaac tctgctatcg cggcgaacaa ctatgactgg 1620ttcggcaata tgaatgtgct gaccttcctg cgcgatattg gcaaacactt ctccgttaac 1680cagatgatca acaaagaagc ggttaagcag cgtctcaacc gtgaagatca ggggatttcg 1740ttcactgagt tttcctacaa cctgttgcag ggttatgact tcgcctgtct gaacaaacag 1800tacggtgtgg tgctgtgcat tggtggttct gaccagtggg gtaacatcac ttctggtatc 1860gacctgaccc gtcgtctgca tcagaatcag gtgtttggcc tgaccgttcc gctgatcact 1920aaagcagatg gcaccaaatt tggtaaaact gaaggcggcg cagtctggtt ggatccgaag 1980aaaaccagcc cgtacaaatt ctaccagttc tggatcaaca ctgcggatgc cgacgtttac 2040cgcttcctga agttcttcac ctttatgagc attgaagaga tcaacgccct ggaagaagaa 2100gataaaaaca gcggtaaagc accgcgcgcc cagtatgtac tggcggagca ggtgactcgt 2160ctggttcacg gtgaagaagg tttacaggcg gcaaaacgta ttaccgaatg cctgttcagc 2220ggttctttga gtgcgctgag tgaagcggac ttcgaacagc tggcgcagga cggcgtaccg 2280atggttgaga tggaaaaggg cgcagacctg atgcaggcac tggtcgattc tgaactgcaa 2340ccttcccgtg gtcaggcacg taaaactatc gcctccaatg ccatcaccat taacggtgaa 2400aaacagtccg atcctgaata cttctttaaa gaagaagatc gtctgtttgg tcgttttacc 2460ttactgcgtc gcggtaaaaa gaattactgt ctgatttgct ggaaataa 2508121890DNAArtificial SequenceDescription of Artificial SequencePed-N-IYRS 12atggaggtta aaaagagtaa cgtaacggtt tacgttgatg aaaagcttaa agatataagg 60ccttatggag tttacgcaat agttgaaggt ttaaggctcg acgaagattc tttaagtcaa 120atgattcagc tacaagaaaa gatagccctt acatttggaa gaagaaggag agaagtggcc 180ataggaatct tcgattttga taagattaag ccacctattt actataaagc cgccgaaaaa 240actgaaaagt ttgcccccct gggctataaa gaggaaatga ctctagagga gatccttgaa 300aagcatgaaa agggaaggga gtatgggcac cttataaagg ataaacaatt ttatccacta 360cttattgaca gcgaggggaa tgtgctctcc atgccgccaa taatcaactc cgagtttacg 420ggaagagtaa caacggatac gaaaaatgtc ttcatagatg tcacgggatg gaagcttgag 480aaggtaatgc ttgcccttaa tgtcatggta actgcattag cagagcgtgg aggtaaaata 540aggagcgtta gggttgtcta caaggacttc gaaattgaaa ccccaggctc cgcctccggc 600cccgcctccg ccggcatggc aagcagtaac ttgattaaac aattgcaaga gcgggggctg 660gtagcccagg tgacggacga ggaagcgtta gcagagcgac tggcgcaagg cccgatcgcg 720ctcgtttgcg gcttcgatcc taccgctgac agcttgcatt tggggcatct tgttccattg 780ttatgcctga aacgcttcca gcaggcgggc cacaagccgg ttgcgctggt aggcggcgcg 840acgggtctga ttggcgaccc gagcttcaaa gctgccgagc gtaagctgaa caccgaagaa 900actgttcagg agtgggtgga caaaatccgt aagcaggttg ccccgttcct cgatttcgac 960tgtggagaaa actctgctat cgcggcgaac aactatgact ggttcggcaa tatgaatgtg 1020ctgaccttcc tgcgcgatat tggcaaacac ttctccgtta accagatgat caacaaagaa 1080gcggttaagc agcgtctcaa ccgtgaagat caggggattt cgttcactga gttttcctac 1140aacctgttgc agggttatga cttcgcctgt ctgaacaaac agtacggtgt ggtgctgtgc 1200attggtggtt ctgaccagtg gggtaacatc acttctggta tcgacctgac ccgtcgtctg 1260catcagaatc aggtgtttgg cctgaccgtt ccgctgatca ctaaagcaga tggcaccaaa 1320tttggtaaaa ctgaaggcgg cgcagtctgg ttggatccga agaaaaccag cccgtacaaa 1380ttctaccagt tctggatcaa cactgcggat gccgacgttt accgcttcct gaagttcttc 1440acctttatga gcattgaaga gatcaacgcc ctggaagaag aagataaaaa cagcggtaaa 1500gcaccgcgcg cccagtatgt actggcggag caggtgactc gtctggttca cggtgaagaa 1560ggtttacagg cggcaaaacg tattaccgaa tgcctgttca gcggttcttt gagtgcgctg 1620agtgaagcgg acttcgaaca gctggcgcag gacggcgtac cgatggttga gatggaaaag 1680ggcgcagacc tgatgcaggc actggtcgat tctgaactgc aaccttcccg tggtcaggca 1740cgtaaaacta tcgcctccaa tgccatcacc attaacggtg aaaaacagtc cgatcctgaa 1800tacttcttta aagaagaaga tcgtctgttt ggtcgtttta ccttactgcg tcgcggtaaa 1860aagaattact gtctgatttg ctggaaataa 1890131884DNAArtificial SequenceDescription of Artificial SequencePed-CP1-IYRS 13atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcgctcgt ttgcggcttc 120gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360gctatcgcgg cgaacaacta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480ctcaaccgtg aagatcagga ggttaaaaag agtaacgtaa cggtttacgt tgatgaaaag 540cttaaagata taaggcctta tggagtttac gcaatagttg aaggtttaag gctcgacgaa 600gattctttaa gtcaaatgat tcagctacaa gaaaagatag cccttacatt tggaagaaga 660aggagagaag tggccatagg aatcttcgat tttgataaga ttaagccacc tatttactat 720aaagccgccg aaaaaactga aaagtttgcc cccctgggct ataaagagga aatgactcta 780gaggagatcc ttgaaaagca tgaaaaggga agggagtatg ggcaccttat aaaggataaa 840caattttatc cactacttat tgacagcgag gggaatgtgc tctccatgcc gccaataatc 900aactccgagt ttacgggaag agtaacaacg gatacgaaaa atgtcttcat agatgtcacg 960ggatggaagc ttgagaaggt aatgcttgcc cttaatgtca tggtaactgc attagcagag 1020cgtggaggta aaataaggag cgttagggtt gtctacaagg acttcgaaat tgaaacccca 1080ggctccgcct ccggccccgc ctccgccggg atttcgttca ctgagttttc ctacaacctg 1140ttgcagggtt atgacttcgc ctgtctgaac aaacagtacg gtgtggtgct gtgcattggt 1200ggttctgacc agtggggtaa catcacttct ggtatcgacc tgacccgtcg tctgcatcag 1260aatcaggtgt ttggcctgac cgttccgctg atcactaaag cagatggcac caaatttggt 1320aaaactgaag gcggcgcagt ctggttggat ccgaagaaaa ccagcccgta caaattctac 1380cagttctgga tcaacactgc ggatgccgac gtttaccgct tcctgaagtt cttcaccttt 1440atgagcattg aagagatcaa cgccctggaa gaagaagata aaaacagcgg taaagcaccg 1500cgcgcccagt atgtactggc ggagcaggtg actcgtctgg ttcacggtga agaaggttta 1560caggcggcaa aacgtattac cgaatgcctg ttcagcggtt ctttgagtgc gctgagtgaa 1620gcggacttcg aacagctggc gcaggacggc gtaccgatgg ttgagatgga aaagggcgca 1680gacctgatgc aggcactggt cgattctgaa ctgcaacctt cccgtggtca ggcacgtaaa 1740actatcgcct ccaatgccat caccattaac ggtgaaaaac agtccgatcc tgaatacttc 1800tttaaagaag aagatcgtct gtttggtcgt tttaccttac tgcgtcgcgg taaaaagaat 1860tactgtctga tttgctggaa ataa 1884141914DNAArtificial SequenceDescription of Artificial SequencePed-AC-IYRS 14atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcgctcgt ttgcggcttc 120gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360gctatcgcgg cgaacaacta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gttgcagggt 540tatgacttcg cctgtctgaa caaacagtac ggtgtggtgc tgtgcattgg tggttctgac 600cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900tatgtactgg cggaggttaa aaagagtaac gtaacggttt acgttgatga aaagcttaaa 960gatataaggc cttatggagt ttacgcaata gttgaaggtt taaggctcga cgaagattct 1020ttaagtcaaa tgattcagct acaagaaaag atagccctta catttggaag aagaaggaga 1080gaagtggcca taggaatctt cgattttgat aagattaagc cacctattta ctataaagcc 1140gccgaaaaaa ctgaaaagtt tgcccccctg ggctataaag aggaaatgac tctagaggag 1200atccttgaaa agcatgaaaa gggaagggag tatgggcacc ttataaagga taaacaattt 1260tatccactac ttattgacag cgaggggaat gtgctctcca tgccgccaat aatcaactcc 1320gagtttacgg gaagagtaac aacggatacg aaaaatgtct tcatagatgt cacgggatgg 1380aagcttgaga aggtaatgct tgcccttaat gtcatggtaa ctgcattagc agagcgtgga 1440ggtaaaataa ggagcgttag ggttgtctac aaggacttcg aaattgaaac cccaggctcc 1500gcctccggcc ccgcctccgc cggcgcaccg cgcgcccagt atgtactggc ggagcaggtg 1560actcgtctgg ttcacggtga agaaggttta caggcggcaa aacgtattac cgaatgcctg 1620ttcagcggtt ctttgagtgc gctgagtgaa gcggacttcg aacagctggc gcaggacggc 1680gtaccgatgg ttgagatgga aaagggcgca gacctgatgc aggcactggt cgattctgaa 1740ctgcaacctt cccgtggtca ggcacgtaaa actatcgcct ccaatgccat caccattaac 1800ggtgaaaaac agtccgatcc tgaatacttc tttaaagaag aagatcgtct gtttggtcgt 1860tttaccttac tgcgtcgcgg taaaaagaat tactgtctga tttgctggaa ataa 19141534DNAArtificial SequenceDescription of Artificial SequencePrimer EcN1 15gcacgccata tggttcaacc ggaaatcgtt ccgg 341636DNAArtificial SequenceDescription of Artificial SequencePrimer EcN2 16gttactgctt gccatacgct tcggcagcgt tgcttc 361736DNAArtificial SequenceDescription of Artificial SequencePrimer EcN3 17gaagcaacgc tgccgaagcg tatggcaagc agtaac 361831DNAArtificial SequenceDescription of Artificial SequencePrimer EcN4 18aggtgcctcg agtttccagc aaatcagaca g 311934DNAArtificial SequenceDescription of Artificial SequencePrimer EcN5 19gcacgccata tggttcaacc ggaaatcgtt ccgg 342031DNAArtificial SequenceDescription of Artificial SequencePrimer EcN6 20aggtgcctcg agtttccagc aaatcagaca g 312137DNAArtificial SequenceDescription of Artificial SequencePrimer TtN1 21gccgcccata tgagggtgcc cttctcctgg ctaaaag 372254DNAArtificial SequenceDescription of Artificial SequencePrimer TtN2 22caagttactg cttgccatgc cgctcggcgc gctgccgggg atggcctccg gggg 542354DNAArtificial SequenceDescription of Artificial SequencePrimer TtN3 23cccccggagg ccatccccgg cagcgcgccg agcggcatgg caagcagtaa cttg 542431DNAArtificial SequenceDescription of Artificial SequencePrimer TtN4 24aggtgcctcg agtttccagc aaatcagaca g 312537DNAArtificial SequenceDescription of Artificial SequencePrimer TtN5 25gccgcccata tgagggtgcc cttctcctgg ctaaaag 372631DNAArtificial SequenceDescription of Artificial SequencePrimer TtN6 26aggtgcctcg agtttccagc aaatcagaca g 312731DNAArtificial SequenceDescription of Artificial SequencePrimer PhN1 27cccatatgga ggttaaaaag agtaacgtaa c 312860DNAArtificial SequenceDescription of Artificial SequencePrimer PhN2 28gttactgctt gccatgccgg cggaggcggg gccggaggcg gagcctgggg tttcaatttc 602960DNAArtificial SequenceDescription of Artificial SequencePrimer PhN3 29gaaattgaaa ccccaggctc cgcctccggc cccgcctccg ccggcatggc aagcagtaac 603031DNAArtificial SequenceDescription of Artificial SequencePrimer PhN4 30aggtgcctcg agtttccagc aaatcagaca g 313131DNAArtificial SequenceDescription of Artificial SequencePrimer PhN5 31cccatatgga ggttaaaaag agtaacgtaa c 313231DNAArtificial SequenceDescription of Artificial SequencePrimer PhN6 32aggtgcctcg agtttccagc aaatcagaca g 313330DNAArtificial SequenceDescription of Artificial SequencePrimer PhC1 33caaccgtgaa gatcaggagg ttaaaaagag 303431DNAArtificial SequenceDescription of Artificial SequencePrimer PhC2 34cagtgaacga aatcccggcg gaggcggggc c 313534DNAArtificial SequenceDescription of Artificial SequencePrimer PhC3 35cggcgccata tggcaagcag taacttgatt aaac 343630DNAArtificial SequenceDescription of Artificial SequencePrimer PhC4 36ctctttttaa cctcctgatc ttcacggttg 303731DNAArtificial SequenceDescription of Artificial SequencePrimer PhC5 37ggccccgcct ccgccgggat ttcgttcact g 313831DNAArtificial SequenceDescription of Artificial SequencePrimer PhC6 38aggtgcctcg agtttccagc aaatcagaca g 313934DNAArtificial SequenceDescription of Artificial SequencePrimer PhC7 39cggcgccata tggcaagcag taacttgatt aaac 344031DNAArtificial SequenceDescription of Artificial SequencePrimer PhC8 40aggtgcctcg agtttccagc aaatcagaca g 314133DNAArtificial SequenceDescription of Artificial SequencePrimer PhA1 41cagtatgtac tggcggaggt taaaaagagt aac 334230DNAArtificial SequenceDescription of Artificial SequencePrimer PhA2 42ctgggcgcgc ggtgcgccgg cggaggcggg 304334DNAArtificial SequenceDescription of Artificial SequencePrimer PhA3 43cggcgccata tggcaagcag taacttgatt aaac 344433DNAArtificial SequenceDescription of Artificial SequencePrimer PhA4 44gttactcttt ttaacctccg ccagtacata ctg 334530DNAArtificial SequenceDescription of Artificial SequencePrimer PhA5 45cccgcctccg ccggcgcacc gcgcgcccag 304631DNAArtificial SequenceDescription of Artificial SequencePrimer PhA6 46aggtgcctcg agtttccagc aaatcagaca g 314734DNAArtificial SequenceDescription of Artificial SequencePrimer PhA7 47cggcgccata tggcaagcag taacttgatt aaac 344831DNAArtificial SequenceDescription of Artificial SequencePrimer PhA8

48aggtgcctcg agtttccagc aaatcagaca g 3149738DNAArtificial SequenceDescription of Artificial SequenceGST(Am) 49atggctagca tgactggtgg acagcaaatg ggtcgggatc cgggtgcgaa ttctggtgta 60actaagaact cttagagccc tatactaggt tattggaaaa ttaagggcct tgtgcaaccc 120actcgacttc ttttggaata tcttgaagaa aaatatgaag agcatttgta tgagcgcgat 180gaaggtgata aatggcgaaa caaaaagttt gaattgggtt tggagtttcc caatcttcct 240tattatattg atggtgatgt taaattaaca cagtctatgg ccatcatacg ttatatagct 300gacaagcaca acatgttggg tggttgtcca aaagagcgtg cagagatttc aatgcttgaa 360ggagcggttt tggatattag atacggtgtt tcgagaattg catatagtaa agactttgaa 420actctcaaag ttgattttct tagcaagcta cctgaaatgc tgaaaatgtt cgaagatcgt 480ttatgtcata aaacatattt aaatggtgat catgtaaccc atcctgactt catgttgtat 540gacgctcttg atgttgtttt atacatggac ccaatgtgcc tggatgcgtt cccaaaatta 600gtttgtttta aaaaacgtat tgaagctatc ccacaaattg ataagtactt gaaatccagc 660aagtatatag catggccttt gcagggctgg caagccacgt ttggtggtgg cgaccatcct 720ccaaaatcgg attaataa 7385015PRTArtificial SequenceDescription of Artificial SequencelinkerN 50Gln Arg Leu Asn Arg Glu Asp Gln Glu Val Lys Lys Ser Asn Val1 5 10 155122PRTArtificial SequenceDescription of Artificial SequencelinkerC 51Glu Ile Glu Thr Pro Gly Ser Ala Ser Gly Pro Ala Ser Ala Gly Ile1 5 10 15Ser Phe Thr Phe Ser Tyr 2052795PRTEscherichia coli 52Met Lys Phe Ser Glu Leu Trp Leu Arg Glu Trp Val Asn Pro Ala Ile1 5 10 15Asp Ser Asp Ala Leu Ala Asn Gln Ile Thr Met Ala Gly Leu Glu Val 20 25 30Asp Gly Val Glu Pro Val Ala Gly Ser Phe His Gly Val Val Val Gly 35 40 45Glu Val Val Glu Cys Ala Gln His Pro Asn Ala Asp Lys Leu Arg Val 50 55 60Thr Lys Val Asn Val Gly Gly Asp Arg Leu Leu Asp Ile Val Cys Gly65 70 75 80Ala Pro Asn Cys Arg Gln Gly Leu Arg Val Ala Val Ala Thr Ile Gly 85 90 95Ala Val Leu Pro Gly Asp Phe Lys Ile Lys Ala Ala Lys Leu Arg Gly 100 105 110Glu Pro Ser Glu Gly Met Leu Cys Ser Phe Ser Glu Leu Gly Ile Ser 115 120 125Asp Asp His Ser Gly Ile Ile Glu Leu Pro Ala Asp Ala Pro Ile Gly 130 135 140Thr Asp Ile Arg Glu Tyr Leu Lys Leu Asp Asp Asn Thr Ile Glu Ile145 150 155 160Ser Val Thr Pro Asn Arg Ala Asp Cys Leu Gly Ile Ile Gly Val Ala 165 170 175Arg Asp Val Ala Val Leu Asn Gln Leu Pro Leu Val Gln Pro Glu Ile 180 185 190Val Pro Val Gly Ala Thr Ile Asp Asp Thr Leu Pro Ile Thr Val Glu 195 200 205Ala Pro Glu Ala Cys Pro Arg Tyr Leu Gly Arg Val Val Lys Gly Ile 210 215 220Asn Val Lys Ala Pro Thr Pro Leu Trp Met Lys Glu Lys Leu Arg Arg225 230 235 240Cys Gly Ile Arg Ser Ile Asp Ala Val Val Asp Val Thr Asn Tyr Val 245 250 255Leu Leu Glu Leu Gly Gln Pro Met His Ala Phe Asp Lys Asp Arg Ile 260 265 270Glu Gly Gly Ile Val Val Arg Met Ala Lys Glu Gly Glu Thr Leu Val 275 280 285Leu Leu Asp Gly Thr Glu Ala Lys Leu Asn Ala Asp Thr Leu Val Ile 290 295 300Ala Asp His Asn Lys Ala Leu Ala Met Gly Gly Ile Phe Gly Gly Glu305 310 315 320His Ser Gly Val Asn Asp Glu Thr Gln Asn Val Leu Leu Glu Cys Ala 325 330 335Phe Phe Ser Pro Leu Ser Ile Thr Gly Arg Ala Arg Arg His Gly Leu 340 345 350His Thr Asp Ala Ser His Arg Tyr Glu Arg Gly Val Asp Pro Ala Leu 355 360 365Gln His Lys Ala Met Glu Arg Ala Thr Arg Leu Leu Ile Asp Ile Cys 370 375 380Gly Gly Glu Ala Gly Pro Val Ile Asp Ile Thr Asn Glu Ala Thr Leu385 390 395 400Pro Lys Arg Ala Thr Ile Thr Leu Arg Arg Ser Lys Leu Asp Arg Leu 405 410 415Ile Gly His His Ile Ala Asp Glu Gln Val Thr Asp Ile Leu Arg Arg 420 425 430Leu Gly Cys Glu Val Thr Glu Gly Lys Asp Glu Trp Gln Ala Val Ala 435 440 445Pro Ser Trp Arg Phe Asp Met Glu Ile Glu Glu Asp Leu Val Glu Glu 450 455 460Val Ala Arg Val Tyr Gly Tyr Asn Asn Ile Pro Asp Glu Pro Val Gln465 470 475 480Ala Ser Leu Ile Met Gly Thr His Arg Glu Ala Asp Leu Ser Leu Lys 485 490 495Arg Val Lys Thr Leu Leu Asn Asp Lys Gly Tyr Gln Glu Val Ile Thr 500 505 510Tyr Ser Phe Val Asp Pro Lys Val Gln Gln Met Ile His Pro Gly Val 515 520 525Glu Ala Leu Leu Leu Pro Ser Pro Ile Ser Val Glu Met Ser Ala Met 530 535 540Arg Leu Ser Leu Trp Thr Gly Leu Leu Ala Thr Val Val Tyr Asn Gln545 550 555 560Asn Arg Gln Gln Asn Arg Val Arg Ile Phe Glu Ser Gly Leu Arg Phe 565 570 575Val Pro Asp Thr Gln Ala Pro Leu Gly Ile Arg Gln Asp Leu Met Leu 580 585 590Ala Gly Val Ile Cys Gly Asn Arg Tyr Glu Glu His Trp Asn Leu Ala 595 600 605Lys Glu Thr Val Asp Phe Tyr Asp Leu Lys Gly Asp Leu Glu Ser Val 610 615 620Leu Asp Leu Thr Gly Lys Leu Asn Glu Val Glu Phe Arg Ala Glu Ala625 630 635 640Asn Pro Ala Leu His Pro Gly Gln Ser Ala Ala Ile Tyr Leu Lys Gly 645 650 655Glu Arg Ile Gly Phe Val Gly Val Val His Pro Glu Leu Glu Arg Lys 660 665 670Leu Asp Leu Asn Gly Arg Thr Leu Val Phe Glu Leu Glu Trp Asn Lys 675 680 685Leu Ala Asp Arg Val Val Pro Gln Ala Arg Glu Ile Ser Arg Phe Pro 690 695 700Ala Asn Arg Arg Asp Ile Ala Val Val Val Ala Glu Asn Val Pro Ala705 710 715 720Ala Asp Ile Leu Ser Glu Cys Lys Lys Val Gly Val Asn Gln Val Val 725 730 735Gly Val Asn Leu Phe Asp Val Tyr Arg Gly Lys Gly Val Ala Glu Gly 740 745 750Tyr Lys Ser Leu Ala Ile Ser Leu Ile Leu Gln Asp Thr Ser Arg Thr 755 760 765Leu Glu Glu Glu Glu Ile Ala Ala Thr Val Ala Lys Cys Val Glu Ala 770 775 780Leu Lys Glu Arg Phe Gln Ala Ser Leu Arg Asp785 790 79553785PRTThermus thermophilus 53Met Arg Val Pro Phe Ser Trp Leu Lys Ala Tyr Val Pro Glu Leu Glu1 5 10 15Ser Pro Glu Val Leu Glu Glu Arg Leu Ala Gly Leu Gly Phe Glu Thr 20 25 30Asp Arg Ile Glu Arg Val Phe Pro Ile Pro Arg Gly Val Val Phe Ala 35 40 45Arg Val Leu Glu Ala His Pro Ile Pro Gly Thr Arg Leu Lys Arg Leu 50 55 60Val Leu Asp Ala Gly Arg Thr Val Glu Val Val Ser Gly Ala Glu Asn65 70 75 80Ala Arg Lys Gly Ile Gly Val Ala Leu Ala Leu Pro Gly Thr Glu Leu 85 90 95Pro Gly Leu Gly Gln Lys Val Gly Glu Arg Val Ile Gln Gly Val Arg 100 105 110Ser Phe Gly Met Ala Leu Ser Pro Arg Glu Leu Gly Val Gly Glu Tyr 115 120 125Gly Gly Gly Leu Leu Glu Phe Pro Glu Asp Ala Leu Pro Pro Gly Thr 130 135 140Pro Leu Ser Glu Ala Trp Pro Glu Glu Val Val Leu Asp Leu Glu Val145 150 155 160Thr Pro Asn Arg Pro Asp Ala Leu Gly Leu Leu Gly Leu Ala Arg Asp 165 170 175Leu His Ala Leu Gly Tyr Ala Leu Val Glu Pro Glu Ala Ala Leu Lys 180 185 190Ala Glu Ala Leu Pro Leu Pro Phe Ala Leu Lys Val Glu Asp Pro Glu 195 200 205Gly Ala Pro His Phe Thr Leu Gly Tyr Ala Phe Gly Leu Arg Val Ala 210 215 220Pro Ser Pro Leu Trp Met Gln Arg Ala Leu Phe Ala Ala Gly Met Arg225 230 235 240Pro Ile Asn Asn Val Val Asp Val Thr Asn Tyr Val Met Leu Glu Arg 245 250 255Ala Gln Pro Met His Ala Phe Asp Leu Arg Phe Val Gly Glu Gly Ile 260 265 270Ala Val Arg Arg Ala Arg Glu Gly Glu Arg Leu Lys Thr Leu Asp Gly 275 280 285Val Glu Arg Thr Leu His Pro Glu Asp Leu Val Ile Ala Gly Trp Arg 290 295 300Gly Glu Glu Ser Phe Pro Leu Gly Leu Ala Gly Val Met Gly Gly Ala305 310 315 320Glu Ser Glu Val Arg Glu Asp Thr Glu Ala Ile Ala Leu Glu Val Ala 325 330 335Cys Phe Asp Pro Val Ser Ile Arg Lys Thr Ala Arg Arg His Gly Leu 340 345 350Arg Thr Glu Ala Ser His Arg Phe Glu Arg Gly Val Asp Pro Leu Gly 355 360 365Gln Val Pro Ala Gln Arg Arg Ala Leu Ser Leu Leu Gln Ala Leu Ala 370 375 380Gly Ala Arg Val Ala Glu Ala Leu Leu Glu Ala Gly Ser Pro Lys Pro385 390 395 400Pro Glu Ala Ile Pro Phe Arg Pro Glu Tyr Ala Asn Arg Leu Leu Gly 405 410 415Thr Ser Tyr Pro Glu Ala Glu Gln Ile Ala Ile Leu Lys Arg Leu Gly 420 425 430Cys Arg Val Glu Gly Glu Gly Pro Thr Tyr Arg Val Thr Pro Pro Ser 435 440 445His Arg Leu Asp Leu Arg Leu Glu Glu Asp Leu Val Glu Glu Val Ala 450 455 460Arg Ile Gln Gly Tyr Glu Thr Ile Pro Leu Ala Leu Pro Ala Phe Phe465 470 475 480Pro Ala Pro Asp Asn Arg Gly Val Glu Ala Pro Tyr Arg Lys Glu Gln 485 490 495Arg Leu Arg Glu Val Leu Ser Gly Leu Gly Phe Gln Glu Val Tyr Thr 500 505 510Tyr Ser Phe Met Asp Pro Glu Asp Ala Arg Arg Phe Arg Leu Asp Pro 515 520 525Pro Arg Leu Leu Leu Leu Asn Pro Leu Ala Pro Glu Lys Ala Ala Leu 530 535 540Arg Thr His Leu Phe Pro Gly Leu Val Arg Val Leu Lys Glu Asn Leu545 550 555 560Asp Leu Asp Arg Pro Glu Arg Ala Leu Leu Phe Glu Val Gly Arg Val 565 570 575Phe Arg Glu Arg Glu Glu Thr His Leu Ala Gly Leu Leu Phe Gly Glu 580 585 590Gly Val Gly Leu Pro Trp Ala Lys Glu Arg Leu Ser Gly Tyr Phe Leu 595 600 605Leu Lys Gly Tyr Leu Glu Ala Leu Phe Ala Arg Leu Gly Leu Ala Phe 610 615 620Arg Val Glu Ala Gln Ala Phe Pro Phe Leu His Pro Gly Val Ser Gly625 630 635 640Arg Val Leu Val Glu Gly Glu Glu Val Gly Phe Leu Gly Ala Leu His 645 650 655Pro Glu Ile Ala Gln Glu Leu Glu Leu Pro Pro Val His Leu Phe Glu 660 665 670Leu Arg Leu Pro Leu Pro Asp Lys Pro Leu Ala Phe Gln Asp Pro Ser 675 680 685Arg His Pro Ala Ala Phe Arg Asp Leu Ala Val Val Val Pro Ala Pro 690 695 700Thr Pro Tyr Gly Glu Val Glu Ala Leu Val Arg Glu Ala Ala Gly Pro705 710 715 720Tyr Leu Glu Ser Leu Ala Leu Phe Asp Leu Tyr Gln Gly Pro Pro Leu 725 730 735Pro Glu Gly His Lys Ser Leu Ala Phe His Leu Arg Phe Arg His Pro 740 745 750Lys Arg Thr Leu Arg Asp Glu Glu Val Glu Glu Ala Val Ser Arg Val 755 760 765Ala Glu Ala Leu Arg Ala Arg Gly Phe Gly Leu Arg Gly Leu Asp Thr 770 775 780Pro78554556PRTPyrococcus horikoshii 54Met Pro Lys Phe Asp Val Ser Lys Ser Asp Leu Glu Arg Leu Ile Gly1 5 10 15Arg Ser Phe Ser Ile Glu Glu Trp Glu Asp Leu Val Leu Tyr Ala Lys 20 25 30Cys Glu Leu Asp Asp Val Trp Glu Glu Asn Gly Lys Val Tyr Phe Lys 35 40 45Leu Asp Ser Lys Asp Thr Asn Arg Pro Asp Leu Trp Ser Ala Glu Gly 50 55 60Val Ala Arg Gln Ile Lys Trp Ala Leu Gly Ile Glu Lys Gly Leu Pro65 70 75 80Lys Tyr Glu Val Lys Lys Ser Asn Val Thr Val Tyr Val Asp Glu Lys 85 90 95Leu Lys Asp Ile Arg Pro Tyr Gly Val Tyr Ala Ile Val Glu Gly Leu 100 105 110Arg Leu Asp Glu Asp Ser Leu Ser Gln Met Ile Gln Leu Gln Glu Lys 115 120 125Ile Ala Leu Thr Phe Gly Arg Arg Arg Arg Glu Val Ala Ile Gly Ile 130 135 140Phe Asp Phe Asp Lys Ile Lys Pro Pro Ile Tyr Tyr Lys Ala Ala Glu145 150 155 160Lys Thr Glu Lys Phe Ala Pro Leu Gly Tyr Lys Glu Glu Met Thr Leu 165 170 175Glu Glu Ile Leu Glu Lys His Glu Lys Gly Arg Glu Tyr Gly His Leu 180 185 190Ile Lys Asp Lys Gln Phe Tyr Pro Leu Leu Ile Asp Ser Glu Gly Asn 195 200 205Val Leu Ser Met Pro Pro Ile Ile Asn Ser Glu Phe Thr Gly Arg Val 210 215 220Thr Thr Asp Thr Lys Asn Val Phe Ile Asp Val Thr Gly Trp Lys Leu225 230 235 240Glu Lys Val Met Leu Ala Leu Asn Val Met Val Thr Ala Leu Ala Glu 245 250 255Arg Gly Gly Lys Ile Arg Ser Val Arg Val Val Tyr Lys Asp Phe Glu 260 265 270Ile Glu Thr Pro Asp Leu Thr Pro Lys Glu Phe Glu Val Glu Leu Asp 275 280 285Tyr Ile Arg Lys Leu Ser Gly Leu Glu Leu Asn Asp Gly Glu Ile Lys 290 295 300Glu Leu Leu Glu Lys Met Met Tyr Glu Val Glu Ile Ser Arg Gly Arg305 310 315 320Ala Lys Leu Lys Tyr Pro Ala Phe Arg Asp Asp Ile Met His Ala Arg 325 330 335Asp Ile Leu Glu Asp Val Leu Ile Ala Tyr Gly Tyr Asn Asn Ile Glu 340 345 350Pro Glu Glu Pro Lys Leu Ala Val Gln Gly Arg Gly Asp Pro Phe Lys 355 360 365Asp Phe Glu Asp Ala Ile Arg Asp Leu Met Val Gly Phe Gly Leu Gln 370 375 380Glu Val Met Thr Phe Asn Leu Thr Asn Lys Glu Val Gln Phe Lys Lys385 390 395 400Met Asn Ile Pro Glu Glu Glu Ile Val Glu Ile Ala Asn Pro Ile Ser 405 410 415Gln Arg Trp Ser Ala Leu Arg Lys Trp Ile Leu Pro Ser Leu Met Glu 420 425 430Phe Leu Ser Asn Asn Thr His Glu Glu Tyr Pro Gln Arg Ile Phe Glu 435 440 445Val Gly Leu Ala Thr Leu Ile Asp Glu Ser Arg Glu Thr Lys Thr Val 450 455 460Ser Glu Pro Lys Leu Ala Val Ala Leu Ala Gly Thr Gly Tyr Thr Phe465 470 475 480Thr Asn Ala Lys Glu Ile Leu Asp Ala Leu Met Arg His Leu Gly Phe 485 490 495Glu Tyr Glu Ile Glu Glu Val Glu His Gly Ser Phe Ile Pro Gly Arg 500 505 510Ala Gly Lys Ile Ile Val Asn Gly Arg Asp Ile Gly Ile Ile Gly Glu 515 520 525Val His Pro Gln Val Leu Glu Asn Trp Asn Ile Glu Val Pro Val Val 530 535 540Ala Phe Glu Ile Phe Leu Arg Pro Leu Tyr Arg His545 550 5555534DNAArtificial SequenceDescription of Artificial SequencePrimer Ec-B1-F 55gcaccgcata tgaaattcag tgaactgtgg ttac 345634DNAArtificial SequenceDescription of Artificial SequencePrimer Ec-B5-R 56gtgcctcgag cgggatgttg ttgtagccgt aaac 345743DNAArtificial SequenceDescription of Artificial SequencePrimer Ec-B3/4-R 57gtgcctcgag acgcttcggc agcgttgctt cgttggtgat atc 435834DNAArtificial SequenceDescription of Artificial SequencePrimer Ec-B3/4-F 58gcacgccata tggttcaacc ggaaatcgtt ccgg 345937DNAArtificial SequenceDescription of Artificial SequencePrimer Tt-B1-F 59gccgcccata tgagggtgcc cttctcctgg ctaaaag 376037DNAArtificial SequenceDescription of Artificial

SequencePrimer Tt-B5-R 60gctcgaagct tggggatggt ctcgtagccc tggatgc 376144DNAArtificial SequenceDescription of Artificial SequencePrimer Tt-B3/4-R 61gctcgaagct tcgggggctt ggggcttccc gcctcgagga gggc 446237DNAArtificial SequenceDescription of Artificial SequencePrimer Tt-B3/4-F 62gccgcccata tggccctgaa ggcggaggcc cttcccc 376335DNAArtificial SequenceDescription of Artificial SequencePrimer Ph-B1-F 63ggcgcgccca tatgccaaag ttcgacgttt caaag 356429DNAArtificial SequenceDescription of Artificial SequencePrimer Ph-B5-R 64agctcgaggg gctctatgtt attatatcc 296530DNAArtificial SequenceDescription of Artificial SequencePrimer Ph-B3/4-R 65ggctcgagta aatctggggt ttcaatttcg 306637DNAArtificial SequenceDescription of Artificial SequencePrimer Ph-B3/4-F 66gccggcccca tatggaggtt aaaaagagta acgtaac 376735DNAArtificial SequenceDescription of Artificial SequencePrimer Ec-Mu-F 67ggcctgcata ccgattggtc tcaccgttat gagcg 356835DNAArtificial SequenceDescription of Artificial SequencePrimer Ec-Mu-R 68cgctcataac ggtgagacca atcggtatgc aggcc 356933DNAArtificial SequenceDescription of Artificial SequenceTt-Mu-F 69ggcctgcgca ccgagtggag ccaccgcttt gag 337033DNAArtificial SequenceDescription of Artificial SequenceTt-Mu-R 70ctcaaagcgg tggctccact cggtgcgcag gcc 337130DNAArtificial SequenceDescription of Artificial SequencePrimer Ph-Mu-F 71gaaggagaga agtgtggata ggaatcttcg 307230DNAArtificial SequenceDescription of Artificial SequencePrimer Ph-Mu-R 72cgaagattcc tatccacact tctctccttc 307317DNAArtificial SequenceDescription of Artificial SequenceT7 promoter 73taatacgact cactata 17

* * * * *