Recombinant Expression Of Carboxylesterases Liu; Zhongbin [TONGJI UNIVERSITY]

Recombinant Expression Of Carboxylesterases

Liu; Zhongbin

Patent Application Summary

U.S. patent application number 13/127652 was filed with the patent office on 2011-09-01 for recombinant expression of carboxylesterases. This patent application is currently assigned to TONGJI UNIVERSITY. Invention is credited to Zhongbin Liu.

Application Number	20110212504 13/127652
Document ID	/
Family ID	43627215
Filed Date	2011-09-01

United States Patent Application	20110212504
Kind Code	A1
Liu; Zhongbin	September 1, 2011

RECOMBINANT EXPRESSION OF CARBOXYLESTERASES

Abstract

The present application provides a method of producing a carboxylesterase or its variant in eukaryotic cells. The present application also provides an expression vector for high level carboxylesterase expression, a eukaryotic cell comprising the expression vector, and uses thereof. The present application also provides a composition comprising a carboxylesterase or its variant produced by a method described in the present application and uses of the composition.

Inventors:	Liu; Zhongbin; (Shanghai, CN)
Assignee:	TONGJI UNIVERSITY Shanghai CN
Family ID:	43627215
Appl. No.:	13/127652
Filed:	June 2, 2010
PCT Filed:	June 2, 2010
PCT NO:	PCT/CN2010/073449
371 Date:	May 4, 2011

Current U.S. Class:	435/197 ; 435/255.5; 435/256.1
Current CPC Class:	C12N 15/815 20130101; C12N 9/18 20130101; C12P 7/40 20130101; C12Y 301/01001 20130101; C12N 15/80 20130101; C12N 15/81 20130101
Class at Publication:	435/197 ; 435/256.1; 435/255.5
International Class:	C12N 9/18 20060101 C12N009/18; C12N 1/15 20060101 C12N001/15; C12N 1/19 20060101 C12N001/19

Foreign Application Data

Date	Code	Application Number
Aug 31, 2009	CN	200910166839.1

Claims

1. A method for producing a protein, comprising: culturing a eukaryotic cell engineered to express a gene encoding a thermostable carboxylesterase from a microbe or its variant under conditions suitable for expression of the thermostable carboxylesterase or its variant.

2. The method of claim 1, wherein the eukaryotic cell is a yeast cell.

3. The method of claim 2, wherein the yeast cell is selected from the group consisting of Pichia species, Hansenula species, Saccharomyces species and Candida species.

4. The method of claim 3, wherein the yeast cell is selected from the group consisting of Pichia pastoris, Hansenula polymorpha, Saccharomyces cerevisiae and Torulopsis glabrata.

5. The method of claim 4, wherein the yeast cell is Pichia pastoris GS115.

6. The method of claim 1, wherein the eukaryotic cell is a filamentous fungal cell.

7. The method of claim 6, wherein the filamentous fungal cell is from the Aspergillus genus.

8. The method of claim 7, wherein the filamentous fungal cell is selected from the group consisting of Aspergillus niger and Aspergillus oryzae.

9. The method of claim 8, wherein the filamentous fungal cell is Aspergillus niger M54.

10. The method of claim 1, further comprising isolating the carboxylesterase or its variant from the eukaryotic cell culture.

11. The method of claim 1, further comprising introducing an expression vector containing the gene encoding for the carboxylesterase or its variant into the eukaryotic cell.

12. The method of claim 1, wherein the expressed carboxylesterase or its variant is secreted to the outside of the eukaryotic cell.

13. The method of claim 1, wherein the carboxylesterase is a bacterial carboxylesterase.

14. (canceled)

15. The method of claim 1, wherein the carboxylesterase or its variant has at least 70% sequence identity to the amino acid sequence of a carboxylesterase isolated from Geobacillus stearothermophilus.

16. The method of claim 15, wherein the carboxylesterase isolated from Geobacillus stearothermophilus has the amino acid sequence as set forth in SEQ ID NOs: 9, 11, 13 or 15.

17. The method of claim 16, wherein the carboxylesterase or its variant has at least 90% sequence identity to the amino acid sequence encoded by any of SEQ ID NOs: 10, 12, 14 or 16.

18-31. (canceled)

32. A recombinant eukaryotic cell comprising an expression vector comprising a gene encoding a microbial thermostable carboxylesterase or its variant; and a regulatory sequence capable of promoting expression of the microbial thermostable carboxylesterase or its variant in a eukaryotic cell, wherein the regulatory sequence is operably linked to the gene.

33. The eukaryotic cell of claim 32, wherein the cell is Aspergillus niger.

34. The eukaryotic cell of claim 32, wherein the cell is Pichia pastoris.

35-37. (canceled)

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] For purposes of the USPTO extra-statutory requirements, the present application claims benefit of priority of Chinese Patent Application No. 200910166839.1, entitled Recombinant Expression of Carboxylesterase, naming Zhongbin Liu as inventor, filed 31, Aug., 2009, which was filed within the twelve months preceding the filing date of the present application, or is an application of which a currently co-pending application is entitled to the benefit of the filing date.

[0002] All subject matter of the listed applications and of any and all parent, grandparent, great-grandparent, etc. applications of the Related Applications is incorporated herein by reference to the extent such subject matter is not inconsistent herewith.

BACKGROUND

[0003] A carboxylesterase can hydrolyze a carboxylic ester to produce a carboxylate and an alcohol. Carboxylesterases belong to the superfamily of hydrolases. Carboxylesterases have been identified in various species from prokaryotic cells to eukaryotic cells.

SUMMARY

[0004] In one aspect, the present disclosure provides a method for producing a protein, comprising culturing a eukaryotic cell engineered to express a gene encoding for a carboxylesterase or its variant under conditions suitable for expression of the carboxylesterase or its variant.

[0005] In another aspect, the present disclosure provides a method for producing a protein, comprising culturing a eukaryotic cell engineered to express a gene encoding for a microbial carboxylesterase or its variant under conditions suitable for expression of the microbial carboxylesterase or its variant.

[0006] In another aspect, the present disclosure provides a method for producing a protein, comprising culturing a filamentous fungal cell engineered to express a gene encoding for a carboxylesterase or its variant under conditions suitable for expression of the carboxylesterase or its variant.

[0007] In another aspect, the present disclosure provides an expression vector, comprising a gene encoding for a carboxylesterase or its variant; and a regulatory sequence capable of promoting expression of the carboxylesterase or its variant in a eukaryotic cell, wherein the regulatory sequence is operably linked to the gene.

[0008] In another aspect, the present disclosure provides a eukaryotic cell comprising an expression vector containing a gene encoding for a carboxylesterase or its variant, and a regulatory sequence capable of promoting expression of the carboxylesterase or its variant in the eukaryotic cell, wherein the regulatory sequence is operably linked to the gene.

[0009] In another aspect, the present disclosure provides a composition comprising a eukaryotic cell and a carboxylesterase or its variant expressed by the eukaryotic cell.

[0010] In another aspect, the present disclosure provides a composition comprising a filamentous fungal cell and a carboxylesterase or its variant expressed by the filamentous fungal cell.

[0011] In another aspect, the present disclosure provides a composition comprising an isolated carboxylesterase or its variant produced by a method of the present disclosure.

[0012] In another aspect, the present disclosure provides methods of using the compositions of the present disclosure.

[0013] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.

BRIEF DESCRIPTION OF THE FIGURES

[0014] FIG. 1 shows a schematic map of plasmid pIGF.

[0015] FIG. 2 shows a schematic map of plasmid pYG1.2.

[0016] FIG. 3 shows the gel electrophoresis results of the culture media from Aspergillus niger M54 transformed with pYG1.2-CarE-his (L2), Aspergillus niger M54 transformed with pYG1.2 (L3), and non-transformed Aspergillus niger M54 (L4). The protein molecular weight markers are shown in L1. The band around 29.0 KD shown in L2 is the band for the carboxylesterase.

[0017] FIG. 4 shows the Western blot results of the carboxylesterase expressed from Pichia pastoris GS115 transformed with pPIC9K-CarE-His (L4) and from Aspergillus niger M54 transformed with pYG1.2-CarE-His (L5). Expression of carboxylesterase is not detected in the negative controls: Pichia pastoris GS115 transformed with pPIC9K (L3), non-transformed Aspergillus niger M54 (L6), Aspergillus niger M54 transformed with pYG1.2 (L 7). The positive control (a His-tag containing protein) is shown in L1 and protein markers are shown in L2.

[0018] FIG. 5 shows the gel electrophoresis results of carboxylesterase expressed from Pichia pastoris GS115 transformed with pPIC9K-CarE-His sampled after cultivation for 24 h (L3), 48 h (L4) and 72 h (L5). Protein markers are shown in L1 and culture medium of Pichia pastoris GS115 transformed with pPIC9K is shown in L2.

[0019] FIG. 6 shows the enzymatic activities of carboxylesterases isolated from culture media at different time points of incubation.

[0020] FIG. 7 shows the relative enzymatic activities of recombinant carboxylesterase measured at different pH values.

[0021] FIG. 8 shows the relative enzymatic activities of recombinant carboxylesterase measured at 37.degree. C. after the carboxylesterase has been treated at different temperatures for 10 or 30 minutes.

[0022] FIG. 9 shows the relative enzymatic activities of recombinant carboxylesterase measured at different temperatures.

DETAILED DESCRIPTION

[0023] In the following detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented here.

[0024] The present disclosure relates to recombinant methods for producing carboxylesterases and variants thereof, expression vectors and host cells useful for recombinantly producing carboxylesterases and variants thereof. The present disclosure also relates to compositions comprising the recombinantly produced carboxylesterases and variants thereof and methods of using the compositions.

[0025] In one aspect, the present disclosure provides a method for producing a protein comprising culturing a eukaryotic cell engineered to express a gene encoding for a carboxylesterase or its variant under conditions suitable for expression of the carboxylesterase or its variant.

[0026] In another aspect, the present disclosure provides a method for producing a protein comprising culturing a filamentous fungal cell engineered to express a gene encoding for a carboxylesterase or its variant under conditions suitable for expression of the carboxylesterase or its variant.

[0027] In another aspect, the present disclosure provides a method for producing a protein comprising culturing a eukaryotic cell engineered to express a gene encoding for a microbial carboxylesterase or its variant under conditions suitable for expression of the carboxylesterase or its variant.

Eukaryotic Cells

[0028] Eukaryotic cells are cells that are organized into complex structures enclosed within membranes, which include, inter alia, a membrane-bound nucleus containing genetic materials. The eukaryotic cells of this disclosure include, without limitation, fungal cells, protist cells, animal cells and plant cells.

[0029] Fungal cells may include, without limitation, yeast cells and filamentous fungal cells.

[0030] Yeast cells of the present disclosure may belong to the division of Ascomycota and Basidiomycota by their phylogenetic characteristics. Illustrative examples of yeast cells include, without limitation, Pichia species such as Pichia angusta, Pichia pastoris, Pichia anomala, Pichia stipitis, Pichia methanolica, and Pichia guilliermondii; Hansenula species such as Hansenula anomala, Hansenula polymorpha, Hansenula wingei, Hansenula jadinii and Hansenula saturnus; Saccharomyces species such as Saccharomyces cerevisiae, Saccharomyces bayanus, Saccharomyces boulardii; Candida species such as Candida albicans, Candida methylica, Candida boidinii, Candida tropicalis, Candida wickerhamii, Candida maltosa, and Candida glabrata, Torulopsis glabrata; and also Kluyveromyces species, and Schizosaccharomyces species.

[0031] In certain embodiments, the yeast cell is one or more of Pichia pastoris, Hansenula polymorpha, Saccharomyces cerevisiae, or Torulopsis glabrata. In certain embodiments, the yeast cell is Pichia pastoris. In certain embodiments, the yeast cell is one or more of Pichia pastoris strain GS115 cell, Pichia pastoris strain KM71 cell or Pichia pastoris strain MC100-3 cell. In certain embodiments, the yeast cell is a Hansenula polymorpha strain ATCC34438 cell.

[0032] Filamentous fungi may include without limitation any species of microscopic fungi that grow in the form of multi-cellular filaments. In certain embodiments, the filamentous fungal cells include, but are not limited to, the various species of Acremonium, Aspergillus, Fusarium, Humicola, Mucor, Myceliophthora, Neurospora, Penicillium, Thielavia, Tolypocladium, and Trichoderma.

[0033] In certain embodiments, the filamentous fungal cell is an Aspergillus awamori, Aspergillus foetidus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger or Aspergillus oryzae cell. In an illustrative embodiment, the filamentous fungal cell is an Aspergillus niger ATCC 12049 strain cell. In another illustrative embodiment, the filamentous fungal cell is an Aspergillus oryzae RIB40 strain cell.

[0034] Protist cells may include, without limitation, protozoa cells and algae cells.

[0035] Animal cells may include, without limitation, mammalian cells, avian cells, amphibian cells, and insect cells. Illustrative examples of animal cells include pig liver cells, human embryonic kidney 293 (HEK293) cells, Chinese hamster ovary cells (CHO), zebrafish PAC2 cells, Xenopus A6 kidney epithelial cells, caenorhabditis elegans cells, and drosophila cells.

[0036] Plant cells may include, without limitation, parenchyma cells, collenchyma cells, and sclerenchyma cells. Illustrative examples of plant cells are Tobacco BY-2 cells, Datura innoxia cell line, and SB-1 cell line.

[0037] In certain embodiments, the eukaryotic cells in the present disclosure may carry one or more mutations that cause phenotype changes from the wild type strains. Mutations in the eukaryotic cells may occur naturally or non-naturally. Naturally occurring mutations may form spontaneously in the course of evolution. Non-naturally occurring mutations may be artificially generated using methods known in the art. In an illustrative example, mutations may be generated by exposing cells to physical mutagens such as UV irradiation or chemical mutagens such as hydroxylamine and ethidium bromide (see, for example, Hopwood, The Isolation of Mutants in Methods in Microbiology (J. R. Norris and D. W. Ribbons, eds.) 1970, 363-433, Academic Press, New York). In another illustrative example, mutations may be generated by gene deletion techniques such as homologous recombination to disrupt the expression of one or more target genes (see, for example, Alberts et al, Chapter 5: DNA Replication, Repair, and Recombination, Molecular biology of the cell, 2002, 845, Garland Science. New York). In another illustrative example, mutations may be made by gene modification techniques such as polymerase chain reaction (PCR) (see, for example, Botstein et al, Strategies and applications of in vitro mutagenesis, Science 1985, vol 229, No. 4719, 1193-1201; Lo et al., Specific amino acid substitutions in bacterioopsin: Replacement of a restriction fragment in the structural gene by synthetic DNA fragments containing altered codons, Proc. Natl. Acad. Sci. USA 1985, vol 81, No. 8, 2285-2289; Youngman et al., Genetic transposition and insertional mutagenesis in Bacillus subtilis with Streptococcus faecalis transposon Tn917, Proc. Natl. Acad. Sci. USA 1983, vol 80, No. 8, 2305-2309).

[0038] In certain embodiments, the eukaryotic cells in the present disclosure may carry one or more mutations that render them unable to synthesize an essential substance required for cell growth. Mutations may occur in genes involved in the synthesis and/or metabolism of amino acids, nucleotides, sugars, fatty acids, vitamins and other essential substances.

[0039] In certain embodiments, the eukaryotic cells may be mutant yeast cells carrying mutations in ura, trp, ade and leu genes which are involved in the synthesis of uridine, tryptophan, adenosine, and leucine, respectively (Agaphonov et al., Isolation and characterization of the LEU2 gene of Hansenula polymorpha, Yeast 1994, vol 10, 509-513; Bogdanova et al., Plasmid eorganization during integrative transformation in Hansenula polymorpha, Yeast 1995, vol 11, 343-353; Merckelbach et al., Cloning and sequencing of the ura3 locus of the methylotrophic yeast Hansenula polymorpha and its use for the generation of a deletion by gene replacement, Appl. Microbiol. Biotechnol. 1993, vol 40, 361-364).

[0040] In certain embodiments, the eukaryotic cells may be mutant filamentous fungal cells, e.g. Aspergillus niger strains, deficient in pyrG gene function, which are unable to synthesize uridine and thus cannot grow on uridine-free culture medium (Liu, et al, Construction of pyrG auxotrophic Aspergillus niger strain, Journal of microbiology 2001, vol 21, No. 3, 15-16). In an illustrative embodiment, the eukaryotic cell is Aspergillus niger M54, which has been deposited with the China Center for Type Culture Collection (CCTCC), Wuhan University, Wuhan, China, on Jun. 14, 2009, and assigned the Accession No. CCTCC M 209121, under the terms and conditions of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purpose of Patent Procedure (the Budapest Treaty).

[0041] In another illustrative embodiment, the eukaryotic cells are auxotrophic Aspergillus oryzae mutant strains, for example, Aspergillus oryzae M-2-3 deficient in argB gene, which are unable to synthesize arginine and thus cannot grow on arginine-free culture medium (Gomi et al, Integrative transformation of Aspergillus oryzae with a plasmid containing the Aspergillus nidulans argB gene. Agric Biol. Chem. 1987, vol 51, 2549-2555), and Aspergillus oryzae deficient in niaD gene, which are deficient in nitrate reductase and thus are unable to grow on the medium with nitrate as sole source of nitrogen (Unkles et al, The development of a homologous transformation system for Aspergillus oryzae based on the nitrate assimilation pathway: A convenient and general selection system for filamentous fungal transformation, Molecular and General Genetics 1989, vol 218, No. 1, 99-104).

Carboxylesterases and Variants Thereof

[0042] The term "carboxylesterase" as used herein refers to an enzymatic polypeptide that is capable of hydrolyzing a carboxyl ester into a carboxylate and an alcohol. A carboxylesterase may be a wild type carboxylesterase or any variant thereof. A variant of a wild type carboxylesterase differs from the wild type carboxylesterase in the amino acid sequence and/or modification of the amino acids but retains the capability to hydrolyze a carboxyl ester into a carboxylate and an alcohol. The variant may have one or more amino acid substitutions, additions, deletions, insertions, truncations, modifications (e.g. phosphorylation, glycosylation, labeling, etc.), or any combination thereof, of the wild type carboxylesterase. The variant may include naturally occurring variants of the wild type carboxylesterase and artificial polypeptide sequences such as those obtained by chemical synthesis or recombinant methods. The variant may include fragments, mutants, hybrids, analogs and derivatives of wild type carboxylesterases. The variants may contain non-naturally occurring amino acid residues.

[0043] Carboxylesterases have been identified and isolated from a wide variety of species, including, without limitation, animals, insects, plants, and microbes. The nucleotide sequences and amino acid sequences of carboxylesterases from many species have been identified.

[0044] In one embodiment, the carboxylesterase is derived from microbes. The term "microbe" refers to any living organism other than humans, animals and plants. Microbes may include, without limitation, prokaryotes such as bacteria, protozoa, fungi, protists and archaea. Illustrative examples of microbes are Escherichia coli, Geobacillus stearothermophilus, Bacillus cereus, Candida rugosa, Plasmodium falciparum, Pyrococcus furiosus, Salmonella enterica, and Aspergillus fumigatus.

[0045] Carboxylesterases have been isolated from many microbes and their corresponding nucleotide sequences and amino acid sequences have been obtained. Table 1 lists illustrative examples of microbial carboxylesterases and their nucleotide and polypeptide sequences identified by GenBank Accession Numbers.

TABLE-US-00001 TABLE 1 Illustrative examples of carboxylesterases of different microbes. GenBank Amino Acid GenBank Nucleotide Species Sequence Accession No. Sequence Accession No. Geobacillus BAD77330 BA000043, Region: kaustophilus (SEQ ID NO: 1) 3067043 . . . 3067783 (SEQ ID NO: 2) Geobacillus AAG53982 AF327065 thermoleovorans (SEQ ID NO: 3) (SEQ ID NO: 4) Salmonella enterica YP_002245400 NC_011294, Region: (SEQ ID NO: 5) 3548658 . . . 3549428 (SEQ ID NO: 6) Aspergillus XP_755184 XM_750091 fumigatus (SEQ ID NO: 7) (SEQ ID NO: 8)

[0046] In one embodiment, the carboxylesterase is derived from bacteria. Illustrative examples of bacteria are, Escherichia coli, Geobacillus stearothermophilus, Geobacillus kaustophilus, Sulfolobus solfataricus, and Bacillus thermoleovorans. In another embodiment, the carboxylesterase is derived from thermophilic bacteria. Illustrative examples of thermophilic bacteria are Geobacillus stearothermophilus, Geobacillus kaustophilus, Sulfolobus solfataricus, and Bacillus thermoleovorans. In another embodiment, the carboxylesterase is derived from Geobacillus stearothermophilus. Four carboxylesterases have been identified from Geobacillus stearothermophilus. The amino acid sequences and nucleotide sequences of the four carboxylesterases are set forth in SEQ ID NOs: 9-16 as shown in Table 2 below.

TABLE-US-00002 TABLE 2 Illustrative examples of carboxylesterases of Geobacillus stearothermophilus. GenBank Amino GenBank Nucleotide Acid Sequence Sequence Species Accession No. Accession No. Geobacillus AAN81911 AY186197.1, Region: stearothermophilus (SEQ ID NO: 9) 1742 . . . 2485 (SEQ ID NO: 10) AAN81912 AY186197.1, Region: 1 . . . 531 (SEQ ID NO: 11) (SEQ ID NO: 12) AAN81910 AY186196.1, Region: (SEQ ID NO: 13) 3549 . . . 5045 (SEQ ID NO: 14) ACA01541 DQ146476.2, (SEQ ID NO: 15) Region: 13137 . . . 13817 (SEQ ID NO: 16)

[0047] Furthermore, Table 3 lists some illustrative examples of carboxylesterases from animals, insects and plants, and the GenBank Accession Numbers of their corresponding nucleotide and amino acid sequences.

TABLE-US-00003 TABLE 3 Illustrative examples of carboxylesterases of different species. GenBank GenBank Amino Nucleotide Acid Sequence Sequence Accession Species Accession No. No. Animal Homo sapiens AAA83932 M65261 Region: (SEQ ID NO: 17) 1 . . . 1367 (SEQ ID NO: 18) Mus musculus CAA73388 Y12887 Region: (mouse) (SEQ ID NO: 19) 42 . . . 1739 (SEQ ID NO: 20) Xenopus laevis (frog) NP_001080853 NM_001087384 (SEQ ID NO: 21) Region: 38 . . . 1699 (SEQ ID NO: 22) Gallus gallus NP_001013015 NM_001012997 (chicken) (SEQ ID NO: 23) Region: 15 . . . 1685 (SEQ ID NO: 24) Insect Drosophila AAA28520 M33780 Region: melanogaster (fly) (SEQ ID NO: 25) join (3052 . . . 4438, 4495 . . . 4742) (SEQ ID NO: 26) Bombyx mori NP_001037027 NM_001043562 (silkworm) (SEQ ID NO: 27) Region: 33 . . . 1745 (SEQ ID NO: 28) Plant Arabidopsis thaliana NP_176139 NM_104623 (SEQ ID NO: 29) Region: 112 . . . 1194 (SEQ ID NO: 30) Malus pumila (apple) ABB89007 DQ279908 Region: (SEQ ID NO: 31) 79 . . . 981 (SEQ ID NO: 32)

[0048] Variants of carboxylesterases may be generated by conservative substitutions to the wild type carboxylesterases, wherein a substituent amino acid has similar structural or chemical properties to the native amino acid, for example, similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues. The variants may also be made by non-conservative substitutions or other changes to the amino acid sequence of the wild type carboxylesterase as long as the variants retain the carboxylesterase activity. Guidance in determining which and how many amino acid residues may be substituted, inserted or deleted without abolishing functional or biological activity may be found using computer programs well known in the art, for example, STAR software (see Bauer et al, STAR: predicting recombination sites from amino acid sequence, BMC Bioinformatics, 2006, vol 7, 437).

[0049] In certain embodiments, the present disclosure provides carboxylesterases and variants thereof that share at least 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% amino acid sequence identity with one or more of the amino acid sequences of the carboxylesterases set forth in SEQ ID NOs: 9, 11, 13 and 15.

[0050] "Percent (%) amino acid sequence identity" with respect to the carboxylesterase polypeptide sequences identified herein is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the specific carboxylesterase polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.

[0051] In certain embodiments, the present disclosure provides nucleotide sequences encoding for a carboxylesterase or its variant that share at least 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity with one or more of the nucleotide sequences of the carboxylesterase set forth in SEQ ID NOs: 10, 12, 14, and 16.

[0052] "Percent (%) nucleotide sequence identity" with respect to carboxylesterase-encoding nucleotide sequences identified herein is defined as the percentage of nucleotides in a candidate sequence that are identical with the nucleotides in the carboxylesterase nucleotide sequence of interest, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent nucleotide sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or Megalign (DNASTAR) software.

[0053] In one embodiment, percent amino acid sequence identity and percent nucleotide sequence identity may be determined using the sequence comparison program NCBI-BLAST2 (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)). The NCBI-BLAST2 sequence comparison program may be downloaded from http://www.ncbi.nlm.nih.gov. NCBI-BLAST2 uses several search parameters, wherein all of those search parameters are set to default values including, for example, unmask=yes, strand=all, expected occurrences=10, minimum low complexity length=15/5, multi-pass e-value=0.01, constant for multi-pass=25, dropoff for final gapped alignment=25 and scoring matrix=BLOSUM62. In situations where NCBI-BLAST2 is employed for amino acid (or nucleotide) sequence comparisons, the % amino acid (or nucleotide) sequence identity of a given amino acid sequence A (or a given nucleotide sequence A) to, with, or against a given amino acid sequence B (or a given nucleotide sequence B) (which can alternatively be phrased as a given amino acid sequence A (or a given nucleotide sequence A) that has or comprises a certain % amino acid (or nucleotide) sequence identity to, with, or against a given amino acid sequence B (or a given nucleotide sequence B)) is calculated as follows: 100 times the fraction X/Y, where X is the number of amino acid (or nucleotide) residues scored as identical matches by the sequence alignment program NCBI-BLAST2 in that program's alignment of A and B, and where Y is the total number of amino acid (or nucleotide) residues in B. It will be appreciated that where the length of sequence A is not equal to the length of sequence B, the % sequence identity of A to B will not equal the % sequence identity of B to A.

[0054] In another embodiment, percent amino acid sequence identity and percentage nucleotide sequence identity values may also be obtained as described below by using the WU-BLAST-2 computer program (Altschul et al., Methods in Enzymology 266:460-480 (1996)). All of the WU-BLAST-2 search parameters are set to the default values. When WU-BLAST-2 is employed, the % amino acid (or nucleotide) sequence identity of a given amino acid sequence A (or nucleotide sequence A) to, with, or against a given amino acid sequence B (or a given nucleotide sequence B) is determined by dividing (a) the number of matching identical amino acid (or nucleotide) residues between the sequence A and sequence B as determined by WU-BLAST-2 by (b) the total number of residues of sequence B, and (c) multiplied by 100.

[0055] In certain embodiments, the present disclosure provides thermostable carboxylesterases. The term "thermostable carboxylesterase" as used herein refers to a carboxylesterase capable of maintaining detectable enzymatic activity to hydrolyze carboxylic ester groups after being exposed to an elevated temperature at or above about 40.degree. C. for a period of exposure time. Enzymatic activity of carboxylesterase may be detected using any method known in the art, for example, by measuring disappearance of a substrate or formation of a product under a given set of reaction conditions. Illustrative methods for detecting the enzymatic activity of carboxylesterase are spectroscopic methods, radiometric methods, colorimetric methods or high performance liquid chromatography based methods. Illustrative examples of substrates of carboxylesterases are, naphthyl acetate (NA), p-nitrophenyl acetate (p-NPA), methylthiobutyrate (MtB), or .sup.14C-labelled esters. Control enzymatic activity may be measured using the same method under the same condition but in the absence of carboxylesterase. Enzymatic activity of carboxylesterase is considered detectable if it has a numerical value larger than the control enzymatic activity.

[0056] In certain embodiments, the elevated temperature is between about 40.degree. C. and about 100.degree. C., or between about 50.degree. C. and about 90.degree. C., or between about 50.degree. C. and about 70.degree. C. In certain embodiments, the elevated temperature is about 50.degree. C., about 55.degree. C., about 60.degree. C., about 65.degree. C., about 70.degree. C., about 75.degree. C., about 80.degree. C., about 85.degree. C., or about 90.degree. C. The period of exposure time during which the carboxylesterase may be exposed to the elevated temperature may be determined by a person skilled in the art. In certain embodiments, the exposure time is up to 10 minutes, 30 minutes, 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 12 hours, 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, or 10 days. In certain embodiments, the exposure time is between 30 minutes and 10 days, or between 30 minutes and 5 days, or between 30 minutes and 1 day, or between 30 minutes and 6 hours, or between 30 minutes and 2 hours, between 1 hour and 2 hours, or between 10 minutes and 30 minutes.

[0057] In certain embodiments, the present disclosure provides thermostable carboxylesterases of Geobacillus stearothermophilus having the amino acid sequences set forth in SEQ ID NOs: 9, 11, 13, and 15. As shown in FIG. 8, the enzymatic activity of thermostable carboxylesterase of SEQ ID NO: 9 may be measured after the enzyme has been exposed to elevated temperatures and times such as but not limited to, 40.degree. C., 50.degree. C., 60.degree. C., 70.degree. C., and 80.degree. C., respectively, for 10 minutes or 30 minutes. The enzymatic activity of the carboxylesterase exposed and tested at 37.degree. C. is also measured as a standard reference. The samples and the standard reference are otherwise tested and measured under the same conditions. After exposure at 60.degree. C. for 10 minutes, the carboxylesterase may have almost 100% of the enzymatic activity of the standard reference. After exposure at 70.degree. C. for 30 minutes, the carboxylesterase may have above 60% of the enzymatic activity of the standard reference.

Expression Vectors and Host Cells

[0058] In one aspect, the present disclosure provides eukaryotic cells engineered to express a gene encoding for a carboxylesterase or its variant.

[0059] The term "express" or "expression" as used herein includes one or more steps involved in the production of the carboxylesterase including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion. The term "engineered to express" as used herein refers to one or more steps of enabling a host cell to express a carboxylesterase or its variant in such a manner that is not naturally found in the host cell. In certain embodiments, the term "engineered to express" includes one or more steps of introducing an exogenous gene encoding for a carboxylesterase or its variant into a host cell for the purpose of expressing the carboxylesterase or its variant in the host cell. In certain embodiments, a host cell is engineered to express a mutated carboxylesterase or a carboxylesterase with mutated regulatory sequences by exposing the host cell to mutagens.

[0060] The term "gene" as used herein refers to polyribonucleotides or polydeoxyribonucleotides or mixed polyribo-polydeoxyribonucleotides that contain information encoding for a peptide or polypeptide. This includes single- and double-stranded molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA hybrids, as well as "protein nucleic acids" (PNA) formed by conjugating bases to an amino acid backbone. Genes include naturally-occurring polynucleotides or synthetic polynucleotides formed from naturally-occurring bases or modified bases. The term gene also encompasses the coding regions of a structural gene and sequences located adjacent to the coding regions on both the 5' and 3' ends that are useful for the transcription or translation of the RNA or polypeptide as well as intervening sequences (introns) between individual coding segments (exons).

[0061] The term "peptide" or "polypeptide" refers to amino acids linked to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and may contain modified amino acids other than the 20 naturally occurring amino acids. The term "peptide" or "polypeptide" also includes peptides or polypeptide fragments, motifs and the like, glycosylated peptides or polypeptides, and other modified peptides or polypeptides.

[0062] The term "encoding for" as used herein means being capable of being transcribed into mRNA and/or translated into a peptide or protein.

[0063] In certain embodiments, genes encoding for a carboxylesterase or its variants are inserted into expression vectors for expression by host cells.

[0064] The term "expression vector" as used herein refers to a nucleotide vehicle into which a gene encoding for a peptide or protein is operably inserted so that the encoded peptide or protein can be expressed. Illustrative examples of nucleotide vehicles that may be used to build expression vectors include, but are not limited to, plasmids, phagemids, cosmids, artificial chromosomes such as yeast artificial chromosome, bacterial artificial chromosome, or P1-derived artificial chromosome, bacteriaophages such as lambda phage or M13 phage, animal viruses such as retrovirus, adenovirus or papovavirus, and plant viruses such as potato virus X. Many eukaryotic expression vectors are commercially available. Selection of appropriate expression vectors is within the knowledge of those skilled in the art.

[0065] In certain embodiments, the expression vector is a vector suitable for expression in yeast cells. Illustrative examples are, pPIC3K (Invitrogen, Carlsbad, Calif.), pPIC9K (Invitrogen), pAO815 (Invitrogen), pGAPZ (Invitrogen), pYC2/CT (Invitrogen), pYD1 yeast display vector (Invitrogen), pESC vectors (Stratagene, La Jolla, Calif.), pESC-HIS vector (Stratagene), and pHIPX4 (Gietl et al, Mutational analysis of the N-terminal topgenic signal of watermelon glyoxysomal malate dehydrogenase using the heterologous host Hansenula polymorphs. Proc. Natl. Acad. Sci. USA 1994, vol 91, 3151-3155). In certain embodiments, the expression vector is a plasmid suitable for expression in yeast cells. In certain embodiments, the expression vector is a plasmid suitable for expression in Pichia pastoris. In certain embodiments, the expression vector is pPIC9K.

[0066] In certain embodiments, the expression vector is a vector suitable for expression in filamentous fungal cells. Illustrative examples are, pPTR (TaKaRa Bio Inc., Shiga, Japan), pDG1 (ATCC Catalog No. 53005), pAB366 (ATCC Catalog No. 77134), pAB520 (ATCC Catalog No. 77137), plasmid pYG1.2 (Liu et al, Construction of recombinant expression plasmid for Aspergillus niger, Journal of Tongji University (Medical science) 2001, vol 22, 1-3), pTAex3 (Sakuradani et al, D6-Fatty acid desaturase from an arachidonic acid-producing Mortierella fungus Gene cloning and its heterologous expression in a fungus, Aspergillus, Gene 1999, vol 238, 445-453), pSa123 (Gomi et al., Integrative transformation of Aspergillus oryzae with a plasmid containing the Aspergillus nidulans argB gene, Agric. Biol. Chem. 1987, vol 51, 2549-2555), pNAN8142 (Hiroyuki et al., Expression of Aspergillus oryzae Phytase Gene in Aspergillus oryzae RIB40 niaD, Journal of bioscience and bioengineering, 2006, Vol 102, No. 6, 564-567). In certain embodiments, the expression vector is a plasmid suitable for expression in filamentous fungal cells. In certain embodiments, the expression vector is a plasmid suitable for expression in Aspergillus niger. In certain embodiments, the expression vector is pYG1.2. An Escherichia coli DH5.alpha. strain containing the pYG1.2 plasmid (Escherichia coli DH5.alpha./pYG1.2) was deposited with the China Center for Type Culture Collection (CCTCC), Wuhan University, Wuhan, China, on Jul. 27, 2009, and assigned the Accession No. CCTCC M 209165, under the terms and conditions of the Budapest Treaty.

[0067] In another aspect, the present disclosure provides an expression vector comprising a gene encoding for a carboxylesterase from a microbe or its variant, and a regulatory sequence capable of promoting expression of the carboxylesterase or its variant in a eukaryotic cell, wherein the regulatory sequence is operably linked to the gene. In certain embodiments, the amino acid sequences of the carboxylesterase or its variant have at least 70% sequence identity to the amino acid sequences of SEQ ID NOs: 9, 11, 13 or 15. In certain embodiments, the amino acid sequences of the carboxylesterase or its variant have at least 90% sequence identity to the amino acid sequences of SEQ ID NOs: 9, 11, 13 or 15. In certain embodiments, the nucleotide sequences of the carboxylesterase or its variant have at least 70% sequence identity to the nucleotide sequences of SEQ ID NOs: 10, 12, 14 or 16. In certain embodiments, the nucleotide sequences of the carboxylesterase or its variant have at least 90% sequence identity to the nucleotide sequences of SEQ ID NOs: 10, 12, 14 or 16.

[0068] In another aspect, the present disclosure provides an expression vector comprising a gene encoding for a carboxylesterase or its variant, and a regulatory sequence capable of promoting expression of the carboxylesterase or its variant in a filamentous fungal cell, wherein the regulatory sequence is operably linked to the gene.

[0069] The term "regulatory sequence" includes any component which is necessary or advantageous for the expression of a carboxylesterase or its variant of the present disclosure. Such regulatory sequences may include, but are not limited to, a promoter sequence, a transcription terminator, a leader sequence, and a polyadenylation sequence. The term "operably linked" means that the gene sequence is directly or indirectly linked to or associated with one or more regulatory sequence in a manner that allows expression of the gene encoding for the carboxylesterase or its variant. The gene coding sequence and the one or more regulatory sequences may be located on the same polynucleotide molecule and positioned in such a manner that allow expression of the gene encoding for the carboxylesterase or its variant. The gene coding sequence and the one or more regulatory sequences may be located on different polynucleotide molecules but the regulatory sequences can function to affect expression of the gene encoding for the carboxylesterase or its variant.

[0070] The regulatory sequence may contain an appropriate promoter sequence. As used herein, a "promoter sequence" refers to a segment of DNA that controls transcription of a DNA sequence to which it is operably linked. The promoter sequence includes specific sequences that are sufficient for RNA polymerase recognition, binding and transcription initiation. In addition, the promoter sequence may include sequences that modulate this recognition, binding and transcription initiation activity of RNA polymerase. These sequences may affect transcription on the same molecule or a different molecule. Functions of the promoter sequences, depending upon the nature of the regulation, may be constitutive or inducible by a stimulus. Any promoter sequence suitable for transcription control in eukaryotic cells may be used. In certain embodiments, the promoter sequence is suitable for transcription control in yeast cells and/or filamentous fungal cells. Illustrative examples of suitable promoter sequences in yeast cells are TEF promoter, CYC promoter, ADH1 promoter, 3-phosphoglycerate kinase promoter, glyceraldehyde-3-phosphate dehydrogenase (GAFDH or GAP) promoter, galactokinase (GAL1) promoter, galactoepimerase promoter, and alcohol dehydrogenase (ADH1) promoter. Illustrative examples of suitable promoter sequences in filamentous fungal cells are .alpha.-amylase promoter, glucoamylase promoter, alcohol dehydrogenase promoter (Kinghorn et al., Applied molecular genetics of filamentous fungi, Springer press, 1992, p 18). In certain embodiments, the promoters are inducible promoters that can turn on or off the carboxylesterase expression in response to a chemical or physical stimulus. Illustrative examples of inducible promoters are AOX1 promoter (inducible by methanol), GAL1 promoter (inducible by galactose), CUP promoter (inducible by Cu.sup.2+) (Wei Xiao, Yeast protocols, Edition: 2, Humana Press, 2005, p 320), and alc A promoter (inducible by alcohols).

[0071] The regulatory sequence may contain a suitable transcription terminator sequence, which is a sequence recognized by a eukaryotic cell RNA polymerase to terminate transcription. The terminator sequence may be operably linked to the 3' terminus of the nucleotide sequence encoding for a carboxylesterase or its variants. Any terminator sequence which is functional in eukaryotic cells may be used in the present disclosure. In certain embodiments, the terminator sequence may be nucleotide sequence having transcription termination activity in yeast cells or filamentous fungal cells.

[0072] The regulatory sequence may also contain a suitable leader sequence, which is a non-translated region of an mRNA which is important for translation by eukaryotic cells. The leader sequence may be operably linked to the 5' terminus of the nucleotide sequence encoding the carboxylesterase or its variants. Any leader sequence which is functional in eukaryotic cells may be used in the present disclosure. In certain embodiments, the leader sequence may be a nucleotide sequence functional in yeast cells or filamentous fungal cells.

[0073] The regulatory sequence may also contain a polyadenylation sequence, a sequence which may be operably linked to the 3' terminus of the carboxylesterase gene sequence and which, when transcribed, is recognized by eukaryotic cells as a signal to add polyadenosine residues to the transcribed mRNA. Any polyadenylation sequence which is functional in eukaryotic cells may be used in the present disclosure. In certain embodiments, the polyadenylation sequence is functional in yeast cells or filamentous fungal cells.

[0074] In another aspect, the present disclosure provides an expression vector optionally further comprising a marker gene for selective identification of the expression vector. A marker gene is a gene encoding for a protein that can serve as a selection marker for identifying cells comprising the gene. Typical marker genes encode proteins that have one or more of the following characteristics: i) confer resistance to antibiotics or other toxic substances, e.g., Ampicillin, neomycin, methotrexate, etc.; ii) complement auxotrophic deficiencies, and iii) supply critical nutrients not available from the media. Marker genes may be inducible or non-inducible and will generally allow for positive selection. Suitable marker genes for yeast host cells include, but are not limited to, Ade2, His3, Leu2, Lys2, Met3, Trp1, Ura3, and neomycin or Kanamycin or Ampicillin resistance gene. Suitable marker genes for use in filamentous fungal host cells include, but are not limited to, amdS (acetamidase), argB (ornithine carbamoyltransferase), bar (phosphinothricin acetyltransferase), hygB (hygromycin phosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5'-phosphate decarboxylase), sC (sulfate adenyltransferase), trpC (anthranilate synthase), as well as equivalents thereof.

[0075] In another aspect, the present disclosure provides a method of producing a carboxylesterase or its variant, wherein the expressed carboxylesterase or its variant is secreted to the outside of the eukaryotic cell. In certain embodiments, the gene encoding for the carboxylesterase or its variant is linked to a signal sequence that codes for a signal peptide. As used herein, the term "signal peptide" refers to an amino acid sequence that is linked to the carboxylesterase or its variant and that enables the expressed carboxylesterase or its variant to be transported/secreted outside of a cell membrane. In certain embodiments, the signal peptide may be linked to the amino terminus of the carboxylesterase or its variant. In certain embodiments, the signal peptide may be cut by a peptidase to remove the signal peptide from the carboxylesterase or its variant.

[0076] The signal sequence may be one or more naturally existing signal sequences of carboxylesterases, or foreign signal sequences added to the carboxylesterases. In certain embodiments, the signal sequences are functional in yeast cells and/or filamentous fungal cells. Illustrative examples of signal peptides functional in yeast cells are chicken lysozyme signal peptide (CLSP), signal peptide for Saccharomyces cerevisiae alpha-factor and signal peptide for Saccharomyces cerevisiae invertase. Illustrative examples of signal peptides functional in filamentous fungal cells are signal peptide for Aspergillus oryzae TAKA amylase, signal peptide for Aspergillus niger neutral amylase, signal peptide for Aspergillus niger glucoamylase, signal peptide for Rhizomucor miehei aspartic proteinase, signal peptide for Humicola insolens cellulase, and signal peptide for Humicola lanuginosa lipase.

[0077] The term "host cell" as used herein refers to a cell that is susceptible to transformation, transfection, transduction, and the like with an expression vector.

[0078] The expression vector may be introduced into the eukaryotic cells using any suitable methods known in the art, including without limitation, electroporation, calcium chloride-, lithium chloride-, lithium acetate/polyethylene glycol-, calcium phosphate-, DEAE-dextran-, liposome-mediated DNA uptake, spheroplasting, injection, microinjection, microprojectile bombardment, phage infection, viral infection, or other established methods. Alternatively, expression vectors containing the gene sequences of interest can be transcribed in vitro, and the resulting mRNA may be introduced into the host cell for transient expression by well-known methods, e.g., by injection (see, Kubo et al., Location of a region of the muscarinic acetylcholine receptor involved in selective effector coupling, FEBS Letts. 1988, vol 241, 119).

[0079] In certain embodiments, expression vectors may be introduced into yeast cells by methods such as protoplast transformation (see, e.g., Spencer et al, Genetic manipulation of non-conventional yeasts by conventional and non-conventional methods. J Basic Microbiol., 1988, vol 28, No. 5, 321-333), competent cells transformation (see, e.g., Gietz et al, Frozen competent yeast cells that can be transformed with high efficiency using the LiAc/SS carrier DNA/PEG method, Nat. Protoc. 2007; 2(1):1-4), electroporation (see, e.g., Suga et al, High-efficiency electroporation by freezing intact yeast cells with addition of calcium, Curr Genet., 2003, vol 43, No. 3, 206-211), or conjugation through cell-to-cell contact (see, e.g., Nishikawa et al, Trans-kingdom conjugation offers a powerful gene targeting: tool in yeast, 1998, Genetic Analysis: Biomolecular Engineering, vol 14, No. 3, 65-73).

[0080] In certain embodiments, the expression vector may be introduced into filamentous fungal cells by protoplast transformation comprising steps of protoplast isolation, regeneration, and fusion (see, Arora et al, Handbook of fungal biotechnology, 2nd Edition, CRC Press, 2004, p 9-24). Suitable procedures for transformation of filamentous fungal cells are described in various publications (see, for example, Ruiz et al, Strategies for the transformation of filamentous fungi, J Appl Microbiol., 2002, vol 92, No. 2, 189-195; Hynes et al, Genetic transformation of filamentous fungi, Journal of Genetics, 1996, vol 75, No. 3, 297-311).

[0081] In another aspect, the present disclosure provides a eukaryotic cell comprising an expression vector, wherein the expression vector containing a gene encoding for a carboxylesterase from a microbe or its variant, and a regulatory sequence capable of promoting expression of the carboxylesterase or its variant in a eukaryotic cell, wherein the regulatory sequence is operably linked to the gene. In certain embodiments, the eukaryotic cell is yeast cell. In certain embodiments, the eukaryotic cell is Pichia pastoris.

[0082] In another aspect, the present disclosure provides a eukaryotic cell comprising an expression vector containing a gene encoding for a carboxylesterase or its variant, and a regulatory sequence capable of promoting expression of the carboxylesterase or its variant in a filamentous fungal cell, wherein the regulatory sequence is operably linked to the gene. In certain embodiments, the filamentous fungal cell is Aspergillus niger.

Cell Culturing

[0083] The eukaryotic cells engineered to express a carboxylesterase or its variant of the present disclosure may be cultured in any suitable medium under conditions suitable for expression of the carboxylesterase or its variant. For example, the cells may be cultivated by shake flask cultivation, small-scale or large-scale fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory or industrial fermentors. The cultivation may take place in a suitable nutrient medium comprising carbon and nitrogen sources and inorganic salts. Suitable media are available from commercial suppliers or may be prepared using commercially available ingredients.

[0084] The cultivation conditions such as temperature, pH, incubation time and presence of an inducer may be adjusted to allow higher expression of the carboxylesterases. Cultivation conditions may be adjusted by people skilled in the art. In certain embodiments, cultivation conditions may be determined by cultivating cells engineered to express a carboxylesterase or its variant under a wide range of conditions, measuring the expression of the carboxylesterase or its variant and selecting the cultivation conditions that allow a relatively high level expression of the carboxylesterase or its variant.

[0085] Suitable temperature, pH, and incubation time for cell cultivation usually depend on the host cells. In certain embodiments, the cultivation temperature may range from about 20.degree. C. to about 80.degree. C., from about 30.degree. C. to about 70.degree. C., from about 30.degree. C. to about 60.degree. C., from about 30.degree. C. to about 50.degree. C., or from about 30.degree. C. to about 40.degree. C. In certain embodiments, the cultivation temperature is at about 20.degree. C., about 25.degree. C., about 30.degree. C., about 35.degree. C., about 37.degree. C., about 40.degree. C., about 50.degree. C., about 60.degree. C., about 70.degree. C., or about 80.degree. C.

[0086] In certain embodiments, the cultivation pH may range from about 2 to about 8.5, from about 3 to about 8.5, from about 4 to about 8.5, from about 5 to about 8.5, from about 6 to about 8.5, or from about 7 to about 8.5. In certain embodiments, the cultivation pH is at about 2, about 3, about 4, about 5, about 6, about 7, about 8, or about 8.5.

[0087] In certain embodiments, the incubation time may be at least one day, at least 2 days, at least 3 days, or at least 4 days. In certain embodiments, the incubation time may range from 1 day to 10 days, 2 days to 9 days, 3 days to 8 days, or 4 days to 7 days. In certain embodiments, the incubation time is 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, or 10 days.

[0088] In certain embodiments, the eukaryotic cells may be cultured in the presence of an inducer that can induce the expression of the carboxylesterase or its variant. The selection of an inducer may be based on the inducible promoter operably linked to the gene encoding for the carboxylesterase or its variant. In an illustrative embodiment, the eukaryotic cells are cultivated in the presence of methanol to induce AOX1 promoter operably linked with the gene encoding the carboxylesterase or its variant. In another illustrative embodiment, the eukaryotic cells are cultivated in the presence of galactose to induce GAL1 promoter. In another illustrative embodiment, the eukaryotic cells are cultivated in the presence of Cu.sup.2+ to induce CUP promoter. In another illustrative embodiment, the eukaryotic cells are cultivated in the presence of one or more alcohols to induce alc A promoter. The amount of an inducer in the cell culture may be adjusted by people skilled in the art to allow a relatively higher level expression of the carboxylesterase or its variant.

[0089] The expression level of a carboxylesterase or its variant may be determined using well-established techniques known in the art. In certain embodiments, the expression level of a carboxylesterase or its variant is measured by quantifying the amount of mRNA transcribed or the amount of protein translated.

[0090] In an illustrative embodiment, mRNA levels can be determined by Northern blot analysis (Alwine et al., Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes, Proc. Natl. Acad. Sci. USA 1977, vol 74, 5350-5354; Bird, Size separation and quantification of mRNA by northern analysis, Methods Mol. Biol. 1998, vol 105, 325-36). Briefly, poly(A) RNA is isolated from cells, separated by gel electrophoresis, blotted onto a support surface (e.g., nitrocellulose or Immobilon-Ny transfer membrane (Millipore Corp., Bedford, Mass.)), and incubated with a labeled (e.g., fluorescently labeled or radiolabeled) oligonucleotide probe that is capable of hybridizing with the mRNA of interest. In another illustrative embodiment, mRNA levels can be determined by quantitative RT-PCR (for review, see Freeman et al., Quantitative RT-PCR: pitfalls and potential, Biotechniques 1999, vol 26, 112-122) or semi-quantitative RT-PCR analysis (Ren et al., Lipopolysaccharide-induced expression of IP-10 mRNA in rat brain and in cultured rat astrocytes and microglia, Mol. Brain. Res. 1998, vol 59, 256-263). In accordance with this technique, poly(A) RNA is isolated from cells, used for cDNA synthesis, and the resultant cDNA is incubated with PCR primers that are capable of hybridizing with the template and amplifying the template sequence to produce levels of the PCR products that are proportional to the cellular levels of the mRNA of interest.

[0091] In an illustrative example, the expressed carboxylesterase or its variant is detected by electrophoresis such as sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), the density of a carboxylesterase band on SDS-PAGE gel may be scanned to quantify the protein using a commercial scanner (for example, GS-800 densitometer of Bio-Rad). In another illustrative example, the expressed carboxylesterase or its variant is detected by Western blot analysis using antibodies specifically recognizing the carboxylesterase or its variant. In another illustrative example, the expressed carboxylesterase or its variant is detected by measuring their enzymatic activity using a substrate.

[0092] Enzymatic activity of the expressed carboxylesterase may be determined using methods known in the art. The enzymatic activity may be characterized by measuring the disappearance of a substrate or the formation of a product. The measurement may be spectroscopic, radiometric, colorimetric or based on high performance liquid chromatography. Any substrate suitable for a carboxylesterase reaction may be used. Illustrative examples of substrates of carboxylesterases are, naphthyl acetate (NA), p-nitrophenyl acetate (p-NPA), methylthiobutyrate (MtB), or .sup.14C-labelled esters. In an illustrative embodiment, the enzymatic activity of carboxylesterases may be quantified by spectroscopic measurement of a complex formed between the chromogenic reagent Fast Blue B salt and .alpha.-naphthol, which is the product of hydrolysis by a carboxylesterase of the substrate .alpha.-naphthyl acetate.

[0093] In another aspect, the present disclosure provides a method of producing a carboxylesterase or its variant, wherein the carboxylesterase or its variant is produced in an amount of at least about 12 mg/L and up to 100 mg/L. In certain embodiments, the yield of carboxylesterase or its variant is at least 12 mg/L, or at least 15 mg/L, or at least 17 mg/L, or at least 19 mg/L. In certain embodiments, the yield of carboxylesterase or its variant is 1 mg/L to 100 mg/L, 10 mg/L to 100 mg/L, 15 mg/L to 100 mg/L, 20 mg/L to 100 mg/L, 30 mg/L to 100 mg/L, 10 mg/L to 50 mg/L, 15 mg/L to 50 mg/L, 20 mg/L to 50 mg/L, or 30 mg/L to 50 mg/L. In certain embodiments, the carboxylesterase or its variant is produced in an amount of about 1 mg/L, about 5 mg/L, about 10 mg/L, about 12 mg/L, about 15 mg/L, about 20 mg/L, about 30 mg/L, about 40 mg/L, about 50 mg/L, or about 100 mg/L.

Isolation of the Expressed Carboxylesterases

[0094] The expressed carboxylesterase or its variant may be isolated from the cell culture using standard methods known in the art including, without limitation, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation (see I. M. Rosenberg (Ed.), Protein Analysis and Purification: Benchtop Techniques, 1996, Birkhauser, Boston, Cambridge, Mass.; Janson et al, Protein Purification, 1989, VCH Publishers, New York)).

[0095] The expressed carboxylesterase may be further purified by a variety of procedures known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion), electrophoretic procedures (e.g., preparative isoelectric focusing, SDS-PAGE), and differential solubility (e.g., ammonium sulfate precipitation).

[0096] In certain embodiments, antibody-based methods can be used to isolate and purify expressed carboxylesterases or variants thereof. Antibodies that can bind to the carboxylesterases or variants thereof, can be produced and isolated using methods known and practiced in the art. Carboxylesterases or variants thereof can be purified from a cell lysate or from the supernatant of the culture medium by chromatography on antibody-conjugated solid-phase matrices such as immunoprecipitation (see Harlow et al, Using Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1999, Cold Spring Harbor, N.Y.).

[0097] In another aspect, the present disclosure provides isolated carboxylesterases or variants thereof. "Isolated carboxylesterase" as used herein refers to a carboxylesterase that is substantially free of the other cellular components with which they are associated during the production methods described herein. "Substantially free" includes a preparation of carboxylesterase having less than about 50%, 40%, 30%, 20%, 10%, 5% or 1% (by dry weight) of the other cellular components or other contaminating materials that are not the carboxylesterase or its variant of interest. In certain embodiments, the isolated carboxylesterase has less than 50%, 40%, 30%, 20%, 10%, 5%, or 1% (by dry weight) of contaminating materials that are not the carboxylesterase or its variant of interest.

Compositions Comprising Expressed Carboxylesterases and Uses Thereof

[0098] In another aspect, the present disclosure provides a composition comprising a eukaryotic cell and a carboxylesterase or its variant expressed by the eukaryotic cell. In certain embodiments, the present disclosure provides a composition comprising a eukaryotic cell and a microbial carboxylesterase or its variant expressed by the eukaryotic cell. In certain embodiments, the present disclosure provided a composition comprising a filamentous fungal cell and a carboxylesterase or its variant expressed by the filamentous fungal cell. In an illustrative embodiment, the composition is directly obtained from a cell culture containing eukaryotic cells engineered to express a gene encoding for a carboxylesterase or its variant. The cell culture may be filtered or centrifuged or otherwise treated to get rid of the culture medium, cell debris and/or other unwanted substances. The cell culture may undergo further purification processes as deemed suitable by a person skilled in the art to increase the concentration of the carboxylesterases therein. The composition may be prepared in liquid form or dried solid form.

[0099] In another aspect, the present disclosure provides a composition comprising isolated carboxylesterases or variants thereof produced by a method described herein. In certain embodiments, the composition comprising isolated carboxylesterases or variants thereof having no more than 50%, 40%, 30%, 20%, 10%, 5% or 1% (by dry weight) of contaminating materials that are not the carboxylesterase or its variant of interest.

[0100] The compositions of the present disclosure may be used in various biological, agricultural and pharmaceutical applications. The composition of the present disclosure may be used to convert a compound with a carboxyl ester group to a compound without a carboxyl ester group, comprising incubating the compound with a carboxyl ester group with a carboxylesterase or its variant. In an illustrative embodiment, the composition is used to convert a prodrug with a carboxyl ester group to a drug without a carboxyl ester group. In another illustrative embodiment, the composition is used to convert a pesticide with a carboxyl ester group to a detoxified pesticide without a carboxyl ester group.

[0101] In another aspect, the present disclosure provides a method of detoxifying pesticides comprising incubating pesticides with the composition provided herein. Pesticides, as used herein, refer to chemical agents used to kill, repel or act against pests such as insects, plant pathogens, and weeds. Some pesticides contain one or more carboxyl ester groups in the chemical structures. Hydrolysis of the carboxyl ester groups by a carboxylesterase can convert the toxic pesticides to non-toxic substances. This is useful for cleaning up unwanted pesticide residues on agricultural products such as vegetables and fruits or in the environment such as water and soil. It is also useful for reducing or eliminating the toxic effects of pesticides in the event of poisoning. Illustrative examples of pesticides that may be detoxified by carboxylesterases are organophosphate pesticides, carbamate pesticides and pyrethroid pesticides.

[0102] In certain embodiments, the composition may be provided in the form of cell extracts from or culture supernatants of host cells that are engineered to expresses the carboxylesterase or its variant, or may be provided in the form of isolated carboxylesterase or its variant. The amount of the composition to be used may be determined as needed by people practicing the method. In certain embodiments, the amount of the composition to be used may depend on the amount and type of pesticide contained in the sample to be detoxified, and the enzymatic activity of the composition to hydrolyze the specific pesticide. In an illustrative embodiment, 0.1 nmol carboxylesterase is incubated with a sample containing 1 nmol of a carboxyl ester group containing pesticide; the carboxylesterase degraded about 100% of the pesticide after incubation for 4 hours. In another illustrative embodiment, 0.1 nmol carboxylesterase is incubated with a sample containing 2 nmol of a carboxyl ester group containing pesticide; the carboxylesterase degraded about 85% of the pesticide after incubation for 6 hours.

[0103] The incubation time may be determined as needed by people practicing the method. In certain embodiments, the incubation time may range from about 1 hour to about 3 weeks, from about 3 hours to about 7 days, and from about 4 hours to about 3 days, for example, about 1 hour, about 1.5 hour, about 3 hours, about 6 hours, about 9 hours, about 1 day, about 3 days, about 7 days, about 9 days, about 12 days, about 15 days, and about 3 weeks. Other conditions for incubation, for example, temperature, pH, presence of co-factors may be selected by people skilled in the art to detoxify a higher percentage of the pesticide.

[0104] In certain embodiments, the amounts of pesticides in a sample under treatment may be reduced by at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%. In certain embodiments, the amounts of pesticides in a sample under treatment may be reduced by 30% to 100%, 40% to 90%, 50% to 80%, or 60% to 70%. In certain embodiments, the amounts of pesticides in a sample under treatment may be reduced by 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, and 100%.

[0105] In another aspect, the present disclosure provides a method of converting a prodrug into a drug comprising incubating the prodrug with the composition provided herein. Some prodrugs may contain carboxyl ester groups that make the prodrugs pharmaceutically inactive. A carboxylesterase described herein can hydrolyze the carboxyl ester groups and convert the inactive prodrug into the active drug. In an illustrative example, irinotecan, an anti-cancer prodrug, is converted by carboxylesterase into the active drug compound 7-ethyl-10-hydroxycamptothecin, a topoisomerase I inhibitor (Yoon et al, Activation of a camptothecin prodrug by specific carboxylesterases as predicted by quantitative structure-activity relationship and molecular docking studies, Mol Cancer Ther 2003, vol 2, 1171). In another illustrative example, prodrug oseltamivir is converted into the active drug oseltamivir carboxylate by carboxylesterase (Shi et al, anti-influenza viral prodrug oseltamivir is activated by carboxylesterase hcel and the activation is inhibited by anti-platelet agent clopidogrel, J. Phar. Exp. Ther. 2006, vol 319, 1477-1484).

EXAMPLES

[0106] The following Examples are set forth to aid in the understanding of the present disclosure, and should not be construed to limit in any way the scope of the invention as defined in the claims which follow thereafter.

[0107] Materials and Culture Media

[0108] 1% agarose gel: 400 mg agarose, 39.2 ml H.sub.2O, 0.8 ml 50.times.TAE, and 1 .mu.g ethidium bromide.

[0109] LB medium (1 L): 10 g tryptone, 5 g yeast extract, 10 g NaCl, pH 7.4.

[0110] LB plate: LB medium containing 1.5% agar.

[0111] RDB plate: 1 M sorbitol, 2% glucose, 1.34% Yeast Nitrogen Base (YNB), 2% agar, 4.times.10.sup.-5% biotin, and 0.005% amino acid.

[0112] YPD medium: 1% yeast extract, 2% tryptone and 2% glucose.

[0113] YPD plate: 1% yeast extract, 2% tryptone, 2% glucose, and 2% powdered agar.

[0114] BMMY medium: 1% yeast extract, 2% tryptone, 100 mM potassium phosphate buffer (pH 6.0), 1.34% YNB, 4.times.10.sup.-5% biotin and 0.5% methanol.

[0115] MN broth (400 ml): 16 ml MN salt solution (37.5 g sodium nitrate, 3.25 g chloride potassium, 9.5 g monopotassium phosphate in 250 ml H.sub.2O), 0.4 ml trace elements (containing 2.2 g zinc sulfate heptahydrate, 1.1 g boric acid, 0.5 g manganese chloride tetrahydrate, 0.5 g ferrous sulfate heptahydrate, 0.17 g cobalt chloride hexahydrate, 0.16 g copper sulfate pentahydrate, 0.15 g sodium molybdate, 5 g disodium ethylenediaminetetraacetate per 100 ml, pH 6.5), 6 g glucose, 0.4 g casein acids hydrolysate, 8 ml 50.times.MgSO.sub.4 solution (6.5 g MgSO.sub.4.7H.sub.2O in 250 ml H.sub.2O), pH 6.5.

[0116] MN+URI broth (400 ml): 0.4 g uridine, 400 ml MN broth.

[0117] MN+SORB agar medium (1 L): 40 ml MN salt solution, 1 ml trace elements, 10 g glucose, 218.64 g sorbitol, 15 g agar, 20 ml 50.times.MgSO.sub.4 solution, pH 6.5.

[0118] STC buffer (300 ml): 65.6 g sorbitol, 0.36 g Tris base, 2.2 g CaCl.sub.2.2H.sub.2O, pH 7.5.

[0119] PEG solution (100 ml): 60 g PEG6000, 0.12 g Tris, 0.74 g CaCl.sub.2, pH 7.5.

[0120] NM buffer (500 ml): 29.25 g NaCl, 2.132 g MES, pH 5.8.

[0121] MM buffer (200 ml): 59.15 g MgSO.sub.4.7H.sub.2O, 0.8 g MES, pH 5.8.

[0122] 6.times.SDS-PAGE loading buffer: 300 mM Tris-HCl (pH 6.8), 12% (w/v/) SDS, 0.6% (w/v) Bromophenol Blue, 60% (v/v) glycerol, 6% (w/v) .beta.-Mercaptoethanol.

[0123] PBST buffer: 0.01 M phosphate-buffered saline, pH 7.2, supplemented with 0.1 (v/v) Tween 20.

[0124] MGY medium: 1.34% YNB, 1% glycerol, 4.times.10.sup.-5% biotin.

[0125] Enzyme dilution buffer (100 ml): 3.5 g NaCl, 0.11 g CaCl.sub.2, 1 g glucose, pH 5.8.

[0126] Lywallzyme: 0.2 g Lywallzyme dissolved in enzyme dilution buffer.

[0127] BSA: 12 mg/ml BSA dissolved in enzyme dilution buffer.

[0128] 12% polyacrylamide resolving gel (10 ml): 4 ml H.sub.2O, 3.3 ml 30% polyacrylamide, 2.5 ml 1.5M Tris-HCl (pH 8.8), 0.1 ml 10% SDS, 0.1 ml 10% AP, 4 .mu.l TEMED.

[0129] 5% polyacrylamide stacking gel (5 ml): 3.44 ml H.sub.2O, 0.83 ml 30% polyacrylamide, 0.63 ml 0.5M Tris-HCl (pH 6.8), 0.05 ml 10% SDS, 0.05 ml 10% AP, 4 .mu.l TEMED.

[0130] All culture media and solutions are sterilized.

Example 1

Expression of Recombinant Carboxylesterase in Aspergillus niger M54 Strain

[0131] Carboxylesterase gene is amplified by PCR from Geobacillus stearothermophilus CICC 20156 strain (purchased from China Center of Industrial Culture Collection, Beijing, China), using the genomic DNA of the CICC 20156 strain as template and the following primers: forward primer P1 (SEQ ID NO: 33), and reverse primer P2 (SEQ ID NO: 34). The primers contain sequences from the carboxylesterase gene set forth in SEQ ID NO: 9. The forward primer contains an Xba I site and a KEX2 site while the reverse primer includes a HpaI site as well as nucleotides encoding for hexa histidine tag (His-tag, coding sequence of His-tag is marked with a wave underline), which consists of six histidine residues that can be used for affinity purification and antibody detection. The reverse primer contains the His-tag coding sequence linked to a portion of the nucleotide sequence of the 3' end of the carboxylesterase gene. The PCR product generated using primers P1 and P2 is called the CarE-his gene (SEQ ID NO: 35) encoding for a fusion protein of carboxylesterase and a His-tag fused at the C terminal.

TABLE-US-00004 TABLE 4 Nucleotide acid sequences of primers P1 and P2, and the CarE-his gene. Name Sequence P1 5'CGTCTAGAAAGAGAATGATGAAAATTGTTCCGCCG 3' (SEQ ID NO: 33) P2 5'CGGTTAACTTA CCAATCTAA CGATTCAA3' (SEQ ID NO: 34) CarE- 5'ATGATGAAAATTGTTCCGCCGAAGCGTTTTTCTTTGAA his GCCGGGGAGCGGGCGGTGCTGCTTTTGCATGGGTTTACCG GCAATTCCGCCGACGTTCGGATGCTTGGGCGATTCTTGGA ATCGAAAGGGTATACGTGCCACGCTCCGATTTACAAAGGG CATGGCGTGCCGCCGGAAGAGCTCGTCCACACCGGACCGG ATGATTGGTGGCAAGACGTCATGAACGGCTATCAGTTTTT GAAAAACAAAGGCTACGAAAAAATTGCCGTGGCTGGATTG TCGCTTGGAGGCGTATTTTCTCTCAAATTAGGCTACACTG TACCTACACAAGGCATTGTGACGATGTGCGCGCCGATGTA CATCAAAAGCGAAGAAACGATGTACGAAGGTGTGCTCGAG TATGCGCGCGAGTATAAAAAGCGGGAAGGGAAATCAGAGG AACAAATCGAACAGGAAATGGAACGGTTCAAACAAACGCC GATGAAGACGTTGAAAGCCTTGCAAGAACTCATTGCCGAT GTGCGCGCCCACCTTGATTTGGTTTATGCACCGACGTTCG TCGTCCAAGCGCGCCATGATGAGATGATCAATCCAGACAG CGCGAACATCATTTATAACGAAATTGATCGCCGGTCAAAC AAATCAAATGGTATGAGCAATCAGGCCATGTGATTACGCT TGATCAAGAAAAAGATCAGCTGCATGAAGATATTTATGCA TTTCTTGAATCGTTAGATTGG T GA 3' (SEQ ID NO: 35)

[0132] The PCR is performed in a 50 .mu.l reaction system containing the following composition: 5 .mu.l 10.times.Pfu Buffer (Tiangen biotech Co. Ltd, Beijing, China), 4 .mu.l dNTP mix (Tiangen biotech Co. Ltd, Beijing, China), 1 .mu.l forward primer P1, 1 .mu.l reverse primer P2, 1 .mu.l template DNA, 1 .mu.l pfu DNA polymerase (Tiangen biotech Co. Ltd, Beijing, China), and 37 .mu.l double distilled H.sub.2O. The following cycles are used in PCR: 95.degree. C. for 5 minutes, followed by 30 cycles of 95.degree. C. for 45 seconds, 60.degree. C. for 45 seconds, and 72.degree. C. for 90 seconds; and then 72.degree. C. for 5 minutes. The PCR product is then put on electrophoresis in 1% agarose gel and the 780 bp band is cut and purified by gel extraction using a gel extraction kit (Tiangen biotech Co. Ltd, Beijing, China) according to the manufacturer's instructions.

[0133] The purified PCR product is inserted into pYG1.2 vector, which is constructed using methods previously published (Liu, Zhongbin, et al, Construction of recombinant expression plasmid for Aspergillus niger, Journal of Tongji University (medical science), 2001, 22, vol 3, 1-3). pYG1.2 vector contains a pyr gene from Aspergillus niger ATCC 12049 strain, a gla A coding sequence and regulatory sequences. The steps for making the pYG1.2 vector are described briefly below.

[0134] The commercially available plasmid pUC18 (Fermentas Inc., Burlington, Canada) is used to construct the pIGF vector. The pUC18 vector contains a beta-lactamase gene that confers resistance to Ampicillin. A 4.8 kb fragment, containing the gla A encoding sequence (encoding for the first 498 amino acids of glucoamylase), a 2.0 kb upstream regulatory sequence (the promoter of gla A) and a 2.3 kb downstream regulatory sequence (terminator sequence of gla A), are inserted into the pUC18 vector to make the pIGF vector. The gla A coding sequence is inserted to help increase the expression level of the target protein. FIG. 1 shows a schematic diagram of the structure of the pIGF vector. The inserted gla A coding sequence includes an Xba I/Hap I cloning site (shown in FIG. 1) that can be used to create fusion proteins with gla A.

[0135] Then the pIGF vector is inserted with pyr G gene of Aspergillus niger ATCC12049 strain to make the pYG1.2 vector. A 600 bp fragment containing the conservative sequences of pyrG gene is obtained by PCR using the pAB4.1 plasmid (provided by Institute of Food Research, Norwich, UK, which contains pyrG gene from Aspergillus niger ATCC9029 strain) as the template. The fragment is labeled with .sup.32P and is used as a probe in plaque hybridization (Sambrook, J, et al, A laboratory manual, New York: Cold Spring Harbor Laboratory Press, 1989). A 9.8 kb nucleotide fragment containing the pyrG gene is isolated from the gene library of Aspergillus niger ATCC 12049 strain, and is further digested with Xho I to obtain a 2.3 kb fragment, which is confirmed to contain the pyrG gene of ATCC 12049 strain by restriction zymography and PCR.

[0136] The obtained 2.3 kb fragment is ligated with the linearized fragment of pIGF vector digested with XhoI. The ligation product is transformed into Escherichia coli (E. coli) DH5.alpha. and clones grown on LB plates containing Ampicillin are picked. PCR is performed to identify positive clones containing the inserted pyr G gene. The identified positive clones are propagated to extract plasmids, and the recombinant plasmids are sequenced to confirm successful insertion of the pyrG gene. One of the confirmed plasmids is named as pYG1.2 and used as a vector in the following molecular cloning studies. A schematic map of pYG1.2 is shown in FIG. 2. An Escherichia coli DH5.alpha. strain containing the pYG1.2 plasmid (Escherichia coli DH5.alpha./pYG1.2) is deposited with CCTCC, Wuhan University, Wuhan, China, on Jul. 27, 2009. The deposit No. is CCTCC M 209165. The deposit will be maintained under the terms and conditions of the Budapest Treaty. The plasmid pYG1.2 may be recovered from the Escherichia coli DH5.alpha./pYG1.2 strain by conventional plasmid extraction methods.

[0137] To insert the carboxylesterase gene into the pYG1.2 vector, the purified CarE-his gene product and the pYG1.2 plasmid are separately digested using both XbaI and HpaI (New Englan Biolabs, Inc., Ipswich, Mass.), followed by electrophoresis in 1% agarose gel and purification by gel extraction. The digested CarE-his gene product and the digested pYG1.2 fragment are ligated using T4 ligase (Takara Bio. Inc., Japan).

[0138] E. Coli DH5.alpha. are transformed by mixing competent E. Coli DH5.alpha. cells with 20 .mu.l ligation product, incubating on ice for 30 minutes, heat shock at 42.degree. C. for 90 seconds, incubating on ice for 5 minutes, followed by addition of 200 .mu.l LB medium and incubation in a shaker at 37.degree. C. for 45 minutes. 50 .mu.l of the resultant bacteria culture are inoculated onto LB plates containing Ampicillin, and are incubated overnight at 37.degree. C.

[0139] Individual colonies of E. Coli grown on the LB plates containing Ampicillin are screened by PCR using the individual E. coli colonies as templates under the same PCR reaction conditions as described above. The PCR products are characterized by electrophoresis in 1% agarose gel, and positive colonies showing a band at 780 bp are propagated overnight in LB medium with Ampicillin, followed by plasmid purification using plasmid extraction kit (Tiangen biotech Co. Ltd, Beijing, China) according to the manufacturer's instructions.

[0140] The obtained recombinant plasmids are characterized by digestion using XbaI and HpaI followed by electrophoresis. The positive plasmids are sequenced and the results confirm successful insertion of the carboxylesterase gene into the pYG1.2 plasmid. The recombinant plasmid is named as pYG1.2-CarE-his and used in the subsequent studies.

[0141] Aspergillus niger M54 strain is used to express carboxylesterase. Aspergillus niger M54 is a pyr G deficient strain that cannot grow on uridine-free medium. Aspergillus niger M54 is obtained by exposing Aspergillus niger ATCC 12049 strain to UV irradiation and screen for strains deficient in pyr G gene as previously described (Liu, Zhongbin et al, Construction of pyr G auxotrophic Aspergillus niger strain, Journal of microbiology, 2001, 21, vol 3, 15-16). The strain has been deposited with CCTCC, Wuhan University, Wuhan, China, on Jun. 14, 2009. The deposit No is CCTCC M 209121. The deposit will be maintained under the terms and conditions of the Budapest Treaty.

[0142] The plasmid pYG1.2-CarE-his is used to transform the protoplasts of Aspergillus niger M54. Protoplasts are prepared as follows: 1 L flask containing 200 ml MN+URI broth is inoculated with Aspergillus niger M54 suspension at the cell density of 4.times.10.sup.8 and incubated in a shaker at 200 rpm at 30.degree. C. for 24 hours; the treated cells are collected and 1 g of the cells (wet weight) is suspended with 10 ml ice cold MM buffer; 1 ml Lywallzyme and 0.5 ml BSA are added to the cell suspension following gentle shake at 30.degree. C. at 80 rpm for 10 minutes and then at 50 rpm for 2 hours; the suspension is then centrifuged at 4.degree. C. at 1500 rpm for 2 minutes and the supernatant is transferred and mixed with NM buffer followed by centrifugation at 4.degree. C. at 2500 rpm for 10 minutes; the precipitates are mixed with ice cold STC buffer to a total volume of 50 ml followed by centrifugation at 4.degree. C. at 2000 rpm for 10 minutes to yield milky white protoplasts which is then suspended in 1 ml ice cold STC buffer.

[0143] The protoplasts are transformed with pYG1.2-CarE-his plasmid, pYG1.2 plasmid (positive control) and blank buffer (negative control), respectively. 3.0 .mu.g plasmid DNA is mixed with 100 .mu.l protoplast suspension in a 15 ml tube and incubated at room temperature for 25 minutes. A total volume of 1250 .mu.l PEG solution is added drop by drop into the mixture of plasmid and protoplast with gentle mix, followed by incubation at room temperature for 20 minutes. Ice cold STC buffer is added to fill the tube followed by gentle mix until the PEG solution is completely diluted. After centrifugation at 4.degree. C. at 2000 rpm for 10 minutes, the cells are resuspended in 300 .mu.l STC buffer, from which 100 .mu.l is taken to be inoculated onto the uridine-free MN+SORB agar medium plates and incubated at 30.degree. C. for 3-5 days to select transformants that can grow on the selective medium. The non-transformed Aspergillus niger M54 would not grow on the uridine-free medium while Aspergillus niger transformed with pYG1.2-CarE-his and pYG1.2 both grow on the uridine-free medium, indicating the plasmids are successfully expressed in Aspergillus niger M54.

[0144] The expression of carboxylesterase is characterized using SDS-PAGE. Aspergillus niger M54 transformed with pYG1.2-CarE-his is cultured in uridine-free MN broth at 30.degree. C. with 200 rpm shake for 5 days. Then the supernatants of the culture are taken and analyzed by electrophoresis. Supernatants of Aspergillus niger M54 transformed with pYG1.2 and supernatants of non-transformed Aspergillus niger M54 are analyzed in parallel as negative controls. 15 .mu.l supernatants are mixed with 3 .mu.l 6.times. loading SDS-PAGE buffer and boiled for 5 minutes before loading to the SDS-PAGE gel, which is composed of 12% polyacrylamide resolving gel and 5% polyacrylamide stacking gel. The samples are subjected to constant voltage electrophoresis at 120V for 3-4 hours. Then the gel is stained by Coomassie Brilliant Blue for 30 minutes at room temperature and rinsed to show the protein bands. A distinct protein band at 29.0 KD is observed in the lane of supernatants of Aspergillus niger M54 transformed with pYG1.2-CarE-his, but it is absent in the negative controls (FIG. 3).

[0145] The expression of carboxylesterase is further confirmed by Western Blot. Supernatants of Aspergillus niger M54 transformed with pYG1.2-CarE-his are separated by SDS-PAGE electrophoresis and proteins on the gel are transferred to polyvinylidene difluoride (PVDF) membrane for 1 hour by applying an electric current at an intensity of 0.65 mA/cm.sup.2. The PVDF membrane is blocked with non-fat milk followed by incubation with an appropriate dilution of anti-his antibody (an IgG type antibody) at 4.degree. C. overnight and incubation with a secondary antibody, an anti-IgG antibody, at room temperature for 90 minutes. After washing with PBST, the PVDF membrane is exposed to an X-ray film for an appropriate time period followed by visualization and photograph. Supernatants of non-transformed Aspergillus niger M54 and supernatants of Aspergillus niger M54 transformed with pYG1.2 vector are used as negative controls. His-tagged histones used as positive control. A distinct band at 29 KD is observed for the pYG1.2-CarE-his transformant while no band is detected in the negative controls (FIG. 4), confirming the expression of carboxylesterase in pYG1.2-CarE-his transformants.

Example 2

Expression of Carboxylesterase in Pichia pastoris GS115

[0146] The DNA sequence of the carboxylesterase contains an internal XhoI site CTCGAG, which is changed by a site specific silent mutation to CTCGAA (marked with double underline) using PCR. Two separate PCR reactions are performed using primers P3 and P4 shown in SEQ ID NOs: 36-37, and primers P5 and P6 shown in SEQ ID NOs:38-39, respectively, in which P3 contains an Xho I site, a KEX2 site and a KEX1 site, P4 and P5 contain the silent mutation, and P6 contains an EcoRI site and six histidine codons for a His-tag (coding sequence of His-tag is marked with a wave underline). The PCR conditions are the same as described in Example 1. The PCR products from the two separate reactions are purified by gel extraction after electrophoresis in 1% agarose gel. The PCR products from the two reactions have overlapping sequences (shown as underlined parts in P4 and P5 in Table 5) that would allow the two products to bind to each other at the 5' end of one product and the 3' end of the other product. Therefore, the two purified PCR products are mixed together for a third round of PCR to make a PCR product combining the two templates. Products of the third round of PCR are subjected to electrophoresis in 1% agarose gel and the band of about 800 bp is cut and purified by gel extraction.

TABLE-US-00005 TABLE 5 Nucleotide acid sequences of primers P3, P4, P5 and P6, and the CarE-hisgene. Name Sequence P3 5'CGCTCGAGAAAAGAGAGGCTGAAGCTATGATGAAA ATTGTTCCG 3' (SEQ ID NO: 36) P4 5'CGCGCGCATATTCGAGGACGCCTTCGTAC 3' (SEQ ID NO: 37) P5 5' CGTCCTCGAATATGCGCGCGAGTATAAAA 3' (SEQ ID NO: 38) P6 5'CGGAATTCTTA CCAATC TAACGATTCAAG 3' (SEQ ID NO: 39) Mutated 5'ATGATGAAAATTGTTCCGCCGAAGCCGTTTTTCTT CarE-his TGAAGCCGGGGAGCGGGCGGTGCTGCTTTTGCATGGG TTTACCGGCAATTCCGCCGACGTTCGGATGCTTGGGC GATTCTTGGAATCGAAAGGGTATACGTGCCACGCTCC GATTTACAAAGGGCATGGCGTGCCGCCGGAAGAGCTC GTCCACACCGGACCGGATGATTGGTGGCAAGACGTCA TGAACGGCTATCAGTTTTTGAAAAACAAAGGCTACGA AAAAATTGCCGTGGCTGGATTGTCGCTTGGAGGCGTA TTTTCTCTCAAATTAGGCTACACTGTACCTACACAAG GCATTGTGACGATGTGCGCGCCGATGTACATCAAAAG CGAAGAAACGATGTACGAAGGCGTCCTCGAATATGCG CGCGAGTATAAAAAGCGGGAAGGGAAATCAGAGGAAC AAATCGAACAGGAAATGGAACGGTTCAAACAAACGCC GATGAAGACGTTGAAAGCCTTGCAAGAACTCATTGCC GATGTGCGCGCCCACCTTGATTTGGTTTATGCACCGA CGTTCGTCGTCCAAGCGCGCCATGATGAGATGATCAA TCCAGACAGCGCGAACATCATTTATAACGAAATTGAA TCGCCGGTCAAACAAATCAAATGGTATGAGCAATCAG GCCATGTGATTACGCTTGATCAAGAAAAAGATCAGCT GCATGAAGATATTTATGCATTTCTTGAATCGTTAGAT TGG TAA 3' (SEQ ID NO: 40)

[0147] The purified final PCR product and the pBluescriptII-SKM (Stratagene, La Jolla, Calif.) are separately digested using both XhoI and EcoRI, followed by electrophoresis in 1% agarose gel and purification by gel extraction. pBluescriptII-SKM contains a beta-lactamase gene that confers resistance to Ampicillin. After ligation of the digested PCR product and the digested pBluescriptII-SKM fragment, E. Coli DH5.alpha. are transformed with the ligation product following the same procedure described in Example 1. Individual colonies of E. Coli grown on the LB plates containing Ampicillin are screened using PCR, and positive colonies identified by PCR are propagated followed by plasmid purification. Target gene insertion is confirmed by digesting the purified plasmid with XhoI and EcoRI, followed by electrophoresis. The obtained recombinant plasmid is named as pBluescriptII-SKM-CarE-his and used in the following molecular cloning experiments.

[0148] Plasmid pBluescriptII-SKM-CarE-his is digested with XhoI and EcoRI, and ligated to the fragment of pPIC9 plasmid (Invitrogen) digested with the same restriction enzymes. The ligation product is used to transform E. Coli DH5.alpha. to obtain positive recombinant colonies identified by PCR and enzyme digestion with XhoI and EcoRI following the same procedures described in Example 1. The plasmid from the positive colonies is purified and named as pPIC9-CarE-his.

[0149] Plasmid pPIC9-CarE-his is digested with BamHI and EcoRI, and the fragment containing the carboxylesterase gene is ligated to the fragment of pPIC9K plasmid (Invitrogen) digested with the same restriction enzymes. pPIC9K plasmid carries a Kanamycin resistant gene which confers host cell resistance to Kanamycin as well as certain antibiotics that share structure similarity with Kanamycin. The ligation product is used to transform E. Coli DH5.alpha. to obtain positive recombinant colonies identified by PCR and enzyme digestion with BamHI and EcoRI following the same procedures described in Example 1. Electrophoresis results show a 1100 bp target band corresponding to the digestion fragment containing both the carboxylesterase gene and .alpha.-factor secreting signal sequence introduced from pPIC9 plasmid. The recombinant plasmid is named as pPIC9K-CarE-his and is used in subsequent studies.

[0150] pPIC9K-CarE-his plasmid is digested with BglII and then purified to obtain linearized plasmid DNA. 80 .mu.l suspension of Pichia pastoris GS115 strain is mixed with the linearized plasmid DNA and equilibrated in ice cold cuvette for 5 minutes, followed by electroporation at 1500 v, 25 .mu.F and 200.OMEGA., and immediate addition of 1 ml ice cold 1M sorbitol. The mixture is placed on ice for 2-3 hours and then inoculated onto RDB plates. The plates are incubated at 30.degree. C. for 3-5 days.

[0151] All colonies are washed from the RDB plates and diluted to about 10.sup.6 cells/ml with YPD medium. 100 .mu.l diluted Pichia pastoris cells are inoculated onto YPD plates supplemented with various concentrations (0.25 mg/ml, 1 mg/ml and 2 mg/ml) of G418 (purchased from Invitrogen), which is an aminoglycoside antibiotic similar in structure to Kanamycin, followed by incubation at 30.degree. C.

[0152] Individual colonies grown on the plates supplemented with the highest concentration of G418 are picked and inoculated separately into 3 ml MGY medium followed by incubation at 30.degree. C. with shake at 250 rpm until the OD600 reaches 2-6. After centrifugation at 1500 g at room temperature for 5 minutes, the precipitates are resuspended with 3 ml BMMY medium (supplemented with 5.Salinity. methanol) and incubated at 30.degree. C. with shake at 250 rpm to induce expression of the target gene. Methanol (5.Salinity.) is added into the culture every 24 hours and samples of the culture are taken meanwhile. After 96 hours, the samples at different time intervals are run by SDS-PAGE analysis.

[0153] The supernatants of the samples are analyzed using SDS-PAGE electrophoresis and Western blot following the same procedures as described in Example 1. Supernatants of Pichia pastoris GS115 transformed with pPIC9K vector are analyzed in parallel as a negative control. Results obtained from electrophoresis and Western blot both show that a distinct protein band at 29.0 KD is present in the supernatants of Pichia pastoris GS115 transformed with pPIC9K-CarE-his but absent in the negative control (FIG. 4 and FIG. 5).

[0154] pPIC9K-CarE-his transformants expressing relatively higher level of carboxylesterase are picked, propagated and stored at -80.degree. C.

Example 3

Characterization of Carboxylesterase Activity

[0155] Carboxylesterase activity is measured by incubating .alpha.-naphthyl acetate with supernatants of transformed Aspergillus niger M54 and supernatants of transformed Pichia pastoris GS115, respectively, at 37.degree. C. pH 7.0 for 10 minutes followed by immediate termination of the reaction. Absorbance is measured at 600 nm. Enzyme activity units are calculated as the amount of enzyme needed to release 1 .mu.mol .alpha.-naphthol from 0.03 M .alpha.-naphthyl acetate solution per minute.

[0156] To determine the incubation time period for producing the highest enzyme production, the culture supernatants containing carboxylesterase are sampled on the 1st day, 2nd day, 3rd day, 4th day, 5th day and 6th day of the incubation, and then measured for carboxylesterase activity. As shown in FIG. 6, recombinant carboxylesterase activity reaches its peak in transformed Aspergillus niger M54 after 5-day incubation and in transformed Pichia pastoris GS115 after 4-day incubation, and begins to decline thereafter. Carboxylesterase is produced in an amount of 15.3 mg/ml in the 5th day culture of transformed Aspergillus niger M54, and is produced in an amount of 30.7 mg/ml in the 4th day culture of transformed Pichia pastoris GS115, as determined by gel imaging analysis system. The results suggest that the proper incubation period for carboxylesterase production in transformed Aspergillus niger M54 and transformed Pichia pastoris GS115 are 5 days and 4 days, respectively. The production yield of recombinant carboxylesterase in transformed Pichia pastoris GS115 is higher than that in transformed Aspergillus niger M54.

[0157] To test the pH dependence of the carboxylesterase activity, the culture supernatants containing carboxylesterases expressed from both transformed Aspergillus niger M54 and transformed Pichia pastoris GS115 are incubated with a substrate in buffers having pH values ranging from 5 to 11 at 37.degree. C. for 30 minutes, followed by enzymatic reaction and activity determination. The results show that the recombinant carboxylesterase exhibits the highest enzymatic activity at pH 8.0, which is defined as 100% relative enzymatic activity, and above 75% relative enzymatic activity within the pH range of 6.0 and 8.5 and decreased activity when pH is below 6.0 or above 8.5 (FIG. 7).

[0158] To determine the proper reaction temperature for carboxylesterase activity, the culture supernatants containing carboxylesterases are incubated with substrate solutions at temperatures ranging from 20.degree. C. to 80.degree. C. at pH 7.0 for 30 minutes followed by determination of enzymatic activities. Carboxylesterases expressed from both transformed Aspergillus niger M54 and transformed Pichia pastoris GS115 show highest enzymatic activity at 60.degree. C., which is defined as 100% relative enzymatic activity, and decreased activity at higher temperatures at 70.degree. C. and 80.degree. C. (FIG. 9).

[0159] To test the thermostability of the recombinant carboxylesterase, the culture supernatants containing carboxylesterases expressed from both transformed Aspergillus niger M54 and transformed Pichia pastoris GS115 are incubated at temperatures ranging from 40.degree. C. to 80.degree. C. for 10 minutes and 30 minutes, respectively, followed by cooling in ice-bath. The substrate solutions are added to the enzyme incubations followed by enzymatic reaction at 37.degree. C. for 10 minutes and activity determination. The results show that, after incubation at 60.degree. C. for 10 minutes, the recombinant carboxylesterase retains nearly 100% of the enzymatic activity, and exhibits about 60% relative enzymatic activity after incubation at 70.degree. C. for 30 minutes (FIG. 8).

Example 4

Expression of Carboxylesterase from Geobacillus kaustophilus HTA426 Strain in Hansenula polymorpha

[0160] The carboxylesterase from Geobacillus kaustophilus HTA426 strain (Accession number: BA000043, SEQ ID NO: 10) is cloned from the cDNA library of Geobacillus kaustophilus HTA426 strain by PCR using primers designed from the known DNA sequence. PCR products are purified and digested with appropriate restriction enzymes, and then ligated with pHIPX4 vector (Gietl et al, Mutational analysis of the N-terminal topogenic signal of watermelon glyoxysomal malate dehydrogenase using the heterologous host Hansenula polymorpha. Proc Natl Sci USA 1994, vol 91, 31513155). The recombinant pHIPX4 plasmid with insertion of the carboxylesterase gene is propagated in E. Coli DH5.alpha. and is then linearized to transform Hansenula polymorpha strain leu 1.1 (Gleeson et al, Transformation of the methylotrophic yeast Hansenula polymorpha. J Gen Microbiol 1986, vol, 132, 3459-65) using electroporation.

[0161] The transformed Hansenula polymorpha are screened using leucine-free culture medium. Single clones growing on leucine-free culture plates are incubated and the expression of carboxylesterase is characterized using enzymatic assays with .alpha.-naphthyl acetate as a substrate.

Example 5

Expression of Carboxylesterase from Bacillus thermoleovorans Strain in Aspergillus oryzae

[0162] The carboxylesterase from Bacillus thermoleovorans strain (Accession number: AF327065, SEQ ID NO:4) is cloned from the cDNA library of Bacillus thermoleovorans strain by PCR using primers designed from the known DNA sequence. PCR products are purified and digested with appropriate restriction enzymes, and then ligated with pSa123 vector which carries the Arginine synthesis gene (Gomi et al, Integrative transformation of Aspergillus oryzae with a plasmid containing the Aspergillus nidulans argB gene. Agric. Biol. Chem. 1987, vol 51, 2549-2555). The recombinant pSa123 plasmid with insertion of the carboxylesterase gene is propagated in E. Coli DH5.alpha. and is then linearized to transform Aspergillus oryzae M-2-3 which is deficient in the Arginine synthesis gene (Ozeki et al, Construction of a promoter probe vector autonomously maintained in Aspergillus and characterization of promoter regions derived from A. niger and A. oryzae genomes. Biosci. Biotech. Biochem. 1996, vol 60, 383-389) using Aspergillus oryzae M-2-3 protoplast.

[0163] The transformed Aspergillus oryzae are screened using arginine-free culture medium. Single clones growing on arginine-free culture plates are incubated and the expression of carboxylesterase is characterized using enzymatic assays with .alpha.-naphthyl acetate as substrate.

Example 6

Purification of the Expressed Carboxylesterase

[0164] Pichia pastoris GS115 transformed with pPIC9K-CarE-his are cultured at 30.degree. C. for 4 days. The culture medium is harvested and filtered through a 0.2 .mu.m filter followed by addition of NaAzide to a final concentration of 0.01%. 100 ml glycerol, 30 ml 5 M NaCl, 10 ml 1M imidazole and 50 ml Ni-NTA superflow resin are added to each liter of the harvested culture medium, followed by Gyro-rotary motion at 150 rpm at room temperature for 30-40 minutes. The resin beads are spin down at 3750 rpm for 10 minutes. The beads are loaded to a column and the column is washed with washing buffer containing: 50 mM Tris (pH 8.0), 300 mM NaCl, 10% Glycerol, and 10 mM Imidazole, until UV absorbance at 280 nm is stable. Then the column is eluted with elution buffer containing: 50 mM Tris pH 8.0, 300 mM NaCl, 10% Glycerol, and 250 mM Imidazole, until no protein is detected in the eluted solution. The eluted fractions are analyzed by electrophoresis to identify fractions with a single target protein band. Target fractions are collected and tested for the quantity and enzymatic activity.

General

[0165] With respect to the use of substantially any plural and/or singular terms herein, those having skill in the art can translate from the plural to the singular and/or from the singular to the plural as is appropriate to the context and/or application.

[0166] It will be understood by those within the art that, in general, terms used herein, and especially in the appended claims (e.g., bodies of the appended claims) are generally intended as "open" terms (e.g., the term "including" should be interpreted as "including but not limited to," the term "having" should be interpreted as "having at least," the term "includes" should be interpreted as "includes but is not limited to," etc.). It will be further understood by those within the art that if a specific number of an introduced claim recitation is intended, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases "at least one" and "one or more" to introduce claim recitations. However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles "a" or "an" limits any particular claim containing such introduced claim recitation to embodiments containing only one such recitation, even when the same claim includes the introductory phrases "one or more" or "at least one" and indefinite articles such as "a" or "an" (e.g., "a" and/or "an" should be interpreted to mean "at least one" or "one or more"); the same holds true for the use of definite articles used to introduce claim recitations. In addition, even if a specific number of an introduced claim recitation is explicitly recited, those skilled in the art will recognize that such recitation should be interpreted to mean at least the recited number (e.g., the bare recitation of "two recitations," without other modifiers, means at least two recitations, or two or more recitations).

[0167] In addition, where features or aspects of the disclosure are described in terms of Markush groups, those skilled in the art will recognize that the disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group.

[0168] As will be understood by one skilled in the art, for any and all purposes, such as in terms of providing a written description, all ranges disclosed herein also encompass any and all possible subranges and combinations of subranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as "up to," "at least," "greater than," "less than," and the like include the number recited and refer to ranges which can be subsequently broken down into subranges as discussed above. Finally, as will be understood by one skilled in the art, a range includes each individual member. Thus, for example, a group having 1-3 cells refers to groups having 1, 2, or 3 cells and so forth.

[0169] The present disclosure is not to be limited in terms of the particular embodiments described in this application, which are intended as illustrations of various aspects. Many modifications and variations can be made without departing from its spirit and scope, as will be apparent to those skilled in the art. Functionally equivalent methods and compositions within the scope of the disclosure, in addition to those enumerated herein, will be apparent to those skilled in the art from the foregoing descriptions. Such modifications and variations are intended to fall within the scope of the appended claims. The present disclosure is to be limited only by the terms of the appended claims, along with the full scope of equivalents to which such claims are entitled. It is to be understood that this disclosure is not limited to particular methods, reagents, compounds compositions or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.

[0170] While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purposes of illustration and are not intended to be limiting, with the true scope and spirit being indicated by the following claims.

Sequence CWU 1

1

401246PRTGeobacillus kaustophilus 1Met Lys Ile Val Pro Pro Lys Pro Phe Phe Phe Glu Ala Gly Glu Arg1 5 10 15Ala Val Leu Leu Leu His Gly Phe Thr Gly Asn Ser Ala Asp Val Arg 20 25 30Met Leu Gly Arg Phe Leu Glu Ser Lys Gly Tyr Thr Cys His Ala Pro 35 40 45Ile Tyr Lys Gly His Gly Val Pro Pro Glu Glu Leu Val His Thr Gly 50 55 60Pro Asp Asp Trp Trp Gln Asp Val Met Asn Gly Tyr Gln Phe Leu Lys65 70 75 80Asn Lys Gly Tyr Glu Lys Ile Ala Val Ala Gly Leu Ser Leu Gly Gly 85 90 95Val Phe Ser Leu Lys Leu Gly Tyr Thr Val Pro Ile Gln Gly Ile Val 100 105 110Thr Met Cys Ala Pro Met Tyr Ile Lys Ser Glu Glu Thr Met Tyr Glu 115 120 125Gly Val Leu Glu Tyr Ala Arg Glu Tyr Lys Lys Arg Glu Gly Lys Ser 130 135 140Glu Glu Gln Ile Glu Gln Glu Met Glu Arg Phe Lys Gln Thr Pro Met145 150 155 160Lys Thr Leu Lys Ala Leu Gln Glu Leu Ile Ala Asp Val Arg Ala His 165 170 175Leu Asp Leu Val Tyr Ala Pro Thr Phe Val Val Gln Ala Arg His Asp 180 185 190Glu Met Ile Asn Pro Asp Ser Ala Asn Ile Ile Tyr Asn Glu Ile Glu 195 200 205Ser Pro Val Lys Gln Ile Lys Trp Tyr Glu Gln Ser Gly His Val Ile 210 215 220Thr Leu Asp Gln Glu Lys Asp Gln Leu His Glu Asp Ile Tyr Ala Phe225 230 235 240Leu Glu Ser Leu Asp Trp 2452741DNAGeobacillus kaustophilus 2ttaccaatct aacgattcaa gaaatgcata aatatcttca tgcagctgat ctttttcttg 60atcaagcgta atcacatggc ctgattgctc ataccatttg atttgtttga ccggcgattc 120aatttcgtta taaatgatgt tcgcgctgtc tggattgatc atctcatcat ggcgcgcttg 180gacgacgaac gtcggtgcat aaaccaaatc aaggtgggcg cgcacatcgg caatgagttc 240ttgcaaggct ttcaacgtct tcatcggcgt ttgtttgaac cgttccattt cctgttcgat 300ttgttcctct gatttccctt cccgcttttt atactcgcgc gcatactcga gcacaccttc 360gtacatcgtt tcttcgcttt tgatgtacat cggcgcgcac atcgtcacaa tgccttgtat 420aggtacagtg tagcctaatt tgagagaaaa tacgcctcca agcgacaatc cagccacggc 480aattttttcg tagcctttgt ttttcaaaaa ctgatagccg ttcatgacgt cttgccacca 540atcatccggt ccggtgtgga cgagctcttc cggcggcacg ccatgccctt tgtaaatcgg 600agcgtggcac gtataccctt tcgattccaa gaatcgccca agcatccgaa cgtcggcgga 660attgccggta aacccatgca aaagcagcac cgcccgctcc ccggcttcaa agaaaaacgg 720cttcggcgga acaattttca t 7413246PRTGeobacillus thermoleovorans 3Met Met Lys Ile Val Pro Pro Lys Pro Phe Phe Phe Glu Ala Gly Glu1 5 10 15Arg Ala Val Leu Leu Leu His Gly Phe Thr Gly Asn Ser Ala Asp Val 20 25 30Arg Met Leu Gly Arg Phe Leu Glu Ser Lys Gly Tyr Thr Cys His Ala 35 40 45Pro Ile Thr Lys Gly Met Val Pro Pro Glu Glu Leu Val His Thr Gly 50 55 60Pro Asp Asp Trp Trp Gln Asp Val Met Asn Gly Tyr Gln Phe Leu Lys65 70 75 80Asn Lys Gly Tyr Glu Lys Ile Ala Val Ala Gly Leu Ser Leu Gly Gly 85 90 95Val Phe Ser Leu Lys Leu Gly Tyr Thr Val Pro Ile Glu Gly Ile Val 100 105 110Thr Met Cys Ala Pro Met Tyr Ile Lys Ser Glu Glu Thr Met Tyr Glu 115 120 125Gly Val Leu Glu Tyr Ala Arg Glu Tyr Lys Lys Arg Glu Gly Lys Ser 130 135 140Glu Glu Gln Ile Glu Gln Glu Met Glu Arg Phe Lys Gln Thr Pro Met145 150 155 160Lys Thr Leu Lys Ala Leu Gln Glu Leu Ile Ala Asp Val Arg Ala His 165 170 175Leu Asp Leu Val Tyr Ala Arg Thr Phe Val Val Gln Ala Arg His Asp 180 185 190Lys Met Ile Asn Pro Asp Ser Ala Asn Ile Ile Tyr Asn Glu Ile Glu 195 200 205Ser Pro Val Lys Gln Ile Lys Trp Tyr Glu Gln Ser Gly His Val Ile 210 215 220Thr Leu Asp Gln Glu Lys Asp Gln Leu His Glu Asp Ile Tyr Ala Phe225 230 235 240Leu Glu Ser Leu Asp Trp 2454741DNAGeobacillus thermoleovorans 4atgatgaaaa ttgttccgcc gaagccgttt ttctttgaag ccggggagcg ggcggtgctg 60cttttgcatg ggtttaccgg caattccgcc gacgttcgga tgcttgggcg attcttggaa 120tcgaaagggt atacgtgcca cgctccgatt acaaagggca tggtgccgcc ggaagagctc 180gtccacaccg gaccggatga ttggtggcaa gacgtcatga acggctatca gtttttgaaa 240aacaaaggct acgaaaaaat tgccgtggct ggattgtctc ttggaggcgt attttctctc 300aaattaggct acactgtacc tatagaaggc attgtgacga tgtgcgcgcc gatgtacatc 360aaaagcgaag aaacgatgta cgaaggtgtg ctcgagtatg cgcgcgagta taaaaagcgg 420gaagggaaat cagaggaaca aatcgaacag gaaatggaac ggttcaaaca aacgccgatg 480aagacgttga aagccttgca agaactcatt gccgatgtgc gcgcccacct tgatttggtt 540tatgcacgca cgttcgtcgt ccaagcgcgc catgataaga tgatcaatcc agacagcgcg 600aacatcattt ataacgaaat tgaatcgccg gtcaaacaaa tcaaatggta tgagcaatca 660ggccatgtga ttacgcttga tcaagaaaaa gatcagctgc atgaagatat ttatgcattt 720cttgaatcgt tagattggta a 7415256PRTSalmonella enterica 5Met Asn Asp Ile Trp Trp Gln Thr Tyr Gly Glu Gly Asn Cys His Leu1 5 10 15Val Leu Leu His Gly Trp Gly Leu Asn Ala Glu Val Trp His Cys Ile 20 25 30Arg Glu Glu Leu Gly Ser His Phe Thr Leu His Leu Val Asp Leu Pro 35 40 45Gly Tyr Gly Arg Ser Ser Gly Phe Gly Ala Met Thr Leu Glu Glu Met 50 55 60Thr Ala Gln Val Ala Lys Asn Ala Pro Asp Gln Ala Ile Trp Leu Gly65 70 75 80Trp Ser Leu Gly Gly Leu Val Ala Ser Gln Met Ala Leu Thr His Pro 85 90 95Glu Arg Val Gln Ala Leu Val Thr Val Ala Ser Ser Pro Cys Phe Ser 100 105 110Ala Arg Glu Gly Trp Pro Gly Ile Lys Pro Glu Ile Leu Gly Gly Phe 115 120 125Gln Gln Gln Leu Ser Asp Asp Phe Gln Arg Thr Val Glu Arg Phe Leu 130 135 140Ala Leu Gln Thr Leu Gly Thr Glu Thr Ala Arg Gln Asp Ala Arg Thr145 150 155 160Leu Lys Ser Val Val Leu Ala Gln Pro Met Pro Asp Val Glu Val Leu 165 170 175Asn Gly Gly Leu Glu Ile Leu Lys Thr Val Asp Leu Arg Glu Ala Leu 180 185 190Lys Asn Val Asn Met Pro Phe Leu Arg Leu Tyr Gly Tyr Leu Asp Gly 195 200 205Leu Val Pro Arg Lys Ile Ala Pro Leu Leu Asp Thr Leu Trp Pro His 210 215 220Ser Thr Ser Gln Ile Met Ala Lys Ala Ala His Ala Pro Phe Ile Ser225 230 235 240His Pro Ala Ala Phe Cys Gln Ala Leu Met Thr Leu Lys Ser Ser Leu 245 250 2556771DNASalmonella enterica 6 ttacagcgat gattttagcg tcatcagcgc ctgacaaaac gccgccggat gcgagataaa 60cggcgcatgg gccgccttcg ccattatctg tgatgtactg tgcggccata acgtatcgag 120caaaggcgcg attttacgcg gcaccagacc gtccagataa ccatacaaac gcaaaaacgg 180catgttcaca tttttaagcg cttctcgtag atcgaccgtt tttaagattt ccagtccgcc 240attgagcacc tctacatccg gcataggctg cgccagcact acgcttttta aggtgcgggc 300atcctgacgc gccgtctccg tccctaacgt ttgcagcgcc agaaaacgct ccaccgtgcg 360ctgaaaatcg tcgctaagct gctgctggaa tccgccgagg atttctggtt ttattcccgg 420ccacccctca cgcgcgctaa agcatggcga agaggcgact gtcaccagcg cctgaacgcg 480ttcagggtgg gtgagcgcca tctgactcgc caccaggccg cccaggctcc agccaagcca 540gatagcctgg tccggcgcgt ttttcgctac ctgcgccgtc atctcttcaa gcgtcatggc 600gccaaacccc gagctgcgac catagcccgg caagtcgacc agatgcagcg taaaatgcga 660gccgagttcc tcgcgaatgc aatgccatac ctccgcgttc aatccccatc cgtgcagcag 720cacaagatga caatttccct cgccgtaggt ctgccaccag atgtcattca t 7717541PRTAspergillus fumigatus 7Met Val Ile Ser Thr Lys Tyr Ile Phe Ala Leu Cys Val Leu Leu Leu1 5 10 15Thr Phe Ser Leu Ser Ser Ala Tyr Glu Asp Pro Leu Val Glu Leu Asp 20 25 30Tyr Gly Gln Phe Gln Gly Lys Tyr Asp Ser Thr Tyr Asn Leu Ser Tyr 35 40 45Phe Arg Lys Ile Pro Phe Ala Ala Pro Pro Thr Gly Glu Asn Arg Phe 50 55 60Arg Ala Pro Gln Pro Pro Leu Arg Ile Thr His Gly Val Tyr Asp Thr65 70 75 80Asp Gln Asp Phe Asp Met Cys Pro Gln Arg Thr Val Asn Gly Ser Glu 85 90 95Asp Cys Leu Tyr Leu Gly Leu Phe Ser Arg Pro Trp Asp Ala Thr Val 100 105 110Ala Pro Ser Ser Ala Ser Arg Pro Val Leu Val Val Phe Tyr Gly Gly 115 120 125Gly Phe Ile Gln Gly Ser Ala Ser Phe Thr Leu Pro Pro Ser Ser Tyr 130 135 140Pro Ile Leu Asn Val Thr Glu Leu Asn Asp Tyr Val Val Ile Tyr Pro145 150 155 160Asn Tyr Arg Val Asn Ala Phe Gly Phe Leu Pro Gly Lys Ala Ile Lys 165 170 175Arg Ser Pro Thr Ser Asp Leu Asn Pro Gly Leu Leu Asp Gln Gln Tyr 180 185 190Val Leu Lys Trp Val Gln Lys Tyr Ile His His Phe Gly Gly Asp Pro 195 200 205Arg Asn Val Thr Ile Trp Gly Gln Ser Ala Gly Ala Gly Ser Val Val 210 215 220Ala Gln Val Leu Ala Asn Gly Arg Gly Asn Gln Pro Lys Leu Phe Asp225 230 235 240Lys Ala Leu Val Ser Ser Pro Phe Trp Pro Lys Thr Tyr Ala Tyr Asp 245 250 255Ala Pro Glu Ala Glu Ala Ile Tyr Asp Gln Leu Val Thr Leu Thr Gly 260 265 270Cys Ala Asn Ala Thr Asp Ser Leu Ala Cys Leu Lys Ser Val Asp Val 275 280 285Gln Thr Ile Arg Asp Ala Asn Leu Ile Ile Ser Ala Ser His Thr Tyr 290 295 300Asn Thr Ser Ser Tyr Thr Trp Ala Pro Val Ile Asp Gly Glu Phe Leu305 310 315 320Gln Glu Ser Leu Thr Thr Ala Val Ala Arg Arg Lys Ile Lys Thr His 325 330 335Phe Val Phe Gly Met Tyr Asn Thr His Glu Gly Glu Asn Phe Leu Pro 340 345 350Ser Gly Leu Gly Lys Thr Ala Thr Thr Ala Gly Phe Asn Ser Ser Ala 355 360 365Ala Ser Phe His Thr Trp Leu Thr Gly Phe Leu Pro Gly Leu Ser Pro 370 375 380Lys His Ile Ala Leu Leu Glu Thr Lys Tyr Tyr Pro Pro Thr Gly Glu385 390 395 400Thr Glu Thr Ile Asp Leu Tyr Asn Thr Thr Leu Val Arg Ala Gly Leu 405 410 415Val Tyr Arg Asp Leu Val Leu Ala Cys Pro Ala Tyr Trp Leu Thr Ser 420 425 430Ala Ala Arg Arg Arg Gly Tyr Leu Gly Glu Tyr Thr Ile Ser Pro Ala 435 440 445Lys His Ala Ser Asp Thr Ile Tyr Trp Asn Arg Val Asn Pro Ile Gln 450 455 460Gln Thr Asp Pro Leu Ile Tyr Asp Gly Phe Ala Gly Ala Phe Gly Ser465 470 475 480Phe Phe Gln Thr Gly Asp Pro Asn Ala His Lys Leu Thr Asn Ala Ser 485 490 495Glu Lys Gly Val Pro Val Leu Glu Lys Thr Gly Glu Glu Trp Val Ile 500 505 510Ala Pro Asp Gly Phe Ala Thr Ala Gln Val Ser Phe Leu Lys Glu Arg 515 520 525Cys Asp Phe Trp Glu Ser Val Gly Glu Arg Val Pro Val 530 535 54081626DNAAspergillus fumigatus 8atggtgatct caacgaagta tatatttgcc ctttgcgtcc tcttgctgac cttctcactc 60tctagcgcct acgaagatcc cctcgtcgag ctcgactatg ggcagttcca gggcaaatat 120gactctacgt ataatctctc atacttccgc aagatcccct ttgcggcgcc tccaacgggg 180gagaaccggt ttagagcccc tcagccacct ctaaggatca cgcatggcgt ctatgacact 240gatcaggact ttgacatgtg cccgcaacgc acggtcaatg gctccgaaga ctgcctctac 300cttggcctgt tctcacgacc gtgggatgct acagtggctc cctcgtctgc ttctagacca 360gtcctggtag tcttctacgg tggtggcttc atccaaggca gcgcttcgtt tacactacct 420ccgtcctcat atccaatcct gaacgtcacc gagctgaatg actatgtggt catctacccc 480aactaccggg tcaatgcatt tggtttcctt ccgggcaagg cgatcaagcg atctccaacg 540tctgatctca accccggcct cttggaccag cagtacgttc tcaagtgggt gcagaaatac 600attcaccact tcggcggtga ccctcgcaac gtcacgatct ggggccaatc cgccggcgcc 660ggctcagtgg ttgcgcaggt tctcgccaac ggacgaggca accaacccaa gctcttcgac 720aaagcgctcg tcagctcgcc cttctggcca aagacctacg cctacgacgc ccccgaagca 780gaagccatct acgaccagct tgtcactctc accggctgcg ccaatgccac cgactccctc 840gcctgcctga aatccgtcga cgtccagacc atccgcgacg caaacctcat catcagcgcc 900agccacacct acaacacaag ctcctacacc tgggcccccg tcatcgacgg cgaattcctc 960caagaatccc tcaccaccgc cgtcgcccgc cgcaaaatca aaacccactt cgtcttcggc 1020atgtacaaca cccacgaggg cgagaacttc ctcccctccg gcctgggcaa gaccgctaca 1080accgctgggt tcaactcctc tgctgctagc ttccacacct ggctgacggg cttcctgccg 1140ggtctctcgc ccaagcacat cgccctcctc gaaaccaagt actacccgcc caccggcgaa 1200acagagacaa tcgacctgta caacacgacg ctcgtccgcg cgggcctggt ctacagggat 1260cttgtcctcg cctgtccggc gtactggctt acctcggccg caagacggag gggctatcta 1320ggtgaatata cgatttcgcc ggctaagcac gcgagcgata ccatctattg gaaccgagtg 1380aacccgatcc agcagactga tccactgatt tatgacggct tcgcaggcgc ttttgggagt 1440ttcttccaga cgggcgatcc gaatgcgcat aagctgacga atgcgtcgga gaagggtgtg 1500ccggttctcg agaagaccgg ggaggagtgg gtgattgctc cggatgggtt tgcaaccgcg 1560caggtgtcgt ttttaaagga gaggtgtgat ttctgggagt cggtggggga gcgggttcct 1620gtctga 16269247PRTGeobacillus stearothermophilus 9Met Met Lys Ile Val Pro Pro Lys Pro Phe Phe Phe Glu Ala Gly Glu1 5 10 15Arg Ala Val Leu Leu Leu His Gly Phe Thr Gly Asn Ser Ala Asp Val 20 25 30Arg Met Leu Gly Arg Phe Leu Glu Ser Lys Gly Tyr Thr Cys His Ala 35 40 45Pro Ile Tyr Lys Gly His Gly Val Pro Pro Glu Glu Leu Val His Thr 50 55 60Gly Pro Asp Asp Trp Trp Gln Asp Val Met Asn Gly Tyr Glu Phe Leu65 70 75 80Lys Asn Lys Gly Tyr Glu Lys Ile Ala Val Ala Gly Leu Ser Leu Gly 85 90 95Gly Val Phe Ser Leu Lys Leu Gly Tyr Thr Val Pro Ile Glu Gly Ile 100 105 110Val Thr Met Cys Ala Pro Met Tyr Ile Lys Ser Glu Glu Thr Met Tyr 115 120 125Glu Gly Val Leu Glu Tyr Ala Arg Glu Tyr Lys Lys Arg Glu Gly Lys 130 135 140Ser Glu Glu Gln Ile Glu Gln Glu Met Glu Lys Phe Lys Gln Thr Pro145 150 155 160Met Lys Thr Leu Lys Ala Leu Gln Glu Leu Ile Ala Asp Val Arg Asp 165 170 175His Leu Asp Leu Ile Tyr Ala Pro Thr Phe Val Val Gln Ala Arg His 180 185 190Asp Glu Met Ile Asn Pro Asp Ser Ala Asn Ile Ile Tyr Asn Glu Ile 195 200 205Glu Ser Pro Val Lys Gln Ile Lys Trp Tyr Glu Gln Ser Gly His Val 210 215 220Ile Thr Leu Asp Gln Glu Lys Asp Gln Leu His Glu Asp Ile Tyr Ala225 230 235 240Phe Leu Glu Ser Leu Asp Trp 24510744DNAGeobacillus stearothermophilus 10atgatgaaaa ttgttccgcc gaagccgttt ttctttgaag ccggggagcg ggcggtgctg 60ctgttgcatg ggtttaccgg caattccgct gatgttcgga tgctcgggcg ttttttagaa 120tccaaaggct atacgtgcca tgcgcctatt tacaaaggac acggcgtgcc gcctgaggag 180ctcgtccaca ccgggccgga tgactggtgg caagatgtca tgaacggcta cgagtttttg 240aaaaacaagg gctacgaaaa aatcgccgtc gccggactgt cgcttggagg cgtattttca 300ttgaaattag gttacactgt acctatagag ggcattgtga cgatgtgcgc gccgatgtac 360atcaaaagcg aggaaacgat gtacgaaggc gtgctcgagt atgcgcgcga gtataaaaag 420cgggaaggaa aatcagagga gcagatcgaa caggagatgg agaagttcaa gcagacgccg 480atgaagacgt taaaggcgct gcaggagctg atcgccgatg tgcgtgacca tcttgatttg 540atttatgccc cgacgtttgt tgtgcaggcg cgccatgatg agatgatcaa cccggacagc 600gcgaacatca tttataacga aattgaatcg ccggtcaaac aaatcaagtg gtatgagcaa 660tcaggccatg tgattacgct tgatcaagaa aaagatcagc tgcatgaaga tatttatgca 720tttcttgaat cgttagattg gtaa 74411176PRTGeobacillus stearothermophilus 11Lys Leu Gly Glu Lys Glu Leu Leu Asp Arg Ile Asn Arg Glu Val Gly1 5 10 15Pro Val Pro Glu Glu Ala Ile Arg Tyr Tyr Lys Glu Thr Ala Glu Pro 20 25 30Ser Ala Pro Thr Trp Gln Thr Trp Leu Arg Ile Met Thr Tyr Arg Val 35 40 45Phe Val Glu Gly Met Leu Arg Thr Ala Asp Ala Gln Ala Ala Gln Gly 50 55 60Ala Asp Val Tyr Met Tyr Arg Phe Asp Tyr Glu Thr Pro Val Phe Gly65 70 75 80Gly Gln Leu Lys Ala Cys His Ala Leu Glu Leu Pro Phe Val Phe His

85 90 95Asn Leu His Gln Pro Gly Val Ala Asn Phe Val Gly Asn Arg Pro Glu 100 105 110Arg Glu Ala Ile Ala Asn Glu Met His Tyr Ala Trp Leu Ser Phe Ala 115 120 125Arg Thr Gly Asp Pro Asn Gly Ala His Leu Pro Glu Ala Trp Pro Ala 130 135 140Tyr Thr Asn Glu Arg Lys Ala Ala Phe Val Phe Ser Ala Ala Ser His145 150 155 160Val Glu Asp Asp Pro Phe Gly Arg Glu Arg Ala Ala Trp Gln Gly Arg 165 170 17512531DNAGeobacillus stearothermophilus 12 aagcttggcg aaaaggaact tcttgaccga atcaaccggg aagtcgggcc ggtgccagaa 60gaggccatcc gctattacaa agaaacggcg gagccgtcgg cgcctacttg gcaaacgtgg 120cttcgcatca tgacgtaccg cgtatttgtc gaggggatgc tgcggacggc ggacgcccaa 180gcggcgcaag gggcggatgt gtacatgtac cgctttgact atgagacgcc ggtgttcggc 240ggccagctga aagcatgcca cgcgctcgag ctgccgtttg tgtttcacaa tctccatcag 300ccgggcgtcg cgaatttcgt cggcaaccgg ccggagcgcg aggcgatcgc caatgaaatg 360cattacgctt ggctctcgtt tgcccgcacc ggagacccga acggtgctca cttgccggaa 420gcgtggccgg cgtatacgaa cgagcgcaag gcggcctttg tcttttcggc cgccagccat 480gtcgaggacg acccgttcgg ccgcgagcgg gcggcatggc aaggacgcta g 53113498PRTGeobacillus stearothermophilus 13Met Glu Arg Thr Val Val Glu Thr Arg Tyr Gly Arg Leu Arg Gly Glu1 5 10 15Met Asn Glu Gly Val Phe Val Trp Lys Gly Ile Pro Tyr Ala Lys Ala 20 25 30Pro Val Gly Glu Arg Arg Phe Leu Pro Pro Glu Pro Pro Asp Ala Trp 35 40 45Asp Gly Val Arg Glu Ala Thr Ser Phe Gly Pro Val Val Met Gln Pro 50 55 60Ser Asp Pro Ile Phe Ser Gly Leu Leu Gly Arg Met Ser Glu Ala Pro65 70 75 80Ser Glu Asp Gly Leu Tyr Leu Asn Ile Trp Ser Pro Ala Ala Asp Gly 85 90 95Lys Lys Arg Pro Val Leu Phe Trp Ile His Gly Gly Ala Phe Leu Phe 100 105 110Gly Ser Gly Ser Ser Pro Trp Tyr Asp Gly Thr Ala Phe Ala Lys His 115 120 125Gly Asp Val Val Val Val Thr Ile Asn Tyr Arg Met Asn Val Phe Gly 130 135 140Phe Leu His Leu Gly Asp Ser Phe Gly Glu Ala Tyr Ala Gln Ala Gly145 150 155 160Asn Leu Gly Ile Leu Asp Gln Val Ala Ala Leu Arg Trp Val Lys Glu 165 170 175Asn Ile Ala Ala Phe Gly Gly Asp Pro Asp Asn Ile Thr Ile Phe Gly 180 185 190Glu Ser Ala Gly Ala Ala Ser Val Gly Val Leu Leu Ser Leu Pro Glu 195 200 205Ala Ser Gly Leu Phe Arg Arg Ala Met Leu Gln Ser Gly Ser Gly Ser 210 215 220Leu Leu Leu Arg Ser Pro Glu Thr Ala Met Ala Met Thr Glu Arg Ile225 230 235 240Leu Asp Lys Ala Gly Ile Arg Pro Gly Asp Arg Glu Arg Leu Leu Ser 245 250 255Ile Pro Ala Glu Glu Leu Leu Arg Ala Ala Leu Ser Leu Gly Pro Gly 260 265 270Val Met Tyr Gly Pro Val Val Asp Gly Arg Val Leu Arg Arg His Pro 275 280 285Ile Glu Ala Leu Arg Tyr Gly Ala Ala Ser Gly Ile Pro Ile Leu Ile 290 295 300Gly Val Thr Lys Asp Glu Tyr Asn Leu Phe Thr Leu Thr Asp Pro Ser305 310 315 320Trp Thr Lys Leu Gly Glu Lys Glu Leu Leu Asp Arg Ile Asn Arg Glu 325 330 335Val Gly Pro Val Pro Glu Glu Ala Ile Arg Tyr Tyr Lys Glu Thr Ala 340 345 350Glu Pro Ser Ala Pro Thr Trp Gln Thr Trp Leu Arg Ile Met Thr Tyr 355 360 365Arg Val Phe Val Glu Gly Met Leu Arg Thr Ala Asp Ala Gln Ala Ala 370 375 380Gln Gly Ala Asp Val Tyr Met Tyr Arg Phe Asp Tyr Glu Thr Pro Val385 390 395 400Phe Gly Gly Gln Leu Lys Ala Cys His Ala Leu Glu Leu Pro Phe Val 405 410 415Phe His Asn Leu His Gln Pro Gly Val Ala Asn Phe Val Gly Asn Arg 420 425 430Pro Glu Arg Glu Ala Ile Ala Asn Glu Met His Tyr Ala Trp Leu Ser 435 440 445Phe Ala Arg Thr Gly Asp Pro Asn Gly Ala His Leu Pro Glu Ala Trp 450 455 460Pro Ala Tyr Thr Asn Glu Arg Lys Ala Ala Phe Val Phe Ser Ala Ala465 470 475 480Ser His Val Glu Asp Asp Pro Phe Gly Arg Glu Arg Ala Ala Trp Gln 485 490 495Gly Arg141497DNAGeobacillus stearothermophilus 14atggagcgaa ccgttgttga aacaaggtac ggacggttgc gcggggaaat gaatgaaggc 60gtttttgttt ggaaaggaat tccgtacgcg aaagcgccgg tcggtgagcg ccggtttttg 120ccgccggagc cgcccgatgc gtgggatggg gtgcgggagg cgacatcgtt cggtcctgtc 180gtgatgcagc cgtcggatcc gattttcagc ggattgctcg ggcggatgag cgaggcgccg 240agcgaagacg ggctgtactt gaacatttgg tcgccggcgg cggatgggaa gaagcgcccg 300gtgttgtttt ggattcacgg cggcgccttt ttgttcggtt cgggttcttc gccgtggtat 360gacgggacgg cgttcgcgaa acacggcgat gtcgttgtcg tgacgatcaa ctaccgaatg 420aacgtgtttg gctttttgca tctcggtgat tcgttcggcg aagcgtacgc gcaagccggg 480aatctcggca ttttggacca agtggcggcg ctgcgctggg tgaaggagaa cattgcggcg 540tttggtggtg atccggacaa catcacgatt ttcggtgaat cggccggagc ggcgagcgtc 600ggcgtgctgt tgtcgcttcc ggaggccagc gggctgtttc ggcgcgccat gttgcaaagc 660ggttcgggat cgcttcttct ccgttcgccg gagaccgcga tggcgatgac cgaacgcatt 720cttgataagg ctggcatccg tccgggcgac cgcgaacggc tgttgtcgat tcctgccgag 780gagctgctgc gggcggcgct gtcgctcggt ccaggggtca tgtacggtcc ggtggtggat 840ggccgcgtat tgcgccgcca tccgatcgaa gcgctccgct acggggcggc cagcggcatt 900ccgattctca tcggcgtgac gaaagacgag tacaacttgt ttaccttgac ggatccgtca 960tggacaaagc ttggcgaaaa ggaacttctt gaccgaatca accgggaagt cgggccggtg 1020ccagaagagg ccatccgcta ttacaaagaa acggcggagc cgtcggcgcc tacttggcaa 1080acgtggcttc gcatcatgac gtaccgcgta tttgtcgagg ggatgctgcg gacggcggac 1140gcccaagcgg cgcaaggggc ggatgtgtac atgtaccgct ttgactatga gacgccggtg 1200ttcggcggcc agctgaaagc atgccacgcg ctcgagctgc cgtttgtgtt tcacaatctc 1260catcagccgg gcgtcgcgaa tttcgtcggc aaccggccgg agcgcgaggc gatcgccaat 1320gaaatgcatt acgcttggct ctcgtttgcc cgcaccggag acccgaacgg tgctcacttg 1380ccggaagcgt ggccggcgta tacgaacgag cgcaaggcgg cctttgtctt ttcggccgcc 1440agccatgtcg aggacgaccc gttcggccgc gagcgggcgg catggcaagg acgctag 149715226PRTGeobacillus stearothermophilus 15Met His Asn Asp Leu Ala Tyr Glu Tyr Asp Ile His Leu Pro Ser Gly1 5 10 15Gly Glu Ala Gly Lys Lys Tyr Pro Ala Val Phe Ala Leu His Gly Ile 20 25 30Gly Tyr Asp Glu Gln Tyr Met Leu Thr Leu Val Lys Asp Leu Lys Glu 35 40 45Glu Phe Ile Leu Ile Gly Ile Arg Gly Asp Leu Pro Tyr Glu Asp Gly 50 55 60Tyr Ala Tyr Tyr Tyr Leu Lys Glu Tyr Gly Lys Pro Glu Arg Lys Met65 70 75 80Phe Asp Asp Ser Ile Gly Lys Leu Lys His Phe Ile Glu Tyr Ala Leu 85 90 95Asn Gln Tyr Pro Ile Asp Ser Asp Arg Val Tyr Leu Ile Gly Phe Ser 100 105 110Gln Gly Ala Ile Leu Ser Met Ser Leu Ala Leu Ile Leu Gly Asp Lys 115 120 125Ile Lys Gly Ile Ala Ala Met Asn Gly Tyr Ile Pro Ser Phe Val Lys 130 135 140Glu Glu Tyr Pro Leu Gln Pro Ile Ser His Leu Ser Val Phe Leu Thr145 150 155 160Gln Gly Glu Ser Asp His Ile Phe Pro Leu Asn Ile Gly Gln Glu Asn 165 170 175Tyr Glu Tyr Leu Arg Gln His Ala Gly Ala Val Lys Tyr Thr Ile Tyr 180 185 190Pro Ala Gly His Glu Ile Ser Gln Asp Asn Gln Arg Asp Ile Val Ser 195 200 205Trp Leu Arg His Asp Ala Phe His His Asn Ser Asn Lys Ala Thr Asn 210 215 220Pro Ala22516681DNAGeobacillus stearothermophilus 16atgcataatg acttggcata tgaatatgac attcatcttc cttctggcgg agaagcgggg 60aaaaagtatc cggctgtttt cgcgttgcac ggcatcgggt atgacgaaca atacatgctt 120actttagtga aagatttaaa agaagaattt attttaatag gcattagagg ggatctcccg 180tatgaagatg gatatgccta ttattatttg aaagaatatg gaaagccaga acggaaaatg 240ttcgatgata gcataggcaa attaaagcac ttcattgaat atgcattaaa ccaatatccg 300attgattccg atcgagtgta tttgatcggg tttagtcaag gcgccatttt aagtatgtct 360ctcgccttga tactgggcga taaaattaaa gggattgccg caatgaacgg atatatacca 420tcgtttgtga aggaagaata tccgttgcag cctatcagtc acttgtctgt gtttctcacc 480caaggcgaat cagatcatat ttttccttta aatattgggc aggaaaatta tgaatacttg 540cgccagcatg cgggggctgt gaagtatacc atttatccgg caggacatga aatatcgcaa 600gacaaccaac gcgacatcgt ttcatggctg cgtcatgatg cattccatca caattccaat 660aaggcaacaa atcccgcatg a 68117454PRTHomo sapiens 17Glu His Cys Leu Tyr Leu Asn Ile Tyr Thr Pro Ala Asp Leu Thr Lys1 5 10 15Lys Asn Arg Leu Pro Val Met Val Trp Ile His Gly Gly Gly Leu Met 20 25 30Val Gly Ala Ala Ser Thr Tyr Asp Gly Leu Ala Leu Ala Ala His Glu 35 40 45Asn Val Val Val Val Thr Ile Gln Tyr Arg Leu Gly Ile Trp Gly Phe 50 55 60Phe Ser Thr Gly Asp Glu His Ser Arg Gly Asn Trp Gly His Leu Asp65 70 75 80Gln Val Ala Ala Leu Arg Trp Val Gln Asp Asn Ile Ala Ser Phe Gly 85 90 95Gly Asn Pro Gly Ser Val Thr Ile Phe Gly Glu Ser Ala Gly Gly Glu 100 105 110Ser Val Ser Val Leu Val Leu Ser Pro Leu Ala Lys Asn Leu Phe His 115 120 125Arg Ala Ile Ser Glu Ser Gly Val Ala Leu Thr Ser Val Leu Val Lys 130 135 140Lys Gly Asp Val Lys Pro Leu Ala Glu Gln Ile Ala Ile Thr Ala Gly145 150 155 160Cys Lys Thr Thr Thr Ser Ala Ala Met Val His Cys Leu Arg Gln Lys 165 170 175Thr Glu Glu Glu Leu Leu Glu Thr Thr Leu Lys Ile Gly Asn Ser Tyr 180 185 190Leu Trp Thr Tyr Arg Glu Thr Gln Arg Glu Ser Thr Leu Leu Gly Thr 195 200 205Val Ile Asp Gly Met Leu Leu Leu Lys Thr Pro Glu Glu Leu Gln Arg 210 215 220Glu Arg Asn Phe His Thr Val Pro Tyr Met Val Gly Ile Asn Lys Gln225 230 235 240Glu Phe Gly Trp Leu Ile Pro Met Gln Leu Met Ser Tyr Pro Leu Ser 245 250 255Glu Gly Gln Leu Asp Gln Lys Thr Ala Met Ser Leu Leu Gly Ser Pro 260 265 270Ile Pro Leu Phe Ala Ile Ala Lys Glu Leu Ile Pro Glu Ala Thr Glu 275 280 285Lys Tyr Leu Gly Gly Thr Asp Asp Thr Val Lys Lys Lys Asp Leu Ile 290 295 300Leu Asp Leu Ile Ala Asp Val Met Phe Gly Val Pro Ser Val Ile Val305 310 315 320Ala Arg Asn His Arg Asp Ala Gly Ala Pro Thr Tyr Met Tyr Glu Phe 325 330 335Gln Tyr Arg Pro Ser Phe Ser Ser Asp Met Lys Pro Lys Thr Val Ile 340 345 350Gly Asp His Gly Asp Glu Leu Phe Ser Val Phe Gly Ala Pro Phe Leu 355 360 365Lys Glu Gly Ala Ser Glu Glu Glu Ile Arg Leu Ser Lys Met Val Met 370 375 380Lys Phe Trp Ala Asn Phe Ala Arg Asn Gly Asn Pro Asn Gly Lys Gly385 390 395 400Leu Pro His Trp Pro Glu Tyr Asn Gln Lys Glu Gly Tyr Leu Gln Ile 405 410 415Gly Ala Asn Thr Gln Ala Ala Gln Lys Leu Lys Asp Lys Glu Val Ala 420 425 430Phe Trp Thr Asn Leu Phe Ala Lys Lys Ala Val Glu Lys Pro Pro Gln 435 440 445Thr Asp His Ile Glu Leu 450181367DNAHomo sapiens 18ctgaacactg tctttacctc aatatttaca ctcctgcaga cttgaccaag aaaaacaggc 60tgccggtgat ggtgtggatc cacggagggg ggctgatggt gggtgcggca tcaacctatg 120atgggctggc ccttgctgcc catgaaaacg tggtggtggt gaccattcaa tatcgcctgg 180gcatctgggg attcttcagc acaggggatg aacacagccg ggggaactgg ggtcacctgg 240accaggtggc tgccctgcgc tgggtccagg acaacattgc cagctttgga gggaacccag 300gctctgtgac catctttgga gagtcagcgg gaggagaaag tgtctctgtt cttgttttgt 360ctccattggc caagaacctc ttccaccggg ccatttctga gagtggcgtg gccctcactt 420ctgttctggt gaagaaaggt gatgtcaagc ccttggctga gcaaattgct atcactgctg 480ggtgcaaaac caccacctct gctgctatgg ttcactgcct gcgacagaag acggaagagg 540agctcttgga gacgacattg aaaattggaa attcttatct ctggacttac agggagaccc 600agagagagtc aacccttctg ggcactgtga ttgatgggat gctgctgctg aaaacacctg 660aagagcttca acgtgaaagg aatttccaca ctgtccccta catggtcgga attaacaagc 720aggagtttgg ctggttgatt ccaatgcagt tgatgagcta tccactctcc gaagggcaac 780tggaccagaa gacagccatg tcactccttg gaagtcctat ccccttgttt gccattgcta 840aggaactgat tccagaagcc actgagaaat acttaggagg aacagacgac actgtcaaaa 900agaaagacct gatcctggac ttgatagcag atgtgatgtt tggtgtccca tctgtgattg 960tggcccggaa ccacagagat gctggagcac ccacctacat gtatgagttt cagtaccgtc 1020caagcttctc atcagacatg aaacccaaga cggtgatagg agaccacggg gatgagctct 1080tctccgtctt tggggcccca tttttaaaag agggtgcctc agaagaggag atcagactta 1140gcaagatggt gatgaaattc tgggccaact ttgctcgcaa tggaaacccc aatgggaaag 1200ggctgcccca ctggccagag tacaaccaga aggaagggta tctgcagatt ggtgccaaca 1260cccaggcggc ccagaagctg aaggacaaag aagtagcttt ctggaccaac ctctttgcca 1320agaaggcagt ggagaagcca ccccagacag accacataga gctgtga 136719565PRTMus musculus 19Met Trp Leu Cys Ala Leu Ser Leu Ile Ser Leu Thr Ala Cys Leu Ser1 5 10 15Leu Gly His Pro Ser Leu Pro Pro Val Val His Thr Val His Gly Lys 20 25 30Val Leu Gly Lys Tyr Val Thr Leu Glu Gly Phe Ser Gln Pro Val Ala 35 40 45Val Phe Leu Gly Val Pro Phe Ala Lys Pro Pro Leu Gly Ser Leu Arg 50 55 60Phe Ala Pro Pro Glu Pro Ala Glu Pro Trp Ser Phe Val Lys His Thr65 70 75 80Thr Ser Tyr Pro Pro Leu Cys Tyr Gln Asn Pro Glu Ala Ala Leu Arg 85 90 95Leu Ala Glu Arg Phe Thr Asn Gln Arg Lys Ile Ile Pro His Lys Phe 100 105 110Ser Glu Asp Cys Leu Tyr Leu Asn Ile Tyr Thr Pro Ala Asp Leu Thr 115 120 125Gln Asn Ser Arg Leu Pro Val Met Val Trp Ile His Gly Gly Gly Leu 130 135 140Val Ile Asp Gly Ala Ser Thr Tyr Asp Gly Val Pro Leu Ala Val His145 150 155 160Glu Asn Val Val Val Val Val Ile Gln Tyr Arg Leu Gly Ile Trp Gly 165 170 175Phe Phe Ser Thr Glu Asp Glu His Ser Arg Gly Asn Trp Gly His Leu 180 185 190Asp Gln Val Ala Ala Leu His Trp Val Gln Asp Asn Ile Ala Asn Phe 195 200 205Gly Gly Asn Pro Gly Ser Val Thr Ile Phe Gly Glu Ser Ala Gly Gly 210 215 220Glu Ser Val Ser Val Leu Val Leu Ser Pro Leu Ala Lys Asn Leu Phe225 230 235 240His Arg Ala Ile Ala Gln Ser Ser Val Ile Phe Asn Pro Cys Leu Phe 245 250 255Gly Arg Ala Ala Arg Pro Leu Ala Lys Lys Ile Ala Ala Leu Ala Gly 260 265 270Cys Lys Thr Thr Thr Ser Ala Ala Met Val His Cys Leu Arg Gln Lys 275 280 285Thr Glu Asp Glu Leu Leu Glu Val Ser Leu Lys Met Lys Phe Gly Thr 290 295 300Val Asp Phe Leu Gly Asp Pro Arg Glu Ser Tyr Pro Phe Leu Pro Thr305 310 315 320Val Ile Asp Gly Val Leu Leu Pro Lys Ala Pro Glu Glu Ile Leu Ala 325 330 335Glu Lys Ser Phe Asn Thr Val Pro Tyr Met Val Gly Ile Asn Lys His 340 345 350Glu Phe Gly Trp Ile Ile Pro Met Phe Leu Asp Phe Pro Leu Ser Glu 355 360 365Arg Lys Leu Glu Gln Lys Thr Ala Ala Ser Ile Leu Trp Gln Ala Tyr 370 375 380Pro Ile Leu Asn Ile Ser Glu Lys Leu Ile Pro Ala Ala Ile Glu Lys385 390 395 400Tyr Leu Gly Gly Thr Glu Asp Pro Ala Thr Met Thr Asp Leu Phe Leu 405 410 415Asp Leu Ile Gly Asp Ile Met Phe Gly Val Pro Ser Val Ile Val Ser 420 425 430Arg Ser His Arg Asp Ala Gly Ala Pro Thr Tyr Met Tyr Glu Tyr Gln 435 440 445Tyr Arg Pro Ser Phe Val Ser Asp Asp Arg Pro Gln Glu Leu Leu Gly 450 455 460Asp His Ala Asp Glu Leu Phe Ser Val Trp Gly Ala Pro Phe Leu Lys465 470 475 480Glu Gly Ala Ser Glu Glu Glu Ile Asn Leu Ser Asn Met Val Met Lys

485 490 495Phe Trp Ala Asn Phe Ala Arg Asn Gly Asn Pro Asn Gly Glu Gly Leu 500 505 510Pro His Trp Pro Glu Tyr Asp Gln Lys Glu Gly Tyr Leu Gln Ile Gly 515 520 525Val Pro Ala Gln Ala Ala His Arg Leu Lys Asp Lys Glu Val Asp Phe 530 535 540Trp Thr Glu Leu Arg Ala Lys Glu Thr Ala Glu Arg Ser Ser His Arg545 550 555 560Glu His Val Glu Leu 565201698DNAMus musculus 20atgtggctct gtgctttgag tctgatctct ctcactgctt gcttgagtct gggacaccca 60tccttaccgc ctgtggtaca caccgttcat ggcaaagtcc tggggaagta tgtcacctta 120gaaggattct cacagcctgt ggccgtcttc ctgggagtcc cctttgccaa gccccctctt 180ggatctctga ggtttgctcc accagagcct gcagagccct ggagcttcgt gaagcacacc 240acttcctacc ctcctttgtg ctaccaaaac ccagaggcag cattgaggct cgctgagcgc 300ttcaccaacc aaaggaagat cattccccac aaattttctg aggactgtct ctacctcaac 360atttatactc ctgctgactt aacacagaac agcaggttgc ccgtgatggt gtggatacat 420ggaggtggac ttgtgataga tggagcatca acctatgatg gagtgcccct ggctgtccat 480gaaaatgtgg ttgtagtggt cattcagtat cgcctgggca tctggggatt cttcagcaca 540gaggatgaac acagccgggg gaactggggt cacttggacc aggtggctgc actacattgg 600gtccaagaca acattgccaa ctttgggggc aacccaggat ctgtgactat cttcggcgag 660tcagcaggag gtgaaagtgt ctctgttctt gtgttaagcc cactggccaa gaacctcttc 720cacagggcca tcgctcagag tagtgtcatt ttcaatcctt gcctttttgg gagagctgcc 780agacccttgg ctaagaaaat tgctgctctt gctggctgta aaaccaccac ctccgctgcc 840atggttcact gcctgcgcca gaagactgaa gatgagctct tggaggtctc actgaaaatg 900aaatttggga ctgttgattt tcttggagac cccagagaga gctatccctt cctccctact 960gtgattgatg gagtgttgct gccaaaggca ccagaagaga ttctggctga gaagagtttc 1020aacactgtcc cctacatggt gggcatcaac aagcatgagt ttggctggat cattccaatg 1080tttttggact tcccactctc tgaaagaaaa ctggaacaga agacagctgc atccatcctg 1140tggcaggcct acccaattct taacatctct gaaaagctga ttccagcagc tattgaaaag 1200tatttaggag ggacagaaga ccctgccaca atgacagacc tgttcctgga cttgattgga 1260gacattatgt tcggtgtccc atctgtaatc gtgtcccgta gtcacagaga tgctggagcc 1320ccaacctaca tgtatgaata tcagtatcgc ccaagttttg tatcagacga tagaccccag 1380gaattgttag gagaccacgc tgatgaactc ttttctgtat ggggagcccc gtttttaaaa 1440gagggtgctt cagaagaaga gatcaacctc agcaacatgg tgatgaaatt ctgggccaac 1500tttgctcgga atgggaaccc taatggtgaa gggctgcctc attggccaga atatgaccag 1560aaggaaggat accttcagat tggagtccca gcacaggcag cccataggct gaaagacaag 1620gaagtggact tttggactga gctcagagcc aaggaaacag cagagaggtc atcccatagg 1680gaacatgttg aactgtga 169821553PRTXenopus laevis 21Met Ala Leu Trp Ala Ser Leu Ala Leu Ala Phe Ala Ser Leu Val Ala1 5 10 15Val Ser Gln Ala Ala Ser Leu Gly Val Val Tyr Thr Glu Gly Gly Phe 20 25 30Val Glu Gly Thr Ser Lys Lys Ile Gly Ile Leu Phe Pro Asn Tyr Ile 35 40 45Asp Val Phe Lys Gly Ile Pro Phe Ala Ala Pro Pro Lys Ala Phe Glu 50 55 60Lys Ala Gln Leu His Pro Gly Trp Ser Gly Thr Leu Lys Ala Thr Asn65 70 75 80Phe Lys Glu Arg Cys Leu Gln Ser Thr Leu Thr Gln Thr Asn Val Arg 85 90 95Gly Asp Leu Asp Cys Leu Tyr Leu Asn Ile Trp Val Pro Gln Thr Arg 100 105 110Ser Ser Val Ser Thr Asn Leu Pro Val Met Val Trp Ile Tyr Gly Gly 115 120 125Ala Phe Leu Leu Gly Ser Ser Gln Gly Ala Asn Val Leu Asp Asn Tyr 130 135 140Leu Tyr Asp Gly Glu Glu Leu Ala Leu Arg Gly Asn Val Ile Val Val145 150 155 160Thr Leu Asn Tyr Arg Leu Gly Pro Leu Gly Phe Leu Ser Thr Gly Asp 165 170 175Ser Asn Leu Pro Gly Asn Tyr Gly Leu Trp Asp Gln His Met Ala Ile 180 185 190Ala Trp Val Lys Arg Asn Ile Ala Ala Phe Gly Gly Asn Pro Asp Asn 195 200 205Ile Thr Ile Phe Gly Glu Ser Ala Gly Gly Ala Ser Val Ser Leu Gln 210 215 220Thr Leu Ser Pro Tyr Asn Lys Gly Leu Ile Lys Arg Ala Ile Ser Gln225 230 235 240Ser Gly Val Gly Met Ser Pro Trp Ala Leu Gln Ser Asn Pro Leu Phe 245 250 255Trp Thr Thr Lys Val Ala Glu Lys Val Gly Cys Pro Val His Asp Thr 260 265 270Ala Ala Met Ala Asn Cys Leu Arg Ile Ser Asp Pro Lys Ala Val Thr 275 280 285Leu Ala Tyr Lys Leu Asp Pro Ser Val Leu Glu Tyr Pro Ala Val Tyr 290 295 300Tyr Leu Gly Ile Ser Pro Val Ile Asp Gly Asp Phe Ile Pro Asp Glu305 310 315 320Pro Met Asn Leu Phe Ala Asn Ala Ala Asp Val Asp Tyr Met Ala Gly 325 330 335Val Asn Asn Met Asp Ala His Leu Phe Ala Gly Ile Asp Leu Pro Val 340 345 350Ile Asn Gln Pro Leu Gln Lys Ile Ser Glu Ala Glu Val Tyr Asn Leu 355 360 365Val Gln Gly Leu Thr Leu Thr Lys Ile Ser Ser Ala Leu Glu Thr Ala 370 375 380Tyr Asn Leu Tyr Thr Ala Asn Trp Gly Pro Asn Pro Glu Gln Glu Asn385 390 395 400Met Lys Arg Thr Val Ile Asp Leu Glu Thr Asp Tyr Leu Phe Leu Val 405 410 415Pro Thr Gln Glu Ala Leu Ala Leu His Ser Met Asn Ala Arg Ser Gly 420 425 430Arg Thr Tyr Asn Tyr Val Phe Ser Leu Pro Thr Arg Met Pro Ile Tyr 435 440 445Pro Ser Trp Val Gly Ala Asp His Ala Asp Asp Leu Gln Tyr Val Phe 450 455 460Gly Lys Pro Phe Gln Thr Pro Leu Ala Tyr Arg Pro Lys Asp Arg Asp465 470 475 480Val Ser Ala Ala Met Ile Ala Tyr Trp Thr Asn Phe Ala Ala Thr Gly 485 490 495Asp Pro Asn Gln Gly Pro Ser Lys Val Pro Thr Asp Trp Leu Pro Tyr 500 505 510Thr Thr His Leu Gly Gln Tyr Leu Glu Ile Asn Asp Asn Met Ser Tyr 515 520 525Gln Ser Val Lys Gln Ser Leu Arg Ser Pro Tyr Val Lys Phe Trp Ala 530 535 540His Thr Tyr Arg Asn Met Ala Asn Val545 550221662DNAXenopus laevis 22atggctcttt gggcttctct tgccctggca tttgccagtc tggtggcagt gtcccaggct 60gcttcactgg gagtggttta caccgagggt ggatttgtgg aaggtaccag caagaaaatt 120gggatcctgt tcccgaatta cattgatgtt tttaagggca tcccgtttgc tgctccacca 180aaagcctttg agaaggcaca actgcaccca ggctggtcag gtacattaaa agccacaaac 240tttaaggaac ggtgcttaca atccacctta acccaaacaa atgtccgtgg tgatttagac 300tgcctctacc tgaacatctg ggttcctcag acccgctctt cagtgtccac caacctacca 360gtcatggttt ggatctacgg tggggccttc ttgctcggtt catctcaggg ggccaacgtg 420ttggataact atctgtatga tggagaggag ctcgctctcc gtggcaatgt cattgtggtg 480accttgaact acagactggg accattgggc tttctgagta ctggagactc taaccttcct 540ggcaactatg gactgtggga tcaacacatg gccatcgcct gggtgaaaag gaacattgct 600gcatttggtg ggaaccctga taacatcacc atatttggag agtctgctgg aggcgccagt 660gtctcccttc agaccctgtc tccatacaac aaaggactga tcaagcgagc catcagccag 720agtggggtgg gcatgtcccc ttgggcactt cagagcaacc cacttttctg gaccacaaag 780gtggctgaaa aagttggatg tcctgttcat gacacagccg ctatggcaaa ctgcttgagg 840atttcagacc ctaaggctgt cactttagcc tataaactgg acccgtctgt cctggagtat 900cccgctgttt actacttggg catctcccca gtcattgatg gtgatttcat tcctgatgaa 960ccaatgaatc tctttgctaa tgcggcggat gtggattaca tggcaggtgt aaacaacatg 1020gatgcacatt tgtttgcagg catagatctg ccagttatca atcagcctct tcagaagatt 1080tctgaggccg aagtctataa tctggtgcag ggtttgaccc tgactaaaat ctccagtgcc 1140ttggaaactg cctacaacct ttacacggcc aactggggac ccaaccctga gcaggagaat 1200atgaaaagaa ctgtcataga cttagagacg gactatcttt tcctggtccc tacccaagag 1260gcactggctc ttcactccat gaatgctcgg agtggacgga cttacaacta cgtgttctct 1320ttgccgactc gcatgcccat ttaccccagc tgggtcggag ccgatcatgc agatgatttg 1380cagtacgtgt tcgggaaacc cttccagact ccattggctt acagacccaa ggatagagat 1440gtctctgccg ccatgattgc ctattggacc aactttgctg caactggtga ccccaaccaa 1500ggaccctcca aagtgcccac cgattggctg ccttacacca ctcaccttgg ccagtacctg 1560gaaatcaacg acaacatgtc ttaccaatct gtaaagcaga gtctacgttc cccttatgtg 1620aaattctggg cccacactta ccgcaatatg gccaacgtgt aa 166223556PRTGallus gallus 23Met Ala His Trp Ala Ile Leu Ser Phe Ala Leu Cys Cys Cys Leu Gly1 5 10 15Val Ala Gln Ala Ala Thr Leu Gly Val Val Leu Thr Glu Gly Gly Phe 20 25 30Val Glu Gly Glu Ser Lys Arg Arg Gly Leu Phe Gly Ser Tyr Val Asp 35 40 45Ile Phe Arg Gly Ile Pro Phe Ala Ala Pro Pro Lys Ala Leu Gln Asp 50 55 60Pro Gln Pro His Pro Gly Trp Asp Gly Thr Leu Lys Ala Lys Lys Phe65 70 75 80Lys Asn Arg Cys Met Gln Met Thr Leu Thr Gln Thr Asp Val Arg Gly 85 90 95Lys Glu Asp Cys Leu Tyr Leu Asn Ile Trp Ile Pro Gln Gly Lys Arg 100 105 110Glu Val Ser Thr Asn Leu Pro Val Met Val Trp Ile Tyr Gly Gly Ala 115 120 125Phe Leu Leu Gly Gly Gly Gln Gly Ala Asn Phe Leu Asp Asn Tyr Leu 130 135 140Tyr Asp Gly Glu Glu Ile Ala Val Arg Gly Asn Val Ile Val Val Thr145 150 155 160Leu Asn Tyr Arg Val Gly Pro Leu Gly Phe Leu Ser Thr Gly Asp Pro 165 170 175Asn Met Pro Gly Asn Tyr Gly Leu Lys Asp Gln His Met Ala Ile Ala 180 185 190Trp Val Lys Arg Asn Ile Lys Ala Phe Gly Gly Asp Pro Asp Asn Ile 195 200 205Thr Ile Phe Gly Glu Ser Ala Gly Ala Ala Ser Val Ser Leu Gln Ile 210 215 220Leu Ser Pro Lys Asn Ala Gly Leu Phe Lys Arg Ala Ile Ser Gln Ser225 230 235 240Gly Val Ser Leu Cys Ser Trp Val Ile Gln Lys Asp Pro Leu Thr Trp 245 250 255Ala Lys Lys Val Gly Glu Gln Val Gly Cys Pro Thr Asp Asn Thr Thr 260 265 270Val Leu Ala Asn Cys Leu Arg Ala Thr Asp Pro Lys Ala Leu Thr Leu 275 280 285Ala His His Val Glu Leu Ile Ser Leu Pro Gly Pro Leu Val His Thr 290 295 300Leu Ser Ile Thr Pro Val Val Asp Gly Asp Phe Leu Pro Asp Met Pro305 310 315 320Glu Asn Leu Phe Ala Asn Ala Ala Asp Ile Asp Tyr Ile Ala Gly Val 325 330 335Asn Asn Met Asp Gly His Phe Phe Ala Gly Phe Asp Leu Pro Ala Ile 340 345 350Asn Arg Pro Leu Gln Lys Ile Thr Ala Ser Asp Val Tyr Asn Leu Val 355 360 365Lys Gly Leu Thr Ala Asp Arg Gly Glu Arg Gly Ala Asn Leu Thr Tyr 370 375 380Asp Leu Tyr Thr Glu Leu Trp Gly Asp Asn Pro Glu Gln Gln Val Met385 390 395 400Lys Arg Thr Val Val Asp Leu Ala Thr Asp Tyr Ile Phe Leu Ile Pro 405 410 415Thr Gln Trp Thr Leu Asn Leu His His Lys Asn Ala Arg Ser Gly Lys 420 425 430Thr Tyr Ser Tyr Leu Phe Ser Gln Pro Ser Arg Met Pro Ile Tyr Pro 435 440 445Ser Trp Val Gly Ala Asp His Ala Asp Asp Leu Gln Tyr Val Phe Gly 450 455 460Lys Pro Phe Ala Thr Pro Leu Gly Tyr Leu Pro Lys His Arg Thr Val465 470 475 480Ser Ser Ala Met Ile Ala Tyr Trp Thr Asn Phe Ala Arg Thr Gly Asp 485 490 495Pro Asn Ser Gly Asn Ser Glu Val Pro Ile Thr Trp Pro Pro Tyr Thr 500 505 510Thr Glu Gly Gly Tyr Tyr Leu Glu Ile Asn Asn Lys Ile Asn Tyr Asn 515 520 525Ser Val Lys Gln Asn Leu Arg Thr Pro Tyr Val Asn Tyr Trp Asn Ser 530 535 540Val Tyr Leu Asn Leu Pro Leu Ile Ala Ser Thr Ser545 550 555241671DNAGallus gallus 24atggctcact gggcgattct gagctttgcc ttgtgctgct gcctcggggt agcacaggcc 60gcaactctgg gtgtggtgct caccgaggga ggttttgtgg aaggcgagag taaacgacgg 120ggactctttg ggagctatgt ggatatcttc agagggatcc cttttgctgc cccgccaaag 180gcactgcaag acccccaacc tcatcctggc tgggacggaa cactgaaagc aaaaaaattt 240aagaatcgct gcatgcagat gacacttacc caaactgatg tccgtgggaa ggaggactgc 300ctctatctga acatctggat ccctcaaggg aagagagaag tctccaccaa cttgccagtg 360atggtctgga tctacggtgg tgccttcctt cttggagggg gtcaaggagc caacttcctt 420gacaactacc tctatgatgg tgaggagatc gccgtgcggg gcaatgtgat tgtggtgacc 480ctcaactatc gtgtggggcc cctgggcttc ctcagcactg gagacccaaa catgccaggg 540aactacgggc tgaaggatca gcacatggct attgcctggg tgaagaggaa tatcaaggcc 600tttggaggcg acccagacaa catcaccatc tttggggagt cagctggtgc tgccagtgtc 660tccctgcaga tattgtcccc aaagaacgca ggtctgttca agagagccat cagccaaagc 720ggtgtcagtc tgtgcagctg ggtcatccaa aaggacccac tcacttgggc taaaaaggtt 780ggagagcagg tgggctgccc cacagacaac accacggtct tggccaactg cctccgtgcc 840actgacccca aagccctgac actggcccac cacgtggaac tgatctccct gcctggtccc 900ctggttcata cactctccat cactcctgtt gttgatggag acttcctccc tgacatgcca 960gagaacctct ttgccaatgc tgctgacatc gactacattg ctggggtcaa caacatggat 1020ggacatttct ttgctggctt tgatttacct gctatcaacc gtccacttca gaaaatcact 1080gcgagcgatg tctataactt ggtcaaagga ctaactgcag acaggggtga gagaggagcc 1140aacttgacgt acgatctcta cacagagttg tggggtgaca acccagagca acaagtcatg 1200aagagaacag tggtggacct ggctaccgac tacattttcc tgattcccac acagtggaca 1260ctaaacctgc accacaagaa tgcccggagt ggcaagacat acagctactt gttctcccag 1320ccatctcgaa tgcccatcta tccaagctgg gtaggggcag accacgctga tgacttgcag 1380tacgtgtttg ggaaaccctt tgccacccct ctaggctacc tgcccaagca caggacagtc 1440tcatctgcca tgattgctta ttggaccaat tttgccagga ctggtgaccc caacagtggg 1500aattcagagg tgcccattac ctggccaccc tacaccactg agggtggtta ctacctggaa 1560atcaacaaca aaataaacta taattcagtg aaacagaatc tgagaacccc atacgtgaac 1620tactggaatt cagtctatct aaatctgcca ctgattgcca gcacatccta g 167125544PRTDrosophila melanogaster 25Met Ser Ile Phe Lys Arg Leu Leu Cys Leu Thr Leu Leu Trp Ile Ala1 5 10 15Ala Leu Glu Ser Glu Ala Asp Pro Leu Ile Val Glu Ile Thr Asn Gly 20 25 30Lys Ile Arg Gly Lys Asp Asn Gly Leu Tyr Tyr Ser Tyr Glu Ser Ile 35 40 45Pro Tyr Ala Glu His Pro Thr Gly Ala Leu Arg Phe Glu Ala Pro Gln 50 55 60Pro Tyr Ser His His Trp Thr Asp Val Phe Asn Ala Thr Gln Ser Pro65 70 75 80Val Glu Cys Met Gln Trp Asn Gln Phe Ile Asn Glu Asn Asn Lys Leu 85 90 95Met Gly Asp Glu Asp Cys Leu Thr Val Ser Ile Tyr Lys Pro Lys Lys 100 105 110Pro Asn Arg Ser Ser Phe Pro Val Val Val Leu Leu His Gly Gly Ala 115 120 125Phe Met Phe Gly Ser Gly Ser Ile Tyr Gly His Asp Ser Ile Met Arg 130 135 140Glu Gly Thr Leu Leu Val Val Lys Ile Ser Tyr Arg Leu Gly Pro Leu145 150 155 160Gly Phe Ala Ser Thr Gly Asp Arg His Leu Pro Gly Asn Tyr Gly Leu 165 170 175Lys Asp Gln Arg Leu Ala Leu Gln Trp Ile Lys Lys Asn Ile Ala His 180 185 190Phe Gly Gly Met Pro Asp Asn Ile Val Leu Ile Gly His Ser Ala Gly 195 200 205Gly Ala Ser Ala His Leu Gln Leu Leu His Glu Asp Phe Lys His Leu 210 215 220Ala Lys Gly Ala Ile Ser Val Ser Gly Asn Ala Leu Asp Pro Trp Val225 230 235 240Ile Gln Gln Gly Gly Arg Arg Arg Ala Phe Glu Leu Gly Arg Ile Val 245 250 255Gly Cys Gly His Thr Asn Val Ser Ala Glu Leu Lys Asp Cys Leu Lys 260 265 270Ser Lys Pro Ala Ser Asp Ile Val Ser Ala Val Arg Ser Phe Leu Val 275 280 285Phe Ser Tyr Val Pro Phe Ser Ala Phe Gly Pro Val Val Glu Pro Ser 290 295 300Asp Ala Pro Asp Ala Phe Leu Thr Glu Asp Pro Arg Ala Val Ile Lys305 310 315 320Ser Gly Lys Phe Ala Gln Val Pro Trp Ala Val Thr Tyr Thr Thr Glu 325 330 335Asp Gly Gly Tyr Asn Ala Ala Gln Leu Leu Glu Arg Asn Lys Leu Thr 340 345 350Gly Glu Ser Trp Ile Asp Leu Leu Asn Asp Arg Trp Phe Asp Trp Ala 355 360 365Pro Tyr Leu Leu Phe Tyr Arg Asp Ala Lys Lys Thr Ile Lys Asp Met 370 375 380Asp Asp Leu Ser Phe Asp Leu Arg Gln Gln Tyr Leu Ala Asp Arg Arg385 390 395 400Phe Ser Val Glu Ser Tyr Trp Asn Val Gln Arg Met Phe Thr Asp Val 405 410 415Leu Phe Lys Asn Ser Val Pro Ser Ala Ile Asp Leu His Arg Lys Tyr 420 425 430Gly Lys Ser Pro Val Tyr Ser Phe Val Tyr

Asp Asn Pro Thr Asp Ser 435 440 445Gly Val Gly Gln Leu Leu Ser Asn Arg Thr Asp Val His Phe Gly Thr 450 455 460Val His Gly Asp Asp Phe Phe Leu Ile Phe Asn Thr Ala Ala Tyr Arg465 470 475 480Ile Gly Ile Arg Pro Asp Glu Glu Val Ile Ser Lys Lys Phe Ile Gly 485 490 495Met Leu Glu Asp Phe Ala Leu Asn Asp Lys Gly Thr Leu Thr Phe Gly 500 505 510Glu Cys Asn Phe Gln Asn Asn Val Asn Ser Lys Glu Tyr Gln Val Leu 515 520 525Arg Ile Ser Arg Asn Ala Cys Lys Asn Glu Glu Tyr Ala Arg Phe Pro 530 535 540261635DNADrosophila melanogaster 26atgagtatat tcaaacggct gttgtgcctg actttgctgt ggatagcagc tttagaatct 60gaagctgatc ccttgattgt tgagataaca aatggaaaaa tccgtggcaa agataatggg 120ttgtactaca gctacgaatc gattccctat gccgagcatc caactggtgc cctccgtttt 180gaagcacctc agccgtatag tcatcattgg actgatgttt tcaatgccac gcagtctcca 240gttgagtgca tgcagtggaa tcagtttata aacgaaaaca ataagctgat gggtgatgag 300gattgcttaa cggtaagcat ctataagcca aagaaaccca atcggagcag ctttcctgtc 360gtagtactcc tgcatggagg tgctttcatg ttcggtagtg gatccatata tggacacgac 420tccattatgc gtgagggaac tttgcttgtg gtaaaaataa gctatcgtct tggaccattg 480ggttttgcaa gtaccggcga tagacacttg ccgggaaact atggtctaaa ggatcaacgt 540ctggccctac aatggatcaa gaagaacatt gctcactttg gtggaatgcc agataatatt 600gtgctcattg gtcactctgc aggcggtgct tcggctcatt tgcagctgtt gcacgaggat 660ttcaaacatt tggccaaagg agcgatttcg gtgagcggca atgcattgga tccttgggtc 720atacagcagg gtggacgacg acgtgcattt gaactgggtc gtattgtcgg ttgtggacac 780acaaatgtct ccgcagaact caaggactgc ttgaagtcta agccggctag cgatatagtc 840tctgctgtcc gaagcttcct tgtgttttcc tatgtaccct tcagtgcttt tggacctgtt 900gtggagccgt cagatgcacc agacgccttt ctaaccgagg acccaagagc agtgattaag 960agcgggaagt ttgcccaagt cccttgggct gtgacgtaca ccactgagga cgggggatac 1020aacgctgctc agctgttgga aagaaacaaa ttaactggcg agagttggat tgacctactc 1080aatgatcgat ggtttgattg ggcaccatac ttgctcttct atcgggacgc caagaaaacc 1140atcaaagata tggatgatct ttcatttgat ctcaggcagc agtatctagc agatcggcga 1200ttcagtgtgg aaagttattg gaacgtgcag cgaatgttta ctgatgttct tttcaagaat 1260agcgtgccaa gtgcaataga tcttcaccga aagtatggca aaagtccggt ttattctttt 1320gtctacgata atcctaccga ttccggagtg ggtcaattgc tttccaatcg aacagatgta 1380cattttggta ctgtccacgg agatgacttt ttcttgattt tcaatacagc tgcataccgt 1440atcggcattc gtccggatga agaagttatt tcaaaaaagt ttataggtat gctggaggat 1500ttcgcactca acgataaggg aacattaaca tttggagaat gtaatttcca aaataatgtg 1560aacagcaagg aatatcaagt gctgcgtatt tcacgaaacg cttgtaaaaa cgaggaatat 1620gctcggtttc cctaa 163527570PRTBombyx mori 27Met Cys Thr Lys Phe Ala Val Leu Leu Tyr Tyr Val Ile Val Gly Ser1 5 10 15Val Arg Ala Tyr Ser Ser Pro Ala Ala Ser Pro Pro Ser Ser Cys Asn 20 25 30Val Val Ala Gln Thr Glu Ser Gly Trp Val Cys Gly Arg Thr Arg Arg 35 40 45Ala Glu Ala Ser Thr Leu Tyr Ala Ser Phe Arg Gly Val Pro Tyr Ala 50 55 60Lys Gln Pro Val Gly Glu Leu Arg Phe Lys Glu Leu Gln Pro Ala Glu65 70 75 80Pro Trp Thr Asp Tyr Leu Asp Ala Thr Glu Glu Gly Pro Val Cys Tyr 85 90 95Gln Thr Asp Val Leu Tyr Gly Ser Leu Met Lys Pro His Gly Met Asp 100 105 110Glu Ala Cys Ile Tyr Ala Asn Ile His Val Pro Leu Asn Ala Leu Pro 115 120 125Ala Ala Gly Glu Thr Pro Thr Lys Pro Gly Leu Pro Ile Leu Val Phe 130 135 140Ile His Gly Gly Gly Phe Ala Phe Gly Ser Gly Asp Ala Asp Leu Tyr145 150 155 160Gly Pro Glu Tyr Leu Val Thr Arg Asn Val Val Val Ile Thr Phe Asn 165 170 175Tyr Arg Leu Asn Phe Phe Gly Phe Phe Ser Leu Asp Thr Pro Lys Val 180 185 190Pro Gly Asn Asn Gly Leu Arg Asp Met Val Thr Leu Leu Arg Trp Val 195 200 205Lys Arg Asn Ala Arg Ala Phe Gly Gly Asn Pro Asp Asn Val Thr Leu 210 215 220Ala Gly Gln Ser Ala Gly Ala Ala Ala Ala His Leu Leu Thr Leu Ser225 230 235 240Lys Ala Thr Glu Gly Leu Val Ser Arg Ala Ile Leu Met Ser Gly Ala 245 250 255Gly Thr Ser Thr Phe Phe Thr Thr Ser Pro Ile Phe Ser Gln Ser Ile 260 265 270Asn Lys Ile Leu Phe Ser Ile Leu Gly Val Asn Ser Thr Asn Pro Asp 275 280 285Glu Ile His Glu Lys Leu Val Ala Met Pro Val Glu Lys Leu Asn Glu 290 295 300Ala Asn Arg Ile Leu Ile Asp Gln Ile Gly Leu Thr Thr Phe Phe Pro305 310 315 320Val Val Glu Thr Pro His Pro Gly Ile Thr Thr Ile Leu Asp Glu Asp 325 330 335Pro Asn Ile Leu Val Gln Gln Gly Arg Gly Lys Asp Ile Pro Leu Ile 340 345 350Ile Gly Phe Thr Asn Ser Glu Cys His Met Phe Gln His Arg Phe Glu 355 360 365Gln Ile Asp Ile Val Ser Lys Ile Asn Glu Asn Pro Ala Ile Leu Val 370 375 380Pro Ser Asn Leu Leu Tyr Ser Ser Thr Pro Glu Thr Ile Ala Leu Val385 390 395 400Ser Asn Gln Ile Ser Gln Arg Tyr Phe Asn Gly Ser Val Asp Leu Glu 405 410 415Gly Phe Ile Asn Met Cys Thr Asp Ser Tyr Tyr Lys Tyr Pro Ala Met 420 425 430Lys Leu Ala Glu Lys Arg Ser Ala Ala Gly Asp Ala Pro Val Phe Leu 435 440 445Tyr Gln Phe Ser Tyr Asp Gly Tyr Ser Val Phe Lys Gln Ala Phe His 450 455 460Leu His Phe Asn Gly Ala Gly His Ala Asp Asp Leu Thr Tyr Val Leu465 470 475 480Lys Val Asn Ser Ala Ser Gly Thr Ser Ser Ser Gln Lys Ala Asp Asp 485 490 495Glu Met Lys Tyr Trp Met Thr Thr Phe Val Thr Asn Phe Met Arg Cys 500 505 510Ser Ala Pro Met Cys Asp Glu Thr Thr Ala Trp Pro Pro Val Thr Pro 515 520 525Arg Glu Leu Gln Tyr Gln Asp Ile Ile Thr Pro Asn Leu Cys His Gln 530 535 540Thr Ser Leu Thr Lys Glu Gln Leu Glu Met Lys Asn Phe Phe Asp Lys545 550 555 560Ile His Asn Gly Gly Glu Ser Arg Leu Lys 565 570281713DNABombyx mori 28atgtgtacca aattcgctgt attactatat tacgttatcg tgggctcagt aagggcatac 60tcaagcccag cagcgtcgcc gccgtcgtcg tgcaatgtgg tcgcgcagac ggagtcaggc 120tgggtgtgtg gccgtactcg ccgggcggaa gcaagcactt tatacgccag tttccgggga 180gtgccttatg ccaagcaacc agtcggagaa cttcgattta aggaattaca accagcagag 240ccatggaccg actacctaga tgccaccgag gaaggtccag tttgctacca gacagacgtt 300ctttatggaa gtctaatgaa acctcacggc atggatgagg catgcatcta cgccaatata 360catgtgcctt tgaacgccct gccggcagct ggtgagacgc ctacgaagcc tggtcttcca 420atattagtct ttattcacgg aggtggcttc gcgtttggat ctggtgatgc tgacctatat 480ggaccggagt atcttgtcac aagaaacgtt gttgtcatca cttttaacta caggttgaat 540ttctttggat ttttctcatt ggatactcct aaagtccccg gaaacaatgg tcttagggac 600atggtgactc tgctccgttg ggtgaagagg aacgccagag cctttggagg taatcctgac 660aacgtgacct tggcgggcca gagcgctggg gccgctgctg ctcaccttct caccttatcc 720aaggccactg aaggcttagt ttcaagggct atattgatga gcggtgctgg aacatccact 780ttctttacaa catctcctat cttctcccag tccatcaaca aaatcttgtt ttccatcctt 840ggcgtcaact ccactaatcc tgatgagata cacgagaaac tcgtcgccat gccggttgaa 900aaactgaatg aagccaacag aatattgatt gatcaaatcg gccttaccac ttttttccca 960gtggtagaaa caccgcatcc cggaattacc actatattag atgaagatcc aaatattctg 1020gtccagcaag gccgcggtaa agacataccc ttgattatag gtttcacgaa ttctgaatgc 1080catatgttcc agcatagatt tgaacagatc gatatagtat ctaagatcaa tgaaaatcca 1140gcaatcttag ttccttccaa tctactgtac tcctcgactc ctgagacgat tgctttggtc 1200tcgaatcaaa tcagccaaag atacttcaat ggtagcgtag atctggaggg ctttatcaat 1260atgtgtaccg atagttacta caagtaccca gccatgaagt tggccgagaa gagatctgcg 1320gcaggtgatg ctccggtatt tctgtaccag ttctcttacg acggttacag cgtgttcaag 1380caagcctttc atttgcattt caacggtgct ggacacgcgg acgacttgac atacgtgctg 1440aaagtgaatt ctgcgtcagg gactagttca tcacaaaaag cagacgatga aatgaaatat 1500tggatgacga cgttcgtcac aaactttatg cgatgcagtg ctcctatgtg cgatgaaact 1560acagcgtggc caccagttac accgcgggaa ctacaatacc aagacattat tacaccaaac 1620ttatgccacc aaactagtct taccaaagaa caactcgaaa tgaagaattt cttcgataag 1680atccataatg gaggtgaaag cagacttaag taa 171329360PRTArabidopsis thaliana 29Met Trp Thr Ser Lys Thr Ile Ser Phe Thr Leu Phe Ile Thr Thr Thr1 5 10 15Leu Leu Gly Ser Cys Asn Ala Ser Ala Lys Ala Lys Thr Gln Pro Leu 20 25 30Phe Pro Ala Ile Leu Ile Phe Gly Asp Ser Thr Val Asp Thr Gly Asn 35 40 45Asn Asn Tyr Pro Ser Gln Thr Ile Phe Arg Ala Lys His Val Pro Tyr 50 55 60Gly Ile Asp Leu Pro Asn His Ser Pro Asn Gly Arg Phe Ser Asn Gly65 70 75 80Lys Ile Phe Ser Asp Ile Ile Ala Thr Lys Leu Asn Ile Lys Gln Phe 85 90 95Val Pro Pro Phe Leu Gln Pro Asn Leu Thr Asp Gln Glu Ile Val Thr 100 105 110Gly Val Cys Phe Ala Ser Ala Gly Ala Gly Tyr Asp Asp Gln Thr Ser 115 120 125Leu Thr Thr Gln Ala Ile Arg Val Ser Glu Gln Pro Asn Met Phe Lys 130 135 140Ser Tyr Ile Ala Arg Leu Lys Ser Ile Val Gly Asp Lys Lys Ala Met145 150 155 160Lys Ile Ile Asn Asn Ala Leu Val Val Val Ser Ala Gly Pro Asn Asp 165 170 175Phe Ile Leu Asn Tyr Tyr Glu Val Pro Ser Trp Arg Arg Met Tyr Pro 180 185 190Ser Ile Ser Asp Tyr Gln Asp Phe Val Leu Ser Arg Leu Asn Asn Phe 195 200 205Val Lys Glu Leu Tyr Ser Leu Gly Cys Arg Lys Ile Leu Val Gly Gly 210 215 220Leu Pro Pro Met Gly Cys Leu Pro Ile Gln Met Thr Ala Gln Phe Arg225 230 235 240Asn Val Leu Arg Phe Cys Leu Glu Gln Glu Asn Arg Asp Ser Val Leu 245 250 255Tyr Asn Gln Lys Leu Gln Lys Leu Leu Pro Gln Thr Gln Ala Ser Leu 260 265 270Thr Gly Ser Lys Ile Leu Tyr Ser Asp Val Tyr Asp Pro Met Met Glu 275 280 285Met Leu Gln Asn Pro Ser Lys Tyr Gly Phe Lys Glu Thr Thr Arg Gly 290 295 300Cys Cys Gly Thr Gly Phe Leu Glu Thr Ser Phe Met Cys Asn Ala Tyr305 310 315 320Ser Ser Met Cys Gln Asn Arg Ser Glu Phe Leu Phe Phe Asp Ser Ile 325 330 335His Pro Ser Glu Ala Thr Tyr Asn Tyr Ile Gly Asn Val Leu Asp Thr 340 345 350Lys Ile Arg Gly Trp Leu Lys Ala 355 360301083DNAArabidopsis thaliana 30atgtggactt ctaaaaccat aagcttcact ctcttcatca caacaacact tctcgggtcc 60tgcaacgcat ctgcaaaggc caaaacgcaa ccgctattcc cagcgattct aatctttggt 120gattcaacag tcgacacagg caacaataat tacccttcac aaacaatctt cagagctaaa 180catgttcctt acggaattga tctcccaaac cactcaccta acggaagatt ctcaaacggg 240aaaattttct ccgacataat cgcaaccaaa ctcaacatca aacagtttgt tcctcctttc 300ttacaaccaa atctcaccga ccaagaaatt gtaaccggag tctgtttcgc atcagcaggt 360gccggttacg atgaccaaac cagtctcacg acacaagcga ttcgtgtctc ggaacaacca 420aatatgttca agagttacat tgctcgtctt aagagtatcg taggagacaa gaaagccatg 480aagatcataa acaatgcttt ggtggttgtg agtgcagggc ctaatgattt catcttgaat 540tattacgagg ttccctcatg gcgtcgcatg tatcctagca tttctgatta ccaagatttt 600gttcttagta ggcttaacaa tttcgtgaag gagctttaca gcctaggttg ccggaaaatt 660ttggtcggag gtttaccgcc aatgggatgt ttaccgattc aaatgactgc tcaattccgc 720aacgtcctaa ggttttgctt ggaacaagag aacagagact ctgttttata caatcagaaa 780cttcagaagc tcttacctca gacacaagca tctcttacag gaagcaagat cctttactct 840gatgtctatg accctatgat ggagatgctc caaaacccta gcaaatacgg atttaaagag 900acgacgagag gatgttgtgg aacagggttc ttggagacga gcttcatgtg taatgcttat 960tcttccatgt gtcagaatcg ctcggagttt ctgttctttg actcgattca tccatctgaa 1020gctacctaca attacattgg taatgttctg gatactaaga ttcgtgggtg gcttaaggct 1080taa 108331300PRTMalus pumila 31Met Glu Pro Ile Asn Asp Glu Ile Ala Arg Glu Phe Arg Phe Phe Arg1 5 10 15Val Tyr Lys Asp Gly Arg Ile Glu Ile Phe Tyr Lys Thr Gln Lys Val 20 25 30Pro Pro Ser Thr Asp Glu Ile Thr Gly Val Gln Ser Lys Asp Ile Thr 35 40 45Ile Gln Pro Glu Pro Ala Val Ser Ala Arg Ile Phe Leu Pro Lys Ile 50 55 60His Glu Pro Ala Gln Lys Leu Pro Val Leu Leu Tyr Leu His Gly Gly65 70 75 80Gly Phe Ile Phe Glu Ser Ala Phe Ser Pro Ile Tyr His Asn Phe Val 85 90 95Gly Arg Leu Ala Ala Glu Ala His Ala Val Val Val Ser Val Glu Tyr 100 105 110Gly Leu Phe Pro Asp Arg Pro Val Pro Ala Cys Tyr Glu Asp Ser Trp 115 120 125Ala Ala Leu Lys Trp Leu Ala Ser His Ala Ser Gly Asp Gly Thr Glu 130 135 140Ser Trp Leu Asn Lys Tyr Ala Asp Phe Asp Arg Leu Phe Ile Gly Gly145 150 155 160Asp Ser Gly Gly Ala Asn Leu Ser His Tyr Leu Ala Val Arg Val Gly 165 170 175Ser Leu Gly Gln Pro Asp Leu Lys Ile Gly Gly Val Val Leu Val His 180 185 190Pro Phe Phe Gly Gly Leu Glu Glu Asp Asp Gln Met Phe Leu Tyr Met 195 200 205Cys Thr Glu Asn Gly Gly Leu Glu Asp Arg Arg Leu Arg Pro Pro Pro 210 215 220Glu Asp Phe Lys Arg Leu Ala Cys Gly Lys Met Leu Ile Phe Phe Ala225 230 235 240Ala Gly Asp His Leu Arg Gly Ala Gly Gln Leu Tyr Tyr Glu Asp Leu 245 250 255Lys Lys Ser Glu Trp Gly Gly Ser Val Asp Val Val Glu His Gly Glu 260 265 270Gly His Val Phe His Leu Phe Asn Ser Asp Cys Glu Asn Ala Ala Asp 275 280 285Leu Val Lys Lys Phe Gly Ser Phe Ile Asn Gln Lys 290 295 30032903DNAMalus pumila 32atggagccaa tcaacgacga gattgctcgt gaatttcgct tcttccgggt gtacaaagac 60ggtcgcatag aaatattcta caagacacaa aaggtccccc cttcgactga cgaaatcact 120ggtgtccaat ccaaggacat cacaattcaa cccgaacccg ccgtttctgc ccgtatcttc 180cttcccaaga tccacgagcc ggcccaaaag ctccccgttc tcctctacct ccacggcggt 240gggtttatct tcgagtctgc cttctctcct atttatcaca acttcgtcgg acgattggca 300gctgaagccc acgcagtcgt agtgtccgtc gaatacgggt tgttcccgga tcgccccgta 360cccgcttgct atgaagactc atgggcggcg ctcaaatggc tcgcgtccca cgctagtggg 420gatggaaccg agtcgtggtt aaacaagtat gctgactttg accggttgtt tataggcggg 480gacagcggtg gagcaaattt gtcgcactat ttggctgtcc gggtcgggtc cctcgggcaa 540ccggatttga agattggtgg agttgtgctg gtgcatccgt tctttggggg cttggaggag 600gacgaccaaa tgtttctgta catgtgtacg gagaacggtg ggttggagga tcgtaggctg 660aggccgcccc cagaggattt caaaaggcta gcttgcggga agatgttgat atttttcgcg 720gcgggagacc atctgagagg ggcgggccag ctgtactatg aggacctgaa aaagagtgag 780tggggcggga gtgtcgacgt ggtggagcat ggtgaaggac atgtgtttca cttgttcaat 840tcggactgtg agaatgctgc ggacttggtg aaaaaatttg gatccttcat caaccaaaag 900tag 9033335DNAArtificial SequencePrimer 33cgtctagaaa gagaatgatg aaaattgttc cgccg 353446DNAArtificial SequencePrimer 34cggttaactt aatggtgatg gtgatggtgc caatctaacg attcaa 4635762DNAArtificial SequenceCarE-his 35atgatgaaaa ttgttccgcc gaagccgttt ttctttgaag ccggggagcg ggcggtgctg 60cttttgcatg ggtttaccgg caattccgcc gacgttcgga tgcttgggcg attcttggaa 120tcgaaagggt atacgtgcca cgctccgatt tacaaagggc atggcgtgcc gccggaagag 180ctcgtccaca ccggaccgga tgattggtgg caagacgtca tgaacggcta tcagtttttg 240aaaaacaaag gctacgaaaa aattgccgtg gctggattgt cgcttggagg cgtattttct 300ctcaaattag gctacactgt acctacacaa ggcattgtga cgatgtgcgc gccgatgtac 360atcaaaagcg aagaaacgat gtacgaaggt gtgctcgagt atgcgcgcga gtataaaaag 420cgggaaggga aatcagagga acaaatcgaa caggaaatgg aacggttcaa acaaacgccg 480atgaagacgt tgaaagcctt gcaagaactc attgccgatg tgcgcgccca ccttgatttg 540gtttatgcac cgacgttcgt cgtccaagcg cgccatgatg agatgatcaa tccagacagc 600gcgaacatca tttataacga aattgaatcg ccggtcaaac aaatcaaatg gtatgagcaa 660tcaggccatg tgattacgct tgatcaagaa aaagatcagc tgcatgaaga tatttatgca 720tttcttgaat cgttagattg gcaccatcac catcaccatt ga 7623644DNAArtificial SequencePrimer 36cgctcgagaa aagagaggct gaagctatga tgaaaattgt tccg 443729DNAArtificial SequencePrimer 37cgcgcgcata ttcgaggacg ccttcgtac 293829DNAArtificial SequencePrimer 38cgtcctcgaa tatgcgcgcg agtataaaa

293947DNAArtificial SequencePrimer 39cggaattctt aatggtgatg gtgatggtgc caatctaacg attcaag 4740762DNAArtificial SequenceMutated CarE-his 40atgatgaaaa ttgttccgcc gaagccgttt ttctttgaag ccggggagcg ggcggtgctg 60cttttgcatg ggtttaccgg caattccgcc gacgttcgga tgcttgggcg attcttggaa 120tcgaaagggt atacgtgcca cgctccgatt tacaaagggc atggcgtgcc gccggaagag 180ctcgtccaca ccggaccgga tgattggtgg caagacgtca tgaacggcta tcagtttttg 240aaaaacaaag gctacgaaaa aattgccgtg gctggattgt cgcttggagg cgtattttct 300ctcaaattag gctacactgt acctacacaa ggcattgtga cgatgtgcgc gccgatgtac 360atcaaaagcg aagaaacgat gtacgaaggc gtcctcgaat atgcgcgcga gtataaaaag 420cgggaaggga aatcagagga acaaatcgaa caggaaatgg aacggttcaa acaaacgccg 480atgaagacgt tgaaagcctt gcaagaactc attgccgatg tgcgcgccca ccttgatttg 540gtttatgcac cgacgttcgt cgtccaagcg cgccatgatg agatgatcaa tccagacagc 600gcgaacatca tttataacga aattgaatcg ccggtcaaac aaatcaaatg gtatgagcaa 660tcaggccatg tgattacgct tgatcaagaa aaagatcagc tgcatgaaga tatttatgca 720tttcttgaat cgttagattg gcaccatcac catcaccatt aa 762

* * * * *

References

ncbi.nlm.nih.gov