Process For The Production Of Gamma-aminobutyric Acid Zelder; Oskar ; et al. [Herold; Andrea]

Process For The Production Of Gamma-aminobutyric Acid

Zelder; Oskar ; et al.

Patent Application Summary

U.S. patent application number 14/289731 was filed with the patent office on 2014-09-11 for process for the production of gamma-aminobutyric acid. This patent application is currently assigned to BASF SE. The applicant listed for this patent is Andrea Herold, Weol Kyu Jeong, Corinna Klopprogge, Hartwig Schroder, Oskar Zelder. Invention is credited to Andrea Herold, Weol Kyu Jeong, Corinna Klopprogge, Hartwig Schroder, Oskar Zelder.

Application Number	20140256005 14/289731
Document ID	/
Family ID	40578627
Filed Date	2014-09-11

United States Patent Application	20140256005
Kind Code	A1
Zelder; Oskar ; et al.	September 11, 2014

PROCESS FOR THE PRODUCTION OF GAMMA-AMINOBUTYRIC ACID

Abstract

The present invention relates to a novel method for the fermentative production of gamma-aminobutyric acid (GABA) by cultivating a recombinant microorganism expressing an enzyme having a glutamate decarboxylase activity. The present invention also relates to corresponding recombinant hosts, recombinant vectors, expression cassettes and nucleic acids suitable for preparing such hosts as well as to a method for preparing polyamides making use of GABA as obtained fermentative production.

Inventors:

Zelder; Oskar; (Speyer, DE) ; Jeong; Weol Kyu; (Miryongdong Gunsan, KR) ; Klopprogge; Corinna; (Mannheim, DE) ; Herold; Andrea; (Ketsch, DE) ; Schroder; Hartwig; (Nussloch, DE)

Applicant:

Name	City	State	Country	Type
Zelder; Oskar Jeong; Weol Kyu Klopprogge; Corinna Herold; Andrea Schroder; Hartwig	Speyer Miryongdong Gunsan Mannheim Ketsch Nussloch		DE KR DE DE DE

Assignee:

BASF SE
Ludwigshafen
DE

Family ID:

40578627

Appl. No.:

14/289731

Filed:

May 29, 2014

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
12918241	Aug 18, 2010	8771998
PCT/EP2009/001225	Feb 20, 2009
14289731

Current U.S. Class:	435/129 ; 435/128; 435/232; 435/252.32; 435/320.1; 536/23.2
Current CPC Class:	C12P 13/005 20130101; C12N 9/88 20130101; C12P 13/02 20130101
Class at Publication:	435/129 ; 536/23.2; 435/128; 435/232; 435/320.1; 435/252.32
International Class:	C12P 13/00 20060101 C12P013/00; C12P 13/02 20060101 C12P013/02; C12N 9/88 20060101 C12N009/88

Foreign Application Data

Date	Code	Application Number
Feb 21, 2008	EP	08151744.3

Claims

1. A method for the fermentative production of gamma-aminobutyric acid (GABA), comprising cultivating a recombinant microorganism derived from a parent microorganism having the ability to produce glutamate and additionally having the ability to express a heterologous glutamate decarboxylase (E.C. 4.1.1.15), wherein said microorganism is a Corynebacterium, so that glutamate is converted to GABA.

2. The method of claim 1, wherein the microorganism is Corynebacterium glutamicum.

3. The method of claim 1, wherein said heterologous glutamate decarboxylase is of eukaryotic origin.

4. The method of claim 3, wherein said heterologous glutamate decarboxylase is a plant glutamate decarboxylase or a chimeric glutamate decarboxylase comprising amino acid sequence portions derived from plant glutamate decarboxylase.

5. The method of claim 3, wherein said heterologous glutamate decarboxylase is a decarboxylase of a plant of the genus Solanum, in particular from Solanum tuberosum.

6. The method of claim 3, wherein the heterologous glutamate decarboxylase is from Solanum tuberosum and comprises an amino acid sequence from Thr94 to Leu336 of SEQ ID NO: 2 or a sequence having at least 92% identity thereto.

7. The method of claim 6, wherein the heterologous glutamate decarboxylase is N-terminally and/or C-terminally supplemented by the corresponding terminal amino acid sequences of a glutamate decarboxylase from Solanum tuberosum.

8. The method of claim 6, wherein the heterologous glutamate decarboxylase is N-terminally and/or C-terminally supplemented by the corresponding terminal amino acid sequences of a glutamate decarboxylase of a second plant different from Solanum tuberosum.

9. The method of claim 8, wherein said second plant is Solanum lycopersicum.

10. The method of claim 9, wherein said glutamate decarboxylase comprises an amino acid sequence according to SEQ ID NO: 2 or a sequence having at least 80% identity thereto.

11. The method of claim 1, wherein the enzyme is encoded by a nucleic acid sequence, which is adapted to the codon usage of said parent microorganism having the ability to produce glutamate having glutamate decarboxylase activity.

12. The method of claim 1, wherein the enzyme having glutamate decarboxylase activity is encoded by a nucleic acid sequence comprising a coding sequence selected from the group consisting of a) position 472 to 1200 according to SEQ ID NO:1 or from position 193 to 1605 according to SEQ ID NO:1; b) a coding sequence encoding a glutamate decarboxylase of a plant of the genus Solanum, in particular from Solanum tuberosum; c) a coding sequence encoding a glutamate decarboxylase from Solanum tuberosum and comprising an amino acid sequence from Thr94 to Leu336 of SEQ ID NO: 2 or a sequence having at least 92% identity thereto; d) a coding sequence encoding a glutamate decarboxylase that is N-terminally and/or C-terminally supplemented by the corresponding terminal amino acid sequences of a glutamate decarboxylase from Solanum tuberosum; e) a coding sequence encoding a glutamate decarboxylase that is N-terminally and/or C-terminally supplemented by the corresponding terminal amino acid sequences of a glutamate decarboxylase of a second plant different from Solanum tuberosum; f) a coding sequence encoding a glutamate decarboxylase that is N-terminally and/or C-terminally supplemented by the corresponding terminal amino acid sequences of a glutamate decarboxylase of Solanum lycopersicum; g) a coding sequence encoding a glutamate decarboxylase comprising an amino acid sequence according to SEQ ID NO: 2 or a sequence having at least 80% identity thereto, and h) a coding sequence encoding a glutamate decarboxylase, wherein the coding sequence is adapted to the codon usage of said parent microorganism having the ability to produce glutamate having glutamate decarboxylase activity.

13. A glutamate decarboxylase as defined in claim 5.

14. A nucleic acid sequence comprising the coding sequence for a glutamate decarboxylase as claimed in claim 13.

15. An expression cassette, comprising at least one nucleic acid sequence as claimed in claim 14, which sequence is operatively linked to at least one regulatory nucleic acid sequence.

16. A recombinant vector, comprising at least one expression cassette as claimed in claim 15.

17. A prokaryotic or eukaryotic host, transformed with at least one vector as claimed in claim 16.

18. The host of claim 17, selected from a recombinant Corynebacterium.

19. The host of claim 18, which is recombinant Corynebacterium glutamicum.

20. The method of claim 1, wherein the GABA thus produced is isolated from the fermentation broth.

21. A method of preparing a polyamide, which method comprises a) preparing GABA by the method of claim 1; b) isolating GABA; and c) polymerizing said GABA, optionally in the presence of at least one further suitable polyvalent co-monomer, selected from aminocarboxylic acids, and hydroxycarboxylic acids.

22. The method of claim 1, wherein the recombinant microorganism further comprises at least one deregulated gene selected from the group consisting of: i) amplification of isocitrate dehydrogenase; ii) amplification of glutamate dehydrogenase; iii) amplification of phosphoenolpyruvate carboxylase; iv) releasing feedback inhibition by point mutation and amplification of pyruvate carboxylase; v) attenuation of 2-oxoglutarate dehydrogenase; vi) attenuation of isocitrate lyase; vii) attenuation of phosphoenolpyruvate carboxykinase; viii) attenuation of glutamine synthetase; and ix) attenuation of glutamate exporter.

Description

RELATED APPLICATIONS

[0001] This application is a continuation of patent application Ser. No. 12/918,241 filed Aug. 18, 2010, which is a national stage application (under 35 U.S.C. .sctn.371) of PCT/EP2009/001225, filed Feb. 20, 2009, which claims benefit of European application 08151744.3, filed Feb. 21, 2008. The entire content of each aforementioned application is hereby incorporated by reference in its entirety.

SUBMISSION OF SEQUENCE LISTING

[0002] The Sequence Listing associated with this application is filed in electronic format via EFS-Web and hereby incorporated by reference into the specification in its entirety. The name of the text file containing the Sequence Listing is Sequence_Listing.sub.--074012.sub.--0155.sub.--01. The size of the text file is 64 KB, and the text file was created on May 29, 2014.

BRIEF SUMMARY OF INVENTION

[0003] The present invention relates to a novel method for the fermentative production of gamma-aminobutyric acid (GABA) by cultivating a recombinant microorganism expressing an enzyme having a glutamate decarboxylase activity. The present invention also relates to corresponding recombinant hosts, recombinant vectors, expression cassettes and nucleic acids suitable for preparing such hosts as well as to a method for preparing polyamides making use of GABA as obtained fermentative production.

BACKGROUND OF THE INVENTION

[0004] GABA (CAS number 56-12-2) is an important ubiquitous non-protein amino acid in both prokaryotic and eukaryotic organisms. It shows different biological functions, for example as representative depressive neurotransmitter in the sympathetic nervous system and it is effective for lowering the blood pressure of experimental animals and humans. The compound is synthesized by glutamate decarboxylase (GAD; EC 4.1.1.15) from glutamate.

[0005] GABA is used in different technical fields. For example, GABA-enriched food can be used as a dietary supplement and nutraceutical to help treat sleeplessness, depression and autonomic disorders, chronic alcohol-related symptoms, and to stimulate immune cells. The compound can also be used as a raw material for the production of polyamides and of pyrrolidone.

[0006] A suitable way for the fermentative production of said commercially interesting chemical compound has not yet been described.

[0007] The object of the present invention is, therefore, to provide a suitable method for the fermentative production of GABA or corresponding salts thereof.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] FIG. 1 depicts the comparison of contigs from potato ESTs with GAD homologs from plants.

[0009] FIG. 2 depicts the schematic drawing of a chimera GAD gene of the invention.

[0010] FIG. 3 depicts the plasmid map of the pClik5aMCS cloning vector.

SUMMARY OF THE INVENTION

[0011] The above-mentioned problem was solved by the present invention teaching the fermentative production of GABA or a salt thereof by cultivating a recombinant glutamate producing microorganism expressing GAD enzyme which enzyme converts glutamate that is formed in said microorganism to GABA.

DETAILED DESCRIPTION OF THE INVENTION

1. Preferred Embodiments

[0012] The present invention relates to a method for the fermentative production of gamma-aminobutyric acid (GABA), which method comprises the cultivation of a recombinant microorganism which microorganism preferably being derived from a parent microorganism having the ability to produce glutamate, and which recombinant microorganism, qualitatively or quantitatively, retains said ability of said parent microorganism, and additionally having the ability to express heterologous glutamate decarboxylase (E.C. 4.1.1.15), so that glutamate is converted to GABA; and optionally isolating GABA from the fermentation broth. Said modified microorganism also may or may not retain its ability to produce glutamate.

[0013] In particular, said microorganism is a glutamate producing bacterium, particularly a coryneform bacterium, like a bacterium of the genus Corynebacterium, as, for example, Corynebacterium glutamicum.

[0014] Said heterologous glutamate decarboxylase is of prokaryotic or eukaryotic origin.

[0015] In one specific embodiment, said heterologous glutamate decarboxylase is a plant glutamate decarboxylase or a chimeric glutamate decarboxylase comprising at least one amino acid sequence portion derived from plant glutamate decarboxylase. Said "at least one amino acid sequence portion derived from plant glutamate decarboxylase" comprises at least ten consecutive amino acid residues of said plant enzyme. In total, there may be 1 to 10, in particular, 1 to 5, preferably 1 or 2 amino acid sequence portions derived from said plant sequence. Each of said portions may have a length of 10 to 500, 10 to 450, 10 to 400, 20 to 350, 40 to 300, 50 to 250, 60 to 200, 70 to 150 or 80 to 100 consecutive amino acid residues of said plant enzyme.

[0016] In particular, said heterologous glutamate decarboxylase is a decarboxylase of a plant of the genus Solanum, in particular from Solanum tuberosum, i.e. potato. For example, said heterologous glutamate decarboxylase is from Solanum tuberosum and comprises an amino acid sequence from Thr94 to Leu336 of SEQ ID NO: 2 or a sequence having 80% to less than 100% identity thereto, as, for example, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%.

[0017] In addition, said heterologous glutamate decarboxylase may be N-terminally and/or C-terminally supplemented by the corresponding terminal amino acid sequences of a glutamate decarboxylase from Solanum tuberosum, or N-terminally and/or C-terminally supplemented by the corresponding terminal amino acid sequences of a glutamate decarboxylase of a second plant, different from Solanum tuberosum. For example, said second plant is Solanum lycopersicum, i.e. tomato.

[0018] In a particular embodiment, said glutamate decarboxylase comprises an amino acid sequence according to SEQ ID NO: 2 or a sequence having 80% to less than 100% identity thereto, as, for example, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%.

[0019] According to another embodiment of the invention, said heterologous glutamate decarboxylase is a bacterial glutamate decarboxylase, for example from a bacterium of the genus Escherichia, in particular from E. coli.

[0020] Said E. coli glutamate decarboxylase may be selected from GadA of SEQ ID NO:6, the GadBC complex comprising the GadB sequence of SEQ ID NO: 8 and the GadC sequence of SEQ ID NO: 9 and sequences having 80% to less than 100% identity to GadA or GadBC, respectively. Suitable sequences may have, for example, an identity of 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%

[0021] In another embodiment, the enzyme having glutamate decarboxylase activity is encoded by a nucleic acid sequence, which is adapted to the codon usage of said parent microorganism having the ability to produce glutamate.

[0022] In particular, the enzyme having glutamate decarboxylase activity may be encoded by a nucleic acid sequence comprising a coding sequence selected from [0023] a) position 472 to 1200 according to SEQ ID NO:1 or from position 193 to 1605 according to SEQ ID NO:1; [0024] b) SEQ ID NO: 5; [0025] c) SEQ ID NO: 7; [0026] d) or any coding sequence encoding a glutamate decarboxylase as defined above.

[0027] The present invention also relates to a glutamate decarboxylase enzyme as defined above; as well as to a nucleic acid sequence comprising the coding sequence for such a glutamate decarboxylase.

[0028] In another embodiment, the present invention provides an expression cassette, comprising at least one nucleic acid sequence as defined above, which sequence is operatively linked to at least one regulatory nucleic acid sequence; as well as a recombinant vector, comprising at least one such expression cassette.

[0029] The present invention also relates to a prokaryotic or eukaryotic host, transformed with at least one vector as defined above; in particular to hosts selected from recombinant coryneform bacteria, especially a recombinant Corynebacterium, as, for example, recombinant Corynebacterium glutamicum.

[0030] Finally, the present invention relates to a method of preparing a polymer, in particular, a polyamide, which method comprises [0031] a) preparing GABA by a method as described above; [0032] b) isolating GABA; and [0033] c) polymerizing said GABA, optionally in the presence of at least one further suitable polyvalent copolymerizable co-monomer, selected, for example, from aminocarboxylic acids and hydroxycarboxylic acids.

2. Explanation of Particular Terms

[0034] Unless otherwise stated the expressions "gamma-aminobutyric acid", "gamma-aminobutyrate" and "GABA" are considered to be synonymous. The GABA product as obtained according to the present invention may be in the form of the free acid, in the form of a partial or complete salt of said acid and base functional groups or in the form of mixtures of the non-charged acid and any of its salt or mixtures.

[0035] A GABA "salt" comprises for example metal salts, as for example mono- or di-alkalimetal salts of GABA like mono-sodium di-sodium, mono-potassium and dipotassium salts as well as alkaline earth metal salts as for example the calcium or magnesium salts or the protonated form of GABA.

[0036] "Deregulation" has to be understood in its broadest sense, and comprises an increase or decrease of complete switch off of an enzyme (target enzyme) activity by different means well known to those in the art. Suitable methods comprise for example an increase or decrease of the copy number of gene and/or enzyme molecules in an organism, or the modification of another feature of the enzyme affecting the its enzymatic activity, which then results in the desired effect on the metabolic pathway at issue, in particular the Glutamate biosynthetic pathway or any pathway or enzymatic reaction coupled thereto. Suitable genetic manipulation can also include, but is not limited to, altering or modifying regulatory sequences or sites associated with expression of a particular gene (e.g., by removing strong promoters, inducible promoters or multiple promoters), modifying the chromosomal location of a particular gene, altering nucleic acid sequences adjacent to a particular gene such as a ribosome binding site or transcription terminator, decreasing the copy number of a particular gene, modifying proteins (e.g., regulatory proteins, suppressors, enhancers, transcriptional activators and the like) involved in transcription of a particular gene and/or translation of a particular gene product, or any other conventional means of deregulating expression of a particular gene routine in the art (including but not limited to use of antisense nucleic acid molecules, or other methods to knock-out or block expression of the target protein).

[0037] The term "heterologous" or "exogenous" refers to proteins, nucleic acids and corresponding sequences as described herein, which are introduced into or produced (transcribed or translated) by a genetically manipulated microorganism as defined herein and which microorganism prior to said manipulation did not contain or did not produce said sequence. In particular said microorganism prior to said manipulation may not contain or express said heterologous enzyme activity, or may contain or express an endogenous enzyme of comparable activity or specificity, which is encoded by a different coding sequence or by an enzyme of different amino acid sequence, and said endogenous enzyme may convert the same substrate or substrates as said exogenous enzyme.

[0038] A "parent" microorganism of the present invention is any microorganism having the ability to produce glutamate.

[0039] A microorganism "derived from a parent microorganism" refers to a microorganism modified by any type of manipulation, selected from chemical, biochemical or microbial, in particular genetic engineering techniques. Said manipulation results in at least one change of a biological feature of said parent microorganism. As an example the coding sequence of a heterologous enzyme may be introduced into said organism. By said change at least one feature may be added to, replaced in or deleted from said parent microorganism. Said change may, for example, result in an altered metabolic feature of said microorganism, so that, for example, a substrate of an enzyme expressed by said microorganism (which substrate was not utilized at all or which was utilized with different efficiency by said parent microorganism) is metabolized in a characteristic way (for example, in different amount, proportion or with different efficiency if compared to the parent microorganism), and/or a metabolic final or intermediary product is formed by said modified microorganism in a characteristic way (for example, in different amount, proportion or with different efficiency if compared to the parent microorganism).

[0040] An "intermediary product" is understood as a product, which is transiently or continuously formed during a chemical or biochemical process, in a not necessarily analytically directly detectable concentration. Said "intermediary product" may be removed from said biochemical process by a second, chemical or biochemical reaction, in particular by a reaction catalyzed by a "glutamate decarboxylase" enzyme as defined herein.

[0041] The term "glutamate decarboxylase" refers to any enzyme of any origin having the ability to convert glutamate into GABA. Such enzymes are classified as EC. 4.1.1.15.

[0042] A "recombinant host" may be any prokaryotic or eukaryotic cell, which contains either a cloning vector or expression vector. This term is also meant to include those prokaryotic or eukaryotic cells that have been genetically engineered to contain the cloned gene(s) in the chromosome or genome of the host cell. For examples of suitable hosts, see Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL, Second Edition, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989).

[0043] The term "recombinant microorganism" includes a microorganism (e.g., bacteria, yeast, fungus, etc.) or microbial strain, which has been genetically altered, modified or engineered (e.g., genetically engineered) such that it exhibits an altered, modified or different genotype and/or phenotype (e.g., when the genetic modification affects coding nucleic acid sequences of the microorganism) as compared to the naturally-occurring microorganism or "parent" microorganism which it was derived from.

[0044] As used herein, a "substantially pure" protein or enzyme means that the desired purified protein is essentially free from contaminating cellular components, as evidenced by a single band following polyacrylamide-sodium dodecyl sulfategel electrophoresis (SDS-PAGE). The term "substantially pure" is further meant to describe a molecule, which is homogeneous by one or more purity or homogeneity characteristics used by those of skill in the art. For example, a substantially pure glutamate decarboxylase will show constant and reproducible characteristics within standard experimental deviations for parameters such as the following: molecular weight, chromatographic migration, amino acid composition, amino acid sequence, blocked or unblocked N-terminus, HPLC elution profile, biological activity, and other such parameters. The term, however, is not meant to exclude artificial or synthetic mixtures of glutamate decarboxylase with other compounds. In addition, the term is not meant to exclude glutamate decarboxylase fusion proteins optionally isolated from a recombinant host.

3. Other Embodiments of the Invention

3.1 Deregulation of Further Genes

[0045] The fermentative production of GABA with a recombinant Corynebacterium glutamate producer expressing glutamate decarboxylase may be further improved if it is combined with the deregulation of at least one further gene as listed below.

TABLE-US-00001 Enzyme (gene product) Gene Deregulation isocitrate dehydrogenase icd amplification NCgI0634 glutamate dehydrogenase gdh amplification NCgI1999 phosphoenolpyruvate ppc amplification carboxylase NCgI1523 pyruvate carboxylase pycA Releasing feedback inhibition by NCgI0659 point mutation (EP1108790) and amplification 2-oxoglutarate odhA attenuation (WO2006/028298) dehydrogenase NCgI1084 isocitrate lyase aceA attenuation NCgI2248 phosphoenolpyruvate pck attenuation carboxykinase NCgI2765 glutamine synthetase ginA attenuation NCgI2148 glutamate exporter yggB attenuation (WO2006070944) NCgI1221

[0046] A preferred way of an "amplification" is an "up"-mutation which increases the gene activity e.g. by gene amplification using strong expression signals and/or point mutations which enhance the enzymatic activity.

[0047] A preferred way of an "attenuation" is a "down"-mutation which decreases the gene activity e.g. by gene deletion or disruption, using weak expression signals and/or point mutations which destroy or decrease the enzymatic activity.

3.2 Proteins According to the Invention

[0048] The present invention is not limited to the specifically mentioned proteins, but also extends to functional equivalents thereof.

[0049] "Functional equivalents" or "analogs" or "functional mutations" of the concretely disclosed enzymes are, within the scope of the present invention, various polypeptides thereof, which moreover possess the desired biological function or activity, e.g. enzyme activity.

[0050] For example, "functional equivalents" means enzymes, which, in a test used for enzymatic activity, display at least a 1 to 10%, or at least 20%, or at least 50%, or at least 75%, or at least 90% higher or lower activity of an enzyme, as defined herein.

[0051] "Functional equivalents", according to the invention, also means in particular mutants, which, in at least one sequence position of the amino acid sequences stated above, have an amino acid that is different from that concretely stated, but nevertheless possess one of the aforementioned biological activities. "Functional equivalents" thus comprise the mutants obtainable by one or more amino acid additions, substitutions, deletions and/or inversions, where the stated changes can occur in any sequence position, provided they lead to a mutant with the profile of properties according to the invention. Functional equivalence is in particular also provided if the reactivity patterns coincide qualitatively between the mutant and the unchanged polypeptide, i.e. if for example the same substrates are converted at a different rate. Examples of suitable amino acid substitutions are shown in the following table:

TABLE-US-00002 Original residue Examples of substitution Ala Ser Arg Lys Asn Gln; His Asp Glu Cys Ser Gln Asn Glu Asp Gly Pro His Asn; Gln Ile Leu; Val Leu Ile; Val Lys Arg; Gln; Glu Met Leu; Ile Phe Met; Leu; Tyr Ser Thr Thr Ser Trp Tyr Tyr Trp; Phe Val Ile; Leu

[0052] "Functional equivalents" in the above sense are also "precursors" of the polypeptides described, as well as "functional derivatives" and "salts" of the polypeptides.

[0053] "Precursors" are in that case natural or synthetic precursors of the polypeptides with or without the desired biological activity.

[0054] The expression "salts" means salts of carboxyl groups as well as salts of acid addition of amino groups of the protein molecules according to the invention. Salts of carboxyl groups can be produced in a known way and comprise inorganic salts, for example sodium, calcium, ammonium, iron and zinc salts, and salts with organic bases, for example amines, such as triethanolamine, arginine, lysine, piperidine and the like. Salts of acid addition, for example salts with inorganic acids, such as hydrochloric acid or sulfuric acid and salts with organic acids, such as acetic acid and oxalic acid, are also covered by the invention.

[0055] "Functional derivatives" of polypeptides according to the invention can also be produced on functional amino acid side groups or at their N-terminal or C-terminal end using known techniques. Such derivatives comprise for example aliphatic esters of carboxylic acid groups, amides of carboxylic acid groups, obtainable by reaction with ammonia or with a primary or secondary amine; N-acyl derivatives of free amino groups, produced by reaction with acyl groups; or O-acyl derivatives of free hydroxy groups, produced by reaction with acyl groups.

[0056] "Functional equivalents" naturally also comprise polypeptides that can be obtained from other organisms, as well as naturally occurring variants. For example, areas of homologous sequence regions can be established by sequence comparison, and equivalent enzymes can be determined on the basis of the concrete parameters of the invention.

[0057] "Functional equivalents" also comprise fragments, preferably individual domains or sequence motifs, of the polypeptides according to the invention, which for example display the desired biological function.

[0058] "Functional equivalents" are, moreover, fusion proteins, which have one of the polypeptide sequences stated above or functional equivalents derived there from and at least one further, functionally different, heterologous sequence in functional N-terminal or C-terminal association (i.e. without substantial mutual functional impairment of the fusion protein parts). Non-limiting examples of these heterologous sequences are e.g. signal peptides, histidine anchors or enzymes.

[0059] "Functional equivalents" that are also included according to the invention are homologues of the concretely disclosed proteins. These possess percent identity values as stated above. Said values refer to the identity with the concretely disclosed amino acid sequences, and may be calculated according to the algorithm of Pearson and Lipman, Proc. Natl. Acad, Sci. (USA) 85(8), 1988, 2444-2448.

[0060] The % identity values may also be calculated from BLAST alignments, algorithm blastp (protein-protein BLAST) or by applying the Clustal setting as given below.

[0061] A percentage identity of a homologous polypeptide according to the invention means in particular the percentage identity of the amino acid residues relative to the total length of one of the amino acid sequences concretely described herein.

[0062] In the case of a possible protein glycosylation, "functional equivalents" according to the invention comprise proteins of the type designated above in deglycosylated or glycosylated form as well as modified forms that can be obtained by altering the glycosylation pattern.

[0063] Such functional equivalents or homologues of the proteins or polypeptides according to the invention can be produced by mutagenesis. e.g. by point mutation, lengthening or shortening of the protein.

[0064] Such functional equivalents or homologues of the proteins according to the invention can be identified by screening combinatorial databases of mutants, for example shortening mutants. For example, a variegated database of protein variants can be produced by combinatorial mutagenesis at the nucleic acid level, e.g. by enzymatic ligation of a mixture of synthetic oligonucleotides. There are a great many methods that can be used for the production of databases of potential homologues from a degenerated oligonucleotide sequence. Chemical synthesis of a degenerated gene sequence can be carried out in an automatic DNA synthesizer, and the synthetic gene can then be ligated in a suitable expression vector. The use of a degenerated genome makes it possible to supply all sequences in a mixture, which code for the desired set of potential protein sequences. Methods of synthesis of degenerated oligonucleotides are known to a person skilled in the art (e.g. Narang, S. A. (1983) Tetrahedron 39:3; Itakura et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al., (1984) Science 198:1056; Ike et al. (1983) Nucleic Acids Res. 11:477).

[0065] In the prior art, several techniques are known for the screening of gene products of combinatorial databases, which were produced by point mutations or shortening, and for the screening of cDNA libraries for gene products with a selected property. These techniques can be adapted for the rapid screening of the gene banks that were produced by combinatorial mutagenesis of homologues according to the invention. The techniques most frequently used for the screening of large gene banks, which are based on a high-throughput analysis, comprise cloning of the gene bank in expression vectors that can be replicated, transformation of the suitable cells with the resultant vector database and expression of the combinatorial genes in conditions in which detection of the desired activity facilitates isolation of the vector that codes for the gene whose product was detected. Recursive Ensemble Mutagenesis (REM), a technique that increases the frequency of functional mutants in the databases, can be used in combination with the screening tests, in order to identify homologues (Arkin and Yourvan (1992) PNAS 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

3.3 Coding Nucleic Acid Sequences

[0066] The invention also relates to nucleic acid sequences that code for enzymes as defined herein.

[0067] The present invention also relates to nucleic acids with a certain degree of "identity" to the sequences specifically disclosed herein. "Identity" between two nucleic acids means identity of the nucleotides, in each case over the entire length of the nucleic acid.

[0068] For example the identity may be calculated by means of the Vector NTI Suite 7.1 program of the company Informax (USA) employing the Clustal Method (Higgins D G, Sharp P M. Fast and sensitive multiple sequence alignments on a microcomputer. Comput Appl. Biosci. 1989 April; 5(2):151-1) with the following settings:

[0069] Multiple Alignment Parameter:

TABLE-US-00003 Gap opening penalty 10 Gap extension penalty 10 Gap separation penalty range 8 Gap separation penalty off % identity for alignment delay 40 Residue specific gaps off Hydrophilic residue gap off Transition weighing 0

[0070] Pairwise Alignment Parameter:

TABLE-US-00004 FAST algorithm on K-tuple size 1 Gap penalty 3 Window size 5 Number of best diagonals 5

[0071] Alternatively the identity may be determined according to Chenna, Ramu. Sugawara, Hideaki, Koike, Tadashi, Lopez, Rodrigo, Gibson, Toby J, Higgins, Desmond G, Thompson, Julie D. Multiple sequence alignment with the Clustal series of programs. (2003) Nucleic Acids Res 31 (13): 3497-500, the web page: ebi.ac.uk/Tools/clustalw/index.html# and the following settings

TABLE-US-00005 DNA Gap Open Penalty 15.0 DNA Gap Extension Penalty 6.66 DNA Matrix Identity Protein Gap Open Penalty 10.0 Protein Gap Extension Penalty 0.2 Protein matrix Gonnet Protein/DNA ENDGAP -1 Protein/DNA GAPDIST 4

[0072] All the nucleic acid sequences mentioned herein (single-stranded and double-stranded DNA and RNA sequences, for example cDNA and mRNA) can be produced in a known way by chemical synthesis from the nucleotide building blocks, e.g. by fragment condensation of individual overlapping, complementary nucleic acid building blocks of the double helix. Chemical synthesis of oligonucleotides can, for example, be performed in a known way, by the phosphoamidite method (Voet, Voet, 2nd edition. Wiley Press, New York, pages 896-897). The accumulation of synthetic oligonucleotides and filling of gaps by means of the Klenow fragment of DNA polymerase and ligation reactions as well as general cloning techniques are described in Sambrook et al. (1989), see below.

[0073] The invention also relates to nucleic acid sequences (single-stranded and double-stranded DNA and RNA sequences, e.g. cDNA and mRNA), coding for one of the above polypeptides and their functional equivalents, which can be obtained for example using artificial nucleotide analogs.

[0074] The invention relates both to isolated nucleic acid molecules, which code for polypeptides or proteins according to the invention or biologically active segments thereof, and to nucleic acid fragments, which can be used for example as hybridization probes or primers for identifying or amplifying coding nucleic acids according to the invention.

[0075] The nucleic acid molecules according to the invention can in addition contain non-translated sequences from the 3' and/or 5' end of the coding genetic region.

[0076] The invention further relates to the nucleic acid molecules that are complementary to the concretely described nucleotide sequences or a segment thereof.

[0077] The nucleotide sequences according to the invention make possible the production of probes and primers that can be used for the identification and/or cloning of homologous sequences in other cellular types and organisms. Such probes or primers generally comprise a nucleotide sequence region which hybridizes under "stringent" conditions (see below) on at least about 12, preferably at least about 25, for example about 40, 50 or 75 successive nucleotides of a sense strand of a nucleic acid sequence according to the invention or of a corresponding antisense strand.

[0078] An "isolated" nucleic acid molecule is separated from other nucleic acid molecules that are present in the natural source of the nucleic acid and can moreover be substantially free from other cellular material or culture medium, if it is being produced by recombinant techniques, or can be free from chemical precursors or other chemicals, if it is being synthesized chemically.

[0079] A nucleic acid molecule according to the invention can be isolated by means of standard techniques of molecular biology and the sequence information supplied according to the invention. For example, cDNA can be isolated from a suitable cDNA library, using one of the concretely disclosed complete sequences or a segment thereof as hybridization probe and standard hybridization techniques (as described for example in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd edition, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989). In addition, a nucleic acid molecule comprising one of the disclosed sequences or a segment thereof, can be isolated by the polymerase chain reaction, using the oligonucleotide primers that were constructed on the basis of this sequence. The nucleic acid amplified in this way can be cloned in a suitable vector and can be characterized by DNA sequencing. The oligonucleotides according to the invention can also be produced by standard methods of synthesis, e.g. using an automatic DNA synthesizer.

[0080] Nucleic acid sequences according to the invention or derivatives thereof, homologues or parts of these sequences, can for example be isolated by usual hybridization techniques or the PCR technique from other bacteria. e.g. via genomic or cDNA libraries. These DNA sequences hybridize in standard conditions with the sequences according to the invention.

[0081] "Hybridize" means the ability of a polynucleotide or oligonucleotide to bind to an almost complementary sequence in standard conditions, whereas nonspecific binding does not occur between non-complementary partners in these conditions. For this, the sequences can be 90-100% complementary. The property of complementary sequences of being able to bind specifically to one another is utilized for example in Northern Blotting or Southern Blotting or in primer binding in PCR or RT-PCR.

[0082] Short oligonucleotides of the conserved regions are used advantageously for hybridization. However, it is also possible to use longer fragments of the nucleic acids according to the invention or the complete sequences for the hybridization. These standard conditions vary depending on the nucleic acid used (oligonucleotide, longer fragment or complete sequence) or depending on which type of nucleic acid--DNA or RNA--is used for hybridization. For example, the melting temperatures for DNA:DNA hybrids are approx. 10.degree. C. lower than those of DNA:RNA hybrids of the same length.

[0083] For example, depending on the particular nucleic acid, standard conditions mean temperatures between 42 and 58.degree. C. in an aqueous buffer solution with a concentration between 0.1 to 5.times.SSC (1.times.SSC=0.15 M NaCl, 15 mM sodium citrate, pH 7.2) or additionally in the presence of 50% formamide, for example 42.degree. C. in 5.times.SSC, 50% formamide. Advantageously, the hybridization conditions for DNA:DNA hybrids are 0.1.times.SSC and temperatures between about 20.degree. C. to 45.degree. C., preferably between about 30.degree. C. to 45.degree. C. For DNA:RNA hybrids the hybridization conditions are advantageously 0.1.times.SSC and temperatures between about 30.degree. C. to 55.degree. C., preferably between about 45.degree. C. to 55.degree. C. These stated temperatures for hybridization are examples of calculated melting temperature values for a nucleic acid with a length of approx. 100 nucleotides and a G+C content of 50% in the absence of formamide. The experimental conditions for DNA hybridization are described in relevant genetics textbooks, for example Sambrook et al., 1989, and can be calculated using formulae that are known by a person skilled in the art, for example depending on the length of the nucleic acids, the type of hybrids or the G+C content. A person skilled in the art can obtain further information on hybridization from the following textbooks: Ausubel et al. (eds), 1985, Current Protocols in Molecular Biology, John Wiley & Sons, New York; Hames and Higgins (eds), 1985, Nucleic Acids Hybridization: A Practical Approach, IRL Press at Oxford University Press, Oxford; Brown (ed), 1991, Essential Molecular Biology: A Practical Approach, IRL Press at Oxford University Press, Oxford.

[0084] "Hybridization" can in particular be carried out under stringent conditions. Such hybridization conditions are for example described in Sambrook, J., Fritsch, E. F., Maniatis, T., in: Molecular Cloning (A Laboratory Manual), 2nd edition, Cold Spring Harbor Laboratory Press, 1989, pages 9.31-9.57 or in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6.

[0085] "Stringent" hybridization conditions mean in particular: Incubation at 42.degree. C. overnight in a solution consisting of 50% formamide, 5.times.SSC (750 mM NaCl, 75 mM tri-sodium citrate), 50 mM sodium phosphate (pH 7.6), 5.times. Denhardt Solution, 10% dextran sulfate and 20 g/ml denatured, sheared salmon sperm DNA, followed by washing of the filters with 0.1.times.SSC at 65.degree. C.

[0086] The invention also relates to derivatives of the concretely disclosed or derivable nucleic acid sequences.

[0087] Thus, further nucleic acid sequences according to the invention can be derived from the sequences specifically disclosed herein and can differ from it by addition, substitution, insertion or deletion of individual or several nucleotides, and furthermore code for polypeptides with the desired profile of properties.

[0088] The invention also encompasses nucleic acid sequences that comprise so-called silent mutations or have been altered, in comparison with a concretely stated sequence, according to the codon usage of a special original or host organism, as well as naturally occurring variants, e.g. splicing variants or allelic variants, thereof.

[0089] It also relates to sequences that can be obtained by conservative nucleotide substitutions (i.e. the amino acid in question is replaced by an amino acid of the same charge, size, polarity and/or solubility).

[0090] The invention also relates to the molecules derived from the concretely disclosed nucleic acids by sequence polymorphisms. These genetic polymorphisms can exist between individuals within a population owing to natural variation. These natural variations usually produce a variance of 1 to 5% in the nucleotide sequence of a gene.

[0091] Derivatives of nucleic acid sequences according to the invention mean for example allelic variants, having at least 60% homology at the level of the derived amino acid, preferably at least 80% homology, quite especially preferably at least 90% homology over the entire sequence range (regarding homology at the amino acid level, reference should be made to the details given above for the polypeptides). Advantageously, the homologies can be higher over partial regions of the sequences.

[0092] Furthermore, derivatives are also to be understood to be homologues of the nucleic acid sequences according to the invention, for example animal, plant, fungal or bacterial homologues, shortened sequences, single-stranded DNA or RNA of the coding and noncoding DNA sequence. For example, homologues have, at the DNA level, a homology of at least 40%, preferably of at least 60%, especially preferably of at least 70%, quite especially preferably of at least 80% over the entire DNA region given in a sequence specifically disclosed herein.

[0093] Moreover, derivatives are to be understood to be, for example, fusions with promoters. The promoters that are added to the stated nucleotide sequences can be modified by at least one nucleotide exchange, at least one insertion, inversion and/or deletion, though without impairing the functionality or efficacy of the promoters. Moreover, the efficacy of the promoters can be increased by altering their sequence or can be exchanged completely with more effective promoters even of organisms of a different genus.

3.4 Constructs According to the Invention

[0094] The invention also relates to expression constructs, containing, under the genetic control of regulatory nucleic acid sequences, a nucleic acid sequence coding for a polypeptide or fusion protein according to the invention; as well as vectors comprising at least one of these expression constructs.

[0095] "Expression unit" means, according to the invention, a nucleic acid with expression activity, which comprises a promoter as defined herein and, after functional association with a nucleic acid that is to be expressed or a gene, regulates the expression, i.e. the transcription and the translation of this nucleic acid or of this gene. In this context, therefore, it is also called a "regulatory nucleic acid sequence". In addition to the promoter, other regulatory elements may be present, e.g. enhancers.

[0096] "Expression cassette" or "expression construct" means, according to the invention, an expression unit, which is functionally associated with the nucleic acid that is to be expressed or the gene that is to be expressed. In contrast to an expression unit, an expression cassette thus comprises not only nucleic acid sequences which regulate transcription and translation, but also the nucleic acid sequences which should be expressed as protein as a result of the transcription and translation.

[0097] The terms "expression" or "overexpression" describe, in the context of the invention, the production or increase of intracellular activity of one or more enzymes in a microorganism, which are encoded by the corresponding DNA. For this, it is possible for example to insert a gene in an organism, replace an existing gene by another gene, increase the number of copies of the gene or genes, use a strong promoter or use a gene that codes for a corresponding enzyme with a high activity, and optionally these measures can be combined.

[0098] Preferably such constructs according to the invention comprise a promoter 5'-upstream from the respective coding sequence, and a terminator sequence 3'-downstream, and optionally further usual regulatory elements, in each case functionally associated with the coding sequence.

[0099] A "promoter", a "nucleic acid with promoter activity" or a "promoter sequence" mean, according to the invention, a nucleic acid which, functionally associated with a nucleic acid that is to be transcribed, regulates the transcription of this nucleic acid.

[0100] "Functional" or "operative" association means, in this context, for example the sequential arrangement of one of the nucleic acids with promoter activity and of a nucleic acid sequence that is to be transcribed and optionally further regulatory elements, for example nucleic acid sequences that enable the transcription of nucleic acids, and for example a terminator, in such a way that each of the regulatory elements can fulfill its function in the transcription of the nucleic acid sequence. This does not necessarily require a direct association in the chemical sense. Genetic control sequences, such as enhancer sequences, can also exert their function on the target sequence from more remote positions or even from other DNA molecules. Arrangements are preferred in which the nucleic acid sequence that is to be transcribed is positioned behind (i.e. at the 3' end) the promoter sequence, so that the two sequences are bound covalently to one another. The distance between the promoter sequence and the nucleic acid sequence that is to be expressed transgenically can be less than 200 bp (base pairs), or less than 100 bp or less than 50 bp.

[0101] Apart from promoters and terminators, examples of other regulatory elements that may be mentioned are targeting sequences, enhancers, polyadenylation signals, selectable markers, amplification signals, replication origins and the like. Suitable regulatory sequences are described for example in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990).

[0102] Nucleic acid constructs according to the invention comprise in particular sequences selected from those, specifically mentioned herein or derivatives and homologues thereof, as well as the nucleic acid sequences that can be derived from amino acid sequences specifically mentioned herein which are advantageously associated operatively or functionally with one or more regulating signal for controlling, e.g. increasing, gene expression.

[0103] In addition to these regulatory sequences, the natural regulation of these sequences can still be present in front of the actual structural genes and optionally can have been altered genetically, so that natural regulation is switched off and the expression of the genes has been increased. The nucleic acid construct can also be of a simpler design, i.e. without any additional regulatory signals being inserted in front of the coding sequence and without removing the natural promoter with its regulation. Instead, the natural regulatory sequence is silenced so that regulation no longer takes place and gene expression is increased.

[0104] A preferred nucleic acid construct advantageously also contains one or more of the aforementioned enhancer sequences, functionally associated with the promoter, which permit increased expression of the nucleic acid sequence. Additional advantageous sequences, such as other regulatory elements or terminators, can also be inserted at the 3' end of the DNA sequences. One or more copies of the nucleic acids according to the invention can be contained in the construct. The construct can also contain other markers, such as antibiotic resistances or auxotrophy-complementing genes, optionally for selection on the construct.

[0105] Examples of suitable regulatory sequences are contained in promoters such as cos-, tac-, trp-, tet-, trp-tet-, lpp-, lac-, lpp-lac-, lacl.sup.q-, T7-, T5-, T3-, gal-, trc-, ara-, rhaP (rhaP.sub.BAD)SP6-, lambda-P.sub.R- or in the lambda-P.sub.L promoter, which find application advantageously in Gram-negative bacteria. Other advantageous regulatory sequences are contained for example in the Gram-positive promoters ace, amy and SPO2, in the yeast or fungal promoters ADC1. MFalpha, AC, P-60, CYC1, GAPDH, TEF, rp28, ADH. Artificial promoters can also be used for regulation.

[0106] For expression, the nucleic acid construct is inserted in a host organism advantageously in a vector, for example a plasmid or a phage, which permits optimum expression of the genes in the host. In addition to plasmids and phages, vectors are also to be understood as meaning all other vectors known to a person skilled in the art, e.g. viruses, such as SV40, CMV, baculovirus and adenovirus, transposons, IS elements, phasmids, cosmids, and linear or circular DNA. These vectors can be replicated autonomously in the host organism or can be replicated chromosomally. These vectors represent a further embodiment of the invention.

[0107] Suitable plasmids are, for example in E. coli, pLG338, pACYC184, pBR322, pUC18, pUC19, pKC30, pRep4, pHS1, pKK223-3, pDHE19.2, pHS2, pPLc236, pMBL24, pLG200, pUR290, pIN-III.sup.113-B1, .lamda.gt11 or pBdCI; in nocardioform actinomycetes pJAM2; in Streptomyces pIJ101, pIJ364, pIJ702 or pIJ361; in bacillus pUB110, pC194 or pBD214; in Corynebacterium pSA77 or pAJ667; in fungi pALS1, pIL2 or pBB116; in yeasts 2alphaM, pAG-1, YEp6, YEp13 or pEMBLYe23 or in plants pLGV23, pGHlac.sup.+, pBIN19, pAK2004 or pDH51. The aforementioned plasmids represent a small selection of the possible plasmids. Other plasmids are well known to a person skilled in the art and will be found for example in the book Cloning Vectors (Eds. Pouwels P. H. et al. Elsevier, Amsterdam-New York-Oxford, 1985, ISBN 0 444 904018).

[0108] In a further embodiment of the vector, the vector containing the nucleic acid construct according to the invention or the nucleic acid according to the invention can be inserted advantageously in the form of a linear DNA in the microorganisms and integrated into the genome of the host organism through heterologous or homologous recombination. This linear DNA can comprise a linearized vector such as plasmid or just the nucleic acid construct or the nucleic acid according to the invention.

[0109] For optimum expression of heterologous genes in organisms, it is advantageous to alter the nucleic acid sequences in accordance with the specific codon usage employed in the organism. The codon usage can easily be determined on the basis of computer evaluations of other, known genes of the organism in question.

[0110] The production of an expression cassette according to the invention is based on fusion of a suitable promoter with a suitable coding nucleotide sequence and a terminator signal or polyadenylation signal. Common recombination and cloning techniques are used for this, as described for example in T. Maniatis, E. F. Fritsch and J. Sambrook, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989) as well as in T. J. Silhavy, M. L. Berman and L. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1984) and in Ausubel, F. M. et al., Current Protocols in Molecular Biology, Greene Publishing Assoc. and Wiley Interscience (1987).

[0111] The recombinant nucleic acid construct or gene construct is inserted advantageously in a host-specific vector for expression in a suitable host organism, to permit optimum expression of the genes in the host. Vectors are well known to a person skilled in the art and will be found for example in "Cloning Vectors" (Pouwels P. H. et al., Publ. Elsevier, Amsterdam-New York-Oxford, 1985).

3.5 Hosts that can be Used According to the Invention

[0112] Depending on the context, the term "microorganism" means the starting microorganism (wild-type) or a genetically modified microorganism according to the invention, or both.

[0113] The term "wild-type" means, according to the invention, the corresponding starting microorganism, and need not necessarily correspond to a naturally occurring organism.

[0114] By means of the vectors according to the invention, recombinant microorganisms can be produced, which have been transformed for example with at least one vector according to the invention and can be used for the fermentative production according to the invention.

[0115] Advantageously, the recombinant constructs according to the invention, described above, are inserted in a suitable host system and expressed. Preferably, common cloning and transfection methods that are familiar to a person skilled in the art are used, for example co-precipitation, protoplast fusion, electroporation, retroviral transfection and the like, in order to secure expression of the stated nucleic acids in the respective expression system. Suitable systems are described for example in Current Protocols in Molecular Biology, F. Ausubel et al., Publ. Wiley Interscience, New York 1997, or Sambrook et al. Molecular Cloning: A Laboratory Manual. 2nd edition, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989.

[0116] The parent microorganisms are typically those which have the ability to produce lysine, in particular L-lysine, from glucose, saccharose, lactose, fructose, maltose, molassis, starch, cellulose or glycerol, fatty acids, plant oils or ethanol. Preferably they are coryneform bacteria, in particular of the genus corynebacterium or of the genus Brevibacterium. In particular the species Corynebacterium glutamicum has to be mentioned.

[0117] Non-limiting examples of suitable strains of the genus Corynebacterium, and the species Corynebacterium glutamicum (C. glutamicum), are

Corynebacterium glutamicum ATCC 13032, Corynebacterium acetoglutamicum ATCC 15806, Corynebacterium acetoacidophilum ATCC 13870, Corynebacterium thermoaminogenes FERM BP-1 539, Corynebacterium melassecola ATCC 17965

[0118] and of the genus Brevibacterium, are

Brevibacterium flavum ATCC 14067 Brevibacterium lactofermentum ATCC 13869 and Brevibacterium divaricatum ATCC 14020 or strains derived there from like, Corynebacterium glutamicum KFCC 10065 Corynebacterium glutamicum ATCC21608

[0119] KFCC designates Korean Federation of Culture Collection, ATCC designates American type strain culture collection, FERM BP designates the collection of National institute of Bioscience and Human-Technology, Agency of Industrial Science and Technology, Japan.

[0120] The host organism or host organisms according to the invention preferably contain at least one of the nucleic acid sequences, nucleic acid constructs or vectors described in this invention, which code for an enzyme activity according to the above definition.

3.5 Fermentative Production of GABA

[0121] The invention also relates to methods for the fermentative production of GABA.

[0122] The recombinant microorganisms as used according to the invention can be cultivated continuously or discontinuously in the batch process or in the fed batch or repeated fed batch process. A review of known methods of cultivation will be found in the textbook by Chmiel (Bioprocesstechnik 1. Einfuhrung in die Bioverfahrenstechnik (Gustav Fischer Verlag, Stuttgart, 1991)) or in the textbook by Storhas (Bioreaktoren und periphere Einrichtungen (Vieweg Verlag, Braunschweig/Wiesbaden. 1994)).

[0123] The culture medium that is to be used must satisfy the requirements of the particular strains in an appropriate manner. Descriptions of culture media for various microorganisms are given in the handbook "Manual of Methods for General Bacteriology" of the American Society for Bacteriology (Washington D. C., USA, 1981).

[0124] These media that can be used according to the invention generally comprise one or more sources of carbon, sources of nitrogen, inorganic salts, vitamins and/or trace elements.

[0125] Preferred sources of carbon are sugars, such as mono-, di- or polysaccharides. Very good sources of carbon are for example glucose, fructose, mannose, galactose, ribose, sorbose, ribulose, lactose, maltose, sucrose, raffinose, starch or cellulose. Sugars can also be added to the media via complex compounds, such as molasses, or other by-products from sugar refining. It may also be advantageous to add mixtures of various sources of carbon. Other possible sources of carbon are oils and fats such as soybean oil, sunflower oil, peanut oil and coconut oil, fatty acids such as palmitic acid, stearic acid or linoleic acid, alcohols such as glycerol, methanol or ethanol and organic acids such as acetic acid or lactic acid.

[0126] Sources of nitrogen are usually organic or inorganic nitrogen compounds or materials containing these compounds. Examples of sources of nitrogen include ammonia gas or ammonium salts, such as ammonium sulfate, ammonium chloride, ammonium phosphate, ammonium carbonate or ammonium nitrate, nitrates, urea, amino acids or complex sources of nitrogen, such as corn-steep liquor, soybean flour, soybean protein, yeast extract, meat extract and others. The sources of nitrogen can be used separately or as a mixture.

[0127] Inorganic salt compounds that may be present in the media comprise the chloride, phosphate or sulfate salts of calcium, magnesium, sodium, cobalt, molybdenum, potassium, manganese, zinc, copper and iron.

[0128] Inorganic sulfur-containing compounds, for example sulfates, sulfites, dithionites, tetrathionates, thiosulfates, sulfides, but also organic sulfur compounds, such as mercaptans and thiols, can be used as sources of sulfur.

[0129] Phosphoric acid, potassium dihydrogenphosphate or dipotassium hydrogenphosphate or the corresponding sodium-containing salts can be used as sources of phosphorus.

[0130] Chelating agents can be added to the medium, in order to keep the metal ions in solution. Especially suitable chelating agents comprise dihydroxyphenols, such as catechol or protocatechuate, or organic acids, such as citric acid.

[0131] The fermentation media used according to the invention may also contain other growth factors, such as vitamins or growth promoters, which include for example biotin, riboflavin, thiamine, folic acid, nicotinic acid, pantothenate and pyridoxine. Growth factors and salts often come from complex components of the media, such as yeast extract, molasses, corn-steep liquor and the like. In addition, suitable precursors can be added to the culture medium. The precise composition of the compounds in the medium is strongly dependent on the particular experiment and must be decided individually for each specific case. Information on media optimization can be found in the textbook "Applied Microbiol. Physiology, A Practical Approach" (Publ. P. M. Rhodes, P. F. Stanbury, IRL Press (1997) p. 53-73, ISBN 0 19 963577 3). Growing media can also be obtained from commercial suppliers, such as Standard 1 (Merck) or BHI (Brain heart infusion, DIFCO) etc.

[0132] All components of the medium are sterilized, either by heating (20 min at 1.5 bar and 121.degree. C.) or by sterile filtration. The components can be sterilized either together, or if necessary separately. All the components of the medium can be present at the start of growing, or optionally can be added continuously or by batch feed.

[0133] The temperature of the culture is normally between 15.degree. C. and 45.degree. C., preferably 25.degree. C. to 40.degree. C. and can be kept constant or can be varied during the experiment. The pH value of the medium should be in the range from 5 to 8.5, preferably around 7.0. The pH value for growing can be controlled during growing by adding basic compounds such as sodium hydroxide, potassium hydroxide, ammonia or ammonia water or acid compounds such as phosphoric acid or sulfuric acid. Antifoaming agents, e.g. fatty acid polyglycol esters, can be used for controlling foaming. To maintain the stability of plasmids, suitable substances with selective action, e.g. antibiotics, can be added to the medium. Oxygen or oxygen-containing gas mixtures, e.g. the ambient air, are fed into the culture in order to maintain aerobic conditions. The temperature of the culture is normally from 20.degree. C. to 45.degree. C. Culture is continued until a maximum of the desired product has formed. This is normally achieved within 10 hours to 160 hours.

[0134] The cells can be disrupted optionally by high-frequency ultrasound, by high pressure, e.g. in a French pressure cell, by osmolysis, by the action of detergents, lytic enzymes or organic solvents, by means of homogenizers or by a combination of several of the methods listed.

3.6 GABA Isolation

[0135] The methodology of the present invention can further include a step of recovering GABA. The term "recovering" includes extracting, harvesting, isolating or purifying the compound from culture media. Recovering the compound can be performed according to any conventional isolation or purification methodology known in the art including, but not limited to, treatment with a conventional resin (e.g., anion or cation exchange resin, non-ionic adsorption resin, etc.), treatment with a conventional adsorbent (e.g., activated charcoal, silicic acid, silica gel, cellulose, alumina, etc.), alteration of pH, solvent extraction (e.g., with a conventional solvent such as an alcohol, ethyl acetate, hexane and the like), distillation, dialysis, filtration, concentration, crystallization, recrystallization, pH adjustment, lyophilization and the like. For example GABA can be recovered from culture media by first removing the microorganisms. The remaining broth is then passed through or over a cation exchange resin to remove unwanted cations and then through or over an anion exchange resin to remove unwanted inorganic anions and organic acids.

3.7 Polyamide Polymers and Pyrrolidone

[0136] In another aspect, the present invention provides a process for the production of polymers, in particular polyamides, comprising a step as mentioned above for the production of GABA. The GABA is reacted in a known manner with itself or at least one different co-monomer, selected from amino- and hydroxycarboxylic acids by applying standard methods of polymer synthesis. Suitable co-monomers are for example derived from C.sub.2-C.sub.31, preferably C.sub.4-C.sub.10-straight or branched chain monocarboxylic acids, carrying at least one reactive hydroxyl or amino group.

[0137] Such hydroxyl- or amino-substituted, copolymerizable "carboxylic acids" are derived from straight-chain or branched, saturated or mono- or poly-unsaturated C.sub.2-C.sub.30-monocarboxylic acids. In particular, said acids carry a straight-chain mono- or poly-unsaturated hydrocarbyl residue or a mixture of such residues with an average length of 1-30, preferably 3-9 carbon atoms. Particularly preferred residues are: [0138] saturated, straight-chain residues like CH.sub.3--, C.sub.2H.sub.5--; C.sub.3H.sub.7--; C.sub.4H.sub.9--; C.sub.5H.sub.11--; C.sub.6H.sub.13--; C.sub.7H.sub.15--, C.sub.8H.sub.17--; C.sub.9H.sub.19--; C.sub.10H.sub.21--; C.sub.11H.sub.23--; C.sub.12H.sub.25--; C.sub.13H.sub.27--; C.sub.14H.sub.29--; C.sub.15H.sub.31--; C.sub.16H.sub.33--; C.sub.17H.sub.35--; C.sub.18H.sub.37--; C.sub.19H.sub.39--; C.sub.20H.sub.41--; C.sub.21H.sub.43--; C.sub.23H.sub.47--; C.sub.24H.sub.49; --C.sub.25H.sub.51--; C.sub.29H.sub.59--; C.sub.30H.sub.61; [0139] saturated, branched residues like iso-C.sub.3H.sub.7--; iso-C.sub.4H.sub.9--; iso-C.sub.18H.sub.37--; [0140] mono-unsaturated, straight-chain residues like C.sub.2H.sub.3--; C.sub.3H.sub.5--; C.sub.15H.sub.29--; C.sub.17H.sub.33--; C.sub.21H.sub.41--; [0141] two-fold unsaturated, straight-chain like C.sub.5H.sub.7--; C.sub.17H.sub.31--; Those residues are modified so that they carry alt least one functional substituent, selected from hydroxyl and amino groups, required for copolymerization.

[0142] In another aspect the fermentatively produced GABA may be applied for producing pyrrolidone by applying standard techniques of organic synthesis.

[0143] The following examples only serve to illustrate the invention. The numerous possible variations that are obvious to a person skilled in the art also fall within the scope of the invention.

Experimental Part

[0144] Unless otherwise stated the following experiments have been performed by applying standard equipment, methods, chemicals, and biochemicals as used in genetic engineering, fermentative production of chemical compounds by cultivation of microorganisms and in the analysis and isolation of products. See also Sambrook et al, and Chmiel et al as cited herein above.

Example 1

Cloning of an E. coli Glutamate Decarboxylase (GAD) Gene

[0145] PCR primers, WKJ95/WKJ96 and WKJ99/WKJ100, were used with chromosomal DNA of E. coli as a template to amplify the DNA fragments containing the gadBC and gadA genes, respectively. The amplified DNA fragments were purified, digested with restriction enzymes, XhoI/XbaI for gadBC and XhoI/SpeI for gadA, and ligated to the pClik5aMCS (SEQ ID NO:14; FIG. 3) vector digested with same restriction enzymes resulting in pClik5aMCS gadBC and pClik5aMCS gadA, respectively.

[0146] Oligonucleotide Primers Used:

TABLE-US-00006 WKJ95 (SEQ ID NO: 10) ccgctcgagcggcccaagcttcggtaaatacttataccggag WKJ96 (SEQ ID NO: 11) ctagtctagactagcccaagcttgtcgatcatcgcctgttg WKJ99 (SEQ ID NO: 12) ccgctcgagcggcccaagcttcgtgataaattgcgtcagaaag WKJ100 (SEQ ID NO: 13) ctagactagtctagcccaagcttctcgaatttggcttgcatcc

Example 2

Search for GAD Gene in Solanum tuberosum (Potato)

[0147] In order to find a yet unknown gene that encodes GAD in potato, the first step was to identify a GAD from a closely related organism. A query in the sequence databases Genbank, Refseq and Uniprot for "glutamate decarboxlase" in the genus "Solanum"revealed a previously characterised GAD in Solanum lycopersicum (tomato), Swissprot accession number P54767. This sequence was used as a template to perform a tblastn search in Genbank subsections plant, EST and GSS. Among the best 100 hits (expect value <10.sup.-118) 16 sequences were extracted from Solanum tuberosum (potato). All 16 sequences are expressed sequence tags (EST), i.e. represent fragments of the expressed and spliced mRNAs. An assembly using VectorNTI Contig Express (settings: overlap=20, identity=0.8, cut-off score=40) revealed a contig composed of 15 sequences and a second contig made up by one sequence (BG594946).

[0148] To check the quality of the assembly and to make a decision, which of the contigs to choose, the consensus sequences of both assemblies were generated and compared with all 100 hits from the initial blast search. The alignment (shown as guide tree in FIG. 1) revealed contig 1 to be an outlier. Since contig 2 that is composed solely of the EST with the accession BG594946 fits very well to the tomato GAD, it was chosen as the best candidate to represent the potato GAD.

[0149] Since BG594946 only covers the core of the GAD gene, the flanking 5' and 3' regions were taken from the corresponding tomato gene resulting in a chimera GAD gene as shown in FIG. 2.

Example 3

Cloning of a Synthetic Chimera GAD Gene

[0150] As the codon usage for the plant-originated chimera GAD gene is quite different to that of the C. glutamicum genes, expression of the chimera GAD gene may not be efficient in a C. glutamicum strain. To enhance gene expression in C. glulamicum, a synthetic GAD gene with the sequence being adapted to C. glutamicum codon usage was created on the basis of the chimera GAD gene without a calmodulin binding sequence. Furthermore, the synthetic GAD gene had a C. glutamicum sodA promoter (Psod) and a groEL terminator. The synthetic GAD gene was digested with restriction enzyme SpeI and inserted to the pClik5aMCS vector digested with the same restriction enzyme resulting in pClik5aMCS Psod SL_gad.

Example 4

GABA Production in Shake Flask Culture

[0151] To construct a GABA production strain glutamate producing bacterium C. glulamicum ATCC13032 was transformed with the recombinant plasmids containing the GAD genes.

[0152] Shaking flask experiments were performed on the recombinant strains to test the GABA production. The strains were pre-cultured on CM plates (10 g/l glucose, 2.5 g/l NaCl, 2 g/l urea, 10 g/l Bacto peptone, 10 g/l yeast extract, 22 g/l agar) overnight at 30.degree. C. Cultured cells were harvested in a microtube containing 1.5 ml of 0.9% NaCl and cell density was determined by the absorbance at 610 nm following vortex. For the main culture suspended cells were inoculated to reach 1.5 of initial OD into 10 ml of the production medium (60 g/l glucose, 30 g/l (NH.sub.4).sub.2SO.sub.4, 2 g/l yeast extract, 1 g/l KH.sub.2PO.sub.4, 1 g/l MgSO.sub.4.7H.sub.2O, 10 mg/l FeSO.sub.4.7H.sub.2O, 10 mg/l MnSO.sub.4.H.sub.2O, 0.2 mg/l thiamine.HCl, 2 mg/l biotin, 52 g/l ACES, pH 6.5) contained in an autoclaved 100 ml of Erlenmeyer flask. Main culture was performed on a rotary shaker (Infors AJ118, Bottmingen, Switzerland) with 200 rpm for 48 hours at 30.degree. C. The determination of the GABA concentration was conducted by HPLC (Agilent 1100 Series) with a Gemini C18 column (Phenomenex) and a fluorescence detector (Agilent). A pre-column derivatization with ortho-phthalaldehyde allows the quantification of GABA. Cell growth was monitored by a spectrophotometer at 610 nm.

[0153] An accumulation of GABA was observed in all recombinant strains containing the GAD gene. The recombinant strain carrying the pClik5aMCS Psod SL_gad plasmid showed the highest GABA productivity. The results are summarized in following table:

TABLE-US-00007 TABLE GABA production in shaking flask culture Strains GABA (mmol/g cell) ATCC13032 0.0 +pClik5aMCS 0.0 +pClik5aMCS gadA 0.2 +pClik5aMCS gadBC 0.4 +pClik5aMCS Psod SL_gad 1.2

Any document cited herein is incorporated by reference.

Sequence CWU 1

1

1411666DNAArtificial sequenceGDC potato- tomato chimera 1tagctgccaa ttattccggg cttgtgaccc gctacccgat aaataggtcg gctgaaaaat 60ttcgttgcaa tatcaacaaa aaggcctatc attgggaggt gtcgcaccaa gtacttttgc 120gaagcgccat ctgacggatt ttcaaaagat gtatatgctc ggtgcggaaa cctacgaaag 180gattttttac cc atg gtg ctg acc acc acc tcc atc cgc gat tcc gaa gaa 231 Met Val Leu Thr Thr Thr Ser Ile Arg Asp Ser Glu Glu 1 5 10 tcc ctg cac tgc acc ttc gca tcc cgc tac gtg cag gaa cca ctg cca 279Ser Leu His Cys Thr Phe Ala Ser Arg Tyr Val Gln Glu Pro Leu Pro 15 20 25 aag ttc aag atc cca aag aag tcc atg cca aag gaa gca gca tac cag 327Lys Phe Lys Ile Pro Lys Lys Ser Met Pro Lys Glu Ala Ala Tyr Gln 30 35 40 45 atc gtg aac gat gaa ctg atg ctg gat ggc aac cca cgc ctg aac ctg 375Ile Val Asn Asp Glu Leu Met Leu Asp Gly Asn Pro Arg Leu Asn Leu 50 55 60 gca tcc ttc gtg tcc acc tgg atg gaa cca gaa tgc gat aag ctg atc 423Ala Ser Phe Val Ser Thr Trp Met Glu Pro Glu Cys Asp Lys Leu Ile 65 70 75 atg tcc tcc atc aac aag aac tac gtg gat atg gat gaa tac cca gtg 471Met Ser Ser Ile Asn Lys Asn Tyr Val Asp Met Asp Glu Tyr Pro Val 80 85 90 acc acc gaa ctg cag aac cgc tgc gtg aac atg ctg gca cac ctg ttc 519Thr Thr Glu Leu Gln Asn Arg Cys Val Asn Met Leu Ala His Leu Phe 95 100 105 cac gca cca gtg ggc gat gat gaa acc gca gtg ggc gtg ggc acc gtg 567His Ala Pro Val Gly Asp Asp Glu Thr Ala Val Gly Val Gly Thr Val 110 115 120 125 ggc tcc tcc gaa gca atc atg ctg gca ggc ctg gca ttc aag cgc aag 615Gly Ser Ser Glu Ala Ile Met Leu Ala Gly Leu Ala Phe Lys Arg Lys 130 135 140 tgg cag gca aag cgc aag gca gaa ggc aag cca ttc gat aag cca aac 663Trp Gln Ala Lys Arg Lys Ala Glu Gly Lys Pro Phe Asp Lys Pro Asn 145 150 155 atc gtg acc ggc gca aac gtg cag gtg tgc tgg gaa aag ttc gca cgc 711Ile Val Thr Gly Ala Asn Val Gln Val Cys Trp Glu Lys Phe Ala Arg 160 165 170 tac ttc gaa gtg gaa ctg aag gaa gtg aag ctg aag gaa ggc tac tac 759Tyr Phe Glu Val Glu Leu Lys Glu Val Lys Leu Lys Glu Gly Tyr Tyr 175 180 185 gtg atg gat cca gca aag gca gtg gaa atg gtg gat gaa aac acc atc 807Val Met Asp Pro Ala Lys Ala Val Glu Met Val Asp Glu Asn Thr Ile 190 195 200 205 tgc gtg gca gca atc ctg ggc tcc acc ctg acc ggc gaa ttc gaa gat 855Cys Val Ala Ala Ile Leu Gly Ser Thr Leu Thr Gly Glu Phe Glu Asp 210 215 220 gtg aag ctg ctg aac gaa ctg ctg acc aag aag aac aag gaa acc ggc 903Val Lys Leu Leu Asn Glu Leu Leu Thr Lys Lys Asn Lys Glu Thr Gly 225 230 235 tgg gat acc cca atc cac gtg gat gca gca tcc ggc ggc ttc atc gca 951Trp Asp Thr Pro Ile His Val Asp Ala Ala Ser Gly Gly Phe Ile Ala 240 245 250 cca ttc ctg tgg cca gat ctg gaa tgg gat ttc cgc ctg cca ctg gtg 999Pro Phe Leu Trp Pro Asp Leu Glu Trp Asp Phe Arg Leu Pro Leu Val 255 260 265 aag tcc atc aac gtg tcc ggc cac aag tac ggc ctg gtg tac gca ggc 1047Lys Ser Ile Asn Val Ser Gly His Lys Tyr Gly Leu Val Tyr Ala Gly 270 275 280 285 gtg ggc tgg gtg atc tgg cgc tcc aag gaa gat ctg cca gat gaa ctg 1095Val Gly Trp Val Ile Trp Arg Ser Lys Glu Asp Leu Pro Asp Glu Leu 290 295 300 gtg ttc cac atc aac tac ctg ggc tcc gat cag cca acc ttc acc ctg 1143Val Phe His Ile Asn Tyr Leu Gly Ser Asp Gln Pro Thr Phe Thr Leu 305 310 315 aac ttc tcc aag tcc tcc tac cag atc atc gca cag tac tac cag ttc 1191Asn Phe Ser Lys Ser Ser Tyr Gln Ile Ile Ala Gln Tyr Tyr Gln Phe 320 325 330 atc cgc ctg ggc ttc gaa ggc tac aag gat gtg atg aag aac tgc ctg 1239Ile Arg Leu Gly Phe Glu Gly Tyr Lys Asp Val Met Lys Asn Cys Leu 335 340 345 tcc aac gca aag gtg ctg acc gaa ggc atc acc aag atg ggc cgc ttc 1287Ser Asn Ala Lys Val Leu Thr Glu Gly Ile Thr Lys Met Gly Arg Phe 350 355 360 365 gat atc gtg tcc aag gat gtg ggc gtg cca gtg gtg gca ttc tcc ctg 1335Asp Ile Val Ser Lys Asp Val Gly Val Pro Val Val Ala Phe Ser Leu 370 375 380 cgc gat tcc tcc aag tac acc gtg ttc gaa gtg tcc gaa cac ctg cgc 1383Arg Asp Ser Ser Lys Tyr Thr Val Phe Glu Val Ser Glu His Leu Arg 385 390 395 cgc ttc ggc tgg atc gtg cca gca tac acc atg cca cca gat gca gaa 1431Arg Phe Gly Trp Ile Val Pro Ala Tyr Thr Met Pro Pro Asp Ala Glu 400 405 410 cac atc gca gtg ctg cgc gtg gtg atc cgc gaa gat ttc tcc cac tcc 1479His Ile Ala Val Leu Arg Val Val Ile Arg Glu Asp Phe Ser His Ser 415 420 425 ctg gca gaa cgc ctg gtg tcc gat atc gaa aag atc ctg tcc gaa ctg 1527Leu Ala Glu Arg Leu Val Ser Asp Ile Glu Lys Ile Leu Ser Glu Leu 430 435 440 445 gat acc cag cca cca cgc ctg cca acc aag gca gtg cgc gtg acc gca 1575Asp Thr Gln Pro Pro Arg Leu Pro Thr Lys Ala Val Arg Val Thr Ala 450 455 460 gaa gaa gtg cgc gat gat aag ggc gat taa agttctgtga aaaacaccgt 1625Glu Glu Val Arg Asp Asp Lys Gly Asp 465 470 ggggcagttt ctgcttcgcg gtgtttttta tttgtggggc a 16662470PRTArtificial sequenceSynthetic Construct 2Met Val Leu Thr Thr Thr Ser Ile Arg Asp Ser Glu Glu Ser Leu His 1 5 10 15 Cys Thr Phe Ala Ser Arg Tyr Val Gln Glu Pro Leu Pro Lys Phe Lys 20 25 30 Ile Pro Lys Lys Ser Met Pro Lys Glu Ala Ala Tyr Gln Ile Val Asn 35 40 45 Asp Glu Leu Met Leu Asp Gly Asn Pro Arg Leu Asn Leu Ala Ser Phe 50 55 60 Val Ser Thr Trp Met Glu Pro Glu Cys Asp Lys Leu Ile Met Ser Ser 65 70 75 80 Ile Asn Lys Asn Tyr Val Asp Met Asp Glu Tyr Pro Val Thr Thr Glu 85 90 95 Leu Gln Asn Arg Cys Val Asn Met Leu Ala His Leu Phe His Ala Pro 100 105 110 Val Gly Asp Asp Glu Thr Ala Val Gly Val Gly Thr Val Gly Ser Ser 115 120 125 Glu Ala Ile Met Leu Ala Gly Leu Ala Phe Lys Arg Lys Trp Gln Ala 130 135 140 Lys Arg Lys Ala Glu Gly Lys Pro Phe Asp Lys Pro Asn Ile Val Thr 145 150 155 160 Gly Ala Asn Val Gln Val Cys Trp Glu Lys Phe Ala Arg Tyr Phe Glu 165 170 175 Val Glu Leu Lys Glu Val Lys Leu Lys Glu Gly Tyr Tyr Val Met Asp 180 185 190 Pro Ala Lys Ala Val Glu Met Val Asp Glu Asn Thr Ile Cys Val Ala 195 200 205 Ala Ile Leu Gly Ser Thr Leu Thr Gly Glu Phe Glu Asp Val Lys Leu 210 215 220 Leu Asn Glu Leu Leu Thr Lys Lys Asn Lys Glu Thr Gly Trp Asp Thr 225 230 235 240 Pro Ile His Val Asp Ala Ala Ser Gly Gly Phe Ile Ala Pro Phe Leu 245 250 255 Trp Pro Asp Leu Glu Trp Asp Phe Arg Leu Pro Leu Val Lys Ser Ile 260 265 270 Asn Val Ser Gly His Lys Tyr Gly Leu Val Tyr Ala Gly Val Gly Trp 275 280 285 Val Ile Trp Arg Ser Lys Glu Asp Leu Pro Asp Glu Leu Val Phe His 290 295 300 Ile Asn Tyr Leu Gly Ser Asp Gln Pro Thr Phe Thr Leu Asn Phe Ser 305 310 315 320 Lys Ser Ser Tyr Gln Ile Ile Ala Gln Tyr Tyr Gln Phe Ile Arg Leu 325 330 335 Gly Phe Glu Gly Tyr Lys Asp Val Met Lys Asn Cys Leu Ser Asn Ala 340 345 350 Lys Val Leu Thr Glu Gly Ile Thr Lys Met Gly Arg Phe Asp Ile Val 355 360 365 Ser Lys Asp Val Gly Val Pro Val Val Ala Phe Ser Leu Arg Asp Ser 370 375 380 Ser Lys Tyr Thr Val Phe Glu Val Ser Glu His Leu Arg Arg Phe Gly 385 390 395 400 Trp Ile Val Pro Ala Tyr Thr Met Pro Pro Asp Ala Glu His Ile Ala 405 410 415 Val Leu Arg Val Val Ile Arg Glu Asp Phe Ser His Ser Leu Ala Glu 420 425 430 Arg Leu Val Ser Asp Ile Glu Lys Ile Leu Ser Glu Leu Asp Thr Gln 435 440 445 Pro Pro Arg Leu Pro Thr Lys Ala Val Arg Val Thr Ala Glu Glu Val 450 455 460 Arg Asp Asp Lys Gly Asp 465 470 31509DNASolanum lycopersicumCDS(1)..(1509) 3atg gtg tta aca acg acg tcg ata aga gat tca gaa gag agc ttg cac 48Met Val Leu Thr Thr Thr Ser Ile Arg Asp Ser Glu Glu Ser Leu His 1 5 10 15 tgt aca ttt gca tca aga tat gta cag gaa cct tta cct aag ttc aaa 96Cys Thr Phe Ala Ser Arg Tyr Val Gln Glu Pro Leu Pro Lys Phe Lys 20 25 30 atg cct aaa aaa tcc atg ccg aaa gaa gca gct tat cag att gta aac 144Met Pro Lys Lys Ser Met Pro Lys Glu Ala Ala Tyr Gln Ile Val Asn 35 40 45 gac gag ctt atg ttg gat ggt aac ccc agg ttg aat tta gct tcc ttt 192Asp Glu Leu Met Leu Asp Gly Asn Pro Arg Leu Asn Leu Ala Ser Phe 50 55 60 gtt agc aca tgg atg gag ccc gag tgc gat aag ctc atc atg tca tcc 240Val Ser Thr Trp Met Glu Pro Glu Cys Asp Lys Leu Ile Met Ser Ser 65 70 75 80 att aat aaa aac tat gtc gac atg gat gag tat cct gtc acc act gaa 288Ile Asn Lys Asn Tyr Val Asp Met Asp Glu Tyr Pro Val Thr Thr Glu 85 90 95 ctt caa aat aga tgt gtt aac atg tta gca cat ctt ttc cat gcc ccg 336Leu Gln Asn Arg Cys Val Asn Met Leu Ala His Leu Phe His Ala Pro 100 105 110 gtt ggt gat gat gag act gca gtt gga gtt ggt aca gtg ggt tca tca 384Val Gly Asp Asp Glu Thr Ala Val Gly Val Gly Thr Val Gly Ser Ser 115 120 125 gag gca ata atg ctt gct ggc ctt gct ttc aaa cgc aaa tgg caa tcg 432Glu Ala Ile Met Leu Ala Gly Leu Ala Phe Lys Arg Lys Trp Gln Ser 130 135 140 aaa aga aaa gca gaa ggc aaa cct ttc gat aag cct aat ata gtc act 480Lys Arg Lys Ala Glu Gly Lys Pro Phe Asp Lys Pro Asn Ile Val Thr 145 150 155 160 gga gct aat gtg cag gtc tgc tgg gaa aaa ttt gca agg tat ttt gag 528Gly Ala Asn Val Gln Val Cys Trp Glu Lys Phe Ala Arg Tyr Phe Glu 165 170 175 gtt gag ttg aag gag gtg aaa cta aaa gaa gga tac tat gta atg gac 576Val Glu Leu Lys Glu Val Lys Leu Lys Glu Gly Tyr Tyr Val Met Asp 180 185 190 cct gcc aaa gca gta gag ata gtg gat gag aat aca ata tgt gtt gct 624Pro Ala Lys Ala Val Glu Ile Val Asp Glu Asn Thr Ile Cys Val Ala 195 200 205 gca atc ctt ggt tct act ctg act ggg gag ttt gag gat gtg aag ctc 672Ala Ile Leu Gly Ser Thr Leu Thr Gly Glu Phe Glu Asp Val Lys Leu 210 215 220 cta aac gag ctc ctt aca aaa aag aac aag gaa acc gga tgg gag aca 720Leu Asn Glu Leu Leu Thr Lys Lys Asn Lys Glu Thr Gly Trp Glu Thr 225 230 235 240 ccg att cat gtc gat gct gcg agt gga gga ttt att gct cct ttc ctc 768Pro Ile His Val Asp Ala Ala Ser Gly Gly Phe Ile Ala Pro Phe Leu 245 250 255 tgg cca gat ctt gaa tgg gat ttc cgt ttg cct ctt gtg aaa agt ata 816Trp Pro Asp Leu Glu Trp Asp Phe Arg Leu Pro Leu Val Lys Ser Ile 260 265 270 aat gtc agc ggt cac aag tat ggc ctt gta tat gct ggt gtc ggt tgg 864Asn Val Ser Gly His Lys Tyr Gly Leu Val Tyr Ala Gly Val Gly Trp 275 280 285 gtg ata tgg cgg agc aag gaa gac ttg ccc gat gaa ctc gtc ttt cat 912Val Ile Trp Arg Ser Lys Glu Asp Leu Pro Asp Glu Leu Val Phe His 290 295 300 ata aac tac ctt ggg tct gat cag cct act ttt act ctc aac ttc tct 960Ile Asn Tyr Leu Gly Ser Asp Gln Pro Thr Phe Thr Leu Asn Phe Ser 305 310 315 320 aaa ggt tcc tat caa ata att gca cag tat tat cag tta ata aga ctt 1008Lys Gly Ser Tyr Gln Ile Ile Ala Gln Tyr Tyr Gln Leu Ile Arg Leu 325 330 335 ggc ttt gag ggt tat aag aac gtc atg aag aat tgc tta tca aac gca 1056Gly Phe Glu Gly Tyr Lys Asn Val Met Lys Asn Cys Leu Ser Asn Ala 340 345 350 aaa gta cta aca gag gga atc aca aaa atg ggg cgg ttc gat att gtc 1104Lys Val Leu Thr Glu Gly Ile Thr Lys Met Gly Arg Phe Asp Ile Val 355 360 365 tct aag gat gtg ggt gtt cct gtt gta gca ttt tct ctc agg gac agc 1152Ser Lys Asp Val Gly Val Pro Val Val Ala Phe Ser Leu Arg Asp Ser 370 375 380 agc aaa tat acg gta ttt gaa gta tct gag cat ctc aga aga ttt gga 1200Ser Lys Tyr Thr Val Phe Glu Val Ser Glu His Leu Arg Arg Phe Gly 385 390 395 400 tgg atc gtc cct gca tac aca atg cca ccg gat gct gaa cac att gct 1248Trp Ile Val Pro Ala Tyr Thr Met Pro Pro Asp Ala Glu His Ile Ala 405 410 415 gta ctg cgg gtt gtc att aga gag gat ttc agc cac agc cta gct gag 1296Val Leu Arg Val Val Ile Arg Glu Asp Phe Ser His Ser Leu Ala Glu 420 425 430 aga ctt gtt tct gac att gag aaa att ctg tca gag ttg gac aca cag 1344Arg Leu Val Ser Asp Ile Glu Lys Ile Leu Ser Glu Leu Asp Thr Gln 435 440 445 cct cct cgt ttg ccc acc aaa gct gtc cgt gtc act gct gag gaa gtg 1392Pro Pro Arg Leu Pro Thr Lys Ala Val Arg Val Thr Ala Glu Glu Val 450 455 460 cgt gat gac aag ggt gat ggg ctt cat cat ttt cac atg gat act gta 1440Arg Asp Asp Lys Gly Asp Gly Leu His His Phe His Met Asp Thr Val 465 470 475 480 gag act cag aaa gac att atc aaa cat tgg agg aaa atc gca ggg aag 1488Glu Thr Gln Lys Asp Ile Ile Lys His Trp Arg Lys Ile Ala Gly Lys 485 490 495 aag acc agc gga gtc tgc tag 1509Lys Thr Ser Gly Val Cys 500 4502PRTSolanum lycopersicum 4Met Val Leu Thr Thr Thr Ser Ile Arg Asp Ser Glu Glu Ser Leu His 1 5 10 15 Cys Thr Phe Ala Ser Arg Tyr Val Gln Glu Pro Leu Pro Lys Phe Lys 20 25 30 Met Pro Lys Lys Ser Met Pro Lys Glu Ala Ala Tyr Gln Ile Val Asn 35 40 45 Asp Glu Leu Met Leu Asp Gly Asn Pro Arg Leu Asn Leu Ala Ser Phe 50 55 60 Val Ser Thr Trp Met Glu Pro Glu Cys Asp Lys Leu Ile Met Ser Ser 65 70 75

80 Ile Asn Lys Asn Tyr Val Asp Met Asp Glu Tyr Pro Val Thr Thr Glu 85 90 95 Leu Gln Asn Arg Cys Val Asn Met Leu Ala His Leu Phe His Ala Pro 100 105 110 Val Gly Asp Asp Glu Thr Ala Val Gly Val Gly Thr Val Gly Ser Ser 115 120 125 Glu Ala Ile Met Leu Ala Gly Leu Ala Phe Lys Arg Lys Trp Gln Ser 130 135 140 Lys Arg Lys Ala Glu Gly Lys Pro Phe Asp Lys Pro Asn Ile Val Thr 145 150 155 160 Gly Ala Asn Val Gln Val Cys Trp Glu Lys Phe Ala Arg Tyr Phe Glu 165 170 175 Val Glu Leu Lys Glu Val Lys Leu Lys Glu Gly Tyr Tyr Val Met Asp 180 185 190 Pro Ala Lys Ala Val Glu Ile Val Asp Glu Asn Thr Ile Cys Val Ala 195 200 205 Ala Ile Leu Gly Ser Thr Leu Thr Gly Glu Phe Glu Asp Val Lys Leu 210 215 220 Leu Asn Glu Leu Leu Thr Lys Lys Asn Lys Glu Thr Gly Trp Glu Thr 225 230 235 240 Pro Ile His Val Asp Ala Ala Ser Gly Gly Phe Ile Ala Pro Phe Leu 245 250 255 Trp Pro Asp Leu Glu Trp Asp Phe Arg Leu Pro Leu Val Lys Ser Ile 260 265 270 Asn Val Ser Gly His Lys Tyr Gly Leu Val Tyr Ala Gly Val Gly Trp 275 280 285 Val Ile Trp Arg Ser Lys Glu Asp Leu Pro Asp Glu Leu Val Phe His 290 295 300 Ile Asn Tyr Leu Gly Ser Asp Gln Pro Thr Phe Thr Leu Asn Phe Ser 305 310 315 320 Lys Gly Ser Tyr Gln Ile Ile Ala Gln Tyr Tyr Gln Leu Ile Arg Leu 325 330 335 Gly Phe Glu Gly Tyr Lys Asn Val Met Lys Asn Cys Leu Ser Asn Ala 340 345 350 Lys Val Leu Thr Glu Gly Ile Thr Lys Met Gly Arg Phe Asp Ile Val 355 360 365 Ser Lys Asp Val Gly Val Pro Val Val Ala Phe Ser Leu Arg Asp Ser 370 375 380 Ser Lys Tyr Thr Val Phe Glu Val Ser Glu His Leu Arg Arg Phe Gly 385 390 395 400 Trp Ile Val Pro Ala Tyr Thr Met Pro Pro Asp Ala Glu His Ile Ala 405 410 415 Val Leu Arg Val Val Ile Arg Glu Asp Phe Ser His Ser Leu Ala Glu 420 425 430 Arg Leu Val Ser Asp Ile Glu Lys Ile Leu Ser Glu Leu Asp Thr Gln 435 440 445 Pro Pro Arg Leu Pro Thr Lys Ala Val Arg Val Thr Ala Glu Glu Val 450 455 460 Arg Asp Asp Lys Gly Asp Gly Leu His His Phe His Met Asp Thr Val 465 470 475 480 Glu Thr Gln Lys Asp Ile Ile Lys His Trp Arg Lys Ile Ala Gly Lys 485 490 495 Lys Thr Ser Gly Val Cys 500 51401DNAEscherichia coliCDS(1)..(1401) 5atg gac cag aag ctg tta acg gat ttc cgc tca gaa cta ctc gat tca 48Met Asp Gln Lys Leu Leu Thr Asp Phe Arg Ser Glu Leu Leu Asp Ser 1 5 10 15 cgt ttt ggc gca aag gcc att tct act atc gcg gag tca aaa cga ttt 96Arg Phe Gly Ala Lys Ala Ile Ser Thr Ile Ala Glu Ser Lys Arg Phe 20 25 30 ccg ctg cac gaa atg cgc gat gat gtc gca ttt cag att atc aat gat 144Pro Leu His Glu Met Arg Asp Asp Val Ala Phe Gln Ile Ile Asn Asp 35 40 45 gaa tta tat ctt gat ggc aac gct cgt cag aac ctg gcc act ttc tgc 192Glu Leu Tyr Leu Asp Gly Asn Ala Arg Gln Asn Leu Ala Thr Phe Cys 50 55 60 cag acc tgg gac gac gaa aac gtc cat aaa ttg atg gat ttg tcg atc 240Gln Thr Trp Asp Asp Glu Asn Val His Lys Leu Met Asp Leu Ser Ile 65 70 75 80 aat aaa aac tgg atc gac aaa gaa gaa tat ccg caa tcc gca gcc atc 288Asn Lys Asn Trp Ile Asp Lys Glu Glu Tyr Pro Gln Ser Ala Ala Ile 85 90 95 gac ctg cgt tgc gta aat atg gtt gcc gat ctg tgg cat gcg cct gcg 336Asp Leu Arg Cys Val Asn Met Val Ala Asp Leu Trp His Ala Pro Ala 100 105 110 ccg aaa aat ggt cag gcc gtt ggc acc aac acc att ggt tct tcc gag 384Pro Lys Asn Gly Gln Ala Val Gly Thr Asn Thr Ile Gly Ser Ser Glu 115 120 125 gcc tgt atg ctc ggc ggg atg gcg atg aaa tgg cgt tgg cgc aag cgt 432Ala Cys Met Leu Gly Gly Met Ala Met Lys Trp Arg Trp Arg Lys Arg 130 135 140 atg gaa gct gca ggc aaa cca acg gat aaa cca aac ctg gtg tgc ggt 480Met Glu Ala Ala Gly Lys Pro Thr Asp Lys Pro Asn Leu Val Cys Gly 145 150 155 160 ccg gta caa atc tgc tgg cat aaa ttc gcc cgc tac tgg gat gtg gag 528Pro Val Gln Ile Cys Trp His Lys Phe Ala Arg Tyr Trp Asp Val Glu 165 170 175 ctg cgt gag atc cct atg cgc ccc ggt cag ttg ttt atg gac ccg aaa 576Leu Arg Glu Ile Pro Met Arg Pro Gly Gln Leu Phe Met Asp Pro Lys 180 185 190 cgc atg att gaa gcc tgt gac gaa aac acc atc ggc gtg gtg ccg act 624Arg Met Ile Glu Ala Cys Asp Glu Asn Thr Ile Gly Val Val Pro Thr 195 200 205 ttc ggc gtg acc tac acc ggt aac tat gag ttc cca caa ccg ctg cac 672Phe Gly Val Thr Tyr Thr Gly Asn Tyr Glu Phe Pro Gln Pro Leu His 210 215 220 gat gcg ctg gat aaa ttc cag gcc gac acc ggt atc gac atc gac atg 720Asp Ala Leu Asp Lys Phe Gln Ala Asp Thr Gly Ile Asp Ile Asp Met 225 230 235 240 cac atc gac gct gcc agc ggt ggc ttc ctg gca ccg ttc gtc gcc ccg 768His Ile Asp Ala Ala Ser Gly Gly Phe Leu Ala Pro Phe Val Ala Pro 245 250 255 gat atc gtc tgg gac ttc cgc ctg ccg cgt gtg aaa tcg atc agt gct 816Asp Ile Val Trp Asp Phe Arg Leu Pro Arg Val Lys Ser Ile Ser Ala 260 265 270 tca ggc cat aaa ttc ggt ctg gct ccg ctg ggc tgc ggc tgg gtt atc 864Ser Gly His Lys Phe Gly Leu Ala Pro Leu Gly Cys Gly Trp Val Ile 275 280 285 tgg cgt gac gaa gaa gcg ctg ccg cag gaa ctg gtg ttc aac gtt gac 912Trp Arg Asp Glu Glu Ala Leu Pro Gln Glu Leu Val Phe Asn Val Asp 290 295 300 tac ctg ggt ggt caa att ggt act ttt gcc atc aac ttc tcc cgc ccg 960Tyr Leu Gly Gly Gln Ile Gly Thr Phe Ala Ile Asn Phe Ser Arg Pro 305 310 315 320 gcg ggt cag gta att gca cag tac tat gaa ttc ctg cgc ctc ggt cgt 1008Ala Gly Gln Val Ile Ala Gln Tyr Tyr Glu Phe Leu Arg Leu Gly Arg 325 330 335 gaa ggc tat acc aaa gta cag aac gcc tct tac cag gtt gcc gct tat 1056Glu Gly Tyr Thr Lys Val Gln Asn Ala Ser Tyr Gln Val Ala Ala Tyr 340 345 350 ctg gcg gat gaa atc gcc aaa ctg ggg ccg tat gag ttc atc tgt acg 1104Leu Ala Asp Glu Ile Ala Lys Leu Gly Pro Tyr Glu Phe Ile Cys Thr 355 360 365 ggt cgc ccg gac gaa ggc atc ccg gcg gtt tgc ttc aaa ctg aaa gat 1152Gly Arg Pro Asp Glu Gly Ile Pro Ala Val Cys Phe Lys Leu Lys Asp 370 375 380 ggt gaa gat ccg gga tac acc ctg tac gac ctc tct gaa cgt ctg cgt 1200Gly Glu Asp Pro Gly Tyr Thr Leu Tyr Asp Leu Ser Glu Arg Leu Arg 385 390 395 400 ctg cgc ggc tgg cag gtt ccg gcc ttc act ctc ggc ggt gaa gcc acc 1248Leu Arg Gly Trp Gln Val Pro Ala Phe Thr Leu Gly Gly Glu Ala Thr 405 410 415 gac atc gtg gtg atg cgc att atg tgt cgt cgc ggc ttc gaa atg gac 1296Asp Ile Val Val Met Arg Ile Met Cys Arg Arg Gly Phe Glu Met Asp 420 425 430 ttt gct gaa ctg ttg ctg gaa gac tac aaa gcc tcc ctg aaa tat ctc 1344Phe Ala Glu Leu Leu Leu Glu Asp Tyr Lys Ala Ser Leu Lys Tyr Leu 435 440 445 agc gat cac ccg aaa ctg cag ggt att gcc cag cag aac agc ttt aaa 1392Ser Asp His Pro Lys Leu Gln Gly Ile Ala Gln Gln Asn Ser Phe Lys 450 455 460 cac acc tga 1401His Thr 465 6466PRTEscherichia coli 6Met Asp Gln Lys Leu Leu Thr Asp Phe Arg Ser Glu Leu Leu Asp Ser 1 5 10 15 Arg Phe Gly Ala Lys Ala Ile Ser Thr Ile Ala Glu Ser Lys Arg Phe 20 25 30 Pro Leu His Glu Met Arg Asp Asp Val Ala Phe Gln Ile Ile Asn Asp 35 40 45 Glu Leu Tyr Leu Asp Gly Asn Ala Arg Gln Asn Leu Ala Thr Phe Cys 50 55 60 Gln Thr Trp Asp Asp Glu Asn Val His Lys Leu Met Asp Leu Ser Ile 65 70 75 80 Asn Lys Asn Trp Ile Asp Lys Glu Glu Tyr Pro Gln Ser Ala Ala Ile 85 90 95 Asp Leu Arg Cys Val Asn Met Val Ala Asp Leu Trp His Ala Pro Ala 100 105 110 Pro Lys Asn Gly Gln Ala Val Gly Thr Asn Thr Ile Gly Ser Ser Glu 115 120 125 Ala Cys Met Leu Gly Gly Met Ala Met Lys Trp Arg Trp Arg Lys Arg 130 135 140 Met Glu Ala Ala Gly Lys Pro Thr Asp Lys Pro Asn Leu Val Cys Gly 145 150 155 160 Pro Val Gln Ile Cys Trp His Lys Phe Ala Arg Tyr Trp Asp Val Glu 165 170 175 Leu Arg Glu Ile Pro Met Arg Pro Gly Gln Leu Phe Met Asp Pro Lys 180 185 190 Arg Met Ile Glu Ala Cys Asp Glu Asn Thr Ile Gly Val Val Pro Thr 195 200 205 Phe Gly Val Thr Tyr Thr Gly Asn Tyr Glu Phe Pro Gln Pro Leu His 210 215 220 Asp Ala Leu Asp Lys Phe Gln Ala Asp Thr Gly Ile Asp Ile Asp Met 225 230 235 240 His Ile Asp Ala Ala Ser Gly Gly Phe Leu Ala Pro Phe Val Ala Pro 245 250 255 Asp Ile Val Trp Asp Phe Arg Leu Pro Arg Val Lys Ser Ile Ser Ala 260 265 270 Ser Gly His Lys Phe Gly Leu Ala Pro Leu Gly Cys Gly Trp Val Ile 275 280 285 Trp Arg Asp Glu Glu Ala Leu Pro Gln Glu Leu Val Phe Asn Val Asp 290 295 300 Tyr Leu Gly Gly Gln Ile Gly Thr Phe Ala Ile Asn Phe Ser Arg Pro 305 310 315 320 Ala Gly Gln Val Ile Ala Gln Tyr Tyr Glu Phe Leu Arg Leu Gly Arg 325 330 335 Glu Gly Tyr Thr Lys Val Gln Asn Ala Ser Tyr Gln Val Ala Ala Tyr 340 345 350 Leu Ala Asp Glu Ile Ala Lys Leu Gly Pro Tyr Glu Phe Ile Cys Thr 355 360 365 Gly Arg Pro Asp Glu Gly Ile Pro Ala Val Cys Phe Lys Leu Lys Asp 370 375 380 Gly Glu Asp Pro Gly Tyr Thr Leu Tyr Asp Leu Ser Glu Arg Leu Arg 385 390 395 400 Leu Arg Gly Trp Gln Val Pro Ala Phe Thr Leu Gly Gly Glu Ala Thr 405 410 415 Asp Ile Val Val Met Arg Ile Met Cys Arg Arg Gly Phe Glu Met Asp 420 425 430 Phe Ala Glu Leu Leu Leu Glu Asp Tyr Lys Ala Ser Leu Lys Tyr Leu 435 440 445 Ser Asp His Pro Lys Leu Gln Gly Ile Ala Gln Gln Asn Ser Phe Lys 450 455 460 His Thr 465 73092DNAEscherichia coliCDS(1)..(1398)gadB 7atg gat aag aag caa gta acg gat tta agg tcg gaa cta ctc gat tca 48Met Asp Lys Lys Gln Val Thr Asp Leu Arg Ser Glu Leu Leu Asp Ser 1 5 10 15 cgt ttt ggt gcg aag tct att tcc act atc gca gaa tca aaa cgt ttt 96Arg Phe Gly Ala Lys Ser Ile Ser Thr Ile Ala Glu Ser Lys Arg Phe 20 25 30 ccg ctg cac gaa atg cgc gac gat gtc gca ttc cag att atc aat gac 144Pro Leu His Glu Met Arg Asp Asp Val Ala Phe Gln Ile Ile Asn Asp 35 40 45 gaa tta tat ctt gat ggc aac gct cgt cag aac ctg gcc act ttc tgc 192Glu Leu Tyr Leu Asp Gly Asn Ala Arg Gln Asn Leu Ala Thr Phe Cys 50 55 60 cag acc tgg gac gac gaa aat gtc cac aaa ttg atg gat tta tcc att 240Gln Thr Trp Asp Asp Glu Asn Val His Lys Leu Met Asp Leu Ser Ile 65 70 75 80 aac aaa aac tgg atc gac aaa gaa gaa tat ccg caa tcc gca gcc atc 288Asn Lys Asn Trp Ile Asp Lys Glu Glu Tyr Pro Gln Ser Ala Ala Ile 85 90 95 gac ctg cgt tgc gta aat atg gtt gcc gat ctg tgg cat gcg cct gcg 336Asp Leu Arg Cys Val Asn Met Val Ala Asp Leu Trp His Ala Pro Ala 100 105 110 ccg aaa aat ggt cag gcc gtt ggc acc aac acc att ggt tct tcc gag 384Pro Lys Asn Gly Gln Ala Val Gly Thr Asn Thr Ile Gly Ser Ser Glu 115 120 125 gcc tgt atg ctc ggc ggg atg gcg atg aaa tgg cgt tgg cgc aag cgt 432Ala Cys Met Leu Gly Gly Met Ala Met Lys Trp Arg Trp Arg Lys Arg 130 135 140 atg gaa gct gca ggc aaa cca acg gat aaa cca aac ctg gtg tgc ggt 480Met Glu Ala Ala Gly Lys Pro Thr Asp Lys Pro Asn Leu Val Cys Gly 145 150 155 160 ccg gta caa atc tgc tgg cat aaa ttc gcc cgc tac tgg gat gtg gag 528Pro Val Gln Ile Cys Trp His Lys Phe Ala Arg Tyr Trp Asp Val Glu 165 170 175 ctg cgt gag atc cct atg cgc ccc ggt cag ttg ttt atg gac ccg aaa 576Leu Arg Glu Ile Pro Met Arg Pro Gly Gln Leu Phe Met Asp Pro Lys 180 185 190 cgc atg att gaa gcc tgt gac gaa aac acc atc ggc gtg gtg ccg act 624Arg Met Ile Glu Ala Cys Asp Glu Asn Thr Ile Gly Val Val Pro Thr 195 200 205 ttc ggc gtg acc tac act ggt aac tat gag ttc cca caa ccg ctg cac 672Phe Gly Val Thr Tyr Thr Gly Asn Tyr Glu Phe Pro Gln Pro Leu His 210 215 220 gat gcg ctg gat aaa ttc cag gcc gat acc ggt atc gac atc gac atg 720Asp Ala Leu Asp Lys Phe Gln Ala Asp Thr Gly Ile Asp Ile Asp Met 225 230 235 240 cac atc gac gct gcc agc ggt ggc ttc ctg gca ccg ttc gtc gcc ccg 768His Ile Asp Ala Ala Ser Gly Gly Phe Leu Ala Pro Phe Val Ala Pro 245 250 255 gat atc gtc tgg gac ttc cgc ctg ccg cgt gtg aaa tcg atc agt gct 816Asp Ile Val Trp Asp Phe Arg Leu Pro Arg Val Lys Ser Ile Ser Ala 260 265 270 tca ggc cat aaa ttc ggt ctg gct ccg ctg ggc tgc ggc tgg gtt atc 864Ser Gly His Lys Phe Gly Leu Ala Pro Leu Gly Cys Gly Trp Val Ile 275 280 285 tgg cgt gac gaa gaa gcg ctg ccg cag gaa ctg gtg ttc aac gtt gac 912Trp Arg Asp Glu Glu Ala Leu Pro Gln Glu Leu Val Phe Asn Val Asp 290 295 300 tac ctg ggt ggt caa att ggt act ttt gcc atc aac ttc tcc cgc ccg 960Tyr Leu Gly Gly Gln Ile Gly Thr Phe Ala Ile Asn Phe Ser Arg Pro 305 310 315 320 gcg ggt cag gta att gca cag tac tat gaa ttc ctg cgc ctc ggt cgt 1008Ala Gly Gln Val Ile Ala Gln Tyr Tyr Glu Phe Leu Arg Leu Gly Arg 325 330 335 gaa ggc tat acc aaa gta cag aac gcc tct tac cag gtt gcc gct tat

1056Glu Gly Tyr Thr Lys Val Gln Asn Ala Ser Tyr Gln Val Ala Ala Tyr 340 345 350 ctg gcg gat gaa atc gcc aaa ctg ggg ccg tat gag ttc atc tgt acg 1104Leu Ala Asp Glu Ile Ala Lys Leu Gly Pro Tyr Glu Phe Ile Cys Thr 355 360 365 ggt cgc ccg gac gaa ggc atc ccg gcg gtt tgc ttc aaa ctg aaa gat 1152Gly Arg Pro Asp Glu Gly Ile Pro Ala Val Cys Phe Lys Leu Lys Asp 370 375 380 ggt gaa gat ccg gga tac acc ctg tat gac ctc tct gaa cgt ctg cgt 1200Gly Glu Asp Pro Gly Tyr Thr Leu Tyr Asp Leu Ser Glu Arg Leu Arg 385 390 395 400 ctg cgc ggc tgg cag gtt ccg gcc ttc act ctc ggc ggt gaa gcc acc 1248Leu Arg Gly Trp Gln Val Pro Ala Phe Thr Leu Gly Gly Glu Ala Thr 405 410 415 gac atc gtg gtg atg cgc att atg tgt cgt cgc ggc ttc gaa atg gac 1296Asp Ile Val Val Met Arg Ile Met Cys Arg Arg Gly Phe Glu Met Asp 420 425 430 ttt gct gaa ctg ttg ctg gaa gac tac aaa gcc tcc ctg aaa tat ctc 1344Phe Ala Glu Leu Leu Leu Glu Asp Tyr Lys Ala Ser Leu Lys Tyr Leu 435 440 445 agc gat cac ccg aaa ctg cag ggt att gcc caa cag aac agc ttt aaa 1392Ser Asp His Pro Lys Leu Gln Gly Ile Ala Gln Gln Asn Ser Phe Lys 450 455 460 cat acc tgataacgtt taacggtaac ggtgtcccga aacgaacccg tttcgggaca 1448His Thr 465 atttccaaag tctgttcact ggcattagca acggaaaata ttgttctgaa tacgcttcag 1508aacaaaacag gtgcggttcc gacaggaata ccgttttagg gggataat atg gct aca 1565 Met Ala Thr tca gta cag aca ggt aaa gct aag cag ctc aca tta ctt gga ttc ttt 1613Ser Val Gln Thr Gly Lys Ala Lys Gln Leu Thr Leu Leu Gly Phe Phe 470 475 480 485 gcc ata acg gca tcg atg gta atg gct gtt tat gaa tac cct acc ttc 1661Ala Ile Thr Ala Ser Met Val Met Ala Val Tyr Glu Tyr Pro Thr Phe 490 495 500 gca aca tcg ggc ttt tca tta gtc ttc ttc ctg cta tta ggc ggg att 1709Ala Thr Ser Gly Phe Ser Leu Val Phe Phe Leu Leu Leu Gly Gly Ile 505 510 515 tta tgg ttt att ccc gtg gga ctt tgt gct gcg gaa atg gcc acc gtc 1757Leu Trp Phe Ile Pro Val Gly Leu Cys Ala Ala Glu Met Ala Thr Val 520 525 530 gac ggc tgg gaa gaa ggt ggt gtc ttc gcc tgg gta tca aat act ctg 1805Asp Gly Trp Glu Glu Gly Gly Val Phe Ala Trp Val Ser Asn Thr Leu 535 540 545 ggg ccg aga tgg gga ttt gca gcg atc tca ttt ggc tat ctg caa atc 1853Gly Pro Arg Trp Gly Phe Ala Ala Ile Ser Phe Gly Tyr Leu Gln Ile 550 555 560 565 gcc att ggt ttt att ccg atg ctc tat ttc gtg tta ggg gca ctc tcc 1901Ala Ile Gly Phe Ile Pro Met Leu Tyr Phe Val Leu Gly Ala Leu Ser 570 575 580 tac atc ctg aaa tgg cca gcg ctg aat gaa gac ccc att acc aaa act 1949Tyr Ile Leu Lys Trp Pro Ala Leu Asn Glu Asp Pro Ile Thr Lys Thr 585 590 595 att gca gca ctc atc att ctt tgg gcg ctg gca tta acg cag ttt ggt 1997Ile Ala Ala Leu Ile Ile Leu Trp Ala Leu Ala Leu Thr Gln Phe Gly 600 605 610 ggc acg aaa tac acg gcg cga att gct aaa gtt ggc ttc ttc gcc ggt 2045Gly Thr Lys Tyr Thr Ala Arg Ile Ala Lys Val Gly Phe Phe Ala Gly 615 620 625 atc ctg tta cct gca ttt att ttg atc gca tta gcg gct att tat ctg 2093Ile Leu Leu Pro Ala Phe Ile Leu Ile Ala Leu Ala Ala Ile Tyr Leu 630 635 640 645 cac tcc ggt gcc ccc gtt gct atc gaa atg gat tcg aag acc ttc ttc 2141His Ser Gly Ala Pro Val Ala Ile Glu Met Asp Ser Lys Thr Phe Phe 650 655 660 cct gac ttc tct aaa gtg ggc acc ctg gta gta ttt gtt gcc ttc att 2189Pro Asp Phe Ser Lys Val Gly Thr Leu Val Val Phe Val Ala Phe Ile 665 670 675 ttg agt tat atg ggc gta gaa gca tcc gca acc cac gtc aat gaa atg 2237Leu Ser Tyr Met Gly Val Glu Ala Ser Ala Thr His Val Asn Glu Met 680 685 690 agc aac cca ggg cgc gac tat ccg ttg gct atg tta ctg ctg atg gtg 2285Ser Asn Pro Gly Arg Asp Tyr Pro Leu Ala Met Leu Leu Leu Met Val 695 700 705 gcg gca atc tgc tta agc tct gtt ggt ggt ttg tct att gcg atg gtc 2333Ala Ala Ile Cys Leu Ser Ser Val Gly Gly Leu Ser Ile Ala Met Val 710 715 720 725 att ccg ggt aat gaa atc aac ctc tcc gca ggg gta atg caa acc ttt 2381Ile Pro Gly Asn Glu Ile Asn Leu Ser Ala Gly Val Met Gln Thr Phe 730 735 740 acc gtt ctg atg tcc cat gtg gca cca gaa att gag tgg acg gtt cgc 2429Thr Val Leu Met Ser His Val Ala Pro Glu Ile Glu Trp Thr Val Arg 745 750 755 gtg atc tcc gca ctg ctg ttg ctg ggt gtt ctg gcg gaa atc gcc tcc 2477Val Ile Ser Ala Leu Leu Leu Leu Gly Val Leu Ala Glu Ile Ala Ser 760 765 770 tgg att gtt ggt cct tct cgc ggg atg tat gta aca gcg cag aaa aac 2525Trp Ile Val Gly Pro Ser Arg Gly Met Tyr Val Thr Ala Gln Lys Asn 775 780 785 ctg ctg cca gcg gca ttc gct aaa atg aac aaa aat ggc gta ccg gta 2573Leu Leu Pro Ala Ala Phe Ala Lys Met Asn Lys Asn Gly Val Pro Val 790 795 800 805 acg ctg gtc att tcg cag ctg gtg att acg tct atc gcg ttg atc atc 2621Thr Leu Val Ile Ser Gln Leu Val Ile Thr Ser Ile Ala Leu Ile Ile 810 815 820 ctc acc aat acc ggt ggc ggt aac aac atg tcc ttc ctg atc gca ctg 2669Leu Thr Asn Thr Gly Gly Gly Asn Asn Met Ser Phe Leu Ile Ala Leu 825 830 835 gcg ctg acg gtg gtg att tat ctg tgt gct tat ttc atg ctg ttt att 2717Ala Leu Thr Val Val Ile Tyr Leu Cys Ala Tyr Phe Met Leu Phe Ile 840 845 850 ggc tac att gtg ttg gtt ctt aaa cat cct gac tta aaa cgc aca ttt 2765Gly Tyr Ile Val Leu Val Leu Lys His Pro Asp Leu Lys Arg Thr Phe 855 860 865 aat atc cct ggt ggt aaa ggg gtg aaa ctg gtc gtg gca att gtc ggt 2813Asn Ile Pro Gly Gly Lys Gly Val Lys Leu Val Val Ala Ile Val Gly 870 875 880 885 ctg ctg act tca att atg gcg ttt att gtt tcc ttc ctg ccg ccg gat 2861Leu Leu Thr Ser Ile Met Ala Phe Ile Val Ser Phe Leu Pro Pro Asp 890 895 900 aac atc cag ggt gat tct acc gat atg tat gtt gaa tta ctg gtt gtt 2909Asn Ile Gln Gly Asp Ser Thr Asp Met Tyr Val Glu Leu Leu Val Val 905 910 915 agt ttc ctg gtg gta ctt gcc ctg ccc ttt att ctc tat gct gtt cat 2957Ser Phe Leu Val Val Leu Ala Leu Pro Phe Ile Leu Tyr Ala Val His 920 925 930 gat cgt aaa ggc aaa gca aat acc ggc gtc act ctg gag cca atc aac 3005Asp Arg Lys Gly Lys Ala Asn Thr Gly Val Thr Leu Glu Pro Ile Asn 935 940 945 agt cag aac gca cca aaa ggt cac ttc ttc ctg cac ccg cgt gca cgt 3053Ser Gln Asn Ala Pro Lys Gly His Phe Phe Leu His Pro Arg Ala Arg 950 955 960 965 tca cca cac tat att gtg atg aat gac aag aaa cac taa 3092Ser Pro His Tyr Ile Val Met Asn Asp Lys Lys His 970 975 8466PRTEscherichia coli 8Met Asp Lys Lys Gln Val Thr Asp Leu Arg Ser Glu Leu Leu Asp Ser 1 5 10 15 Arg Phe Gly Ala Lys Ser Ile Ser Thr Ile Ala Glu Ser Lys Arg Phe 20 25 30 Pro Leu His Glu Met Arg Asp Asp Val Ala Phe Gln Ile Ile Asn Asp 35 40 45 Glu Leu Tyr Leu Asp Gly Asn Ala Arg Gln Asn Leu Ala Thr Phe Cys 50 55 60 Gln Thr Trp Asp Asp Glu Asn Val His Lys Leu Met Asp Leu Ser Ile 65 70 75 80 Asn Lys Asn Trp Ile Asp Lys Glu Glu Tyr Pro Gln Ser Ala Ala Ile 85 90 95 Asp Leu Arg Cys Val Asn Met Val Ala Asp Leu Trp His Ala Pro Ala 100 105 110 Pro Lys Asn Gly Gln Ala Val Gly Thr Asn Thr Ile Gly Ser Ser Glu 115 120 125 Ala Cys Met Leu Gly Gly Met Ala Met Lys Trp Arg Trp Arg Lys Arg 130 135 140 Met Glu Ala Ala Gly Lys Pro Thr Asp Lys Pro Asn Leu Val Cys Gly 145 150 155 160 Pro Val Gln Ile Cys Trp His Lys Phe Ala Arg Tyr Trp Asp Val Glu 165 170 175 Leu Arg Glu Ile Pro Met Arg Pro Gly Gln Leu Phe Met Asp Pro Lys 180 185 190 Arg Met Ile Glu Ala Cys Asp Glu Asn Thr Ile Gly Val Val Pro Thr 195 200 205 Phe Gly Val Thr Tyr Thr Gly Asn Tyr Glu Phe Pro Gln Pro Leu His 210 215 220 Asp Ala Leu Asp Lys Phe Gln Ala Asp Thr Gly Ile Asp Ile Asp Met 225 230 235 240 His Ile Asp Ala Ala Ser Gly Gly Phe Leu Ala Pro Phe Val Ala Pro 245 250 255 Asp Ile Val Trp Asp Phe Arg Leu Pro Arg Val Lys Ser Ile Ser Ala 260 265 270 Ser Gly His Lys Phe Gly Leu Ala Pro Leu Gly Cys Gly Trp Val Ile 275 280 285 Trp Arg Asp Glu Glu Ala Leu Pro Gln Glu Leu Val Phe Asn Val Asp 290 295 300 Tyr Leu Gly Gly Gln Ile Gly Thr Phe Ala Ile Asn Phe Ser Arg Pro 305 310 315 320 Ala Gly Gln Val Ile Ala Gln Tyr Tyr Glu Phe Leu Arg Leu Gly Arg 325 330 335 Glu Gly Tyr Thr Lys Val Gln Asn Ala Ser Tyr Gln Val Ala Ala Tyr 340 345 350 Leu Ala Asp Glu Ile Ala Lys Leu Gly Pro Tyr Glu Phe Ile Cys Thr 355 360 365 Gly Arg Pro Asp Glu Gly Ile Pro Ala Val Cys Phe Lys Leu Lys Asp 370 375 380 Gly Glu Asp Pro Gly Tyr Thr Leu Tyr Asp Leu Ser Glu Arg Leu Arg 385 390 395 400 Leu Arg Gly Trp Gln Val Pro Ala Phe Thr Leu Gly Gly Glu Ala Thr 405 410 415 Asp Ile Val Val Met Arg Ile Met Cys Arg Arg Gly Phe Glu Met Asp 420 425 430 Phe Ala Glu Leu Leu Leu Glu Asp Tyr Lys Ala Ser Leu Lys Tyr Leu 435 440 445 Ser Asp His Pro Lys Leu Gln Gly Ile Ala Gln Gln Asn Ser Phe Lys 450 455 460 His Thr 465 9511PRTEscherichia coli 9Met Ala Thr Ser Val Gln Thr Gly Lys Ala Lys Gln Leu Thr Leu Leu 1 5 10 15 Gly Phe Phe Ala Ile Thr Ala Ser Met Val Met Ala Val Tyr Glu Tyr 20 25 30 Pro Thr Phe Ala Thr Ser Gly Phe Ser Leu Val Phe Phe Leu Leu Leu 35 40 45 Gly Gly Ile Leu Trp Phe Ile Pro Val Gly Leu Cys Ala Ala Glu Met 50 55 60 Ala Thr Val Asp Gly Trp Glu Glu Gly Gly Val Phe Ala Trp Val Ser 65 70 75 80 Asn Thr Leu Gly Pro Arg Trp Gly Phe Ala Ala Ile Ser Phe Gly Tyr 85 90 95 Leu Gln Ile Ala Ile Gly Phe Ile Pro Met Leu Tyr Phe Val Leu Gly 100 105 110 Ala Leu Ser Tyr Ile Leu Lys Trp Pro Ala Leu Asn Glu Asp Pro Ile 115 120 125 Thr Lys Thr Ile Ala Ala Leu Ile Ile Leu Trp Ala Leu Ala Leu Thr 130 135 140 Gln Phe Gly Gly Thr Lys Tyr Thr Ala Arg Ile Ala Lys Val Gly Phe 145 150 155 160 Phe Ala Gly Ile Leu Leu Pro Ala Phe Ile Leu Ile Ala Leu Ala Ala 165 170 175 Ile Tyr Leu His Ser Gly Ala Pro Val Ala Ile Glu Met Asp Ser Lys 180 185 190 Thr Phe Phe Pro Asp Phe Ser Lys Val Gly Thr Leu Val Val Phe Val 195 200 205 Ala Phe Ile Leu Ser Tyr Met Gly Val Glu Ala Ser Ala Thr His Val 210 215 220 Asn Glu Met Ser Asn Pro Gly Arg Asp Tyr Pro Leu Ala Met Leu Leu 225 230 235 240 Leu Met Val Ala Ala Ile Cys Leu Ser Ser Val Gly Gly Leu Ser Ile 245 250 255 Ala Met Val Ile Pro Gly Asn Glu Ile Asn Leu Ser Ala Gly Val Met 260 265 270 Gln Thr Phe Thr Val Leu Met Ser His Val Ala Pro Glu Ile Glu Trp 275 280 285 Thr Val Arg Val Ile Ser Ala Leu Leu Leu Leu Gly Val Leu Ala Glu 290 295 300 Ile Ala Ser Trp Ile Val Gly Pro Ser Arg Gly Met Tyr Val Thr Ala 305 310 315 320 Gln Lys Asn Leu Leu Pro Ala Ala Phe Ala Lys Met Asn Lys Asn Gly 325 330 335 Val Pro Val Thr Leu Val Ile Ser Gln Leu Val Ile Thr Ser Ile Ala 340 345 350 Leu Ile Ile Leu Thr Asn Thr Gly Gly Gly Asn Asn Met Ser Phe Leu 355 360 365 Ile Ala Leu Ala Leu Thr Val Val Ile Tyr Leu Cys Ala Tyr Phe Met 370 375 380 Leu Phe Ile Gly Tyr Ile Val Leu Val Leu Lys His Pro Asp Leu Lys 385 390 395 400 Arg Thr Phe Asn Ile Pro Gly Gly Lys Gly Val Lys Leu Val Val Ala 405 410 415 Ile Val Gly Leu Leu Thr Ser Ile Met Ala Phe Ile Val Ser Phe Leu 420 425 430 Pro Pro Asp Asn Ile Gln Gly Asp Ser Thr Asp Met Tyr Val Glu Leu 435 440 445 Leu Val Val Ser Phe Leu Val Val Leu Ala Leu Pro Phe Ile Leu Tyr 450 455 460 Ala Val His Asp Arg Lys Gly Lys Ala Asn Thr Gly Val Thr Leu Glu 465 470 475 480 Pro Ile Asn Ser Gln Asn Ala Pro Lys Gly His Phe Phe Leu His Pro 485 490 495 Arg Ala Arg Ser Pro His Tyr Ile Val Met Asn Asp Lys Lys His 500 505 510 1042DNAArtificial sequencePCR Primer 10ccgctcgagc ggcccaagct tcggtaaata cttataccgg ag 421141DNAArtificial sequencePCR Primer 11ctagtctaga ctagcccaag cttgtcgatc atcgcctgtt g 411243DNAArtificial sequencePCR Primer 12ccgctcgagc ggcccaagct tcgtgataaa ttgcgtcaga aag 431343DNAArtificial sequencePCR Primer 13ctagactagt ctagcccaag cttctcgaat ttggcttgca tcc 43145091DNAArtificial sequencePlasmide 14tcgatttaaa tctcgagagg cctgacgtcg ggcccggtac cacgcgtcat atgactagtt 60cggacctagg gatatcgtcg acatcgatgc tcttctgcgt taattaacaa ttgggatcct 120ctagacccgg gatttaaatc gctagcgggc tgctaaagga agcggaacac gtagaaagcc 180agtccgcaga aacggtgctg accccggatg aatgtcagct actgggctat ctggacaagg 240gaaaacgcaa gcgcaaagag aaagcaggta gcttgcagtg ggcttacatg gcgatagcta 300gactgggcgg ttttatggac agcaagcgaa ccggaattgc cagctggggc gccctctggt 360aaggttggga agccctgcaa agtaaactgg atggctttct tgccgccaag gatctgatgg 420cgcaggggat caagatctga tcaagagaca ggatgaggat cgtttcgcat gattgaacaa 480gatggattgc acgcaggttc tccggccgct tgggtggaga ggctattcgg ctatgactgg 540gcacaacaga caatcggctg ctctgatgcc gccgtgttcc ggctgtcagc gcaggggcgc 600ccggttcttt ttgtcaagac cgacctgtcc ggtgccctga atgaactgca ggacgaggca 660gcgcggctat cgtggctggc cacgacgggc gttccttgcg cagctgtgct cgacgttgtc 720actgaagcgg gaagggactg gctgctattg ggcgaagtgc cggggcagga tctcctgtca 780tctcaccttg

ctcctgccga gaaagtatcc atcatggctg atgcaatgcg gcggctgcat 840acgcttgatc cggctacctg cccattcgac caccaagcga aacatcgcat cgagcgagca 900cgtactcgga tggaagccgg tcttgtcgat caggatgatc tggacgaaga gcatcagggg 960ctcgcgccag ccgaactgtt cgccaggctc aaggcgcgca tgcccgacgg cgaggatctc 1020gtcgtgaccc atggcgatgc ctgcttgccg aatatcatgg tggaaaatgg ccgcttttct 1080ggattcatcg actgtggccg gctgggtgtg gcggaccgct atcaggacat agcgttggct 1140acccgtgata ttgctgaaga gcttggcggc gaatgggctg accgcttcct cgtgctttac 1200ggtatcgccg ctcccgattc gcagcgcatc gccttctatc gccttcttga cgagttcttc 1260tgagcgggac tctggggttc gaaatgaccg accaagcgac gcccaacctg ccatcacgag 1320atttcgattc caccgccgcc ttctatgaaa ggttgggctt cggaatcgtt ttccgggacg 1380ccggctggat gatcctccag cgcggggatc tcatgctgga gttcttcgcc cacgctagcg 1440gcgcgccggc cggcccggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc 1500aggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga 1560gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca 1620ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 1680ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt 1740cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 1800ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 1860tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 1920gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 1980tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 2040gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 2100tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag 2160ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt 2220agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa 2280gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg 2340attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa ggccggccgc 2400ggccgcgcaa agtcccgctt cgtgaaaatt ttcgtgccgc gtgattttcc gccaaaaact 2460ttaacgaacg ttcgttataa tggtgtcatg accttcacga cgaagtacta aaattggccc 2520gaatcatcag ctatggatct ctctgatgtc gcgctggagt ccgacgcgct cgatgctgcc 2580gtcgatttaa aaacggtgat cggatttttc cgagctctcg atacgacgga cgcgccagca 2640tcacgagact gggccagtgc cgcgagcgac ctagaaactc tcgtggcgga tcttgaggag 2700ctggctgacg agctgcgtgc tcggccagcg ccaggaggac gcacagtagt ggaggatgca 2760atcagttgcg cctactgcgg tggcctgatt cctccccggc ctgacccgcg aggacggcgc 2820gcaaaatatt gctcagatgc gtgtcgtgcc gcagccagcc gcgagcgcgc caacaaacgc 2880cacgccgagg agctggaggc ggctaggtcg caaatggcgc tggaagtgcg tcccccgagc 2940gaaattttgg ccatggtcgt cacagagctg gaagcggcag cgagaattat cgcgatcgtg 3000gcggtgcccg caggcatgac aaacatcgta aatgccgcgt ttcgtgtgcc gtggccgccc 3060aggacgtgtc agcgccgcca ccacctgcac cgaatcggca gcagcgtcgc gcgtcgaaaa 3120agcgcacagg cggcaagaag cgataagctg cacgaatacc tgaaaaatgt tgaacgcccc 3180gtgagcggta actcacaggg cgtcggctaa cccccagtcc aaacctggga gaaagcgctc 3240aaaaatgact ctagcggatt cacgagacat tgacacaccg gcctggaaat tttccgctga 3300tctgttcgac acccatcccg agctcgcgct gcgatcacgt ggctggacga gcgaagaccg 3360ccgcgaattc ctcgctcacc tgggcagaga aaatttccag ggcagcaaga cccgcgactt 3420cgccagcgct tggatcaaag acccggacac ggagaaacac agccgaagtt ataccgagtt 3480ggttcaaaat cgcttgcccg gtgccagtat gttgctctga cgcacgcgca gcacgcagcc 3540gtgcttgtcc tggacattga tgtgccgagc caccaggccg gcgggaaaat cgagcacgta 3600aaccccgagg tctacgcgat tttggagcgc tgggcacgcc tggaaaaagc gccagcttgg 3660atcggcgtga atccactgag cgggaaatgc cagctcatct ggctcattga tccggtgtat 3720gccgcagcag gcatgagcag cccgaatatg cgcctgctgg ctgcaacgac cgaggaaatg 3780acccgcgttt tcggcgctga ccaggctttt tcacataggc tgagccgtgg ccactgcact 3840ctccgacgat cccagccgta ccgctggcat gcccagcaca atcgcgtgga tcgcctagct 3900gatcttatgg aggttgctcg catgatctca ggcacagaaa aacctaaaaa acgctatgag 3960caggagtttt ctagcggacg ggcacgtatc gaagcggcaa gaaaagccac tgcggaagca 4020aaagcacttg ccacgcttga agcaagcctg ccgagcgccg ctgaagcgtc tggagagctg 4080atcgacggcg tccgtgtcct ctggactgct ccagggcgtg ccgcccgtga tgagacggct 4140tttcgccacg ctttgactgt gggataccag ttaaaagcgg ctggtgagcg cctaaaagac 4200accaagggtc atcgagccta cgagcgtgcc tacaccgtcg ctcaggcggt cggaggaggc 4260cgtgagcctg atctgccgcc ggactgtgac cgccagacgg attggccgcg acgtgtgcgc 4320ggctacgtcg ctaaaggcca gccagtcgtc cctgctcgtc agacagagac gcagagccag 4380ccgaggcgaa aagctctggc cactatggga agacgtggcg gtaaaaaggc cgcagaacgc 4440tggaaagacc caaacagtga gtacgcccga gcacagcgag aaaaactagc taagtccagt 4500caacgacaag ctaggaaagc taaaggaaat cgcttgacca ttgcaggttg gtttatgact 4560gttgagggag agactggctc gtggccgaca atcaatgaag ctatgtctga atttagcgtg 4620tcacgtcaga ccgtgaatag agcacttaag gtctgcgggc attgaacttc cacgaggacg 4680ccgaaagctt cccagtaaat gtgccatctc gtaggcagaa aacggttccc ccgtagggtc 4740tctctcttgg cctcctttct aggtcgggct gattgctctt gaagctctct aggggggctc 4800acaccatagg cagataacgt tccccaccgg ctcgcctcgt aagcgcacaa ggactgctcc 4860caaagatctt caaagccact gccgcgactg ccttcgcgaa gccttgcccc gcggaaattt 4920cctccaccga gttcgtgcac acccctatgc caagcttctt tcaccctaaa ttcgagagat 4980tggattctta ccgtggaaat tcttcgcaaa aatcgtcccc tgatcgccct tgcgacgttg 5040gcgtcggtgc cgctggttgc gcttggcttg accgacttga tcagcggccg c 5091

* * * * *