Acetyl-coa Producing Enzymes In Yeast

Muller; Ulrike Maria ;   et al.

Patent Application Summary

U.S. patent application number 12/670050 was filed with the patent office on 2010-09-30 for acetyl-coa producing enzymes in yeast. This patent application is currently assigned to DSM IP ASSETS B.V.. Invention is credited to Ulrike Maria Muller, Lourina Madeleine Raamsdonk, Aaron Adriaan Winkler, Liang Wu.

Application Number20100248233 12/670050
Document ID /
Family ID40281027
Filed Date2010-09-30

United States Patent Application 20100248233
Kind Code A1
Muller; Ulrike Maria ;   et al. September 30, 2010

ACETYL-COA PRODUCING ENZYMES IN YEAST

Abstract

The present invention relates to a method of identifying a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell comprising: a) providing a mutated yeast cell comprising a deletion of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS); b) transforming said mutated yeast cell with an expression vector comprising a heterologous nucleotide sequence encoding a candidate polypeptide having potential enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA; c) testing said recombinant mutated yeast cell for its ability to grow on minimal medium containing glucose as sole carbon source, and d) identifying said candidate polypeptide as a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell when growth of said cell is observed. The invention further relates to a method of producing a fermentation production such as butanol.


Inventors: Muller; Ulrike Maria; (Linnich, DE) ; Wu; Liang; (Delft, NL) ; Raamsdonk; Lourina Madeleine; (Nootdorp, NL) ; Winkler; Aaron Adriaan; (Den Haag, NL)
Correspondence Address:
    NIXON & VANDERHYE, PC
    901 NORTH GLEBE ROAD, 11TH FLOOR
    ARLINGTON
    VA
    22203
    US
Assignee: DSM IP ASSETS B.V.
Heerlen
NL

Family ID: 40281027
Appl. No.: 12/670050
Filed: July 11, 2008
PCT Filed: July 11, 2008
PCT NO: PCT/EP08/59119
371 Date: May 7, 2010

Related U.S. Patent Documents

Application Number Filing Date Patent Number
60935031 Jul 23, 2007
61064120 Feb 19, 2008

Current U.S. Class: 435/6.13 ; 435/160; 435/254.2; 435/254.21; 435/320.1; 435/6.15
Current CPC Class: C07K 14/33 20130101; C12N 9/001 20130101; C12Q 1/32 20130101; Y02E 50/10 20130101; C12P 7/16 20130101
Class at Publication: 435/6 ; 435/320.1; 435/254.2; 435/254.21; 435/160
International Class: C12Q 1/68 20060101 C12Q001/68; C12N 15/63 20060101 C12N015/63; C12N 1/19 20060101 C12N001/19; C12P 7/16 20060101 C12P007/16

Foreign Application Data

Date Code Application Number
Jul 23, 2007 EP 07112956.3
Dec 21, 2007 EP 07123976.8
Feb 19, 2008 EP 08101747.7

Claims



1. A method of identifying a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell comprising: providing a mutated yeast cell, wherein said mutation comprises an inactivation of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS); transforming said mutated yeast cell with an expression vector comprising at least one heterologous nucleotide sequence operably linked to a promoter functional in yeast and said heterologous nucleotide sequence encoding a candidate polypeptide having potential enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl CoA; testing said recombinant mutated yeast cell for its ability to grow on minimal medium containing glucose as sole carbon source, and identifying said candidate polypeptide as a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl CoA in (the cytosol of) said yeast cell when growth of said cell is observed.

2. Method according to claim 1, wherein said yeast cell is a cell of Saccharomyces cerevisiae and wherein said heterologous nucleotide sequence is codon pair optimized for expression in Saccharomyces cerevisiae.

3. Method according to claim 2, wherein said mutation comprises an inactivation of the gene for acetyl-CoA synthetase isoform 2 (acs2).

4. Method according to claim 1, wherein said candidate polypeptide having enzymatic activity for converting acetaldehyde into acetyl-CoA is a (putative) acetylating acetaldehyde dehydrogenase (acdh).

5. A vector for the expression of heterologous polypeptides in yeast, said vector comprising a heterologous nucleotide sequence operably linked to a promoter functional in yeast and said heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell.

6. Vector according to claim 5, wherein said polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA.

7. Vector according to claim 5, wherein said polypeptide has more than 50%, preferably more than 60%, 70%, 80%, 90%, or 95% sequence identity with the amino acid sequence selected from SEQ ID NO: 19, 22, 25, 28 and 52.

8. Vector according to claim 5 for expression in Saccharomyces cerevisiae, wherein said heterologous nucleotide sequence is codon pair optimized for expression in Saccharomyces cerevisiae.

9. Expression vector according to claim 8, wherein said heterologous nucleotide sequence is selected from SEQ ID NO: 20, 23, 26, 29 and 51.

10. A recombinant yeast cell comprising a vector of claim 5.

11. A recombinant yeast cell comprising a heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell.

12. Yeast cell according to claim 10, further comprising an inactivation of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS).

13. Yeast cell according to claim 10, wherein the yeast cell comprises an inactivation of a gene encoding an acetyl-CoA synthase.

14. Yeast cell according to claim 10, wherein said cell shows growth on minimal medium containing glucose as sole carbon source.

15. Yeast cell according to claim 10, further comprising an inactivation of a gene encoding an enzyme that catalyses the conversion of acetaldehyde into ethanol, preferably an alcohol dehydrogenase.

16. Yeast cell according to claim 10, further comprising one or more introduced genes encoding a recombinant pathway for the formation of 1-butanol from acetyl-CoA.

17. Yeast cell according to claim 16, wherein said one or more introduced genes encode enzymes that produce acetoacetyl-CoA, 3-hydroxybutyryl-CoA, crotonyl-CoA, butyryl-CoA, butylaldehyde and/or 1-butanol.

18. Yeast cell according to claim 10, wherein said yeast is Saccharomyces cerevisiae

19. A method of producing a fermentation product, comprising the steps of fermenting a suitable carbon substrate with a yeast cell according to claim 10 and recovering the fermentation product produced during said fermentation.

20. Method according to claim 19, wherein the fermentation product is butanol.
Description



FIELD OF THE INVENTION

[0001] The present invention is in the field of metabolites production in yeast using heterologous expression systems. In particular, the present invention relates to the metabolic engineering of yeast strains capable of producing metabolites that require cytosolic acetyl-CoA as a precursor, such as butanol-producing yeast strains. The present invention relates to an assay system for identifying heterologous enzymes capable of converting pyruvate, acetaldehyde or acetate into cytosolic acetyl-CoA when expressed in the cytosol in yeast.

BACKGROUND OF THE INVENTION

[0002] Acetyl-coenzyme A (CoA) is an essential intermediate in numerous metabolic pathways, and is a key precursor in the synthesis of many industrial relevant compounds, such as fatty acids, carotenoids, isoprenoids, vitamins, amino acids, lipids, wax esters, (poly)saccharides polyhydroxyalkanoates, statins, polyketides and acetic esters (such as ethyl acetate and isoamyl acetate). In particular, acetyl-CoA is also the precursor of the industrially important bulk chemical 1-butanol.

[0003] Compared to bacteria, such as E. coli, yeast cells provide a very suitable alternative to produce the above-mentioned acetyl-CoA derived products, in that yeast is not susceptible to phage or other infection since yeast-based processes may be run at low pH. Therefore, the use of yeast does not require a sterile process, thereby lowering the cost price of the product of interest.

[0004] When natural (wild type) yeast is not able to produce the acetyl-CoA-derived product of interest, the use of metabolic engineering can provide for yeast cells expressing heterologous genes that could support such a process. In such cases, the heterologous gene products are usually targeted to the cytosolic compartment of yeast. As the biosynthesis of acetyl-CoA-derived product will take place completely or partially in the cytosol, the supply of sufficient amounts of the precursor acetyl-CoA in the cytosolic compartment is crucial. In Saccharomyces cerevisiae, biosynthesis of acetyl-CoA takes place in two separate compartments. In mitochondria, acetyl-CoA is synthesized by oxidative decarboxylation of pyruvate catalyzed by the pyruvate dehydrogenase complex (PDH), with the following overall reaction stoichiometry:

Pyruvate(Pyr)+CoA+NAD.sup.+=acetyl-CoA+CO.sub.2+NADH+H.sup.+

[0005] In cytosol, acetyl-CoA is synthesized via the pyruvate dehydrogenase (PDH) by-pass, involving the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS), with the following overall reaction stoichiometry:

Pyr+CoA+ATP+NAD(P).sup.+=acetyl-CoA+CO.sub.2+NAD(P)H+AMP+PPi+H.sup.+.

[0006] Pyruvate-decarboxylase-negative (Pdc-) mutant of the yeast S. cerevisiae does not have a functional PDH by-pass, and cannot grow on minimal medium with glucose as the sole carbon source due to inability to supply (sufficient) cytosolic acetyl-CoA for growth (Flikweert et al., (1996) Yeast 12:247-57). The PDH by-pass is therefore essential in providing acetyl-CoA in the cytosolic compartment. However, the PDH by-pass in yeast is not optimal with respect to the energy balance, as can be seen from the overall reaction stoichiometry: 2 moles of ATP are needed per acetyl-CoA synthesized via the PDH-bypass since in the acetyl-CoA synthetase reaction ATP is hydrolyzed to AMP. In contrast, the mitochondrial pathway via the PDH requires no ATP. The additional ATP requirement of the PDH by-pass can present a problem for synthesizing the product of interest from cytosolic acetyl-CoA precursor, as more carbon source needs to be diverted for ATP generation, via e.g. oxidative phosphorylation and/or substrate phosphorylation (e.g. glycolysis), thereby lowering the overall yield of the product on carbon.

[0007] When yeast is metabolically engineered to produce 1-butanol, heterologous biosynthetic genes of 1-butanol can be expressed in the cytosol in yeast cells (WO 2007/041269). In general 1 mole of glucose give rise to 2 moles of acetyl-CoA via glycolysis, which is the precursor of 1 mole of butanol; hence a maximum of 1 mole of butanol can be synthesized per mole of glucose if cell growth and maintenance is not considered. However, when the PDH by-pass is used in combination with butanol biosynthesis, this maximal theoretical yield cannot be achieved due to energy imbalance: whereas 2 moles of ATP are generated per mole of glucose converted in glycolysis, a total of 4 moles (2 times 2 mole) of ATP are needed in the PDH by-pass to form 2 moles of acetyl-CoA, which are converted to 1 mole of butanol. Thus, there is a net shortage of ATP if the PDH by-pass were used to synthesize 1 mole of 1-butanol from 1 mole of glucose.

[0008] Thus, there is a need for the identification of possible alternative metabolic routes for producing cytosolic acetyl-CoA in yeast, for the production of acetyl-CoA-derived products, in particular butanol, wherein the PDH by-pass is not required.

[0009] Butanol is an important industrial chemical and is suitable as an alternative engine fuel having improved properties over ethanol. Butanol also finds use as a solvent for a wide variety of chemical and textile processes, in the organic synthesis of plastics, as a chemical intermediate and as a solvent in the coating and food and flavor industry. Butanol can be produced from biomass (biobutanol) as well as fossil fuels (petrobutanol).

[0010] The chemical synthesis of butanol in one of its isomers can be accomplished by a variety of available methods known in the art (see e.g. Ullmann's Encyclopedia of Industrial Chemistry, 6th edition, 2003, Wiley-VCHVerlag GmbH and Co., Weinheim, Germany, Vol. 5, pp. 716-719). These processes have the disadvantage that they are based on the use of petrochemical derivates, are generally expensive, and are not environmentally friendly.

[0011] Biological synthesis of butanol can be achieved by fermentation using the acetone-butanol-ethanol (ABE) process carried out by the bacteria Clostridium acetobutylicum or other Clostridium species. An important disadvantage of the ABE process, however, is that it results in a mixture of acetone, 1-butanol and ethanol. Moreover, the use of bacteria requires sterile process conditions and generally renders the process susceptible to bacteriophage infection. Yeast cells thus provide a very suitable alternative as described above.

SUMMARY OF THE INVENTION

[0012] The present inventors have now identified alternative metabolic routes for increasing the production of cytosolic acetyl-CoA in yeast which can overcome the problems of the PDH by-pass.

[0013] One possible route includes the direct conversion of acetaldehyde to acetyl-CoA without ATP consumption, by use of an acetylating acetaldehyde dehydrogenase (E.C. 1.2.1.10) (see FIG. 2, reaction A, ACDH). Another route includes the direct conversion of pyruvate to acetyl-CoA by an enzyme or a multi-enzyme-complex without ATP consumption, for instance, by use of a pyruvate:NADP oxidoreductase (E.C. 1.2.1.51) see FIG. 2, reaction C, PNO). In these two possible routes, the formation of 1 mole of butanol per mole of glucose would result in the formation of 2 moles of ATP. Yet another route includes the conversion of acetate to acetyl-CoA with 1 ATP consumed per acetyl-CoA formed by an alternative enzyme or a combination of enzymes, for instance, by use of acetate:CoA ligase (ADP-forming, E.C. 6.2.1.13), or by use of ATP:acetate phosphotransferase (E.C. 2.7.2.1) in combination with acetyl-CoA:Pi acetyltransferase (E.C. 2.3.1.8). In this route, the formation of 1 mole of butanol per mole of glucose is ATP-balanced, i.e. no ATP will be formed. The present inventors have now found that such an alternative to the PDH by-pass can result in acetyl-CoA synthesis in the cytosol of the yeast, and that such acetyl-CoA can be used biosynthetically to produce higher amounts of desirable fermentation products, such as butanol.

[0014] In a first aspect, the present invention provides a method of identifying a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell comprising: [0015] providing a mutated yeast cell, wherein said mutation comprises an inactivation of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS); [0016] transforming said mutated yeast cell with an expression vector comprising at least one heterologous nucleotide sequence operably linked to a promoter functional in yeast and said at least one heterologous nucleotide sequence encoding at least one candidate polypeptide having potential enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA; [0017] testing said recombinant mutated yeast cell for its ability to grow on minimal medium containing glucose as sole carbon source, and [0018] identifying said candidate polypeptide as a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell when growth of said cell is observed.

[0019] In a preferred embodiment of said method the yeast cell is a cell of Saccharomyces cerevisiae and the heterologous nucleotide sequence is codon (pair) optimized for expression in Saccharomyces cerevisiae.

[0020] In another preferred embodiment, said mutation comprises an inactivation of the gene for acetyl-CoA synthetase isoform 2 (acs2).

[0021] In another preferred embodiment, said at least one candidate polypeptide having enzymatic activity for converting acetaldehyde into acetyl-CoA is a (putative) acetylating acetaldehyde dehydrogenases.

[0022] Alternatively, said at least one heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell may consist of two or more enzymes working together to achieve the desired conversion from pyruvate, acetaldehyde or acetate into acetyl-CoA.

[0023] In another aspect, the present invention provides an integration vector for the integration in a yeast genome of a heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA, and the subsequent expression of the heterologous polypeptide therefrom.

[0024] In another aspect, the present invention provides an expression vector expressing heterologous polypeptides in yeast, said expression vector comprising a heterologous nucleotide sequence operably linked to a promoter functional in yeast and said heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell.

[0025] In a preferred embodiment of said vector the polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA is identified by a method according to the present invention as described above.

[0026] In another preferred embodiment, said polypeptide is selected from SEQ ID NO: 19, 22, 25, 28 and 52 and functional homologues thereof.

[0027] In another preferred embodiment, said expression vector is for expression in Saccharomyces cerevisiae, wherein said heterologous nucleotide sequence is codon (pair) optimized for expression in Saccharomyces cerevisiae.

[0028] In another preferred embodiment, said heterologous nucleotide sequence is selected from SEQ ID NO: 20, 23, 26 and 29.

[0029] In another aspect, the present invention provides a recombinant yeast cell comprising the expression vector of the present invention as described above.

[0030] In a preferred embodiment, the recombinant yeast cell further comprises an inactivation of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS).

[0031] Preferably, a yeast cell according to he present invention comprises an inactivation of a gene encoding an acetyl-CoA synthase.

[0032] In another preferred embodiment, the recombinant yeast cell further comprises an inactivation of a gene (nucleotide sequence) encoding an enzyme capable of catalysing the conversion of acetaldehyde to ethanol, preferably a gene encoding an alcohol dehydrogenase.

[0033] As used herein, inactivation of a gene (nucleotide sequence) encoding an enzyme may be achieved by mutation, deletion or disruption of (part of) a gene or nucleotide sequence encoding an enzyme.

[0034] Preferably a yeast cell according to the present invention shows growth on minimal medium containing glucose as sole carbon source.

[0035] In another preferred embodiment of a yeast cell of the invention, said yeast cell further comprises one or more introduced genes encoding a recombinant pathway for the formation of 1-butanol from cytosolic acetyl-CoA. Suitable recombinant pathways from acetyl-CoA to 1-butanol are known in the art. Such pathways are for instance known from WO 2007/041269. Preferably said one or more introduced genes encode enzymes that produce acetoacetyl-CoA, 3-hydroxybutyryl-CoA, crotonyl-CoA, butyryl-CoA, butylaldehyde and/or 1-butanol. Said enzymes can be: [0036] acetyl-CoA acetyltransferase (E.C. 2.3.1.9 [Enzyme Nomenclature 1992, Academic Press, San Diego]; although, enzyme's with a broader substrate range (E.C. 2.3.1.16) will be functional as well), which converts 2 moles of acetyl-CoA to acetoacetyl-CoA; [0037] NADH-dependent or NADPH-dependent 3-hydroxybutyryl-CoA dehydrogenase E.C. 1.1.1.35 or E.C. 1.1.1.30, resp. E.C. 1.1.1.157 or E.C. 1.1.1.36), which converts acetoacetyl-CoA to 3-hydroxybutyryl-CoA; [0038] 3-hydroxybutyryl-CoA dehydratase (also named crotonase; E.C. 4.2.1.17 or E.C. 4.2.1.55), which converts 3-hydroxybutyryl-CoA to crotonyl-CoA; [0039] NADH-dependent or NADPH-dependent butyryl-CoA dehydrogenase (E.C. 1.3.1.44 resp. E.C. 1.3.1.38 or E.C.1.3.99.2), which converts crotonyl-CoA to butyryl-CoA; [0040] monofunctional NADH-dependent or NADPH-dependent aldehyde dehydrogenase (E.C. 1.2.1.10, or 1.2.1.57), which converts butyryl-CoA to butyraldehyde, and [0041] NADH-dependent or NADPH-dependent butanol dehydrogenase (E.C. 1.1.1.-), which converts butylaldehyde to 1-butanol, or [0042] bifunctional NADH-dependent or NADPH-dependent aldehyde/alcohol dehydrogenase (E.C. 1.1.1.1./1.2.1.10), which converts butyryl-CoA to 1-butanol via butyraldehyde

[0043] In another preferred embodiment of the invention a yeast cell is a Saccharomyces cerevisiae.

[0044] In another aspect, the present invention provides a method of producing butanol, comprising the steps of fermenting a suitable carbon substrate with a yeast cell according to the present invention and recovering the butanol produced during said fermentation.

BRIEF DESCRIPTION OF THE DRAWINGS

[0045] FIG. 1 is a schematic presentation of the PDH by-pass showing the enzymes pyruvate decarboxylase (PDC; E.C. 4.1.1.1), acetaldehyde dehydrogenase (ALD; E.C. 1.2.1.3, E.C. 1.2.1.4 and E.C. 1.2.1.5), and acetyl-CoA synthetase (ACS; E.C. 6.2.1.1).

[0046] FIG. 2 shows a schematic metabolic route for butanol production in Saccharomyces cerevisiae. Reactions 1-6 are the butanol biosynthesis steps from Clostridium acetobutylicum introduced in yeast. A, B, and C indicate alternative reactions for acetyl-CoA biosynthesis in the cytosol. B indicates part of the pyruvate dehydrogenase by-pass (pdc, ald and acs), the natural source of cytosolic acetyl-CoA in yeast. Glc, glucose; EtOH, ethanol; Pyr, Pyruvate; AA, acetaldehyde; ACT, acetate; AcCoA, acetyl-CoA; AACoA, acetoacetyl-CoA; BuCoA, butyryl-CoA; Bual, butylaldehyde; BuOH, butanol; NAD(P)(H), nicotinamide adenine dinucleotide (phosphate) (in reduced form); ATP, adenosine triphosphate; AMP, adenosine monophosphate; TCA cycle, tricarboxylic acid cycle; PDH, pyruvate dehydrogenase; pdc, pyruvate decarboxylase; adh, alcohol dehydrogenase; acdh, acetylating acetaldehyde dehydrogenase; ald, acetaldehyde dehydrogenase; acs, acetyl-CoA synthetase; pno, pyruvate:NADP oxidoreductase. Enzymatic conversions indicated by reaction 1-6 indicate a heterologous butanol pathway from Clostridium acetobutylicum: thIB (or ThL) encoding acetyl-CoA acetyltransferase or thiolase [E.C. 2.3.1.9] (SEQ ID NO:30); hbd, 3-hydroxybutyryl-CoA dehydrogenase [E.C.1.1.1.157] (SEQ ID NO:31); crt, 3-hydroxybutyryl-CoA dehydratase [E.C.4.2.1.55] (SEQ ID NO:32); ter, trans-enoyl CoA reductase; bcd, butyryl-CoA dehydrogenase [E.C.1.3.99.2] (SEQ ID NO:33); etf .alpha..beta., heterodimeric electron transfer flavoprotein (etf .alpha. and etf .beta., SEQ ID NO:38 and SEQ ID NO:39, respectively); adhE/adhE1, aldehyde/alcohol dehydrogenase E and E1 [E.C. 1.1.1.1/1.2.1.10] (SEQ ID NO:34 and 35, respectively); bdhA/bdhB, NAD(P)H-dependent butanol dehydrogenase A and B [E.C.:1.1.1.-] (SEQ ID NO:36 and 37, respectively).

[0047] FIG. 3 shows the map of plasmid YEplac112PtdhTadh. The sequence of this plasmid is provided in SEQ ID NO:40.

[0048] FIG. 4 shows an example of a similarity tree based on amino acid sequences of proteins of the types 1 to 4 as described in Example 2 and indicates the branches.

DETAILED DESCRIPTION OF THE INVENTION

Definitions

[0049] The term "butanol" refers to n-butanol, or 1-butanol.

[0050] The term "yeast" refers to a phylogenetically diverse group of single-celled fungi, most of which are in the division of Ascomycota and Basidiomycota. The budding yeasts ("true yeasts") are classified in the order Saccharomycetales, with Saccharomyces cerevisiae as the most well known species

[0051] The term "recombinant yeast" as used herein, is defined as a cell which contains a nucleotide sequence and/or protein, or is transformed or genetically modified with a nucleotide sequence that does not naturally occur in the yeast, or it contains additional copy or copies of an endogenous nucleic acid sequence (or protein), or it contains a mutation, deletion or disruption of an endogenous nucleic acid sequence.

[0052] The term "mutated" as used herein regarding proteins or polypeptides means that at least one amino acid in the wild-type or naturally occurring protein or polypeptide sequence has been replaced with a different amino acid, or deleted from the sequence via mutagenesis of nucleic acids encoding these amino acids. Mutagenesis is a well-known method in the art, and includes, for example, site-directed mutagenesis by means of PCR or via oligonucleotide-mediated mutagenesis as described in Sambrook et al., Molecular Cloning-A Laboratory Manual, 2nd ed., Vol. 1-3 (1989). The term "mutated" as used herein regarding genes means that at least one nucleotide in the nucleotide sequence of that gene or a regulatory sequence thereof, has been replaced with a different nucleotide, or has been deleted from the sequence via mutagenesis, resulting in the transcription of a non-functional protein sequence or the knock-out of that gene.

[0053] The term "gene", as used herein, refers to a nucleic acid sequence containing a template for a nucleic acid polymerase, in eukaryotes, RNA polymerase II. Genes are transcribed into mRNAs that are then translated into protein.

[0054] The term pyruvate dehydrogenase (PDH) by-pass refers to the enzymatic cascade form pyruvate to acetyl-CoA in the cytosol of yeast, and which consists of the following enzymes: pyruvate decarboxylase (PDC; E.C. 4.1.1.1) converting pyruvate into acetaldehyde; acetaldehyde dehydrogenase (ALD; E.C. 1.2.1.3, E.C. 1.2.1.4 and E.C. 1.2.1.5), converting acetaldehyde into acetate; and acetyl-CoA synthetase (ACS; E.C. 6.2.1.1), converting acetate into acetyl-CoA.

[0055] The term "nucleic acid" as used herein, includes reference to a deoxyribonucleotide or ribonucleotide polymer, i.e. a polynucleotide, in either single- or double-stranded form, and unless otherwise limited, encompasses known analogues having the essential nature of natural nucleotides in that they hybridize to single-stranded nucleic acids in a manner similar to naturally occurring nucleotides (e.g., peptide nucleic acids). A polynucleotide can be full-length or a subsequence of a native or heterologous structural or regulatory gene. Unless otherwise indicated, the term includes reference to the specified sequence as well as the complementary sequence thereof. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to name just two examples, are polynucleotides as the term is used herein. It will be appreciated that a great variety of modifications have been made to DNA and RNA that serve many useful purposes known to those of skill in the art. The term polynucleotide as it is employed herein embraces such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including among other things, simple and complex cells.

[0056] The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The essential nature of such analogues of naturally occurring amino acids is that, when incorporated into a protein, that protein is specifically reactive to antibodies elicited to the same protein but consisting entirely of naturally occurring amino acids. The terms "polypeptide", "peptide" and "protein" are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.

[0057] Sequence identity is herein defined as a relationship between two or more amino acid (polypeptide or protein) sequences or two or more nucleic acid (polynucleotide) sequences, as determined by comparing the sequences. Usually, sequence identities are compared over the whole length of the sequences compared. In the art, "identity" also means the degree of sequence relatedness between amino acid or nucleic acid sequences, as the case may be, as determined by the match between strings of such sequences.

[0058] Preferred methods to determine identity are designed to give the largest match between the sequences tested. Methods to determine identity are codified in publicly available computer programs. Preferred computer program methods to determine identity and similarity between two sequences include BLASTP, BLASTN (Altschul, S. F. et al., J. Mol. Biol. 215:403-410 (1990), publicly available from NCBI and other sources (BLAST Manual, Altschul, S., et al., NCBI NLM NIH Bethesda, Md. 20894). Preferred parameters for amino acid sequences comparison using BLASTP are gap open 11.0, gap extend 1, Blosum 62 matrix.

[0059] Every nucleic acid sequence herein that encodes a polypeptide also, by reference to the genetic code, describes every possible silent variation of the nucleic acid. The term "conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or conservatively modified variants of the amino acid sequences due to the degeneracy of the genetic code. The term "degeneracy of the genetic code" refers to the fact that a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations" and represent one species of conservatively modified variation.

[0060] "Expression" refers to the transcription of a gene into structural RNA (rRNA, tRNA) or messenger RNA (mRNA) with subsequent translation into a protein.

[0061] As used herein, "heterologous" in reference to a nucleic acid or protein is a nucleic acid or protein that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous structural gene is from a species different from that from which the structural gene was derived, or, if from the same species, one or both are substantially modified from their original form. A heterologous protein may originate from a foreign species or, if from the same species, is substantially modified from its original form by deliberate human intervention.

[0062] As used herein "promoter" is a DNA sequence that directs the transcription of a (structural) gene. Typically, a promoter is located in the 5'-region of a gene, proximal to the transcriptional start site of a (structural) gene. Promoter sequences may be constitutive, inducible or repressible. If a promoter is an inducible promoter, then the rate of transcription increases in response to an inducing agent.

[0063] The term "vector" as used herein, includes reference to an autosomal expression vector and to an integration vector used for integration into the chromosome.

[0064] The term "expression vector" refers to a DNA molecule, linear or circular, that comprises a segment encoding a polypeptide of interest under the control of (i.e., operably linked to) additional nucleic acid segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and may optionally include one or more origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, and the like. Expression vectors are generally derived from plasmid or viral DNA, or may contain elements of both. In particular an expression vector comprises a nucleotide sequence that comprises in the 5' to 3' direction and operably linked: (a) a yeast-recognized transcription and translation initiation region, (b) a coding sequence for a polypeptide of interest, and (c) a yeast-recognized transcription and translation termination region. "Plasmid" refers to autonomously replicating extrachromosomal DNA which is not integrated into a microorganism's genome and is usually circular in nature.

[0065] An "integration vector" refers to a DNA molecule, linear or circular, that can be incorporated in a microorganism's genome and provides for stable inheritance of a gene encoding a polypeptide of interest. The integration vector generally comprises one or more segments comprising a gene sequence encoding a polypeptide of interest under the control of (i.e., operably linked to) additional nucleic acid segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and one or more segments that drive the incorporation of the gene of interest into the genome of the target cell, usually by the process of homologous recombination. Typically, the integration vector will be one which can be transferred into the target cell, but which has a replicon which is nonfunctional in that organism. Integration of the segment comprising the gene of interest may be selected if an appropriate marker is included within that segment.

[0066] As used herein, the term "operably linked" refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence "operably linked" to another control sequence and/or to a coding sequence is ligated in such a way that transcription and/or expression of the coding sequence is achieved under conditions compatible with the control sequence. Generally, operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame.

[0067] By "host cell" is meant a cell which contains a vector and supports the replication and/or expression of the vector. Host cells may be prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells. Preferably, host cells are cells of the order of Actinomycetales, most preferably yeast cells, most preferably cells of Saccharomyces cerevicsiae.

[0068] "Transformation" and "transforming", as used herein, refers to the insertion of an exogenous polynucleotide into a host cell, irrespective of the method used for the insertion, for example, direct uptake, transduction, f-mating or electroporation. The exogenous polynucleotide may be maintained as a non-integrated vector, for example, a plasmid, or alternatively, may be integrated into the host cell genome.

[0069] The term "oligonucleotide" refers to a short sequence of nucleotide monomers (usually 6 to 100 nucleotides) joined by phosphorous linkages (e.g., phosphodiester, alkyl and aryl-phosphate, phosphorothioate, phosphotliester), or non-phosphorous linkages (e.g., peptide, sulfamate and others). An oligonucleotide may contain modified nucleotides having modified bases (e.g., 5-methyl cytosine) and modified sugar groups (e.g., 2'-O-methyl ribosyl, 2'-O-methoxyethyl ribosyl, 2'-fluoro ribosyl, 2'-amino ribosyl, and the like). Oligonucleotides may be naturally-occurring or synthetic molecules of double- and single-stranded DNA and double- and single-stranded RNA with circular, branched or linear shapes and optionally including domains capable of forming stable secondary structures (e.g., stem-and-loop and loop-stem-loop structures).

[0070] The term "polynucleotide" as used herein refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. Thus, this term includes double- and single-stranded DNA and RNA.

[0071] The term "recombinant polynucleotide" as used herein intends a polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its origin or manipulation: (1) is not associated with all or a portion of a polynucleotide with which it is associated in nature; or (2) is linked to a polynucleotide other than that to which it is linked in nature; or (3) does not occur in nature.

[0072] The term "minimal medium" as used herein refers to a chemically defined medium, which includes only the nutrients that are required by the cells to survive and proliferate in culture. Typically, minimal medium is free of biological extracts, e.g., growth factors, serum, pituitary extract, or other substances, which are not necessary to support the survival and proliferation of a cell population in culture. For example, minimal medium generally includes as essential substances: at least one carbon source, such as glucose; at least one nitrogen source, such as ammonium, ammonium sulfate, ammonium chloride, ammonium nitrate or urea; inorganic salts, such as dipotassium hydrogenphosphate, potassium dihydrogen-phosphate and magnesium sulfate; and other nutrients, such as biotin and vitamins.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0073] A method of the present invention provides a method for identifying heterologous enzymes capable of producing acetyl-CoA in the cytosol of a yeast cell. The heterologous enzyme may produce the acetyl-CoA using pyruvate, acetaldehyde or acetate as a substrate, preferably in a single conversion step. Preferably, the heterologous enzyme produces the acetyl-CoA from acetaldehyde. An enzyme capable of catalyzing said reaction is acetylating acetaldehyde dehydrogenase (acdh; E.C. 1.2.1.10) also referred to as acetaldehyde:NAD+oxidoreductase (CoA-acetylating). The conversion of acetaldehyde into acetyl-CoA by acetylating acetaldehyde dehydrogenase is reversible and runs in the direction of acetyl-CoA when acetaldehyde accumulates in the cytosol. Such an accumulation may for instance be achieved by deletion of alcohol dehydrogenase (adh; E.C. 1.1.1.1).

[0074] The heterologous enzyme may also produce the acetyl-CoA from pyruvate. An enzyme capable of catalyzing said reaction is a pyruvate:NADP oxidoreductase (pno; E.C. 1.2.1.51). The reaction is stoichiometrically identical to the mitochondrial pyruvate dehydrogenase except that pno uses NADPH as a cofactor as compared to PDH that uses NADH. Compared to acdh, an important disadvantage of the pno enzyme system is that pno is oxygen sensitive, and that it is a large multimeric enzyme, and hence, its successful genetic incorporation (a 5-6 kb gene) is much more difficult than that of acdh. For this reason, the use of acdh is preferred in embodiments of the present invention.

[0075] An important feature of a test cell capable of revealing the desired enzymatic activity of a test polypeptide is that the cell is prototrophic as a result of the introduced polypeptide. With this, it is meant that the cell's nutritional requirements do not exceed those of the corresponding wild-type strain and that it will proliferate on minimal medium (in contrast to the auxotroph). In fact, the production of acetyl-CoA as supported by the test polypeptide will cancel the effect of the deletion of said at least one gene of the PDH by-pass, caused by the deletion of the gene for pyruvate decarboxylase (pdc; E.C. 4.1.1.1), acetaldehyde dehydrogenase (aid; E.C. 1.2.1.3, E.C. 1.2.1.4 or E.C. 1.2.1.5), or acetyl-CoA synthetase (acs; E.C. 6.2.1.1). Such complementation assays are well known in the art. In aspects of the present invention the assay is used to identify suitable sources of heterologous enzymes capable of sustaining cytosolic acetyl-CoA production in yeast cells.

[0076] The complementation assay is based on the provision of alternative routes to overcome the deleted enzyme activity of the PDH by-pass. Methods for effecting deletion of genes in yeast are well known in the art, and can for instance be achieved by oligonucleotide-mediated mutagenesis. Good results may be obtained with the plasmid pUG6 carrying the IoxP-kanMX-IoxP gene disruption cassette (Guldener et al. [1996] Nucleic Acids Res. 24(13):2519-24; GenPept accession no. P30114). Thus, the skilled person will be able to provide a yeast strain having a deleted acetaldehyde dehydrogenase and/or acetyl-CoA synthetase gene for blocking the PDH by-pass therein.

[0077] Saccharomyces cerevisiae comprises two acetyl-CoA synthetase isoforms, Acs1p and Acs2p. Both are the nuclear source of acetyl-CoA for histone acetylation. The production of cytosolic acetyl-CoA is also required for lipid production. Acs activity is essential, since an acs1 acs2 double null mutant is non-viable. An acs1 null mutant can grow with ethanol as the sole carbon source. The mutated yeast cell used in aspects of the present invention preferably has an inactivation of the acs2 gene.

[0078] Saccharomyces cerevisiae mutants carrying an inactivation of the acs2 gene are not able to grow on glucose as sole carbon source, because ACS1 is repressed and the protein is actively degraded. Complementation of such a delta acs2 mutant with a plasmid based acs gene will restore the cell's ability to grow on glucose as single carbon source. In addition, growth of such a mutant is complemented by the expression of genes supporting alternative routes for the production of sufficient cytosolic acetyl-CoA. Thus, transformation of the delta acs2 mutant with a plasmid from which a functional (heterologous) acdh or pno can be expressed will restore the mutant's ability to grow on glucose as sole carbon source. It should be understood that in addition to the removal of the ACS2 locus, one may also remove the ACS1 locus. Although it is believed that this may in some instances prevent the occurrence of revertants (mutations in the ACS1 locus leading to reversion of the delta acs2 phenotype), this was however not found to be essential. Double mutants (acs1/acs2.DELTA. strains) would be wholly dependant on the introduced acdh or pno gene for the production of cytosolic acetyl-CoA.

[0079] An important advantage of a complementation assay of the present invention is that it can be performed as a plate screening assay wherein successful complementation is observed as colony growth. This is much faster than experiments that require the analysis for the production of a desired metabolic product.

[0080] For complementation of the mutation, the yeast cell having the inactivated ald and/or acs gene is then transformed with a suitable expression vector comprising a nucleotide sequence of a heterologous test polypeptide.

[0081] Yeast expression vectors are widely available from a variety of commercial suppliers. To date, functional complementation of yeast mutations by foreign homologues has become a standard practice in engineering of Saccharomyces cerevisiae. Suitable expression vectors for heterologous gene expression may be based on artificial, inducible promoters such as the GAL promoter, but is preferably based on constitutive promotors such as the TDH3 promoter. Suitable systems are exemplified in the examples below. In certain production systems, the use of an inducible promotor may be preferred, as it would allow for temporal separation of stages for biomass production (promotor not induced) and fermentation product production (promoter induced). In another highly preferred embodiment in certain production systems, the vector is in integration vector for stable integrating the heterologous genes in the genome of the yeast production strain.

[0082] In order to achieve optimal expression in yeast, the codon (pair) usage of the heterologous gene may be optimized by using any one of a variety of synthetic gene design software packages, for instance GeneOptimizer.RTM. from Geneart AG (Regensburg, Germany) for codon usage optimization or codon pair usage optimization as described in WO2008/000632. Such adaptation of codon usage ensures that the heterologous genes, which are for instance of bacterial origin, are effectively processed by the yeast transcription and translation machinery. Optimization of codon pair usage will result in enhanced protein expression in the yeast cell.

[0083] The optimized sequences may for instance be cloned into a high copy yeast expression plasmid, operably linked to a (preferably constitutive) promoter functional in yeast. Good results have been obtained with the plasmid YEplac112 (2.mu. TRP1) (Gietz & Sugino [1988] Gene 74(2):527-34).

[0084] Heterologous genes that encode a candidate polypeptide having potential enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA may be identified in silico. Suitable enzymes described as possessing the capacity to convert acetaldehyde into acetyl-CoA are acetylating acetaldehyde dehydrogenases (E.C. 1.2.1.10). The nucleotide and amino acid sequences of over 200 of these enzymes from a variety of microbial origins are described in various databases (e.g. the KEGG (Kyoto Encyclopedia of Genes and Genomes) database).

[0085] The present inventors have selected several acetylating acetaldehyde dehydrogenases and tested these in the delta acs2 mutant-based assay system of the present invention. Many of these, though not all, were functional in S. cerevisiae when codon pair usage was optimized.

[0086] Functional homologues to these proteins can also be used in aspects of the present invention. The term "functional homologues" as used herein refers to a protein comprising the amino acid sequence of SEQ ID NO:19, 22, 25 or the acetaldehyde dehydrogenase part of SEQ ID NOs: 28 and 52 in which one or more amino acids are substituted, deleted, added, and/or inserted, and which protein has the same enzymatic functionality for substrate conversion, for instance an acetylating acetaldehyde dehydrogenase homologue is capable of converting acetaldehyde into acetyl-CoA. This functionality may be tested by use of an assay system comprising a recombinant yeast cell comprising an expression vector for the expression of the homologue in yeast, said expression vector comprising a heterologous nucleotide sequence operably linked to a promoter functional in yeast and said heterologous nucleotide sequence encoding the homologous polypeptide of which enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl CoA in (the cytosol of) said yeast cell is to be tested, and performing a method for identifying a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) a yeast cell as described herein using said assay system. Candidate homologues may be identified by using in silico similarity analyses. A detailed example of such an analysis is described in Example 2 below. The skilled person will be able to derive therefrom how suitable candidate homologues may be found and, optionally upon codon (pair) optimization, will be able to test the required functionality of such candidate homologues using the assay system of the present invention as described above. A suitable homologue represents a polypeptide having an amino acid sequence identity to an acetylating acetaldehyde dehydrogenase of more than 50%, preferably more than 60%, more preferably more than 70%, 80%, 90% or more, for instance having such an amino acid sequence identity to SEQ ID NOs:19, 22, 25, or the acetaldehyde dehydrogenase part of SEQ ID NOs:28 and 52 and having the required enzymatic functionality for converting acetaldehyde into acetyl-CoA. Similarly, enzymes described for the direct conversion of pyruvate into acetyl-CoA and the functional homologues thereof, as well as enzymes described for the conversion of acetate to acetyl-CoA and the functional homologues thereof, can also be used, similar as described for acetylating acetaldehyde dehydrogenase above.

[0087] A method of the present invention further comprises the step of testing the ability of the mutated and test-protein transformed yeast cell to grow on minimal medium containing glucose as sole carbon source. As stated earlier, this may suitably occur on solid (agar) media in Petri dishes (plates) where growth can be observed as growth of a colony, however, liquid media are equally suitable and growth may be detected by turbidity. Other methods for determining growth of the mutated and test-protein transformed yeast cell on minimal medium containing glucose as sole carbon source may also be used.

[0088] When the mutated and test-protein-transformed yeast cell is capable of growth on minimal medium with glucose, the candidate polypeptide is successfully identified as a heterologous polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell. Growth may suitably be observed as colony formation on solid growth media, in particular minimal medium containing glucose.

[0089] An expression vector for the expression of heterologous polypeptides in yeast, according to the present invention may be any expression vector suitable for transforming yeast. Innumerable examples are available in the art that can suitably be used to express heterologous nucleotide sequences in yeast. A very suitable vector in aspects of the invention is a plasmid. A highly preferred plasmid is YEplac112PtdhTadh (SEQ ID NO:40).

[0090] Generally, the heterologous nucleotide sequence encoding the polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl CoA in (the cytosol of) said yeast cell, will be placed under control of a promoter functional in yeast. Preferably the promoter is a constitutive promoter. The promoter on plasmid YEplac112PtdhTadh is the TDH3 promoter.

[0091] The heterologous nucleotide sequences incorporated in the expression vector of the present invention may be any pno, acdh or other enzyme capable of converting pyruvate, acetaldehyde or acetate (respectively) into acetyl-CoA in the cytosol of the yeast. Preferred nucleotide sequences are those as identified herein, namely the nucleotide sequences encoding: [0092] the ethanolamine utilization protein EutE from E. coli HS (nucleotide sequences with SEQ ID NO:18); [0093] the hypothetical protein Lin1129 from Listeria innocua similar to ethanolamine utilization protein EutE, (nucleotide sequences with SEQ ID NO:21) [0094] the acetaldehyde dehydrogenase EDK33116 from Clostridium kluyveri DSM 555 (nucleotide sequences with SEQ ID NO:24); and [0095] the adhE homologue of S. aureus (nucleotide sequences with SEQ ID NO:27) encoding a bifunctional acetaldehyde/alcohol dehydrogenase in Staphylococcus aureus subsp. aureus N315, or the acetaldehyde dehydrogenase functional part thereof. [0096] the adhE homologue of Piromyces sp. E2 (nucleotide sequence SEQ ID NO: 51) encoding a bifunctional acetaldehyde/alcohol dehydrogenase, or the acetaldehyde dehydrogenase part thereof.

[0097] Also suitable are functional homologues of these nucleotide sequences, or of the polypeptides that they encode. With this term is meant that a nucleic acid sequence having more than 80%, 90% or 95% sequence identity with the nucleotide sequences encoding the above acdh enzymes, or having more than 50%, preferably more than 60%, 70%, 80%, 90%, or 95% sequence identity with the amino acid sequence of the above acdh enzymes, with the proviso that the polypeptides encoded by the homologous sequences exhibit functional enzymatic acdh activity.

[0098] As stated above, these nucleotide sequences can be optimized for expression in Saccharomyces cerevisiae by optimization of codon pair usage well known in the art. Codon pair optimized sequences for the SEQ ID NO:18, 21, 24, and 27 are provided in SEQ ID NO:20, 23, 26, and 29, respectively.

[0099] The expression vector of the invention may be used to transform a yeast cell. Methods of transformation include electroporation, glass bead and biolistic transformation, all of which are well known in the art and for instance described in Sambrook et al., Molecular Cloning-A Laboratory Manual, 2nd ed., Vol. 1-3 (1989).

[0100] A yeast cell according to the present invention comprises a heterologous nucleotide sequence encoding a polypeptide having enzymatic activity for converting pyruvate, acetaldehyde or acetate into acetyl-CoA in (the cytosol of) said yeast cell. Preferably, a yeast cell of the invention comprises a heterologous acdh or pno. The advantage of such a yeast cell is that it can produce acetyl-CoA by a metabolic route wherein the PDH by-pass is not required. This is energetically more favourable under anaerobic conditions, and may form the basis of any biological synthesis process using yeast cells under anaerobic conditions where acetyl-CoA is an intermediate. In addition to comprising the heterologous acdh or pno, the yeast cell of the invention may comprise various gene deletions or gene supplementations, depending on the intended use of the yeast.

[0101] Preferably a yeast cell according to the present invention comprises an inactivation of a nucleotide sequence (gene) encoding an enzyme capable of catalysing the conversion of acetaldehyde to ethanol, preferably an alcohol dehydrogenase, for instance to optimize acetaldehyde accumulation in the yeast cell.

[0102] If used in a method of screening for heterologous enzymes according to a method of the invention, the yeast cell comprises a deletion of at least one gene of the (PDH) by-pass, selected from the genes encoding the enzymes pyruvate decarboxylase (PDC), acetaldehyde dehydrogenase (ALD), and acetyl-CoA synthetase (ACS), preferably acetyl-CoA synthetase, most preferably acs2.

[0103] If used in a method of producing a fermentation product, the yeast cell may optionally comprise a number of (heterologous) gene supplementations supporting the metabolic pathway from acetyl-CoA to said butanol. Such a pathway may consist only of heterologous gene products, or may make use of a mixture of heterologous and endogenous gene products. In the event the fermentation product is butanol, use can be made of a yeast comprising genes encoding enzymes for the butanol pathway of e.g. Clostridium acetobutylicum as described herein and in FIG. 2. In the event the yeast cell according to the present invention comprises genes encoding enzymes for butanol production, the yeast preferably comprises a nucleotide sequence encoding a butyryl-CoA dehydrogenase and at least one nucleotide sequence encoding a heterologous electron transfer flavoprotein (ETF). It was found that a yeast cell comprising an ETF in addition to genes of the butanol pathway produces an increased amount of butanol,

[0104] A heterologous electron transfer flavoprotein in the eukaryotic cell according to the present invention may be a single protein or the ETF may comprise two or more subunits, for instance an alpha and a beta subunit. Preferably the ETF comprises an ETF alpha (SEQ ID NO: 38) and an ETF beta (SEQ ID NO: 39). The electron transfer flavoprotein may be derived from any suitable origin. Preferably, the ETF is derived from the same origin as the butyryl-CoA dehydrogenase. Preferably, the ETF is derived from prokaryotic origin preferably from a Clostridium sp., preferably a Clostridium acetobutylicum or a Clostridium beijerinckii.

[0105] A method for producing a fermentation product according to the present invention, preferably comprises growing a yeast under anaerobic conditions on a suitable carbon and energy source. Suitable sources of carbon and energy are C5 and C6 sugars (monosaccharides) such as glucose and polysaccharides such as starch. Other raw materials such as sugarcane, maize, wheat, barley, sugarbeets, rapeseed, and sunflower are also suitable. In some instances the raw material may be pre-digested by enzymatic treatment. Most preferably the carbon source is lignocellulose, which is composed of mainly cellulose, hemicellulose, pectin, and lignin. Lignocellulose is found, for example, in the stems, leaves, hulls, husks, and cobs of plants. Hydrolysis of these polymers by specific enzymatic treatment releases a mixture of neutral sugars including glucose, xylose, mannose, galactose, and arabinose. Lignocellulosic materials, such as wood, herbaceous material, agricultural residues, corn fiber, waste paper, pulp and paper mill residues can be used to produce butanol. Hydrolysing enzymes are for instance beta-linked glucans for the hydrolysis of cellulose (these enzymes include endoglucanases, cellobiohydrolases, glucohydrolases and beta-glucosidases); beta-glucosidases hydrolyze cellobiose; endo-acting and exo-acting hemicellulases and cellobiases for hydrolysis of hemicellulose, and acetylesterases and esterases that hydrolyze lignin glycoside bonds. These and other methods for hydrolysis of lignocellulose are well known in the art.

[0106] Variations and modifications of the embodiments disclosed herein are possible, and practical alternatives to and equivalents of the various elements of the embodiments would be understood to those of ordinary skill in the art upon study of this patent document. These and other variations and modifications of the embodiments disclosed herein may be made without departing from the scope and spirit of the invention.

[0107] The invention will now be illustrated by way of the following non-limiting examples.

EXAMPLES

[0108] The following examples illustrate the provision of a strain of Saccharomyces cerevisiae useful in assays and methods of the present invention, for instance in methods for identifying heterologous enzymes capable of forming cytosolic acetyl-CoA in S. cerevisiae. Such methods are useful in the identification of routes/enzymes which allow the cytosolic supply of acetyl-CoA in S. cerevisiae under anaerobic conditions.

[0109] In order to enhance cytosolic acetyl-CoA formation in our butanol production strain, a selection method was set up to identify heterologous enzymes forming cytosolic acetyl-CoA in S. cerevisiae. The test system is based on a delta acs2 yeast mutant deficient in cytosolic acetyl-CoA biosynthesis on glucose, such a strain is unable to grow on glucose as sole carbon source unless cytosolic acetyl-CoA formation is complemented. Complementation studies in such a strain can reveal which heterologous enzymes are suitable for use in butanol producing strains of Saccharomyces cerevisiae.

[0110] Acetylating acetaldehyde dehydrogenase was identified to be a good candidate for cytosolic acetyl-CoA supply over the homologous PDH by-pass because no ATP is dissipated. Twelve putative acetylating acetaldehyde dehydrogenases, identified based on sequence homology, were synthesized and checked for complementation of the delta acs2 yeast.

[0111] The codon pair optimized genes of the eutE homologues of E. coli, L. innocua and C. kluyveri and the adhE homologue of S. aureus were able to complement the acs2 yeast mutants (4 out of 7), resulting in growth of the acs2.DELTA. S. cerevisiae host. The aim is to improve butanol biosynthesis in yeast by expression of one or more genes so identified.

[0112] In order to test if these heterologous routes for cytosolic acetyl-CoA supply work in S. cerevisiae, a screening system was developed based on Saccharomyces cerevisiae mutants carrying a deletion of the acs2 gene. These cells are not able to grow on glucose as sole carbon source unless the delta acs2 mutant is complemented with a plasmid based acs gene or complemented with the expression of any other gene generating sufficient cytosolic acetyl-CoA. So if it were to be transformed with a plasmid leading to active expression of acdh or pno, such a mutant should be able to grow again with glucose as single carbon source. The complementation studies were performed on plates. The following experiments were performed to set up and evaluate the test system.

Example 1

Construction of Delta acs2 Strain

[0113] The S. cerevisiae acs2 deleted strain (acs2.DELTA. strain) was produced by first performing a PCR on plasmid pUG6 (Guldener et al., 1996, supra) with the following oligonucleotides:

TABLE-US-00001 5'acs2::Kanlox 5'-tacacaaacagaatacaggaaagtaaatcaatacaataataaaacag ctgaagcttcgtacgc-3' 3'acs2::Kanlox 5'-tctcattacgaaatttttctcatttaagttatttctttttttgaggc ataggccactagtggatctg-3'.

[0114] The resulting 1.4 kb fragment, containing the KanMX marker which confers resistance to G418, was used to transform S. cerevisiae CEN.PK113-3C (MATA trp1-289). After transformation the strain was plated on YPD (10 g I.sup.-1 yeast extract (BD Difco), 20 g I.sup.-1 peptone (BD Difco)), 10 g I.sup.-1 glucose) with 200 mg/ml Geneticin (G418). In resistant transformants, correct integration was verified by PCR using oligonucleotides:

TABLE-US-00002 5'ACS2: 5'-gatattcggtagccgattcc-3' 3'ACS2: 5'-ccgtaaccttctcgtaatgc-3' ACS2internal: 5'-cggattcgtcatcagcttca-3' KanA: 5'-cgcacgtcaagactgtcaag-3' KanB: 5'-tcgtatgtgaatgctggtcg-3'

[0115] The phenotype was verified by testing for growth on YP with 1% glucose (YPD) or 1% ethanol+1% glycerol (YPEG) as the carbon source.

[0116] One transformant that had the correct PCR bands and did not grow on YP with glucose, but did grow on with YP with ethanol and glycerol as the carbon sources, was picked and named RWB060 (MATA trp1-289 acs2::Kanlox).

Example 2

In Silico Identification of Putative Acetylating Acetaldehyde Dehydrogenases for Direct Conversion of Acetaldehyde to Acetyl-CoA

[0117] Enzymes described for the conversion of acetaldehyde to acetyl-CoA are the so-called acetylating acetaldehyde dehydrogenases (ACDH) (E.C. 1.2.1.10) catalysing the following reaction:

Acetaldehyde(AA)+NAD.sup.++CoA<=>Acetyl-CoA+NADH+H.sup.+

[0118] From literature four types of proteins have been described that have this activity:

[0119] 1) Bifunctional proteins that catalyze the reversible conversion of acetyl-CoA to acetaldehyde, and the subsequent reversible conversion of acetaldehyde to ethanol. An example of this type of proteins is the AdhE protein in E. coli (GenBank No: NP.sub.--415757). AdhE appears to be the evolutionary product of a gene fusion. The NH.sub.2-terminal region of the AdhE protein is highly homologous to aldehyde:NAD.sup.+ oxidoreductases, whereas the COOH-terminal region is homologous to a family of Fe.sup.2+ dependent ethanol:NAD.sup.+ oxidoreductases (Membrillo-Hernandez et al., (2000) J. Biol. Chem. 275: 33869-33875). The E. coli AdhE is subject to metal-catalyzed oxidation and therefore oxygen-sensitive (Tamarit et al. (1998) J. Biol. Chem. 273:3027-32).

[0120] 2) Proteins that catalyze the reversible conversion of acetyl-CoA to acetaldehyde in strictly or facultative anaerobic micro-organisms but do not possess alcohol dehydrogenase activity. An example of this type of proteins has been reported in Clostridium kluyveri (Smith et al. (1980) Arch. Biochem. Biophys. 203: 663-675). An acetylating acetaldehyde dehydrogenase has been annotated in the genome of Clostridium kluyveri DSM 555 (GenBank No: EDK33116). A homologous protein AcdH is identified in the genome of Lactobacillus plantarum (GenBank No: NP.sub.--784141). Another example of this type of proteins is the ald gene product in Clostridium beijerinckii NRRL B593 (Toth et al. (1999) Appl. Environ. Microbiol. 65: 4973-4980, GenBank No: AAD31841).

[0121] 3) Proteins that are involved in ethanolamine catabolism. Ethanolamine can be utilized both as carbon and nitrogen source by many enterobacteria (Stojiljkovic et al. (1995) J. Bacteriol. 177: 1357-1366). Ethanolamine is first converted by ethanolamine ammonia lyase to ammonia and acetaldehyde, subsequently, acetaldehyde is converted by acetylating acetaldehyde dehydrogenase to acetyl-CoA. An example of this type of acetylating acetaldehyde dehydrogenase is the EutE protein in Salmonella typhimurium (Stojiljkovic et al. (1995) J. Bacteriol. 177: 1357-1366, GenBank No: AAL21357). E. coli is also able to utilize ethanolamine (Scarlett et al. (1976) J. Gen. Microbiol. 95:173-176) and has an EutE protein (GenBank No: AAG57564) which is homologous to the EutE protein in S. typhimurium.

[0122] 4) Proteins that are part of a bifunctional aldolase-dehydrogenase complex involved in 4-hydroxy-2-ketovalerate catabolism. Such bifunctional enzymes catalyze the final two steps of the meta-cleavage pathway for catechol, an intermediate in many bacterial species in the degradation of phenols, toluates, naphthalene, biphenyls and other aromatic compounds (Powlowski and Shingler (1994) Biodegradation 5, 219-236). 4-Hydroxy-2-ketovalerate is first converted by 4-hydroxy-2-ketovalerate aldolase to pyruvate and acetaldehyde, subsequently acetaldehyde is converted by acetylating acetaldehyde dehydrogenase to acetyl-CoA. An example of this type of acetylating acetaldehyde dehydrogenase is the DmpF protein in Pseudomonas sp CF600 (GenBank No: CAA43226) (Shingler et al. (1992) J. Bacteriol. 174:711-24). E. coli has a homologous MphF protein (Ferrandez et al. (1997) J. Bacteriol. 179: 2573-2581, GenBank No: NP.sub.--414885) to the DmpF protein in Pseudomonas sp. CF600.

[0123] To identify the protein family members of acetylating acetaldehyde dehydrogenase, the amino acid sequences of the E. coli bifunctional AdhE protein (GenBank No: NP.sub.--415757), L. plantarum AcdH protein (acetylating) (GenBank No: NP.sub.--784141), the E. coli EutE protein (GenBank No: AAG57564) and the E. coli MhpF protein (GenBank No: NP.sub.--414885) were each run as a query sequence in a BLASTp search against the GenBank non-redundant protein database using default parameters. Amino acid sequences with an E-value smaller or equal to 1 e-20 were extracted. Redundant sequences were removed and the remaining sequences were aligned and a similarity tree was built using Genedata Physolopher protein analyzer software, version 6.5.2. A similarity tree provides information on organism sequence similarity. The tree is created independently of the ClustalW algorithm by pairwise comparison of the amino acid sequences per residue position. At each position, the similarity is rated and summed up to an overall score for each sequence pair. Based on these pairwise scores a hierarchical clustering is performed, which arranges the sequences in a tree. Note that the ald gene product of C. beijerinckii (GenBank no: AAD31841) clustered together with the EutE proteins from E. coli and S. typhimurium. From this similarity tree four major branches could be defined, each branch contains one amino acid sequence that was used as a query for the BLASTp search. FIG. 4 shows an example of such a similarity tree, containing all sequences that are mentioned in this example.

[0124] At least one amino acid sequence was selected from each branch for complementation tests in S. cerevisiae delta acs2. Preferably, the selected amino acid sequences have experimental evidence of its biochemical function as acetylating acetaldehyde dehydrogenase. Such evidences can be found in public databases, such as in the BRENDA, UniProt and NCBI Entrez databases.

Example 3

Construction of Expression Plasmids and Complementation Test

[0125] To test whether acetylating acetaldehyde dehydrogenases (ACDH) could complement the deletion of ACS2 in S. cerevisiae, several genes coding for a (putative) ACDH were chosen from a variety of databases as described above.

[0126] To achieve optimal expression in yeast, the codon usage of all genes was adapted by codon pair optimization. These sequences were synthesized at Geneart AG (Regensburg, Germany).

[0127] The optimized sequences were cloned into the high copy yeast expression plasmid YEplac112PtdhTadh (SEQ ID NO:40; based on YEplac112 (2.mu. TRP1) (Gietz & Sugino [1988] Gene 74(2):527-34), allowing constitutive expression from the TDH3 promoter.

[0128] YEplac112PtdhTadh was made by cloning a KpnI-SacI fragment from p426GPD (Mumberg et al. [1995] Gene. 156(1):119-22), containing the TDH3 promoter and CYC1 terminator, into YEplac112 cut with KpnI-SacI. The resulting plasmid was cut with KpnI and SphI and the ends were made blunt then ligated to give YEplac112TDH. To obtain YEplac112PtdhTadh, YEplac112TDH was cut with PstI-HindIII and ligated to a 345 by PstI-HindIII PCR fragment containing the ADH1 terminator (Tadh), thus replacing the CYC1 terminator and changing the polylinker between the promoter and terminator. The Tadh PCR fragment was generated using the following oligonucleotides:

TABLE-US-00003 MCS-5'Tadh: 5'-aaggtacctctagactagtcccgggctgcagtcgactcgagcgaatt tcttatgatttatgatt-3' Tadh1-Hind: 5'-aggaagcttaggcctgtgtggaagaacgattacaacagg-3'

[0129] PCR was done with Vent.sup.R DNA polymerase, according to the manufacturer's specifications.

[0130] The synthetic constructs containing the ACDH genes were cut with SpeI-PstI and ligated into YEplac112PtdhTadh digested with the same enzymes, resulting in pBOL058 through to pBOL068 and pBOL082. The names of the final plasmids and the genes they contain are given in Table 1.

TABLE-US-00004 TABLE 1 Table 1: Overview on putative acetylating acetaldehyde dehydrogenases tested for complementation of delta acs2 S. cerevisiae strain. Genes which resulted in complementation are given in bold. SEQ ID NOs are provided for the DNA sequence of the wild type gene, the protein expressed therefrom, and the codon pair optimized DNA sequence. SEQ ID NO. Size DNA/ Organisms Name Group* (kb) PRT/OPT Escherichia coli adhE 1 2.6 Entamoeba histolytica adh2 1 2.6 48/50/49 adhE 1 2.6 27/28/29 sp. E2 adhE 1 2.6 51/52 EDK33116 2 1.5 24/25/26 Lactobacillus plantarum acdH 2 1.4 EutE 3 1.4 18/19/20 Lin1129 3 1.4 21/22/23 Pseudomonas putida YP 001268189 4 1.0 *Group refers to the group of proteins having ACDH activity as defined in Example 2. Group 1: similar to bifunctional E. coli AdhE (AdhE-type of proteins); group 2: proteins having similarity to Lactobacillus plantarum AcdH (AcdH-type of proteins); group 3: similar to E. coli EutE (EutE-type of proteins); group 4: similar to E. coli MhpF (MhpF-type of proteins).

[0131] All plasmids were used to transform the delta acs2 yeast strain RWB060. As negative control, the empty vector YEplac112 was used. Transformants were plated on mineral medium (Verduyn et al. [1992] Yeast 8 (1992), pp. 501-517) containing either 1% glucose (MYD) or 1% ethanol+1% glycerol (MYEG) as single carbon source.

[0132] While for all constructs several transformants could be selected on minimal medium with ethanol/glycerol, this was not the case on the glucose containing plates.

TABLE-US-00005 TABLE 2 Result of a complementation experiment for putative acetylating acetaldehyde dehydrogenases in delta acs2 S. cerevisiae strain RWB060. Genes resulting in complementation are given in bold. MYEG and MYD columns indicate number of transformants on plates MYEG (ethanol/glycerol) and MYG (glucose). Gene (GenPept Organisms accession) plasmid MYEG MYD none YEplac112 75 0 Escherichia coli adhE pBOL059 6 0 Entamoeba histolytica adh2 pBOL061 54 0 adhE pBOL064 36 39 (BAB41363) sp. E2 adhE pBOL139 32 3 EDK33116 pBOL065 21 8 (EDK33116) Lactobacillus acdH pBOL058 6 0 plantarum EutE pBOL066 24 18 (ABV06849) Lin1129 pBOL067 28 8 (CAC96360) Pseudomonas putida YP 001268189 pBOL068 32 0

[0133] On the glucose containing plates, transformants could only be selected for plasmids pBOL064, pBOL065, pBOL066, and pBOL067, not the empty vector. There was also a clear difference in colony size, depending on the plasmid used. While construct pBOL066 (E. coli eutE) resulted in biggest colonies, colonies of pBOL067 (L. innocua lin1129) appeared a bit smaller and pBOL065 (C. kluyveri edk3116) showed smallest colonies. Plasmid pBOL064 (S. aureus adhE) and plasmid pBOL139 (Piromyces sp. E2, adhE) were done at a later date, so could not be compared directly, Colonies containing pBOL64 seemed to be similar to colonies comprising pBOL066 and colonies comprising pBOL139 seemed to be similar to colonies comprising pBOL065.

[0134] To ensure that these results did not arise from spontaneous revertants, transformation experiments were repeated for some of the plasmids, giving the same results. In addition, for almost all plasmids four transformants were selected at random from the MYEG plates and restreaked onto MYD and MYEG plates.

[0135] In all experiments no growth was ever seen on glucose with the empty vector (YEplac112), while only pBOL065, pBOL066 and pBOL067 repeatedly gave good growth on glucose. Plasmid pBOL064 was not re-tested this way after the initial very positive result.

[0136] From these results, it was concluded that the codon pair optimized genes of the eutE homologues of: [0137] E. coli (SEQ ID NO:20) encoding the ethanolamine utilization protein EutE from E. coli HS; [0138] L. innocua (SEQ ID NO:23) encoding a hypothetical protein from L. innocua similar to ethanolamine utilization protein EutE, and

[0139] C. kluyveri (SEQ ID NO:26) encoding acetylating acetaldehyde dehydrogenase in Clostridium kluyveri DSM 555;

and the codon pair optimized gene of the adhE homologue of [0140] S. aureus (SEQ ID NO:29) encoding a bifunctional acetaldehyde/alcohol dehydrogenase in Staphylococcus aureus subsp. aureus N315; and the non codon pair optimized gene of the adhE homologue [0141] Piromyces sp. E2 (SEQ ID NO:51) encoding a bifunctional acetaldehyde/alcohol dehydrogenase are able to complement the acs2 yeast mutants. These genes encode an enzymatic activity allowing the formation of cytosolic acetyl-CoA from acetaldehyde in yeast.

Conclusions

[0142] The supply of cytosolic acetyl-CoA is believed to be a bottleneck in the butanol production in yeast. In order to identify heterologous genes encoding for enzymes forming cytosolic acetyl-CoA in S. cerevisiae a test system based on a delta acs2 yeast mutant was established.

[0143] Due to its deficiency in cytosolic acetyl-CoA biosynthesis on glucose, the acs2.DELTA. strain is unable to grow with glucose as sole carbon source.

[0144] 9 putative acetylating acetaldehyde dehydrogenases identified as candidates for cytosolic acetyl-CoA supply from acetaldehyde were expressed in the acs2.DELTA. yeast. In total, 5 of these 9 genes complemented growth of the acs2.DELTA. strain with glucose as single carbon source. Therewith, the use of the delta acs2 strain as pre-selection tool for feasible routes for cytosolic supply of acetyl-CoA was shown.

[0145] 4 of 5 acetylating acetaldehyde dehydrogenases identified thus far, eutE homologues of E. coli, L. innocua and C. kluyveri and the adhE homologue of S. aureus, and Piromyces sp. E2, were successfully integrated in butanol producing strains of S. cerevisiae. The effect on butanol production was investigated as described in Examples below.

[0146] This test system may also be used, to analyse whether pyruvate:NADP oxidoreductase can successfully be over-expressed in yeast. Due to the oxygen sensitivity, this test has to be performed anaerobically.

[0147] Examples 4-6 below describe the testing 4 of the 5 selected ACDH genes from Example 3 for improvement of butanol production.

Example 4

Construction of a Butanol Producing Yeast Strain and Knocking Out the ADH1 and ADH2 Genes

[0148] The six Clostridium acetobutylicum genes involved in butanol biosynthesis from Acetyl-CoA are listed in Table 3. The genes were codon pair optimized for S. cerevisiae as described in WO2008/000632 and expressed from yeast promoters and terminators as listed in Table 3.

[0149] Two yeast integration vectors (pBOL34 [SEQ ID NO:41] and pBOL36 [SEQ ID NO:42]), each containing 3 of the six codon pair optimised genes from Clostridium acetobutylicum involved in butanol biosynthesis, were designed and synthesized at Geneart.

[0150] The genes ThiL, Hbd and Crt are expressed from pBOL34 containing a AmdS selection marker. The final three genes, Bcd, BdhB and AdhE were expressed from a integration vector with an AmdS selection marker named pBOL36.

TABLE-US-00006 TABLE 3 Genes used for butanol production in S. cerevisiae including the promoter (1000 bp) and terminator (500 bp) Gene activity Promotor Terminator ThiL acetyl CoA c-acetyltransfrase ADH1 TDH1 [E.C. 2.3.1.9 Hbd 3-hydroxybutyryl-CoA ENO1 PMA1 dehydrogenase [E.C.1.1.1.157] Crt 3-hydroxybutyryl-CoA TDH1 ADH1 dehydratase [E.C.4.2.1.55] Bcd butyryl-CoA dehydrogenase PDC1 TDH1 [E.C.1.3.99.2]. BdhB NADH-dependent butanol ENO1 PMA1 dehydrogenase [E.C.1.1.1.--]. adhE alcohol/acetaldehyde CoA TDH1 ADH2 dehydrogenase [E.C.: 1.1.1.1/ 1.2.1.10]

[0151] For integration in the ADH2 locus, pBOL36 was linearized by a BsaBI digestion. S. cerevisiae CEN.PK113-5D (MATa MAL2-8c SUC2 ura3-52) was transformed with the linear fragment and grown on plates with YCB (Difco) and 5 mM acetamide as nitrogen source.

[0152] The AmdS marker was removed by recombination by growing the transformants for 6 hours in YEPD in 2 ml tubes at 30.degree. C. Cells were subsequently plated on 1.8% agar medium containing YCB (Difco) and 40 mM fluoracetamide and 30 mM phosphate buffer pH 6.8 supporting growth only from cells that have lost the AmdS marker. Correct integration and recombination were confirmed by PCR. The correct integration of the fragment upstream was confirmed with the following primers:

TABLE-US-00007 P1: 5'-GAATTGAAGGATATCTACATCAAG-3' and P2: 5'-CCCATCTACGGAACCCTGATCAAGC-3'.

[0153] The correct integration of the fragment downstream was confirmed with the following primers:

TABLE-US-00008 P3: 5'-GATGGTGTCACCATTACCAGGTCTAG-3' and P4: 5'-GTTCTCTGGTCAAGTTGAAGTCCATTTTGATTGATTTGACTGTGTTA TTTTGCGTG-3'.

[0154] The resulting strain was named BLT021.

[0155] pBOL34 was linearized by a PsiI digestion and integrated in the ADH1 locus of BLT021. The transformants were grown on plates containing YCB (Difco) and 5 mM acetamide. For removal of the AmdS selection marker, colonies were inoculated in YEPD and grown for 6 hours in 2 ml tubes at 30.degree. C. The cells were plated on YCB (Difco) and 40 mM fluoracetamide and 0.1% ammonium sulphate.

[0156] Correct integration and recombination were confirmed by PCR. The correct integration of the fragment upstream was confirmed with the following primer set:

TABLE-US-00009 P5: 5'-GAACAATAGAGCGACCATGACCTTG-3' and P6: 5'-GACATCAGCGTCACCAGCCTTGATG-3'.

[0157] The correct integration of the fragment downstream was confirmed with the following primer set:

TABLE-US-00010 P7: 5'-GATTGAAGGTTTCAAGAACAGGTGATG-3' and P8: 5'-GGCGATCAGAGTTGAAAAAAAAATG-3'.

[0158] The resulting strain was named BLT057.

Example 5

Introducing ETF.alpha. and ETF.beta. in BLT057

[0159] The ETF genes and the Acdh genes as listed in Table 4 were codon pair optimized for S. cerevisiae as described in WO2008/000632 and expressed from yeast promoters and terminators as listed in Table 4.

TABLE-US-00011 TABLE 4 Promoters and terminators used for expression of codon pair optimized ETF genes and Acdh genes in S. cerevisiae Promotor Terminator Etf.alpha. (CpO) tef1 tdh2 Etf.beta. (CpO) tdh2 tef1 Acdh64 (AdhE S. aureus) tdh3 adh Acdh65 (Clostridium) tdh3 adh Acdh66 (EutE E. coli) tdh3 adh Acdh67 (lin1129 Ec) tdh3 Adh

[0160] The integration vectors expressing ETF.alpha. and ETF.beta. only (pBOL113, [SEQ ID NO:43]) or ETF.alpha. and ETF.beta. combined with Acdh64 (pBOL115, [SEQ ID NO:44]), Acdh65 (pBOL116, [SEQ ID NO:45]), Acdh66 (pBOL118, [SEQ ID NO:46]) or Acdh67 (pBOL120, [SEQ ID NO:47]) were synthesized by Geneart AG.

[0161] The vectors, pBOL113, pBOL115, pBOL116, pBOL118 and pBOL120, were linearized with StuI and integrated in the ura3-52 locus of strain BLT057.

[0162] The transformants were grown in YNB (Difco) w/o amino acids+2% galactose to select for uracil prototrophic strains. The strains derived from strain BLT057 with pBOL113/115/116/118/120 integrated in the genome were designated strains: BLT071, BLT072, BLT073, BLT074 and BLT075, respectively.

Example 6

Improved Butanol Production by Expressing Positive Acdh Genes

[0163] Strains BLT071 through BLT075 as prepared in Example 5 were grown in Verduyn medium (Verduyn et al. (1992) Yeast 8: 501-517) in which the ammonium sulphate is replaced by 2 g/l ureum and which further contains 4 wt. % galactose. Cells were grown in 100 ml shake flasks containing 50 ml of medium for 72 hours at 30.degree. C. at 180 rpm in a rotary shaker.

[0164] The butanol concentration was determined in the supernatant of the culture. Samples were analysed on a HS-GC equipped with a flame ionisation detector and an automatic injection system. Column J&W DB-1 length 30 m, id 0.53 mm, df 5 .mu.m. The following conditions were used: helium as carrier gas with a flow rate of 5 ml/min. Column temperature was set at 110.degree. C. The injector was set at 140.degree. C. and the detector performed at 300.degree. C. The data was obtained using Chromeleon software. Samples were heated at 60.degree. C. for 20 min in the headspace sampler. One (1) ml of the headspace volatiles were automatically injected on the column.

[0165] 1-Butanol production of the various strains was as follows:

[0166] BLT057: 120 mg/l

[0167] BLT071: 450 mg/l

[0168] BLT072: 500 mg/l

[0169] BLT073: 600 mg/l

[0170] BLT074: 670 mg/l

[0171] BLT075: 700 mg/l

[0172] The results show that introduction of electron transfer flavoproteins (ETF alpha and ETF beta) and/or introduction of acetylating acetaldehyde dehydrogenases as identified by a complementation assay of Example 3, increase the butanol production level.

Sequence CWU 1

1

52164DNAArtificialprimer 5'acs2 1atacacaaac agaatacagg aaagtaaatc aatacaataa taaaacagct gaagcttcgt 60acgc 64267DNAArtificialprimer 3'acs2 2tctcattacg aaatttttct catttaagtt atttcttttt ttgaggcata ggccactagt 60ggatctg 67320DNAArtificialprobe 5'acs2 3gatattcggt agccgattcc 20420DNAArtificialprobe 3'acs2 4ccgtaacctt ctcgtaatgc 20520DNAArtificialprobe ACS2internal 5cggattcgtc atcagcttca 20620DNAArtificialprobe KanA 6cgcacgtcaa gactgtcaag 20720DNAArtificialprobe KanB 7tcgtatgtga atgctggtcg 20864DNAArtificialprimer MCS-5'Tadh 8aaggtacctc tagactagtc ccgggctgca gtcgactcga gcgaatttct tatgatttat 60gatt 64939DNAArtificialprimer Tadh1-Hind 9aggaagctta ggcctgtgtg gaagaacgat tacaacagg 391024DNAArtificialprimer P1 10gaattgaagg atatctacat caag 241125DNAArtificialprimer P2 11cccatctacg gaaccctgat caagc 251226DNAArtificialprimer P3 12gatggtgtca ccattaccag gtctag 261356DNAArtificialPrimer P4 13gttctctggt caagttgaag tccattttga ttgatttgac tgtgttattt tgcgtg 561425DNAArtificialPrimer P5 14gaacaataga gcgaccatga ccttg 251525DNAArtificialPrimer P6 15gacatcagcg tcaccagcct tgatg 251627DNAArtificialPrimer P7 16gattgaaggt ttcaagaaca ggtgatg 271725DNAArtificialPrimer P8 17ggcgatcaga gttgaaaaaa aaatg 25181404DNAEscherichia coliCDS(1)..(1404) 18atg aat caa cag gat att gaa cag gtg gtg aaa gcg gta ctg ctg aaa 48Met Asn Gln Gln Asp Ile Glu Gln Val Val Lys Ala Val Leu Leu Lys1 5 10 15atg caa agc agt gac acg ccg tcc gcc gcc gtt cat gag atg ggc gtt 96Met Gln Ser Ser Asp Thr Pro Ser Ala Ala Val His Glu Met Gly Val 20 25 30ttc gcg tcc ctg gat gac gcc gtt gcg gca gcc aaa gtc gcc cag caa 144Phe Ala Ser Leu Asp Asp Ala Val Ala Ala Ala Lys Val Ala Gln Gln 35 40 45ggg tta aaa agc gtg gca atg cgc cag tta gcc att gct gcc att cgt 192Gly Leu Lys Ser Val Ala Met Arg Gln Leu Ala Ile Ala Ala Ile Arg 50 55 60gaa gca ggc gaa aaa cac gcc aga gat tta gcg gaa ctt gcc gtc agt 240Glu Ala Gly Glu Lys His Ala Arg Asp Leu Ala Glu Leu Ala Val Ser65 70 75 80gaa acc ggc atg ggg cgc gtt gaa gat aaa ttt gca aaa aac gtc gct 288Glu Thr Gly Met Gly Arg Val Glu Asp Lys Phe Ala Lys Asn Val Ala 85 90 95cag gcg cgc ggc aca cca ggc gtt gag tgc ctc tct ccg caa gtg ctg 336Gln Ala Arg Gly Thr Pro Gly Val Glu Cys Leu Ser Pro Gln Val Leu 100 105 110act ggc gac aac ggc ctg acc cta att gaa aac gca ccc tgg ggc gtg 384Thr Gly Asp Asn Gly Leu Thr Leu Ile Glu Asn Ala Pro Trp Gly Val 115 120 125gtg gct tcg gtg acg cct tcc act aac ccg gcg gca acc gta att aac 432Val Ala Ser Val Thr Pro Ser Thr Asn Pro Ala Ala Thr Val Ile Asn 130 135 140aac gcc atc agc ctg att gcc gcg ggc aac agc gtc att ttt gcc ccg 480Asn Ala Ile Ser Leu Ile Ala Ala Gly Asn Ser Val Ile Phe Ala Pro145 150 155 160cat ccg gcg gcg aaa aaa gtc tcc cag cgg gcg att acg ctg ctc aac 528His Pro Ala Ala Lys Lys Val Ser Gln Arg Ala Ile Thr Leu Leu Asn 165 170 175cag gcg att gtt gcc gca ggt ggg ccg gaa aac tta ctg gtt act gtg 576Gln Ala Ile Val Ala Ala Gly Gly Pro Glu Asn Leu Leu Val Thr Val 180 185 190gca aat ccg gat atc gaa acc gcg caa cgc ttg ttc aag ttt ccg ggt 624Ala Asn Pro Asp Ile Glu Thr Ala Gln Arg Leu Phe Lys Phe Pro Gly 195 200 205atc ggc ctg ctg gtg gta acc ggc ggc gaa gcg gta gta gaa gcg gcg 672Ile Gly Leu Leu Val Val Thr Gly Gly Glu Ala Val Val Glu Ala Ala 210 215 220cgt aaa cac acc aat aaa cgt ctg att gcc gca ggc gct ggc aac ccg 720Arg Lys His Thr Asn Lys Arg Leu Ile Ala Ala Gly Ala Gly Asn Pro225 230 235 240ccg gta gtg gtg gat gaa acc gcc gac ctc gcc cgt gcc gct cag tcc 768Pro Val Val Val Asp Glu Thr Ala Asp Leu Ala Arg Ala Ala Gln Ser 245 250 255atc gtc aaa ggc gct tct ttc gat aac aac atc att tgt gcc gac gaa 816Ile Val Lys Gly Ala Ser Phe Asp Asn Asn Ile Ile Cys Ala Asp Glu 260 265 270aag gta ctg att gtt gtt gat agc gta gcc gat gaa ctg atg cgt ctg 864Lys Val Leu Ile Val Val Asp Ser Val Ala Asp Glu Leu Met Arg Leu 275 280 285atg gaa ggc cag cac gcg gtg aaa ctg acc gca gaa cag gcg cag cag 912Met Glu Gly Gln His Ala Val Lys Leu Thr Ala Glu Gln Ala Gln Gln 290 295 300ctg caa ccg gtg ttg ctg aaa aat atc gac gag cgc gga aaa ggc acc 960Leu Gln Pro Val Leu Leu Lys Asn Ile Asp Glu Arg Gly Lys Gly Thr305 310 315 320gtc agc cgt gac tgg gtt ggt cgc gac gca ggc aaa atc gcg gcg gca 1008Val Ser Arg Asp Trp Val Gly Arg Asp Ala Gly Lys Ile Ala Ala Ala 325 330 335atc ggc ctt aaa gtt ccg caa gaa acg cgc ctg ctg ttt gtg gaa acc 1056Ile Gly Leu Lys Val Pro Gln Glu Thr Arg Leu Leu Phe Val Glu Thr 340 345 350acc gca gaa cat ccg ttt gcc gtg act gaa ctg atg atg ccg gtg ttg 1104Thr Ala Glu His Pro Phe Ala Val Thr Glu Leu Met Met Pro Val Leu 355 360 365ccc gtc gtg cgc gtc gcc aac gtg gcg gat gcc att gcg cta gcg gtg 1152Pro Val Val Arg Val Ala Asn Val Ala Asp Ala Ile Ala Leu Ala Val 370 375 380aaa ctg gaa ggc ggt tgc cac cac acg gcg gca atg cac tcg cgc aac 1200Lys Leu Glu Gly Gly Cys His His Thr Ala Ala Met His Ser Arg Asn385 390 395 400atc gaa aac atg aac cag atg gcg aat gct att gat acc agc att ttc 1248Ile Glu Asn Met Asn Gln Met Ala Asn Ala Ile Asp Thr Ser Ile Phe 405 410 415gtt aag aac gga ccg tgc att gcc ggg ctg ggg ctg ggc ggg gaa ggc 1296Val Lys Asn Gly Pro Cys Ile Ala Gly Leu Gly Leu Gly Gly Glu Gly 420 425 430tgg acc acc atg acc atc acc acg cca acc ggt gaa ggg gta acc agc 1344Trp Thr Thr Met Thr Ile Thr Thr Pro Thr Gly Glu Gly Val Thr Ser 435 440 445gcg cgt acg ttt gtc cgt ctg cgt cgc tgt gta tta gtc gat gcg ttt 1392Ala Arg Thr Phe Val Arg Leu Arg Arg Cys Val Leu Val Asp Ala Phe 450 455 460cgc att gtt taa 1404Arg Ile Val46519467PRTEscherichia coli 19Met Asn Gln Gln Asp Ile Glu Gln Val Val Lys Ala Val Leu Leu Lys1 5 10 15Met Gln Ser Ser Asp Thr Pro Ser Ala Ala Val His Glu Met Gly Val 20 25 30Phe Ala Ser Leu Asp Asp Ala Val Ala Ala Ala Lys Val Ala Gln Gln 35 40 45Gly Leu Lys Ser Val Ala Met Arg Gln Leu Ala Ile Ala Ala Ile Arg 50 55 60Glu Ala Gly Glu Lys His Ala Arg Asp Leu Ala Glu Leu Ala Val Ser65 70 75 80Glu Thr Gly Met Gly Arg Val Glu Asp Lys Phe Ala Lys Asn Val Ala 85 90 95Gln Ala Arg Gly Thr Pro Gly Val Glu Cys Leu Ser Pro Gln Val Leu 100 105 110Thr Gly Asp Asn Gly Leu Thr Leu Ile Glu Asn Ala Pro Trp Gly Val 115 120 125Val Ala Ser Val Thr Pro Ser Thr Asn Pro Ala Ala Thr Val Ile Asn 130 135 140Asn Ala Ile Ser Leu Ile Ala Ala Gly Asn Ser Val Ile Phe Ala Pro145 150 155 160His Pro Ala Ala Lys Lys Val Ser Gln Arg Ala Ile Thr Leu Leu Asn 165 170 175Gln Ala Ile Val Ala Ala Gly Gly Pro Glu Asn Leu Leu Val Thr Val 180 185 190Ala Asn Pro Asp Ile Glu Thr Ala Gln Arg Leu Phe Lys Phe Pro Gly 195 200 205Ile Gly Leu Leu Val Val Thr Gly Gly Glu Ala Val Val Glu Ala Ala 210 215 220Arg Lys His Thr Asn Lys Arg Leu Ile Ala Ala Gly Ala Gly Asn Pro225 230 235 240Pro Val Val Val Asp Glu Thr Ala Asp Leu Ala Arg Ala Ala Gln Ser 245 250 255Ile Val Lys Gly Ala Ser Phe Asp Asn Asn Ile Ile Cys Ala Asp Glu 260 265 270Lys Val Leu Ile Val Val Asp Ser Val Ala Asp Glu Leu Met Arg Leu 275 280 285Met Glu Gly Gln His Ala Val Lys Leu Thr Ala Glu Gln Ala Gln Gln 290 295 300Leu Gln Pro Val Leu Leu Lys Asn Ile Asp Glu Arg Gly Lys Gly Thr305 310 315 320Val Ser Arg Asp Trp Val Gly Arg Asp Ala Gly Lys Ile Ala Ala Ala 325 330 335Ile Gly Leu Lys Val Pro Gln Glu Thr Arg Leu Leu Phe Val Glu Thr 340 345 350Thr Ala Glu His Pro Phe Ala Val Thr Glu Leu Met Met Pro Val Leu 355 360 365Pro Val Val Arg Val Ala Asn Val Ala Asp Ala Ile Ala Leu Ala Val 370 375 380Lys Leu Glu Gly Gly Cys His His Thr Ala Ala Met His Ser Arg Asn385 390 395 400Ile Glu Asn Met Asn Gln Met Ala Asn Ala Ile Asp Thr Ser Ile Phe 405 410 415Val Lys Asn Gly Pro Cys Ile Ala Gly Leu Gly Leu Gly Gly Glu Gly 420 425 430Trp Thr Thr Met Thr Ile Thr Thr Pro Thr Gly Glu Gly Val Thr Ser 435 440 445Ala Arg Thr Phe Val Arg Leu Arg Arg Cys Val Leu Val Asp Ala Phe 450 455 460Arg Ile Val465201401DNAArtificialoptimised sequence 20atgaaccaac aagatatcga acaagttgtc aaggctgtct tgttgaaaat gcaatcttct 60gacactccat ctgctgctgt ccacgaaatg ggtgttttcg cttctttgga cgacgctgtt 120gctgctgcca aggttgctca acaaggtttg aaatctgttg ccatgagaca attggccatt 180gctgccatca gagaagctgg tgaaaagcat gccagagact tggctgaatt ggctgtctcc 240gaaaccggta tgggtagagt tgaagacaaa ttcgctaaga acgttgctca agctagaggt 300actccaggtg tcgaatgttt gtctccacaa gtcttgaccg gtgataatgg tttgactttg 360attgaaaatg ctccatgggg tgttgttgct tccgtcaccc catctaccaa cccagctgct 420actgtcatca acaacgccat ctctttgatt gctgctggta actccgttat cttcgctcca 480cacccagctg ccaagaaggt ttctcaaaga gccatcactc tattgaacca agccattgtt 540gctgctggtg gtccagaaaa cttgttggtc actgttgcca acccagatat cgaaactgct 600caaagattat tcaagttccc aggtatcggt ctattagtcg tcactggtgg tgaagctgtt 660gttgaagctg ccagaaagca caccaacaag agattgattg ctgctggtgc tggtaaccct 720cctgttgttg tcgatgaaac cgctgatttg gccagagctg ctcaatccat tgtcaagggt 780gcttctttcg acaacaacat catctgtgct gacgaaaagg ttttgattgt tgttgactcc 840gttgctgacg aattgatgag attgatggaa ggtcaacatg ccgtcaagtt gactgctgaa 900caagctcaac aattgcaacc agttttgttg aagaacatcg atgaaagagg taagggtacc 960gtctccagag actgggttgg tagagatgct ggtaagattg ctgctgccat cggtttgaag 1020gttccacaag aaaccagatt attattcgtc gaaaccaccg ctgaacaccc atttgctgtc 1080actgaattga tgatgccagt cttaccagtt gtccgtgttg ctaacgttgc tgacgctatt 1140gctttggctg tcaaattgga aggtggttgt caccacactg ctgccatgca ctccagaaac 1200atcgaaaaca tgaaccaaat ggctaacgcc attgacactt ccatctttgt caagaacggt 1260ccatgtatcg ctggtttggg tttgggtggt gaaggttgga ccaccatgac catcaccacc 1320ccaactggtg aaggtgtcac ttctgccaga actttcgtca gattacgtcg ttgtgttttg 1380gtcgatgctt tcagaattgt t 1401211410DNAListeria innocuaCDS(1)..(1410) 21atg gaa tca tta gaa ctc gaa caa ctg gta aaa aaa gtt ctc tta gaa 48Met Glu Ser Leu Glu Leu Glu Gln Leu Val Lys Lys Val Leu Leu Glu1 5 10 15aaa tta gca gaa caa aaa gaa gta cca aca aaa aca act aca caa ggc 96Lys Leu Ala Glu Gln Lys Glu Val Pro Thr Lys Thr Thr Thr Gln Gly 20 25 30gcg aaa agt ggc gtt ttt gat aca gtt gac gag gct gtt caa gca gca 144Ala Lys Ser Gly Val Phe Asp Thr Val Asp Glu Ala Val Gln Ala Ala 35 40 45gtt ata gcg cag aat tgc tat aaa gaa aaa tca ctt gaa gaa cgc cgc 192Val Ile Ala Gln Asn Cys Tyr Lys Glu Lys Ser Leu Glu Glu Arg Arg 50 55 60aat gtt gta aaa gca att cgt gaa gca ctt tat cca gaa att gaa aca 240Asn Val Val Lys Ala Ile Arg Glu Ala Leu Tyr Pro Glu Ile Glu Thr65 70 75 80att gcg aca aga gca gtt gca gag act ggt atg gga aat gtg aca gat 288Ile Ala Thr Arg Ala Val Ala Glu Thr Gly Met Gly Asn Val Thr Asp 85 90 95aaa att ttg aaa aac acg tta gca atc gaa aaa acg cca ggg gta gaa 336Lys Ile Leu Lys Asn Thr Leu Ala Ile Glu Lys Thr Pro Gly Val Glu 100 105 110gat tta tat aca gaa gta gct aca ggt gat aac ggt atg aca cta tat 384Asp Leu Tyr Thr Glu Val Ala Thr Gly Asp Asn Gly Met Thr Leu Tyr 115 120 125gaa ctc tct ccg tat ggc gta att ggt gca gta gcg ccg agc aca aac 432Glu Leu Ser Pro Tyr Gly Val Ile Gly Ala Val Ala Pro Ser Thr Asn 130 135 140cca acg gaa aca ttg att tgt aat tca atc ggt atg ctc gca gct gga 480Pro Thr Glu Thr Leu Ile Cys Asn Ser Ile Gly Met Leu Ala Ala Gly145 150 155 160aat gcc gtt ttt tat agc cct cat cca ggg gca aaa aac att tca ctg 528Asn Ala Val Phe Tyr Ser Pro His Pro Gly Ala Lys Asn Ile Ser Leu 165 170 175tgg ttg att gaa aaa cta aac aca att gtt cgc gat agt tgt ggt ata 576Trp Leu Ile Glu Lys Leu Asn Thr Ile Val Arg Asp Ser Cys Gly Ile 180 185 190gat aat cta att gtc acc gtg gct aaa cca tcc atc caa gca gct caa 624Asp Asn Leu Ile Val Thr Val Ala Lys Pro Ser Ile Gln Ala Ala Gln 195 200 205gaa atg atg aac cat cca aaa gta ccg cta ctt gtt att aca ggt ggt 672Glu Met Met Asn His Pro Lys Val Pro Leu Leu Val Ile Thr Gly Gly 210 215 220ccg ggc gtt gtt ctc caa gcg atg caa tca ggt aaa aaa gtg att gga 720Pro Gly Val Val Leu Gln Ala Met Gln Ser Gly Lys Lys Val Ile Gly225 230 235 240gca gga gca ggg aac ccg cct tct att gtt gac gaa aca gct aat atc 768Ala Gly Ala Gly Asn Pro Pro Ser Ile Val Asp Glu Thr Ala Asn Ile 245 250 255gaa aaa gcg gct gct gac atc gta gac gga gca tct ttt gac cat aat 816Glu Lys Ala Ala Ala Asp Ile Val Asp Gly Ala Ser Phe Asp His Asn 260 265 270att tta tgt att gct gaa aaa agt gtg gta gct gtt gat agc att gct 864Ile Leu Cys Ile Ala Glu Lys Ser Val Val Ala Val Asp Ser Ile Ala 275 280 285gat ttc ttg tta ttc caa atg gaa aaa aat ggt gcc ctt cat gtt act 912Asp Phe Leu Leu Phe Gln Met Glu Lys Asn Gly Ala Leu His Val Thr 290 295 300aat cca agt gat att caa aaa tta gaa aaa gta gcc gtt acc gat aaa 960Asn Pro Ser Asp Ile Gln Lys Leu Glu Lys Val Ala Val Thr Asp Lys305 310 315 320ggt gta act aat aaa aaa tta gtc gga aaa agt gca act gaa atc tta 1008Gly Val Thr Asn Lys Lys Leu Val Gly Lys Ser Ala Thr Glu Ile Leu 325 330 335aaa gaa gca gga ata gct tgt gat ttt aca cca cgt tta atc att gtg 1056Lys Glu Ala Gly Ile Ala Cys Asp Phe Thr Pro Arg Leu Ile Ile Val 340 345 350gaa acg gag aaa tct cat cca ttt gca aca gta gag cta tta atg cca 1104Glu Thr Glu Lys Ser His Pro Phe Ala Thr Val Glu Leu Leu Met Pro 355 360 365atc gtt cca gtt gta agg gtg cct gat ttt gac gaa gcc ctt gaa gtg 1152Ile Val Pro Val Val Arg Val Pro Asp Phe Asp Glu Ala Leu Glu Val 370 375 380gct att gaa ctc gaa caa ggc tta cat cat aca gca aca atg cat tca 1200Ala Ile Glu Leu Glu Gln Gly Leu His His Thr Ala Thr Met His Ser385 390 395 400caa aat atc tcg aga tta aac aaa gct gca aga gat atg caa act tcc 1248Gln Asn Ile Ser Arg Leu Asn Lys Ala Ala Arg Asp Met Gln Thr Ser 405 410 415atc ttt gtc aaa aat ggt ccg tcc ttt gcg gga tta ggc ttt aga gga 1296Ile Phe Val Lys Asn Gly Pro Ser Phe Ala Gly Leu Gly Phe Arg Gly 420 425 430gaa ggt agt act act ttc act att gca acg cct act gga gaa gga aca 1344Glu Gly Ser Thr Thr Phe Thr Ile Ala Thr Pro Thr Gly Glu Gly Thr 435 440

445act aca gca cgt cat ttt gct aga cgc cgc cgc tgt gtt tta aca gat 1392Thr Thr Ala Arg His Phe Ala Arg Arg Arg Arg Cys Val Leu Thr Asp 450 455 460ggt ttt tcg att cgt taa 1410Gly Phe Ser Ile Arg46522469PRTListeria innocua 22Met Glu Ser Leu Glu Leu Glu Gln Leu Val Lys Lys Val Leu Leu Glu1 5 10 15Lys Leu Ala Glu Gln Lys Glu Val Pro Thr Lys Thr Thr Thr Gln Gly 20 25 30Ala Lys Ser Gly Val Phe Asp Thr Val Asp Glu Ala Val Gln Ala Ala 35 40 45Val Ile Ala Gln Asn Cys Tyr Lys Glu Lys Ser Leu Glu Glu Arg Arg 50 55 60Asn Val Val Lys Ala Ile Arg Glu Ala Leu Tyr Pro Glu Ile Glu Thr65 70 75 80Ile Ala Thr Arg Ala Val Ala Glu Thr Gly Met Gly Asn Val Thr Asp 85 90 95Lys Ile Leu Lys Asn Thr Leu Ala Ile Glu Lys Thr Pro Gly Val Glu 100 105 110Asp Leu Tyr Thr Glu Val Ala Thr Gly Asp Asn Gly Met Thr Leu Tyr 115 120 125Glu Leu Ser Pro Tyr Gly Val Ile Gly Ala Val Ala Pro Ser Thr Asn 130 135 140Pro Thr Glu Thr Leu Ile Cys Asn Ser Ile Gly Met Leu Ala Ala Gly145 150 155 160Asn Ala Val Phe Tyr Ser Pro His Pro Gly Ala Lys Asn Ile Ser Leu 165 170 175Trp Leu Ile Glu Lys Leu Asn Thr Ile Val Arg Asp Ser Cys Gly Ile 180 185 190Asp Asn Leu Ile Val Thr Val Ala Lys Pro Ser Ile Gln Ala Ala Gln 195 200 205Glu Met Met Asn His Pro Lys Val Pro Leu Leu Val Ile Thr Gly Gly 210 215 220Pro Gly Val Val Leu Gln Ala Met Gln Ser Gly Lys Lys Val Ile Gly225 230 235 240Ala Gly Ala Gly Asn Pro Pro Ser Ile Val Asp Glu Thr Ala Asn Ile 245 250 255Glu Lys Ala Ala Ala Asp Ile Val Asp Gly Ala Ser Phe Asp His Asn 260 265 270Ile Leu Cys Ile Ala Glu Lys Ser Val Val Ala Val Asp Ser Ile Ala 275 280 285Asp Phe Leu Leu Phe Gln Met Glu Lys Asn Gly Ala Leu His Val Thr 290 295 300Asn Pro Ser Asp Ile Gln Lys Leu Glu Lys Val Ala Val Thr Asp Lys305 310 315 320Gly Val Thr Asn Lys Lys Leu Val Gly Lys Ser Ala Thr Glu Ile Leu 325 330 335Lys Glu Ala Gly Ile Ala Cys Asp Phe Thr Pro Arg Leu Ile Ile Val 340 345 350Glu Thr Glu Lys Ser His Pro Phe Ala Thr Val Glu Leu Leu Met Pro 355 360 365Ile Val Pro Val Val Arg Val Pro Asp Phe Asp Glu Ala Leu Glu Val 370 375 380Ala Ile Glu Leu Glu Gln Gly Leu His His Thr Ala Thr Met His Ser385 390 395 400Gln Asn Ile Ser Arg Leu Asn Lys Ala Ala Arg Asp Met Gln Thr Ser 405 410 415Ile Phe Val Lys Asn Gly Pro Ser Phe Ala Gly Leu Gly Phe Arg Gly 420 425 430Glu Gly Ser Thr Thr Phe Thr Ile Ala Thr Pro Thr Gly Glu Gly Thr 435 440 445Thr Thr Ala Arg His Phe Ala Arg Arg Arg Arg Cys Val Leu Thr Asp 450 455 460Gly Phe Ser Ile Arg465231407DNAArtificialoptimised sequence 23atggaatctt tggaattgga acaattagtc aagaaggttt tgttggaaaa attggctgaa 60caaaaggaag ttccaaccaa gaccaccacc caaggtgcca agtccggtgt tttcgatacc 120gtcgatgaag ctgtccaagc tgccgtcatt gctcaaaact gttacaagga aaaatctttg 180gaagaaagaa gaaacgttgt caaggccatc agagaagctt tatacccaga aatcgaaacc 240attgctacca gagctgttgc tgaaaccggt atgggtaatg tcaccgataa aatcttgaag 300aacactttag ctatcgaaaa gactccaggt gttgaagact tgtacactga agttgctacc 360ggtgacaacg gtatgacttt atacgaatta tctccatacg gtgtcatcgg tgctgttgct 420ccatctacca acccaactga aactttgatc tgtaactcca tcggtatgtt ggctgctggt 480aacgccgttt tctactctcc tcacccaggt gccaagaaca tctctttatg gttgattgaa 540aagttgaaca ctatcgtcag agattcttgt ggtattgaca acttgattgt caccgttgcc 600aagccatcta tccaagctgc tcaagaaatg atgaaccacc caaaggttcc attgttggtc 660atcactggtg gtccaggtgt tgtcttgcaa gctatgcaat ctggtaagaa ggttatcggt 720gctggtgctg gtaaccctcc atccatcgtt gacgaaaccg ctaacattga aaaggctgct 780gctgacattg tcgacggtgc ttcctttgac cataatatct tgtgtatcgc tgaaaagtct 840gttgttgccg ttgactccat tgctgacttc ttgttgttcc aaatggaaaa gaacggtgct 900ttgcacgtca ctaacccatc tgatatccaa aaattggaaa aggttgccgt cactgacaag 960ggtgtcacca acaagaaatt ggttggtaag tctgccactg aaatcttgaa agaagctggt 1020attgcttgtg atttcacccc aagattgatc attgtcgaaa ctgaaaagtc ccacccattc 1080gctactgttg aattgttgat gccaattgtt ccagttgtca gagttccaga cttcgatgaa 1140gctttggaag ttgccattga attggaacaa ggtctacatc acactgctac catgcactct 1200caaaacatct ccagattgaa caaggctgcc cgtgacatgc aaacctccat ctttgtcaag 1260aacggtccat ctttcgctgg tttaggtttc agaggtgaag gttccaccac tttcaccatt 1320gctactccaa ctggtgaagg tactaccact gcccgtcact tcgctagaag aagaagatgt 1380gtcttgactg atggtttctc cattaga 1407241476DNAClostridium kluyveriCDS(1)..(1476) 24atg gag ata atg gat aag gac tta cag tca ata cag gaa gta aga act 48Met Glu Ile Met Asp Lys Asp Leu Gln Ser Ile Gln Glu Val Arg Thr1 5 10 15ctt ata gca aaa gca aag aaa gct caa gca gaa ttt aaa aat ttt tct 96Leu Ile Ala Lys Ala Lys Lys Ala Gln Ala Glu Phe Lys Asn Phe Ser 20 25 30caa gaa gct gta aac aag gta ata gaa aaa ata gct aag gct aca gaa 144Gln Glu Ala Val Asn Lys Val Ile Glu Lys Ile Ala Lys Ala Thr Glu 35 40 45gtt gaa gct gta aaa ctt gca aaa ttg gca tat gaa gat aca gga tat 192Val Glu Ala Val Lys Leu Ala Lys Leu Ala Tyr Glu Asp Thr Gly Tyr 50 55 60gga aaa tgg gaa gat aaa gta ata aag aat aag ttt tca agt ata gta 240Gly Lys Trp Glu Asp Lys Val Ile Lys Asn Lys Phe Ser Ser Ile Val65 70 75 80gtt tat aac tat att aaa gat ttg aaa acg gtt gga att tta aaa gaa 288Val Tyr Asn Tyr Ile Lys Asp Leu Lys Thr Val Gly Ile Leu Lys Glu 85 90 95gac aag gaa aag aaa tta ata gat ata gct gtt cca ctt gga gtt ata 336Asp Lys Glu Lys Lys Leu Ile Asp Ile Ala Val Pro Leu Gly Val Ile 100 105 110gca gga ctt ata cct tca act aac cca act tca aca gca ata ttc aag 384Ala Gly Leu Ile Pro Ser Thr Asn Pro Thr Ser Thr Ala Ile Phe Lys 115 120 125gta tta ata gca tta aag gca gga aat gca ata gta ttc tca cca cat 432Val Leu Ile Ala Leu Lys Ala Gly Asn Ala Ile Val Phe Ser Pro His 130 135 140cca aca gca gta aga agt att aca gaa act gta aag ata atg cag aaa 480Pro Thr Ala Val Arg Ser Ile Thr Glu Thr Val Lys Ile Met Gln Lys145 150 155 160gct gca gta gaa gca gga gca cca gat gga tta atc caa tgt atg tca 528Ala Ala Val Glu Ala Gly Ala Pro Asp Gly Leu Ile Gln Cys Met Ser 165 170 175ata ttg aca gta gaa ggt act gct gaa ttg atg aag aat aag gat aca 576Ile Leu Thr Val Glu Gly Thr Ala Glu Leu Met Lys Asn Lys Asp Thr 180 185 190gca ctt atc ctt gca aca ggt gga gaa gga atg gta aga gca gct tac 624Ala Leu Ile Leu Ala Thr Gly Gly Glu Gly Met Val Arg Ala Ala Tyr 195 200 205agt tca gga aca cca gct ata gga gtt gga cct gga aac ggc cca tgc 672Ser Ser Gly Thr Pro Ala Ile Gly Val Gly Pro Gly Asn Gly Pro Cys 210 215 220ttt att gaa aga aca gca gat att cct aca gca gta aga aaa gta ata 720Phe Ile Glu Arg Thr Ala Asp Ile Pro Thr Ala Val Arg Lys Val Ile225 230 235 240ggc agt gat act ttt gat aat gga gta ata tgt gct tca gaa caa tca 768Gly Ser Asp Thr Phe Asp Asn Gly Val Ile Cys Ala Ser Glu Gln Ser 245 250 255ata ata gca gag aca gta aag aaa gca gag ata att gaa gaa ttc aag 816Ile Ile Ala Glu Thr Val Lys Lys Ala Glu Ile Ile Glu Glu Phe Lys 260 265 270aga caa aaa gga tat ttc tta aat gca gaa gaa tca gaa aaa gta ggc 864Arg Gln Lys Gly Tyr Phe Leu Asn Ala Glu Glu Ser Glu Lys Val Gly 275 280 285aag att tta tta aga gct aat gga aca cca aac cca gca ata gta gga 912Lys Ile Leu Leu Arg Ala Asn Gly Thr Pro Asn Pro Ala Ile Val Gly 290 295 300aaa gat gtt caa gca tta gca aaa tta gca gga ata agc ata cca agc 960Lys Asp Val Gln Ala Leu Ala Lys Leu Ala Gly Ile Ser Ile Pro Ser305 310 315 320gat gcg gta ata tta ctt tca gag cag aca gat gtg agt cca aag aac 1008Asp Ala Val Ile Leu Leu Ser Glu Gln Thr Asp Val Ser Pro Lys Asn 325 330 335cct tat gca aag gaa aaa tta gct cca gta ctt gca ttc tat aca gta 1056Pro Tyr Ala Lys Glu Lys Leu Ala Pro Val Leu Ala Phe Tyr Thr Val 340 345 350gaa gac tgg cat gaa gca tgt gaa aaa tcc tta gca ctt ctt cat aac 1104Glu Asp Trp His Glu Ala Cys Glu Lys Ser Leu Ala Leu Leu His Asn 355 360 365caa gga agt gga cat aca tta ata att cac tca cag aat gaa gaa atc 1152Gln Gly Ser Gly His Thr Leu Ile Ile His Ser Gln Asn Glu Glu Ile 370 375 380ata aga gaa ttc gca ttg aag aaa cca gta tca aga ata ctt gta aat 1200Ile Arg Glu Phe Ala Leu Lys Lys Pro Val Ser Arg Ile Leu Val Asn385 390 395 400tca cct gga tca ctt gga gga ata ggt gga gct aca aat ctt gta cca 1248Ser Pro Gly Ser Leu Gly Gly Ile Gly Gly Ala Thr Asn Leu Val Pro 405 410 415tca ctt aca tta ggc tgt gga gca gta ggt gga agt gca act tca gat 1296Ser Leu Thr Leu Gly Cys Gly Ala Val Gly Gly Ser Ala Thr Ser Asp 420 425 430aac gta gga cca gaa aac tta ttc aac ata aga aaa gta gct tat gga 1344Asn Val Gly Pro Glu Asn Leu Phe Asn Ile Arg Lys Val Ala Tyr Gly 435 440 445act acg aca gta gaa gaa ata aga gaa gct ttt ggt gta gga gca gct 1392Thr Thr Thr Val Glu Glu Ile Arg Glu Ala Phe Gly Val Gly Ala Ala 450 455 460tca tca agt gca cca gca gaa cca gaa gat aat gaa gat gta cag gct 1440Ser Ser Ser Ala Pro Ala Glu Pro Glu Asp Asn Glu Asp Val Gln Ala465 470 475 480ata gta aaa gct ata atg gct aaa tta aat ctt taa 1476Ile Val Lys Ala Ile Met Ala Lys Leu Asn Leu 485 49025491PRTClostridium kluyveri 25Met Glu Ile Met Asp Lys Asp Leu Gln Ser Ile Gln Glu Val Arg Thr1 5 10 15Leu Ile Ala Lys Ala Lys Lys Ala Gln Ala Glu Phe Lys Asn Phe Ser 20 25 30Gln Glu Ala Val Asn Lys Val Ile Glu Lys Ile Ala Lys Ala Thr Glu 35 40 45Val Glu Ala Val Lys Leu Ala Lys Leu Ala Tyr Glu Asp Thr Gly Tyr 50 55 60Gly Lys Trp Glu Asp Lys Val Ile Lys Asn Lys Phe Ser Ser Ile Val65 70 75 80Val Tyr Asn Tyr Ile Lys Asp Leu Lys Thr Val Gly Ile Leu Lys Glu 85 90 95Asp Lys Glu Lys Lys Leu Ile Asp Ile Ala Val Pro Leu Gly Val Ile 100 105 110Ala Gly Leu Ile Pro Ser Thr Asn Pro Thr Ser Thr Ala Ile Phe Lys 115 120 125Val Leu Ile Ala Leu Lys Ala Gly Asn Ala Ile Val Phe Ser Pro His 130 135 140Pro Thr Ala Val Arg Ser Ile Thr Glu Thr Val Lys Ile Met Gln Lys145 150 155 160Ala Ala Val Glu Ala Gly Ala Pro Asp Gly Leu Ile Gln Cys Met Ser 165 170 175Ile Leu Thr Val Glu Gly Thr Ala Glu Leu Met Lys Asn Lys Asp Thr 180 185 190Ala Leu Ile Leu Ala Thr Gly Gly Glu Gly Met Val Arg Ala Ala Tyr 195 200 205Ser Ser Gly Thr Pro Ala Ile Gly Val Gly Pro Gly Asn Gly Pro Cys 210 215 220Phe Ile Glu Arg Thr Ala Asp Ile Pro Thr Ala Val Arg Lys Val Ile225 230 235 240Gly Ser Asp Thr Phe Asp Asn Gly Val Ile Cys Ala Ser Glu Gln Ser 245 250 255Ile Ile Ala Glu Thr Val Lys Lys Ala Glu Ile Ile Glu Glu Phe Lys 260 265 270Arg Gln Lys Gly Tyr Phe Leu Asn Ala Glu Glu Ser Glu Lys Val Gly 275 280 285Lys Ile Leu Leu Arg Ala Asn Gly Thr Pro Asn Pro Ala Ile Val Gly 290 295 300Lys Asp Val Gln Ala Leu Ala Lys Leu Ala Gly Ile Ser Ile Pro Ser305 310 315 320Asp Ala Val Ile Leu Leu Ser Glu Gln Thr Asp Val Ser Pro Lys Asn 325 330 335Pro Tyr Ala Lys Glu Lys Leu Ala Pro Val Leu Ala Phe Tyr Thr Val 340 345 350Glu Asp Trp His Glu Ala Cys Glu Lys Ser Leu Ala Leu Leu His Asn 355 360 365Gln Gly Ser Gly His Thr Leu Ile Ile His Ser Gln Asn Glu Glu Ile 370 375 380Ile Arg Glu Phe Ala Leu Lys Lys Pro Val Ser Arg Ile Leu Val Asn385 390 395 400Ser Pro Gly Ser Leu Gly Gly Ile Gly Gly Ala Thr Asn Leu Val Pro 405 410 415Ser Leu Thr Leu Gly Cys Gly Ala Val Gly Gly Ser Ala Thr Ser Asp 420 425 430Asn Val Gly Pro Glu Asn Leu Phe Asn Ile Arg Lys Val Ala Tyr Gly 435 440 445Thr Thr Thr Val Glu Glu Ile Arg Glu Ala Phe Gly Val Gly Ala Ala 450 455 460Ser Ser Ser Ala Pro Ala Glu Pro Glu Asp Asn Glu Asp Val Gln Ala465 470 475 480Ile Val Lys Ala Ile Met Ala Lys Leu Asn Leu 485 490261473DNAArtificialoptimised sequence 26atggaaatca tggacaagga tttgcaatcc atccaagaag ttagaacttt gattgccaag 60gccaagaagg ctcaagctga attcaagaac ttttcccaag aagctgttaa caaggtcatc 120gaaaagatcg ccaaggctac tgaagttgaa gctgtcaaat tggccaaatt ggcttacgaa 180gacaccggtt acggtaaatg ggaagacaag gtcatcaaga acaaattctc ctccattgtt 240gtctacaact acatcaagga tttgaagacc gttggtatct tgaaggaaga caaggaaaag 300aaattgattg acattgctgt cccattaggt gtcattgctg gtttgattcc atctaccaac 360ccaacttcca ctgccatttt caaggtcttg attgctttga aggctggtaa cgccattgtc 420ttctctccac acccaactgc tgtccgttcc atcactgaaa ccgttaagat catgcaaaag 480gctgctgttg aagctggtgc tccagatggt ttgatccaat gtatgtccat tttgaccgtt 540gaaggtactg ctgaattgat gaagaacaag gacaccgctt tgatcttggc taccggtggt 600gaaggtatgg ttagagctgc ttactcctct ggtactccag ccatcggtgt cggtccaggt 660aacggtccat gtttcatcga aagaactgct gacattccaa ctgctgttag aaaggttatc 720ggttctgaca ctttcgacaa cggtgtcatc tgtgcttctg aacaatccat cattgctgaa 780accgtcaaga aggctgaaat catcgaagaa ttcaagagac aaaagggtta cttcttgaat 840gctgaagaat ctgaaaaggt tggtaagatt ctattacgtg ccaacggtac tccaaaccca 900gccatcgttg gtaaggatgt ccaagctttg gccaaattgg ctggtatttc cattccatct 960gatgctgtta tcttactatc cgaacaaacc gatgtttctc ctaaaaatcc atacgctaag 1020gaaaaattgg ctccagtctt ggctttctac accgtcgaag actggcatga agcttgtgaa 1080aagtctttgg ctttattgca caaccaaggt tctggtcaca ctttgatcat ccactctcaa 1140aacgaagaaa tcattagaga atttgctttg aagaagcctg tttccagaat tttggttaac 1200tctccaggtt ctttgggtgg tatcggtggt gctaccaact tagtcccatc tttgacttta 1260ggttgtggtg ctgttggtgg ttctgccacc tctgacaacg ttggtccaga aaacttgttc 1320aacatcagaa aggttgctta cggtaccacc accgtcgaag aaatcagaga agctttcggt 1380gtcggtgctg cttcttcttc tgctccagct gaaccagaag acaacgaaga tgttcaagcc 1440attgttaagg ccatcatggc caaattgaac ttg 1473272610DNAStaphylococcus aureusCDS(1)..(2610) 27atg tta act ata cct gaa aaa gaa aat cgt gga tcg aaa gaa caa gaa 48Met Leu Thr Ile Pro Glu Lys Glu Asn Arg Gly Ser Lys Glu Gln Glu1 5 10 15gtg gca att atg att gat gct cta gct gac aaa ggg aaa aaa gca tta 96Val Ala Ile Met Ile Asp Ala Leu Ala Asp Lys Gly Lys Lys Ala Leu 20 25 30gaa gca tta tct aaa aag tca caa gaa gaa att gat cat att gtt cat 144Glu Ala Leu Ser Lys Lys Ser Gln Glu Glu Ile Asp His Ile Val His 35 40 45caa atg agc tta gca gct gtt gat caa cat atg gtg cta gca aaa tta 192Gln Met Ser Leu Ala Ala Val Asp Gln His Met Val Leu Ala Lys Leu 50 55 60gca cat gaa gaa act gga aga ggt ata tac gaa gat aaa gcg att aaa 240Ala His Glu Glu Thr Gly Arg Gly Ile Tyr Glu Asp Lys Ala Ile Lys65 70 75 80aat tta tac gct tct gaa tat ata tgg aat tca ata aaa gac aat aag 288Asn Leu Tyr Ala Ser Glu Tyr Ile Trp Asn Ser Ile Lys Asp Asn Lys 85 90 95aca gta ggg att att ggt gaa gat aaa gaa aaa gga tta acg tat gta

336Thr Val Gly Ile Ile Gly Glu Asp Lys Glu Lys Gly Leu Thr Tyr Val 100 105 110gcg gaa cca att ggt gtt att tgt ggt gtt acg cca aca aca aat cct 384Ala Glu Pro Ile Gly Val Ile Cys Gly Val Thr Pro Thr Thr Asn Pro 115 120 125acg tcg aca act att ttt aaa gcg atg att gca att aag aca gga aat 432Thr Ser Thr Thr Ile Phe Lys Ala Met Ile Ala Ile Lys Thr Gly Asn 130 135 140cca atc att ttt gca ttc cat cca agt gca caa gaa tcg tcg aag cgt 480Pro Ile Ile Phe Ala Phe His Pro Ser Ala Gln Glu Ser Ser Lys Arg145 150 155 160gca gca gaa gtt gta tta gaa gcg gca atg aag gca ggt gca cct aaa 528Ala Ala Glu Val Val Leu Glu Ala Ala Met Lys Ala Gly Ala Pro Lys 165 170 175gat att att cag tgg att gaa gtg cct tct atc gaa gca aca aaa caa 576Asp Ile Ile Gln Trp Ile Glu Val Pro Ser Ile Glu Ala Thr Lys Gln 180 185 190tta atg aat cac aaa ggt att gca tta gtt cta gca aca ggt ggt tcg 624Leu Met Asn His Lys Gly Ile Ala Leu Val Leu Ala Thr Gly Gly Ser 195 200 205ggc atg gtt aag tct gca tat tca act ggc aaa ccg gca tta ggt gtg 672Gly Met Val Lys Ser Ala Tyr Ser Thr Gly Lys Pro Ala Leu Gly Val 210 215 220gga cca ggt aac gtg ccg tct tac att gaa aaa aca gca cac att aaa 720Gly Pro Gly Asn Val Pro Ser Tyr Ile Glu Lys Thr Ala His Ile Lys225 230 235 240cgt gca gta aat gat atc att ggt tca aaa aca ttt gat aat ggt atg 768Arg Ala Val Asn Asp Ile Ile Gly Ser Lys Thr Phe Asp Asn Gly Met 245 250 255att tgt gct tct gaa caa gtt gta gtc att gat aaa gaa att tat aaa 816Ile Cys Ala Ser Glu Gln Val Val Val Ile Asp Lys Glu Ile Tyr Lys 260 265 270gat gtt act aat gaa ttt aaa gca cat caa gca tac ttt gtt aaa aaa 864Asp Val Thr Asn Glu Phe Lys Ala His Gln Ala Tyr Phe Val Lys Lys 275 280 285gat gaa tta caa cgc tta gaa aat gca att atg aat gaa caa aaa aca 912Asp Glu Leu Gln Arg Leu Glu Asn Ala Ile Met Asn Glu Gln Lys Thr 290 295 300ggt att aag cct gat att gtc ggt aaa tct gca gtt gaa ata gct gaa 960Gly Ile Lys Pro Asp Ile Val Gly Lys Ser Ala Val Glu Ile Ala Glu305 310 315 320tta gca ggt ata cct gtc ccc gaa aat aca aaa ctt atc ata gcc gaa 1008Leu Ala Gly Ile Pro Val Pro Glu Asn Thr Lys Leu Ile Ile Ala Glu 325 330 335att agc ggt gta ggt tca gac tat ccg tta tct cgt gaa aaa tta tct 1056Ile Ser Gly Val Gly Ser Asp Tyr Pro Leu Ser Arg Glu Lys Leu Ser 340 345 350cca gta tta gcc tta gta aaa gcc caa tct aca aaa caa gca ttt caa 1104Pro Val Leu Ala Leu Val Lys Ala Gln Ser Thr Lys Gln Ala Phe Gln 355 360 365att tgt gaa gac aca cta cat ttt ggt gga tta gga cac aca gcc gtt 1152Ile Cys Glu Asp Thr Leu His Phe Gly Gly Leu Gly His Thr Ala Val 370 375 380atc cat aca gaa gat gaa aca tta caa aaa gat ttt gga cta aga atg 1200Ile His Thr Glu Asp Glu Thr Leu Gln Lys Asp Phe Gly Leu Arg Met385 390 395 400aaa gct tgt cgt gta ctt gta aat aca cca tca gcg gtt gga ggt att 1248Lys Ala Cys Arg Val Leu Val Asn Thr Pro Ser Ala Val Gly Gly Ile 405 410 415ggt gat atg tat aac gaa ttg att ccg tct tta aca tta ggt tgt ggt 1296Gly Asp Met Tyr Asn Glu Leu Ile Pro Ser Leu Thr Leu Gly Cys Gly 420 425 430tcg tac ggt aga aac tca att tca cat aat gtt agt gcg aca gat tta 1344Ser Tyr Gly Arg Asn Ser Ile Ser His Asn Val Ser Ala Thr Asp Leu 435 440 445tta aac att aaa acg att gct aaa cga cgt aat aat act caa att ttc 1392Leu Asn Ile Lys Thr Ile Ala Lys Arg Arg Asn Asn Thr Gln Ile Phe 450 455 460aag gtg cct gct caa att tat ttt gaa gaa aat gca atc atg agt cta 1440Lys Val Pro Ala Gln Ile Tyr Phe Glu Glu Asn Ala Ile Met Ser Leu465 470 475 480aca aca atg gac aag att gaa aaa gtg atg att gtc tgt gac cct ggt 1488Thr Thr Met Asp Lys Ile Glu Lys Val Met Ile Val Cys Asp Pro Gly 485 490 495atg gta gaa ttc ggt tat aca aaa aca gtt gag aat gta tta aga caa 1536Met Val Glu Phe Gly Tyr Thr Lys Thr Val Glu Asn Val Leu Arg Gln 500 505 510aga acg gaa cag cct caa att aaa ata ttt agc gaa gtc gaa ccg aac 1584Arg Thr Glu Gln Pro Gln Ile Lys Ile Phe Ser Glu Val Glu Pro Asn 515 520 525cca tca act aat aca gta tat aaa ggt ctg gaa atg atg gtt gat ttc 1632Pro Ser Thr Asn Thr Val Tyr Lys Gly Leu Glu Met Met Val Asp Phe 530 535 540caa cca gat aca atc att gca ctt ggt ggt ggt tca gcg atg gat gct 1680Gln Pro Asp Thr Ile Ile Ala Leu Gly Gly Gly Ser Ala Met Asp Ala545 550 555 560gca aaa gca atg tgg atg ttc ttt gaa cac cct gag aca tca ttc ttc 1728Ala Lys Ala Met Trp Met Phe Phe Glu His Pro Glu Thr Ser Phe Phe 565 570 575ggt gct aaa caa aag ttc cta gac atc ggt aaa cgt act tat aaa ata 1776Gly Ala Lys Gln Lys Phe Leu Asp Ile Gly Lys Arg Thr Tyr Lys Ile 580 585 590ggc atg cct gaa aat gcg acg ttc att tgt atc cct acg aca tca ggt 1824Gly Met Pro Glu Asn Ala Thr Phe Ile Cys Ile Pro Thr Thr Ser Gly 595 600 605aca ggt tca gaa gta aca cca ttt gca gtt atc aca gat agt gaa aca 1872Thr Gly Ser Glu Val Thr Pro Phe Ala Val Ile Thr Asp Ser Glu Thr 610 615 620aat gta aaa tat ccg ttg gct gat ttt gct tta aca cct gac gtt gca 1920Asn Val Lys Tyr Pro Leu Ala Asp Phe Ala Leu Thr Pro Asp Val Ala625 630 635 640att att gac cct caa ttt gtg atg agt gtg cca aaa agc gtt aca gca 1968Ile Ile Asp Pro Gln Phe Val Met Ser Val Pro Lys Ser Val Thr Ala 645 650 655gat aca gga atg gat gta cta acg cat gca atg gaa tca tat gta tct 2016Asp Thr Gly Met Asp Val Leu Thr His Ala Met Glu Ser Tyr Val Ser 660 665 670gta atg gct tca gac tat aca aga ggt ttg agt cta caa gcg att aaa 2064Val Met Ala Ser Asp Tyr Thr Arg Gly Leu Ser Leu Gln Ala Ile Lys 675 680 685ttg acg ttc gaa tat tta aaa tca tct gtt gaa aag ggt gat aaa gtt 2112Leu Thr Phe Glu Tyr Leu Lys Ser Ser Val Glu Lys Gly Asp Lys Val 690 695 700tca aga gag aaa atg cat aac gca tca act ttg gct ggt atg gca ttt 2160Ser Arg Glu Lys Met His Asn Ala Ser Thr Leu Ala Gly Met Ala Phe705 710 715 720gca aat gca ttc tta ggc att gca cac tca att gca cat aaa att ggt 2208Ala Asn Ala Phe Leu Gly Ile Ala His Ser Ile Ala His Lys Ile Gly 725 730 735ggc gaa tat ggt att ccg cat ggt aga gcg aat gcg ata tta cta ccg 2256Gly Glu Tyr Gly Ile Pro His Gly Arg Ala Asn Ala Ile Leu Leu Pro 740 745 750cat att atc cgt tat aat gcc aaa gac ccg caa aaa cat gca tta ttc 2304His Ile Ile Arg Tyr Asn Ala Lys Asp Pro Gln Lys His Ala Leu Phe 755 760 765cct aaa tat gag ttc ttc aga gca gat aca gat tat gca gat att gcc 2352Pro Lys Tyr Glu Phe Phe Arg Ala Asp Thr Asp Tyr Ala Asp Ile Ala 770 775 780aaa ttc tta gga tta aaa ggg aat acg aca gaa gca ctc gta gaa tca 2400Lys Phe Leu Gly Leu Lys Gly Asn Thr Thr Glu Ala Leu Val Glu Ser785 790 795 800tta gct aaa gct gtc tac gaa tta ggt caa tca gtc gga att gaa atg 2448Leu Ala Lys Ala Val Tyr Glu Leu Gly Gln Ser Val Gly Ile Glu Met 805 810 815aat ttg aaa tca caa ggt gtg tct gaa gaa gaa tta aat gaa tca att 2496Asn Leu Lys Ser Gln Gly Val Ser Glu Glu Glu Leu Asn Glu Ser Ile 820 825 830gat aga atg gca gag ctc gca ttt gaa gat caa tgt aca act gct aat 2544Asp Arg Met Ala Glu Leu Ala Phe Glu Asp Gln Cys Thr Thr Ala Asn 835 840 845cct aaa gaa gca cta atc agt gaa atc aaa gat atc att caa aca tca 2592Pro Lys Glu Ala Leu Ile Ser Glu Ile Lys Asp Ile Ile Gln Thr Ser 850 855 860tat gat tat aag caa taa 2610Tyr Asp Tyr Lys Gln86528869PRTStaphylococcus aureus 28Met Leu Thr Ile Pro Glu Lys Glu Asn Arg Gly Ser Lys Glu Gln Glu1 5 10 15Val Ala Ile Met Ile Asp Ala Leu Ala Asp Lys Gly Lys Lys Ala Leu 20 25 30Glu Ala Leu Ser Lys Lys Ser Gln Glu Glu Ile Asp His Ile Val His 35 40 45Gln Met Ser Leu Ala Ala Val Asp Gln His Met Val Leu Ala Lys Leu 50 55 60Ala His Glu Glu Thr Gly Arg Gly Ile Tyr Glu Asp Lys Ala Ile Lys65 70 75 80Asn Leu Tyr Ala Ser Glu Tyr Ile Trp Asn Ser Ile Lys Asp Asn Lys 85 90 95Thr Val Gly Ile Ile Gly Glu Asp Lys Glu Lys Gly Leu Thr Tyr Val 100 105 110Ala Glu Pro Ile Gly Val Ile Cys Gly Val Thr Pro Thr Thr Asn Pro 115 120 125Thr Ser Thr Thr Ile Phe Lys Ala Met Ile Ala Ile Lys Thr Gly Asn 130 135 140Pro Ile Ile Phe Ala Phe His Pro Ser Ala Gln Glu Ser Ser Lys Arg145 150 155 160Ala Ala Glu Val Val Leu Glu Ala Ala Met Lys Ala Gly Ala Pro Lys 165 170 175Asp Ile Ile Gln Trp Ile Glu Val Pro Ser Ile Glu Ala Thr Lys Gln 180 185 190Leu Met Asn His Lys Gly Ile Ala Leu Val Leu Ala Thr Gly Gly Ser 195 200 205Gly Met Val Lys Ser Ala Tyr Ser Thr Gly Lys Pro Ala Leu Gly Val 210 215 220Gly Pro Gly Asn Val Pro Ser Tyr Ile Glu Lys Thr Ala His Ile Lys225 230 235 240Arg Ala Val Asn Asp Ile Ile Gly Ser Lys Thr Phe Asp Asn Gly Met 245 250 255Ile Cys Ala Ser Glu Gln Val Val Val Ile Asp Lys Glu Ile Tyr Lys 260 265 270Asp Val Thr Asn Glu Phe Lys Ala His Gln Ala Tyr Phe Val Lys Lys 275 280 285Asp Glu Leu Gln Arg Leu Glu Asn Ala Ile Met Asn Glu Gln Lys Thr 290 295 300Gly Ile Lys Pro Asp Ile Val Gly Lys Ser Ala Val Glu Ile Ala Glu305 310 315 320Leu Ala Gly Ile Pro Val Pro Glu Asn Thr Lys Leu Ile Ile Ala Glu 325 330 335Ile Ser Gly Val Gly Ser Asp Tyr Pro Leu Ser Arg Glu Lys Leu Ser 340 345 350Pro Val Leu Ala Leu Val Lys Ala Gln Ser Thr Lys Gln Ala Phe Gln 355 360 365Ile Cys Glu Asp Thr Leu His Phe Gly Gly Leu Gly His Thr Ala Val 370 375 380Ile His Thr Glu Asp Glu Thr Leu Gln Lys Asp Phe Gly Leu Arg Met385 390 395 400Lys Ala Cys Arg Val Leu Val Asn Thr Pro Ser Ala Val Gly Gly Ile 405 410 415Gly Asp Met Tyr Asn Glu Leu Ile Pro Ser Leu Thr Leu Gly Cys Gly 420 425 430Ser Tyr Gly Arg Asn Ser Ile Ser His Asn Val Ser Ala Thr Asp Leu 435 440 445Leu Asn Ile Lys Thr Ile Ala Lys Arg Arg Asn Asn Thr Gln Ile Phe 450 455 460Lys Val Pro Ala Gln Ile Tyr Phe Glu Glu Asn Ala Ile Met Ser Leu465 470 475 480Thr Thr Met Asp Lys Ile Glu Lys Val Met Ile Val Cys Asp Pro Gly 485 490 495Met Val Glu Phe Gly Tyr Thr Lys Thr Val Glu Asn Val Leu Arg Gln 500 505 510Arg Thr Glu Gln Pro Gln Ile Lys Ile Phe Ser Glu Val Glu Pro Asn 515 520 525Pro Ser Thr Asn Thr Val Tyr Lys Gly Leu Glu Met Met Val Asp Phe 530 535 540Gln Pro Asp Thr Ile Ile Ala Leu Gly Gly Gly Ser Ala Met Asp Ala545 550 555 560Ala Lys Ala Met Trp Met Phe Phe Glu His Pro Glu Thr Ser Phe Phe 565 570 575Gly Ala Lys Gln Lys Phe Leu Asp Ile Gly Lys Arg Thr Tyr Lys Ile 580 585 590Gly Met Pro Glu Asn Ala Thr Phe Ile Cys Ile Pro Thr Thr Ser Gly 595 600 605Thr Gly Ser Glu Val Thr Pro Phe Ala Val Ile Thr Asp Ser Glu Thr 610 615 620Asn Val Lys Tyr Pro Leu Ala Asp Phe Ala Leu Thr Pro Asp Val Ala625 630 635 640Ile Ile Asp Pro Gln Phe Val Met Ser Val Pro Lys Ser Val Thr Ala 645 650 655Asp Thr Gly Met Asp Val Leu Thr His Ala Met Glu Ser Tyr Val Ser 660 665 670Val Met Ala Ser Asp Tyr Thr Arg Gly Leu Ser Leu Gln Ala Ile Lys 675 680 685Leu Thr Phe Glu Tyr Leu Lys Ser Ser Val Glu Lys Gly Asp Lys Val 690 695 700Ser Arg Glu Lys Met His Asn Ala Ser Thr Leu Ala Gly Met Ala Phe705 710 715 720Ala Asn Ala Phe Leu Gly Ile Ala His Ser Ile Ala His Lys Ile Gly 725 730 735Gly Glu Tyr Gly Ile Pro His Gly Arg Ala Asn Ala Ile Leu Leu Pro 740 745 750His Ile Ile Arg Tyr Asn Ala Lys Asp Pro Gln Lys His Ala Leu Phe 755 760 765Pro Lys Tyr Glu Phe Phe Arg Ala Asp Thr Asp Tyr Ala Asp Ile Ala 770 775 780Lys Phe Leu Gly Leu Lys Gly Asn Thr Thr Glu Ala Leu Val Glu Ser785 790 795 800Leu Ala Lys Ala Val Tyr Glu Leu Gly Gln Ser Val Gly Ile Glu Met 805 810 815Asn Leu Lys Ser Gln Gly Val Ser Glu Glu Glu Leu Asn Glu Ser Ile 820 825 830Asp Arg Met Ala Glu Leu Ala Phe Glu Asp Gln Cys Thr Thr Ala Asn 835 840 845Pro Lys Glu Ala Leu Ile Ser Glu Ile Lys Asp Ile Ile Gln Thr Ser 850 855 860Tyr Asp Tyr Lys Gln865292607DNAArtificialoptimised sequence 29atgttgacca ttccagaaaa ggaaaacaga ggttccaagg aacaagaagt tgccatcatg 60attgatgctt tagctgacaa aggtaagaag gctttggaag ctttgtccaa gaagtctcaa 120gaagaaattg accacattgt ccaccaaatg tccttggctg ctgttgacca acacatggtt 180ttggccaagt tggctcatga agaaaccggt agaggtatct acgaagacaa ggctatcaag 240aacttatacg cctctgaata catctggaac tccatcaagg acaacaagac tgttggtatc 300attggtgaag acaaagaaaa gggtttgacc tacgttgctg aaccaattgg tgtcatctgt 360ggtgtcactc caaccaccaa cccaacttct accaccatct tcaaggctat gattgccatc 420aagactggta acccaattat tttcgctttc cacccatctg ctcaagaatc ttccaagaga 480gctgctgaag ttgttttgga agctgccatg aaggctggtg ctccaaagga tatcatccaa 540tggattgaag ttccatccat tgaagctacc aagcaattga tgaaccacaa gggtattgct 600ttagtcttgg ctaccggtgg ttctggtatg gttaagtctg cttactccac tggtaaacca 660gctttgggtg ttggtccagg taacgttcca tcttacatcg aaaagactgc tcatatcaag 720cgtgctgtca acgatatcat cggttccaag actttcgata atggtatgat ctgtgcttct 780gaacaagttg ttgtcattga caaggaaatc tacaaggatg tcaccaatga attcaaggct 840caccaagctt acttcgtcaa gaaggacgaa ttacaaagat tagaaaacgc catcatgaac 900gaacaaaaga ctggtatcaa gccagatatc gttggtaagt ctgctgttga aattgctgaa 960ttggccggta tcccagttcc agaaaacacc aaattgatca ttgctgaaat ctccggtgtc 1020ggttctgact acccattgtc cagagaaaag ttgtctccag ttttggcttt agtcaaggct 1080caatctacca agcaagcttt ccaaatctgt gaagacactt tgcacttcgg tggtttaggt 1140cacactgctg ttatccacac tgaagacgaa actttgcaaa aggatttcgg tctaagaatg 1200aaggcttgtc gtgttttggt caacactcca tctgctgttg gtggtatcgg tgacatgtac 1260aacgaattga ttccatcctt gactttgggt tgtggttctt acggtagaaa ctccatctcc 1320cacaacgtct ctgctaccga tttgttgaac atcaagacca ttgccaagag aagaaacaac 1380actcaaatct tcaaggttcc agctcaaatc tatttcgaag aaaacgctat catgtccttg 1440accaccatgg acaagattga aaaggtcatg atcgtttgtg acccaggtat ggttgaattt 1500ggttacacca aaaccgtcga aaacgtctta cgtcaaagaa ctgaacaacc tcaaatcaag 1560atcttctctg aagttgaacc aaatccatcc accaacactg tctacaaggg tttggaaatg 1620atggtcgatt tccaaccaga caccatcatt gctttgggtg gtggttctgc catggatgct 1680gccaaggcta tgtggatgtt cttcgaacat ccagaaactt ctttcttcgg tgccaagcaa 1740aaattcttgg acattggtaa gagaacctac aagattggta tgccagaaaa cgccactttc 1800atctgtattc caaccacttc tggtactggt tctgaagtca ctccatttgc tgttatcact 1860gactctgaaa ccaacgtcaa atacccattg gctgatttcg ctttgactcc agatgtcgcc 1920atcattgacc ctcaatttgt catgtccgtc ccaaaatctg tcactgctga taccggtatg 1980gacgttttga ctcacgctat ggaatcttac gtttctgtca tggcctccga ttacaccaga 2040ggtttgtccc tacaagctat caaattgacc tttgaatact tgaaatcttc cgttgaaaaa 2100ggtgacaagg tttccagaga aaagatgcac aacgcttcta ctttggccgg tatggccttt 2160gctaacgctt tcttgggtat tgctcactcc attgctcaca aaattggtgg tgaatacggt

2220attccacatg gtagagctaa cgccatcttg ttgcctcaca tcatcagata caacgccaag 2280gaccctcaaa agcacgcttt gttcccaaag tacgaattct tcagagctga caccgattac 2340gctgatatcg ccaagttctt aggtttgaaa ggtaacacca ctgaagcttt ggttgaatct 2400ttggccaagg ctgtctacga attaggtcaa tctgttggta ttgaaatgaa cttgaaatct 2460caaggtgtct ctgaagaaga attgaacgaa tccattgaca gaatggctga attggctttc 2520gaagaccaat gtaccactgc caacccaaag gaagctttga tttctgaaat caaggatatc 2580atccaaactt cttacgacta caagcag 260730392PRTClostridium acetobutylicum 30Met Lys Glu Val Val Ile Ala Ser Ala Val Arg Thr Ala Ile Gly Ser1 5 10 15Tyr Gly Lys Ser Leu Lys Asp Val Pro Ala Val Asp Leu Gly Ala Thr 20 25 30Ala Ile Lys Glu Ala Val Lys Lys Ala Gly Ile Lys Pro Glu Asp Val 35 40 45Asn Glu Val Ile Leu Gly Asn Val Leu Gln Ala Gly Leu Gly Gln Asn 50 55 60Pro Ala Arg Gln Ala Ser Phe Lys Ala Gly Leu Pro Val Glu Ile Pro65 70 75 80Ala Met Thr Ile Asn Lys Val Cys Gly Ser Gly Leu Arg Thr Val Ser 85 90 95Leu Ala Ala Gln Ile Ile Lys Ala Gly Asp Ala Asp Val Ile Ile Ala 100 105 110Gly Gly Met Glu Asn Met Ser Arg Ala Pro Tyr Leu Ala Asn Asn Ala 115 120 125Arg Trp Gly Tyr Arg Met Gly Asn Ala Lys Phe Val Asp Glu Met Ile 130 135 140Thr Asp Gly Leu Trp Asp Ala Phe Asn Asp Tyr His Met Gly Ile Thr145 150 155 160Ala Glu Asn Ile Ala Glu Arg Trp Asn Ile Ser Arg Glu Glu Gln Asp 165 170 175Glu Phe Ala Leu Ala Ser Gln Lys Lys Ala Glu Glu Ala Ile Lys Ser 180 185 190Gly Gln Phe Lys Asp Glu Ile Val Pro Val Val Ile Lys Gly Arg Lys 195 200 205Gly Glu Thr Val Val Asp Thr Asp Glu His Pro Arg Phe Gly Ser Thr 210 215 220Ile Glu Gly Leu Ala Lys Leu Lys Pro Ala Phe Lys Lys Asp Gly Thr225 230 235 240Val Thr Ala Gly Asn Ala Ser Gly Leu Asn Asp Cys Ala Ala Val Leu 245 250 255Val Ile Met Ser Ala Glu Lys Ala Lys Glu Leu Gly Val Lys Pro Leu 260 265 270Ala Lys Ile Val Ser Tyr Gly Ser Ala Gly Val Asp Pro Ala Ile Met 275 280 285Gly Tyr Gly Pro Phe Tyr Ala Thr Lys Ala Ala Ile Glu Lys Ala Gly 290 295 300Trp Thr Val Asp Glu Leu Asp Leu Ile Glu Ser Asn Glu Ala Phe Ala305 310 315 320Ala Gln Ser Leu Ala Val Ala Lys Asp Leu Lys Phe Asp Met Asn Lys 325 330 335Val Asn Val Asn Gly Gly Ala Ile Ala Leu Gly His Pro Ile Gly Ala 340 345 350Ser Gly Ala Arg Ile Leu Val Thr Leu Val His Ala Met Gln Lys Arg 355 360 365Asp Ala Lys Lys Gly Leu Ala Thr Leu Cys Ile Gly Gly Gly Gln Gly 370 375 380Thr Ala Ile Leu Leu Glu Lys Cys385 39031282PRTClostridium acetobutylicum 31Met Lys Lys Val Cys Val Ile Gly Ala Gly Thr Met Gly Ser Gly Ile1 5 10 15Ala Gln Ala Phe Ala Ala Lys Gly Phe Glu Val Val Leu Arg Asp Ile 20 25 30Lys Asp Glu Phe Val Asp Arg Gly Leu Asp Phe Ile Asn Lys Asn Leu 35 40 45Ser Lys Leu Val Lys Lys Gly Lys Ile Glu Glu Ala Thr Lys Val Glu 50 55 60Ile Leu Thr Arg Ile Ser Gly Thr Val Asp Leu Asn Met Ala Ala Asp65 70 75 80Cys Asp Leu Val Ile Glu Ala Ala Val Glu Arg Met Asp Ile Lys Lys 85 90 95Gln Ile Phe Ala Asp Leu Asp Asn Ile Cys Lys Pro Glu Thr Ile Leu 100 105 110Ala Ser Asn Thr Ser Ser Leu Ser Ile Thr Glu Val Ala Ser Ala Thr 115 120 125Lys Arg Pro Asp Lys Val Ile Gly Met His Phe Phe Asn Pro Ala Pro 130 135 140Val Met Lys Leu Val Glu Val Ile Arg Gly Ile Ala Thr Ser Gln Glu145 150 155 160Thr Phe Asp Ala Val Lys Glu Thr Ser Ile Ala Ile Gly Lys Asp Pro 165 170 175Val Glu Val Ala Glu Ala Pro Gly Phe Val Val Asn Arg Ile Leu Ile 180 185 190Pro Met Ile Asn Glu Ala Val Gly Ile Leu Ala Glu Gly Ile Ala Ser 195 200 205Val Glu Asp Ile Asp Lys Ala Met Lys Leu Gly Ala Asn His Pro Met 210 215 220Gly Pro Leu Glu Leu Gly Asp Phe Ile Gly Leu Asp Ile Cys Leu Ala225 230 235 240Ile Met Asp Val Leu Tyr Ser Glu Thr Gly Asp Ser Lys Tyr Arg Pro 245 250 255His Thr Leu Leu Lys Lys Tyr Val Arg Ala Gly Trp Leu Gly Arg Lys 260 265 270Ser Gly Lys Gly Phe Tyr Asp Tyr Ser Lys 275 28032261PRTClostridium acetobutylicum 32Met Glu Leu Asn Asn Val Ile Leu Glu Lys Glu Gly Lys Val Ala Val1 5 10 15Val Thr Ile Asn Arg Pro Lys Ala Leu Asn Ala Leu Asn Ser Asp Thr 20 25 30Leu Lys Glu Met Asp Tyr Val Ile Gly Glu Ile Glu Asn Asp Ser Glu 35 40 45Val Leu Ala Val Ile Leu Thr Gly Ala Gly Glu Lys Ser Phe Val Ala 50 55 60Gly Ala Asp Ile Ser Glu Met Lys Glu Met Asn Thr Ile Glu Gly Arg65 70 75 80Lys Phe Gly Ile Leu Gly Asn Lys Val Phe Arg Arg Leu Glu Leu Leu 85 90 95Glu Lys Pro Val Ile Ala Ala Val Asn Gly Phe Ala Leu Gly Gly Gly 100 105 110Cys Glu Ile Ala Met Ser Cys Asp Ile Arg Ile Ala Ser Ser Asn Ala 115 120 125Arg Phe Gly Gln Pro Glu Val Gly Leu Gly Ile Thr Pro Gly Phe Gly 130 135 140Gly Thr Gln Arg Leu Ser Arg Leu Val Gly Met Gly Met Ala Lys Gln145 150 155 160Leu Ile Phe Thr Ala Gln Asn Ile Lys Ala Asp Glu Ala Leu Arg Ile 165 170 175Gly Leu Val Asn Lys Val Val Glu Pro Ser Glu Leu Met Asn Thr Ala 180 185 190Lys Glu Ile Ala Asn Lys Ile Val Ser Asn Ala Pro Val Ala Val Lys 195 200 205Leu Ser Lys Gln Ala Ile Asn Arg Gly Met Gln Cys Asp Ile Asp Thr 210 215 220Ala Leu Ala Phe Glu Ser Glu Ala Phe Gly Glu Cys Phe Ser Thr Glu225 230 235 240Asp Gln Lys Asp Ala Met Thr Ala Phe Ile Glu Lys Arg Lys Ile Glu 245 250 255Gly Phe Lys Asn Arg 26033379PRTClostridium acetobutylicum 33Met Asp Phe Asn Leu Thr Arg Glu Gln Glu Leu Val Arg Gln Met Val1 5 10 15Arg Glu Phe Ala Glu Asn Glu Val Lys Pro Ile Ala Ala Glu Ile Asp 20 25 30Glu Thr Glu Arg Phe Pro Met Glu Asn Val Lys Lys Met Gly Gln Tyr 35 40 45Gly Met Met Gly Ile Pro Phe Ser Lys Glu Tyr Gly Gly Ala Gly Gly 50 55 60Asp Val Leu Ser Tyr Ile Ile Ala Val Glu Glu Leu Ser Lys Val Cys65 70 75 80Gly Thr Thr Gly Val Ile Leu Ser Ala His Thr Ser Leu Cys Ala Ser 85 90 95Leu Ile Asn Glu His Gly Thr Glu Glu Gln Lys Gln Lys Tyr Leu Val 100 105 110Pro Leu Ala Lys Gly Glu Lys Ile Gly Ala Tyr Gly Leu Thr Glu Pro 115 120 125Asn Ala Gly Thr Asp Ser Gly Ala Gln Gln Thr Val Ala Val Leu Glu 130 135 140Gly Asp His Tyr Val Ile Asn Gly Ser Lys Ile Phe Ile Thr Asn Gly145 150 155 160Gly Val Ala Asp Thr Phe Val Ile Phe Ala Met Thr Asp Arg Thr Lys 165 170 175Gly Thr Lys Gly Ile Ser Ala Phe Ile Ile Glu Lys Gly Phe Lys Gly 180 185 190Phe Ser Ile Gly Lys Val Glu Gln Lys Leu Gly Ile Arg Ala Ser Ser 195 200 205Thr Thr Glu Leu Val Phe Glu Asp Met Ile Val Pro Val Glu Asn Met 210 215 220Ile Gly Lys Glu Gly Lys Gly Phe Pro Ile Ala Met Lys Thr Leu Asp225 230 235 240Gly Gly Arg Ile Gly Ile Ala Ala Gln Ala Leu Gly Ile Ala Glu Gly 245 250 255Ala Phe Asn Glu Ala Arg Ala Tyr Met Lys Glu Arg Lys Gln Phe Gly 260 265 270Arg Ser Leu Asp Lys Phe Gln Gly Leu Ala Trp Met Met Ala Asp Met 275 280 285Asp Val Ala Ile Glu Ser Ala Arg Tyr Leu Val Tyr Lys Ala Ala Tyr 290 295 300Leu Lys Gln Ala Gly Leu Pro Tyr Thr Val Asp Ala Ala Arg Ala Lys305 310 315 320Leu His Ala Ala Asn Val Ala Met Asp Val Thr Thr Lys Ala Val Gln 325 330 335Leu Phe Gly Gly Tyr Gly Tyr Thr Lys Asp Tyr Pro Val Glu Arg Met 340 345 350Met Arg Asp Ala Lys Ile Thr Glu Ile Tyr Glu Gly Thr Ser Glu Val 355 360 365Gln Lys Leu Val Ile Ser Gly Lys Ile Phe Arg 370 37534858PRTClostridium acetobutylicum 34Met Lys Val Thr Asn Gln Lys Glu Leu Lys Gln Lys Leu Asn Glu Leu1 5 10 15Arg Glu Ala Gln Lys Lys Phe Ala Thr Tyr Thr Gln Glu Gln Val Asp 20 25 30Lys Ile Phe Lys Gln Cys Ala Ile Ala Ala Ala Lys Glu Arg Ile Asn 35 40 45Leu Ala Lys Leu Ala Val Glu Glu Thr Gly Ile Gly Leu Val Glu Asp 50 55 60Lys Ile Ile Lys Asn His Phe Ala Ala Glu Tyr Ile Tyr Asn Lys Tyr65 70 75 80Lys Asn Glu Lys Thr Cys Gly Ile Ile Asp His Asp Asp Ser Leu Gly 85 90 95Ile Thr Lys Val Ala Glu Pro Ile Gly Ile Val Ala Ala Ile Val Pro 100 105 110Thr Thr Asn Pro Thr Ser Thr Ala Ile Phe Lys Ser Leu Ile Ser Leu 115 120 125Lys Thr Arg Asn Ala Ile Phe Phe Ser Pro His Pro Arg Ala Lys Lys 130 135 140Ser Thr Ile Ala Ala Ala Lys Leu Ile Leu Asp Ala Ala Val Lys Ala145 150 155 160Gly Ala Pro Lys Asn Ile Ile Gly Trp Ile Asp Glu Pro Ser Ile Glu 165 170 175Leu Ser Gln Asp Leu Met Ser Glu Ala Asp Ile Ile Leu Ala Thr Gly 180 185 190Gly Pro Ser Met Val Lys Ala Ala Tyr Ser Ser Gly Lys Pro Ala Ile 195 200 205Gly Val Gly Ala Gly Asn Thr Pro Ala Ile Ile Asp Glu Ser Ala Asp 210 215 220Ile Asp Met Ala Val Ser Ser Ile Ile Leu Ser Lys Thr Tyr Asp Asn225 230 235 240Gly Val Ile Cys Ala Ser Glu Gln Ser Ile Leu Val Met Asn Ser Ile 245 250 255Tyr Glu Lys Val Lys Glu Glu Phe Val Lys Arg Gly Ser Tyr Ile Leu 260 265 270Asn Gln Asn Glu Ile Ala Lys Ile Lys Glu Thr Met Phe Lys Asn Gly 275 280 285Ala Ile Asn Ala Asp Ile Val Gly Lys Ser Ala Tyr Ile Ile Ala Lys 290 295 300Met Ala Gly Ile Glu Val Pro Gln Thr Thr Lys Ile Leu Ile Gly Glu305 310 315 320Val Gln Ser Val Glu Lys Ser Glu Leu Phe Ser His Glu Lys Leu Ser 325 330 335Pro Val Leu Ala Met Tyr Lys Val Lys Asp Phe Asp Glu Ala Leu Lys 340 345 350Lys Ala Gln Arg Leu Ile Glu Leu Gly Gly Ser Gly His Thr Ser Ser 355 360 365Leu Tyr Ile Asp Ser Gln Asn Asn Lys Asp Lys Val Lys Glu Phe Gly 370 375 380Leu Ala Met Lys Thr Ser Arg Thr Phe Ile Asn Met Pro Ser Ser Gln385 390 395 400Gly Ala Ser Gly Asp Leu Tyr Asn Phe Ala Ile Ala Pro Ser Phe Thr 405 410 415Leu Gly Cys Gly Thr Trp Gly Gly Asn Ser Val Ser Gln Asn Val Glu 420 425 430Pro Lys His Leu Leu Asn Ile Lys Ser Val Ala Glu Arg Arg Glu Asn 435 440 445Met Leu Trp Phe Lys Val Pro Gln Lys Ile Tyr Phe Lys Tyr Gly Cys 450 455 460Leu Arg Phe Ala Leu Lys Glu Leu Lys Asp Met Asn Lys Lys Arg Ala465 470 475 480Phe Ile Val Thr Asp Lys Asp Leu Phe Lys Leu Gly Tyr Val Asn Lys 485 490 495Ile Thr Lys Val Leu Asp Glu Ile Asp Ile Lys Tyr Ser Ile Phe Thr 500 505 510Asp Ile Lys Ser Asp Pro Thr Ile Asp Ser Val Lys Lys Gly Ala Lys 515 520 525Glu Met Leu Asn Phe Glu Pro Asp Thr Ile Ile Ser Ile Gly Gly Gly 530 535 540Ser Pro Met Asp Ala Ala Lys Val Met His Leu Leu Tyr Glu Tyr Pro545 550 555 560Glu Ala Glu Ile Glu Asn Leu Ala Ile Asn Phe Met Asp Ile Arg Lys 565 570 575Arg Ile Cys Asn Phe Pro Lys Leu Gly Thr Lys Ala Ile Ser Val Ala 580 585 590Ile Pro Thr Thr Ala Gly Thr Gly Ser Glu Ala Thr Pro Phe Ala Val 595 600 605Ile Thr Asn Asp Glu Thr Gly Met Lys Tyr Pro Leu Thr Ser Tyr Glu 610 615 620Leu Thr Pro Asn Met Ala Ile Ile Asp Thr Glu Leu Met Leu Asn Met625 630 635 640Pro Arg Lys Leu Thr Ala Ala Thr Gly Ile Asp Ala Leu Val His Ala 645 650 655Ile Glu Ala Tyr Val Ser Val Met Ala Thr Asp Tyr Thr Asp Glu Leu 660 665 670Ala Leu Arg Ala Ile Lys Met Ile Phe Lys Tyr Leu Pro Arg Ala Tyr 675 680 685Lys Asn Gly Thr Asn Asp Ile Glu Ala Arg Glu Lys Met Ala His Ala 690 695 700Ser Asn Ile Ala Gly Met Ala Phe Ala Asn Ala Phe Leu Gly Val Cys705 710 715 720His Ser Met Ala His Lys Leu Gly Ala Met His His Val Pro His Gly 725 730 735Ile Ala Cys Ala Val Leu Ile Glu Glu Val Ile Lys Tyr Asn Ala Thr 740 745 750Asp Cys Pro Thr Lys Gln Thr Ala Phe Pro Gln Tyr Lys Ser Pro Asn 755 760 765Ala Lys Arg Lys Tyr Ala Glu Ile Ala Glu Tyr Leu Asn Leu Lys Gly 770 775 780Thr Ser Asp Thr Glu Lys Val Thr Ala Leu Ile Glu Ala Ile Ser Lys785 790 795 800Leu Lys Ile Asp Leu Ser Ile Pro Gln Asn Ile Ser Ala Ala Gly Ile 805 810 815Asn Lys Lys Asp Phe Tyr Asn Thr Leu Asp Lys Met Ser Glu Leu Ala 820 825 830Phe Asp Asp Gln Cys Thr Thr Ala Asn Pro Arg Tyr Pro Leu Ile Ser 835 840 845Glu Leu Lys Asp Ile Tyr Ile Lys Ser Phe 850 85535862PRTClostridium acetobutylicum 35Met Lys Val Thr Thr Val Lys Glu Leu Asp Glu Lys Leu Lys Val Ile1 5 10 15Lys Glu Ala Gln Lys Lys Phe Ser Cys Tyr Ser Gln Glu Met Val Asp 20 25 30Glu Ile Phe Arg Asn Ala Ala Met Ala Ala Ile Asp Ala Arg Ile Glu 35 40 45Leu Ala Lys Ala Ala Val Leu Glu Thr Gly Met Gly Leu Val Glu Asp 50 55 60Lys Val Ile Lys Asn His Phe Ala Gly Glu Tyr Ile Tyr Asn Lys Tyr65 70 75 80Lys Asp Glu Lys Thr Cys Gly Ile Ile Glu Arg Asn Glu Pro Tyr Gly 85 90 95Ile Thr Lys Ile Ala Glu Pro Ile Gly Val Val Ala Ala Ile Ile Pro 100 105 110Val Thr Asn Pro Thr Ser Thr Thr Ile Phe Lys Ser Leu Ile Ser Leu 115 120 125Lys Thr Arg Asn Gly Ile Phe Phe Ser Pro His Pro Arg Ala Lys Lys 130 135 140Ser Thr Ile Leu Ala Ala Lys Thr Ile Leu Asp Ala Ala Val Lys Ser145 150 155 160Gly Ala Pro Glu Asn Ile Ile Gly Trp Ile Asp Glu Pro Ser Ile Glu 165 170 175Leu Thr Gln Tyr Leu Met Gln Lys Ala Asp Ile Thr Leu Ala Thr Gly 180 185 190Gly Pro Ser Leu Val Lys Ser Ala Tyr Ser Ser Gly Lys Pro Ala Ile 195 200 205Gly Val Gly Pro Gly Asn Thr

Pro Val Ile Ile Asp Glu Ser Ala His 210 215 220Ile Lys Met Ala Val Ser Ser Ile Ile Leu Ser Lys Thr Tyr Asp Asn225 230 235 240Gly Val Ile Cys Ala Ser Glu Gln Ser Val Ile Val Leu Lys Ser Ile 245 250 255Tyr Asn Lys Val Lys Asp Glu Phe Gln Glu Arg Gly Ala Tyr Ile Ile 260 265 270Lys Lys Asn Glu Leu Asp Lys Val Arg Glu Val Ile Phe Lys Asp Gly 275 280 285Ser Val Asn Pro Lys Ile Val Gly Gln Ser Ala Tyr Thr Ile Ala Ala 290 295 300Met Ala Gly Ile Lys Val Pro Lys Thr Thr Arg Ile Leu Ile Gly Glu305 310 315 320Val Thr Ser Leu Gly Glu Glu Glu Pro Phe Ala His Glu Lys Leu Ser 325 330 335Pro Val Leu Ala Met Tyr Glu Ala Asp Asn Phe Asp Asp Ala Leu Lys 340 345 350Lys Ala Val Thr Leu Ile Asn Leu Gly Gly Leu Gly His Thr Ser Gly 355 360 365Ile Tyr Ala Asp Glu Ile Lys Ala Arg Asp Lys Ile Asp Arg Phe Ser 370 375 380Ser Ala Met Lys Thr Val Arg Thr Phe Val Asn Ile Pro Thr Ser Gln385 390 395 400Gly Ala Ser Gly Asp Leu Tyr Asn Phe Arg Ile Pro Pro Ser Phe Thr 405 410 415Leu Gly Cys Gly Phe Trp Gly Gly Asn Ser Val Ser Glu Asn Val Gly 420 425 430Pro Lys His Leu Leu Asn Ile Lys Thr Val Ala Glu Arg Arg Glu Asn 435 440 445Met Leu Trp Phe Arg Val Pro His Lys Val Tyr Phe Lys Phe Gly Cys 450 455 460Leu Gln Phe Ala Leu Lys Asp Leu Lys Asp Leu Lys Lys Lys Arg Ala465 470 475 480Phe Ile Val Thr Asp Ser Asp Pro Tyr Asn Leu Asn Tyr Val Asp Ser 485 490 495Ile Ile Lys Ile Leu Glu His Leu Asp Ile Asp Phe Lys Val Phe Asn 500 505 510Lys Val Gly Arg Glu Ala Asp Leu Lys Thr Ile Lys Lys Ala Thr Glu 515 520 525Glu Met Ser Ser Phe Met Pro Asp Thr Ile Ile Ala Leu Gly Gly Thr 530 535 540Pro Glu Met Ser Ser Ala Lys Leu Met Trp Val Leu Tyr Glu His Pro545 550 555 560Glu Val Lys Phe Glu Asp Leu Ala Ile Lys Phe Met Asp Ile Arg Lys 565 570 575Arg Ile Tyr Thr Phe Pro Lys Leu Gly Lys Lys Ala Met Leu Val Ala 580 585 590Ile Thr Thr Ser Ala Gly Ser Gly Ser Glu Val Thr Pro Phe Ala Leu 595 600 605Val Thr Asp Asn Asn Thr Gly Asn Lys Tyr Met Leu Ala Asp Tyr Glu 610 615 620Met Thr Pro Asn Met Ala Ile Val Asp Ala Glu Leu Met Met Lys Met625 630 635 640Pro Lys Gly Leu Thr Ala Tyr Ser Gly Ile Asp Ala Leu Val Asn Ser 645 650 655Ile Glu Ala Tyr Thr Ser Val Tyr Ala Ser Glu Tyr Thr Asn Gly Leu 660 665 670Ala Leu Glu Ala Ile Arg Leu Ile Phe Lys Tyr Leu Pro Glu Ala Tyr 675 680 685Lys Asn Gly Arg Thr Asn Glu Lys Ala Arg Glu Lys Met Ala His Ala 690 695 700Ser Thr Met Ala Gly Met Ala Ser Ala Asn Ala Phe Leu Gly Leu Cys705 710 715 720His Ser Met Ala Ile Lys Leu Ser Ser Glu His Asn Ile Pro Ser Gly 725 730 735Ile Ala Asn Ala Leu Leu Ile Glu Glu Val Ile Lys Phe Asn Ala Val 740 745 750Asp Asn Pro Val Lys Gln Ala Pro Cys Pro Gln Tyr Lys Tyr Pro Asn 755 760 765Thr Ile Phe Arg Tyr Ala Arg Ile Ala Asp Tyr Ile Lys Leu Gly Gly 770 775 780Asn Thr Asp Glu Glu Lys Val Asp Leu Leu Ile Asn Lys Ile His Glu785 790 795 800Leu Lys Lys Ala Leu Asn Ile Pro Thr Ser Ile Lys Asp Ala Gly Val 805 810 815Leu Glu Glu Asn Phe Tyr Ser Ser Leu Asp Arg Ile Ser Glu Leu Ala 820 825 830Leu Asp Asp Gln Cys Thr Gly Ala Asn Pro Arg Phe Pro Leu Thr Ser 835 840 845Glu Ile Lys Glu Met Tyr Ile Asn Cys Phe Lys Lys Gln Pro 850 855 86036389PRTClostridium acetobutylicum 36Met Leu Ser Phe Asp Tyr Ser Ile Pro Thr Lys Val Phe Phe Gly Lys1 5 10 15Gly Lys Ile Asp Val Ile Gly Glu Glu Ile Lys Lys Tyr Gly Ser Arg 20 25 30Val Leu Ile Val Tyr Gly Gly Gly Ser Ile Lys Arg Asn Gly Ile Tyr 35 40 45Asp Arg Ala Thr Ala Ile Leu Lys Glu Asn Asn Ile Ala Phe Tyr Glu 50 55 60Leu Ser Gly Val Glu Pro Asn Pro Arg Ile Thr Thr Val Lys Lys Gly65 70 75 80Ile Glu Ile Cys Arg Glu Asn Asn Val Asp Leu Val Leu Ala Ile Gly 85 90 95Gly Gly Ser Ala Ile Asp Cys Ser Lys Val Ile Ala Ala Gly Val Tyr 100 105 110Tyr Asp Gly Asp Thr Trp Asp Met Val Lys Asp Pro Ser Lys Ile Thr 115 120 125Lys Val Leu Pro Ile Ala Ser Ile Leu Thr Leu Ser Ala Thr Gly Ser 130 135 140Glu Met Asp Gln Ile Ala Val Ile Ser Asn Met Glu Thr Asn Glu Lys145 150 155 160Leu Gly Val Gly His Asp Asp Met Arg Pro Lys Phe Ser Val Leu Asp 165 170 175Pro Thr Tyr Thr Phe Thr Val Pro Lys Asn Gln Thr Ala Ala Gly Thr 180 185 190Ala Asp Ile Met Ser His Thr Phe Glu Ser Tyr Phe Ser Gly Val Glu 195 200 205Gly Ala Tyr Val Gln Asp Gly Ile Ala Glu Ala Ile Leu Arg Thr Cys 210 215 220Ile Lys Tyr Gly Lys Ile Ala Met Glu Lys Thr Asp Asp Tyr Glu Ala225 230 235 240Arg Ala Asn Leu Met Trp Ala Ser Ser Leu Ala Ile Asn Gly Leu Leu 245 250 255Ser Leu Gly Lys Asp Arg Lys Trp Ser Cys His Pro Met Glu His Glu 260 265 270Leu Ser Ala Tyr Tyr Asp Ile Thr His Gly Val Gly Leu Ala Ile Leu 275 280 285Thr Pro Asn Trp Met Glu Tyr Ile Leu Asn Asp Asp Thr Leu His Lys 290 295 300Phe Val Ser Tyr Gly Ile Asn Val Trp Gly Ile Asp Lys Asn Lys Asp305 310 315 320Asn Tyr Glu Ile Ala Arg Glu Ala Ile Lys Asn Thr Arg Glu Tyr Phe 325 330 335Asn Ser Leu Gly Ile Pro Ser Lys Leu Arg Glu Val Gly Ile Gly Lys 340 345 350Asp Lys Leu Glu Leu Met Ala Lys Gln Ala Val Arg Asn Ser Gly Gly 355 360 365Thr Ile Gly Ser Leu Arg Pro Ile Asn Ala Glu Asp Val Leu Glu Ile 370 375 380Phe Lys Lys Ser Tyr38537390PRTClostridium acetobutylicum 37Met Val Asp Phe Glu Tyr Ser Ile Pro Thr Arg Ile Phe Phe Gly Lys1 5 10 15Asp Lys Ile Asn Val Leu Gly Arg Glu Leu Lys Lys Tyr Gly Ser Lys 20 25 30Val Leu Ile Val Tyr Gly Gly Gly Ser Ile Lys Arg Asn Gly Ile Tyr 35 40 45Asp Lys Ala Val Ser Ile Leu Glu Lys Asn Ser Ile Lys Phe Tyr Glu 50 55 60Leu Ala Gly Val Glu Pro Asn Pro Arg Val Thr Thr Val Glu Lys Gly65 70 75 80Val Lys Ile Cys Arg Glu Asn Gly Val Glu Val Val Leu Ala Ile Gly 85 90 95Gly Gly Ser Ala Ile Asp Cys Ala Lys Val Ile Ala Ala Ala Cys Glu 100 105 110Tyr Asp Gly Asn Pro Trp Asp Ile Val Leu Asp Gly Ser Lys Ile Lys 115 120 125Arg Val Leu Pro Ile Ala Ser Ile Leu Thr Ile Ala Ala Thr Gly Ser 130 135 140Glu Met Asp Thr Trp Ala Val Ile Asn Asn Met Asp Thr Asn Glu Lys145 150 155 160Leu Ile Ala Ala His Pro Asp Met Ala Pro Lys Phe Ser Ile Leu Asp 165 170 175Pro Thr Tyr Thr Tyr Thr Val Pro Thr Asn Gln Thr Ala Ala Gly Thr 180 185 190Ala Asp Ile Met Ser His Ile Phe Glu Val Tyr Phe Ser Asn Thr Lys 195 200 205Thr Ala Tyr Leu Gln Asp Arg Met Ala Glu Ala Leu Leu Arg Thr Cys 210 215 220Ile Lys Tyr Gly Gly Ile Ala Leu Glu Lys Pro Asp Asp Tyr Glu Ala225 230 235 240Arg Ala Asn Leu Met Trp Ala Ser Ser Leu Ala Ile Asn Gly Leu Leu 245 250 255Thr Tyr Gly Lys Asp Thr Asn Trp Ser Val His Leu Met Glu His Glu 260 265 270Leu Ser Ala Tyr Tyr Asp Ile Thr His Gly Val Gly Leu Ala Ile Leu 275 280 285Thr Pro Asn Trp Met Glu Tyr Ile Leu Asn Asn Asp Thr Val Tyr Lys 290 295 300Phe Val Glu Tyr Gly Val Asn Val Trp Gly Ile Asp Lys Glu Lys Asn305 310 315 320His Tyr Asp Ile Ala His Gln Ala Ile Gln Lys Thr Arg Asp Tyr Phe 325 330 335Val Asn Val Leu Gly Leu Pro Ser Arg Leu Arg Asp Val Gly Ile Glu 340 345 350Glu Glu Lys Leu Asp Ile Met Ala Lys Glu Ser Val Lys Leu Thr Gly 355 360 365Gly Thr Ile Gly Asn Leu Arg Pro Val Asn Ala Ser Glu Val Leu Gln 370 375 380Ile Phe Lys Lys Ser Val385 39038336PRTClostridium acetobutylicum 38Met Asn Lys Ala Asp Tyr Lys Gly Val Trp Val Phe Ala Glu Gln Arg1 5 10 15Asp Gly Glu Leu Gln Lys Val Ser Leu Glu Leu Leu Gly Lys Gly Lys 20 25 30Glu Met Ala Glu Lys Leu Gly Val Glu Leu Thr Ala Val Leu Leu Gly 35 40 45His Asn Thr Glu Lys Met Ser Lys Asp Leu Leu Ser His Gly Ala Asp 50 55 60Lys Val Leu Ala Ala Asp Asn Glu Leu Leu Ala His Phe Ser Thr Asp65 70 75 80Gly Tyr Ala Lys Val Ile Cys Asp Leu Val Asn Glu Arg Lys Pro Glu 85 90 95Ile Leu Phe Ile Gly Ala Thr Phe Ile Gly Arg Asp Leu Gly Pro Arg 100 105 110Ile Ala Ala Arg Leu Ser Thr Gly Leu Thr Ala Asp Cys Thr Ser Leu 115 120 125Asp Ile Asp Val Glu Asn Arg Asp Leu Leu Ala Thr Arg Pro Ala Phe 130 135 140Gly Gly Asn Leu Ile Ala Thr Ile Val Cys Ser Asp His Arg Pro Gln145 150 155 160Met Ala Thr Val Arg Pro Gly Val Phe Glu Lys Leu Pro Val Asn Asp 165 170 175Ala Asn Val Ser Asp Asp Lys Ile Glu Lys Val Ala Ile Lys Leu Thr 180 185 190Ala Ser Asp Ile Arg Thr Lys Val Ser Lys Val Val Lys Leu Ala Lys 195 200 205Asp Ile Ala Asp Ile Gly Glu Ala Lys Val Leu Val Ala Gly Gly Arg 210 215 220Gly Val Gly Ser Lys Glu Asn Phe Glu Lys Leu Glu Glu Leu Ala Ser225 230 235 240Leu Leu Gly Gly Thr Ile Ala Ala Ser Arg Ala Ala Ile Glu Lys Glu 245 250 255Trp Val Asp Lys Asp Leu Gln Val Gly Gln Thr Gly Lys Thr Val Arg 260 265 270Pro Thr Leu Tyr Ile Ala Cys Gly Ile Ser Gly Ala Ile Gln His Leu 275 280 285Ala Gly Met Gln Asp Ser Asp Tyr Ile Ile Ala Ile Asn Lys Asp Val 290 295 300Glu Ala Pro Ile Met Lys Val Ala Asp Leu Ala Ile Val Gly Asp Val305 310 315 320Asn Lys Val Val Pro Glu Leu Ile Ala Gln Val Lys Ala Ala Asn Asn 325 330 33539259PRTClostridium acetobutylicum 39Met Asn Ile Val Val Cys Leu Lys Gln Val Pro Asp Thr Ala Glu Val1 5 10 15Arg Ile Asp Pro Val Lys Gly Thr Leu Ile Arg Glu Gly Val Pro Ser 20 25 30Ile Ile Asn Pro Asp Asp Lys Asn Ala Leu Glu Glu Ala Leu Val Leu 35 40 45Lys Asp Asn Tyr Gly Ala His Val Thr Val Ile Ser Met Gly Pro Pro 50 55 60Gln Ala Lys Asn Ala Leu Val Glu Ala Leu Ala Met Gly Ala Asp Glu65 70 75 80Ala Val Leu Leu Thr Asp Arg Ala Phe Gly Gly Ala Asp Thr Leu Ala 85 90 95Thr Ser His Thr Ile Ala Ala Gly Ile Lys Lys Leu Lys Tyr Asp Ile 100 105 110Val Phe Ala Gly Arg Gln Ala Ile Asp Gly Asp Thr Ala Gln Val Gly 115 120 125Pro Glu Ile Ala Glu His Leu Gly Ile Pro Gln Val Thr Tyr Val Glu 130 135 140Lys Val Glu Val Asp Gly Asp Thr Leu Lys Ile Arg Lys Ala Trp Glu145 150 155 160Asp Gly Tyr Glu Val Val Glu Val Lys Thr Pro Val Leu Leu Thr Ala 165 170 175Ile Lys Glu Leu Asn Val Pro Arg Tyr Met Ser Val Glu Lys Ile Phe 180 185 190Gly Ala Phe Asp Lys Glu Val Lys Met Trp Thr Ala Asp Asp Ile Asp 195 200 205Val Asp Lys Ala Asn Leu Gly Leu Lys Gly Ser Pro Thr Lys Val Lys 210 215 220Lys Ser Ser Thr Lys Glu Val Lys Gly Gln Gly Glu Val Ile Asp Lys225 230 235 240Pro Val Lys Glu Ala Ala Ala Tyr Val Val Ser Lys Leu Lys Glu Glu 245 250 255His Tyr Ile405976DNAArtificialplasmid YEplac112PtdhTadh 40gcccggggga tccactagtt ctagaatccg tcgaaactaa gttctggtgt tttaaaacta 60aaaaaaagac taactataaa agtagaattt aagaagttta agaaatagat ttacagaatt 120acaatcaata cctaccgtct ttatatactt attagtcaag taggggaata atttcaggga 180actggtttca accttttttt tcagcttttt ccaaatcaga gagagcagaa ggtaatagaa 240ggtgtaagaa aatgagatag atacatgcgt gggtcaattg ccttgtgtca tcatttactc 300caggcaggtt gcatcactcc attgaggttg tgcccgtttt ttgcctgttt gtgcccctgt 360tctctgtagt tgcgctaaga gaatggacct atgaactgat ggttggtgaa gaaaacaata 420ttttggtgct gggattcttt ttttttctgg atgccagctt aaaaagcggg ctccattata 480tttagtggat gccaggaata aactgttcac ccagacacct acgatgttat atattctgtg 540taacccgccc cctattttgg gcatgtacgg gttacagcag aattaaaagg ctaatttttt 600gactaaataa agttaggaaa atcactacta ttaattattt acgtattctt tgaaatggcg 660agtattgata atgataaact gagctcgaat tcactggccg tcgttttaca acgtcgtgac 720tgggaaaacc ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc 780tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat 840ggcgaatggc gcctgatgcg gtattttctc cttacgcatc tgtgcggtat ttcacaccgc 900atatatcgga tcgtacttgt tacccatcat tgaattttga acatccgaac ctgggagttt 960tccctgaaac agatagtata tttgaacctg tataataata tatagtctag cgctttacgg 1020aagacaatgt atgtatttcg gttcctggag aaactattgc atctattgca taggtaatct 1080tgcacgtcgc atccccggtt cattttctgc gtttccatct tgcacttcaa tagcatatct 1140ttgttaacga agcatctgtg cttcattttg tagaacaaaa atgcaacgcg agagcgctaa 1200tttttcaaac aaagaatctg agctgcattt ttacagaaca gaaatgcaac gcgaaagcgc 1260tattttacca acgaagaatc tgtgcttcat ttttgtaaaa caaaaatgca acgcgagagc 1320gctaattttt caaacaaaga atctgagctg catttttaca gaacagaaat gcaacgcgag 1380agcgctattt taccaacaaa gaatctatac ttcttttttg ttctacaaaa atgcatcccg 1440agagcgctat ttttctaaca aagcatctta gattactttt tttctccttt gtgcgctcta 1500taatgcagtc tcttgataac tttttgcact gtaggtccgt taaggttaga agaaggctac 1560tttggtgtct attttctctt ccataaaaaa agcctgactc cacttcccgc gtttactgat 1620tactagcgaa gctgcgggtg cattttttca agataaaggc atccccgatt atattctata 1680ccgatgtgga ttgcgcatac tttgtgaaca gaaagtgata gcgttgatga ttcttcattg 1740gtcagaaaat tatgaacggt ttcttctatt ttgtctctat atactacgta taggaaatgt 1800ttacattttc gtattgtttt cgattcactc tatgaatagt tcttactaca atttttttgt 1860ctaaagagta atactagaga taaacataaa aaatgtagag gtcgagttta gatgcaagtt 1920caaggagcga aaggtggatg ggtaggttat atagggatat agcacagaga tatatagcaa 1980agagatactt ttgagcaatg tttgtggaag cggtattcgc aatattttag tagctcgtta 2040cagtccggtg cgtttttggt tttttgaaag tgcgtcttca gagcgctttt ggttttcaaa 2100agcgctctga agttcctata ctttctagct agagaatagg aacttcggaa taggaacttc 2160aaagcgtttc cgaaaacgag cgcttccgaa aatgcaacgc gagctgcgca catacagctc 2220actgttcacg tcgcacctat atctgcgtgt tgcctgtata tatatataca tgagaagaac 2280ggcatagtgc gtgtttatgc ttaaatgcgt acttatatgc gtctatttat gtaggatgaa 2340aggtagtcta gtacctcctg tgatattatc ccattccatg cggggtatcg tatgcttcct 2400tcagcactac cctttagctg ttctatatgc tgccactcct caattggatt agtctcatcc 2460ttcaatgcta tcatttcctt tgatattgga tcgatccgat gataagctgt caaacatgag 2520aattgatctt ttatgcttgc ttttcaaaag gcttgcaggc aagtgcacaa acaatactta 2580aataaatact actcagtaat

aacctatttc ttagcatttt tgacgaaatt tgctattttg 2640ttagagtctt ttacaccatt tgtctccaca cctccgctta catcaacacc aataacgcca 2700tttaatctaa gcgcatcacc aacattttct ggcgtcagtc caccagctaa cataaaatgt 2760aagctctcgg ggctctcttg ccttccaacc cagtcagaaa tcgagttcca atccaaaagt 2820tcacctgtcc cacctgcttc tgaatcaaac aagggaataa acgaatgagg tttctgtgaa 2880gctgcactga gtagtatgtt gcagtctttt ggaaatacga gtcttttaat aactggcaaa 2940ccgaggaact cttggtattc ttgccacgac tcatctccat gcagttggac gatatcaatg 3000ccgtaatcat tgaccagagc caaaacatcc tccttaggtt gattacgaaa cacgccaacc 3060aagtatttcg gagtgcctga actattttta tatgctttta caagacttga aattttcctt 3120gcaataaccg ggtcaattgt tctctttcta ttgggcacac atataatacc cagcaagtca 3180gcatcggaat ctagtgcaca ttctgcggcc tctgtgctct gcaagccgca aactttcacc 3240aatggaccag aactacctgt gaaattaata acagacatac tccaagctgc ctttgtgtgc 3300ttaatcacgt atactcacgt gctcaatagt caccaatgcc ctccctcttg gccctctcct 3360tttctttttt cgaccgaatt aattcttgaa gacgaaaggg cctcgtgata cgcctatttt 3420tataggttaa tgtcatgata ataatggttt cttagacgtc aggtggcact tttcggggaa 3480atgtgcgcgg aacccctatt tgtttatttt tctaaataca ttcaaatatg tatccgctca 3540tgagacaata accctgataa atgcttcaat aatattgaaa aaggaagagt atgagtattc 3600aacatttccg tgtcgccctt attccctttt ttgcggcatt ttgccttcct gtttttgctc 3660acccagaaac gctggtgaaa gtaaaagatg ctgaagatca gttgggtgca cgagtgggtt 3720acatcgaact ggatctcaac agcggtaaga tccttgagag ttttcgcccc gaagaacgtt 3780ttccaatgat gagcactttt aaagttctgc tatgtggcgc ggtattatcc cgtattgacg 3840ccgggcaaga gcaactcggt cgccgcatac actattctca gaatgacttg gttgagtact 3900caccagtcac agaaaagcat cttacggatg gcatgacagt aagagaatta tgcagtgctg 3960ccataaccat gagtgataac actgcggcca acttacttct gacaacgatc ggaggaccga 4020aggagctaac cgcttttttg cacaacatgg gggatcatgt aactcgcctt gatcgttggg 4080aaccggagct gaatgaagcc ataccaaacg acgagcgtga caccacgatg cctgtagcaa 4140tggcaacaac gttgcgcaaa ctattaactg gcgaactact tactctagct tcccggcaac 4200aattaataga ctggatggag gcggataaag ttgcaggacc acttctgcgc tcggcccttc 4260cggctggctg gtttattgct gataaatctg gagccggtga gcgtgggtct cgcggtatca 4320ttgcagcact ggggccagat ggtaagccct cccgtatcgt agttatctac acgacgggga 4380gtcaggcaac tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta 4440agcattggta actgtcagac caagtttact catatatact ttagattgat ttaaaacttc 4500atttttaatt taaaaggatc taggtgaaga tcctttttga taatctcatg accaaaatcc 4560cttaacgtga gttttcgttc cactgagcgt cagaccccgt agaaaagatc aaaggatctt 4620cttgagatcc tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac 4680cagcggtggt ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 4740tcagcagagc gcagatacca aatactgtcc ttctagtgta gccgtagtta ggccaccact 4800tcaagaactc tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg 4860ctgccagtgg cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata 4920aggcgcagcg gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga 4980cctacaccga actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag 5040ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg 5100agcttccagg gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac 5160ttgagcgtcg atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca 5220acgcggcctt tttacggttc ctggcctttt gctggccttt tgctcacatg ttctttcctg 5280cgttatcccc tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc 5340gccgcagccg aacgaccgag cgcagcgagt cagtgagcga ggaagcggaa gagcgcccaa 5400tacgcaaacc gcctctcccc gcgcgttggc cgattcatta atgcagctgg cacgacaggt 5460ttcccgactg gaaagcgggc agtgagcgca acgcaattaa tgtgagttag ctcactcatt 5520aggcacccca ggctttacac tttatgcttc cggctcgtat gttgtgtgga attgtgagcg 5580gataacaatt tcacacagga aacagctatg accatgatta cgccaagctt aggcctgtgt 5640ggaagaacga ttacaacagg tgttgtcctc tgaggacata aaatacacac cgagattcat 5700caactcattg ctggagttag catatctaca attgggtgaa atggggagcg atttgcaggc 5760atttgctcgg catgccggta gaggtgtggt caataagagc gacctcatgc tatacctgag 5820aaagcaacct gacctacagg aaagagttac tcaagaataa gaattttcgt tttaaaacct 5880aagagtcact ttaaaatttg tatacactta ttttttttat aacttattta ataataaaaa 5940tcataaatca taagaaattc gctcgagtcg actgca 59764113286DNAArtificialpBOL34 41aagcttgcat gcctgcaggt cgacggcgcg ccgggcccgt ttaaacggcc ggccaaggtg 60agacgcgcat aaccgctaga gtactttgaa gaggaaacag caatagggtt gctaccagta 120taaatagaca ggtacataca acactggaaa tggttgtctg tttgagtacg ctttcaattc 180atttgggtgt gcactttatt atgttacaat atggaaggga actttacact tctcctatgc 240acatatatta attaaagtcc aatgctagta gagaaggggg gtaacacccc tccgcgctct 300tttccgattt ttttctaaac cgtggaatat ttcggatatc cttttgttgt ttccgggtgt 360acaatatgga cttcctcttt tctggcaacc aaacccatac atcgggattc ctataatacc 420ttcgttggtc tccctaacat gtaggtggcg gaggggagat atacaataga acagatacca 480gacaagacat aatgggctaa acaagactac accaattaca ctgcctcatt gatggtggta 540cataacgaac taatactgta gccctagact tgatagccat catcatatcg aagtttcact 600accctttttc catttgccat ctattgaagt aataataggc gcatgcaact tcttttcttt 660ttttttcttt tctctctccc ccgttgttgt ctcaccatat ccgcaatgac aaaaaaatga 720tggaagacac taaaggaaaa aattaacgac aaagacagca ccaacagatg tcgttgttcc 780agagctgatg aggggtatct cgaagcacac gaaacttttt ccttccttca ttcacgcaca 840ctactctcta atgagcaacg gtatacggcc ttccttccag ttacttgaat ttgaaataaa 900aaaaagtttg ctgtcttgct atcaagtata aatagacctg caattattaa tcttttgttt 960cctcgtcatt gttctcgttc cctttcttcc ttgtttcttt ttctgcacaa tatttcaagc 1020tataccaagc atacaatcaa ctatctcata tacaatgaag gaagttgtta ttgcttctgc 1080tgtcagaact gccattggtt cttacggtaa gtctttgaag gacgtcccag ctgtcgactt 1140gggtgctacc gccatcaagg aagctgtcaa gaaggctggt atcaagccag aagatgttaa 1200cgaagttatc ttaggtaacg ttttgcaagc tggtttaggt caaaacccag ctcgtcaagc 1260ttctttcaag gctggtttgc cagttgaaat tccagccatg accatcaaca aggtttgtgg 1320ttctggtttg agaactgttt ctttggctgc tcaaatcatc aaggctggtg acgctgatgt 1380catcattgct ggtggtatgg aaaacatgtc cagagctcca tacttggcta acaatgctag 1440atggggttac agaatgggta acgccaagtt cgtcgatgaa atgatcactg acggtttatg 1500ggacgctttc aacgactacc acatgggtat cactgctgaa aacattgctg aaagatggaa 1560catctccaga gaagaacaag atgaatttgc tttggcttct caaaagaagg ctgaagaagc 1620catcaaatct ggtcaattca aggacgaaat tgtcccagtt gtcatcaagg gtagaaaggg 1680tgaaaccgtt gtcgacaccg atgaacaccc aagattcggt tccaccattg aaggtttggc 1740caagttgaaa ccagctttca agaaggatgg taccgtcact gctggtaacg cttccggttt 1800gaacgactgt gctgctgttt tggttatcat gtctgctgaa aaggccaagg aattgggtgt 1860caagccattg gccaagattg tctcctacgg ttctgctggt gttgacccag ccatcatggg 1920ttacggtcct ttctacgcta ccaaggctgc tatcgaaaag gctggttgga ccgttgacga 1980attggatttg attgaatcca acgaagcttt cgctgctcaa tctttggctg ttgccaagga 2040cttgaaattc gacatgaaca aggtcaacgt taacggtggt gccattgctt tgggtcaccc 2100aattggtgct tccggtgcca gaatcttggt tactttagtc cacgctatgc aaaagcgtga 2160tgccaagaag ggtttggcta ctctatgtat cggtggtggt caaggtactg ccatcttatt 2220ggaaaagtgt taggcccggg cataaagcaa tcttgatgag gataatgatt tttttttgaa 2280tatacataaa tactaccgtt tttctgctag attttgtgaa gacgtaaata agtacatatt 2340actttttaag ccaagacaag attaagcatt aactttaccc ttttctcttc taagtttcaa 2400tactagttat cactgtttaa aagttatggc gagaacgtcg gcggttaaaa tatattaccc 2460tgaacgtggt gaattgaagt tctaggatgg tttaaagatt tttccttttt gggaaataag 2520taaacaatat attgctgcct ttgcaaaacg cacataccca caatatgtga ctattggcaa 2580agaacgcatt atcctttgaa gaggtggata ctgatactaa gagagtctct attccggctc 2640cacttttagt ccagagatta cttgtcttct tacgtatcag aacaagaaag catttccaaa 2700gtaattgcat ttgcccttga gcagtatata tatactaaga agtttaaaca tttaaacgtg 2760tgtgtgcatt atatatatta aaaattaaga attagactaa ataaagtgtt tctaaaaaaa 2820tattaaagtt gaaatgtgcg tgttgtgaat tgtgctctat tagaataatt atgacttgtg 2880tgcgtttcat attttaaaat aggaaataac caagaaagaa aaagtaccat ccagagaaac 2940caattatatc aaatcaaata aaacaaccag cttcggtgtg tgtgtgtgtg tgaagctaag 3000agttgatgcc atttaatcta aaaattttaa ggtgtgtgtg tggataaaat attagaatga 3060caattcgaga tgaaatttta agcaaactct agtaggaaat aagcggctta ttcttgttgg 3120ctcctaattc tttttagtgt atcagttccc attgataaaa aaattaaaat taaaattaga 3180aaaattaaac cagaaaaatc aagttgatta aaatgtgaca aaaattatga ttaaatgcta 3240cttcaacagg agcccgggcc tatttggagt agtcgtagaa acccttacca gactttctac 3300ctaaccaacc agctctaacg tacttcttca ataaagtgtg aggtctgtac ttagagtcac 3360cggtttcaga gtataagaca tccatgatgg ccaaacagat atccaaaccg atgaagtcac 3420ctaattccaa tggacccatt gggtggttag cacccaattt catggccttg tcgatatctt 3480caacagaagc aataccttca gccaaaatac cgacagcttc gttgatcatt ggaatcaaga 3540ttctgttgac aacgaaacct ggagcttcag caacttcaac tgggtcctta ccaatggcaa 3600tggaagtttc cttgacagca tcgaaagttt cttgagaggt ggcaatacct ctgatgactt 3660cgaccaactt catgactgga gctgggttga agaagtgcat accgataacc ttgtctggtc 3720tcttggtagc agaagcaact tcagtgatgg acaaagaaga agtgttggaa gccaaaatgg 3780tttctggctt acagatgttg tccaaatcag caaagatttg cttcttgatg tccattcttt 3840caacggcagc ttcaatgacc aaatcacagt cagcagccat gttcaagtca acagtaccgg 3900agattctggt caagatttcg accttggtag cttcttcaat cttacccttc ttgaccaact 3960tggacaagtt cttgttgatg aaatccaaac cacggtcaac gaattcgtcc ttgatatctc 4020tcaaaacaac ttcgaaaccc ttggcagcga aagcttgagc aataccagaa cccatggtac 4080cggcaccaat gacacaaacc ttcttcattt tgatttagtg tttgtgtgtt gataagcagt 4140tgcttggttt tttatgaaaa atagctagaa ggaataaggg attacaagag agatgttaca 4200agaaagaagt aaaataaatt tgattaatat tgccattatc aaaagctatt tatatgttga 4260aatcgtggag atcatgtgtg ccagaaaagg ccacagtttc cggggagagg cataccttga 4320ggtggctagg aatcacggag acctcttgac ttgcagggta ggctagctag aattaagtga 4380ggtgacaagg tttccataca gttttgacct tgagacgttg ctacttacga tttgcagtat 4440gcaagtctca tgctgcaaac aaaagaggac cgctcaggta atcgctcaat tagtggacgt 4500tatcaggggc gggagaggcg aaagtggttt ttggtggtgt aagtaaaggt cgtccaaata 4560tgcaggtgtt tgggtgctat cctagtggaa gctcggatca gtagataacc cgcctagaag 4620cggtattttt cttttttttt cttccttctt tttcgtcatt atttcaaacg cttttgcgtc 4680aagtaatgaa tatctggcgg ttccgcggta atgcgacaat ttgtgatatg cactcttaaa 4740accccgccac gatgatcgca cgtgccggca tttatagacg acttttctgg ttgtcccgct 4800tcacggcaca tgcatgcatc aatgaccgaa ttcaggttgc tactaaccat tgtgttgtgt 4860tattgctgtg catgaggtgc tcaagtgccc gcggcatctg actagtggta actctagacg 4920gcttcgatgc agagagttcc tcaaaatttt tcttttcaat tgtttgcctg gtttccgcgg 4980cgtatatcag tttttggcga tatggtaacg cgatactcta cggcaccttc acggtagatg 5040tcttttttaa aagtgactgt taattccagg attgaaagga agtgtcgaat agtatagtat 5100gctttctagg ccggccgttt aaatgggccc gcggcccgtt taaacggccg gcccttccct 5160tttacagtgc ttcggaaaag cacagcgttg tccaagggaa caatttttct tcaagttaat 5220gcataagaaa tatctttttt tatgtttagc taagtaaaag cagcttggag taaaaaaaaa 5280aatgagtaaa tttctcgatg gattagtttc tcacaggtaa cataacaaaa accaagaaaa 5340gcccgcttct gaaaactaca gttgacttgt atgctaaagg gccagactaa tgggaggaga 5400aaaagaaacg aatgtatatg ctcatttaca ctctatatca ccatatggag gataagttgg 5460gctgagcttc tgatccaatt tattctatcc attagttgct gatatgtccc accagccaac 5520acttgatagt atctactcgc cattcacttc cagcagcgcc agtagggttg ttgagcttag 5580taaaaatgtg cgcaccacaa gcctacatga ctccacgtca catgaaacca caccgtgggg 5640ccttgttgcg ctaggaatag gatatgcgac gaagacgctt ctgcttagta accacaccac 5700attttcaggg ggtcgatctg cttgcttcct ttactgtcac gagcggccca taatcgcgct 5760ttttttttaa aaggcgcgag acagcaaaca ggaagctcgg gtttcaacct tcggagtggt 5820cgcagatctg gagactggat ctttacaata cagtaaggca agccaccatc tgcttcttag 5880gtgcatgcga cggtatccac gtgcagaaca acatagtctg aagaaggggg ggaggagcat 5940gttcattctc tgtagcagta agagcttggt gataatgacc aaaactggag tctcgaaatc 6000atataaatag acaatatatt ttcacacaat gagatttgta gtacagttct attctctctc 6060ttgcataaat aagaaattca tcaagaactt ggtttgatat ttcaccaaca cacacaaaaa 6120acagtacttc actaaattta cacacaaaac aaaatggaat tgaacaacgt tatcttggaa 6180aaggaaggta aggttgccgt tgtcaccatc aacagaccaa aggctttgaa tgctttgaac 6240tctgacactt tgaaggaaat ggactacgtc attggtgaaa ttgaaaacga ttctgaagtt 6300ttggctgtca tcttgaccgg tgccggtgaa aagtctttcg ttgctggtgc tgatatctct 6360gaaatgaagg aaatgaacac cattgaaggt agaaagttcg gtatcttagg taacaaggtt 6420ttcagaagat tggaattgtt ggaaaagcca gtcattgctg ctgtcaacgg tttcgctttg 6480ggtggtggtt gtgaaattgc catgtcctgt gacatcagaa ttgcttcttc taacgctcgt 6540ttcggtcaac cagaagtcgg tctaggtatc actccaggtt tcggtggtac tcaaagatta 6600tccagattgg ttggtatggg tatggccaag caattgatct tcaccgctca aaacatcaag 6660gctgacgaag ctttgagaat tggtttagtc aacaaggttg ttgaaccatc tgaattgatg 6720aacactgcca aggaaattgc taacaagatc gtctccaacg ctccagttgc tgtcaaattg 6780tccaagcaag ccatcaacag aggtatgcaa tgtgatatcg acaccgcttt ggcctttgaa 6840tctgaagctt tcggtgaatg tttctccact gaagaccaaa aggatgctat gaccgctttc 6900atcgaaaaga gaaagattga aggtttcaag aacaggtgat gagcccgggc gcgaatttct 6960tatgatttat gatttttatt attaaataag ttataaaaaa aataagtgta tacaaatttt 7020aaagtgactc ttaggtttta aaacgaaaat tcttattctt gagtaactct ttcctgtagg 7080tcaggttgct ttctcaggta tagcatgagg tcgctcttat tgaccacacc tctaccggca 7140tgccgagcaa atgcctgcaa atcgctcccc atttcaccca attgtagata tgctaactcc 7200agcaatgagt tgatgaatct cggtgtgtat tttatgtcct cagaggacaa cacctgttgt 7260aatcgttctt ccacacggat ccacagccta gccttcagtt gggctctatc ttcatcgtca 7320ttcattgcat ctactagccc cttacctgag cttcaagacg ttatatcgct tttatgtatc 7380atgatcttat cttgagatat gaatacataa atatatttac tcaagtgtat acgtgcatgc 7440tttttttacg gtttaaacat ttaaatgggc cgctctagag gatccccggg taccgagctc 7500gggcccagcg ctactagttc cggtaatttg aaaacaaacc cggtctcgaa gcggagatcc 7560ggcgataatt accgcagaaa taaacccata cacgagacgt agaaccagcc gcacatggcc 7620ggagaaactc ctgcgagaat ttcgtaaact cgcgcgcatt gcatctgtat ttcctaatgc 7680ggcacttcca ggcctcgaga cctctgacat gcttttgaca ggaatagaca ttttcagaat 7740gttatccata tgcctttcgg gtttttttcc ttccttttcc atcatgaaaa atctctcgag 7800accgtttatc cattgctttt ttgttgtctt tttccctcgt tcacagaaag tctgaagaag 7860ctatagtaga actatgagct ttttttgttt ctgttttcct tttttttttt tttacctctg 7920tggaaattgt tactctcaca ctctttagtt cgtttgtttg ttttgtttat tccaattatg 7980accggtgacg aaacgtggtc gatggtgggt accgcttatg ctcccctcca ttagtttcga 8040ttatataaaa aggccaaata ttgtattatt ttcaaatgtc ctatcattat cgtctaacat 8100ctaatttctc ttaaattttt tctctttctt tcctataaca ccaatagtga aaatcttttt 8160ttcttctata tctacaaaaa cttttttttt ctatcaacct cgttgataaa ttttttcttt 8220aacaatcgtt aataattaat taattggaaa ataaccattt tttctctctt ttatacacac 8280attcaaaaga aagaaaaaaa atatacccca gctagttaaa gaaaatcatt gaaaagaata 8340agaagataag aaagatttaa ttatcaaaca atatcaatat gcctcaatcc tgggaagaac 8400tggccgctga taagcgcgcc cgcctcgcaa aaaccatccc tgatgaatgg aaagtccaga 8460cgctgcctgc ggaagacagc gttattgatt tcccaaagaa atcggggatc ctttcagagg 8520ccgaactgaa gatcacagag gcctccgctg cagatcttgt gtccaagctg gcggccggag 8580agttgacctc ggtggaagtt acgctagcat tctgtaaacg ggcagcaatc gcccagcagt 8640taacaaactg cgcccacgag ttcttccctg acgccgctct cgcgcaggca agggaactcg 8700atgaatacta cgcaaagcac aagagacccg ttggtccact ccatggcctc cccatctctc 8760tcaaagacca gcttcgagtc aagggctacg aaacatcaat gggctacatc tcatggctaa 8820acaagtacga cgaaggggac tcggttctga caaccatgct ccgcaaagcc ggtgccgtct 8880tctacgtcaa gacctctgtc ccgcagaccc tgatggtctg cgagacagtc aacaacatca 8940tcgggcgcac cgtcaaccca cgcaacaaga actggtcgtg cggcggcagt tctggtggtg 9000agggtgcgat cgttgggatt cgtggtggcg tcatcggtgt aggaacggat atcggtggct 9060cgattcgagt gccggccgcg ttcaacttcc tgtacggtct aaggccgagt catgggcggc 9120tgccgtatgc aaagatggcg aacagcatgg agggtcagga gacggtgcac agcgttgtcg 9180ggccgattac gcactctgtt gaggacctcc gcctcttcac caaatccgtc ctcggtcagg 9240agccatggaa atacgactcc aaggtcatcc ccatgccctg gcgccagtcc gagtcggaca 9300ttattgcctc caagatcaag aacggcgggc tcaatatcgg ctactacaac ttcgacggca 9360atgtccttcc acaccctcct atcctgcgcg gcgtggaaac caccgtcgcc gcactcgcca 9420aagccggtca caccgtgacc ccgtggacgc catacaagca cgatttcggc cacgatctca 9480tctcccatat ctacgcggct gacggcagcg ccgacgtaat gcgcgatatc agtgcatccg 9540gcgagccggc gattccaaat atcaaagacc tactgaaccc gaacatcaaa gctgttaaca 9600tgaacgagct ctgggacacg catctccaga agtggaatta ccagatggag taccttgaga 9660aatggcggga ggctgaagaa aaggccggga aggaactgga cgccatcatc gcgccgatta 9720cgcctaccgc tgcggtacgg catgaccagt tccggtacta tgggtatgcc tctgtgatca 9780acctgctgga tttcacgagc gtggttgttc cggttacctt tgcggataag aacatcgata 9840agaagaatga gagtttcaag gcggttagtg agcttgatgc cctcgtgcag gaagagtatg 9900atccggaggc gtaccatggg gcaccggttg cagtgcaggt tatcggacgg agactcagtg 9960aagagaggac gttggcgatt gcagaggaag tggggaagtt gctgggaaat gtggtgactc 10020cataggtcga gaatttatac ttagataagt atgtacttac aggtatattt ctatgagata 10080ctgatgtata catgcatgat aatatttaaa cggttattag tgccgattgt cttgtgcgat 10140aatgacgttc ctatcaaagc aatacactta ccacctatta catgggccaa gaaaatattt 10200tcgaacttgt ttagaatatt agcacagagt atatgatgat atccgttaga ttatgcatga 10260ttcattccta caactttttc gtagcataag gattaattac ttggatgcca ataaaaaaaa 10320aaaacatcga gaaaatttca gcatgctcag aaacaattgc agtgtatcaa agtaaaaaaa 10380agattttcgc tacatgttcc ttttgaagaa agaaaatcat ggaacattag atttacaaaa 10440atttaaccac cgctgattaa cgattagacc gttaagcgca caacaggtta ttagtacaga 10500gaaagcattc tgtggtgttg ccccggactt tcttttgcga cataggtaaa tcgaatacca 10560tcatactatc ttttccaatg actccctaaa gaaagactct tcttcgatgt tgtatacgtt 10620ggagcatagg gcaagaattg tggcttgaga tgaattcact ggccgtcgtt ttacaacgtc 10680gtgactggga aaaccctggc gttacccaac ttaatcgcct tgcagcacat ccccctttcg 10740ccagctggcg taatagcgaa gaggcccgca ccgatcgccc ttcccaacag ttgcgcagcc 10800tgaatggcga atggcgcctg atgcggtatt ttctccttac gcatctgtgc ggtatttcac 10860accgcatatg gtgcactctc agtacaatct gctctgatgc cgcatagtta agccagcccc 10920gacacccgcc aacacccgct gacgcgccct gacgggcttg tctgctcccg gcatccgctt 10980acagacaagc tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 11040cgaaacgcgc gagacgaaag ggcctcgtga tacgcctatt tttataggtt aatgtcatga 11100taataatggt ttcttagacg tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta 11160tttgtttatt tttctaaata cattcaaata tgtatccgct catgagacaa taaccctgat 11220aaatgcttca ataatattga aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc 11280ttattccctt ttttgcggca ttttgccttc ctgtttttgc tcacccagaa acgctggtga 11340aagtaaaaga tgctgaagat cagttgggtg cacgagtggg ttacatcgaa ctggatctca 11400acagcggtaa gatccttgag agttttcgcc ccgaagaacg ttttccaatg atgagcactt 11460ttaaagttct gctatgtggc gcggtattat cccgtattga cgccgggcaa gagcaactcg 11520gtcgccgcat acactattct cagaatgact tggttgagta ctcaccagtc acagaaaagc 11580atcttacgga tggcatgaca gtaagagaat

tatgcagtgc tgccataacc atgagtgata 11640acactgcggc caacttactt ctgacaacga tcggaggacc gaaggagcta accgcttttt 11700tgcacaacat gggggatcat gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag 11760ccataccaaa cgacgagcgt gacaccacga tgcctgtagc aatggcaaca acgttgcgca 11820aactattaac tggcgaacta cttactctag cttcccggca acaattaata gactggatgg 11880aggcggataa agttgcagga ccacttctgc gctcggccct tccggctggc tggtttattg 11940ctgataaatc tggagccggt gagcgtgggt ctcgcggtat cattgcagca ctggggccag 12000atggtaagcc ctcccgtatc gtagttatct acacgacggg gagtcaggca actatggatg 12060aacgaaatag acagatcgct gagataggtg cctcactgat taagcattgg taactgtcag 12120accaagttta ctcatatata ctttagattg atttaaaact tcatttttaa tttaaaagga 12180tctaggtgaa gatccttttt gataatctca tgaccaaaat cccttaacgt gagttttcgt 12240tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat cctttttttc 12300tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg gtttgtttgc 12360cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga gcgcagatac 12420caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac tctgtagcac 12480cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt ggcgataagt 12540cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag cggtcgggct 12600gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc gaactgagat 12660acctacagcg tgagctatga gaaagcgcca cgcttcccga agggagaaag gcggacaggt 12720atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca gggggaaacg 12780cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt cgatttttgt 12840gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc tttttacggt 12900tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc cctgattctg 12960tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc cgaacgaccg 13020agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa ccgcctctcc 13080ccgcgcgttg gccgattcat taatgcagct ggcacgacag gtttcccgac tggaaagcgg 13140gcagtgagcg caacgcaatt aatgtgagtt agctcactca ttaggcaccc caggctttac 13200actttatgct tccggctcgt atgttgtgtg gaattgtgag cggataacaa tttcacacag 13260gaaacagcta tgaccatgat tacgcc 132864216359DNAArtificialpBOL36 42aagcttgcat gcctgcaggt cgacggcgcg ccgggcccgt ttaaacaatg gcaaactgag 60cacaacaata ccagtccgga tcaactggca ccatctctcc cgtagtctca tctaattttt 120cttccggatg aggttccaga tataccgcaa cacctttatt atggtttccc tgagggaata 180atagaatgtc ccattcgaaa tcaccaattc taaacctggg cgaattgtat ttcgggtttg 240ttaactcgtt ccagtcagga atgttccacg tgaagctatc ttccagcaaa gtctccactt 300cttcatcaaa ttgtgggaga atactcccaa tgctcttatc tatgggactt ccgggaaaca 360cagtaccgat acttcccaat tcgtcttcag agctcattgt ttgtttgaag agactaatca 420aagaatcgtt ttctcaaaaa aattaatatc ttaactgata gtttgatcaa aggggcaaaa 480cgtaggggca aacaaacgga aaaatcgttt ctcaaatttt ctgatgccaa gaactctaac 540cagtcttatc taaaaattgc cttatgatcc gtctctccgg ttacagcctg tgtaactgat 600taatcctgcc tttctaatca ccattctaat gttttaatta agggattttg tcttcattaa 660cggctttcgc tcataaaaat gttatgacgt tttgcccgca ggcgggaaac catccacttc 720acgagactga tctcctctgc cggaacaccg ggcatctcca acttataagt tggagaaata 780agagaatttc agattgagag aatgaaaaaa aaaaaaaaaa aaaaggcaga ggagagcata 840gaaatggggt tcactttttg gtaaagctat agcatgccta tcacatataa atagagtgcc 900agtagcgact tttttcacac tcgaaatact cttactactg ctctcttgtt gtttttatca 960cttcttgttt cttcttggta aatagaatat caagctacaa aaagcataca atcaactatc 1020aactattaac tatatcgtaa tacacaggcc ggccaaaatg aaggccaaat caaggcggga 1080agggacaacc aggacgtaaa gggtagcctc cccataacat aaactcaata aaatatatag 1140tcttcaactt gaaaaaggaa caagctcatg caaagaggtg gtacccgcac gccgaaatgc 1200atgcaagtaa cctattcaaa gtaatatctc atacatgttt catgagggta acaacatgcg 1260actgggtgag catatgttcc gctgatgtga tgtgcaagat aaacaagcaa gacagaaact 1320aacttcttct tcatgtaata aacacacccc gcgtttattt acctatcttt aaacttcaac 1380accttatatc ataactaata tttcttgaga taagcacact gcacccatac cttccttaaa 1440aacgtagctt ccagtttttg gtggttctgg cttccttccc gattccgccc gctaaacgca 1500taattttgtt gcctggtggc atttgcaaaa tgcataacct atgcatttaa aagattatgt 1560atgctcttct gacttttcgt gtgatgaggc tcgtggaaaa aatgaataat ttatgaattt 1620gagaacaatt ttgtgttgtt acggtatttt actatggaat aatcaatcaa ttgaggattt 1680tatgcaaata tcgtttgaat atttttccga ccctttgagt acttttcttc ataattgcat 1740aatattgtcc gctgcccgtt tttctgttag acggtgtctt gatctacttg ctatcgttca 1800acaccacctt attttctaac tatttttttt ttagctcatt tgaatcagct tatggtgatg 1860gcacattttt gcataaacct agctgtcctc gttgaacata ggaaaaaaaa atatataaac 1920aaggctcttt cactctcctt ggaatcagat ttgggtttgt tccctttatt ttcatatttc 1980ttgtcatatt cttttctcaa ttattatctt ctactcataa cctcacgcaa aataacacag 2040tcaaatcaat caaaatggac ttcaacttga ccagagaaca agaattggtc agacaaatgg 2100ttagagaatt tgctgaaaac gaagttaagc caattgctgc tgaaatcgat gaaactgaaa 2160gattcccaat ggaaaacgtc aagaagatgg gtcaatacgg tatgatgggt attccattct 2220ctaaggaata cggtggtgct ggtggtgacg tcttgtctta catcattgct gtcgaagaat 2280tgtccaaggt ttgtggtacc actggtgtca tcttatctgc tcacacttct ctatgtgcct 2340ccttgatcaa cgaacacggt actgaagaac aaaagcaaaa gtacttggtt ccattggcca 2400agggtgaaaa gattggtgcc tacggtttga ctgaaccaaa cgctggtact gactctggtg 2460ctcaacaaac tgttgccgtt ttggaaggtg accactacgt catcaacggt tccaagatct 2520tcatcaccaa cggtggtgtt gctgacacct ttgtcatctt cgctatgacc gatcgtacca 2580agggtaccaa gggtatctct gctttcatta ttgaaaaggg tttcaagggt ttctccatcg 2640gtaaggtcga acaaaagttg ggtatcagag cttcctctac cactgaattg gttttcgaag 2700acatgattgt tccagttgaa aacatgatcg gtaaggaagg taagggtttc ccaattgcca 2760tgaagacttt agatggtggt agaattggta ttgctgctca agctttgggt attgctgaag 2820gtgccttcaa cgaagctaga gcttacatga aggaaagaaa gcaattcggt agatctttgg 2880acaaattcca aggtttggct tggatgatgg ctgacatgga cgttgccatc gaatctgctc 2940gttacttggt ctacaaggct gcttacttga agcaagctgg tttgccatac accgtcgatg 3000ctgccagagc taagttgcac gctgccaacg ttgccatgga tgtcaccacc aaggctgtcc 3060aattattcgg tggttacggt tacaccaagg actacccagt tgaaagaatg atgagagatg 3120ctaagatcac tgaaatctac gaaggtactt ctgaagttca aaagttggtt atctccggta 3180agatcttcag ataggcccgg gcataaagca atcttgatga ggataatgat ttttttttga 3240atatacataa atactaccgt ttttctgcta gattttgtga agacgtaaat aagtacatat 3300tactttttaa gccaagacaa gattaagcat taactttacc cttttctctt ctaagtttca 3360atactagtta tcactgttta aaagttatgg cgagaacgtc ggcggttaaa atatattacc 3420ctgaacgtgg tgaattgaag ttctaggatg gtttaaagat ttttcctttt tgggaaataa 3480gtaaacaata tattgctgcc tttgcaaaac gcacataccc acaatatgtg actattggca 3540aagaacgcat tatcctttga agaggtggat actgatacta agagagtctc tattccggct 3600ccacttttag tccagagatt acttgtcttc ttacgtatca gaacaagaaa gcatttccaa 3660agtaattgca tttgcccttg agcagtatat atatactaag aagtttaaac atttaaacgg 3720ccggcctaga aagcatacta tactattcga cacttccttt caatcctgga attaacagtc 3780acttttaaaa aagacatcta ccgtgaaggt gccgtagagt atcgcgttac catatcgcca 3840aaaactgata tacgccgcgg aaaccaggca aacaattgaa aagaaaaatt ttgaggaact 3900ctctgcatcg aagccgtcta gagttaccac tagtcagatg ccgcgggcac ttgagcacct 3960catgcacagc aataacacaa cacaatggtt agtagcaacc tgaattcggt cattgatgca 4020tgcatgtgcc gtgaagcggg acaaccagaa aagtcgtcta taaatgccgg cacgtgcgat 4080catcgtggcg gggttttaag agtgcatatc acaaattgtc gcattaccgc ggaaccgcca 4140gatattcatt acttgacgca aaagcgtttg aaataatgac gaaaaagaag gaagaaaaaa 4200aaagaaaaat accgcttcta ggcgggttat ctactgatcc gagcttccac taggatagca 4260cccaaacacc tgcatatttg gacgaccttt acttacacca ccaaaaacca ctttcgcctc 4320tcccgcccct gataacgtcc actaattgag cgattacctg agcggtcctc ttttgtttgc 4380agcatgagac ttgcatactg caaatcgtaa gtagcaacgt ctcaaggtca aaactgtatg 4440gaaaccttgt cacctcactt aattctagct agcctaccct gcaagtcaag aggtctccgt 4500gattcctagc cacctcaagg tatgcctctc cccggaaact gtggcctttt ctggcacaca 4560tgatctccac gatttcaaca tataaatagc ttttgataat ggcaatatta atcaaattta 4620ttttacttct ttcttgtaac atctctcttg taatccctta ttccttctag ctatttttca 4680taaaaaacca agcaactgct tatcaacaca caaacactaa atcaaaatgg tcgatttcga 4740atactctatc ccaaccagaa tcttcttcgg taaggacaag atcaacgttt tgggtagaga 4800attgaagaaa tacggttcca aggttttgat tgtctacggt ggtggttcca tcaagagaaa 4860cggtatctac gacaaggctg tctccatttt ggaaaagaac tctatcaaat tctacgaatt 4920ggctggtgtt gaaccaaacc caagagttac caccgtcgaa aagggtgtca agatctgtcg 4980tgaaaacggt gttgaagttg ttttggccat cggtggtggt tctgccattg actgtgccaa 5040ggtcattgct gctgcctgtg aatacgatgg taacccatgg gacattgtct tggatggttc 5100taagatcaag cgtgtcttac caattgcttc catcttgact atcgctgcta ctggttctga 5160aatggacacc tgggctgtta tcaacaacat ggacactaac gaaaagttga ttgctgctca 5220cccagatatg gccccaaagt tctctatttt ggacccaacc tacacttaca ctgttccaac 5280caaccaaact gctgctggta ctgctgatat catgtctcac atctttgaag tttacttctc 5340caacaccaag accgcttact tgcaagacag aatggctgaa gctctattaa gaacctgtat 5400caagtacggt ggtattgctt tggaaaagcc agatgactac gaagccagag ctaacttgat 5460gtgggcttcc tctttggcta tcaacggttt attgacttac ggtaaggaca ccaactggtc 5520cgttcatttg atggaacacg aattgtctgc ttactacgat atcactcacg gtgtcggttt 5580ggccatcttg actccaaact ggatggaata cattttgaac aacgacactg tctacaagtt 5640cgtcgaatac ggtgttaacg tctggggtat tgacaaggaa aagaaccact acgacattgc 5700tcaccaagcc atccaaaaga ccagagacta tttcgtcaac gttttgggtt taccatccag 5760attaagagat gttggtattg aagaagaaaa attggatatc atggctaagg aatctgtcaa 5820attgactggt ggtaccattg gtaacttgag acctgttaac gcttctgaag ttttgcaaat 5880cttcaagaaa tctgtttagg cccgggctcc tgttgaagta gcatttaatc ataatttttg 5940tcacatttta atcaacttga tttttctggt ttaatttttc taattttaat tttaattttt 6000ttatcaatgg gaactgatac actaaaaaga attaggagcc aacaagaata agccgcttat 6060ttcctactag agtttgctta aaatttcatc tcgaattgtc attctaatat tttatccaca 6120cacacacctt aaaattttta gattaaatgg catcaactct tagcttcaca cacacacaca 6180caccgaagct ggttgtttta tttgatttga tataattggt ttctctggat ggtacttttt 6240ctttcttggt tatttcctat tttaaaatat gaaacgcaca caagtcataa ttattctaat 6300agagcacaat tcacaacacg cacatttcaa ctttaatatt tttttagaaa cactttattt 6360agtctaattc ttaattttta atatatataa tgcacacaca cgtttaaatg ggcccgcggc 6420ccgtttaaac ggccggccct tcccttttac agtgcttcgg aaaagcacag cgttgtccaa 6480gggaacaatt tttcttcaag ttaatgcata agaaatatct ttttttatgt ttagctaagt 6540aaaagcagct tggagtaaaa aaaaaaatga gtaaatttct cgatggatta gtttctcaca 6600ggtaacatag caaaaaccaa gaaaagcccg cttctgaaaa ctacagttga cttgtatgct 6660aaagggccag actaatggga ggagaaaaag aaacgaatgt atatgctcat ttacactcta 6720tatcaccata tggaggataa gttgggctga gcttctgatc caatttattc tatccattag 6780ttgctgatat gtcccaccag ccaacacttg atagtatcta ctcgccattc acttccagca 6840gcgccagtag ggttgttgag cttagtaaaa atgtgcgcac cacaagccta catgactcca 6900cgtcacatga aaccacaccg tggggccttg ttgcgctagg aataggatat gcgacgaaga 6960cgcttctgct tagtaaccac accacatttt cagggggtcg atctgcttgc ttcctttact 7020gtcacgagcg gcccataatc gcgctttttt tttaaaaggc gcgagacagc aaacaggaag 7080ctcgggtttc aaccttcgga gtggtcgcag atctggagac tggatcttta caatacagta 7140aggcaagcca ccatctgctt cttaggtgca tgcgacggta tccacgtgca gaacaacata 7200gtctgaagaa gggggggagg agcatgttca ttctctgtag cagtaagagc ttggtgataa 7260tgaccaaaac tggagtctcg aaatcatata aatagacaat atattttcac acaatgagat 7320ttgtagtaca gttctattct ctctcttgca taaataagaa attcatcaag aacttggttt 7380gatatttcac caacacacac aaaaaacagt acttcactaa atttacacac aaaacaaaat 7440gaaggttacc aaccaaaagg aattgaagca aaagttgaac gaattgagag aagctcaaaa 7500gaagttcgct acctacactc aagaacaagt tgacaagatc ttcaagcaat gtgccattgc 7560tgctgccaag gaacgtatca acttggccaa gttggctgtc gaagaaaccg gtattggttt 7620ggttgaagac aagatcatca agaaccactt cgctgctgaa tacatctaca acaagtacaa 7680gaacgaaaag acctgtggta tcatcgacca cgatgactct ttgggtatca ccaaggttgc 7740tgaaccaatc ggtattgtcg ccgccattgt cccaaccact aacccaactt ccactgccat 7800cttcaaatct ttgatctcct tgaagaccag aaacgctatc ttcttctccc cacacccaag 7860agccaagaag tccaccattg ctgctgccaa attaatcttg gatgctgctg ttaaggctgg 7920tgccccaaag aacattattg gttggatcga tgaaccttcc attgaattgt ctcaagactt 7980gatgtctgaa gctgatatca tcttggctac cggtggtcca tccatggtca aggccgctta 8040ctcttctggt aagccagcta ttggtgttgg tgctggtaac actccagcta tcatcgatga 8100atctgctgac attgacatgg ctgtctcctc cattatcttg tccaagactt atgacaacgg 8160tgtcatctgt gcctctgaac aatccatctt ggttatgaac tctatctacg aaaaggtcaa 8220ggaagaattt gttaagagag gttcctacat cttaaaccaa aatgaaattg ccaagatcaa 8280ggaaaccatg ttcaagaacg gtgccatcaa cgctgacatt gtcggtaaat ctgcttacat 8340cattgccaag atggctggta ttgaagttcc acaaaccact aagattttga tcggtgaagt 8400tcaatctgtc gaaaagtctg aattattctc tcacgaaaag ttgtctccag tcttggctat 8460gtacaaggtc aaggatttcg acgaagcttt gaagaaggct caaagattaa ttgaattagg 8520tggttctggt cacacctctt ctctatacat tgactctcaa aacaacaagg acaaggtcaa 8580ggaattcggt ctagctatga agacttccag aactttcatc aacatgccat cttctcaagg 8640tgcttctggt gatttgtaca actttgccat tgctccatct ttcactttag gttgtggtac 8700ctggggtggt aactctgttt ctcaaaacgt tgaaccaaag catttgctaa acatcaagtc 8760cgttgctgaa agaagagaaa acatgttgtg gttcaaggtt ccacaaaaga tctacttcaa 8820atacggttgt ttgagatttg ctttgaagga attgaaagat atgaacaaga agcgtgcttt 8880catcgttact gacaaggatt tgttcaaatt gggttacgtt aacaagatca ctaaggtttt 8940ggatgaaatt gatatcaagt actccatctt cactgatatc aaatctgacc caaccattga 9000ctccgtcaag aagggtgcta aggaaatgtt gaacttcgaa ccagatacca ttatctccat 9060tggtggtggt tctccaatgg atgctgccaa ggttatgcat ttgttgtacg aatacccaga 9120agctgaaatc gaaaacttgg ccatcaactt catggacatc agaaagagaa tctgtaactt 9180cccaaagttg ggtaccaagg ccatttctgt tgccattcca accaccgctg gtaccggttc 9240tgaagctact ccatttgctg tcatcaccaa cgacgaaacc ggtatgaagt acccattgac 9300ctcttacgaa ttgactccaa acatggccat cattgacact gaattgatgt tgaacatgcc 9360aagaaagttg actgctgcta ccggtattga cgctttagtc cacgctatcg aagcttacgt 9420ctccgttatg gccactgact acactgacga attggctttg agagctatca agatgatctt 9480caagtacttg ccaagagctt acaagaacgg tactaacgat atcgaagctc gtgaaaagat 9540ggctcacgct tccaacattg ctggtatggc tttcgctaac gctttcttgg gtgtttgtca 9600ctccatggcc cacaagttgg gtgctatgca ccacgttcct cacggtattg cttgtgctgt 9660tttgattgaa gaagtcatca agtacaacgc tactgactgt ccaaccaagc aaactgcttt 9720cccacaatac aagtctccaa acgccaagag aaagtacgct gaaattgctg aatacttgaa 9780cttgaaaggt acttctgaca ctgaaaaggt cactgcttta atcgaagcta tctccaagtt 9840gaagattgac ttatctattc ctcaaaacat ctctgctgct ggtattaaca agaaggactt 9900ctacaacact ttagacaaga tgtccgaatt ggctttcgat gaccaatgta ccaccgctaa 9960cccaagatac ccattgatct ctgaattgaa ggatatctac atcaagtcct tttaagcccg 10020ggcgcggatc tcttatgtct ttacgattta tagttttcat tatcaagtat gcctatatta 10080gtatatagca tctttagatg acagtgttcg aagtttcacg aataaaagat aatattctac 10140tttttgctcc caccgcgttt gctagcacga gtgaacacca tccctcgcct gtgagttgta 10200cccattcctc taaactgtag acatggtagc ttcagcagtg ttcgttatgt acggcatcct 10260ccaacaaaca gtcggttata gtttgtcctg ctcctctgaa tcgtctccct cgatatttct 10320cattttcctt cgcatgccag cattgaaatg atcgaagttc aatgatgaaa cggtaattct 10380tctgtcattt actcatctca tctcatcaag ttatataatt ctatacggat gtaatttttc 10440acttttcgtc ttgacgtcca ccctataatt tcaattattg aaccctcaca aatgatgcac 10500tgcaatgtac acaccctcat atagtttaaa catttaaatg ggccgctcta gaggatcccc 10560gggtaccgag ctcgggccca gcgctactag ttccggtaat ttgaaaacaa acccggtctc 10620gaagcggaga tccggcgata attaccgcag aaataaaccc atacacgaga cgtagaacca 10680gccgcacatg gccggagaaa ctcctgcgag aatttcgtaa actcgcgcgc attgcatctg 10740tatttcctaa tgcggcactt ccaggcctcg agacctctga catgcttttg acaggaatag 10800acattttcag aatgttatcc atatgccttt cgggtttttt tccttccttt tccatcatga 10860aaaatctctc gagaccgttt atccattgct tttttgttgt ctttttccct cgttcacaga 10920aagtctgaag aagctatagt agaactatga gctttttttg tttctgtttt cctttttttt 10980ttttttacct ctgtggaaat tgttactctc acactcttta gttcgtttgt ttgttttgtt 11040tattccaatt atgaccggtg acgaaacgtg gtcgatggtg ggtaccgctt atgctcccct 11100ccattagttt cgattatata aaaaggccaa atattgtatt attttcaaat gtcctatcat 11160tatcgtctaa catctaattt ctcttaaatt ttttctcttt ctttcctata acaccaatag 11220tgaaaatctt tttttcttct atatctacaa aaactttttt tttctatcaa cctcgttgat 11280aaattttttc tttaacaatc gttaataatt aattaattgg aaaataacca ttttttctct 11340cttttataca cacattcaaa agaaagaaaa aaaatatacc ccagctagtt aaagaaaatc 11400attgaaaaga ataagaagat aagaaagatt taattatcaa acaatatcaa tatgcctcaa 11460tcctgggaag aactggccgc tgataagcgc gcccgcctcg caaaaaccat ccctgatgaa 11520tggaaagtcc agacgctgcc tgcggaagac agcgttattg atttcccaaa gaaatcgggg 11580atcctttcag aggccgaact gaagatcaca gaggcctccg ctgcagatct tgtgtccaag 11640ctggcggccg gagagttgac ctcggtggaa gttacgctag cattctgtaa acgggcagca 11700atcgcccagc agttaacaaa ctgcgcccac gagttcttcc ctgacgccgc tctcgcgcag 11760gcaagggaac tcgatgaata ctacgcaaag cacaagagac ccgttggtcc actccatggc 11820ctccccatct ctctcaaaga ccagcttcga gtcaagggct acgaaacatc aatgggctac 11880atctcatggc taaacaagta cgacgaaggg gactcggttc tgacaaccat gctccgcaaa 11940gccggtgccg tcttctacgt caagacctct gtcccgcaga ccctgatggt ctgcgagaca 12000gtcaacaaca tcatcgggcg caccgtcaac ccacgcaaca agaactggtc gtgcggcggc 12060agttctggtg gtgagggtgc gatcgttggg attcgtggtg gcgtcatcgg tgtaggaacg 12120gatatcggtg gctcgattcg agtgccggcc gcgttcaact tcctgtacgg tctaaggccg 12180agtcatgggc ggctgccgta tgcaaagatg gcgaacagca tggagggtca ggagacggtg 12240cacagcgttg tcgggccgat tacgcactct gttgaggacc tccgcctctt caccaaatcc 12300gtcctcggtc aggagccatg gaaatacgac tccaaggtca tccccatgcc ctggcgccag 12360tccgagtcgg acattattgc ctccaagatc aagaacggcg ggctcaatat cggctactac 12420aacttcgacg gcaatgtcct tccacaccct cctatcctgc gcggcgtgga aaccaccgtc 12480gccgcactcg ccaaagccgg tcacaccgtg accccgtgga cgccatacaa gcacgatttc 12540ggccacgatc tcatctccca tatctacgcg gctgacggca gcgccgacgt aatgcgcgat 12600atcagtgcat ccggcgagcc ggcgattcca aatatcaaag acctactgaa cccgaacatc 12660aaagctgtta acatgaacga gctctgggac acgcatctcc agaagtggaa ttaccagatg 12720gagtaccttg agaaatggcg ggaggctgaa gaaaaggccg ggaaggaact ggacgccatc 12780atcgcgccga ttacgcctac cgctgcggta cggcatgacc agttccggta ctatgggtat 12840gcctctgtga tcaacctgct ggatttcacg agcgtggttg ttccggttac ctttgcggat 12900aagaacatcg ataagaagaa tgagagtttc aaggcggtta gtgagcttga tgccctcgtg 12960caggaagagt atgatccgga ggcgtaccat ggggcaccgg ttgcagtgca ggttatcgga 13020cggagactca gtgaagagag gacgttggcg attgcagagg aagtggggaa gttgctggga 13080aatgtggtga ctccataggt cgagaattta tacttagata agtatgtact tacaggtata 13140tttctatgag atactgatgt atacatgcat gataatattt aaacggttat tagtgccgat 13200tgtcttgtgc gataatgacg ttcctatcaa agcaatacac ttaccaccta ttacatgggc 13260caagaaaata ttttcgaact tgtttagaat attagcacag agtatatgat gatatccgtt

13320agattatgca tgattcattc ctacaacttt ttcgtagcat aaggattaat tacttggatg 13380ccaataaaaa aaaaaaacat cgagaaaatt tcagcatgct cagaaacaat tgcagtgtat 13440caaagtaaaa aaaagatttt cgctacatgt tccttttgaa gaaagaaaat catggaacat 13500tagatttaca aaaatttaac caccgctgat taacgattag accgttaagc gcacaacagg 13560ttattagtac agagaaagca ttctgtggtg ttgccccgga ctttcttttg cgacataggt 13620aaatcgaata ccatcatact atcttttcca atgactccct aaagaaagac tcttcttcga 13680tgttgtatac gttggagcat agggcaagaa ttgtggcttg agatgaattc actggccgtc 13740gttttacaac gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca 13800catccccctt tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa 13860cagttgcgca gcctgaatgg cgaatggcgc ctgatgcggt attttctcct tacgcatctg 13920tgcggtattt cacaccgcat atggtgcact ctcagtacaa tctgctctga tgccgcatag 13980ttaagccagc cccgacaccc gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc 14040ccggcatccg cttacagaca agctgtgacc gtctccggga gctgcatgtg tcagaggttt 14100tcaccgtcat caccgaaacg cgcgagacga aagggcctcg tgatacgcct atttttatag 14160gttaatgtca tgataataat ggtttcttag acgtcaggtg gcacttttcg gggaaatgtg 14220cgcggaaccc ctatttgttt atttttctaa atacattcaa atatgtatcc gctcatgaga 14280caataaccct gataaatgct tcaataatat tgaaaaagga agagtatgag tattcaacat 14340ttccgtgtcg cccttattcc cttttttgcg gcattttgcc ttcctgtttt tgctcaccca 14400gaaacgctgg tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt gggttacatc 14460gaactggatc tcaacagcgg taagatcctt gagagttttc gccccgaaga acgttttcca 14520atgatgagca cttttaaagt tctgctatgt ggcgcggtat tatcccgtat tgacgccggg 14580caagagcaac tcggtcgccg catacactat tctcagaatg acttggttga gtactcacca 14640gtcacagaaa agcatcttac ggatggcatg acagtaagag aattatgcag tgctgccata 14700accatgagtg ataacactgc ggccaactta cttctgacaa cgatcggagg accgaaggag 14760ctaaccgctt ttttgcacaa catgggggat catgtaactc gccttgatcg ttgggaaccg 14820gagctgaatg aagccatacc aaacgacgag cgtgacacca cgatgcctgt agcaatggca 14880acaacgttgc gcaaactatt aactggcgaa ctacttactc tagcttcccg gcaacaatta 14940atagactgga tggaggcgga taaagttgca ggaccacttc tgcgctcggc ccttccggct 15000ggctggttta ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg tatcattgca 15060gcactggggc cagatggtaa gccctcccgt atcgtagtta tctacacgac ggggagtcag 15120gcaactatgg atgaacgaaa tagacagatc gctgagatag gtgcctcact gattaagcat 15180tggtaactgt cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt 15240taatttaaaa ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa 15300cgtgagtttt cgttccactg agcgtcagac cccgtagaaa agatcaaagg atcttcttga 15360gatccttttt ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc gctaccagcg 15420gtggtttgtt tgccggatca agagctacca actctttttc cgaaggtaac tggcttcagc 15480agagcgcaga taccaaatac tgtccttcta gtgtagccgt agttaggcca ccacttcaag 15540aactctgtag caccgcctac atacctcgct ctgctaatcc tgttaccagt ggctgctgcc 15600agtggcgata agtcgtgtct taccgggttg gactcaagac gatagttacc ggataaggcg 15660cagcggtcgg gctgaacggg gggttcgtgc acacagccca gcttggagcg aacgacctac 15720accgaactga gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc cgaagggaga 15780aaggcggaca ggtatccggt aagcggcagg gtcggaacag gagagcgcac gagggagctt 15840ccagggggaa acgcctggta tctttatagt cctgtcgggt ttcgccacct ctgacttgag 15900cgtcgatttt tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc cagcaacgcg 15960gcctttttac ggttcctggc cttttgctgg ccttttgctc acatgttctt tcctgcgtta 16020tcccctgatt ctgtggataa ccgtattacc gcctttgagt gagctgatac cgctcgccgc 16080agccgaacga ccgagcgcag cgagtcagtg agcgaggaag cggaagagcg cccaatacgc 16140aaaccgcctc tccccgcgcg ttggccgatt cattaatgca gctggcacga caggtttccc 16200gactggaaag cgggcagtga gcgcaacgca attaatgtga gttagctcac tcattaggca 16260ccccaggctt tacactttat gcttccggct cgtatgttgt gtggaattgt gagcggataa 16320caatttcaca caggaaacag ctatgaccat gattacgcc 16359438684DNAArtificialpBOL113 43tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca tcaacccaga cgacaagaac gctttggaag 3240aagctttggt tttgaaggac aactacggtg ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg gccgccaccg cggtggagct ccagcttttg 6420ttccctttag tgagggttaa ttgcgcgctt ggcgtaatca tggtcatagc tgtttcctgt 6480gtgaaattgt tatccgctca caattccaca caacatagga gccggaagca taaagtgtaa 6540agcctggggt gcctaatgag tgaggtaact cacattaatt gcgttgcgct cactgcccgc 6600tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac gcgcggggag 6660aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 6720cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 6780atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 6840taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 6900aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 6960tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 7020gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 7080cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 7140cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 7200atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 7260tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat 7320ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 7380acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 7440aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 7500aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 7560tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 7620cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc 7680catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg 7740ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat 7800aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat 7860ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg 7920caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc 7980attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa 8040agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc 8100actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt 8160ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag 8220ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt 8280gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag 8340atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac 8400cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc 8460gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 8520gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 8580ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca ttattatcat 8640gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtc 86844412314DNAArtificialpBOL115 44tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca tcaacccaga cgacaagaac gctttggaag

3240aagctttggt tttgaaggac aactacggtg ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg gccgccaccg cggtggagcc tgtgtggaag 6420aacgattaca acaggtgttg tcctctgagg acataaaata cacaccgaga ttcatcaact 6480cattgctgga gttagcatat ctacaattgg gtgaaatggg gagcgatttg caggcatttg 6540ctcggcatgc cggtagaggt gtggtcaata agagcgacct catgctatac ctgagaaagc 6600aacctgacct acaggaaaga gttactcaag aataagaatt ttcgttttaa aacctaagag 6660tcactttaaa atttgtatac acttattttt tttataactt atttaataat aaaaatcata 6720aatcataaga aattcgctcg agtcgactgc agtttactgc ttgtagtcgt aagaagtttg 6780gatgatatcc ttgatttcag aaatcaaagc ttcctttggg ttggcagtgg tacattggtc 6840ttcgaaagcc aattcagcca ttctgtcaat ggattcgttc aattcttctt cagagacacc 6900ttgagatttc aagttcattt caataccaac agattgacct aattcgtaga cagccttggc 6960caaagattca accaaagctt cagtggtgtt acctttcaaa cctaagaact tggcgatatc 7020agcgtaatcg gtgtcagctc tgaagaattc gtactttggg aacaaagcgt gcttttgagg 7080gtccttggcg ttgtatctga tgatgtgagg caacaagatg gcgttagctc taccatgtgg 7140aataccgtat tcaccaccaa ttttgtgagc aatggagtga gcaataccca agaaagcgtt 7200agcaaaggcc ataccggcca aagtagaagc gttgtgcatc ttttctctgg aaaccttgtc 7260acctttttca acggaagatt tcaagtattc aaaggtcaat ttgatagctt gtagggacaa 7320acctctggtg taatcggagg ccatgacaga aacgtaagat tccatagcgt gagtcaaaac 7380gtccataccg gtatcagcag tgacagattt tgggacggac atgacaaatt gagggtcaat 7440gatggcgaca tctggagtca aagcgaaatc agccaatggg tatttgacgt tggtttcaga 7500gtcagtgata acagcaaatg gagtgacttc agaaccagta ccagaagtgg ttggaataca 7560gatgaaagtg gcgttttctg gcataccaat cttgtaggtt ctcttaccaa tgtccaagaa 7620tttttgcttg gcaccgaaga aagaagtttc tggatgttcg aagaacatcc acatagcctt 7680ggcagcatcc atggcagaac caccacccaa agcaatgatg gtgtctggtt ggaaatcgac 7740catcatttcc aaacccttgt agacagtgtt ggtggatgga tttggttcaa cttcagagaa 7800gatcttgatt tgaggttgtt cagttctttg acgtaagacg ttttcgacgg ttttggtgta 7860accaaattca accatacctg ggtcacaaac gatcatgacc ttttcaatct tgtccatggt 7920ggtcaaggac atgatagcgt tttcttcgaa atagatttga gctggaacct tgaagatttg 7980agtgttgttt cttctcttgg caatggtctt gatgttcaac aaatcggtag cagagacgtt 8040gtgggagatg gagtttctac cgtaagaacc acaacccaaa gtcaaggatg gaatcaattc 8100gttgtacatg tcaccgatac caccaacagc agatggagtg ttgaccaaaa cacgacaagc 8160cttcattctt agaccgaaat ccttttgcaa agtttcgtct tcagtgtgga taacagcagt 8220gtgacctaaa ccaccgaagt gcaaagtgtc ttcacagatt tggaaagctt gcttggtaga 8280ttgagccttg actaaagcca aaactggaga caacttttct ctggacaatg ggtagtcaga 8340accgacaccg gagatttcag caatgatcaa tttggtgttt tctggaactg ggataccggc 8400caattcagca atttcaacag cagacttacc aacgatatct ggcttgatac cagtcttttg 8460ttcgttcatg atggcgtttt ctaatctttg taattcgtcc ttcttgacga agtaagcttg 8520gtgagccttg aattcattgg tgacatcctt gtagatttcc ttgtcaatga caacaacttg 8580ttcagaagca cagatcatac cattatcgaa agtcttggaa ccgatgatat cgttgacagc 8640acgcttgata tgagcagtct tttcgatgta agatggaacg ttacctggac caacacccaa 8700agctggttta ccagtggagt aagcagactt aaccatacca gaaccaccgg tagccaagac 8760taaagcaata cccttgtggt tcatcaattg cttggtagct tcaatggatg gaacttcaat 8820ccattggatg atatcctttg gagcaccagc cttcatggca gcttccaaaa caacttcagc 8880agctctcttg gaagattctt gagcagatgg gtggaaagcg aaaataattg ggttaccagt 8940cttgatggca atcatagcct tgaagatggt ggtagaagtt gggttggtgg ttggagtgac 9000accacagatg acaccaattg gttcagcaac gtaggtcaaa cccttttctt tgtcttcacc 9060aatgatacca acagtcttgt tgtccttgat ggagttccag atgtattcag aggcgtataa 9120gttcttgata gccttgtctt cgtagatacc tctaccggtt tcttcatgag ccaacttggc 9180caaaaccatg tgttggtcaa cagcagccaa ggacatttgg tggacaatgt ggtcaatttc 9240ttcttgagac ttcttggaca aagcttccaa agccttctta cctttgtcag ctaaagcatc 9300aatcatgatg gcaacttctt gttccttgga acctctgttt tccttttctg gaatggtcaa 9360cattttttac tagttctaga atccgtcgaa actaagttct ggtgttttaa aactaaaaaa 9420aagactaact ataaaagtag aatttaagaa gtttaagaaa tagatttaca gaattacaat 9480caatacctac cgtctttata tacttattag tcaagtaggg gaataatttc agggaactgg 9540tttcaacctt ttttttcagc tttttccaaa tcagagagag cagaaggtaa tagaaggtgt 9600aagaaaatga gatagataca tgcgtgggtc aattgccttg tgtcatcatt tactccaggc 9660aggttgcatc actccattga ggttgtgccc gttttttgcc tgtttgtgcc cctgttctct 9720gtagttgcgc taagagaatg gacctatgaa ctgatggttg gtgaagaaaa caatattttg 9780gtgctgggat tctttttttt tctggatgcc agcttaaaaa gcgggctcca ttatatttag 9840tggatgccag gaataaactg ttcacccaga cacctacgat gttatatatt ctgtgtaacc 9900cgccccctat tttgggcatg tacgggttac agcagaatta aaaggctaat tttttgacta 9960aataaagtta ggaaaatcac tactattaat tatttacgta ttctttgaaa tggcgagtat 10020tgataatgat aaactgagct ccagcttttg ttccctttag tgagggttaa ttgcgcgctt 10080ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca 10140caacatagga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgaggtaact 10200cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct 10260gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc 10320ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 10380ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 10440agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 10500taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 10560cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 10620tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 10680gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 10740gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 10800tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 10860gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 10920cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 10980aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 11040tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 11100ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 11160attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 11220ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc 11280tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat 11340aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc 11400acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag 11460aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag 11520agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt 11580ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg 11640agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt 11700tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc 11760tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc 11820attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa 11880taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg 11940aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc 12000caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag 12060gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt 12120cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 12180tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 12240acctgacgtc taagaaacca ttattatcat gacattaacc tataaaaata ggcgtatcac 12300gaggcccttt cgtc 123144511180DNAArtificialpBOL116 45tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca tcaacccaga cgacaagaac gctttggaag 3240aagctttggt tttgaaggac aactacggtg ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac

acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg gccgccaccg cggtggagcc tgtgtggaag 6420aacgattaca acaggtgttg tcctctgagg acataaaata cacaccgaga ttcatcaact 6480cattgctgga gttagcatat ctacaattgg gtgaaatggg gagcgatttg caggcatttg 6540ctcggcatgc cggtagaggt gtggtcaata agagcgacct catgctatac ctgagaaagc 6600aacctgacct acaggaaaga gttactcaag aataagaatt ttcgttttaa aacctaagag 6660tcactttaaa atttgtatac acttattttt tttataactt atttaataat aaaaatcata 6720aatcataaga aattcgctcg agtcgactgc agtttacaag ttcaatttgg ccatgatggc 6780cttaacaatg gcttgaacat cttcgttgtc ttctggttca gctggagcag aagaagaagc 6840agcaccgaca ccgaaagctt ctctgatttc ttcgacggtg gtggtaccgt aagcaacctt 6900tctgatgttg aacaagtttt ctggaccaac gttgtcagag gtggcagaac caccaacagc 6960accacaacct aaagtcaaag atgggactaa gttggtagca ccaccgatac cacccaaaga 7020acctggagag ttaaccaaaa ttctggaaac aggcttcttc aaagcaaatt ctctaatgat 7080ttcttcgttt tgagagtgga tgatcaaagt gtgaccagaa ccttggttgt gcaataaagc 7140caaagacttt tcacaagctt catgccagtc ttcgacggtg tagaaagcca agactggagc 7200caatttttcc ttagcgtatg gatttttagg agaaacatcg gtttgttcgg atagtaagat 7260aacagcatca gatggaatgg aaataccagc caatttggcc aaagcttgga catccttacc 7320aacgatggct gggtttggag taccgttggc acgtaataga atcttaccaa ccttttcaga 7380ttcttcagca ttcaagaagt aacccttttg tctcttgaat tcttcgatga tttcagcctt 7440cttgacggtt tcagcaatga tggattgttc agaagcacag atgacaccgt tgtcgaaagt 7500gtcagaaccg ataacctttc taacagcagt tggaatgtca gcagttcttt cgatgaaaca 7560tggaccgtta cctggaccga caccgatggc tggagtacca gaggagtaag cagctctaac 7620cataccttca ccaccggtag ccaagatcaa agcggtgtcc ttgttcttca tcaattcagc 7680agtaccttca acggtcaaaa tggacataca ttggatcaaa ccatctggag caccagcttc 7740aacagcagcc ttttgcatga tcttaacggt ttcagtgatg gaacggacag cagttgggtg 7800tggagagaag acaatggcgt taccagcctt caaagcaatc aagaccttga aaatggcagt 7860ggaagttggg ttggtagatg gaatcaaacc agcaatgaca cctaatggga cagcaatgtc 7920aatcaatttc ttttccttgt cttccttcaa gataccaacg gtcttcaaat ccttgatgta 7980gttgtagaca acaatggagg agaatttgtt cttgatgacc ttgtcttccc atttaccgta 8040accggtgtct tcgtaagcca atttggccaa tttgacagct tcaacttcag tagccttggc 8100gatcttttcg atgaccttgt taacagcttc ttgggaaaag ttcttgaatt cagcttgagc 8160cttcttggcc ttggcaatca aagttctaac ttcttggatg gattgcaaat ccttgtccat 8220gatttccatt ttttactagt tctagaatcc gtcgaaacta agttctggtg ttttaaaact 8280aaaaaaaaga ctaactataa aagtagaatt taagaagttt aagaaataga tttacagaat 8340tacaatcaat acctaccgtc tttatatact tattagtcaa gtaggggaat aatttcaggg 8400aactggtttc aacctttttt ttcagctttt tccaaatcag agagagcaga aggtaataga 8460aggtgtaaga aaatgagata gatacatgcg tgggtcaatt gccttgtgtc atcatttact 8520ccaggcaggt tgcatcactc cattgaggtt gtgcccgttt tttgcctgtt tgtgcccctg 8580ttctctgtag ttgcgctaag agaatggacc tatgaactga tggttggtga agaaaacaat 8640attttggtgc tgggattctt tttttttctg gatgccagct taaaaagcgg gctccattat 8700atttagtgga tgccaggaat aaactgttca cccagacacc tacgatgtta tatattctgt 8760gtaacccgcc ccctattttg ggcatgtacg ggttacagca gaattaaaag gctaattttt 8820tgactaaata aagttaggaa aatcactact attaattatt tacgtattct ttgaaatggc 8880gagtattgat aatgataaac tgagctccag cttttgttcc ctttagtgag ggttaattgc 8940gcgcttggcg taatcatggt catagctgtt tcctgtgtga aattgttatc cgctcacaat 9000tccacacaac ataggagccg gaagcataaa gtgtaaagcc tggggtgcct aatgagtgag 9060gtaactcaca ttaattgcgt tgcgctcact gcccgctttc cagtcgggaa acctgtcgtg 9120ccagctgcat taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc 9180ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc 9240agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa 9300catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt 9360tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 9420gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 9480ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag 9540cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 9600caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 9660ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg 9720taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc 9780taactacggc tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac 9840cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 9900tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 9960gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt 10020catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa 10080atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga 10140ggcacctatc tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt 10200gtagataact acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg 10260agacccacgc tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga 10320gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga 10380agctagagta agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg 10440catcgtggtg tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc 10500aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc 10560gatcgttgtc agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca 10620taattctctt actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac 10680caagtcattc tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg 10740ggataatacc gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc 10800ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg 10860tgcacccaac tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac 10920aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat 10980actcttcctt tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata 11040catatttgaa tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa 11100agtgccacct gacgtctaag aaaccattat tatcatgaca ttaacctata aaaataggcg 11160tatcacgagg ccctttcgtc 111804611108DNAArtificialpBOL118 46tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca tcaacccaga cgacaagaac gctttggaag 3240aagctttggt tttgaaggac aactacggtg ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg gccgccaccg cggtggagcc tgtgtggaag 6420aacgattaca acaggtgttg tcctctgagg acataaaata cacaccgaga ttcatcaact 6480cattgctgga gttagcatat ctacaattgg gtgaaatggg gagcgatttg caggcatttg 6540ctcggcatgc cggtagaggt gtggtcaata agagcgacct catgctatac ctgagaaagc 6600aacctgacct acaggaaaga gttactcaag aataagaatt ttcgttttaa aacctaagag 6660tcactttaaa atttgtatac acttattttt tttataactt atttaataat aaaaatcata 6720aatcataaga aattcgctcg agtcgactgc agtttaaaca attctgaaag catcgaccaa 6780aacacaacga cgtaatctga cgaaagttct ggcagaagtg acaccttcac cagttggggt 6840ggtgatggtc atggtggtcc aaccttcacc acccaaaccc aaaccagcga tacatggacc 6900gttcttgaca aagatggaag tgtcaatggc gttagccatt tggttcatgt tttcgatgtt 6960tctggagtgc atggcagcag tgtggtgaca accaccttcc aatttgacag ccaaagcaat 7020agcgtcagca acgttagcaa cacggacaac tggtaagact ggcatcatca attcagtgac 7080agcaaatggg tgttcagcgg tggtttcgac gaataataat ctggtttctt gtggaacctt 7140caaaccgatg gcagcagcaa tcttaccagc atctctacca acccagtctc tggagacggt 7200acccttacct ctttcatcga tgttcttcaa caaaactggt tgcaattgtt gagcttgttc 7260agcagtcaac ttgacggcat gttgaccttc catcaatctc atcaattcgt cagcaacgga 7320gtcaacaaca atcaaaacct tttcgtcagc acagatgatg ttgttgtcga aagaagcacc 7380cttgacaatg gattgagcag ctctggccaa atcagcggtt tcatcgacaa caacaggagg 7440gttaccagca ccagcagcaa tcaatctctt gttggtgtgc tttctggcag cttcaacaac 7500agcttcacca ccagtgacga ctaatagacc gatacctggg aacttgaata atctttgagc 7560agtttcgata tctgggttgg caacagtgac caacaagttt tctggaccac cagcagcaac 7620aatggcttgg ttcaatagag tgatggctct ttgagaaacc ttcttggcag ctgggtgtgg 7680agcgaagata acggagttac cagcagcaat caaagagatg gcgttgttga tgacagtagc 7740agctgggttg gtagatgggg tgacggaagc aacaacaccc catggagcat tttcaatcaa 7800agtcaaacca ttatcaccgg tcaagacttg tggagacaaa cattcgacac ctggagtacc 7860tctagcttga gcaacgttct tagcgaattt gtcttcaact ctacccatac cggtttcgga 7920gacagccaat tcagccaagt ctctggcatg cttttcacca gcttctctga tggcagcaat 7980ggccaattgt ctcatggcaa cagatttcaa accttgttga gcaaccttgg cagcagcaac 8040agcgtcgtcc aaagaagcga aaacacccat ttcgtggaca gcagcagatg gagtgtcaga 8100agattgcatt ttcaacaaga cagccttgac aacttgttcg atatcttgtt ggttcatttt 8160ttactagttc tagaatccgt cgaaactaag ttctggtgtt ttaaaactaa aaaaaagact 8220aactataaaa gtagaattta agaagtttaa gaaatagatt tacagaatta caatcaatac 8280ctaccgtctt tatatactta ttagtcaagt aggggaataa tttcagggaa ctggtttcaa 8340cctttttttt cagctttttc caaatcagag agagcagaag gtaatagaag gtgtaagaaa 8400atgagataga tacatgcgtg ggtcaattgc cttgtgtcat catttactcc aggcaggttg 8460catcactcca ttgaggttgt gcccgttttt tgcctgtttg tgcccctgtt ctctgtagtt 8520gcgctaagag aatggaccta tgaactgatg gttggtgaag aaaacaatat tttggtgctg 8580ggattctttt tttttctgga tgccagctta aaaagcgggc tccattatat ttagtggatg 8640ccaggaataa actgttcacc cagacaccta cgatgttata tattctgtgt aacccgcccc 8700ctattttggg catgtacggg ttacagcaga attaaaaggc taattttttg actaaataaa 8760gttaggaaaa tcactactat taattattta cgtattcttt gaaatggcga gtattgataa 8820tgataaactg agctccagct tttgttccct ttagtgaggg ttaattgcgc gcttggcgta 8880atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat 8940aggagccgga agcataaagt gtaaagcctg gggtgcctaa tgagtgaggt aactcacatt 9000aattgcgttg cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta 9060atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc 9120gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa 9180ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa 9240aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct 9300ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac 9360aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc 9420gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc 9480tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg 9540tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga 9600gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta acaggattag 9660cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta

9720cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag 9780agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg 9840caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac 9900ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc 9960aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag 10020tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc 10080agcgatctgt ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac 10140gatacgggag ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc 10200accggctcca gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg 10260tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag 10320tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc 10380acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac 10440atgatccccc atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag 10500aagtaagttg gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac 10560tgtcatgcca tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg 10620agaatagtgt atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc 10680gccacatagc agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact 10740ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg 10800atcttcagca tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa 10860tgccgcaaaa aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt 10920tcaatattat tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg 10980tatttagaaa aataaacaaa taggggttcc gcgcacattt ccccgaaaag tgccacctga 11040cgtctaagaa accattatta tcatgacatt aacctataaa aataggcgta tcacgaggcc 11100ctttcgtc 111084711114DNAArtificialpBOL120 47tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accataccac agcttttcaa ttcaattcat catttttttt ttattctttt ttttgatttc 240ggtttctttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa cgaaggaagg 300agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca tgaaattgcc 360cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga taaatcatgt 420cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct gccaagctat 480ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt cgtaccacca 540aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta aaaacacatg 600tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag gcattatccg 660ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt aatacagtca 720aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt acgaatgcac 780acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa gaagtaacaa 840aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc ctatctactg 900gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt gttatcggct 960ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg attatgacac 1020ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga accgtggatg 1080atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt gcaaagggaa 1140gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca tatttgagaa 1200gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata ctaaactcac 1260aaattagagc ttcaatttaa ttatatcagt tattacccta tgcggtgtga aataccgcac 1320agatgcgtaa ggagaaaata ccgcatcagg aaattgtaaa cgttaatatt ttgttaaaat 1380tcgcgttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa atcggcaaaa 1440tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca gtttggaaca 1500agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg 1560gcgatggccc actacgtgaa ccatcaccct aatcaagttt tttggggtcg aggtgccgta 1620aagcactaaa tcggaaccct aaagggagcc cccgatttag agcttgacgg ggaaagccgg 1680cgaacgtggc gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg gcgctggcaa 1740gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg ccgctacagg 1800gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg 1860cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga ttaagttggg 1920taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgag cgcgcgtaat 1980acgactcact atagggcgaa ttgggtaccg ggccccccct cgaggtcgac ggtatcgata 2040agcttgatat cgaattcctg cagcccgggg gatccactag ttctagagcg gcccatttaa 2100acggccggcc ctagatcaga gggtggtaaa tgaagtgtaa tagtattcat ttttcttata 2160aatcatccct tccgtgattt atacaaaaga agaggagaat atgctgaata cttggtatat 2220tactctacat tatactctta tcttgacggg tattctgagc atcttactca gtttcaagat 2280cttttaatgt ccaaaaacat ttgagccgat ctaaatactt ctgtgttttc attaatttat 2340aaattgtact cttttaagac atggaaagta ccaacatcgg ttgaaacagt ttttcattta 2400cttatggttt attggttttt ccagtgaatg attatttgtc gttacccttt cgtaaaagtt 2460caaacacgtt tttaagtatt gtttagttgc tctttcgaca tatatgatta tccctgcgcg 2520gctaaagtta aggatgcaaa aaacataaga caactgaagt taatttacgt caattaagtt 2580ttccagggta atgatgtttt gggcttccac taattcaata agtatgtcat gaaatacgtt 2640gtgaagagca tccagaaata atgaaaagaa acaacgaaac tgggtcggcc tgttgtttct 2700tttctttacc acgtgatctg cggcatttac aggaagtcgc gcgttttgcg cagttgttgc 2760aacgcagcta cggctaacaa agcctagtgg aactcgactg atgtgttagg gcctaaaact 2820ggtggtgaca gctgaagtga actattcaat ccaatcatgt catggctgtc acaaagacct 2880tgcggaccgc acgtacgaac acatacgtat gctaatatgt gttttgatag tacccagtga 2940tcgcagacct gcaatttttt tgtaggtttg gaagaatata taaaggttgc actcattcaa 3000gatagttttt ttcttgtgtg tctattcatt ttattattgt ttgtttaaat gttaaaaaaa 3060ccaagaactt agtttcaaat taaattcatc acacaaacaa acaaaacaaa atgaacattg 3120ttgtttgttt gaagcaagtt ccagacactg ctgaagtcag aattgaccca gtcaagggta 3180ctttaatcag agaaggtgtt ccatctatca tcaacccaga cgacaagaac gctttggaag 3240aagctttggt tttgaaggac aactacggtg ctcacgttac cgtcatttcc atgggtccac 3300ctcaagccaa gaacgctttg gttgaagctt tggccatggg tgctgatgaa gctgtcttat 3360tgactgacag agctttcggt ggtgctgata ctttagctac ctctcacacc attgctgctg 3420gtatcaagaa attgaaatac gatatcgtct ttgccggtcg tcaagccatc gatggtgata 3480ccgctcaagt cggtccagaa attgctgaac atttgggtat tccacaagtc acctacgttg 3540aaaaggttga agttgacggt gacactttga agatcagaaa ggcttgggaa gacggttacg 3600aagttgttga agtcaagact ccagttctat tgactgccat caaggaattg aacgttccaa 3660gatacatgtc cgttgaaaag atcttcggtg ctttcgacaa ggaagtcaag atgtggactg 3720ctgatgatat cgatgtcgac aaggccaact tgggtttgaa aggttctcca accaaggtca 3780agaaatcttc taccaaggaa gtcaagggtc aaggtgaagt cattgacaaa ccagtcaagg 3840aagctgccgc ttacgttgtt tccaagttga aggaagaaca ctacatctaa agcccgggcg 3900gagattgata agacttttct agttgcatat cttttatatt taaatcttat ctattagtta 3960attttttgta atttatcctt atatatagtc tggttattct aaaatatcat ttcagtatct 4020aaaaattccc ctcttttttc agttatatct taacaggcga cagtccaaat gttgatttat 4080cccagtccga ttcatcaggg ttgtgaagca ttttgtcaat ggtcgaaatc acatcagtaa 4140tagtgcctct tacttgcctc atagaatttc tttctcttaa cgtcaccgtt tggtctttta 4200tagtttcgaa atctatggtg ataccaaatg gtgttcccaa ttcatcgtta cgggcgtatt 4260ttttaccaat tgaagtattg gaatcgtcaa ttttaaagta tatctctctt ttacgtaaag 4320cctgcgagat cctcttaagt atagcgggga agccatcgtt attcgatatt gtcgtaacaa 4380atactttgat cggcgctatg tttaaatgtt taaacatgga cagatatgcg atgaaaacgc 4440taagtgatac tccaaatggt gaaaggtacg atgcttggaa acaatacttg gaaatcaccg 4500gaaacaccat atgcggcgaa aagccaatta gtgtgatact aagtgcttta tcgaaaatcc 4560gtgatgccgg tccttcaggc atcaaatttc agtggcctaa ttattcacag agttctcatg 4620tgacaagtat tgatgatagt agtgtcagtt atgcttcagg ttatgttact ataggataat 4680gatcacggct aaaacggtcg aatgtaagca tatatctttc gattgtataa ttgttcccaa 4740atactacagc atctcaagga aaaaaaaaca aaaacttcca aaaaaatcga atccctgagg 4800aatctttaat acattttcaa tctatttaag ttttataaac gtgtatatga gatgtcatga 4860gcatgaatta ttaataataa aaactaaatc attaaagtaa cttaaggagt taaagcccgg 4920gctttaattg ttagcagcct tgacttgagc aatcaattct ggaacaacct tgttgacatc 4980accgacaatg gccaaatcag caaccttcat gattggagct tcgacatctt tgttgatggc 5040aatgatgtag tcagagtctt gcataccagc caagtgttgg atggcaccag agataccaca 5100agcaatgtac aaagttggtc tgacggtctt accggtttga ccgacttgca agtccttgtc 5160aacccattcc ttttcaatgg cagctctgga agcagcaatg gtaccaccca acaaagaagc 5220taattcttcc aatttttcga agttttcctt ggaaccaaca ccacgaccac cagcaaccaa 5280aaccttggct tcaccgatat cagcaatgtc cttggccaat ttgacaacct tggaaacctt 5340ggttctgata tcagaagcag tcaatttgat ggcaaccttt tcgatcttgt catcagaaac 5400gttagcatcg ttaactggca atttttcaaa gacacctggt ctgacggtgg ccatttgagg 5460tctgtggtca gaacagacaa tggtagcaat caagttacca ccgaaagctg gtctggtagc 5520caacaagtca cggttttcga catcgatatc caaagaggta cagtcagcag tcaaaccagt 5580agacaatctg gcagcaattc ttggacccaa gtctctaccg atgaaagtag caccgatgaa 5640taagatttct ggctttcttt cgttgaccaa gtcacagata accttggcgt aaccgtcagt 5700ggagaaatga gctaataatt cgttgtcagc agccaaaacc ttgtcagcac cgtgggacaa 5760caagtccttg gacatctttt cagtgttgtg acccaataag acagcagtca attcaacacc 5820caatttttca gccatttcct tacccttacc tagcaattcc aaagaaacct tttgtaattc 5880accatctctt tgttcagcga aaacccagac acccttgtag tcagccttgt tcatgtttag 5940ttaattatag ttcgttgacc gtatattcta aaaacaagta ctccttaaaa aaaaaccttg 6000aagggaataa acaagtagaa tagatagaga gaaaaataga aaatgcaaga gaatttatat 6060attagaaaga gagaaagaaa aatggaaaaa aaaaaatagg aaaagccaga aatagcacta 6120gaaggagcga caccagaaaa gaaggtgatg gaaccaattt agctatatat agttaactac 6180cggctcgatc atctctgcct ccagcatagt cgaagaagaa tttttttttt cttgaggctt 6240ctgtcagcaa ctcgtatttt ttctttcttt tttggtgagc ctaaaaagtt cccacgttct 6300cttgtacgac gccgtcacaa acaaccttat gggtaatttg tcgcggtctg ggtgtataaa 6360tgtgtgggtg caggccggcc gtttaaacgg gccgccaccg cggtggagcc tgtgtggaag 6420aacgattaca acaggtgttg tcctctgagg acataaaata cacaccgaga ttcatcaact 6480cattgctgga gttagcatat ctacaattgg gtgaaatggg gagcgatttg caggcatttg 6540ctcggcatgc cggtagaggt gtggtcaata agagcgacct catgctatac ctgagaaagc 6600aacctgacct acaggaaaga gttactcaag aataagaatt ttcgttttaa aacctaagag 6660tcactttaaa atttgtatac acttattttt tttataactt atttaataat aaaaatcata 6720aatcataaga aattcgctcg agtcgactgc agtttatcta atggagaaac catcagtcaa 6780gacacatctt cttcttctag cgaagtgacg ggcagtggta gtaccttcac cagttggagt 6840agcaatggtg aaagtggtgg aaccttcacc tctgaaacct aaaccagcga aagatggacc 6900gttcttgaca aagatggagg tttgcatgtc acgggcagcc ttgttcaatc tggagatgtt 6960ttgagagtgc atggtagcag tgtgatgtag accttgttcc aattcaatgg caacttccaa 7020agcttcatcg aagtctggaa ctctgacaac tggaacaatt ggcatcaaca attcaacagt 7080agcgaatggg tgggactttt cagtttcgac aatgatcaat cttggggtga aatcacaagc 7140aataccagct tctttcaaga tttcagtggc agacttacca accaatttct tgttggtgac 7200acccttgtca gtgacggcaa ccttttccaa tttttggata tcagatgggt tagtgacgtg 7260caaagcaccg ttcttttcca tttggaacaa caagaagtca gcaatggagt caacggcaac 7320aacagacttt tcagcgatac acaagatatt atggtcaaag gaagcaccgt cgacaatgtc 7380agcagcagcc ttttcaatgt tagcggtttc gtcaacgatg gatggagggt taccagcacc 7440agcaccgata accttcttac cagattgcat agcttgcaag acaacacctg gaccaccagt 7500gatgaccaac aatggaacct ttgggtggtt catcatttct tgagcagctt ggatagatgg 7560cttggcaacg gtgacaatca agttgtcaat accacaagaa tctctgacga tagtgttcaa 7620cttttcaatc aaccataaag agatgttctt ggcacctggg tgaggagagt agaaaacggc 7680gttaccagca gccaacatac cgatggagtt acagatcaaa gtttcagttg ggttggtaga 7740tggagcaaca gcaccgatga caccgtatgg agataattcg tataaagtca taccgttgtc 7800accggtagca acttcagtgt acaagtcttc aacacctgga gtcttttcga tagctaaagt 7860gttcttcaag attttatcgg tgacattacc cataccggtt tcagcaacag ctctggtagc 7920aatggtttcg atttctgggt ataaagcttc tctgatggcc ttgacaacgt ttcttctttc 7980ttccaaagat ttttccttgt aacagttttg agcaatgacg gcagcttgga cagcttcatc 8040gacggtatcg aaaacaccgg acttggcacc ttgggtggtg gtcttggttg gaacttcctt 8100ttgttcagcc aatttttcca acaaaacctt cttgactaat tgttccaatt ccaaagattc 8160cattttttac tagttctaga atccgtcgaa actaagttct ggtgttttaa aactaaaaaa 8220aagactaact ataaaagtag aatttaagaa gtttaagaaa tagatttaca gaattacaat 8280caatacctac cgtctttata tacttattag tcaagtaggg gaataatttc agggaactgg 8340tttcaacctt ttttttcagc tttttccaaa tcagagagag cagaaggtaa tagaaggtgt 8400aagaaaatga gatagataca tgcgtgggtc aattgccttg tgtcatcatt tactccaggc 8460aggttgcatc actccattga ggttgtgccc gttttttgcc tgtttgtgcc cctgttctct 8520gtagttgcgc taagagaatg gacctatgaa ctgatggttg gtgaagaaaa caatattttg 8580gtgctgggat tctttttttt tctggatgcc agcttaaaaa gcgggctcca ttatatttag 8640tggatgccag gaataaactg ttcacccaga cacctacgat gttatatatt ctgtgtaacc 8700cgccccctat tttgggcatg tacgggttac agcagaatta aaaggctaat tttttgacta 8760aataaagtta ggaaaatcac tactattaat tatttacgta ttctttgaaa tggcgagtat 8820tgataatgat aaactgagct ccagcttttg ttccctttag tgagggttaa ttgcgcgctt 8880ggcgtaatca tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca 8940caacatagga gccggaagca taaagtgtaa agcctggggt gcctaatgag tgaggtaact 9000cacattaatt gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct 9060gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc 9120ttcctcgctc actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca 9180ctcaaaggcg gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg 9240agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca 9300taggctccgc ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa 9360cccgacagga ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc 9420tgttccgacc ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc 9480gctttctcat agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct 9540gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg 9600tcttgagtcc aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag 9660gattagcaga gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta 9720cggctacact agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg 9780aaaaagagtt ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt 9840tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt 9900ttctacgggg tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag 9960attatcaaaa aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat 10020ctaaagtata tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc 10080tatctcagcg atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat 10140aactacgata cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc 10200acgctcaccg gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag 10260aagtggtcct gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag 10320agtaagtagt tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt 10380ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg 10440agttacatga tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt 10500tgtcagaagt aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc 10560tcttactgtc atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc 10620attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa 10680taccgcgcca catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg 10740aaaactctca aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc 10800caactgatct tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag 10860gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt 10920cctttttcaa tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 10980tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 11040acctgacgtc taagaaacca ttattatcat gacattaacc tataaaaata ggcgtatcac 11100gaggcccttt cgtc 11114482613DNAEntamoeba histolytica 48atgtcaacac aacaaactat gactgtagat gaacatatta atcaacttgt tgctaaagca 60caagttgcac ttaaagaata tcttaaacca gaatatacac aagaaaaaat agattatatt 120gtaaagaaag catcagttgc agcacttgat caacattgtg cacttgcagc agctgcagtt 180gaagaaacag gaagaggtat ttttgaagat aaagctacta aaaatatatt tgcatgtgaa 240catgttacac atgaaatgag acatgctaaa acagttggta ttattaatgt agatccactt 300tatggaatta cagaaattgc agaaccagtt ggagttgttt gtggagttac accagttact 360aatccaacat caacagctat tttcaagtca cttatttcaa ttaaaacaag aaatccaatt 420gtattttcat tccatccatc agcacttaaa tgttctatta tggcagctaa aattgttaga 480gatgcagcta ttgcagcagg agcaccagaa aattgtattc aatggattga atttggagga 540attgaagcat caaataaatt aatgaatcat ccaggagttg ctactattct tgctacagga 600ggaaatgcta tggttaaagc agcatattca tcaggaaaac cagcacttgg agtaggagca 660ggaaatgtac caacatatat tgaaaaaaca tgtaatatta aacaagcagc aaatgatgta 720gttatgtcaa aatcatttga taatggtatg atttgtgcat cagaacaagc agcaattatt 780gataaagaaa tttatgatca agtagttgaa gaaatgaaaa cacttggagc atatttcatt 840aatgaagaag aaaaagctaa attagaaaag tttatgtttg gagttaatgc atattcagca 900gatgttaata atgcaagact taatccaaaa tgtccaggta tgtcaccaca atggtttgct 960gaacaagttg gaattaaagt tccagaagat tgtaatatta tttgtgcagt ttgtaaagaa 1020gttggaccaa atgaaccatt aacaagagaa aaattatcac cagttcttgc tattcttaaa 1080gcagaaaata cacaagatgg tattgataaa gctgaagcta tggttgaatt taatggtaga 1140ggacattcag cagctattca ttcgaatgat aaagcagtag ttgaaaagta tgcacttaca 1200atgaaagcat gcagaatttt acataataca ccatcatcac aaggaggaat tggatcaatt 1260tataactata tttggccatc atttacactt ggatgtggat catatggagg aaattcggta 1320tcagctaatg ttacatatca taatttatta aatattaaaa gacttgcaga tagaagaaac 1380aaccttcaat ggttcagagt tccaccaaag attttctttg aaccacattc tattagatat 1440cttgctgaac ttaaggaact tagtaaaata ttcattgttt cagatagaat gatgtataaa 1500ttaggatatg tagatagagt tatggatgta ttgaaaagaa gaagtaatga agtagaaatt 1560gaaattttca ttgatgtaga accagatcca tctattcaaa ccgttcaaaa aggacttgct 1620gttatgaata catttggacc agataatatt attgctattg gaggaggatc agctatggat 1680gcagctaaga ttatgtggtt actttatgaa catccagaag ccgatttctt tgcaatgaaa 1740caaaaattca ttgatcttag aaagagagca tttaaattcc caacaatggg taagaaagct 1800agattaattt gtattccaac aacatcagga actggatcag aagttacacc atttgcagtt 1860atttcagatc atgaaacagg taagaaatat ccacttgctg attattcact tacaccatca 1920gttgctattg ttgatccaat gtttactatg tcacttccaa agagagctat tgctgatact 1980ggacttgatg tattggttca tgcaacagaa gcatatgttt cagttatggc taatgaatat 2040actgatggac ttgctagaga agcagttaaa ttagtctttg aaaatcttct taaatcatat 2100aatggagatt tagaagcaag agaaaagatg cacaatgctg caacaattgc aggtatggca 2160tttgcatcag cattccttgg tatggaccat tccatggcac ataaagttgg agcagcattc 2220catcttccac atggtagatg tgtagcagta ttattaccac atgtcattag atataatgga 2280caaaaaccaa gaaagcttgc aatgtggcca aaatataatt tctataaggc agaccaaaga 2340tatatggaac ttgcacaaat ggttggactt aaatgtaata caccagctga aggagttgaa

2400gcatttgcta aagcatgtga agaattaatg aaagccacag agactattac tggattcaag 2460caagcaaata ttgatgaagc agcatggatg agtaaagtac cagaaatggc acttcttgca 2520tttgaagatc aatgttcacc agctaatcca agagtcccaa tggttaagga tatggaaaag 2580attctcaaag ctgcatatta tccaattgct tga 2613492610DNAArtificial sequenceadh2 E. histolytica codon pair optimised 49atgtccactc aacaaaccat gaccgttgat gaacacatta accaattggt cagaaaggct 60caagttgctt tgaaggaata cttgaaacca gaatacactc aagaaaagat cgattacatt 120gtcaagaagg cttctgttgc tgctctagac caacactgtg ctttggctgc tgctgctgtc 180gaagaaactg gtcgtggtat ctttgaagac aaagctacca agaacatttt cgcttgtgaa 240cacgtcactc acgaaatgag acacgccaag accgttggta tcatcaacgt tgatccatta 300tacggtatca ctgaaattgc tgaaccagtc ggtgttgtct gtggtgtcac cccagttacc 360aacccaactt ctactgccat tttcaaatct ttgatttcca tcaagaccag aaacccaatt 420gttttctcct tccacccatc tgctttgaaa tgttccatca tggctgccaa gatcgtcaga 480gatgctgcca ttgctgctgg tgctccagaa aactgtatcc aatggatcga atttggtggt 540attgaagctt ccaacaaatt gatgaaccat cctggtgttg ctaccatctt agctactggt 600ggtaacgcta tggtcaaggc tgcttactct tctggtaagc cagctttggg tgtcggtgct 660ggtaacgtcc caacttacat cgaaaagacc tgtaatatca agcaagctgc taacgatgtt 720gtcatgtcca agtctttcga caacggtatg atctgtgcct ccgaacaagc tgccatcatc 780gacaaagaaa tctacgacca agttgttgaa gaaatgaaga ctttgggtgc ttacttcatc 840aacgaagaag aaaaggccaa attggaaaaa ttcatgttcg gtgttaatgc ttactctgct 900gatgtcaaca acgccagatt gaacccaaag tgtccaggta tgtctccaca atggttcgct 960gaacaagtcg gtatcaaggt tccagaagac tgtaacatca tctgtgccgt ttgtaaggaa 1020gttggtccaa acgaaccatt gaccagagaa aagttgtctc cagttttggc cattttgaag 1080gctgaaaaca ctcaagatgg tattgacaag gctgaagcta tggtcgaatt caacggtcgt 1140ggtcactctg ctgccattca ctccaatgac aaggctgttg ttgaaaaata cgctttgacc 1200atgaaggctt gtcgtatctt gcacaacact ccatcttctc aaggtggtat cggttccatt 1260tacaactaca tctggccatc tttcacttta ggttgtggtt cttacggtgg taactccgtt 1320tctgccaatg ttacctacca caacttgttg aacatcaaga gattggctga cagaagaaac 1380aacttacaat ggttcagagt cccaccaaag atcttcttcg aacctcactc cattagatac 1440ttggctgaat tgaaggaatt gtccaagatt ttcattgtct ctgacagaat gatgtacaaa 1500ttgggttacg ttgacagagt tatggatgtc ttgaagagaa gatccaacga agttgaaatt 1560gaaatcttca tcgatgttga accagaccca tccattcaaa ccgtccaaaa gggtttggct 1620gtcatgaaca ctttcggtcc agacaacatc attgccattg gtggtggttc tgccatggat 1680gctgccaaga tcatgtggtt attatacgaa catccagaag ctgatttctt cgctatgaag 1740caaaaattca tcgatttaag aaagagagct ttcaagttcc caaccatggg taagaaggcc 1800agattaatct gtatcccaac cacttctggt accggttctg aagtcacccc attcgctgtc 1860atctctgacc acgaaactgg taagaagtat ccattggctg actactcttt gaccccatcc 1920gttgccattg ttgacccaat gtttaccatg tccttgccta agagagccat tgctgacact 1980ggtttggatg tcttagtcca cgctactgaa gcttacgttt ctgttatggc taacgaatac 2040actgacggtt tggccagaga agctgtcaaa ttggttttcg aaaacttgtt gaaatcttac 2100aacggtgact tggaagctcg tgaaaagatg cacaacgctg ctaccattgc tggtatggcc 2160tttgcttctg ctttcttggg tatggaccat tccatggctc acaaggtcgg tgctgctttc 2220catttgccac acggtagatg tgttgccgtt ttgttgcctc acgttatcag atacaacggt 2280caaaagccaa gaaagttggc catgtggcca aagtacaact tctacaaggc tgatcaaaga 2340tacatggaat tggctcaaat ggtcggtttg aagtgtaaca ccccagctga aggtgtcgaa 2400gcctttgcca aggcttgtga agaattgatg aaggctactg aaaccatcac tggtttcaag 2460aaggccaaca ttgatgaagc tgcttggatg tccaaggttc cagaaatggc tctattggct 2520ttcgaagacc aatgttctcc agctaaccca agagtcccaa tggttaagga catggaaaag 2580attttgaagg ctgcttacta cccaatcgct 261050870PRTEntamoeba histolytcia 50Met Ser Thr Gln Gln Thr Met Thr Val Asp Glu His Ile Asn Gln Leu1 5 10 15Val Arg Lys Ala Gln Val Ala Leu Lys Glu Tyr Leu Lys Pro Glu Tyr 20 25 30Thr Gln Glu Lys Ile Asp Tyr Ile Val Lys Lys Ala Ser Val Ala Ala 35 40 45Leu Asp Gln His Cys Ala Leu Ala Ala Ala Ala Val Glu Glu Thr Gly 50 55 60Arg Gly Ile Phe Glu Asp Lys Ala Thr Lys Asn Ile Phe Ala Cys Glu65 70 75 80His Val Thr His Glu Met Arg His Ala Lys Thr Val Gly Ile Ile Asn 85 90 95Val Asp Pro Leu Tyr Gly Ile Thr Glu Ile Ala Glu Pro Val Gly Val 100 105 110Val Cys Gly Val Thr Pro Val Thr Asn Pro Thr Ser Thr Ala Ile Phe 115 120 125Lys Ser Leu Ile Ser Ile Lys Thr Arg Asn Pro Ile Val Phe Ser Phe 130 135 140His Pro Ser Ala Leu Lys Cys Ser Ile Met Ala Ala Lys Ile Val Arg145 150 155 160Asp Ala Ala Ile Ala Ala Gly Ala Pro Glu Asn Cys Ile Gln Trp Ile 165 170 175Glu Phe Gly Gly Ile Glu Ala Ser Asn Lys Leu Met Asn His Pro Gly 180 185 190Val Ala Thr Ile Leu Ala Thr Gly Gly Asn Ala Met Val Lys Ala Ala 195 200 205Tyr Ser Ser Gly Lys Pro Ala Leu Gly Val Gly Ala Gly Asn Val Pro 210 215 220Thr Tyr Ile Glu Lys Thr Cys Asn Ile Lys Gln Ala Ala Asn Asp Val225 230 235 240Val Met Ser Lys Ser Phe Asp Asn Gly Met Ile Cys Ala Ser Glu Gln 245 250 255Ala Ala Ile Ile Asp Lys Glu Ile Tyr Asp Gln Val Val Glu Glu Met 260 265 270Lys Thr Leu Gly Ala Tyr Phe Ile Asn Glu Glu Glu Lys Ala Lys Leu 275 280 285Glu Lys Phe Met Phe Gly Val Asn Ala Tyr Ser Ala Asp Val Asn Asn 290 295 300Ala Arg Leu Asn Pro Lys Cys Pro Gly Met Ser Pro Gln Trp Phe Ala305 310 315 320Glu Gln Val Gly Ile Lys Val Pro Glu Asp Cys Asn Ile Ile Cys Ala 325 330 335Val Cys Lys Glu Val Gly Pro Asn Glu Pro Leu Thr Arg Glu Lys Leu 340 345 350Ser Pro Val Leu Ala Ile Leu Lys Ala Glu Asn Thr Gln Asp Gly Ile 355 360 365Asp Lys Ala Glu Ala Met Val Glu Phe Asn Gly Arg Gly His Ser Ala 370 375 380Ala Ile His Ser Asn Asp Lys Ala Val Val Glu Lys Tyr Ala Leu Thr385 390 395 400Met Lys Ala Cys Arg Ile Leu His Asn Thr Pro Ser Ser Gln Gly Gly 405 410 415Ile Gly Ser Ile Tyr Asn Tyr Ile Trp Pro Ser Phe Thr Leu Gly Cys 420 425 430Gly Ser Tyr Gly Gly Asn Ser Val Ser Ala Asn Val Thr Tyr His Asn 435 440 445Leu Leu Asn Ile Lys Arg Leu Ala Asp Arg Arg Asn Asn Leu Gln Trp 450 455 460Phe Arg Val Pro Pro Lys Ile Phe Phe Glu Pro His Ser Ile Arg Tyr465 470 475 480Leu Ala Glu Leu Lys Glu Leu Ser Lys Ile Phe Ile Val Ser Asp Arg 485 490 495Met Met Tyr Lys Leu Gly Tyr Val Asp Arg Val Met Asp Val Leu Lys 500 505 510Arg Arg Ser Asn Glu Val Glu Ile Glu Ile Phe Ile Asp Val Glu Pro 515 520 525Asp Pro Ser Ile Gln Thr Val Gln Lys Gly Leu Ala Val Met Asn Thr 530 535 540Phe Gly Pro Asp Asn Ile Ile Ala Ile Gly Gly Gly Ser Ala Met Asp545 550 555 560Ala Ala Lys Ile Met Trp Leu Leu Tyr Glu His Pro Glu Ala Asp Phe 565 570 575Phe Ala Met Lys Gln Lys Phe Ile Asp Leu Arg Lys Arg Ala Phe Lys 580 585 590Phe Pro Thr Met Gly Lys Lys Ala Arg Leu Ile Cys Ile Pro Thr Thr 595 600 605Ser Gly Thr Gly Ser Glu Val Thr Pro Phe Ala Val Ile Ser Asp His 610 615 620Glu Thr Gly Lys Lys Tyr Pro Leu Ala Asp Tyr Ser Leu Thr Pro Ser625 630 635 640Val Ala Ile Val Asp Pro Met Phe Thr Met Ser Leu Pro Lys Arg Ala 645 650 655Ile Ala Asp Thr Gly Leu Asp Val Leu Val His Ala Thr Glu Ala Tyr 660 665 670Val Ser Val Met Ala Asn Glu Tyr Thr Asp Gly Leu Ala Arg Glu Ala 675 680 685Val Lys Leu Val Phe Glu Asn Leu Leu Lys Ser Tyr Asn Gly Asp Leu 690 695 700Glu Ala Arg Glu Lys Met His Asn Ala Ala Thr Ile Ala Gly Met Ala705 710 715 720Phe Ala Ser Ala Phe Leu Gly Met Asp His Ser Met Ala His Lys Val 725 730 735Gly Ala Ala Phe His Leu Pro His Gly Arg Cys Val Ala Val Leu Leu 740 745 750Pro His Val Ile Arg Tyr Asn Gly Gln Lys Pro Arg Lys Leu Ala Met 755 760 765Trp Pro Lys Tyr Asn Phe Tyr Lys Ala Asp Gln Arg Tyr Met Glu Leu 770 775 780Ala Gln Met Val Gly Leu Lys Cys Asn Thr Pro Ala Glu Gly Val Glu785 790 795 800Ala Phe Ala Lys Ala Cys Glu Glu Leu Met Lys Ala Thr Glu Thr Ile 805 810 815Thr Gly Phe Lys Lys Ala Asn Ile Asp Glu Ala Ala Trp Met Ser Lys 820 825 830Val Pro Glu Met Ala Leu Leu Ala Phe Glu Asp Gln Cys Ser Pro Ala 835 840 845Asn Pro Arg Val Pro Met Val Lys Asp Met Glu Lys Ile Leu Lys Ala 850 855 860Ala Tyr Tyr Pro Ile Ala865 870512658DNAPiromyces sp. E2 51atgtccggat tacaaatgtt ccaaaacctt tctctttacg gtagtctcgc cgaaatcgat 60actagcgaaa agcttaacga agctatggac aaattaactg ctgcccaaga acaattcaga 120gaatacaacc aagaacaagt tgacaaaatc ttcaaggctg ttgctttagc tgcttctcaa 180aaccgtgttg ctttcgctaa gtacgcacac gaagaaaccc aaaagggtgt tttcgaagat 240aaggttatca agaacgaatt cgctgctgat tacatttacc acaagtactg caatgacaag 300accgccggta tcattgaata tgatgaagcc aatggtctta tggaaattgc tgaaccagtt 360ggtccagttg ttggtattgc tccagttact aacccaactt ctactatcat ctacaagtct 420ttaattgcct taaagacccg taactgtatt atcttctcac cacatccagg agctcacaag 480gcctctgttt tcgttgttaa ggtcttacac caagctgctg ttaaggctgg tgccccagaa 540aactgtattc aaatcatctt cccaaagatg gatttaacta ctgaattatt acaccaccaa 600aagactcgtt tcatttgggc tactggtggt ccaggtttag ttcacgcctc ttacacttct 660ggtaagccag ctcttggtgg tggtccaggt aatgctccag ctcttattga tgaaacttgt 720gatatgaacg aagctgttgg ttctatcgtt gtttctaaga ctttcgattg tggtatgatc 780tgtgccactg aaaacgctgt tgtcgttgtc gaatctgtct acgaaaactt cgttgctacc 840atgaagaagc gtggtgccta cttcatgact ccagaagaaa ccaagaaggc ttctaacctt 900cttttcggag aaggtatgag attaaatgct aaggctgttg gtcaaactgc caagacttta 960gctgaaatgg ccggtttcga agtcccagaa aacaccgttg ttctctgtgg tgaagcttct 1020gaagttaaat tcgaagaacc aatggctcac gaaaagttaa ctactatcct cggtatctac 1080aaggctaagg actttgacga tggtgtcaga ttatgtaagg aattagttac tttcggtggt 1140aagggtcaca ctgctgttct ctacaccaac caaaacaaca aggaccgtat tgaaaagtac 1200caaaacgaag ttccagcctt ccacatctta gttgacatgc catcttccct cggttgtatt 1260ggtgatatgt acaacttccg tcttgctcca gctcttacca ttacttgtgg tactatgggt 1320ggtggttcct cctctgataa cattggtcca aagcacttac ttaacatcaa gcgtgttggt 1380atgagacgcg aaaacatgct ttggttcaag attccaaagt ctgtctactt caagcgtgct 1440atcctttctg aagctttatc tgacttacgt gacacccaca agcgtgctat cattattacc 1500gatagaacta tgactatgtt aggtcaaact gacaagatca ttaaggcttg tgaaggtcat 1560ggtatggtct gcactgtcta cgataaggtt gtcccagatc caactatcaa gtgtattatg 1620gaaggtgtta atgaaatgaa cgtcttcaag ccagatttag ctattgctct tggtggtggt 1680tctgctatgg atgccgctaa gatgatgcgt ttattctacg aatacccaga ccaagactta 1740caagatattg ctactcgttt cgtcgatatc cgtaagcgtg ttgttggttg tccaaagctt 1800ggtagactta ttaagactct tgtctgtatc ccaactacct ctggtactgg tgccgaagtt 1860actccattcg ctgtcgttac ctctgaagaa ggtcgtaagt acccattagt cgactacgaa 1920cttactccag atatggctat tgttgatcca gaattcgctg ttggtatgcc aaagcgttta 1980acttcttgga ctggtattga tgctcttacc cacgccattg aatcttacgt ttctattatg 2040gctactgact tcactagacc atactctctc cgtgctgttg gtcttatctt cgaatccctt 2100tcccttgctt acaacaacgg taaggatatt gaagctcgtg aaaagatgca caatgcttct 2160gctattgctg gtatggcctt tgccaacgct ttccttggtt gttgtcactc tgttgctcac 2220caacttggtt ccgtctacca cattccacac ggtcttgcca acgctttaat gctttctcac 2280atcattaagt acaacgctac tgactctcca gttaagatgg gtaccttccc acaatacaag 2340tacccacaag ctatgcgtca ctacgctgaa attgctgaac tcttattacc accaactcaa 2400gttgttaaga tgactgatgt tgataaggtt caatacttaa ttgaccgtgt tgaacaatta 2460aaggctgacg ttggtattcc aaagtctatt aaggaaactg gaatggttac tgaagaagac 2520ttcttcaaca aggttgacca agttgctatc atggccttcg atgaccaatg tactggtgct 2580aacccacgtt acccattagt ttctgaatta aaacaattaa tgattgatgc ctggaacggt 2640gttgtcccaa agctctaa 265852885PRTPiromyces sp. E2 52Met Ser Gly Leu Gln Met Phe Gln Asn Leu Ser Leu Tyr Gly Ser Leu1 5 10 15Ala Glu Ile Asp Thr Ser Glu Lys Leu Asn Glu Ala Met Asp Lys Leu 20 25 30Thr Ala Ala Gln Glu Gln Phe Arg Glu Tyr Asn Gln Glu Gln Val Asp 35 40 45Lys Ile Phe Lys Ala Val Ala Leu Ala Ala Ser Gln Asn Arg Val Ala 50 55 60Phe Ala Lys Tyr Ala His Glu Glu Thr Gln Lys Gly Val Phe Glu Asp65 70 75 80Lys Val Ile Lys Asn Glu Phe Ala Ala Asp Tyr Ile Tyr His Lys Tyr 85 90 95Cys Asn Asp Lys Thr Ala Gly Ile Ile Glu Tyr Asp Glu Ala Asn Gly 100 105 110Leu Met Glu Ile Ala Glu Pro Val Gly Pro Val Val Gly Ile Ala Pro 115 120 125Val Thr Asn Pro Thr Ser Thr Ile Ile Tyr Lys Ser Leu Ile Ala Leu 130 135 140Lys Thr Arg Asn Cys Ile Ile Phe Ser Pro His Pro Gly Ala His Lys145 150 155 160Ala Ser Val Phe Val Val Lys Val Leu His Gln Ala Ala Val Lys Ala 165 170 175Gly Ala Pro Glu Asn Cys Ile Gln Ile Ile Phe Pro Lys Met Asp Leu 180 185 190Thr Thr Glu Leu Leu His His Gln Lys Thr Arg Phe Ile Trp Ala Thr 195 200 205Gly Gly Pro Gly Leu Val His Ala Ser Tyr Thr Ser Gly Lys Pro Ala 210 215 220Leu Gly Gly Gly Pro Gly Asn Ala Pro Ala Leu Ile Asp Glu Thr Cys225 230 235 240Asp Met Asn Glu Ala Val Gly Ser Ile Val Val Ser Lys Thr Phe Asp 245 250 255Cys Gly Met Ile Cys Ala Thr Glu Asn Ala Val Val Val Val Glu Ser 260 265 270Val Tyr Glu Asn Phe Val Ala Thr Met Lys Lys Arg Gly Ala Tyr Phe 275 280 285Met Thr Pro Glu Glu Thr Lys Lys Ala Ser Asn Leu Leu Phe Gly Glu 290 295 300Gly Met Arg Leu Asn Ala Lys Ala Val Gly Gln Thr Ala Lys Thr Leu305 310 315 320Ala Glu Met Ala Gly Phe Glu Val Pro Glu Asn Thr Val Val Leu Cys 325 330 335Gly Glu Ala Ser Glu Val Lys Phe Glu Glu Pro Met Ala His Glu Lys 340 345 350Leu Thr Thr Ile Leu Gly Ile Tyr Lys Ala Lys Asp Phe Asp Asp Gly 355 360 365Val Arg Leu Cys Lys Glu Leu Val Thr Phe Gly Gly Lys Gly His Thr 370 375 380Ala Val Leu Tyr Thr Asn Gln Asn Asn Lys Asp Arg Ile Glu Lys Tyr385 390 395 400Gln Asn Glu Val Pro Ala Phe His Ile Leu Val Asp Met Pro Ser Ser 405 410 415Leu Gly Cys Ile Gly Asp Met Tyr Asn Phe Arg Leu Ala Pro Ala Leu 420 425 430Thr Ile Thr Cys Gly Thr Met Gly Gly Gly Ser Ser Ser Asp Asn Ile 435 440 445Gly Pro Lys His Leu Leu Asn Ile Lys Arg Val Gly Met Arg Arg Glu 450 455 460Asn Met Leu Trp Phe Lys Ile Pro Lys Ser Val Tyr Phe Lys Arg Ala465 470 475 480Ile Leu Ser Glu Ala Leu Ser Asp Leu Arg Asp Thr His Lys Arg Ala 485 490 495Ile Ile Ile Thr Asp Arg Thr Met Thr Met Leu Gly Gln Thr Asp Lys 500 505 510Ile Ile Lys Ala Cys Glu Gly His Gly Met Val Cys Thr Val Tyr Asp 515 520 525Lys Val Val Pro Asp Pro Thr Ile Lys Cys Ile Met Glu Gly Val Asn 530 535 540Glu Met Asn Val Phe Lys Pro Asp Leu Ala Ile Ala Leu Gly Gly Gly545 550 555 560Ser Ala Met Asp Ala Ala Lys Met Met Arg Leu Phe Tyr Glu Tyr Pro 565 570 575Asp Gln Asp Leu Gln Asp Ile Ala Thr Arg Phe Val Asp Ile Arg Lys 580 585 590Arg Val Val Gly Cys Pro Lys Leu Gly Arg Leu Ile Lys Thr Leu Val 595 600 605Cys Ile Pro Thr Thr Ser Gly Thr Gly Ala Glu Val Thr Pro Phe Ala 610 615 620Val Val Thr Ser Glu Glu Gly Arg Lys Tyr Pro Leu Val Asp Tyr Glu625 630 635 640Leu Thr Pro Asp Met Ala Ile Val Asp Pro Glu Phe Ala Val Gly Met 645 650 655Pro Lys Arg Leu Thr Ser Trp Thr Gly Ile Asp Ala Leu Thr His Ala

660 665 670Ile Glu Ser Tyr Val Ser Ile Met Ala Thr Asp Phe Thr Arg Pro Tyr 675 680 685Ser Leu Arg Ala Val Gly Leu Ile Phe Glu Ser Leu Ser Leu Ala Tyr 690 695 700Asn Asn Gly Lys Asp Ile Glu Ala Arg Glu Lys Met His Asn Ala Ser705 710 715 720Ala Ile Ala Gly Met Ala Phe Ala Asn Ala Phe Leu Gly Cys Cys His 725 730 735Ser Val Ala His Gln Leu Gly Ser Val Tyr His Ile Pro His Gly Leu 740 745 750Ala Asn Ala Leu Met Leu Ser His Ile Ile Lys Tyr Asn Ala Thr Asp 755 760 765Ser Pro Val Lys Met Gly Thr Phe Pro Gln Tyr Lys Tyr Pro Gln Ala 770 775 780Met Arg His Tyr Ala Glu Ile Ala Glu Leu Leu Leu Pro Pro Thr Gln785 790 795 800Val Val Lys Met Thr Asp Val Asp Lys Val Gln Tyr Leu Ile Asp Arg 805 810 815Val Glu Gln Leu Lys Ala Asp Val Gly Ile Pro Lys Ser Ile Lys Glu 820 825 830Thr Gly Met Val Thr Glu Glu Asp Phe Phe Asn Lys Val Asp Gln Val 835 840 845Ala Ile Met Ala Phe Asp Asp Gln Cys Thr Gly Ala Asn Pro Arg Tyr 850 855 860Pro Leu Val Ser Glu Leu Lys Gln Leu Met Ile Asp Ala Trp Asn Gly865 870 875 880Val Val Pro Lys Leu 885

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed