Biofuel Production

YOSHIKUNI; Yasuo ;   et al.

Patent Application Summary

U.S. patent application number 13/794610 was filed with the patent office on 2013-09-26 for biofuel production. This patent application is currently assigned to BIO ARCHITECTURE LAB, INC.. The applicant listed for this patent is BIO ARCHITECTURE LAB, INC.. Invention is credited to Yuki KASHIYAMA, Yasuo YOSHIKUNI.

Application Number20130252312 13/794610
Document ID /
Family ID40227859
Filed Date2013-09-26

United States Patent Application 20130252312
Kind Code A1
YOSHIKUNI; Yasuo ;   et al. September 26, 2013

BIOFUEL PRODUCTION

Abstract

Methods, enzymes, recombinant microorganism, and microbial systems are provided for converting polysaccharides, such as those derived from biomass, into suitable monosaccharides or oligosaccharides, as well as for converting suitable monosaccharides or oligosaccharides into commodity chemicals, such as biofuels. Commodity chemicals produced by the methods described herein are also provided. Commodity chemical enriched, refinery-produced petroleum products are also provided, as well as methods for producing the same.


Inventors: YOSHIKUNI; Yasuo; (Albany, CA) ; KASHIYAMA; Yuki; (Berkeley, CA)
Applicant:
Name City State Country Type

BIO ARCHITECTURE LAB, INC.

Berkeley

CA

US
Assignee: BIO ARCHITECTURE LAB, INC.
Berkeley
CA

Family ID: 40227859
Appl. No.: 13/794610
Filed: March 11, 2013

Related U.S. Patent Documents

Application Number Filing Date Patent Number
12245537 Oct 3, 2008
13794610
60977628 Oct 4, 2007

Current U.S. Class: 435/252.33 ; 435/254.2
Current CPC Class: C12P 7/38 20130101; Y02E 50/17 20130101; C12P 7/18 20130101; Y02E 50/10 20130101; Y02T 50/678 20130101; C12P 7/04 20130101; C12P 5/02 20130101; C12P 17/10 20130101; C12P 7/02 20130101; C12N 15/70 20130101; C12P 7/26 20130101; C12P 7/06 20130101; C12P 5/026 20130101; C12P 7/22 20130101; C12P 7/16 20130101
Class at Publication: 435/252.33 ; 435/254.2
International Class: C12N 15/70 20060101 C12N015/70

Claims



1. A recombinant microorganism for production of a commodity chemical, comprising recombinant DNA encoding a transporter, wherein the transporter transports an alginate-derived polysaccharide into the recombinant microorganism, and wherein said polysaccharide is converted to said commodity chemical in said microorganism.

2. The microorganism of claim 1 wherein the transporter is a monosaccharide transporter, disaccharide transporter, trisaccharide transporter, oligosaccharide transporter, or polysaccharide transporter.

3. The microorganism of claim 1 wherein the transporter is a symporter, ABC transporter, or permease.

4. The microorganism of claim 1 wherein the transporter is a superchannel or outer membrane porin.

5. The microorganism of claim 1 wherein the transporter comprises SEQ ID NO: 8, SEQ ID NO: 24, or SEQ ID NO: 38.

6. The microorganism of claim 1 wherein the alginate-derived polysaccharide is selected from the group consisting of a dialginate, trialginate, pentalginate, hexylginate, heptalginate, octalginate, nonalginate, decalginate, undecalginate, dodecalginate, and polyalginate.

7. The microorganism of claim 1 wherein the alginate-derived polysaccharide is a saturated polysaccharide.

8. The microorganism of claim 1 wherein the alginate-derived polysaccharide is an unsaturated polysaccharide.

9. The microorganism of claim 1 wherein the alginate-derived polysaccharide is selected from the group consisting of b-D-mannuronate, .alpha.-L-gluronate, 4-deoxy-L-erythro-5-hexoseulose uronic acid, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-D-mannuronate or L-guluronate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-dialginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-trialginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-tetralginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-pentalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-hexylginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-heptalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-octalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-nonalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-undecalginate, and 4-(4-deoxy-beta-D-mann-4-enuronosyl)-dodecalginate.

10. The microorganism of claim 1 wherein the microorganism is yeast.

11. The microorganism of claim 1 wherein the microorganism is E. coli.

12. A system for the production of a commodity chemical, comprising a) an alginate-derived polysaccharide; and b) a recombinant microorganism comprising recombinant DNA encoding a transporter, wherein the transporter transports an alginate-derived polysaccharide into the recombinant microorganism and wherein said polysaccharide is converted to said commodity chemical in said microorganism.

13. The system of claim 12 wherein the transporter is a symporter, ABC transporter, or permease.

14. The system of claim 12 wherein the transporter is a superchannel or outer membrane porin.

15. The system of claim 12 wherein the transporter comprises SEQ ID NO: 8, SEQ ID NO: 24, or SEQ ID NO: 38.
Description



CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application is a continuation of copending U.S. patent application Ser. No. 12/245,537, with a filing date of Oct. 3, 2008, which claims the benefit under 35 U.S.C. .sctn.119(e) of U.S. Provisional Patent Application No. 60/977,628 filed Oct. 4, 2007, all of which are incorporated herein by reference in their entirety.

SUBMISSION OF SEQUENCE LISTING AS ASCII TEXT FILE

[0002] The content of the following submission on ASCII text file is incorporated herein by reference in its entirety: a computer readable form (CRF) of the Sequence Listing (file name: 690212000607SeqList.txt, date recorded: Mar. 11, 2013 size: 519 KB).

TECHNICAL FIELD

[0003] The present application relates generally to the use of microbial and chemical systems to convert biomass to commodity chemicals, such as biofuels/biopetrols.

BACKGROUND

[0004] Petroleum is facing declining global reserves and contributes to more than 30% of greenhouse gas emissions driving global warming. Annually 800 billion barrels of transportation fuel are consumed globally. Diesel and jet fuels account for greater than 50% of global transportation fuels.

[0005] Significant legislation has been passed, requiring fuel producers to cap or reduce the carbon emissions from the production and use of transportation fuels. Fuel producers are seeking substantially similar, low carbon fuels that can be blended and distributed through existing infrastructure (e.g., refineries, pipelines, tankers).

[0006] Due to increasing petroleum costs and reliance on petrochemical feedstocks, the chemicals industry is also looking for ways to improve margin and price stability, while reducing its environmental footprint. The chemicals industry is striving to develop greener products that are more energy, water, and CO.sub.2 efficient than current products. Fuels produced from biological sources, such as biomass, represent one aspect of process.

[0007] Presents method for converting biomass into biofuels focus on the use of lignocellulolic biomass, and there are many problems associated with using this process. Large-scale cultivation of lignocellulolic biomass requires substantial amount of cultivated land, which can be only achieved by replacing food crop production with energy crop production, deforestation, and by recultivating currently uncultivated land. Other problems include a decrease in water availability and quality and an increase in the use of pesticides and fertilizers.

[0008] The degradation of lignocellulolic biomass using biological systems is a very difficult challenge due to its substantial mechanistic strength and the complex chemical components. Approximately thirty different enzymes are required to fully convert lignocellulose to monosaccharides. The only available alternate to this complex approach requires a substantial amount of heat, pressure, and strong acids. The art therefore needs an economic and technically simple process for converting biomass into hydrocarbons for use as biofuels or biopetrols.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] FIG. 1 shows the Vibrio splendidus genomic region of the fosmid clone described in Example 1. Genes are indicated with orange arrows. Labels show the numerical gene indices and the predicted function of the proteins.

[0010] FIG. 2 illustrates the pathways involved in certain embodiment in which E. coli may be engineered to grow on alginate as a sole source of carbon.

[0011] FIG. 3 illustrates the pathways involved in certain embodiment in which E. coli may be engineered to grow on pectin as a sole source of carbon.

[0012] FIG. 4 shows the results of engineered or recombinant E. coli growing on alginate as a sole source of carbon (see solid circles). Agrobacterium tumefaciens cells provide a positive control (see hatched circles). The well to the immediate left of the of the A. tumefaciens positive control contains DH10B E. coli cells, which provide a negative control.

[0013] FIG. 5 shows the growth of recombinant strain of E. coli on galacturonates and pectin. FIG. 5A shows the growth of E. coli on various lengths of galacturonate after 24 hr. The recombinant strain in FIG. 5A is the E. coli BL21(DE3) strain harboring pTrlogl-kdgR+pBBRGal3P, and the control strain is the BL21(DE3) strain harboring pTrc99A+pBBR1MCS-2, as described in Example 2. FIG. 5B shows the growth of recombinant E. coli on pectin after 3-4 days. The recombinant strain in FIG. 5B is E. coli DH5a strain containing pPEL74 (Ctrl) and pPEL74 and pROU2, as described in Example 2.

[0014] FIG. 6 shows the degradation of alginate to form pyruvate. FIG. 6A illustrates a simplified metabolic pathway for alginate degradation and metabolism. FIG. 6B shows the results of in vitro degradation of alginate to form pyruvate by an enzymatic degradation route. FIG. 6C shows the results of in vitro degradation of alginate to form pyruvate by a chemical degradation route.

[0015] FIG. 7 shows the biological activity of various alcohol dehydrogenases isolated from Agrobacterium tumefaciens C58. FIG. 7A shows DEHU hydrogenase activity as monitored by NADPH consumption, and FIG. 7B shows mannuronate hydrogenase activity as monitored by NADPH consumption.

[0016] FIG. 8 shows the GC-MS chromatogram results for the control sample (FIG. 8A) and for isobutyraldehyde, 3-methylpentanol, and 2-methylpentanal production from pBADalsS-ilvCD-leuABCD2 and pTrcBALK (FIG. 8B).

[0017] FIG. 9 shows the GC-MS chromatogram results for the control sample (FIG. 9A) and for 4-hydroxyphenylethanol and indole-3-ethanol production from pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK (FIG. 9B).

[0018] FIG. 10 shows the mass spectrometry results for isobutanal (FIG. 10A), 3-methylpentanol (FIG. 10B), and 2-methylpentanol (FIG. 10C).

[0019] FIG. 11 shows the mass spectrometry results for phenylethanol (FIG. 11A), 4-hydroxyphenylethanol (FIG. 11B), and indole-3-ethanol (FIG. 11C).

[0020] FIG. 12 shows the biological activity of diol dehydratases. FIG. 12A shows the reduction of butyroin by ddh1, ddh2, and ddh3 as monitored by NADH consumption. FIG. 12B shows the oxidation activity of ddh3 towards 1,2-cyclopentanediol and 1,2-cyclohexanediol as measured by NADH production.

[0021] FIG. 13 summarizes the results of kinetic studies for various substrates in the oxidation reactions catalyzed by the DDH polypeptides. These reactions were NAD+ dependent.

[0022] FIG. 14 shows the nucleotide sequence (FIG. 14A) (SEQ ID NO:97) and polypeptide sequence (FIG. 14B) (SEQ ID NO:98) of diol dehydrogenase DDH1 isolated from Lactobaccilus brevis ATCC 367.

[0023] FIG. 15 shows the nucleotide sequence (FIG. 15A) (SEQ ID NO:99) and polypeptide sequence (FIG. 15B) (SEQ ID NO:100) of diol dehydrogenase DDH2 isolated from Pseudomonas putida KT2440.

[0024] FIG. 16 shows the nucleotide sequence (FIG. 16A) (SEQ ID NO:101) and polypeptide sequence (FIG. 16B) (SEQ ID NO:102) of diol dehydrogenase DDH3 isolated from Klebsiella pneumoniae MGH78578.

[0025] FIG. 17 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This reaction illustrates the sequential conversion of butanal into 5-hydroxy-4-octanone and then 4,5-octanonediol. FIG. 17A shows the detection of butyroin (5-hydroxy-4-octanone) at 5.36 minutes, and FIG. 17B shows the detection of 4,5-octanediol at 6.49 and 6.65 minutes.

[0026] FIG. 18 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of n-pentanal into 6-hydroxy-5-decanone and then 5,6-decanediol. FIG. 18A shows the detection of valeroin (6-hydroxy-5-decanone) at 8.22 minutes, and FIG. 18B shows the detection of 5,6 decanediol at 9.22 and 9.35 minutes.

[0027] FIG. 19 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of 3-methylbutanal into 2,7-dimethyl-5-hydroxy-4-octanone and then 2,7-dimethyl-4,5-octanediol. FIG. 19A shows the detection of isoveraloin (2,7-dimethyl-5-hydroxy-4-octanone) at 6.79 minutes, and FIG. 19B shows the detection of 2,7-dimethyl-4,5-octanediol at 7.95 and 8.15 minutes.

[0028] FIG. 20 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of n-hexanal into 7-hydroxy-6-dodecanone and then 6,7-dodecanediol. FIG. 20A shows the detection of hexanoin (7-hydroxy-6-decanone) at 10.42 minutes, and FIG. 20B shows the detection of 6,7 dodecanediol at 10.89 and 10.95 minutes.

[0029] FIG. 21 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of 4-methylpentanal into 2,9-dimethyl-6-hydroxy-5-decanone and then 2,9-dimethyl-5,6-decanediol. FIG. 21A shows the detection of isohexanoin (2,9-Dimethyl-6-hydroxy-5-decanone) at 9.45 minutes, and FIG. 21B shows the detection of 2,9-dimethyl-5,6-decanediol at 10.38 and 10.44 minutes.

[0030] FIG. 22 shows the in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the conversion of n-octanal into 9-hydroxy-8-hexadecanone by showing the detection of detection of octanoin (9-hydroxy-8-hexadecanone) at 12.35 minutes.

[0031] FIG. 23 shows the in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the conversion of acetaldehyde into 3-hydroxy-2-butanone by showing the detection of acetoin (3-hydroxy-2-butanone) at rt=0.91 minutes.

[0032] FIG. 24 shows the sequential in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the sequential conversion of n-propanal into 4-hydroxy-3-hexanone and then 3,4-hexanediol. FIG. 24A shows the detection of propioin (4-hydroxy-3-hexanone) at rt=2.62 minutes, and FIG. 24B shows the detection of 3,4-hexanediol at rt=3.79 minutes.

[0033] FIG. 25 the in vivo biological activity of a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and a ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the conversion of phenylacetoaldehyde into 1,4-diphenyl-3-hydroxy-2-butanone by showing the detection of 1,4-diphenyl-3-hydroxy-2-butanone at rt=13.66 minutes.

[0034] FIG. 26 shows the sequential biological activity of a diol dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 (DDH3) and a diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578. FIG. 26A shows GC-MS data which confirms the presence of 4,5-octanediol in the sample extraction, which is the expected product resulting from the reduction of butyroin by ddh3. FIG. 26B shows GC-MS data confirming the presence of 4-octanone in the sample extraction, which is the expected product resulting from the sequential dehydrogenation of butyroin and dehydration of 4,5-octanediol by ddh3 and pduCDE, respectively.

[0035] FIG. 27 shows the sequential biological activity of a diol dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 (DDH3) and a diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578. FIGS. 27A and 27B show comparisons between the sample extraction gas chromatograph/mass spectrum and the 4-octanone standard gas chromatograph/mass spectrum, confirming that 4-octanone was produced from butyroin using the enzymes diol dehydrogenase (ddh3) and a diol dehydratase (pduCDE).

[0036] FIG. 28 shows the nucleotide sequence (FIG. 28A) (SEQ ID NO:103) and polypeptide sequence (FIG. 28B) (SEQ ID NO:104) of a diol dehydratase large subunit (pduC) isolated from Klebsiella pneumoniae MGH78578.

[0037] FIG. 29 shows the nucleotide sequence (FIG. 29A) (SEQ ID NO:105) and polypeptide sequence (FIG. 29B) (SEQ ID NO:106) of a diol dehydratase medium subunit isolated from Klebsiella pneumoniae MGH78578 (pduD), in addition to the nucleotide sequence (FIG. 29C) (SEQ ID NO:107) and polypeptide sequence (FIG. 29D) (SEQ ID NO:108) of a diol dehydratase small subunit isolated from Klebsiella pneumoniae MGH78578 (pduE).

[0038] FIG. 30 shows the oxidation of 4-octanol by secondary alcohol dehydrogenases as monitored by NADH production (FIG. 30A) and NADPH production (FIG. 30B).

[0039] FIG. 31 shows the oxidation of 4-octanol by secondary alcohol dehydrogenases as monitored by NADH production (FIG. 31A) and NADPH production (FIG. 31B).

[0040] FIG. 32 shows the oxidation of 2,7-dimethyl octanol by secondary alcohol dehydrogenases as monitored by NADH production (FIG. 32A) and NADPH production (FIG. 32B).

[0041] FIG. 33 shows the oxidation and reduction activity of 2ADH11 and 2ADH16. FIG. 33A shows the reduction of 2,7-dimethyl-4-octanone as measured by NADPH consumption. FIG. 33B shows the reduction of 2,7-dimethyl-4-octanone, 4-octanone, and cyclolypentanone.

[0042] FIG. 34 shows the oxidation and reduction of cyclopentanol by secondary alcohol dehydrogenases. FIG. 34A shows the oxidation of cyclopentanol as monitored by NADH or NADPH formation. FIG. 34B shows the reduction of cyclopentanol as monitored by NADPH consumption.

[0043] FIG. 35 shows the calculated rate constants for the illustrated reduction reactions for each substrate catalyzed by secondary alcohol dehydrogenase ADH-16 (SEQ ID NO:138).

[0044] FIG. 36 shows the calculated rate constants for the illustrated oxidation reactions for each substrate catalyzed by secondary alcohol dehydrogenase ADH-16 (SEQ ID NO:138).

[0045] FIGS. 37A-B shows a list of alginate lyases genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.

[0046] FIGS. 38A-E shows a list of pectate lyase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.

[0047] FIG. 39A shows a list of rhamnogalacturonan lyase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein. FIG. 39B shows a list of rhamnogalacturonate hydrolase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.

[0048] FIGS. 40A-B shows a list of pectin methyl esterase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.

[0049] FIG. 41 shows a list of pectin acetyl esterase genes/proteins that may be utilized according to the methods and recombinant microorganisms described herein.

[0050] FIG. 42 shows the production of 2-phenyl ethanol (FIG. 42A), 2-(4-hydroxyphenyl)ethanol (FIG. 42B), and 2-(indole-3-)ethanol (FIG. 42C) at 24 hours from the recombinant microorganisms described in Example 4, which comprise functional 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis pathways.

[0051] FIG. 43 shows the GC-MS chromatogram results that confirm the production of 2-phenyl ethanol (FIG. 43B) at one week from the recombinant microorganisms described in Example 4 (pBADpheA-aroLAC-aroG-tktA-aroBDE and pTrcBALK). FIG. 43A shows the negative control cells (pBAD33 and pTrc99A).

[0052] FIG. 44 shows the GC-MS chromatogram results that confirm the production of 2-(4-hydroxyphenyl)ethanol (9.36 min) and 2-(indole-3) ethanol (10.32 min) at one week from the recombinant microorganisms described in Example 4 (pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK).

[0053] FIG. 45 confirms both the formation of 1-propanal from 1,2-propanediol (FIG. 45A), and the formation of 2-butanone from meso-2,3-butanediol (FIG. 45B), both of which were catalyzed in vitro by an isolated B12 independent diol dehydratase, as described in Example 9.

[0054] FIG. 46A shows the in vivo production of 1-propanol from 1,2-propanediol. FIG. 46B shows the in vivo production of 2-butanol from meso-2,3 butanediol. FIG. 46C shows the in vivo production of cyclopentanone from trans-1,2-cyclopentanediol. These experiments were performed as described in Example 9.

[0055] FIG. 47 shows the results of the TBA assay, as performed in Example 10. The left tube in FIG. 47 represents media taken from an overnight culture of cells expressing Vs24254, showing secretion of an alginate lyase, while the right hand tube shows the TBA reaction using media from cells expressing Vs24259 (negative control). The lack of pink coloration in the negative control indicates that little or no cleavage of the alginate polymer has occurred.

[0056] FIG. 48 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized benzaldehyde lyase (BAL) catalyzed the in vivo production of 3-hydroxy-2-pentanone and 2-hydroxy-3-pentanone from a ligation reaction between acetaldehyde and propionaldehyde (FIG. 48A), and catalyzed the in vivo production of 4-hydroxy-3-heptanone and 3-hydroxy-4-heptanone from a ligation reaction between propionaldehyde and butyraldehyde (FIG. 48B).

[0057] FIG. 49 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 3-hydroxy-2-heptanone from a ligation reaction between acetaldehyde and pentanal (FIG. 49A), and catalyzed the in vivo production of 4-hydroxy-3-octanone and 3-hydroxy-4-octanone from a ligation reaction between pentanal and propionaldehyde (FIG. 49B).

[0058] FIG. 50 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 5-hydroxy-4-nonanone from ligation reaction between butyraldehyde and pentanal (FIG. 50A), and catalyzed the in vivo production of 2-methyl-5-hydroxy-4-decanone and 2-methyl-4-hydroxy-5-decanone from ligation reaction between hexanal and 3-methylbutyraldehyde (FIG. 50B).

[0059] FIG. 51 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 6-methyl-3-hydroxy-2-heptanone from ligation reaction between acetaldehyde and 4-methylhexanal (FIG. 51A), and catalyzed the in vivo production of 7-methyl-4-hydroxy-3-octanone from a ligation reaction between 4-methylhexanal and propionaldehyde (FIG. 51B).

[0060] FIG. 52 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 8-methyl-5-hydroxy-4-nonanone from ligation reaction between 4-methylhexanal and butyraldehyde (FIG. 52A), and catalyzed the in vivo production of 3-hydroxy-2-decanone from a ligation reaction between acetaldehyde and octanal (FIG. 52B).

[0061] FIG. 53 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 4-hydroxy-3-undecanone from ligation reaction between octanal and propionaldehyde (FIG. 53A), and catalyzed the in vivo production of 5-hydroxy-4-dodecanone from a ligation reaction between octanal and butyraldehyde (FIG. 53B).

[0062] FIG. 54 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 6-hydroxy-5-tridecanone (FIG. 54A) from ligation reaction between octanal and pentanal, and catalyzed the in vivo production of 2-methyl-5-hydroxy-4-dodecanone and 2-methyl-4-hydroxy-5-decanone from a ligation reaction between octanal and 3-methylbutyraldehyde (FIG. 54B).

[0063] FIG. 55 shows the in vivo biological activity of a C--C ligase isolated from Pseudomonas fluorescens and cloned into E. coli. The GC-MS chromatogram results show that codon-optimized BAL catalyzed the in vivo production of 2-methyl-6-hydroxy-5-tridecanone from a ligation reaction between octanal and 4-methylpentanal.

[0064] FIG. 56 shows the growth of recombinant E. coli on alginate as a sole source of carbon (FIG. 56A), as described in Example 10. Growth on glucose (FIG. 56B) provides a positive control. The cells were transformed with either no plasmid (BL21--negative control), one plasmid (e.g., Da or 3a), or two plasmids (e.g., Dk3a and Da3k). The plasmids are indicated by the lower case letter: "a" refers to the pET-DEST42 plasmid backbone and "k" refers to the pENTR/D/TOPO backbone. "D" indicates that the plasmid contains the genomic region Vs24214-24249, while "3" indicates that the plasmid contains the genomic region Vs24189-24209. Thus, Da would be pET-DEST42-Vs24214-24249, Da3k would be pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 and so on. These results show that the combined genomic regions Vs24214-24249 and Vs24189-24209 are sufficient to confer on E. coli the ability to grow on alginate as a sole source of carbon.

[0065] FIG. 57 shows the production of ethanol by E. coli growing on alginate, as performed in Example 11. E. coli was transformed with either pBBRPdc-AdhA/B or pBBRPdc-AdhA/B+1.5 FOS and allowed to grow in m9 media containing alginate.

BRIEF SUMMARY

[0066] Embodiments of the present invention include methods for converting a polysaccharide to a commodity chemical, comprising (a) contacting the polysaccharide, wherein the polysaccharide is optionally derived from biomass, with a polysaccharide degrading or depolymerizing metabolic system, wherein the metabolic system is selected from; (i) enzymatic or chemical catalysis, and (ii) a microbial system, wherein the microbial system comprises a recombinant microorganism, wherein the recombinant microorganism comprises one or exogenous genes that allow it to grow on the polysaccharide as a sole source of carbon, thereby converting the polysaccharide to a suitable monosaccharide or oligosaccharide; and (b) contacting the suitable monosaccharide or oligosaccharide with commodity chemical biosynthesis pathway, wherein the commodity chemical biosynthesis pathway comprises an aldehyde or ketone biosynthesis pathway, thereby converting the polysaccharide to the commodity chemical.

[0067] In certain aspects, the biomass is selected from marine biomass and vegetable/fruit/plant biomass. In certain aspects, the marine biomass is selected from kelp, giant kelp, sargasso, seaweed, algae, marine microflora, microalgae, and sea grass. In certain aspects, the vegetable/fruit/plant biomass comprises plant peel or pomace. In certain aspects, the vegetable/fruit/plant biomass is selected from citrus, potato, tomato, grape, gooseberry, carrot, mango, sugar-beet, apple, switchgrass, wood, and stover.

[0068] In certain aspects, the polysaccharide is selected from alginate, agar, carrageenan, fucoidan, pectin, polygalacturonate, cellulose, hemicellulose, xylan, arabinan, and mannan. In certain aspects, the suitable monosaccharide or oligosaccharide is selected from 2-keto-3-deoxy D-gluconate (KDG), D-mannitol, guluronate, mannuronate, mannitol, lyxose, glycerol, xylitol, glucose, mannose, galactose, xylose, arabinose, glucuronate, galacturonates, and rhamnose.

In certain aspects, the commodity chemical is selected from methane, methanol, ethane, ethene, ethanol, n-propane, 1-propene, 1-propanol, propanal, acetone, propionate, n-butane, 1-butene, 1-butanol, butanal, butanoate, isobutanal, isobutanol, 2-methylbutanal, 2-methylbutanol, 3-methylbutanal, 3-methylbutanol, 2-butene, 2-butanol, 2-butanone, 2,3-butanediol, 3-hydroxy-2-butanone, 2,3-butanedione, ethylbenzene, ethenylbenzene, 2-phenylethanol, phenylacetaldehyde, 1-phenylbutane, 4-phenyl-1-butene, 4-phenyl-2-butene, 1-phenyl-2-butene, 1-phenyl-2-butanol, 4-phenyl-2-butanol, 1-phenyl-2-butanone, 4-phenyl-2-butanone, 1-phenyl-2,3-butandiol, 1-phenyl-3-hydroxy-2-butanone, 4-phenyl-3-hydroxy-2-butanone, 1-phenyl-2,3-butanedione, n-pentane, ethylphenol, ethenylphenol, 2-(4-hydroxyphenyl)ethanol, 4-hydroxyphenylacetaldehyde, 1-(4-hydroxyphenyl)butane, 4-(4-hydroxyphenyl)-1-butene, 4-(4-hydroxyphenyl)-2-butene, 1-(4-hydroxyphenyl)-1-butene, 1-(4-hydroxyphenyl)-2-butanol, 4-(4-hydroxyphenyl)-2-butanol, 1-(4-hydroxyphenyl)-2-butanone, 4-(4-hydroxyphenyl)-2-butanone, 1-(4-hydroxyphenyl)-2,3-butandiol, 1-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 4-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-2,3-butanonedione, indolylethane, indolylethene, 2-(indole-3-)ethanol, n-pentane, 1-pentene, 1-pentanol, pentanal, pentanoate, 2-pentene, 2-pentanol, 3-pentanol, 2-pentanone, 3-pentanone, 4-methylpentanal, 4-methylpentanol, 2,3-pentanediol, 2-hydroxy-3-pentanone, 3-hydroxy-2-pentanone, 2,3-pentanedione, 2-methylpentane, 4-methyl-1-pentene, 4-methyl-2-pentene, 4-methyl-3-pentene, 4-methyl-2-pentanol, 2-methyl-3-pentanol, 4-methyl-2-pentanone, 2-methyl-3-pentanone, 4-methyl-2,3-pentanediol, 4-methyl-2-hydroxy-3-pentanone, 4-methyl-3-hydroxy-2-pentanone, 4-methyl-2,3-pentanedione, 1-phenylpentane, 1-phenyl-1-pentene, 1-phenyl-2-pentene, 1-phenyl-3-pentene, 1-phenyl-2-pentanol, 1-phenyl-3-pentanol, 1-phenyl-2-pentanone, 1-phenyl-3-pentanone, 1-phenyl-2,3-pentanediol, 1-phenyl-2-hydroxy-3-pentanone, 1-phenyl-3-hydroxy-2-pentanone, 1-phenyl-2,3-pentanedione, 4-methyl-1-phenylpentane, 4-methyl-1-phenyl-1-pentene, 4-methyl-1-phenyl-2-pentene, 4-methyl-1-phenyl-3-pentene, 4-methyl-1-phenyl-3-pentanol, 4-methyl-1-phenyl-2-pentanol, 4-methyl-1-phenyl-3-pentanone, 4-methyl-1-phenyl-2-pentanone, 4-methyl-1-phenyl-2,3-pentanediol, 4-methyl-1-phenyl-2,3-pentanedione, 4-methyl-1-phenyl-3-hydroxy-2-pentanone, 4-methyl-1-phenyl-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl)pentane, 1-(4-hydroxyphenyl)-1-pentene, 1-(4-hydroxyphenyl)-2-pentene, 1-(4-hydroxyphenyl)-3-pentene, 1-(4-hydroxyphenyl)-2-pentanol, 1-(4-hydroxyphenyl)-3-pentanol, 1-(4-hydroxyphenyl)-2-pentanone, 1-(4-hydroxyphenyl)-3-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanediol, 1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl)pentane, 4-methyl-1-(4-hydroxyphenyl)-2-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentene, 4-methyl-1-(4-hydroxyphenyl)-1-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentanol, 4-methyl-1-(4-hydroxyphenyl)-2-pentanol, 4-methyl-1-(4-hydroxyphenyl)-3-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-indole-3-pentane, 1-(indole-3)-1-pentene, 1-(indole-3)-2-pentene, 1-(indole-3)-3-pentene, 1-(indole-3)-2-pentanol, 1-(indole-3)-3-pentanol, 1-(indole-3)-2-pentanone, 1-(indole-3)-3-pentanone, 1-(indole-3)-2,3-pentanediol, 1-(indole-3)-2-hydroxy-3-pentanone, 1-(indole-3)-3-hydroxy-2-pentanone, 1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3-)pentane, 4-methyl-1-(indole-3)-2-pentene, 4-methyl-1-(indole-3)-3-pentene, 4-methyl-1-(indole-3)-1-pentene, 4-methyl-2-(indole-3)-3-pentanol, 4-methyl-1-(indole-3)-2-pentanol, 4-methyl-1-(indole-3)-3-pentanone, 4-methyl-1-(indole-3)-2-pentanone, 4-methyl-1-(indole-3)-2,3-pentanediol, 4-methyl-1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3)-3-hydroxy-2-pentanone, 4-methyl-1-(indole-3)-2-hydroxy-3-pentanone, n-hexane, 1-hexene, 1-hexanol, hexanal, hexanoate, 2-hexene, 3-hexene, 2-hexanol, 3-hexanol, 2-hexanone, 3-hexanone, 2,3-hexanediol, 2,3-hexanedione, 3,4-hexanediol, 3,4-hexanedione, 2-hydroxy-3-hexanone, 3-hydroxy-2-hexanone, 3-hydroxy-4-hexanone, 4-hydroxy-3-hexanone, 2-methylhexane, 3-methylhexane, 2-methyl-2-hexene, 2-methyl-3-hexene, 5-methyl-1-hexene, 5-methyl-2-hexene, 4-methyl-1-hexene, 4-methyl-2-hexene, 3-methyl-3-hexene, 3-methyl-2-hexene, 3-methyl-1-hexene, 2-methyl-3-hexanol, 5-methyl-2-hexanol, 5-methyl-3-hexanol, 2-methyl-3-hexanone, 5-methyl-2-hexanone, 5-methyl-3-hexanone, 2-methyl-3,4-hexanediol, 2-methyl-3,4-hexanedione, 5-methyl-2,3-hexanediol, 5-methyl-2,3-hexanedione, 4-methyl-2,3-hexanediol, 4-methyl-2,3-hexanedione, 2-methyl-3-hydroxy-4-hexanone, 2-methyl-4-hydroxy-3-hexanone, 5-methyl-2-hydroxy-3-hexanone, 5-methyl-3-hydroxy-2-hexanone, 4-methyl-2-hydroxy-3-hexanone, 4-methyl-3-hydroxy-2-hexanone, 2,5-dimethylhexane, 2,5-dimethyl-2-hexene, 2,5-dimethyl-3-hexene, 2,5-dimethyl-3-hexanol, 2,5-dimethyl-3-hexanone, 2,5-dimethyl-3,4-hexanediol, 2,5-dimethyl-3,4-hexanedione, 2,5-dimethyl-3-hydroxy-4-hexanone, 5-methyl-1-phenylhexane, 4-methyl-1-phenylhexane, 5-methyl-1-phenyl-1-hexene, 5-methyl-1-phenyl-2-hexene, 5-methyl-1-phenyl-3-hexene, 4-methyl-1-phenyl-1-hexene, 4-methyl-1-phenyl-2-hexene, 4-methyl-1-phenyl-3-hexene, 5-methyl-1-phenyl-2-hexanol, 5-methyl-1-phenyl-3-hexanol, 4-methyl-1-phenyl-2-hexanol, 4-methyl-1-phenyl-3-hexanol, 5-methyl-1-phenyl-2-hexanone, 5-methyl-1-phenyl-3-hexanone, 4-methyl-1-phenyl-2-hexanone, 4-methyl-1-phenyl-3-hexanone, 5-methyl-1-phenyl-2,3-hexanediol, 4-methyl-1-phenyl-2,3-hexanediol, 5-methyl-1-phenyl-3-hydroxy-2-hexanone, 5-methyl-1-phenyl-2-hydroxy-3-hexanone, 4-methyl-1-phenyl-3-hydroxy-2-hexanone, 4-methyl-1-phenyl-2-hydroxy-3-hexanone, 5-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)hexane, 5-methyl-1-(4-hydroxyphenyl)-1-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexene, 5-methyl-1-(4-hydroxyphenyl)-3-hexene, 4-methyl-1-(4-hydroxyphenyl)-1-hexene, 4-methyl-1-(4-hydroxyphenyl)-2-hexene, 4-methyl-1-(4-hydroxyphenyl)-3-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexanol, 5-methyl-1-(4-hydroxyphenyl)-3-hexanol, 4-methyl-1-(4-hydroxyphenyl)-2-hexanol, 4-methyl-1-(4-hydroxyphenyl)-3-hexanol, 5-methyl-1-(4-hydroxyphenyl)-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 5-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(indole-3-)hexane, 5-methyl-1-(indole-3)-1-hexene, 5-methyl-1-(indole-3)-2-hexene, 5-methyl-1-(indole-3)-3-hexene, 4-methyl-1-(indole-3)-1-hexene, 4-methyl-1-(indole-3)-2-hexene, 4-methyl-1-(indole-3)-3-hexene, 5-methyl-1-(indole-3)-2-hexanol, 5-methyl-1-(indole-3)-3-hexanol, 4-methyl-1-(indole-3)-2-hexanol, 4-methyl-1-(indole-3)-3-hexanol, 5-methyl-1-(indole-3)-2-hexanone, 5-methyl-1-(indole-3)-3-hexanone, 4-methyl-1-(indole-3)-2-hexanone, 4-methyl-1-(indole-3)-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanediol, 4-methyl-1-(indole-3)-2,3-hexanediol, 5-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 5-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 4-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 4-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanedione, 4-methyl-1-(indole-3)-2,3-hexanedione, n-heptane, 1-heptene, 1-heptanol, heptanal, heptanoate, 2-heptene, 3-heptene, 2-heptanol, 3-heptanol, 4-heptanol, 2-heptanone, 3-heptanone, 4-heptanone, 2,3-heptanediol, 2,3-heptanedione, 3,4-heptanediol, 3,4-heptanedione, 2-hydroxy-3-heptanone, 3-hydroxy-2-heptanone, 3-hydroxy-4-heptanone, 4-hydroxy-3-heptanone, 2-methylheptane, 3-methylheptane, 6-methyl-2-heptene, 6-methyl-3-heptene, 2-methyl-3-heptene, 2-methyl-2-heptene, 5-methyl-2-heptene, 5-methyl-3-heptene, 3-methyl-3-heptene, 2-methyl-3-heptanol, 2-methyl-4-heptanol, 6-methyl-3-heptanol, 5-methyl-3-heptanol, 3-methyl-4-heptanol, 2-methyl-3-heptanone, 2-methyl-4-heptanone, 6-methyl-3-heptanone, 5-methyl-3-heptanone, 3-methyl-4-heptanone, 2-methyl-3,4-heptanediol, 2-methyl-3,4-heptanedione, 6-methyl-3,4-heptanediol, 6-methyl-3,4-heptanedione, 5-methyl-3,4-heptanediol, 5-methyl-3,4-heptanedione, 2-methyl-3-hydroxy-4-heptanone, 2-methyl-4-hydroxy-3-heptanone, 6-methyl-3-hydroxy-4-heptanone, 6-methyl-4-hydroxy-3-heptanone, 5-methyl-3-hydroxy-4-heptanone, 5-methyl-4-hydroxy-3-heptanone, 2,6-dimethylheptane, 2,5-dimethylheptane, 2,6-dimethyl-2-heptene, 2,6-dimethyl-3-heptene, 2,5-dimethyl-2-heptene, 2,5-dimethyl-3-heptene, 3,6-dimethyl-3-heptene, 2,6-dimethyl-3-heptanol, 2,6-dimethyl-4-heptanol, 2,5-dimethyl-3-heptanol, 2,5-dimethyl-4-heptanol, 2,6-dimethyl-3,4-heptanediol, 2,6-dimethyl-3,4-heptanedione, 2,5-dimethyl-3,4-heptanediol, 2,5-dimethyl-3,4-heptanedione, 2,6-dimethyl-3-hydroxy-4-heptanone, 2,6-dimethyl-4-hydroxy-3-heptanone, 2,5-dimethyl-3-hydroxy-4-heptanone, 2,5-dimethyl-4-hydroxy-3-heptanone, n-octane, 1-octene, 2-octene, 1-octanol, octanal, octanoate, 3-octene, 4-octene, 4-octanol, 4-octanone, 4,5-octanediol, 4,5-octanedione, 4-hydroxy-5-octanone, 2-methyloctane, 2-methyl-3-octene, 2-methyl-4-octene, 7-methyl-3-octene, 3-methyl-3-octene, 3-methyl-4-octene, 6-methyl-3-octene, 2-methyl-4-octanol, 7-methyl-4-octanol, 3-methyl-4-octanol, 6-methyl-4-octanol, 2-methyl-4-octanone, 7-methyl-4-octanone, 3-methyl-4-octanone, 6-methyl-4-octanone, 2-methyl-4,5-octanediol, 2-methyl-4,5-octanedione, 3-methyl-4,5-octanediol, 3-methyl-4,5-octanedione, 2-methyl-4-hydroxy-5-octanone, 2-methyl-5-hydroxy-4-octanone, 3-methyl-4-hydroxy-5-octanone, 3-methyl-5-hydroxy-4-octanone, 2,7-dimethyloctane, 2,7-dimethyl-3-octene, 2,7-dimethyl-4-octene, 2,7-dimethyl-4-octanol, 2,7-dimethyl-4-octanone, 2,7-dimethyl-4,5-octanediol, 2,7-dimethyl-4,5-octanedione, 2,7-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyloctane, 2,6-dimethyl-3-octene, 2,6-dimethyl-4-octene, 3,7-dimethyl-3-octene, 2,6-dimethyl-4-octanol, 3,7-dimethyl-4-octanol, 2,6-dimethyl-4-octanone, 3,7-dimethyl-4-octanone, 2,6-dimethyl-4,5-octanediol, 2,6-dimethyl-4,5-octanedione, 2,6-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyl-5-hydroxy-4-octanone, 3,6-dimethyloctane, 3,6-dimethyl-3-octene, 3,6-dimethyl-4-octene, 3,6-dimethyl-4-octanol, 3,6-dimethyl-4-octanone, 3,6-dimethyl-4,5-octanediol, 3,6-dimethyl-4,5-octanedione, 3,6-dimethyl-4-hydroxy-5-octanone, n-nonane, 1-nonene, 1-nonanol, nonanal, nonanoate, 2-methylnonane, 2-methyl-4-nonene, 2-methyl-5-nonene, 8-methyl-4-nonene, 2-methyl-5-nonanol, 8-methyl-4-nonanol, 2-methyl-5-nonanone, 8-methyl-4-nonanone, 8-methyl-4,5-nonanediol, 8-methyl-4,5-nonanedione, 8-methyl-4-hydroxy-5-nonanone, 8-methyl-5-hydroxy-4-nonanone, 2,8-dimethylnonane, 2,8-dimethyl-3-nonene, 2,8-dimethyl-4-nonene, 2,8-dimethyl-5-nonene, 2,8-dimethyl-4-nonanol, 2,8-dimethyl-5-nonanol, 2,8-dimethyl-4-nonanone, 2,8-dimethyl-5-nonanone, 2,8-dimethyl-4,5-nonanediol, 2,8-dimethyl-4,5-nonanedione, 2,8-dimethyl-4-hydroxy-5-nonanone, 2,8-dimethyl-5-hydroxy-4-nonanone, 2,7-dimethylnonane, 3,8-dimethyl-3-nonene, 3,8-dimethyl-4-nonene, 3,8-dimethyl-5-nonene, 3,8-dimethyl-4-nonanol, 3,8-dimethyl-5-nonanol, 3,8-dimethyl-4-nonanone, 3,8-dimethyl-5-nonanone, 3,8-dimethyl-4,5-nonanediol, 3,8-dimethyl-4,5-nonanedione, 3,8-dimethyl-4-hydroxy-5-nonanone, 3,8-dimethyl-5-hydroxy-4-nonanone, n-decane, 1-decene, 1-decanol, decanoate, 2,9-dimethyldecane, 2,9-dimethyl-3-decene, 2,9-dimethyl-4-decene, 2,9-dimethyl-5-decanol, 2,9-dimethyl-5-decanone, 2,9-dimethyl-5,6-decanediol, 2,9-dimethyl-6-hydroxy-5-decanone, 2,9-dimethyl-5,6-decanedionen-undecane, 1-undecene, 1-undecanol, undecanal. undecanoate, n-dodecane, 1-dodecene, 1-dodecanol, dodecanal, dodecanoate, n-dodecane, 1-decadecene, 1-dodecanol, dodecanal, dodecanoate, n-tridecane, 1-tridecene, 1-tridecanol, tridecanal, tridecanoate, n-tetradecane, 1-tetradecene, 1-tetradecanol, tetradecanal, tetradecanoate, n-pentadecane, 1-pentadecene, 1-pentadecanol, pentadecanal, pentadecanoate, n-hexadecane, 1-hexadecene, 1-hexadecanol, hexadecanal, hexadecanoate, n-heptadecane, 1-heptadecene, 1-heptadecanol, heptadecanal, heptadecanoate, n-octadecane, 1-octadecene, 1-octadecanol, octadecanal, octadecanoate, n-nonadecane, 1-nonadecene, 1-nonadecanol, nonadecanal, nonadecanoate, eicosane, 1-eicosene, 1-eicosanol, eicosanal, eicosanoate, 3-hydroxy propanal, 1,3-propanediol, 4-hydroxybutanal, 1,4-butanediol, 3-hydroxy-2-butanone, 2,3-butandiol, 1,5-pentane diol, homocitrate, homoisocitorate, b-hydroxy adipate, glutarate, glutarsemialdehyde, glutaraldehyde, 2-hydroxy-1-cyclopentanone, 1,2-cyclopentanediol, cyclopentanone, cyclopentanol, (S)-2-acetolactate, (R)-2,3-Dihydroxy-isovalerate, 2-oxoisovalerate, isobutyryl-CoA, isobutyrate, isobutyraldehyde, 5-amino pentaldehyde, 1,10-diaminodecane, 1,10-diamino-5-decene, 1,10-diamino-5-hydroxydecane, 1,10-diamino-5-decanone, 1,10-diamino-5,6-decanediol, 1,10-diamino-6-hydroxy-5-decanone, phenylacetoaldehyde, 1,4-diphenylbutane, 1,4-diphenyl-1-butene, 1,4-diphenyl-2-butene, 1,4-diphenyl-2-butanol, 1,4-diphenyl-2-butanone, 1,4-diphenyl-2,3-butanediol, 1,4-diphenyl-3-hydroxy-2-butanone, 1-(4-hydeoxyphenyl)-4-phenylbutane, 1-(4-hydeoxyphenyl)-4-phenyl-1-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanol, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanone, 1-(4-hydeoxyphenyl)-4-phenyl-2,3-butanediol, 1-(4-hydeoxyphenyl)-4-phenyl-3-hydroxy-2-butanone, 1-(indole-3)-4-phenylbutane, 1-(indole-3)-4-phenyl-1-butene, 1-(indole-3)-4-phenyl-2-butene, 1-(indole-3)-4-phenyl-2-butanol, 1-(indole-3)-4-phenyl-2-butanone, 1-(indole-3)-4-phenyl-2,3-butanediol, 1-(indole-3)-4-phenyl-3-hydroxy-2-butanone, 4-hydroxyphenylacetoaldehyde, 1,4-di(4-hydroxyphenyl)butane, 1,4-di(4-hydroxyphenyl)-1-butene, 1,4-di(4-hydroxyphenyl)-2-butene, 1,4-di(4-hydroxyphenyl)-2-butanol, 1,4-di(4-hydroxyphenyl)-2-butanone, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 1,4-di(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3-)butane, 1-(4-hydroxyphenyl)-4-(indole-3)-1-butene,

1-di(4-hydroxyphenyl)-4-(indole-3)-2-butene, 1-(4-hydroxyphenyl)-4-(indole-3)-2-butanol, 1-(4-hydroxyphenyl)-4-(indole-3)-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3)-2,3-butanediol, 1-(4-hydroxyphenyl-4-(indole-3)-3-hydroxy-2-butanone, indole-3-acetoaldehyde, 1,4-di(indole-3-)butane, 1,4-di(indole-3)-1-butene, 1,4-di(indole-3)-2-butene, 1,4-di(indole-3)-2-butanol, 1,4-di(indole-3)-2-butanone, 1,4-di(indole-3)-2,3-butanediol, 1,4-di(indole-3)-3-hydroxy-2-butanone, succinate semialdehyde, hexane-1,8-dicarboxylic acid, 3-hexene-1,8-dicarboxylic acid, 3-hydroxy-hexane-1,8-dicarboxylic acid, 3-hexanone-1,8-dicarboxylic acid, 3,4-hexanediol-1,8-dicarboxylic acid, 4-hydroxy-3-hexanone-1,8-dicarboxylic acid, fucoidan, iodine, chlorophyll, carotenoid, calcium, magnesium, iron, sodium, potassium, and phosphate.

[0070] Certain embodiments of the present invention include methods for converting a polysaccharide to a suitable monosaccharide or oligosaccharide, comprising: (a) contacting the polysaccharide, wherein the polysaccharide is optionally obtained from biomass, with a microbial system for a time sufficient to convert the polysaccharide to a suitable monosaccharide or oligosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a lyase and a hydrolase, wherein the lyase and/or hydrolase optionally comprises at least one signal peptide or at least one autotransporter domain; (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide transporter, a disaccharide transporter, a trisaccharide transporter, an oligosaccharide transporter, a polysaccharide transporter, and a superchannel; and (iii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to a suitable monosaccharide or oligosaccharide.

[0071] Certain embodiments of the present invention include methods for converting a polysaccharide to a suitable monosaccharide or oligosaccharide, comprising: (a) contacting the polysaccharide, wherein the polysaccharide is optionally obtained from biomass, with a chemical or enzymatic catalysis pathway for a time sufficient to convert the polysaccharide to a first monosaccharide or oligosaccharide; and (b) contacting the first monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert the first monosaccharide or oligosaccharide to the suitable monosaccharide or oligosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a lyase and a hydrolase, (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide transporter, a disaccharide transporter, a trisaccharide transporter, an oligosaccharide transporter, a polysaccharide transporter, and a superchannel; and (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to the suitable monosaccharide or oligosaccharide.

[0072] In certain aspects, the lyase is selected from an alginate lyase, a pectate lyase, a polymannuronate lyase, a polygluronate lyase, a polygalacturonate lyase and a rhamnogalacturonate lyase. In certain aspects, the hydrolase is selected from an alginate hydrolase, a rhamnogalacturonate hydrolase, a polymannuronate hydrolase, a pectin hydrolase, and a polygalacturonate hydrolase. In certain aspects, the transporter is selected from an ABC transporter, a symporter, and an outer membrane porin. In certain aspects, the ABC transporter is selected from Atu3021, Atu3022, Atu3023, Atu3024, algM1, algM2, AlgQ1, AlgQ2, AlgS, OG2516.sub.--05558, OG2516.sub.--05563, OG2516.sub.--05568, OG2516.sub.--05573, TogM, TogN, TogA, TogB, and functional variants thereof. In certain aspects, the symporter is selected from V12B01.sub.--24239 (SEQ ID NO:26), V12B01.sub.--24194 (SEQ ID NO:8), and TogT, and functional variants thereof. In certain aspects, the outermembrane porin comprises a porin selected from V12B01.sub.--24269, KdgM, and KdgN, and functional variants thereof.

[0073] Certain embodiments include a recombinant microorganism that is capable of growing on a polysaccharide as a sole source of carbon, wherein the polysaccharide is selected from alginate, pectin, tri-galacturonate, di-galacturonate, cellulose, and hemi-cellulose. In certain aspects, the polysaccharide is alginate. In certain aspects, the polysaccharide is pectin. In certain aspects, the polysaccharide is tri-galacturonate.

[0074] Certain embodiments include a recombinant microrganism, comprising (i) at least one gene encoding and expressing an enzyme selected from a lyase and a hydrolase, wherein the lyase or hydrolase optionally comprises at least one signal peptide or at least one autotransporter domain; (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide transporter, a disaccharide transporter, a trisaccharide transporter, an oligosaccharide transporter, a polysaccharide transporter, and a superchannel; and (iii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase. In certain aspects, the microorganism is capable of growing on a polysaccharide as a sole source of carbon. In certain aspects, the polysaccharide is selected from alginate, pectin, and tri-galacturonate.

[0075] Certain embodiments include methods for converting a suitable monosaccharide or oligosaccharide to a first commodity chemical comprising, (a) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide or oligosaccharide to the commodity chemical, wherein the microbial system comprises a recombinant microorganism, wherein the microorganism comprises a commodity chemical biosynthesis pathway, thereby converting the suitable monosaccharide or oligosaccharide to the first commodity chemical. In certain aspects, the commodity chemical pathway comprises one or more genes encoding an aldehyde or ketone biosynthesis pathway.

[0076] In certain aspects, the aldehyde or ketone biosynthesis pathway is selected from one or more of an acetoaldehyde, a propionaldehyde, a butyraldehyde, an isobutyraldehyde, a 2-methyl-butyraldehyde, a 3-methyl-butyraldehyde, a 2-phenyl acetaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, a 2-Indole-3-acetoaldehyde, a glutaraldehyde, a 5-amino-pentaldehyde, a succinate semialdehyde, and a succinate 4-hydroxyphenyl acetaldehyde biosynthesis pathway. In certain aspects, the aldehyde or ketone biosynthesis pathway comprises an acetoaldehyde biosynthesis pathway and a biosynthesis pathway selected from a propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, a 2-phenyl acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a 2-Indole-3-acetoaldehyde biosynthesis pathway.

[0077] In certain aspects, the aldehyde or ketone biosynthesis pathway comprises a propionaldehyde biosynthesis pathway and a biosynthesis pathway selected from a butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, and phenylacetoaldehyde biosynthesis pathway. In certain aspects, the aldehyde or ketone biosynthesis pathway comprises a butyraldehyde biosynthesis pathway and a biosynthesis pathway selected from an isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, a 2-phenyl acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a 2-Indole-3-acetoaldehyde biosynthesis pathway. In certain aspects, the aldehyde or ketone biosynthesis pathway comprises an isobutyraldehyde biosynthesis pathway and a biosynthesis pathway selected from a 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, a 2-phenyl acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a 2-Indole-3-acetoaldehyde biosynthesis pathway.

[0078] In certain aspects, the aldehyde or ketone biosynthesis pathway comprises a 2-methyl-butyraldehyde biosynthesis pathway and a biosynthesis pathway selected from a 3-methyl-butyraldehyde, a 2-phenyl acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a 2-Indole-3-acetoaldehyde biosynthesis pathway. In certain aspects, the aldehyde or ketone biosynthesis pathway comprises a 3-methyl-butyraldehyde biosynthesis pathway and a biosynthesis pathway selected from a 2-phenyl acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a 2-Indole-3-acetoaldehyde biosynthesis pathway. In certain aspects, the aldehyde or ketone biosynthesis pathway comprises a 2-phenyl acetoaldehyde biosynthesis pathway and a biosynthesis pathway selected from a 2-(4-hydroxyphenyl)acetaldehyde and a 2-Indole-3-acetoaldehyde biosynthesis pathway.

[0079] In certain aspects, the aldehyde or ketone biosynthesis pathway comprises a 2-(4-hydroxyphenyl)acetaldehyde biosynthesis pathway and a 2-Indole-3-acetoaldehyde biosynthesis pathway. In certain aspects, the first commodity chemical is further enzymatically and/or chemically reduced and dehydrated to a second commodity chemical.

[0080] Certain embodiments include methods for converting a suitable monosaccharide or oligosaccharide to a commodity chemical comprising, (a) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide or oligosaccharide to the commodity chemical, wherein the microbial system comprises; (i) one or more genes encoding and expressing an aldehyde biosynthesis pathway, wherein the aldehyde biosynthesis pathway comprises one or more genes encoding and expressing a decarboxylase enzyme; and (ii) one or more genes encoding and expressing an aldehyde reductase, thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical. In certain aspects, the decarboxylase enzyme is an indole-3-pyruvate decarboxylase (IPDC). In certain aspects, the IPDC comprises an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO: 312. In certain aspects, the aldehyde reductase enzyme is a phenylacetaldehyde reductase (PAR). In certain aspects, the PAR comprises an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO: 313. In certain aspects, the commodity chemical is selected from 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol.

[0081] Certain embodiments include a recombinant microorganism, comprising (i) one or more genes encoding and expressing an aldehyde biosynthesis pathway, wherein the aldehyde biosynthesis pathway comprises one or more genes encoding and expressing a decarboxylase enzyme; and (ii) one or more genes encoding and expressing an aldehyde reductase. In certain aspects, the aldehyde biosynthesis pathway further comprises one or more genes encoding and expressing an enzyme selected from a CoA-linked aldehyde dehydrogenase, an aldehyde dehydrogenase, and an alcohol dehydrogenase. In certain aspects, the decarboxylase enzyme is an indole-3-pyruvate decarboxylase (IPDC). In certain aspects, the aldehyde reductase enzyme is a phenylacetoaldehyde reductase (PAR). In certain aspects, the microorganism is capable of converting a suitable monosaccharide or oligosaccharide to a commodity chemical. In certain aspects, the commodity chemical is selected from 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol.

[0082] Certain embodiments include a recombinant microorganism, wherein the microorganism comprises reduced ethanol production capability compared to a wild-type microorganism. In certain aspects, the microorganism comprises a reduction or inhibition in the conversion of acetyl-coA to ethanol. In certain aspects, the recombinant microorganism comprises a reduction of an ethanol dehydrogenase, thereby providing a reduced ethanol production capability. In certain aspects, the ethanol dehydrogenase is an adhE, homolog or variant thereof. In certain aspects, the microorganism comprises a deletion or knockout of an adhE, homolog or variant thereof. In certain aspects, the recombinant microorganism comprises one or more deletions or knockouts in a gene encoding an enzyme selected from an enzyme that catalyzes the conversion of acetyl-coA to ethanol, an enzyme that catalyzes the conversion of pyruvate to lactate, an enzyme that catalyzes the conversion of fumarate to succinate, an enzyme that catalyzes the conversion of acetyl-coA and phosphate to coA and acetyl phosphate, an enzyme that catalyzes the conversion of acetyl-coA and formate to coA and pyruvate, and an enzyme that catalyzes the conversion of alpha-keto acid to branched chain amino acids.

[0083] Certain embodiments include wherein the microbial systems or recombinant microorgansims described herein comprise a microorganism selected from Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus usamii, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus licheniformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Candida rugosa, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Humicola nsolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sccharomyces cerevisiae, Sclerotina libertine, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Vibrio alginolyticus, Xanthomonas, yeast, Zygosaccharomyces rouxii, Zymomonas, and Zymomonus mobilis.

[0084] Certain embodiments include a commodity chemical produced by the methods described herein. Certain aspects include a blended commodity chemical comprising a commodity chemical produced by the methods provided herein and a refinery-produced petroleum product. In certain aspects, the commodity chemical is selected from a C10-C12 hydrocarbon, 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol. In certain aspects, the C10-C12 hydrocarbon is selected from 2,7-dimethyloctane and 2,9-dimethyldecane. In certain aspects, the refinery-produced petroleum product is selected from jet fuel and diesel fuel.

[0085] Certain embodiments include methods of producing a commodity chemical enriched refinery-produced petroleum product, comprising (a) blending the refinery-produced petroleum product with the commodity chemical produced by the methods described herein, thereby producing the commodity chemical enriched refinery-produced petroleum product.

DETAILED DESCRIPTION

[0086] Embodiments of the present invention relate to the unexpected discovery that microorganisms which are otherwise incapable of growing on certain polysaccharides derived from biomass as a sole source of carbon, can be engineered to grow on these polysaccharides as a sole source of carbon. Such microorganisms can include both prokaryotic and eukaryotic microorganisms, such as bacteria and yeast. In some aspects, certain laboratory and/or wild-type strains of E. coli can be engineered to grow on biomass derived from either alginate or pectin as a sole source of carbon to produce suitable monosaccharides or other molecules. Among other uses apparent to a person skilled in the art, the monosaccharides and other molecules produced by the growth of these engineered or recombinant microorganisms on alginate or pectin may be utilized as feedstock in the production of various commodity chemicals, such as biofuels.

[0087] Alginate and pectin provide advantages over other biomass sources in the production of biofuel feedstocks. For example, large-scale aquatic-farming can generate a significant amount of biomass without replacing food crop production with energy crop production, deforestation, and recultivating currently uncultivated land, as most of hydrosphere including oceans, rivers, and lakes remains untapped. As one particular example, the Pacific coast of North America is abundant in minerals necessary for large-scale aqua-farming. Giant kelp, which lives in the area, grows as fast as 1 m/day, the fastest among plants on earth, and grows up to 50 m. Additionally, aqua-farming has other benefits including the prevention of a red tide outbreak and the creation of a fish-friendly environment.

[0088] As an additional advantage, and in contrast to lignocellulolic biomass, biomass derived from aquatic, fruit, plant and/or vegetable sources is easy to degrade. Such biomass typically lacks lignin and is significantly more fragile than lignocellulolic biomass and can thus be easily degraded using either enzymes or chemical catalysts (e.g., formate). As one example, aquatic biomass such as seaweed may be easily converted to monosaccharides using either enzymes or chemical catalysis, as seaweed has significantly simpler major sugar components (Alginate: 30%, Mannitol: 15%) as compared to lignocellulose (Glucose: 24.1-39%, Mannose: 0.2-4.6%, Galactose: 0.5-2.4%, Xylose: 0.4-22.1%, Arabinose 1.5-2.8%, and Uronic acids: 1.2-20.7%, and total sugar contents are corresponding to 36.5-70% of dried weight).

[0089] As an additional example, biomass from plants such as fruit and/or vegetable contains pectin, a heteropolysaccharide derived from the plant cell wall. The characteristic structure of pectin is a linear chain of .alpha.-(1-4)-linked D-galacturonic acid that forms the pectin-backbone, a homogalacturonan. Pectin can be easily converted to oligosaccharides or suitable monosaccharides using either enzymes, chemical catalysis, and/or microbial systems designed to utilize pectin as a source of carbon, as described herein. Saccharification and fermentation using aquatic, fruit, and/or vegetable biomass is much easier than using lignocellulose.

[0090] In this regard, embodiments of the present invention also relate to the surprising discovery that certain microorganisms can be engineered to produce various commodity chemicals, such as biofuels. In certain aspects, these biofuels may include alkanes, such as medium to long chain alkanes, which provide advantages over ethanol based biofuels. In certain aspects, the monosaccharides (e.g., 2-keto-3-deoxy D-gluconate; KDG) and other molecules produced by the growth of various engineered or recombinant microorganisms (e.g., recombinant microorganisms growing on pectin or alginate as a source of carbon) may be useful in the production of commodity chemicals, such as biofuels. As one example, suitable monosaccharides such as KDG may be utilized by recombinant microorganisms to produce alkanes, such as medium to long chain alkanes, among other chemicals. In certain aspects, such recombinant microorganisms may be utilized to produce such commodity chemical as 2,7 dimethyl octane and 2,9 dimethyl decane, among others provided herein and known in the art.

[0091] Such processes produce biofuels with significant advantages over other biofuels. In particular, medium to long chain alkanes provide a number of important advantages over the existing common biofuels such as ethanol and butanol, and are attractive long-term replacements of petroleum-based fuels such as gasoline, diesels, kerosene, and heavy oils in the future. As one example, medium to long chain alkanes and alcohols are major components in all petroleum products and jet fuel in particular, and hence alkanes we produce can be utilized directly by existing engines. By way of further example, medium to long chain alcohols are far better fuels than ethanol, and have a nearly comparable energy density to gasoline.

[0092] As another example, n-alkanes are major components of all oil products including gasoline, diesels, kerosene, and heavy oils. Microbial systems or recombinant microorganisms may be used to produce n-alkanes with different carbon lengths ranging, for example, from C7 to over C20: C7 for gasoline (e.g., motor vehicles), C10-C15 for diesels (e.g., motor vehicles, trains, and ships), and C8-C16 for kerosene (e.g., aviations and ships), and for all heavy oils.

[0093] As one aspect of the invention, the commodity chemicals produced by the methods and recombinant microorganisms described herein may be utilized by existing petroleum refineries for the purposes of blending with petroleum products produced by traditional refinery methods. To this end, as noted above, fuel producers are seeking substantially similar, low carbon fuels that can be blended and distributed through existing infrastructure (refineries, pipelines, tankers). As hydrocarbons, the commodity chemicals produced according to the methods herein are substantially similar to petroleum derived fuels, reduce green house gas emissions by more than 80% from petroleum derived fuels, and are compatible with existing infrastructure in the oil and gas industry. For instance, certain of the commodity chemicals produced herein, including, for example, various C10-C12 hydrocarbons such as 2,7 dimethyloctane, 2,7 dimethyldecanone, among others, are blendable directly into refinery-produced petroleum products, such as jet and diesel fuels. By using such biologically produced commodity chemicals as a blendstock for jet and diesel fuels, refineries may reduce Green House Gas emissions by more than 80%.

[0094] Accordingly, certain embodiments of the present invention relate generally to methods for converting biomass to a commodity chemical, comprising obtaining a polysaccharide from biomass; contacting the polysaccharide with a polysaccharide degrading or depolymerizing pathway, thereby converting the polysaccharide to a suitable monosaccharide. The suitable monosaccharide obtained from such as process may be used for any desired purpose. For instance, in certain aspects, the suitable monosaccharide may then be converted to a commodity chemical (e.g., biofuel) by contacting the suitable monosaccharide with a biofuel biosynthesis pathway, whether as part of a recombinant microorganism, an in vitro enzymatic or chemical pathway, or a combination thereof, thereby converting the monosaccharide to a commodity chemical.

[0095] In other aspects, in producing a commodity chemical such as a biofuel, a suitable monosaccharide may be obtained directly from any available source and converted to a commodity chemical by contacting the suitable monosaccharide with a biofuel biosynthesis pathway, as described herein. Among other uses apparent to a person skilled in the art, such biofuels may then be blended directly with refinery produced petroleum products, such as jet and diesel fuels, to produce commodity chemical enriched, refinery-produced petroleum products.

DEFINITIONS

[0096] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which the invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, preferred methods and materials are described. For the purposes of the present invention, the following terms are defined below. All references referred to herein are incorporated by reference in their entirety.

[0097] The articles "a" and "an" are used herein to refer to one or to more than one (i.e. to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.

[0098] By "about" is meant a quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length that varies by as much as 30, 25, 20, 25, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1% to a reference quantity, level, value, number, frequency, percentage, dimension, size, amount, weight or length.

[0099] The term "biologically active fragment", as applied to fragments of a reference polynucleotide or polypeptide sequence, refers to a fragment that has at least about 0.1, 0.5, 1, 2, 5, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 100, 110, 120, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000% or more of the activity of a reference sequence.

[0100] The term "reference sequence" refers generally to a nucleic acid coding sequence, or amino acid sequence, of any enzyme having a biological activity described herein (e.g., saccharide dehydrogenase, alcohol dehydrogenase, dehydratase, lyase, transporter, decarboxylase, hydrolase, etc.), such as a "wild-type" sequence, including those reference sequences exemplified by SEQ ID NOS:1-144, and 308-313. A reference sequence may also include naturally-occurring, functional variants (i.e., orthologs or homologs) of the sequences described herein.

[0101] Included within the scope of the present invention are biologically active fragments of at least about 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 500, 600 or more contiguous nucleotides or amino acid residues in length, including all integers in between, which comprise or encode a polypeptide having an enzymatic activity of a reference polynucleotide or polypeptide. Representative biologically active fragments generally participate in an interaction, e.g., an intra-molecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction. Examples of enzymatic interactions or activities include saccharide dehydrogenase activities, alcohol dehydrogenase activities, dehydratases activities, lyase activities, transporter activities, isomerase activities, kinase activities, among others described herein. Biologically active fragments typically comprise one or more active sites or enzymatic/binding motifs, as described herein and known in the art.

[0102] By "coding sequence" is meant any nucleic acid sequence that contributes to the code for the polypeptide product of a gene. By contrast, the term "non-coding sequence" refers to any nucleic acid sequence that does not contribute to the code for the polypeptide product of a gene.

[0103] Throughout this specification, unless the context requires otherwise, the words "comprise", "comprises" and "comprising" will be understood to imply the inclusion of a stated step or element or group of steps or elements but not the exclusion of any other step or element or group of steps or elements.

[0104] By "consisting of" is meant including, and limited to, whatever follows the phrase "consisting of." Thus, the phrase "consisting of" indicates that the listed elements are required or mandatory, and that no other elements may be present.

[0105] By "consisting essentially of" is meant including any elements listed after the phrase, and limited to other elements that do not interfere with or contribute to the activity or action specified in the disclosure for the listed elements. Thus, the phrase "consisting essentially of" indicates that the listed elements are required or mandatory, but that other elements are optional and may or may not be present depending upon whether or not they affect the activity or action of the listed elements.

[0106] The terms "complementary" and "complementarity" refer to polynucleotides (i.e., a sequence of nucleotides) related by the base-pairing rules. For example, the sequence "A-G-T," is complementary to the sequence "T-C-A." Complementarity may be "partial," in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be "complete" or "total" complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands.

[0107] By "corresponds to" or "corresponding to" is meant (a) a polynucleotide having a nucleotide sequence that is substantially identical or complementary to all or a portion of a reference polynucleotide sequence or encoding an amino acid sequence identical to an amino acid sequence in a peptide or protein; or (b) a peptide or polypeptide having an amino acid sequence that is substantially identical to a sequence of amino acids in a reference peptide or protein.

[0108] By "derivative" is meant a polypeptide that has been derived from the basic sequence by modification, for example by conjugation or complexing with other chemical moieties (e.g., pegylation) or by post-translational modification techniques as would be understood in the art. The term "derivative" also includes within its scope alterations that have been made to a parent sequence including additions or deletions that provide for functionally equivalent molecules.

[0109] By "enzyme reactive conditions" it is meant that any necessary conditions are available in an environment (i.e., such factors as temperature, pH, lack of inhibiting substances) which will permit the enzyme to function. Enzyme reactive conditions can be either in vitro, such as in a test tube, or in vivo, such as within a cell.

[0110] As used herein, the terms "function" and "functional" and the like refer to a biological or enzymatic function.

[0111] By "gene" is meant a unit of inheritance that occupies a specific locus on a chromosome and consists of transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (i.e., introns, 5' and 3' untranslated sequences).

[0112] "Homology" refers to the percentage number of amino acids that are identical or constitute conservative substitutions. Homology may be determined using sequence comparison programs such as GAP (Deveraux et al., 1984, Nucleic Acids Research 12, 387-395) which is incorporated herein by reference. In this way sequences of a similar or substantially different length to those cited herein could be compared by insertion of gaps into the alignment, such gaps being determined, for example, by the comparison algorithm used by GAP.

[0113] The term "host cell" includes an individual cell or cell culture which can be or has been a recipient of any recombinant vector(s) or isolated polynucleotide of the invention. Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology or in total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or change. A host cell includes cells transfected, transformed, or infected in vivo or in vitro with a recombinant vector or a polynucleotide of the invention. A host cell which comprises a recombinant vector of the invention is a recombinant host cell, recombinant cell, or recombinant microrganism.

[0114] By "isolated" is meant material that is substantially or essentially free from components that normally accompany it in its native state. For example, an "isolated polynucleotide", as used herein, refers to a polynucleotide, which has been purified from the sequences which flank it in a naturally-occurring state, e.g., a DNA fragment which has been removed from the sequences that are normally adjacent to the fragment. Alternatively, an "isolated peptide" or an "isolated polypeptide" and the like, as used herein, refer to in vitro isolation and/or purification of a peptide or polypeptide molecule from its natural cellular environment, and from association with other components of the cell, i.e., it is not associated with in vivo substances.

[0115] By "increased" or "increasing" is meant the ability of one or more recombinant microorganisms to produce a greater amount of a given product or molecule (e.g., commodity chemical, biofuel, or intermediate product thereof) as compared to a control microorganism, such as an unmodified microorganism or a differently modified microorganism. An "increased" amount is typically a "statistically significant" amount, and may include an increase that is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30 or more times (including all integers and decimal points in between, e.g., 1.5, 1.6, 1.7. 1.8, etc.) the amount produced by an unmodified microorganism or a differently modified microorganism.

[0116] By "obtained from" is meant that a sample such as, for example, a polynucleotide extract or polypeptide extract is isolated from, or derived from, a particular source, such as a desired organism, typically a microorganism. "Obtained from" can also refer to the situation in which a polynucleotide or polypeptide sequence is isolated from, or derived from, a particular organism or microorganism. For example, a polynucleotide sequence encoding a benzaldehyde lyase enzyme may be isolated from a variety of prokaryotic or eukaryotic microorganisms, such as Pseudomonas.

[0117] The term "operably linked" as used herein means placing a gene under the regulatory control of a promoter, which then controls the transcription and optionally the translation of the gene. In the construction of heterologous promoter/structural gene combinations, it is generally preferred to position the genetic sequence or promoter at a distance from the gene transcription start site that is approximately the same as the distance between that genetic sequence or promoter and the gene it controls in its natural setting; i.e. the gene from which the genetic sequence or promoter is derived. As is known in the art, some variation in this distance can be accommodated without loss of function. Similarly, the preferred positioning of a regulatory sequence element with respect to a heterologous gene to be placed under its control is defined by the positioning of the element in its natural setting; i.e., the genes from which it is derived. "Constitutive promoters" are typically active, i.e., promote transcription, under most conditions. "Inducible promoters" are typically active only under certain conditions, such as in the presence of a given molecule factor (e.g., IPTG) or a given environmental condition (e.g., CO.sub.2 concentration, nutrient levels, light, heat). In the absence of that condition, inducible promoters typically do not allow significant or measurable levels of transcriptional activity.

[0118] The recitation "polynucleotide" or "nucleic acid" as used herein designates mRNA, RNA, cRNA, rRNA, cDNA or DNA. The term typically refers to polymeric form of nucleotides of at least 10 bases in length, either ribonucleotides or deoxynucleotides or a modified form of either type of nucleotide. The term includes single and double stranded forms of DNA.

[0119] As will be understood by those skilled in the art, the polynucleotide sequences of this invention can include genomic sequences, extra-genomic and plasmid-encoded sequences and smaller engineered gene segments that express, or may be adapted to express, proteins, polypeptides, peptides and the like. Such segments may be naturally isolated, or modified synthetically by the hand of man.

[0120] Polynucleotides may be single-stranded (coding or antisense) or double-stranded, and may be DNA (genomic, cDNA or synthetic) or RNA molecules. Additional coding or non-coding sequences may, but need not, be present within a polynucleotide of the present invention, and a polynucleotide may, but need not, be linked to other molecules and/or support materials.

[0121] Polynucleotides may comprise a native sequence (i.e., an endogenous sequence) or may comprise a variant, or a biological functional equivalent of such a sequence. Polynucleotide variants may contain one or more substitutions, additions, deletions and/or insertions, as further described below, preferably such that the enzymatic activity of the encoded polypeptide is not substantially diminished relative to the unmodified polypeptide, and preferably such that the enzymatic activity of the encoded polypeptide is improved (e.g., optimized) relative to the unmodified polypeptide. The effect on the enzymatic activity of the encoded polypeptide may generally be assessed as described herein.

[0122] The polynucleotides of the present invention, regardless of the length of the coding sequence itself, may be combined with other DNA sequences, such as promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding segments, and the like, such that their overall length may vary considerably. It is therefore contemplated that a polynucleotide fragment of almost any length may be employed, with the total length preferably being limited by the ease of preparation and use in the intended recombinant DNA protocol.

[0123] The terms "polynucleotide variant" and "variant" and the like refer to polynucleotides that display substantial sequence identity with any of the reference polynucleotide sequences or genes described herein, and to polynucleotides that hybridize with any polynucleotide reference sequence described herein, or any polynucleotide coding sequence of any gene or protein referred to herein, under low stringency, medium stringency, high stringency, or very high stringency conditions that are defined hereinafter and known in the art. These terms also encompass polynucleotides that are distinguished from a reference polynucleotide by the addition, deletion or substitution of at least one nucleotide. Accordingly, the terms "polynucleotide variant" and "variant" include polynucleotides in which one or more nucleotides have been added or deleted, or replaced with different nucleotides. In this regard, it is well understood in the art that certain alterations inclusive of mutations, additions, deletions and substitutions can be made to a reference polynucleotide whereby the altered polynucleotide retains the biological function or activity of the reference polynucleotide, or has increased activity in relation to the reference polynucleotide (i.e., optimized). Polynucleotide variants include, for example, polynucleotides having at least 50% (and at least 51% to at least 99% and all integer percentages in between) sequence identity with a reference polynucleotide described herein.

[0124] The terms "polynucleotide variant" and "variant" also include naturally-occurring allelic variants that encode these enzymes. Examples of naturally-occurring variants include allelic variants (same locus), homologs (different locus), and orthologs (different organism). Naturally occurring variants such as these can be identified and isolated using well-known molecular biology techniques including, for example, various polymerase chain reaction (PCR) and hybridization-based techniques as known in the art. Naturally occurring variants can be isolated from any organism that encodes one or more genes having a suitable enzymatic activity described herein (e.g., C--C ligase, diol dehyodrogenase, pectate lyase, alginate lyase, diol dehydratase, transporter, etc.).

[0125] Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. In certain aspects, non-naturally occurring variants may have been optimized for use in a given microorganism (e.g., E. coli), such as by engineering and screening the enzymes for increased activity, stability, or any other desirable feature. The variations can produce both conservative and non-conservative amino acid substitutions (as compared to the originally encoded product). For nucleotide sequences, conservative variants include those sequences that, because of the degeneracy of the genetic code, encode the amino acid sequence of a reference polypeptide. Variant nucleotide sequences also include synthetically derived nucleotide sequences, such as those generated, for example, by using site-directed mutagenesis but which still encode a biologically active polypeptide. Generally, variants of a particular reference nucleotide sequence will have at least about 30%, 40% 50%, 55%, 60%, 65%, 70%, generally at least about 75%, 80%, 85%, 90% to 95% or more, and even about 97% or 98% or more sequence identity to that particular nucleotide sequence as determined by sequence alignment programs described elsewhere herein using default parameters.

[0126] As used herein, the term "hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions" describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Ausubel et al., "Current Protocols in Molecular Biology", John Wiley & Sons Inc, 1994-1998, Sections 6.3.1-6.3.6. Aqueous and non-aqueous methods are described in that reference and either can be used.

[0127] Reference herein to "low stringency" conditions include and encompass from at least about 1% v/v to at least about 15% v/v formamide and from at least about 1 M to at least about 2 M salt for hybridization at 42.degree. C., and at least about 1 M to at least about 2 M salt for washing at 42.degree. C. Low stringency conditions also may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M NaHPO.sub.4 (pH 7.2), 7% SDS for hybridization at 65.degree. C., and (i) 2.times.SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO.sub.4 (pH 7.2), 5% SDS for washing at room temperature. One embodiment of low stringency conditions includes hybridization in 6.times. sodium chloride/sodium citrate (SSC) at about 45.degree. C., followed by two washes in 0.2.times.SSC, 0.1% SDS at least at 50.degree. C. (the temperature of the washes can be increased to 55.degree. C. for low stringency conditions).

[0128] "Medium stringency" conditions include and encompass from at least about 16% v/v to at least about 30% v/v formamide and from at least about 0.5 M to at least about 0.9 M salt for hybridization at 42.degree. C., and at least about 0.1 M to at least about 0.2 M salt for washing at 55.degree. C. Medium stringency conditions also may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M NaHPO.sub.4 (pH 7.2), 7% SDS for hybridization at 65.degree. C., and (i) 2.times.SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO.sub.4 (pH 7.2), 5% SDS for washing at 60-65.degree. C. One embodiment of medium stringency conditions includes hybridizing in 6.times.SSC at about 45.degree. C., followed by one or more washes in 0.2.times.SSC, 0.1% SDS at 60.degree. C.

[0129] "High stringency" conditions include and encompass from at least about 31% v/v to at least about 50% v/v formamide and from about 0.01 M to about 0.15 M salt for hybridization at 42.degree. C., and about 0.01 M to about 0.02 M salt for washing at 55.degree. C. High stringency conditions also may include 1% BSA, 1 mM EDTA, 0.5 M NaHPO.sub.4 (pH 7.2), 7% SDS for hybridization at 65.degree. C., and (i) 0.2.times.SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM NaHPO.sub.4 (pH 7.2), 1% SDS for washing at a temperature in excess of 65.degree. C. One embodiment of high stringency conditions includes hybridizing in 6.times.SSC at about 45.degree. C., followed by one or more washes in 0.2.times.SSC, 0.1% SDS at 65.degree. C.

[0130] One embodiment of "very high stringency" conditions includes hybridizing in 0.5 M sodium phosphate, 7% SDS at 65.degree. C., followed by one or more washes in 0.2.times.SSC, 1% SDS at 65.degree. C.

[0131] Other stringency conditions are well known in the art and a skilled addressee will recognize that various factors can be manipulated to optimize the specificity of the hybridization. Optimization of the stringency of the final washes can serve to ensure a high degree of hybridization. For detailed examples, see Ausubel et al., supra at pages 2.10.1 to 2.10.16 and Sambrook et al., Current Protocols in Molecular Biology (1989), at sections 1.101 to 1.104.

[0132] While stringent washes are typically carried out at temperatures from about 42.degree. C. to 68.degree. C., one skilled in the art will appreciate that other temperatures may be suitable for stringent conditions. Maximum hybridization rate typically occurs at about 20.degree. C. to 25.degree. C. below the T.sub.m for formation of a DNA-DNA hybrid. It is well known in the art that the T.sub.m is the melting temperature, or temperature at which two complementary polynucleotide sequences dissociate. Methods for estimating T.sub.m are well known in the art (see Ausubel et al., supra at page 2.10.8).

[0133] In general, the T.sub.m of a perfectly matched duplex of DNA may be predicted as an approximation by the formula: T.sub.m=81.5+16.6 (log.sub.10 M)+0.41 (% G+C)-0.63 (% formamide)-(600/length) wherein: M is the concentration of Na.sup.+, preferably in the range of 0.01 molar to 0.4 molar; % G+C is the sum of guano sine and cytosine bases as a percentage of the total number of bases, within the range between 30% and 75% G+C; % formamide is the percent formamide concentration by volume; length is the number of base pairs in the DNA duplex. The T.sub.m of a duplex DNA decreases by approximately 1.degree. C. with every increase of 1% in the number of randomly mismatched base pairs. Washing is generally carried out at T.sub.m-15.degree. C. for high stringency, or T.sub.m-30.degree. C. for moderate stringency.

[0134] In one example of a hybridization procedure, a membrane (e.g., a nitrocellulose membrane or a nylon membrane) containing immobilized DNA is hybridized overnight at 42.degree. C. in a hybridization buffer (50% deionizer formamide, 5.times.SSC, 5.times. Reinhardt's solution (0.1% fecal, 0.1% polyvinylpyrollidone and 0.1% bovine serum albumin), 0.1% SDS and 200 mg/mL denatured salmon sperm DNA) containing a labeled probe. The membrane is then subjected to two sequential medium stringency washes (i.e., 2.times.SSC, 0.1% SDS for 15 min at 45.degree. C., followed by 2.times.SSC, 0.1% SDS for 15 min at 50.degree. C.), followed by two sequential higher stringency washes (i.e., 0.2.times.SSC, 0.1% SDS for 12 min at 55.degree. C. followed by 0.2.times.SSC and 0.1% SDS solution for 12 min at 65-68.degree. C.

[0135] Polynucleotides and fusions thereof may be prepared, manipulated and/or expressed using any of a variety of well established techniques known and available in the art. For example, polynucleotide sequences which encode polypeptides of the invention, or fusion proteins or functional equivalents thereof, may be used in recombinant DNA molecules to direct expression of a selected enzyme in appropriate host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences that encode substantially the same or a functionally equivalent amino acid sequence may be produced and these sequences may be used to clone and express a given polypeptide.

[0136] As will be understood by those of skill in the art, it may be advantageous in some instances to produce polypeptide-encoding nucleotide sequences possessing non-naturally occurring codons. For example, codons preferred by a particular prokaryotic or eukaryotic host can be selected to increase the rate of protein expression or to produce a recombinant RNA transcript having desirable properties, such as a half-life which is longer than that of a transcript generated from the naturally occurring sequence. Such nucleotides are typically referred to as "codon-optimized." Any of the nucleotide sequences described herein may be utilized in such a "codon-optimized" form. For example, the nucleotide coding sequence of the benzaldehyde lyase from Pseudomonas fluorescens may be codon-optimized for expression in E. coli.

[0137] Moreover, the polynucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter polypeptide encoding sequences for a variety of reasons, including but not limited to, alterations which modify the cloning, processing, expression and/or activity of the gene product.

[0138] In order to express a desired polypeptide, a nucleotide sequence encoding the polypeptide, or a functional equivalent, may be inserted into appropriate expression vector, i.e., a vector that contains the necessary elements for the transcription and translation of the inserted coding sequence. Methods which are well known to those skilled in the art may be used to construct expression vectors containing sequences encoding a polypeptide of interest and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described in Sambrook et al., Molecular Cloning, A Laboratory Manual (1989), and Ausubel et al., Current Protocols in Molecular Biology (1989).

[0139] "Polypeptide," "polypeptide fragment," "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues and to variants and synthetic analogues of the same. Thus, these terms apply to amino acid polymers in which one or more amino acid residues are synthetic non-naturally occurring amino acids, such as a chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally-occurring amino acid polymers. In certain aspects, polypeptides may include enzymatic polypeptides, or "enzymes," which typically catalyze (i.e., increase the rate of) various chemical reactions.

[0140] The recitation polypeptide "variant" refers to polypeptides that are distinguished from a reference polypeptide sequence by the addition, deletion or substitution of at least one amino acid residue. In certain embodiments, a polypeptide variant is distinguished from a reference polypeptide by one or more substitutions, which may be conservative or non-conservative. In certain embodiments, the polypeptide variant comprises conservative substitutions and, in this regard, it is well understood in the art that some amino acids may be changed to others with broadly similar properties without changing the nature of the activity of the polypeptide. Polypeptide variants also encompass polypeptides in which one or more amino acids have been added or deleted, or replaced with different amino acid residues.

[0141] The present invention contemplates the use in the methods described herein of variants of full-length polypeptides having any of the enzymatic activities described herein, truncated fragments of these full-length polypeptides, variants of truncated fragments, as well as their related biologically active fragments. Typically, biologically active fragments of a polypeptide may participate in an interaction, for example, an intra-molecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). Biologically active fragments of a polypeptide/enzyme an enzymatic activity described herein include peptides comprising amino acid sequences sufficiently similar to, or derived from, the amino acid sequences of a (putative) full-length reference polypeptide sequence. Typically, biologically active fragments comprise a domain or motif with at least one enzymatic activity, and may include one or more (and in some cases all) of the various active domains. A biologically active fragment of a an enzyme can be a polypeptide fragment which is, for example, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 450, 500, 600 or more contiguous amino acids, including all integers in between, of a reference polypeptide sequence. In certain embodiments, a biologically active fragment comprises a conserved enzymatic sequence, domain, or motif, as described elsewhere herein and known in the art. Suitably, the biologically-active fragment has no less than about 1%, 10%, 25%, 50% of an activity of the wild-type polypeptide from which it is derived.

[0142] The term "exogenous" refers generally to a polynucleotide sequence or polypeptide that does not naturally occur in a wild-type cell or organism, but is typically introduced into the cell by molecular biological techniques, i.e., engineering to produce a recombinant microorganism. Examples of "exogenous" polynucleotides include vectors, plasmids, and/or man-made nucleic acid constructs encoding a desired protein or enzyme. The term "endogenous" refers generally to naturally occurring polynucleotide sequences or polypeptides that may be found in a given wild-type cell or organism. For example, certain naturally-occurring bacterial or yeast species do not typically contain a benzaldehyde lyase gene, and, therefore, do not comprise an "endogenous" polynucleotide sequence that encodes a benzaldehyde lyase. In this regard, it is also noted that even though an organism may comprise an endogenous copy of a given polynucleotide sequence or gene, the introduction of a plasmid or vector encoding that sequence, such as to over-express or otherwise regulate the expression of the encoded protein, represents an "exogenous" copy of that gene or polynucleotide sequence. Any of the of pathways, genes, or enzymes described herein may utilize or rely on an "endogenous" sequence, or may be provided as one or more "exogenous" polynucleotide sequences, and/or may be utilized according to the endogenous sequences already contained within a given microorganism.

[0143] A "recombinant" microorganism typically comprises one or more exogenous nucleotide sequences, such as in a plasmid or vector.

[0144] The recitations "sequence identity" or, for example, comprising a "sequence 50% identical to," as used herein, refer to the extent that sequences are identical on a nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis over a window of comparison. Thus, a "percentage of sequence identity" may be calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, I) or the identical amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, Ile, Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gln, Cys and Met) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity.

[0145] Terms used to describe sequence relationships between two or more polynucleotides or polypeptides include "reference sequence", "comparison window", "sequence identity", "percentage of sequence identity" and "substantial identity". A "reference sequence" is at least 12 but frequently 15 to 18 and often at least 25 monomer units, inclusive of nucleotides and amino acid residues, in length. Because two polynucleotides may each comprise (1) a sequence (i.e., only a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity. A "comparison window" refers to a conceptual segment of at least 6 contiguous positions, usually about 50 to about 100, more usually about 100 to about 150 in which a sequence is compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. The comparison window may comprise additions or deletions (i.e., gaps) of about 20% or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window may be conducted by computerized implementations of algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Drive Madison, Wis., USA) or by inspection and the best alignment (i.e., resulting in the highest percentage homology over the comparison window) generated by any of the various methods selected. Reference also may be made to the BLAST family of programs as for example disclosed by Altschul et al., 1997, Nucl. Acids Res. 25:3389. A detailed discussion of sequence analysis can be found in Unit 19.3 of Ausubel et al., "Current Protocols in Molecular Biology", John Wiley & Sons Inc, 1994-1998, Chapter 15.

[0146] "Transformation" refers generally to the permanent, heritable alteration in a cell resulting from the uptake and incorporation of foreign DNA into the host-cell genome; also, the transfer of an exogenous gene from one organism into the genome of another organism.

[0147] By "vector" is meant a polynucleotide molecule, preferably a DNA molecule derived, for example, from a plasmid, bacteriophage, yeast or virus, into which a polynucleotide can be inserted or cloned. A vector preferably contains one or more unique restriction sites and can be capable of autonomous replication in a defined host cell including a target cell or tissue or a progenitor cell or tissue thereof, or be integrable with the genome of the defined host such that the cloned sequence is reproducible. Accordingly, the vector can be an autonomously replicating vector, i.e., a vector that exists as an extra-chromosomal entity, the replication of which is independent of chromosomal replication, e.g., a linear or closed circular plasmid, an extra-chromosomal element, a mini-chromosome, or an artificial chromosome. The vector can contain any means for assuring self-replication. Alternatively, the vector can be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. Such a vector may comprise specific sequences that allow recombination into a particular, desired site of the host chromosome. A vector system can comprise a single vector or plasmid, two or more vectors or plasmids, which together contain the total DNA to be introduced into the genome of the host cell, or a transposon. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. In the present case, the vector is preferably one which is operably functional in a bacterial cell, such as a cyanobacterial cell. The vector can include a reporter gene, such as a green fluorescent protein (GFP), which can be either fused in frame to one or more of the encoded polypeptides, or expressed separately. The vector can also include a selection marker such as an antibiotic resistance gene that can be used for selection of suitable transformants.

[0148] The terms "wild-type" and "naturally occurring" are used interchangeably to refer to a gene or gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source. A wild type gene or gene product (e.g., a polypeptide) is that which is most frequently observed in a population and is thus arbitrarily designed the "normal" or "wild-type" form of the gene.

[0149] Examples of "biomass" include aquatic or marine biomass, fruit-based biomass such as fruit waste, and vegetable-based biomass such as vegetable waste, among others. Examples of aquatic or marine biomass include, but are not limited to, kelp, giant kelp, seaweed, algae, and marine microflora, microalgae, sea grass, and the like. In certain aspects, biomass does not include fossilized sources of carbon, such as hydrocarbons that are typically found within the top layer of the Earth's crust (e.g., natural gas, nonvolatile materials composed of almost pure carbon, like anthracite coal, etc).

[0150] Examples of fruit and/or vegetable biomass include, but are not limited to, any source of pectin such as plant peel and pomace including citrus, orange, grapefruit, potato, tomato, grape, mango, gooseberry, carrot, sugar-beet, and apple, among others.

[0151] Examples of polysaccharides, oligosaccharides, monosaccharides or other sugar components of biomass include, but are not limited to, alginate, agar, carrageenan, fucoidan, pectin, gluronate, mannuronate, mannitol, lyxose, cellulose, hemicellulose, glycerol, xylitol, glucose, mannose, galactose, xylose, xylan, mannan, arabinan, arabinose, glucuronate, galacturonate (including di- and tri-galacturonates), rhamnose, and the like.

[0152] Certain examples of alginate-derived polysaccharides include saturated polysaccharides, such as .beta.-D-mannuronate, .alpha.-L-gluronate, dialginate, trialginate, pentalginate, hexylginate, heptalginate, octalginate, nonalginate, decalginate, undecalginate, dodecalginate and polyalginate, as well as unsaturated polysaccharides such as 4-deoxy-L-erythro-5-hexoseulose uronic acid, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-D-mannuronate or L-guluronate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-dialginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-trialginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-tetralginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-pentalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-hexylginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-heptalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-octalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-nonalginate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-undecalginate, and 4-(4-deoxy-beta-D-mann-4-enuronosyl)-dodecalginate.

[0153] Certain examples of pectin-derived polysaccharides include saturated polysaccharides, such as galacturonate, digalacturonate, trigalacturonate, tetragalacturonate, pentagalacturonate, hexagalacturonate, heptagalacturonate, octagalacturonate, nonagalacturonate, decagalacturonate, dodecagalacturonate, polygalacturonate, and rhamnopolygalacturonate, as well as saturated polysaccharides such as 4-deoxy-L-threo-5-hexosulose uronate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-digalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-trigalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-tetragalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-pentagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-hexagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-heptagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-octagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-nonagalacturonate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-decagalacturonate, and 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-dodecagalacturonate.

[0154] These polysaccharide or oligosaccharide components may be converted into "suitable monosaccharides" or other "suitable saccharides," such as "suitable oligosaccharides," by the microorganisms described herein which are capable of growing on such polysaccharides or other sugar components as a source of carbon (e.g., a sole source of carbon).

[0155] A "suitable monosaccharide" or "suitable saccharide" refers generally to any saccharide that may be produced by a recombinant microorganism growing on pectin, alginate, or other saccharide (e.g., galacturonate, cellulose, hemi-cellulose etc.) as a source or sole source of carbon, and also refers generally to any saccharide that may be utilized in a biofuel biosynthesis pathway of the present invention to produce hydrocarbons such as biofuels or biopetrols. Examples of suitable monosaccharides or oligosaccharides include, but are not limited to, 2-keto-3-deoxy D-gluconate (KDG), D-mannitol, gluronate, mannuronate, mannitol, lyxose, glycerol, xylitol, glucose, mannose, galactose, xylose, arabinose, glucuronate, galacturonates, and rhamnose, and the like. As noted herein, a "suitable monosaccharide" or "suitable saccharide" as used herein may be produced by an engineered or recombinant microorganism of the present invention, or may be obtained from commercially available sources.

[0156] The recitation "commodity chemical" as used herein includes any saleable or marketable chemical that can be produced either directly or as a by-product of the methods provided herein, including biofuels and/or biopetrols. General examples of "commodity chemicals" include, but are not limited to, biofuels, minerals, polymer precursors, fatty alcohols, surfactants, plasticizers, and solvents. The recitation "biofuels" as used herein includes solid, liquid, or gas fuels derived, at least in part, from a biological source, such as a recombinant microorganism.

Examples of commodity chemicals include, but are not limited to, methane, methanol, ethane, ethene, ethanol, n-propane, 1-propene, 1-propanol, propanal, acetone, propionate, n-butane, 1-butene, 1-butanol, butanal, butanoate, isobutanal, isobutanol, 2-methylbutanal, 2-methylbutanol, 3-methylbutanal, 3-methylbutanol, 2-butene, 2-butanol, 2-butanone, 2,3-butanediol, 3-hydroxy-2-butanone, 2,3-butanedione, ethylbenzene, ethenylbenzene, 2-phenylethanol, phenylacetaldehyde, 1-phenylbutane, 4-phenyl-1-butene, 4-phenyl-2-butene, 1-phenyl-2-butene, 1-phenyl-2-butanol, 4-phenyl-2-butanol, 1-phenyl-2-butanone, 4-phenyl-2-butanone, 1-phenyl-2,3-butandiol, 1-phenyl-3-hydroxy-2-butanone, 4-phenyl-3-hydroxy-2-butanone, 1-phenyl-2,3-butanedione, n-pentane, ethylphenol, ethenylphenol, 2-(4-hydroxyphenyl)ethanol, 4-hydroxyphenylacetaldehyde, 1-(4-hydroxyphenyl)butane, 4-(4-hydroxyphenyl)-1-butene, 4-(4-hydroxyphenyl)-2-butene, 1-(4-hydroxyphenyl)-1-butene, 1-(4-hydroxyphenyl)-2-butanol, 4-(4-hydroxyphenyl)-2-butanol, 1-(4-hydroxyphenyl)-2-butanone, 4-(4-hydroxyphenyl)-2-butanone, 1-(4-hydroxyphenyl)-2,3-butandiol, 1-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 4-(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-2,3-butanonedione, indolylethane, indolylethene, 2-(indole-3-)ethanol, n-pentane, 1-pentene, 1-pentanol, pentanal, pentanoate, 2-pentene, 2-pentanol, 3-pentanol, 2-pentanone, 3-pentanone, 4-methylpentanal, 4-methylpentanol, 2,3-pentanediol, 2-hydroxy-3-pentanone, 3-hydroxy-2-pentanone, 2,3-pentanedione, 2-methylpentane, 4-methyl-1-pentene, 4-methyl-2-pentene, 4-methyl-3-pentene, 4-methyl-2-pentanol, 2-methyl-3-pentanol, 4-methyl-2-pentanone, 2-methyl-3-pentanone, 4-methyl-2,3-pentanediol, 4-methyl-2-hydroxy-3-pentanone, 4-methyl-3-hydroxy-2-pentanone, 4-methyl-2,3-pentanedione, 1-phenylpentane, 1-phenyl-1-pentene, 1-phenyl-2-pentene, 1-phenyl-3-pentene, 1-phenyl-2-pentanol, 1-phenyl-3-pentanol, 1-phenyl-2-pentanone, 1-phenyl-3-pentanone, 1-phenyl-2,3-pentanediol, 1-phenyl-2-hydroxy-3-pentanone, 1-phenyl-3-hydroxy-2-pentanone, 1-phenyl-2,3-pentanedione, 4-methyl-1-phenylpentane, 4-methyl-1-phenyl-1-pentene, 4-methyl-1-phenyl-2-pentene, 4-methyl-1-phenyl-3-pentene, 4-methyl-1-phenyl-3-pentanol, 4-methyl-1-phenyl-2-pentanol, 4-methyl-1-phenyl-3-pentanone, 4-methyl-1-phenyl-2-pentanone, 4-methyl-1-phenyl-2,3-pentanediol, 4-methyl-1-phenyl-2,3-pentanedione, 4-methyl-1-phenyl-3-hydroxy-2-pentanone, 4-methyl-1-phenyl-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl)pentane, 1-(4-hydroxyphenyl)-1-pentene, 1-(4-hydroxyphenyl)-2-pentene, 1-(4-hydroxyphenyl)-3-pentene, 1-(4-hydroxyphenyl)-2-pentanol, 1-(4-hydroxyphenyl)-3-pentanol, 1-(4-hydroxyphenyl)-2-pentanone, 1-(4-hydroxyphenyl)-3-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanediol, 1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl)pentane, 4-methyl-1-(4-hydroxyphenyl)-2-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentene, 4-methyl-1-(4-hydroxyphenyl)-1-pentene, 4-methyl-1-(4-hydroxyphenyl)-3-pentanol, 4-methyl-1-(4-hydroxyphenyl)-2-pentanol, 4-methyl-1-(4-hydroxyphenyl)-3-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-pentanedione, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone, 1-indole-3-pentane, 1-(indole-3)-1-pentene, 1-(indole-3)-2-pentene, 1-(indole-3)-3-pentene, 1-(indole-3)-2-pentanol, 1-(indole-3)-3-pentanol, 1-(indole-3)-2-pentanone, 1-(indole-3)-3-pentanone, 1-(indole-3)-2,3-pentanediol, 1-(indole-3)-2-hydroxy-3-pentanone, 1-(indole-3)-3-hydroxy-2-pentanone, 1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3-)pentane, 4-methyl-1-(indole-3)-2-pentene, 4-methyl-1-(indole-3)-3-pentene, 4-methyl-1-(indole-3)-1-pentene, 4-methyl-2-(indole-3)-3-pentanol, 4-methyl-1-(indole-3)-2-pentanol, 4-methyl-1-(indole-3)-3-pentanone, 4-methyl-1-(indole-3)-2-pentanone, 4-methyl-1-(indole-3)-2,3-pentanediol, 4-methyl-1-(indole-3)-2,3-pentanedione, 4-methyl-1-(indole-3)-3-hydroxy-2-pentanone, 4-methyl-1-(indole-3)-2-hydroxy-3-pentanone, n-hexane, 1-hexene, 1-hexanol, hexanal, hexanoate, 2-hexene, 3-hexene, 2-hexanol, 3-hexanol, 2-hexanone, 3-hexanone, 2,3-hexanediol, 2,3-hexanedione, 3,4-hexanediol, 3,4-hexanedione, 2-hydroxy-3-hexanone, 3-hydroxy-2-hexanone, 3-hydroxy-4-hexanone, 4-hydroxy-3-hexanone, 2-methylhexane, 3-methylhexane, 2-methyl-2-hexene, 2-methyl-3-hexene, 5-methyl-1-hexene, 5-methyl-2-hexene, 4-methyl-1-hexene, 4-methyl-2-hexene, 3-methyl-3-hexene, 3-methyl-2-hexene, 3-methyl-1-hexene, 2-methyl-3-hexanol, 5-methyl-2-hexanol, 5-methyl-3-hexanol, 2-methyl-3-hexanone, 5-methyl-2-hexanone, 5-methyl-3-hexanone, 2-methyl-3,4-hexanediol, 2-methyl-3,4-hexanedione, 5-methyl-2,3-hexanediol, 5-methyl-2,3-hexanedione, 4-methyl-2,3-hexanediol, 4-methyl-2,3-hexanedione, 2-methyl-3-hydroxy-4-hexanone, 2-methyl-4-hydroxy-3-hexanone, 5-methyl-2-hydroxy-3-hexanone, 5-methyl-3-hydroxy-2-hexanone, 4-methyl-2-hydroxy-3-hexanone, 4-methyl-3-hydroxy-2-hexanone, 2,5-dimethylhexane, 2,5-dimethyl-2-hexene, 2,5-dimethyl-3-hexene, 2,5-dimethyl-3-hexanol, 2,5-dimethyl-3-hexanone, 2,5-dimethyl-3,4-hexanediol, 2,5-dimethyl-3,4-hexanedione, 2,5-dimethyl-3-hydroxy-4-hexanone, 5-methyl-1-phenylhexane, 4-methyl-1-phenylhexane, 5-methyl-1-phenyl-1-hexene, 5-methyl-1-phenyl-2-hexene, 5-methyl-1-phenyl-3-hexene, 4-methyl-1-phenyl-1-hexene, 4-methyl-1-phenyl-2-hexene, 4-methyl-1-phenyl-3-hexene, 5-methyl-1-phenyl-2-hexanol, 5-methyl-1-phenyl-3-hexanol, 4-methyl-1-phenyl-2-hexanol, 4-methyl-1-phenyl-3-hexanol, 5-methyl-1-phenyl-2-hexanone, 5-methyl-1-phenyl-3-hexanone, 4-methyl-1-phenyl-2-hexanone, 4-methyl-1-phenyl-3-hexanone, 5-methyl-1-phenyl-2,3-hexanediol, 4-methyl-1-phenyl-2,3-hexanediol, 5-methyl-1-phenyl-3-hydroxy-2-hexanone, 5-methyl-1-phenyl-2-hydroxy-3-hexanone, 4-methyl-1-phenyl-3-hydroxy-2-hexanone, 4-methyl-1-phenyl-2-hydroxy-3-hexanone, 5-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-phenyl-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)hexane, 5-methyl-1-(4-hydroxyphenyl)-1-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexene, 5-methyl-1-(4-hydroxyphenyl)-3-hexene, 4-methyl-1-(4-hydroxyphenyl)-1-hexene, 4-methyl-1-(4-hydroxyphenyl)-2-hexene, 4-methyl-1-(4-hydroxyphenyl)-3-hexene, 5-methyl-1-(4-hydroxyphenyl)-2-hexanol, 5-methyl-1-(4-hydroxyphenyl)-3-hexanol, 4-methyl-1-(4-hydroxyphenyl)-2-hexanol, 4-methyl-1-(4-hydroxyphenyl)-3-hexanol, 5-methyl-1-(4-hydroxyphenyl)-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol, 5-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone, 4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone, 5-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione, 4-methyl-1-(indole-3-)hexane, 5-methyl-1-(indole-3)-1-hexene, 5-methyl-1-(indole-3)-2-hexene, 5-methyl-1-(indole-3)-3-hexene, 4-methyl-1-(indole-3)-1-hexene, 4-methyl-1-(indole-3)-2-hexene, 4-methyl-1-(indole-3)-3-hexene, 5-methyl-1-(indole-3)-2-hexanol, 5-methyl-1-(indole-3)-3-hexanol, 4-methyl-1-(indole-3)-2-hexanol, 4-methyl-1-(indole-3)-3-hexanol, 5-methyl-1-(indole-3)-2-hexanone, 5-methyl-1-(indole-3)-3-hexanone, 4-methyl-1-(indole-3)-2-hexanone, 4-methyl-1-(indole-3)-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanediol, 4-methyl-1-(indole-3)-2,3-hexanediol, 5-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 5-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 4-methyl-1-(indole-3)-3-hydroxy-2-hexanone, 4-methyl-1-(indole-3)-2-hydroxy-3-hexanone, 5-methyl-1-(indole-3)-2,3-hexanedione, 4-methyl-1-(indole-3)-2,3-hexanedione, n-heptane, 1-heptene, 1-heptanol, heptanal, heptanoate, 2-heptene, 3-heptene, 2-heptanol, 3-heptanol, 4-heptanol, 2-heptanone, 3-heptanone, 4-heptanone, 2,3-heptanediol, 2,3-heptanedione, 3,4-heptanediol, 3,4-heptanedione, 2-hydroxy-3-heptanone, 3-hydroxy-2-heptanone, 3-hydroxy-4-heptanone, 4-hydroxy-3-heptanone, 2-methylheptane, 3-methylheptane, 6-methyl-2-heptene, 6-methyl-3-heptene, 2-methyl-3-heptene, 2-methyl-2-heptene, 5-methyl-2-heptene, 5-methyl-3-heptene, 3-methyl-3-heptene, 2-methyl-3-heptanol, 2-methyl-4-heptanol, 6-methyl-3-heptanol, 5-methyl-3-heptanol, 3-methyl-4-heptanol, 2-methyl-3-heptanone, 2-methyl-4-heptanone, 6-methyl-3-heptanone, 5-methyl-3-heptanone, 3-methyl-4-heptanone, 2-methyl-3,4-heptanediol, 2-methyl-3,4-heptanedione, 6-methyl-3,4-heptanediol, 6-methyl-3,4-heptanedione, 5-methyl-3,4-heptanediol, 5-methyl-3,4-heptanedione, 2-methyl-3-hydroxy-4-heptanone, 2-methyl-4-hydroxy-3-heptanone, 6-methyl-3-hydroxy-4-heptanone, 6-methyl-4-hydroxy-3-heptanone, 5-methyl-3-hydroxy-4-heptanone, 5-methyl-4-hydroxy-3-heptanone, 2,6-dimethylheptane, 2,5-dimethylheptane, 2,6-dimethyl-2-heptene, 2,6-dimethyl-3-heptene, 2,5-dimethyl-2-heptene, 2,5-dimethyl-3-heptene, 3,6-dimethyl-3-heptene, 2,6-dimethyl-3-heptanol, 2,6-dimethyl-4-heptanol, 2,5-dimethyl-3-heptanol, 2,5-dimethyl-4-heptanol, 2,6-dimethyl-3,4-heptanediol, 2,6-dimethyl-3,4-heptanedione, 2,5-dimethyl-3,4-heptanediol, 2,5-dimethyl-3,4-heptanedione, 2,6-dimethyl-3-hydroxy-4-heptanone, 2,6-dimethyl-4-hydroxy-3-heptanone, 2,5-dimethyl-3-hydroxy-4-heptanone, 2,5-dimethyl-4-hydroxy-3-heptanone, n-octane, 1-octene, 2-octene, 1-octanol, octanal, octanoate, 3-octene, 4-octene, 4-octanol, 4-octanone, 4,5-octanediol, 4,5-octanedione, 4-hydroxy-5-octanone, 2-methyloctane, 2-methyl-3-octene, 2-methyl-4-octene, 7-methyl-3-octene, 3-methyl-3-octene, 3-methyl-4-octene, 6-methyl-3-octene, 2-methyl-4-octanol, 7-methyl-4-octanol, 3-methyl-4-octanol, 6-methyl-4-octanol, 2-methyl-4-octanone, 7-methyl-4-octanone, 3-methyl-4-octanone, 6-methyl-4-octanone, 2-methyl-4,5-octanediol, 2-methyl-4,5-octanedione, 3-methyl-4,5-octanediol, 3-methyl-4,5-octanedione, 2-methyl-4-hydroxy-5-octanone, 2-methyl-5-hydroxy-4-octanone, 3-methyl-4-hydroxy-5-octanone, 3-methyl-5-hydroxy-4-octanone, 2,7-dimethyloctane, 2,7-dimethyl-3-octene, 2,7-dimethyl-4-octene, 2,7-dimethyl-4-octanol, 2,7-dimethyl-4-octanone, 2,7-dimethyl-4,5-octanediol, 2,7-dimethyl-4,5-octanedione, 2,7-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyloctane, 2,6-dimethyl-3-octene, 2,6-dimethyl-4-octene, 3,7-dimethyl-3-octene, 2,6-dimethyl-4-octanol, 3,7-dimethyl-4-octanol, 2,6-dimethyl-4-octanone, 3,7-dimethyl-4-octanone, 2,6-dimethyl-4,5-octanediol, 2,6-dimethyl-4,5-octanedione, 2,6-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyl-5-hydroxy-4-octanone, 3,6-dimethyloctane, 3,6-dimethyl-3-octene, 3,6-dimethyl-4-octene, 3,6-dimethyl-4-octanol, 3,6-dimethyl-4-octanone, 3,6-dimethyl-4,5-octanediol, 3,6-dimethyl-4,5-octanedione, 3,6-dimethyl-4-hydroxy-5-octanone, n-nonane, 1-nonene, 1-nonanol, nonanal, nonanoate, 2-methylnonane, 2-methyl-4-nonene, 2-methyl-5-nonene, 8-methyl-4-nonene, 2-methyl-5-nonanol, 8-methyl-4-nonanol, 2-methyl-5-nonanone, 8-methyl-4-nonanone, 8-methyl-4,5-nonanediol, 8-methyl-4,5-nonanedione, 8-methyl-4-hydroxy-5-nonanone, 8-methyl-5-hydroxy-4-nonanone, 2,8-dimethylnonane, 2,8-dimethyl-3-nonene, 2,8-dimethyl-4-nonene, 2,8-dimethyl-5-nonene, 2,8-dimethyl-4-nonanol, 2,8-dimethyl-5-nonanol, 2,8-dimethyl-4-nonanone, 2,8-dimethyl-5-nonanone, 2,8-dimethyl-4,5-nonanediol, 2,8-dimethyl-4,5-nonanedione, 2,8-dimethyl-4-hydroxy-5-nonanone, 2,8-dimethyl-5-hydroxy-4-nonanone, 2,7-dimethylnonane, 3,8-dimethyl-3-nonene, 3,8-dimethyl-4-nonene, 3,8-dimethyl-5-nonene, 3,8-dimethyl-4-nonanol, 3,8-dimethyl-5-nonanol, 3,8-dimethyl-4-nonanone, 3,8-dimethyl-5-nonanone, 3,8-dimethyl-4,5-nonanediol, 3,8-dimethyl-4,5-nonanedione, 3,8-dimethyl-4-hydroxy-5-nonanone, 3,8-dimethyl-5-hydroxy-4-nonanone, n-decane, 1-decene, 1-decanol, decanoate, 2,9-dimethyldecane, 2,9-dimethyl-3-decene, 2,9-dimethyl-4-decene, 2,9-dimethyl-5-decanol, 2,9-dimethyl-5-decanone, 2,9-dimethyl-5,6-decanediol, 2,9-dimethyl-6-hydroxy-5-decanone, 2,9-dimethyl-5,6-decanedionen-undecane, 1-undecene, 1-undecanol, undecanal. undecanoate, n-dodecane, 1-dodecene, 1-dodecanol, dodecanal, dodecanoate, n-dodecane, 1-decadecene, 1-dodecanol, dodecanal, dodecanoate, n-tridecane, 1-tridecene, 1-tridecanol, tridecanal, tridecanoate, n-tetradecane, 1-tetradecene, 1-tetradecanol, tetradecanal, tetradecanoate, n-pentadecane, 1-pentadecene, 1-pentadecanol, pentadecanal, pentadecanoate, n-hexadecane, 1-hexadecene, 1-hexadecanol, hexadecanal, hexadecanoate, n-heptadecane, 1-heptadecene, 1-heptadecanol, heptadecanal, heptadecanoate, n-octadecane, 1-octadecene, 1-octadecanol, octadecanal, octadecanoate, n-nonadecane, 1-nonadecene, 1-nonadecanol, nonadecanal, nonadecanoate, eicosane, 1-eicosene, 1-eicosanol, eicosanal, eicosanoate, 3-hydroxy propanal, 1,3-propanediol, 4-hydroxybutanal, 1,4-butanediol, 3-hydroxy-2-butanone, 2,3-butandiol, 1,5-pentane diol, homocitrate, homoisocitorate, b-hydroxy adipate, glutarate, glutarsemialdehyde, glutaraldehyde, 2-hydroxy-1-cyclopentanone, 1,2-cyclopentanediol, cyclopentanone, cyclopentanol, (S)-2-acetolactate, (R)-2,3-Dihydroxy-isovalerate, 2-oxoisovalerate, isobutyryl-CoA, isobutyrate, isobutyraldehyde, 5-amino pentaldehyde, 1,10-diaminodecane, 1,10-diamino-5-decene, 1,10-diamino-5-hydroxydecane, 1,10-diamino-5-decanone, 1,10-diamino-5,6-decanediol, 1,10-diamino-6-hydroxy-5-decanone, phenylacetoaldehyde, 1,4-diphenylbutane, 1,4-diphenyl-1-butene, 1,4-diphenyl-2-butene, 1,4-diphenyl-2-butanol, 1,4-diphenyl-2-butanone, 1,4-diphenyl-2,3-butanediol, 1,4-diphenyl-3-hydroxy-2-butanone, 1-(4-hydeoxyphenyl)-4-phenylbutane, 1-(4-hydeoxyphenyl)-4-phenyl-1-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butene, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanol, 1-(4-hydeoxyphenyl)-4-phenyl-2-butanone, 1-(4-hydeoxyphenyl)-4-phenyl-2,3-butanediol, 1-(4-hydeoxyphenyl)-4-phenyl-3-hydroxy-2-butanone, 1-(indole-3)-4-phenylbutane, 1-(indole-3)-4-phenyl-1-butene, 1-(indole-3)-4-phenyl-2-butene, 1-(indole-3)-4-phenyl-2-butanol, 1-(indole-3)-4-phenyl-2-butanone, 1-(indole-3)-4-phenyl-2,3-butanediol, 1-(indole-3)-4-phenyl-3-hydroxy-2-butanone, 4-hydroxyphenylacetoaldehyde, 1,4-di(4-hydroxyphenyl)butane, 1,4-di(4-hydroxyphenyl)-1-butene, 1,4-di(4-hydroxyphenyl)-2-butene, 1,4-di(4-hydroxyphenyl)-2-butanol, 1,4-di(4-hydroxyphenyl)-2-butanone, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 1,4-di(4-hydroxyphenyl)-3-hydroxy-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3-)butane, 1-(4-hydroxyphenyl)-4-(indole-3)-1-butene,

1-di(4-hydroxyphenyl)-4-(indole-3)-2-butene, 1-(4-hydroxyphenyl)-4-(indole-3)-2-butanol, 1-(4-hydroxyphenyl)-4-(indole-3)-2-butanone, 1-(4-hydroxyphenyl)-4-(indole-3)-2,3-butanediol, 1-(4-hydroxyphenyl-4-(indole-3)-3-hydroxy-2-butanone, indole-3-acetoaldehyde, 1,4-di(indole-3-)butane, 1,4-di(indole-3)-1-butene, 1,4-di(indole-3)-2-butene, 1,4-di(indole-3)-2-butanol, 1,4-di(indole-3)-2-butanone, 1,4-di(indole-3)-2,3-butanediol, 1,4-di(indole-3)-3-hydroxy-2-butanone, succinate semialdehyde, hexane-1,8-dicarboxylic acid, 3-hexene-1,8-dicarboxylic acid, 3-hydroxy-hexane-1,8-dicarboxylic acid, 3-hexanone-1,8-dicarboxylic acid, 3,4-hexanediol-1,8-dicarboxylic acid, 4-hydroxy-3-hexanone-1,8-dicarboxylic acid, fucoidan, iodine, chlorophyll, carotenoid, calcium, magnesium, iron, sodium, potassium, phosphate, and the like.

[0158] The recitation "optimized" as used herein refers to a pathway, gene, polypeptide, enzyme, or other molecule having an altered biological activity, such as by the genetic alteration of a polypeptide's amino acid sequence or by the alteration/modification of the polypeptide's surrounding cellular environment, to improve its functional characteristics in relation to the original molecule or original cellular environment (e.g., a wild-type sequence of a given polypeptide or a wild-type microorganism). Any of the polypeptides or enzymes described herein may be optionally "optimized," and any of the genes or nucleotide sequences described herein may optionally encode an optimized polypeptide or enzyme. Any of the pathways described herein may optionally contain one or more "optimized" enzymes, or one or more nucleotide sequences encoding for an optimized enzyme or polypeptide.

[0159] Typically, the improved functional characteristics of the polypeptide, enzyme, or other molecule relate to the suitability of the polypeptide or other molecule for use in a biological pathway (e.g., a biosynthesis pathway, a C--C ligation pathway) to convert a monosaccharide or oligosaccharide into a biofuel. Certain embodiments, therefore, contemplate the use of "optimized" biological pathways. An exemplary "optimized" polypeptide may contain one or more alterations or mutations in its amino acid coding sequence (e.g., point mutations, deletions, addition of heterologous sequences) that facilitate improved expression and/or stability in a given microbial system or microorganism, allow regulation of polypeptide activity in relation to a desired substrate (e.g., inducible or repressible activity), modulate the localization of the polypeptide within a cell (e.g., intracellular localization, extracellular secretion), and/or effect the polypeptide's overall level of activity in relation to a desired substrate (e.g., reduce or increase enzymatic activity). A polypeptide or other molecule may also be "optimized" for use with a given microbial system or microorganism by altering one or more pathways within that system or organism, such as by altering a pathway that regulates the expression (e.g., up-regulation), localization, and/or activity of the "optimized" polypeptide or other molecule, or by altering a pathway that minimizes the production of undesirable by-products, among other alterations. In this manner, a polypeptide or other molecule may be "optimized" with or without altering its wild-type amino acid sequence or original chemical structure. Optimized polypeptides or biological pathways may be obtained, for example, by direct mutagenesis or by natural selection for a desired phenotype, according to techniques known in the art.

[0160] In certain aspects, "optimized" genes or polypeptides may comprise a nucleotide coding sequence or amino acid sequence that is 50% to 99% identical (including all integeres in between) to the nucleotide or amino acid sequence of a reference (e.g., wild-type) gene or polypeptide. In certain aspects, an "optimized" polypeptide or enzyme may have about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 100 (including all integers and decimal points in between e.g., 1.2, 1.3, 1.4, 1.5, 5.5, 5.6, 5.7, 60, 70, etc.), or more times the biological activity of a reference polypeptide.

[0161] Certain aspects of the invention also include a commodity chemical, such as a biofuel, that is produced according to the methods and recombinant microorganisms described herein. Such a biofuel (e.g., medium to long chain alkane) may be distinguished from other fuels, such as those fuels produced by traditional refinery from crude carbon sources, by radio-carbon dating techniques. For instance, carbon has two stable, nonradioactive isotopes: carbon-12 (.sup.12C), and carbon-13 (.sup.13C). In addition, there are trace amounts of the unstable isotope carbon-14 (.sup.14C) on Earth. Carbon-14 has a half-life of 5730 years, and would have long ago vanished from Earth were it not for the unremitting impact of cosmic rays on nitrogen in the Earth's atmosphere, which create more of this isotope. The neutrons resulting from the cosmic ray interactions participate in the following nuclear reaction on the atoms of nitrogen molecules (N.sub.2) in the atmospheric air:

n+.sub.7.sup.14N.fwdarw..sub.6.sup.14C+p

[0162] Plants and other photosynthetic organisms take up atmospheric carbon dioxide by photosynthesis. Since many plants are ingested by animals, every living organism on Earth is constantly exchanging carbon-14 with its environment for the duration of its existence. Once an organism dies, however, this exchange stops, and the amount of carbon-14 gradually decreases over time through radioactive beta decay.

[0163] Most hydrocarbon-based fuels, such as crude oil and natural gas derived from mining operations, are the result of compression and heating of ancient organic materials (i.e., kerogen) over geological time. Formation of petroleum typically occurs from hydrocarbon pyrolysis, in a variety of mostly endothermic reactions at high temperature and/or pressure. Today's oil formed from the preserved remains of prehistoric zooplankton and algae, which had settled to a sea or lake bottom in large quantities under anoxic conditions (the remains of prehistoric terrestrial plants, on the other hand, tended to form coal). Over geological time the organic matter mixed with mud, and was buried under heavy layers of sediment resulting in high levels of heat and pressure (known as diagenesis). This process caused the organic matter to chemically change, first into a waxy material known as kerogen which is found in various oil shales around the world, and then with more heat into liquid and gaseous hydrocarbons in a process known as catagenesis. Most hydrocarbon based fuels derived from crude oil have been undergoing a process of carbon-14 decay over geological time, and, thus, will have little to no detectable carbon-14. In contrast, certain biofuels produced by the living microorganisms of the present invention will comprise carbon-14 at a level comparable to all other presently living things (i.e., an equilibrium level). In this manner, by measuring the carbon-12 to carbon-14 ratio of a hydrocarbon-based biofuel of the present invention, and comparing that ratio to a hydrocarbon based fuel derived from crude oil, the biofuels produced by the methods provided herein can be structurally distinguished from typical sources of hydrocarbon based fuels.

[0164] Embodiments of the present invention include methods for converting a polysaccharide to a suitable monosaccharide comprising, (a) obtaining the polysaccharide; and (b) contacting the polysaccharide with a recombinant microorganism or microbial system comprising such a microorgansim for a time sufficient to convert the polysaccharide to a suitable monosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a lyase and a hydrolase, wherein the lyase and/or hydrolase optionally comprises at least one signal peptide or at least one autotransporter domain; (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide transporter, a disaccharide transporter, a trisaccharide transporter, an oligosaccharide transporter, and a polysaccharide transporter; and (iii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to a suitable monosaccharide.

[0165] Alternatively, certain aspects may include methods for converting a polysaccharide to a suitable monosaccharide comprising, (a) obtaining the polysaccharide; and (b) contacting the polysaccharide with a microbial system for a time sufficient to convert the polysaccharide to a suitable monosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a lyase and a hydrolase; (ii) at least one gene encoding and expressing a superchannel; and (iii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to a suitable monosaccharide.

[0166] In certain embodiments, a microbial system or isolated microorganism is capable of growing using a polysaccharide (e.g., alginate, pectin, etc.) as a sole source of carbon and/or energy. A "sole source of carbon" refers generally to the ability to grow on a given carbon source as the only carbon source in a given growth medium.

[0167] With regard to alginate, approximately 50 percent of seaweed dry-weight comprises various sugar components, among which alginate and mannitol are major components corresponding to 30 and 15 percent of seaweed dry-weight, respectively. With regard to pectin, although microorganisms such as E. coli are generally considered as a host organisms in synthetic biology, and although such microorganism are able to metabolize mannitol, they completely lack the ability to degrade and metabolize alginate. In this regard, many laboratory or wild-type microorganisms, such as E. coli, are unable to grow on alginate as a sole source of carbon. Similarly, many organisms such as E. coli are unable to degrade and metabolize pectin, a polysaccharide found in many food waste products, and, thus are unable to grown on pectin as a sole source of carbon. Accordingly, embodiments of the present application include engineered microorganisms, such as E. coli, or microbial systems containing such engineered microorganisms, that are capable of using polysaccharides, such as alginate and pectin, as a sole source of carbon and/or energy.

[0168] Alginate is a block co-polymer of .beta.-D-mannuronate (M) and .alpha.-D-gluronate (G) (M and G are epimeric about the C5-carboxyl group). Each alginate polymer comprises regions of all M (polyM), all G (polyG), and/or the mixture of M and G (polyMG). To utilize alginate to produce one or more suitable monosaccharides, certain aspects of the present invention provide an engineered or recombinant microorganism or microbial system that is able to degrade or de-polymerize alginate and to use it as a source of carbon and/or energy. As one means of accomplishing this purpose, such recombinant microorganisms may incorporate a set of polysaccharide degrading or depolymerizing enzymes such as alginate lyases (ALs) to the microbial system.

[0169] ALs are mainly classified into two distinctive subfamilies depending on their acts of catalysis: endo- (EC 4.2.2.3) and exo-acting (EC 4.2.2.-) ALs. Endo-acting ALs are further classified based on their catalytic specificity; M specific and G specific ALs. The endo-acting ALs randomly cleave alginate via a .beta.-elimination mechanism and mainly depolymerize alginate to di-, tri- and tetrasaccharides. The uronate at the non-reducing terminus of each oligosaccharide are converted to unsaturated sugar uronate, 4-deoxy-.alpha.-L-erythro-hex-4-ene pyranosyl uronates. The exo-acting ALs catalyze further depolymerization of these oligosaccharides and release unsaturated monosaccharides, which may be non-enzymatically converted to monosaccharides, including .alpha.-keto acid, 4-deoxy-.alpha.-L-erythro-hexoselulose uronate (DEHU). Certain embodiments of an engineered microbial system or isolated, engineered microorganism may include endoM-, endoG- and exo-acting ALs to degrade or depolymerize aquatic or marine-biomass polysaccharides such as alginate to a monosaccharide such as DEHU.

[0170] Embodiments of the present invention may also include lyases such as alginate lyases isolated from various sources, including, but not limited to, marine algae, mollusks, and wide varieties of microbes such as genus Pseudomonas, Vibrio, and Sphingomonas. Many alginate lyases are endo-acting M specific, several are G specific, and few are exo-acting. For example, ALs isolated from Sphingomonas sp. strain A1 include five endo-acting ALs, A1-I, A1-II, A1-II', A1-III, and A1-IV' and an exo-acting AL, A1-IV.

[0171] Typically, A1-I, A1-II, and A1-III have molecular weights of 66 kDa, 25 kDa, and 40 kDa, respectively. AI-II and AI-III are self-splicing products of A1-I. AI-II may be more specific to G and A1-III may be specific to M. A1-I may have high activity for both M and G. A1-IV has molecular weight of about 85 kDa and catalyzes exo-lytic depolymerization of oligoalginate. Although both A1-II' and A1-IV' are functional homologues of A1-II and A1-IV. AI-II' has endo-lytic activity and may have no preference to M or G. A1-IV has primarily endo-lytic activity. In addition to these ALs, exo-lytic AL Atu3025 derived from Agrobacterium tumefaciens has high activity for depolymerization of oligoalginate, and may be used in certain embodiments of the present invention. Certain embodiments may incorporate into the microbial system or isolated microorganism the genes encoding A1-I, A1-II', A1-IV, and Atu3025, and may include optimal codon usage for the suitable host organisms, such as E. coli.

[0172] Certain examples of alginate lyases or oligoalginate lyases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to SEQ ID NOS:67-68, which show the nucleotide (SEQ ID NO:67) and polypeptide (SEQ ID NO:68) sequences of oligoalginate lyase Atu3025 isolated from Agrobacterium tumefaciens. Certain examples of alginate lyases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the alginate lyase enyzmes described in FIGS. 37A-B, as well as the secreted alginate lyase encoded by Vs24254 from Vibrio splendidus.

[0173] In certain embodiments, a microbial system or recombinant microorganism may be engineered to secrete or display the lyases or alginate lyases (ALs) to the culture media, such as by incorporating a signal peptide or autotransporter domain into the lyase. In this regard, it is typically understood that bacteria have at least four different types of protein secretion machinery (type I, II, III and IV). For example, in E. coli, the type II secretion machinery is used for the secretion of recombinant proteins. The type II secretion machinery may comprise a two-step process: the translocation of premature proteins tagged with signal peptides to the periplasm fraction and processing to the mature proteins followed by secretion to media.

[0174] The first process may proceed by any of three different pathways: secB-dependent pathway, signal recognition particle (SRP) pathway, or twin-arginine translocation (TAT) pathway. Recombinant proteins may be secreted into periplasm fraction. The fates of the mature proteins vary dependent on the type of proteins. For example, some proteins are secreted spontaneously by diffusion or passively by a secretion apparatus named secretion that consists of 12-16 proteins, and others stay in periplasm fraction and are eventually degraded.

[0175] Some proteins may also be secreted by an autotransporter apparatus, such as by utilizing an autotransporter domain. The proteins secreted by autotransporter domains typically comprise an N-terminal signal peptide that plays a role in translocation to the periplasm, which may be mediated by secB or SRP pathways, passenger domain, and/or C-terminal translocation unit (UT) having a characteristic .beta.-barrel structure. The .beta.-barrel portion of the UT builds an aqueous pore channel across the outer membrane and helps the transportation of passenger domain to media. Autodisplayed passenger proteins are often cleaved by the autotransporter and set free to media.

[0176] The type I secretion machinery may also be used for the secretion of recombinant proteins in E. coli. The type I secretion machinery may be used for the secretion of high-molecular-weight toxins and exoenzymes. The type I secretion machinery consist of two inner membrane proteins (HlyB and HlyD) that are the member of the ATP binding cassette (ABC) transporter family, and an endogenous outer membrane protein (TolC). The secretion of recombinant proteins based on type I secretion machinery may utilize the C-terminal region of .alpha.-haemolysin (HlyA) as a signal sequence. The recombinant proteins may readily pass through the inner membrane, periplasm, and outer membrane through the type I secretion machinery.

[0177] Depending on the types of linker and signal peptides utilized by various embodiments of the present application, both autotransporter and type I secretion machinery can be altered to the cell surface display machinery. Alternatively, a system specific to cell surface display may be used. For example, in this system, target proteins may be fused to PgsA protein (a poly-.gamma.-glutamate synthetase complex) that is natively displayed on the surface of Bacillus subtilis.

[0178] Certain embodiments may include lyases such as alginate lyases fused with various signal peptides and/or autotransporter domains found in proteins secreted by both type I and type II secretion machinery. Other embodiments may include lyases such as alginate lyases fused with any combination of signal peptides and or autotransporter domains found in proteins secreted transport machinery as described herein or known to a person skilled in the art. Embodiments may also include signal peptides or autotransporter domains that are experimentally redesigned to maximize the secretion of lyases such as alginate lyases to the culture media, and may also include the use of many different linker sequences that fuse signal peptides, lyases, and autotransporters that improve the efficiency of secretion or the cell surface presentation of lyases.

[0179] Certain embodiments may include a microbial system or isolated microorganism that comprise saccharide transporters, which are able to transport monosaccharides (e.g., DEHU) and oligosaccharides from the media to the cytosol to efficiently utilize these monosaccharides as a source of carbon and/or energy. For instance, genes encoding monosaccharide permeases (i.e., monosaccharide transporters) such as DEHU permeases may be isolated from bacteria that grow on polysaccharides such as alginate as a source of carbon and/or energy, and may be incorporated into embodiments of the present microbial system or isolated microorganism. As an additional example, embodiments may also include redesigned native permeases or transporters with altered specificity for monosaccharide (e.g., DEHU) transportation.

[0180] In this regard, E. coli contains several permeases able to transport monosaccharides, which include, but are not limited to, KdgT for 2-keto-3-deoxy-D-gluconate (KDG) transporter, ExuT for aldohexuronates such as D-galacturonate and D-glucuronate transporter, GntT, GntU, GntP, and GntT for gluconate transporter, and KgtP for proton-driven .alpha.-ketoglutarate transporter. Microbial systems or recombinant microorganisms described herein may comprise any of these permeases, in addition to those permeases known to a person of skill in the art and not mentioned herein, and may also include permease enzymes redesigned to transport other monosaccharides, such as DEHU.

[0181] A microbial system or recombinant microorganism according to the present invention may also comprise permeases/transporters/superchannels/porins that catalyze the transport of monosaccharides (e.g., D-mannuronate and D-lyxose) from media to the periplasm or cytosol of a microorganism. For example, genes encoding the permeases of D-mannuronate in soil Aeromonas may be incorporated into a microbial system as described herein.

[0182] As one alternative example, a microbial system or microorganism may comprise native permeases/transporters that are redesigned to alter their specificity for efficient monosaccharide transportation, such as for D-mannuronate and D-lyxose transportation. For instance, E. coli contains several permeases that are able to transport monosaccharides or sugars such as D-mannonate and D-lyxose, including KdgT for 2-keto-3-deoxy-D-gluconate (KDG) transporter, ExuT for aldohexuronates such as D-galacturonate and D-glucuronate transporter, GntPTU for gluconate/fructuronate transporter, uidB for glucuronide transporter, fucP for L-fucose transporter, galP for galactose transporter, yghK for glycolate transporter, dgoT for D-galactonate transporter, uhpT for hexose phosphate transporter, dctA for orotate/citrate transporter, gntUT for gluconate transporter, malEGF for maltose transporter: alsABC for D-allose transporter, idnT for L-idonate/D-gluconate transporter, KgtP for proton-driven .alpha.-ketoglutarate transporter, lacY for lactose/galactose transporter, xylEFGH for D-xylose transporter, araEFGH for L-arabinose transporter, and rbsABC for D-ribose transporter. In certain embodiments, a microbial system or recombinant microorganism may comprise permeases or transporters as described above, including those that are re-designed or optimized for improvided transport of certain monosaccharides, such as D-mannuronate, DEHU, and D-lyxose.

[0183] Certain aspects may employ a recombinant microorganism that comprises a "superchannel," by which aquatic or marine-biomass polysaccharides such as alginate polymers, or fruit or vegetable biomass such as pectin polymers, may be directly incorporated into the cytosol and degraded inside the microbial system. For instance, a group of bacteria characterized as Sphingomonads have a wide range in capability of degrading environmentally hazardous compounds such as polychlorinated polycyclic aromatics (dioxin). These bacteria contain characteristic large pleat-like molecules on their cell surfaces. In this regard, certain Sphingomonads have structures characterized as "superchannels" that enable the bacteria to directly take up macromolecules.

[0184] As one particular example of a microorganism comprising a superchannel, Sphingomonas sp. strain A1 directly incorporates polysaccharides such as alginate through a superchannel. Such superchannels may consist of a pit on the outer membrane (e.g., AlgR), alginate-binding proteins in the periplasm (e.g., AlgQ1 and Alg Q2), and an ATP-binding cassette (ABC) transporter (e.g., AlgM1, AlgM2, and AlgS). Incorporated polysaccharides such as alginate may be readily depolymerized by lyases such as alginate lyases produced in the cytosol. Thus, certain embodiments may incorporate genes encoding a superchannel (e.g., ccpA, algS, algM1, algM2, algQ1, algQ2) to introduce this ability to the microbial system or recombinant microorganism. Other embodiments may include microorganisms such as Sphingomonas subarctica IFO 16058.sup.T, which harbor the plasmid containing genes that encode a superchannel, and which have significantly improved ability to utilize marine or aquatic biomass polysaccharides such as alginate as a source of carbon and/or energy. Certain recombinant microorganisms may employ these superchannel encoding plasmid sequences contained within Sphingomonas subarctica IFO 16058.sup.T.

[0185] Certain examples of alginate ABC transporters that may be utilized herein, include ABC transporters Atu3021, Atu3022, Atu3023, Atu3024, algM1, algM2, AlgQ1, AlgQ2, AlgS, OG2516.sub.--05558, OG2516.sub.--05563, OG2516.sub.--05568, and OG2516.sub.--05573, including functional variants thereof. Certain examples of alginate symporters that may be utilized herein include symporters V12B01.sub.--24239 and V12B01.sub.--24194, among others, including functional variants thereof. One additional example of an alginate porin includes V12B01.sub.--24269, and variants thereof.

[0186] As noted above, certain embodiments may include recombinant microorgansims that comprise one or more monosaccharide dehydrogenases, isomerases, dehydratases, kinases, and aldolases. With regard to monosaccharide dehyodrogenases, certain microbial systems or recombinant microorganism may incorporate enzymes that reduce various monosaccharides (e.g., DEHU, mannuronate) to a monosaccharide that is suitable for biofuel biosynthesis, such as 2-keto-3-deoxy-D-gluconate (KDG) or D-mannitol. Such exemplary enzymes, include, for example, DEHU hydrogenases and mannuronate hydrogenases, in addition to various alcohol dehydrogenases having DEHU hydrogenase and/or mannuronate dehydrogenase activity, such as the novel ADH1 through ADH12 enzymes isolated from Agrobacterium tumefaciens C58 (see, e.g., SEQ ID NOS:69-92).

[0187] For more detail on the ADH1 through ADH12 enzymes, SEQ ID NO:69 shows the nucleotide and SEQ ID NO:70 shows the polypeptide sequence of ADH1 Atu1557 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:71 shows the nucleotide and SEQ ID NO:72 shows the polypeptide sequence of ADH2 Atu2022 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:73 shows the nucleotide and SEQ ID NO:74 shows the polypeptide sequence of ADH3 Atu0626 isolated from Agrobacterium tumefaciens C58.

[0188] SEQ ID NO:75 shows the nucleotide and SEQ ID NO:76 shows the polypeptide sequence of ADH4 Atu5240 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:77 shows the nucleotide and SEQ ID NO:78 shows the polypeptide sequence of ADH5 Atu3163 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:79 shows the nucleotide and SEQ ID NO:80 shows the polypeptide sequence of ADH6 Atu2151 isolated from Agrobacterium tumefaciens C58.

[0189] SEQ ID NO:81 shows the nucleotide and SEQ ID NO:82 shows the polypeptide sequence of ADH7 Atu2814 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:83 shows the nucleotide and SEQ ID NO:84 shows the polypeptide sequence of ADH8 Atu5447 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:85 shows the nucleotide and SEQ ID NO:86 shows the polypeptide sequence of ADH9 Atu4087 isolated from Agrobacterium tumefaciens C58.

[0190] SEQ ID NO:87 shows the nucleotide and SEQ ID NO:88 shows the polypeptide sequence of ADH10 Atu4289 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:89 shows the nucleotide and SEQ ID NO:90 shows the polypeptide sequence of ADH11 Atu3027 isolated from Agrobacterium tumefaciens C58. SEQ ID NO:91 shows the nucleotide and SEQ ID NO:92 shows the polypeptide sequence of ADH12 Atu3026 isolated from Agrobacterium tumefaciens C58.

[0191] Further examples of enzymes having dehydrogenase activity include Atu3026, Atu3027, OG2516.sub.--05543, OG2516.sub.--05538 and V12B01.sub.--24244. The microorganisms and methods of the present invention may also utilize biologically active fragments and variants of these hydrogenase enzymes, including optimized variants thereof.

[0192] As a further example, Pseudomonas grown using alginate as a sole source of carbon and energy comprises a DEHU hydrogenase enzyme that uses NADPH as a co-factor, is more stable when NADP.sup.+ is present in the solution, and is active at ambient pH. Thus, certain embodiments of a microbial system or a recombinant microorganism as described herein may incorporate genes encoding hydrogenases such as DEHU or mannuronate hydrogenase derived or obtained from various microbes, in which these microbes may be capable of growing on polysaccharides such as alginate or pectin as a source of carbon and/or energy.

[0193] Certain embodiments may incorporate components of a microbial system or isolated microorganism that is capable of efficiently growing on monosaccharides such as D-mannuronate or D-lyxose as a source of carbon and energy. For instance, both Aeromonas and Aerobacter aerogenes PRL-R3 comprise genes encoding monosaccharide dehydrogenases such as D-mannuronate hydrogenase and D-lyxose isomerase. Thus, certain microbial systems or recombinant microorganisms may comprise monosaccharide dehydrogenases such as D-mannuronate hydrogenase and D-lyxose isomerase from Aeromonas, Aerobacter aerogenes PRL-R3, or various other suitable microorganisms, including those microorganisms capable of growing on D-mannuronate or D-lyxose as a source of carbon and energy.

[0194] Certain embodiments may include a microbial system or isolated microorganism with enhanced efficiency for converting monosaccharides such as D-mannonate and D-xylulose into monosaccharides suitable for a biofuel biosynthesis pathway such as KDG. Merely by way of explanation, D-mannonate and D-xylulose are metabolites in microbes such as E. coli. D-mannonate is converted by a D-mannonate dehydratase to KDG. D-xylulose enters the pentose phosphate pathway. Thus, to increase conversion of D-mannonate to KDG, an exogenous or endogenous D-mannonate dehydratase (e.g., uxuA) gene may be over-expressed an a recombinant microorganism of the invention. Similarly, in other embodiments, suitable endogenous or exogenous genes such as kinases (e.g., kdgK), nad, as well as KDG aldolases (e.g., kdgA and eda) may be either incorporated or overexpressed in a given recombinant microorganism (see SEQ ID NOS:93-96), including biologically active variants or fragments thereof, such as optimized variants of these genes. SEQ ID NO:93 shows the nucleotide sequence and SEQ ID NO:94 shows the polypeptide sequence of a 2-keto-deoxy gluconate kinase (KdgK) from Escherichia coli DH10B. SEQ ID NO:95 shows the nucleotide sequence and SEQ ID NO:96 shows the polypeptide sequence of a 2-keto-deoxy gluconate-6-phosphate aldorase (KdgA) from Escherichia coli DH10B.

[0195] In certain aspects, as noted above, a recombinant microorganism that is capable of growing on alginate or pectin as a sole source of carbon may utilize a naturally-occurring or endogenous copy of a dehyradratase, kinase, and/or aldolase. For instance, E. coli contains endogenous dehydratases, kinases, and aldolases that are capable of catalyzing the appropriate steps in the conversion of polysaccharides to a suitable monosaccharide. In these and other related aspects, the naturally-occurring dehydratase or kinase may also be over-expressed, such as by providing an exogenous copy of the naturally-occurring dehydratase, kinase or aldolase operable linked to a highly constitutive or inducible promoter.

[0196] As one exemplary source of enzymes for engineering a recombinant microorganism to grow on alginate as a sole source of carbon, Vibrio splendidus is known to be able to metabolize alginate to support growth. For example, SEQ ID NO:1 shows a secretome region carrying certain Vibrio splendidus genes (V12B01.sub.--02425 to V12B01.sub.--02480), which encodes a type II secretion apparatus. SEQ ID NO:2 shows the nucleotide sequence of an entire genomic region between V12B01.sub.--24189 to V12B01.sub.--24249, which was derived from Vibrio splendidus, and which when transformed into E. coli as a fosmid clone was sufficient to confer the ability to grow on alginate as a sole source of carbon. SEQ ID NOS:3-64 show the individual putative genes contained within SEQ ID NO:2. Thus, in certain aspects, a recombinant microorganism (e.g., E. coli) that is able to grow on alginate as a sole source of carbon and/or energy may comprise one or more nucleotide or polypeptide reference sequences described in SEQ ID NOS:1-64, including biologically active fragments or variants thereof, such as optimized variants.

[0197] In certain aspects, a recombinant microorganism that is able to grow on alginate as a sole source of carbon may contain certain coding nucleotide or polypeptide sequences contained within SEQ ID NO:2, such as the sequences in SEQ ID NOS:3-64, or biologically active fragments or variants thereof, including optimized variants. These sequences are described in further detail below.

[0198] SEQ ID NO:3 shows the nucleotide coding sequence of the putative protein V12B01.sub.--24184. This putative coding sequence is contained within the polynucleotide sequence of SEQ ID NO:2, and encodes a polypeptide that is similar to an autotransporter adhesion or type I secretion target ggxgxdxxx (SEQ ID NO:145) repeat. SEQ ID NO:4 shows the polypeptide sequence of putative protein V12B01.sub.--24184, encoded by the polynucleotide of SEQ ID NO:3. This putative polypeptide is similar to autotransporter adhesion or type I secretion target ggxgxdxxx (SEQ ID NO:145) repeat.

[0199] SEQ ID NO:5 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24189. SEQ ID NO:6 shows the polypeptide sequence of the putative protein V12B01.sub.--24189, which is similar to cyclohexadienyl dehydratase.

[0200] SEQ ID NO:7 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24194. SEQ ID NO:8 shows the polypeptide sequence of the putative protein V12B01.sub.--24194, which is similar to a Na/proline transporter.

[0201] SEQ ID NO:9 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24199. SEQ ID NO:10 shows the polypeptide sequence of the putative protein V12B01.sub.--24199, which is similar to a keto-deoxy-phosphogluconate aldolase.

[0202] SEQ ID NO:11 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24204. SEQ ID NO:12 shows the polypeptide sequence of the putative protein V12B01.sub.--24204, which is similar to 2-dehydro-3-deoxygluconokinase.

[0203] SEQ ID NO:13 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--241209. SEQ ID NO:14 shows the polypeptide sequence of the putative protein V12B01.sub.--241209.

[0204] SEQ ID NO:15 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24214. SEQ ID NO:16 shows the polypeptide sequence of the putative protein V12B01.sub.--24214, which is similar to a chondroitin AC/alginate lyase.

[0205] SEQ ID NO:17 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24219. SEQ ID NO:18 shows the polypeptide sequence of the putative protein V12B01.sub.--24219, which is similar to a chondroitin AC/alginate lyase.

[0206] SEQ ID NO:19 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24224. SEQ ID NO:20 shows the polypeptide sequence of the putative protein V12B01.sub.--24224, which is similar to a 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase.

[0207] SEQ ID NO:21 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24229. SEQ ID NO:22 shows the polypeptide sequence of the putative protein V12B01.sub.--24229, which is similar to a GntR-family transcriptional regulator.

[0208] SEQ ID NO:23 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24234. SEQ ID NO:24 shows the polypeptide sequence of the putative protein V12B01.sub.--24234, which is similar to a Na.sup.+/proline symporter.

[0209] SEQ ID NO:25 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24239. SEQ ID NO:26 shows the polypeptide sequence of the putative protein V12B01.sub.--24239, which is similar to an oligoalginate lyase.

[0210] SEQ ID NO:27 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24244. SEQ ID NO:28 shows the polypeptide sequence of putative protein V12B01.sub.--24244, which is similar to a 3-hydroxyisobutyrate dehydrogenase.

[0211] SEQ ID NO:29 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24249. SEQ ID NO:30 shows the polypeptide sequence of the putative protein V12B01.sub.--24249, which is similar to a methyl-accepting chemotaxis protein.

[0212] SEQ ID NO:31 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24254. SEQ ID NO:32 shows the polypeptide sequence of putative protein V12B01.sub.--24254, which is similar to an alginate lyase.

[0213] SEQ ID NO:33 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24259. SEQ ID NO:34 shows the polypeptide sequence of putative protein V12B01.sub.--24259, which is similar to an alginate lyase.

[0214] SEQ ID NO:35 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24264. SEQ ID NO:36 shows the polypeptide sequence of putative protein V12B01.sub.--24264.

[0215] SEQ ID NO:37 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24269. SEQ ID NO:38 shows the polypeptide sequence of putative protein V12B01.sub.--24269, which is similar to a putative oligogalacturonate specific porin.

[0216] SEQ ID NO:39 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--24274. SEQ ID NO:40 shows the polypeptide sequence of putative protein V12B01.sub.--24274, which is similar to an alginate lyase.

[0217] FIG. 32 shows the nucleotide coding sequence and polypeptide sequence of putative protein V12B01.sub.--02425. FIG. 32A shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02425 (SEQ ID NO:41). FIG. 32B shows the polypeptide sequence of putative protein V12B01.sub.--02425 (SEQ ID NO:42), which is similar to a type II secretory pathway component EpsC.

[0218] SEQ ID NO:43 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02430. SEQ ID NO:44 shows the polypeptide sequence of putative protein V12B01.sub.--02430, which is similar to a type II secretory pathway component EpsD.

[0219] SEQ ID NO:45 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02435. SEQ ID NO:46 shows the polypeptide sequence of putative protein V12B01.sub.--02435, which is similar to a type II secretory pathway component EpsE.

[0220] SEQ ID NO:47 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02440. SEQ ID NO:48 shows the polypeptide sequence of putative protein V12B01.sub.--02440, which is similar to a type II secretory pathway component EpsF.

[0221] SEQ ID NO:49 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02445. SEQ ID NO:50 shows the polypeptide sequence of putative protein V12B01.sub.--02445, which is similar to a type II secretory pathway component EpsG.

[0222] SEQ ID NO:51 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02450. SEQ ID NO:52 shows the polypeptide sequence of putative protein V12B01.sub.--02450, which is similar to a type II secretory pathway component EpsH.

[0223] SEQ ID NO:53 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02455. SEQ ID NO:54 shows the polypeptide sequence of putative protein V12B01.sub.--02455, which is similar to a type II secretory pathway component EpsI.

[0224] SEQ ID NO:55 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02460. SEQ ID NO:56 shows the polypeptide sequence of putative protein V12B01.sub.--02460, which is similar to a type II secretory pathway component EpsJ.

[0225] SEQ ID NO:57 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02465. SEQ ID NO:58 shows the polypeptide sequence of putative protein V12B01.sub.--02465, which is similar to a type II secretory pathway component EpsK.

[0226] SEQ ID NO:59 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02470. SEQ ID NO:60 shows the polypeptide sequence of putative protein V12B01.sub.--02470, which is similar to a type II secretory pathway component EpsL.

[0227] SEQ ID NO:61 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02475. SEQ ID NO:62 shows the polypeptide sequence of putative protein V12B01.sub.--02475, which is similar to a type II secretory pathway component EpsM.

[0228] SEQ ID NO:63 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02480. SEQ ID NO:64 shows the nucleotide sequence that encodes the putative protein V12B01.sub.--02480, which is similar to a type II secretory pathway component EpsC.

[0229] As a further exemplary source of enzymes for engineering a microorganism to grow on alginate, Agrobacterium tumefaciens C58 is able to metabolize relatively small sizes of alginate molecules (.about.1000 mers) as a sole source of carbon and energy. Since A. tumefaciens C58 has long been used for plant biotechnology, the genetics of this organism has been relatively well studied, and many genetic tools are available and compatible with other gram-negative bacteria such as E. coli. Thus, certain aspects may employ this microbe, or the genes therein, for the production of suitable monosaccharides. For instance, as noted above, the present disclosure provides a series of novel ADH genes having both DEHU and mannuronate hydrogenase activity that were obtained from Agrobacterium tumefaciens C58 (see SEQ ID NOS: 67-92).

[0230] As noted above, certain aspects may include a recombinant microorganism or microbial system that is capable of growing on pectin as a sole source of carbon and/or energy. Pectin is a linear chain of .alpha.-(1-4)-linked D-galacturonic acid that forms the pectin-backbone, a homogalacturonan. Into this backbone, there are regions where galacturonic acid is replaced by (1-2)-linked L-rhamnose. From rhamnose, side chains of various neutral sugars typically branch off. This type of pectin is called rhamnogalacturonan I. Over all, about up to every 25th galacturonic acid in the main chain is exchanged with rhamnose. Some stretches consisting of alternating galacturonic acid and rhamnose--"hairy regions", others with lower density of rhamnose--"smooth regions." The neutral sugars mainly comprise D-galactose, L-arabinose and D-xylose; the types and proportions of neutral sugars vary with the origin of pectin. In nature, around 80% of carboxyl groups of galacturonic acid are esterified with methanol. Some plants, like sugar-beet, potatoes and pears, contain pectins with acetylated galacturonic acid in addition to methyl esters. Acetylation prevents gel-formation but increases the stabilising and emulsifying effects of pectin. Certain pectin degradation and metabolic pathways are exemplified in FIG. 3.

[0231] In addition to the genes, enzymes, and biological pathways described above, certain recombinant microorganisms may incorporate features that are useful for growth on pectin as a sole source of carbon. For instance, to degrade and metabolize pectin as a sole source of carbon, pectin methyl and acetyl esterases first catalyze the hydrolysis of methyl and acetyl esters on pectin. Examples of pectin methyl esterases include, but are not limited to, pemA and pmeB. Examples of pectin acetyl esterases include, but are not limited to, PaeX and PaeY. Further examples of pectin methyl esterases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the pectate methyl esterases in FIGS. 40A-B. Further examples of pectate acetyl esterases that may be utilized herein include enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the pectate acetyl esterases described in FIG. 41.

[0232] Further to this end, pectate lyases and hydrolases may catalyze the endolytic cleavage of pectate via .beta.-elimination and hydrolysis, respectively, to produce oligopectates. Other enzymes that may be utilized to metabolize pectin include Examples of pectate lyases include, but are not limited to, PelA, PelB, PelC, PelD, PelE, Pelf, PelI, PelL, and PelZ. Examples of pectate hydrolases include, but are not limited to, PehA, PehN, PehV, PehW, and PehX. Further examples of pectate lyases include polypeptides or enzymes sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the pectate lyases described in FIGS. 38A-E.

[0233] Polygalacturonases, rhamnogalacturonan lyases, and rhamnogalacturonan hydrolyases may also be utilized herein to degrade and metabolize pectin. Examples of rhamnogalacturonan lyases include polypeptides or enzymes sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the rhamnoglacturonan lyases (i.e., rhamnogalacturonases) described in FIG. 39A. Examples of rhamnogalacturonate hydrolyases include polypeptides or enzymes sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity (including all integers in between) to the rhamnogalacturonate hydrolases described in FIG. 39B.

[0234] Thus, to degrade and metabolize pectin, certain of the recombinant microorganisms and methods of the present invention may incorporate one or more of the above noted methy and acetyl esterases, lyases, and/or hydrolases, among others known in the art. These may enzymes may be encoded and expressed by endogenous or exogenous genes, and may also include biologically active fragments or variants thereof, such as homologs, orthologs, and/or optimized variants of these enzymes.

[0235] To further metabolize the degradation products of pectin, oligopectates may be transported into the periplasm fraction of gram-negative bacteria by outer membrane porins, where they are further degraded into such components as di- and tri-galactonurates. Examples of outer membrane porins include that can transport oligopectates into the periplasm include, but are not limited to, kdgN and kdgM. Certain recombinant microorganism may incorporate these or similar genes.

[0236] Di- and tri-galactonurates may then be transported into the cytosol for further degradation. Bacteria contain at least two different transporter systems responsible for di- and tri-galacturonate transportation, including symporter and ABC transporter (e.g., TogT and TogMNAB, respectively). Thus, certain of the recombinant microorganisms provided herein may comprise one or more a di- or tri-galacturonate transporter systems, such as TogT and/or TogMNAB.

[0237] Once di- and trigalacturonate are incorporated into the cytosol, short pectate or galacturonate lyases, break them down to D-galacturonate and (4S)-4,6-dihydroxy-2,5-dioxohexuronate. Examples of short pectate or galacturonate lyases include, but are not limited to, PelW and Ogl, which genes may be either endogenously or exogenously incorporated into certain recombinant microorganisms provided herein. D-galacturonate and (4S)-4,6-dihydroxy-2,5-dioxohexuronate are then converted to 5-dehydro-4-deoxy-D-glucuronate and further to KDG, which steps may be catalyzed by KduI and KduD, respectively. The KduI enzyme has an isomerase activity, and the KduD enzyme has a dehydrogenase activity, such as a 2-deoxy-D-gluconate 3-dehydrogenase activity. Accordingly, certain recombinant microorganisms provided herein may comprise one or more short pectate or galacturonate lyases, such as PelW and/or Ogl, and may optionally comprise one or more isomerases, such as KduI, as well as one or more dehydrogenases, such as KduD, to convert di- and trigalacturonates into a suitable monosaccharide, such as KDG.

[0238] In certain aspects, a recombinant microorganism, such as E. coli, that is able to grown on pectin or tri-galacturonate as a sole source of carbon and/or energy may comprise one or more of the gene sequences contained within SEQ ID NOS:65 and 66, including biologically active fragments or variants thereof, such as optimized variants. SEQ ID NO:65 shows the nucleotide sequence of the kdgF-PaeX region from Erwinia carotovora subsp. Atroseptica SCRI1043. SEQ ID NO:66 shows the nucleotide sequence of ogl-kdgR from Erwinia carotovora subsp. Atroseptica SCRI1043.

[0239] In certain aspects, a recombinant microorganism, such as E. coli, that is able to grown on pectin or tri-galacturonate as a sole source of carbon and/or energy may comprise one or more genomic regions of Erwinia chrysanthemi, comprising several genes (kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR) encoding enzymes (kduI, kduD, ogl, pelW, and paeX), transporters (togM, togN, togA, togB, and kdgM), and regulatory proteins (kdgR) responsible for degradation of di- and trigalacturonate, as well as several genes (pelA, pelE, paeY, and pem) encoding pectate lyases (pelA and pelE), pectin acetylesterases (paeY), and pectin methylesterase (pem) (see Example 2).

[0240] Additional examples of isomerases that may be utilized herein include glucoronate isomerases, such as those in the family uxaC, as well as 4-deoxy-L-threo-5-hexylose uronate isomerases, such as those in the family KduI. Additional examples of reductases that may be utilized herein include tagaturonate reductases, such as those in the family uxaB. Additional examples of dehyadratases that may be utilized herein include altronate dehydratases, such as those in the family uxaA. Additional examples of dehydrogenases that may be utilized herein include 2-deoxy-D-gluconate 3-dehydrogenases, such as those in the family kduD.

[0241] Certain aspects my also utilize recombinant microorganisms engineered to enhance the efficiency of the KDG degradation pathway. For instance, in bacteria, KDG is a common metabolic intermediate in the degradation of hexuronates such as D-glucuronate and D-galacturonate and enters into Entner Doudoroff pathway where it is converted to pyruvate and glyceraldehyde-3-phosphate (G3P). In this pathway, KDG is first phosphorylated by KDG kinase (KdgK) followed by its cleavage into pyruvate and glyceraldehyde-3-phosphate (G3P) using 2-keto-3-deoxy-D-6-phosphate-gluconate (KDPG) aldolase (KdgA). The expression of these enzymes concurrently with KDG permease (e.g., KdgT) is negatively regulated by KdgR and is almost none at basal level. The expression is dramatically (3-5-fold) induced upon the addition of hexuronates, and a similar result has been reported in Pseudomonas grown on alginate. Hence, to increase the conversion of KDG to pyruvate and G3P, the negative regulator KdgR may be removed. To further improve the pathway efficiency, exogenous copies of KdgK and KdgA may also be incorporated into a given recombinant microorganism.

[0242] In certain aspects, a recombinant microorganism that is able to grow on a polysaccharide (e.g., alginate, pectin, etc) as a sole source of carbon may be capable of producing an increased amount of a given commodity chemical (e.g., ethanol) while growing on that polysaccharide. For example, E. coli engineered to grown on alginate may be engineered to produced an increased amount of ethanol from alginate as compared to E. coli that is not engineered to grown on alginate (see Example 11). Thus, certain aspects include a recombinant microorganism that is capable of growing on alginate or pectin as a sole source carbon, and that is capable of producing an increased amount of ethanol, such as by comprising one or more genes encoding and expressing a pyruvate decarboxylase (pdc) and/or an alcohol dehydrogenase, including functional variants thereof. In certain aspects, such a recombinant microorganism may comprise a pyruvate decarboxylase (pdc) and two alcohol dehydrogenases (adhA and adhB) obtained from Zymomonas mobilis.

[0243] Embodiments of the present invention also include methods for converting polysaccharide to a suitable monosaccharide comprising, (a) obtaining a polysaccharide; (b) contacting the polysaccharide with a chemical catalysis or enzymatic pathway, thereby converting the polysaccharide to a first monosaccharide or oligosaccharide; and (c) contacting the first monosaccharide with a microbial system for a time sufficient to convert the first monosaccharide or oligosaccharide to the suitable monosaccharide, wherein the microbial system comprises, (i) at least one gene encoding and expressing an enzyme selected from a monosaccharide transporter, a disaccharide transporter, a trisaccharide transporter, an oligosaccharide transporter, and a polysaccharide transporter; and (ii) at least one gene encoding and expressing an enzyme selected from a monosaccharide dehydrogenase, an isomerase, a dehydratase, a kinase, and an aldolase, thereby converting the polysaccharide to a suitable monosaccharide.

[0244] In certain aspects of the present invention, aquatic or marine-biomass polysaccharides such as alginate may be chemically degraded using chemical catalysts such as acids. Similarly, biomass-derived pectin may be chemically degraded. For instance, the reaction catalyzed by chemical catalysts is typically through hydrolysis, as opposed to the .beta.-elimination type of reactions catalyzed by enzymatic catalysts. Thus, certain embodiments may include boiling alginate or pectin with strong mineral acids to liberate carbon dioxide from D-mannuronate, thereby forming D-lyxose, a common sugar metabolite utilized by many microorganisms. Such embodiments may use, for example, formate, hydrochloric acid, sulfuric acid, in addition to other suitable acids known in the art as chemical catalysts.

[0245] An enzymatic pathway may utilized one or more enzymes described herein that are capable of catalyzing the degradation of polysaccharides, such as alginate or pectin.

[0246] Other embodiments may use variations of chemical catalysis similar to those described herein or known to a person skilled in the art, including improved or redesigned methods of chemical catalysis suitable for use with biomass related polysaccharides. Certain embodiments include those wherein the resulting monosaccharide uronate is D-mannuronate.

[0247] As noted above, the suitable monosaccharides or suitable oligosaccharides produced by the recombinant microorganisms and microbial systems of the present invention may be utilized as a feedstock in the production of commodity chemicals, such as biofuels, as well as commodity chemical intermediates. Thus, certain embodiments of the present invention relate generally to methods for converting a suitable monosaccharide or oligosaccharide to a commodity chemical, such as a biofuel, comprising, (a) obtaining a suitable monosaccharide or oligosaccharide; (b) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the biofuel, thereby converting the suitable monosaccharide to the biofuel.

[0248] Certain aspects include methods for converting a suitable monosaccharide to a first commodity chemical such as a biofuel, comprising, (a) obtaining a suitable monosaccharide; (b) contacting the suitable monosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the first commodity chemical, wherein the microbial system comprises one or more genes encoding a aldehyde or ketone biosynthesis pathway, thereby converting the suitable monosaccharide to the first commodity chemical.

[0249] In these and other related aspects, depending on the particular ketone or aldehyde biosynthesis pathway employed, the first commodity chemical may be further enzymatically and/or chemically reduced and dehydrated to a second commodity chemical. Examples of such second commodity chemicals include, but are not limited to, butene or butane; 1-phenylbutene or 1-phenylbutane; pentene or pentane; 2-methylpentene or 2-methylpentane; 1-phenylpentene or 1-phenylpentane; 1-phenyl-4-methylpentene or 1-phenyl-4-methylpentane; hexene or hexane; 2-methylhexene or 2-methylhexane; 3-methylhexene or 3-methylhexane; 2,5-dimethylhexene or 2,5-dimethylhexane; 1-phenylhexene or 1-phenylhexane; 1-phenyl-4-methylhexene or 1-phenyl-4-methylhexane; 1-phenyl-5-methylhexene or 1-phenyl-5-methylhexane; heptene or heptane; 2-methylheptene or 2-methylheptane; 3-methylheptene or 3-methylheptane; 2,6-dimethylheptene or 2,6-dimethylheptane; 3,6-dimethylheptene or 3,6-dimethylheptane; 3-methyloctene or 3-methyloctane; 2-methyloctene or 2-methyloctane; 2,6-dimethyloctene or 2,6-dimethyloctane; 2,7-dimethyloctene or 2,7-dimethyloctane; 3,6-dimethyloctene or 3,6-dimethyloctane; and cyclopentane or cyclopentene.

[0250] Certain embodiments of the present invention may also include methods for converting a suitable monosaccharide or oligosaccharide to a commodity chemical comprising (a) obtaining a suitable monosaccharide or oligosaccharide; (b) contacting the suitable monosaccharide or oligosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide or oligosaccharide to the commodity chemical, wherein the microbial system comprises; (i) one or more genes encoding a biosynthesis pathway; (ii) one or more genes encoding and expressing a C--C ligation pathway; and (iii) one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase, thereby converting the suitable monosaccharide or oligosaccharide to the commodity chemical.

[0251] Certain aspects also include recombinant microorganism that comprise (i) one or more genes encoding a biosynthesis pathway; (ii) one or more genes encoding and expressing a C--C ligation pathway; and (iii) one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase. Certain aspects also include recombinant microorganisms that comprise the above pathways individually or in certain combinations, such as recombinant microorganism that comprises one or more genes encoding a biosynthesis pathway, as described herein. Certain aspects may also include recombinant microorganisms that comprise one or more genes encoding and expressing a C--C ligation pathway, as described herein. Certain aspects may also include include recombinant microorganisms that comprise one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase, as described herein.

[0252] As for recombinant microorganisms that comprise combinations of the above-noted pathways, certain aspects may include recombinant microorgansims that comprise (i) one or more genes encoding a biosynthesis pathway; and (ii) one or more genes encoding and expressing a C--C ligation pathway. Certain aspects may also include recombinant microorganisms that comprise (i) one or more genes encoding and expressing a C--C ligation pathway; and (ii) one or more genes encoding and expressing a reduction and dehydration pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase.

[0253] Certain aspects may also include recombinant microorganisms that comprise one or more individual components of a dehydration and reduction pathway, such as a recombinant microorganism that comprises a diol dehydrogenase, a diol dehydratase, or a secondary alcohol dehydrogenase. These and other microorganisms may be utilized, for example, to convert a suitable polysaccharide to a first commodity chemical, or an intermediate thereof, or to to convert a first commodity chemical, or an intermediate thereof, to a second commodity chemical.

[0254] Merely by way of illustration, a recombinant microorganism comprising a C--C ligation pathway may be utilized to convert butanal into a first commodity chemical, or an intermediate thereof, such as 5-hydroxy-4-octanone, which can then be converted into a second commodity chemical, or intermediate thereof, by any suitable pathway. As a further example, a recombinant microorganism comprising a C--C ligation pathway and a diol hydrogenase may be utilized for the sequential conversion of butanal into 5-hydroxy-4-octanone and then 4,5-octanonediol. Examples of recombinant microorganisms that comprise these and other various combinations of the individual pathways described herein, as well as various combinations of the individual components of those pathways, will be apparent to those skilled in the art, and may also be found in the Examples.

[0255] Also included are methods of converting a polysaccharide to a first commodity chemical, or an intermediate thereof, such as by utilizing a recombinant microorganism that comprises an aldehyde or ketone biosynthesis pathway. Also included are methods of converting a first commodity chemical, or intermediate thereof, to a second commodity chemical, such as by utilizing a recombinant microorganism that optionally comprises a biosynthesis pathway, optionally comprises C--C ligation pathway and/or optionally comprises one or more of the individual components of a dehydration and reduction pathway. Merely by way of illustration, a recombinant microorganism comprising an exogenous C--C ligase (e.g., benzaldehyde lyase from Pseudomonas fluorescens) could be utilized in a method to convert a first commodity chemical such as 3-methylbutanal to a second commodity chemical such as 2,7-dimethyl-5-hydroxy-4-octanone. Along this line of illustration, the same or different recombinant microorganism comprising a diol dehydrogenase could be utilized in a method to convert 2,7-dimethyl-5-hydroxy-4-octanone to another commodity chemical such as 2,7-dimethyl-4,5-octanediol (see Table 2 for other examples). As an additional illustrative example, a recombinant microorganism comprising an exogenous secondary alcohol dehydrogenase could be utilized in a method to convert a first commodity chemical such as 2,7-dimethyl-4-octanone to a second commodity chemical such as 2,7-dimethyloctanol.

[0256] Embodiments of a microbial system or isolated microorganism of the present application may include a naturally-occurring biosynthesis pathway, and/or an engineered, reconstructed, or re-designed biosynthesis pathway that has been optimized for improved functionality.

[0257] Embodiments of a microbial system or recombinant microorganism of the present invention may include a natural or reconstructed biosynthesis pathway, such as a butyraldehyde biosynthesis pathway, as found in such microorganisms as Clostridium acetobutylicum and Streptomyces coelicolor. In explanation, butyrate and butanol are the common fermentation products of certain bacterial species such as Clostridia, in which the production of butyrate and butanol is mediated by a synthetic thiolase dependent pathway characteristically similar to fatty acid degradation pathway. Such pathways may be initiated with the condensation of two molecules of acetyl-CoA to acetoacetyl-CoA, which is catalyzed by thiolase. Acetoacetyl-CoA is then reduced to .beta.-hydroxy butyryl-CoA, which is catalyzed by NAD(P)H dependent .beta.-hydroxy butyryl-CoA dehydrogenase (HBDH). Crotonase catalyzes dehydration from .beta.-hydroxy butyryl-CoA to form crotonyl-CoA. Further reduction catalyzed by NADH-dependent butyryl-CoA dehydrogenase (BCDH) saturates the double bond at C2 of crotonyl-CoA to form butyryl-CoA.

[0258] In certain embodiments, thiolase, the first enzyme in this pathway, may be overexpressed to maximize production. In certain embodiments, thiolase may over-expressed in E. coli. In this regard, all three enzymes (e.g., HBDH, crotonase, and BCDH) catalyzing the following reaction steps are found in Clostridium acetobutylicum ATCC824. In certain embodiments, BDH, crotonase, and BCDH may be expressed or over-expressed in a suitable microorganism such as E. coli. Alternatively, a short-chain aliphatic acyl-CoA dehydrogenase derived from Pseudomonas putida KT2440 may be utilized in other embodiments of a microbial system or isolated microorganism of the present application.

[0259] Further to this end, butyryl-CoA in Clostridia may be readily converted to butanol and/or butyrate by at least a few different pathways. In one pathway, butyryl-CoA is directly reduced to butyraldehyde catalyzed by NADH dependent CoA-acylating aldehyde dehydrogenase (ALDH). Butyraldehyde may be further reduced to butanol by NADH-dependent butanol dehydrogenase. Although CoA-acylating ALDH catalyzes the one step reduction of butyryl-CoA to butyraldehyde, the incorporation of CoA-acylating ALDH to the microbial system may result in acetoaldehyde formation because of its promiscuous acetyl-CoA deacylating activity. In certain embodiments, the formation of acetoaldehyde may be minimized by functionally redesigning the relevant enzyme(s).

[0260] Butyryl-CoA in other biosynthesis pathways is deacylated to form butyryl phosphate catalyzed by phosphotransbutyrylase. Butyryl phosphate is then hydrolyzed by reversible butyryl phosphate kinase to form butyrate. This reaction is coupled with ATP generation from ADP. The butyrate formation through these enzymes is known to be significantly more specific. Certain embodiments may comprise phosphotransbutyrylase and butyryl phosphate kinase to the microbial system. In other embodiments, butyrate may be directly formed from butyryl-CoA by short chain acyl-CoA thioesterase.

[0261] Butyrate in Clostridia may also be sequentially reduced to butanol, which is catalyzed by a single alcohol/aldehyde dehydrogenase. Certain embodiments may comprise short chain aldehyde dehydrogenase from other bacteria such as Pseudomonas putida to complement the production of butyraldehyde in the microbial system. One potential concern in using short chain aldehyde dehydrogenase involves the possible formation of acetoaldehyde from acetate. Certain embodiments may be directed to minimizing the acetate formation in the microbial system, for example, by deleting several genes encoding enzymes involved in the acetate production.

[0262] Moreover, there are multiple routes in E. coli to form acetate, one of which is mediated by pyruvate oxygenase (PDXB) from pyruvate, whereas another is mediated by phosphotransacetylase (PTA) and acetyl phosphate kinase (ACKA) from acetyl-CoA. The acetate production from E. coli mutant strains with poxB.sup.-, pta.sup.-, and acka.sup.- are significantly diminished. In addition, incorporation of acetyl-CoA synthase (ACS) which catalyses the acetyl-CoA formation from acetate is also known to significantly reduce the accumulation of acetate. Certain embodiments may comprise a microbial system or isolated microorganism with deleted PDXB, PTA, and/or ACKA genes, and other embodiments may also comprise, separately or together with the deleted genes, one or more genes encoding and expressing ACS.

[0263] A microbial system or recombinant microorganism provided herein may also comprise a glutaraldehyde biosynthesis pathway. As one example, Saccharomyces cerevisiae has a lysine biosynthetic pathway in which acetyl-CoA is initially condensed to .alpha.-ketoglutarate, a common metabolite in citric acid cycle, to form homocitorate. This reaction is catalyzed by homocitrate synthase derived from Yeast, Thermus thermophilus, or Deinococcus radiodurans. Homoaconitase derived from Yeast, Thermus thermophilus, or Deinococcus radiodurans catalyzes the conversion between homocitrate and homoisocitrate. Homoisocitrate is then oxidatively decarboxylated to form 2-ketoadipate, which is catalyzed by homoisocitrate dehydrogenase derived from Yeast, Thermus thermophilus, or Deinococcus radiodurans. Homoisocitrate is also oxidatively decarboxylated to form glutaryl-CoA, which may be catalyzed by homoisocitrate dehydrogenase. Thus, certain embodiments may comprise a homocitrate synthase, a homoaconitase, and/or a homoisocitrate dehydrogenase.

[0264] Further to this end, in synthesizing 2-keto-adipicsemialdehyde, 2-ketoadipate is reduced to 2-keto-adipicsemialdehyde. This reaction can be catalyzed by dialdehyde dehydrogenase, which, for example, may be isolated from Agrobacterium tumefaciens C58. Thus, certain embodiments may incorporate dialdehyde dehydrogenases into a microbial system or recombinant microorganism.

[0265] In synthesizing glutaraldehyde, Acyl-CoA thioesterases (ACOT) may also catalyze the hydrolysis of glutaryl-CoA. The genes encoding .omega.-carboxylic acyl-CoA specific peroxisomal ACOTs are found in many mammalian species; both ACOT4 and ACOT8 derived from mice have been previously expressed in E. coli and shown that both enzymes are highly active on the hydrolysis of glutaryl-CoA to form glutarate. Certain embodiments may comprise one or more Acyl-CoA thioesterases.

[0266] Glutarate is sequentially reduced to glutaraldehyde. This reaction can be catalyzed by glutaraldehyde dehydrogenase (CpnE), which, for example, may be isolated from Comomonas sp. Strain NCIMB 9872. Certain embodiments may incorporate glutaraldehyde dehydrogenases such as CpnE into a microbial system or isolated microorganism. Other embodiments may comprise both ACOT and CpnE enzymes. Other embodiments may comprise CpnE enzymes redesigned to catalyze the reduction of 1-hydroxy propanoate and succinate to 1-hydroxy propanal and succinicaldehyde.

[0267] In certain aspects, the biosynthesis pathway may include an aldehyde biosynthesis pathway, a ketone biosynthesis pathway, or both. In certain aspects, the biosynthesis pathway may be include one or more of an acetoaldehyde, propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, 4-methylpentaldehyde, phenylacetoaldehyde, 2-phenyl acetoaldehyde, 2-(4-hydroxyphenyl)acetaldehyde, 2-Indole-3-acetoaldehyde, glutaraldehyde, 5-amino-pentaldehyde, succinate semialdehyde, and/or succinate 4-hydroxyphenyl acetaldehyde biosynthesis pathway, including various combinations thereof.

[0268] With regard to combinations of biosynthesis pathways, a biosynthesis pathway may comprise an acetoaldehyde biosynthesis pathway in combination with at least one of a propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a propionaldehyde biosynthesis pathway in combination with at least one of a butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a butyraldehyde biosynthesis pathway in combination with at least one of an isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise an isobutyraldehyde biosynthesis pathway in combination with at least one of a 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a 2-methyl-butyraldehyde biosynthesis pathway in combination with at least one of a 3-methyl-butyraldehyde or a phenylacetoaldehyde biosynthesis pathway. In certain aspects, a biosynthesis pathway may comprise a 3-methyl-butyraldehyde biosynthesis pathway in combination with a phenylacetoaldehyde biosynthesis pathway.

[0269] In certain aspects, a propionaldehyde biosynthesis pathway may comprise a threonine deaminase (ilvA) gene from an organism such as Escherichia coli and a keto-isovalerate decarboxylase (kiwi) gene from an organism such as Lactococcus lactis, and/or functional variants of these enzymes, including homologs or orthologs thereof, as well as optimized variants. These enzymes may be utilized generally to convert L-threonine to propionaldehyde.

[0270] In certain aspects, a butyraldehyde biosyntheis pathway may comprise at least one of a thiolase (atoB) gene from an organism such as E. coli, a .beta.-hydroxy butyryl-CoA dehydrogenase (hbd) gene, a crotonase (crt) gene, a butyryl-CoA dehydrogenase (bcd) gene, an electron transfer flavoprotein A (etfA) gene, and/or an electron transfer flavoprotein B (etfB) gene from an organism such as Clostridium acetobutyricum (e.g., ATCC 824), as well as a coenzyme A-linked butyraldehyde dehydrogenase (ald) gene from an organism such as Clostridium beijerinckii acetobutyricum ATCC 824. In certain aspects, a coenzyme A-linked alcohol dehydrogenase (adhE2) gene from an organism such as Clostridium acetobutyricum ATCC 824 may be used as an alternative to an ald gene.

[0271] In certain aspects, an isobutyraldehyde biosynthetic pathway may comprise an acetolactate synthase (alsS) from an organism such as Bacillus subtilis or an als gene from an organism such as Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage may be optimized for E. coli protein expression). Such a pathway may also comprise acetolactate reductoisomerase (ilvC) and/or 2,3-dihydroxyisovalerate dehydratase (ilvD) genes from an organism such as E. coli, as well as a keto-isovalerate decarboxylase (kivd) gene from an organism such as Lactococcus lactis.

[0272] In certain aspects, a 3-methylbutyraldehyde and 2-methylbutyraldehyde biosynthesis pathway may comprise an acetolactate synthase (alsS) gene from an organism such as Bacillus subtilis or an (als) gene from an organism such as Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage may be optimized for E. coli protein expression). Certain aspects of such a pathway may also comprise acetolactate reductoisomerase (ilvC), 2,3-dihydroxyisovalerate dehydratase (ilvD), isopropylmalate synthase (LeuA), isopropylmalate isomerase (LeuC and LeuD), and 3-isopropylmalate dehydrogenase (LeuB) genes from an organism such as E. coli, as well as a keto-isovalerate decarboxylase (kivd) from an organism such as Lactococcus lactis.

[0273] In certain aspects, a phenylacetoaldehyde and 4-hydroxyphenylacetoaldehyde biosynthesis pathway may comprise one or more of 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), dehydroshikimate reductase (aroE), shikimate kinase II (aroL), shikimate kinase I (aroK), 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), chorismate synthase (aroC), fused chorismate mutase P/prephenate dehydratase (pheA), and/or fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from an organism such as E. coli, as well as a keto-isovalerate decarboxylase (kivd) from an organism such as Lactococcus lactis.

[0274] In certain aspects, such as for the ultimate production of 1,10-diamino-5-decanol and 1,10-dicarboxylic-5-decanol, a biosynthesis pathway may comprise one or more homocitrate synthase, homoaconitate hydratase, homoisocitrate dehydrogenase, and/or homoisocitrate dehydrogenase genes from an organism such as Deinococcus radiodurans and/or Thermus thermophilus, as well as a keto-adipate decarboxylase gene, a 2-aminoadipate transaminase gene, and a L-2-Aminoadipate-6-semialdehyde: NAD+ 6-oxidoreductase gene. Such a biosynthesis pathway would be able to convert .alpha.-ketoglutarate to 5-aminopentaldehyde.

[0275] In certain aspects, such as for one step in cyclopentanol production, a .alpha.-ketoadipate semialdehyde biosynthesis pathway may comprise homocitrate synthase (hcs), homoaconitate hydratase, and homoisocitrate dehydrogenase genes from an organism such as Deinococcus radiodurans and/or Thermus thermophilus, and an .alpha.-ketoadipate semialdehyde dehydrogenase gene. Such a biosynthesis pathway would be able to convert acetyl-CoA and .alpha.-ketoglutarate to .alpha.-ketoadipate semialdehyde.

[0276] For the production of certain commodity chemicals, such as 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol, among other similar chemicals, a biosynthesis pathway (e.g., aldehyde biosynthesis pathway) may optionally or further comprise one or more genes encoding a carboxylase enzyme, such as an indole-3-pyruvate decarboxylase (IPDC). An IPDC may be obtained, for example, from such microorganisms as Azospirillum brasilense and Paenibacillus polymyxa E681. In this regard, an IPDC may be utilized to more efficiently catalyze the dexarboxylation of various carboxylic acids to form the corresponding aldehyde, which can be further converted to a commodity chemical by a reductase or dehydrogenase, as detailed herein.

[0277] In certain aspects, a 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis pathway may comprise a transketolase (tktA), a 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II (aroL), a shikimate kinase I (aroK), a 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate synthase (aroC), a fused chorismate mutase P/prephenate dehydratase (pheA), and a fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from E. coli, keto-isovalerate decarboxylase (kivd) from Lactococcus lactis, alcohol dehydrogenase (adh2) from Saccharomyces cerevisiae, Indole-3-pyruvate decarboxylase (ipdc) from Azospirillum brasilense, phenylethanol reductase (par) from Rhodococcus sp. ST-10, and a benzaldehyde lyase (bal) from Pseudomonas fluorescence.

[0278] As for all other pathways described herein, the components for each of the biosynthesis pathways described herein may be present in a recombinant microorganism either endogenously or exogenously. To improve the efficiency of a given biosynthesis pathway, endogenous genes, for example, may be up-regulated or over-expressed, such as by introducing an additional (i.e., exogenous) copy of that endogenous gene into the recombinant microorganism. Such pathways may also be optimized by altering via mutagenesis the endogenous version of a gene to improve functionality, followed by introduction of the altered gene into the microorganism. The expression of endogenous genes may be up or down-regulated, or even eliminated, according to known techniques in the art and described herein. Similarly, the expression levels of exogenously provided genes may be regulated as desired, such as by using various constitutive or inducible promoters. Such genes may also be "codon-optimized," as described herein and known in the art. Also included are functional naturally-occurring variants of the genes and enzymes described herein, including homologs or orthologs thereof.

[0279] Certain embodiments of a microbial system or isolated microorganism may comprise a CC-ligation pathway. In certain aspects, a CC-ligation pathway may comprise a ThDP-dependent enzyme, such as a C--C ligase, or an optimized C--C ligase. For example, eight-carbon unit molecules (butyroins) may be made from condensing together two four-carbon unit molecules (butyraldehydes). ThDP-dependent enzymes are a group of enzymes known to catalyze both breaking and formation of C--C bonds and have been utilized as catalysts in chemoenzymatic syntheses. The spectrum of chemical reactions that these enzymes catalyze ranges from decarboxylation of .alpha.-keto acids, oxidative decarboxylation, carboligation, and to the cleavage of C--C bonds.

[0280] To provide a few examples, benzaldehyde lyase (BAL) from Pseudomonas fluorescens, benzoylformate decarboxylase (BFD) from Pseudomonas putida, and pyruvate decarboxylase (PDC) from Zymomonas mobilis may catalyze a carboligation reaction between two aldehydes. BAL accepts the broadest spectrum of aldehydes as substrates among these three enzymes ranging from substituted benzaldehyde to acetoaldehyde, among others, as shown herein. BAL catalyzes stereospecific carboligation reaction between two aldehydes and forms .alpha.-hydroxy ketones with over 99% ee for R-configuration. The benzoin formation from two benzaldehyde molecules is a favored reaction catalyzed by BAL and proceeds as fast as 320 mmol (benzoin) mg (protein).sup.-1 min.sup.-1. The formation of .alpha.-hydroxy ketone may be carried out using many different aldehydes, including butyraldehyde.

[0281] BFD and PCD may also catalyze the carboligation reactions between two aldehyde molecules. BFD and PCD accept relatively larger and smaller aldehyde molecules, respectively. With the presence of benzaldehyde and acetoaldehyde, BFD catalyzes the formation of benzoin and (S)-.alpha.-hydroxy phenylpropanone (2S-HPP), whereas PCD catalyzes the formation of (R)-.alpha.-hydroxy phenylpropanone (2R-HPP) and (R)-.alpha.-hydroxy 2-butanone (acetoin). As detailed below, certain microbial systems or isolated microorganisms of the present application may comprise natural or optimized C--C ligases (ThDP-dependent enzymes) selected from benzaldehyde lyase (BAL) from Pseudomoas fluorescens, benzoylformate decarboxylase (BFD) from Pseudomonas putida, and pyruvate decarboxylase (PDC) from Zymomonas mobilis. Other embodiments may comprise a benzaldehyde lyase (BAL) from Pseudomoas fluorescens (see SEQ ID NOS:143-144, showing the nucleotide and polypeptide sequences, respectively) including biologically active variants thereof, such as optimized variants.

[0282] A C--C ligation pathway of the present invention typically comprises one or more C--C ligases, such as a lyase enzyme. Exemplary lyases include, but are not limited to, acetoaldehyde lyases, propionaldehyde lyases, butyraldehyde lyases, isobutyraldehyde lyases, 2-methyl-butyraldehyde lyases, 3-methyl-butyraldehyde lyases (isoveraldehyde), phenylacetaldehyde lyases, .alpha.-keto adipate carboxylyases, pentaldehyde lyases, 4-methyl-pentaldehyde lyases, hexyldehyde lyases, heptaldehyde lyases, octaldehyde lyases, 4-hydroxyphenylacetaldehyde lyases, indoleacetaldehyde lyases, indolephenylacetaldehyde lyases. In certain aspects, a selected CC-ligase or lyase enzyme may have one or more of the above exemplified lyase activities, such as acetoaldehyde lyase activity, a propionaldehyde lyase activity, a butyraldehyde lyase activity, and/or an isobutyraldehyde lyase activity, among others.

[0283] As noted above, a C--C ligase may comprise a benzaldehyde lyase, such as a benzaldehyde lyase isolated from Pseudomonas fluorescens (SEQ ID NOS:143-144), as well as biologically active fragments or variants of this reference sequence, such as optimized variants of a benzaldehyde lyase. In this regard, certain aspects may comprise nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:143-144, and which are capable of catalyzing a carboligation reaction, or which possess C--C lyase activity, as described herein. In certain aspects, a BAL enzyme will comprise one or more conserved amino acid residues, including G27, E50, A57, G155, P162, P234, D271, G277, G422, G447, D448, and/or G512.

[0284] Pseudomonas fluorescens is able to grow on R-benzoin as the sole carbon and energy source because it harbours the enzyme benzaldehyde lyase that cleaves the acyloin linkage using thiamine diphosphate (ThDP) as a cofactor. In the reverse reaction, as utilized herein, benzaldehyde lyase catalyses the carboligation of two aldehydes with high substrate and stereospecificity. Structure-based comparisons with other proteins show that benzaldehyde lyase belongs to a group of closely related ThDP-dependent enzymes. The ThDP cofactors of these enzymes are fixed at their two ends in separate domains, suspending a comparatively mobile thiazolium ring between them. While the residues binding the two ends of ThDP are well conserved, the lining of the active centre pocket around the thiazolium moiety varies greatly within the group. The active sites for BAL have been described, for example, in Kneen et al. (Biochimica et Biophysica Acta 1753:263-271, 2005) and Brandt et al. (Biochemistry 47:7734-43, 2008). Benzaldehyde lyase derived from Pseudomonas fluorescens has been demonstrated herein to at least have an acetoaldehyde lyase activity, a propionaldehyde lyase activity, a butyraldehyde lyase activity, a 3-methyl-butyraldehyde lyase activity, a pentaldehyde lyase activity, a 4-methylpentaldehyde lyase activity, a hexyldehyde lyase activity, a phenylacetoaldehyde lyase activity, and an octaldehyde lyase activity (see Table 2), among other in vivo lyase activities (see FIGS. 48-55).

[0285] In certain aspects, a C--C ligase, such as BAL derived from Pseudomonas fluorescens, BFD derived from Pseudomonas putida, or PDC derived from Zymomonas mobilis may comprise a lyase with a combination of lyase activities, such as a lyase having both a propionaldehyde lyase activity and a 3-methyl-butyraldehyde lyase activity, among other combinations and activities, such as those exemplary combinations detailed herein. Merely by way of illustration, a lyase having a combination of lyase activities may be referred to herein as a propionaldehyde/3-methyl-butyraldehyde lyase.

[0286] A dehydration and reduction pathway, comprising a diol dehydrogenase, a diol dehydratase, and a secondary alcohol dehydrogenase, may be utilized to further convert an aldehyde, ketone, or corresponding alcohol, to a commodity chemical, such as a biofuel.

[0287] To this end, a dehydration and reduction pathway may comprise one or more diol dehydrogenases. A "diol dehydrogenase" refers generally to an enzyme that catalyzes the reversible reduction and oxidation of a .alpha.-hydroxy ketone and/or its corresponding diol. Certain embodiments of a microbial system or isolated microorganism may comprise genes encoding a diol dehydrogenase that specifically catalyzes the reduction of .alpha.-hydroxy-ketones, including, for example, a 4,5, octanediol dehydrogenase. Diol dehydrogenases, such as 4,5, octanediol dehydrogenase, may be isolated from a variety of organisms and incorporated into a microbial system or isolated microorganism. A particular group of alcohol dehydrogenases has a characteristic ability to oxidize various .alpha.-hydroxy alcohols and reduce various .alpha.-hydroxy ketones and .alpha.-keto ketones. As such, the recitation "diol dehydrogenase" may also encompass such alcohol dehydrogenases.

[0288] By way of example regarding diol dehydrogenases from exemplary organisms, glycerol dehydrogenase isolated from Hansenula ofunaensis has broad substrate specificity and is capable of catalyzing the oxidation of various .alpha.-hydroxy alcohols, including 1,2-octane, as well as the reduction of various .alpha.-hydroxy ketones and .alpha.-keto ketones, including 3-hydroxy-2-butanone and 3,4-hexanedione, with the activity comparable to its native substrates, glycerol and dihydroxyaceton, respectively (40-200%). As one further example, glycerol dehydrogenase discovered in Hansenula polumorpha DI-1 works similarly. In certain embodiments, a microbial system or recombinant microorganism may comprise a glycerol dehydrogenase gene isolated from Hansenula ofunaensis, a glycerol dehydrogenase isolated from Hansenula polumorpha DI-1 and/or a meso-2,3-butane diol dehydrogenase from Klebsiella pneumoniae. In other embodiments, a microbial system or isolated microorganism may comprise a 4,5, octanediol dehydrogenase, among others detailed herein. Diol dehyodregnases may also be obtained from Lactobaccilus brevis ATCC 367, Pseudomanas putida KT2440, and Klebsiella pneumoniae MGH78578), as described herein (see Example 5).

[0289] Exemplary diol dehydrogenases include, but are not limited to, 2,3-butanediol dehydrogenase, 3,4-hexanediol dehydrogenase, 4,5-octanediol dehydrogenase, 5,6-decanediol dehydrogenase, 6,7-dodecanediol dehydrogenase, 7,8-tetradecanediol dehydrogenase, 8,9-hexadecanediol dehydrogenase, 2,5-dimethyl-3,4-hexanediol dehydrogenase, 3,6-dimethyl-4,5-octanediol dehydrogenase, 2,7-dimethyl-4,5-octanediol dehydrogenase, 2,9-dimethyl-5,6-decanediol dehydrogenase, 1,4-diphenyl-2,3-butanediol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,4-diindole-2,3-butanediol dehydrogenase, 1,2-cyclopentanediol dehydrogenase, 2,3-pentanediol dehydrogenase, 2,3-hexanediol dehydrogenase, 2,3-heptanediol dehydrogenase, 2,3-octanediol dehydrogenase, 2,3-nonanediol dehydrogenase, 4-methyl-2,3-pentanediol dehydrogenase, 4-methyl-2,3-hexanediol dehydrogenase, 5-methyl-2,3-hexanediol dehydrogenase, 6-methyl-2,3-heptanediol dehydrogenase, 1-phenyl-2,3-butanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1-indole-2,3-butanediol dehydrogenase, 3,4-heptanediol dehydrogenase, 3,4-octanediol dehydrogenase, 3,4-nonanediol dehydrogenase, 3,4-decanediol dehydrogenase, 3,4-undecanediol dehydrogenase, 2-methyl-3,4-hexanediol dehydrogenase, 5-methyl-3,4-heptanediol dehydrogenase, 6-methyl-3,4-heptanediol dehydrogenase, 7-methyl-3,4-octanediol dehydrogenase, 1-phenyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase, 1-indole-2,3-pentanediol dehydrogenase, 4,5-nonanediol dehydrogenase, 4,5-decanediol dehydrogenase, 4,5-undecanediol dehydrogenase, 4,5-dodecanediol dehydrogenase, 2-methyl-3,4-heptanediol dehydrogenase, 3-methyl-4,5-octanediol dehydrogenase, 2-methyl-4,5-octanediol dehydrogenase, 8-methyl-4,5-nonanediol dehydrogenase, 1-phenyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase, 1-indole-2,3-hexanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-tridecanediol dehydrogenase, 2-methyl-3,4-octanediol dehydrogenase, 3-methyl-4,5-nonanediol dehydrogenase, 2-methyl-4,5-nonanediol dehydrogenase, 2-methyl-5,6-decanediol dehydrogenase, 1-phenyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase, 1-indole-2,3-heptanediol dehydrogenase, 6,7-tridecanediol dehydrogenase, 6,7-tetradecanediol dehydrogenase, 2-methyl-3,4-nonanediol dehydrogenase, 3-methyl-4,5-decanediol dehydrogenase, 2-methyl-4,5-decanediol dehydrogenase, 2-methyl-5,6-undecanediol dehydrogenase, 1-phenyl-2,3-octanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase, 1-indole-2,3-octanediol dehydrogenase, 7,8-pentadecanediol dehydrogenase, 2-methyl-3,4-decanediol dehydrogenase, 3-methyl-4,5-undecanediol dehydrogenase, 2-methyl-4,5-undecanediol dehydrogenase, 2-methyl-5,6-dodecanediol dehydrogenase, 1-phenyl-2,3-nonanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase, 1-indole-2,3-nonanediol dehydrogenase, 2-methyl-3,4-undecanediol dehydrogenase, 3-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-5,6-tridecanediol dehydrogenase, 1-phenyl-2,3-decanediol dehydrogenase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase, 1-indole-2,3-decanediol dehydrogenase, 2,5-dimethyl-3,4-heptanediol dehydrogenase, 2,6-dimethyl-3,4-heptanediol dehydrogenase, 2,7-dimethyl-3,4-octanediol dehydrogenase, 1-phenyl-4-methyl-2,3-pentanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase, 1-indole-4-methyl-2,3-pentanediol dehydrogenase, 2,6-dimethyl-4,5-octanediol dehydrogenase, 3,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-4-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase, 1-indole-4-methyl-2,3-hexanediol dehydrogenase, 2,8-dimethyl-4,5-nonanediol dehydrogenase, 1-phenyl-5-methyl-2,3-hexanediol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase, 1-indole-5-methyl-2,3-hexanediol dehydrogenase, 1-phenyl-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase, 1-indole-6-methyl-2,3-heptanediol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-phenyl-2,3-butanediol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, 1,10-diamino-5,6-decanediol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase, and the like.

[0290] In certain aspects, a selected diol dehydrogenase enzyme may have one or more of the above exemplified diol dehydrogenase activities, such as a 2,3-butanediol dehydrogenase activity, a 3,4-hexanediol dehydrogenase activity, and/or a 4,5-octanediol dehydrogenase activity, among others.

[0291] In certain aspects, a recombinant microorganism may comprise a diol dehydrogenase encoded by a nucleotide reference sequence selected from SEQ ID NO:97, 99, and 101, or an enzyme having a polypeptide sequence selected from SEQ ID NO:98, 100, and 102, including biologically active fragments or variants thereof, such as optimized variants. Certain aspects may also comprises nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:97-102.

[0292] Other embodiments may comprise re-designed diol dehydrogenases for reduction of 1-hydroxy propanal, succinicaldehyde, and glutaraldehyde to 1,3-propanediol, 1,4-butanediol, and 1,5 pentanediol, respectively, among others.

[0293] A dehydration and reduction pathway, as described herein, may comprise one or more diol dehydratases. A "diol dehydratase" refers generally to an enzyme that catalyzes the irreversible dehydration of diols. For instance, this enzyme may serve to dehydrate octanediol to form 4-octane. It has been recognized that there are at least two different types of diol dehydratases: a group dependent on and independent of coenzyme B12 for its catalysis. Coenzyme B12 dependent diol dehydratases are known to catalyze a radical mediated dehydration reaction from .alpha.-hydroxy alcohol to aldehydes or ketones. For example, a diol dehydratase from Klebsiella pneumoniae catalyzes the dehydration of glycerol to form .beta.-hydroxypropyl aldehyde, accepts 2,3-butanediol as a substrate, and catalyzes the dehydration reaction to form 2-butanone.

[0294] As a further example, Clostridium butylicum contains coenzyme B12 independent diol dehydratases. FIG. 46 shows the in vivo biological activity of coenzyme B12 independent diol dehydratase (dhaB1) and activator (dhaB2) isolated from Clostridium butylicum (see Example 9). 46A shows the in vivo production of 1-propanol from 1,2-propanediol, FIG. 46B shows the in vivo production of 2-butanol from meso-2,3 butanediol, and FIG. 46C shows the in vivo production of cyclopentanone from trans-1,2-cyclopentanediol.

[0295] Thus, certain embodiments of the present invention may comprise optimized or redesigned diol dehydratases that accommodate various substrates, such as 4,5-octanediol as a substrate, and may include diol dehydratases isolated and/or optimized from Klebsiella pneumoniae and Clostridium butylicum, among other organisms described herein and known in the art.

[0296] Exemplary diol dehydratases include, but are not limited to, 2,3-butanediol dehydratase, 3,4-hexanediol dehydratase, 4,5-octanediol dehydratase, 5,6-decanediol dehydratase, 6,7-dodecanediol dehydratase, 7,8-tetradecanediol dehydratase, 8,9-hexadecanediol dehydratase, 2,5-dimethyl-3,4-hexanediol dehydratase, 3,6-dimethyl-4,5-octanediol dehydratase, 2,7-dimethyl-4,5-octanediol dehydratase, 2,9-dimethyl-5,6-decanediol dehydratase, 1,4-diphenyl-2,3-butanediol dehydratase, bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,4-diindole-2,3-butanediol dehydratase, 1,2-cyclopentanediol dehydratase, 2,3-pentanediol dehydratase, 2,3-hexanediol dehydratase, 2,3-heptanediol dehydratase, 2,3-octanediol dehydratase, 2,3-nonanediol dehydratase, 4-methyl-2,3-pentanediol dehydratase, 4-methyl-2,3-hexanediol dehydratase, 5-methyl-2,3-hexanediol dehydratase, 6-methyl-2,3-heptanediol dehydratase, 1-phenyl-2,3-butanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1-indole-2,3-butanediol dehydratase, 3,4-heptanediol dehydratase, 3,4-octanediol dehydratase, 3,4-nonanediol dehydratase, 3,4-decanediol dehydratase, 3,4-undecanediol dehydratase, 2-methyl-3,4-hexanediol dehydratase, 5-methyl-3,4-heptanediol dehydratase, 6-methyl-3,4-heptanediol dehydratase, 7-methyl-3,4-octanediol dehydratase, 1-phenyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase, 1-indole-2,3-pentanediol dehydratase, 4,5-nonanediol dehydratase, 4,5-decanediol dehydratase, 4,5-undecanediol dehydratase, 4,5-dodecanediol dehydratase, 2-methyl-3,4-heptanediol dehydratase, 3-methyl-4,5-octanediol dehydratase, 2-methyl-4,5-octanediol dehydratase, 8-methyl-4,5-nonanediol dehydratase, 1-phenyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase, 1-indole-2,3-hexanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-undecanediol dehydratase, 5,6-tridecanediol dehydratase, 2-methyl-3,4-octanediol dehydratase, 3-methyl-4,5-nonanediol dehydratase, 2-methyl-4,5-nonanediol dehydratase, 2-methyl-5,6-decanediol dehydratase, 1-phenyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase, 1-indole-2,3-heptanediol dehydratase, 6,7-tridecanediol dehydratase, 6,7-tetradecanediol dehydratase, 2-methyl-3,4-nonanediol dehydratase, 3-methyl-4,5-decanediol dehydratase, 2-methyl-4,5-decanediol dehydratase, 2-methyl-5,6-undecanediol dehydratase, 1-phenyl-2,3-octanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydratase, 1-indole-2,3-octanediol dehydratase, 7,8-pentadecanediol dehydratase, 2-methyl-3,4-decanediol dehydratase, 3-methyl-4,5-undecanediol dehydratase, 2-methyl-4,5-undecanediol dehydratase, 2-methyl-5,6-dodecanediol dehydratase, 1-phenyl-2,3-nonanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase, 1-indole-2,3-nonanediol dehydratase, 2-methyl-3,4-undecanediol dehydratase, 3-methyl-4,5-dodecanediol dehydratase, 2-methyl-4,5-dodecanediol dehydratase, 2-methyl-5,6-tridecanediol dehydratase, 1-phenyl-2,3-decanediol dehydratase, 1-(4-hydroxyphenyl)-2,3-decanediol dehydratase, 1-indole-2,3-decanediol dehydratase, 2,5-dimethyl-3,4-heptanediol dehydratase, 2,6-dimethyl-3,4-heptanediol dehydratase, 2,7-dimethyl-3,4-octanediol dehydratase, 1-phenyl-4-methyl-2,3-pentanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase, 1-indole-4-methyl-2,3-pentanediol dehydratase, 2,6-dimethyl-4,5-octanediol dehydratase, 3,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-4-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase, 1-indole-4-methyl-2,3-hexanediol dehydratase, 2,8-dimethyl-4,5-nonanediol dehydratase, 1-phenyl-5-methyl-2,3-hexanediol dehydratase, 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase, 1-indole-5-methyl-2,3-hexanediol dehydratase, 1-phenyl-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase, 1-indole-6-methyl-2,3-heptanediol dehydratase, 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-phenyl-2,3-butanediol dehydratase, 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, 1,10-diamino-5,6-decanediol dehydratase, 1,4-di(4-hydroxyphenyl)-2,3-butanediol, 2,3-hexanediol-1,6-dicarboxylic acid dehydratase, and the like.

[0297] In certain aspects, a selected diol dehydratase enzyme may have one or more of the above exemplified diol dehydratase activities, such as a 2,3-butanediol dehydratase activity, a 3,4-hexanediol dehydratase activity, and/or a 4,5-octanediol dehydratase activity, among others.

[0298] In certain aspects, diol dehydratases may be obtained from Klebsiella pneumoniae MGH 78578, including from the pduCDE gene of this and other microorganisms. In certain aspects, a recombinant microorganism may comprise one or more diol dehydratases encoded by a nucleotide reference sequence selected from SEQ ID NO:103, 105, and 107, or an enzyme having a polypeptide sequence selected from SEQ ID NO:104, 106, and 108, including biologically active fragments or variants thereof, such as optimized variants. Certain aspects may also comprises nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:103-108. In certain aspects, polypeptides of SEQ ID NO:104 may comprise certain conserved amino acid residues, including those chosen from D149, P151, A155, A159, G165, E168, E170, A183, G189, G196, Q200, E208, G215, Y219, E221, T222, S224, Y226, G227, T228, F232, G235, D236, D237, T238, P239, S241, L245, Y249, S251, R252, G253, K255, R257, S260, E265, M268, G269, S275, Y278, L279, E280, C283, G291, Q293, G294, Q296, N297, G298, G312, E329, S341, R344, G356, D371, N372, F374, S377, R392, D393, R412, L477, A486, G499, D500, S516, N522, D523, Y524, G526, and G530.

[0299] In certain aspects, a diol dehydratase may include a polypeptide that comprises an amino acid sequence having 0%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:308-311. SEQ ID NO:308 shows the polypeptide sequence of PduG, a diol dehydratase reactivation large subunit derived from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:309 shows the polypeptide sequence of PduH, diol dehydratase reactivation small subunit derived from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:310 shows the polypeptide sequence of a B12-independent glycerol dehydratase from Clostridium Butyricum. SEQ ID NO:311 shows the polypeptide sequence of a glycerol dehydratase activator from Clostridium Butyricum. In certain aspects, a B 12-independent glycerol dehydratase may comprise conserved amino acid residues, such as T36, G74, P87, E88, E97, W126, R221, A263, Q265, R287, D289, E309, R317, G335, G345, G346, N356, P374, R379, G399, G401, P403, D408, G432, C433, N452, C529, G533, G539, G540, S559, G603, N604, A654, G658, R659, D676, N702, Q735, N737, A747, P751, R760, V761, A762, G763, Q776, I780, and/or R782. In certain aspects, a B12-independent glycerol dehydratase activator may comprise certain conserved amino acid residues, including D19, G20, G22, R24, F28, G31, C32, C36, W38, C39, N41, P42, C58, C64, C96, G129, T132, G135, G136, D185, R187, N208, R222, and/or R264.

[0300] A dehydration and reduction pathway, as described herein, may comprise one or more alcohol dehydrogenases or secondary alcohol dehydrogenases. An "alcohol dehydrogenase" or "secondary alcohol dehydrogenase" that is part of a dehydration and reduction pathway refers generally to an enzyme that catalyzes the conversion of aldehyde or ketone substituents to alcohols. For instance, 4-octanone may be reduced to 4-octanol by a secondary alcohol dehydrogenase one enzymatic step for the conversion of butyroin to a biofuel. Pseudomonads express at least one secondary alcohol dehydrogenase that oxidizes 4-octanol to 4-octanone using NAD.sup.+ as a co-factor. As another example, Rhodococcus erythropolis ATCC4277 catalyzes oxidation of medium to long chain secondary fatty alcohols using NADH as a co-factor, using an enzyme that also catalyzes the oxidation of 3-decanol and 4-decanol. In addition, Norcadia fusca AKU2123 contains an (S)-specific secondary alcohol dehydrogenase.

[0301] Genes encoding secondary alcohol dehydrogenases may be isolated from these and other organisms according to known techniques in the art and incorporated into the microbial systems recombinant organisms as described herein. In certain embodiments, a microbial system or isolated microorganism may comprise natural or optimized secondary alcohol dehydrogenases from Pseudomonads, Rhodococcus erythropolis ATCC4277, Norcadia fusca AKU2123, or other suitable organisms.

[0302] Examples of secondary alcohol dehydrogenases include, but are not limited to, 2-butanol dehydrogenase, 3-hexanol dehydrogenase, 4-octanol dehydrogenase, 5-decanol dehydrogenase, 6-dodecanol dehydrogenase, 7-tetradecanol dehydrogenase, 8-hexadecanol dehydrogenase, 2,5-dimethyl-3-hexanol dehydrogenase, 3,6-dimethyl-4-octanol dehydrogenase, 2,7-dimethyl-4-octanol dehydrogenase, 2,9-dimethyl-4-decanol dehydrogenase, 1,4-diphenyl-2-butanol dehydrogenase, bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase, 1,4-diindole-2-butanol dehydrogenase, cyclopentanol dehydrogenase, 2(or 3)-pentanol dehydrogenase, 2(or 3)-hexanol dehydrogenase, 2(or 3)-heptanol dehydrogenase, 2(or 3)-octanol dehydrogenase, 2(or 3)-nonanol dehydrogenase, 4-methyl-2(or 3)-pentanol dehydrogenase, 4-methyl-2(or 3)-hexanol dehydrogenase, 5-methyl-2(or 3)-hexanol dehydrogenase, 6-methyl-2(or 3)-heptanol dehydrogenase, 1-phenyl-2(or 3)-butanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1-indole-2(or 3)-butanol dehydrogenase, 3(or 4)-heptanol dehydrogenase, 3(or 4)-octanol dehydrogenase, 3(or 4)-nonanol dehydrogenase, 3(or 4)-decanol dehydrogenase, 3(or 4)-undecanol dehydrogenase, 2-methyl-3(or 4)-hexanol dehydrogenase, 5-methyl-3(or 4)-heptanol dehydrogenase, 6-methyl-3(or 4)-heptanol dehydrogenase, 7-methyl-3(or 4)-octanol dehydrogenase, 1-phenyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-pentanol dehydrogenase, 1-indole-2(or 3)-pentanol dehydrogenase, 4(or 5)-nonanol dehydrogenase, 4(or 5)-decanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 4(or 5)-dodecanol dehydrogenase, 2-methyl-3(or 4)-heptanol dehydrogenase, 3-methyl-4(or 5)-octanol dehydrogenase, 2-methyl-4(or 5)-octanol dehydrogenase, 8-methyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase, 1-indole-2(or 3)-hexanol dehydrogenase, 4(or 5)-undecanol dehydrogenase, 5(or 6)-undecanol dehydrogenase, 5(or 6)-tridecanol dehydrogenase, 2-methyl-3(or 4)-octanol dehydrogenase, 3-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-5(or 6)-decanol dehydrogenase, 1-phenyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-heptanol dehydrogenase, 1-indole-2(or 3)-heptanol dehydrogenase, 6(or 7)-tridecanol dehydrogenase, 6(or 7)-tetradecanol dehydrogenase, 2-methyl-3(or 4)-nonanol dehydrogenase, 3-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-5(or 6)-undecanol dehydrogenase, 1-phenyl-2(or 3)-octanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-octanol dehydrogenase, 1-indole-2(or 3)-octanol dehydrogenase, 7(or 8)-pentadecanol dehydrogenase, 2-methyl-3(or 4)-decanol dehydrogenase, 3-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-5(or 6)-dodecanol dehydrogenase, 1-phenyl-2(or 3)-nonanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase, 1-indole-2(or 3)-nonanol dehydrogenase, 2-methyl-3(or 4)-undecanol dehydrogenase, 3-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-5(or 6)-tridecanol dehydrogenase, 1-phenyl-2(or 3)-decanol dehydrogenase, 1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase, 1-indole-2(or 3)-decanol dehydrogenase, 2,5-dimethyl-3(or 4)-heptanol dehydrogenase, 2,6-dimethyl-3(or 4)-heptanol dehydrogenase, 2,7-dimethyl-3(or 4)-octanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2(or 3)-pentanol dehydrogenase, 1-indole-4-methyl-2(or 3)-pentanol dehydrogenase, 2,6-dimethyl-4(or 5)-octanol dehydrogenase, 3,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-4-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2 (or 3)-hexanol dehydrogenase, 1-indole-4-methyl-2(or 3)-hexanol dehydrogenase, 2,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase, 1-indole-5-methyl-2(or 3)-hexanol dehydrogenase, 1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase, 1-indole-6-methyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-phenyl-2(or 3)-butanol dehydrogenase, 1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase, 1,10-diamino-5-decanol dehydrogenase, 1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase, 2-hexanol-1,6-dicarboxylic acid dehydrogenase, phenylethanol dehydrogenase, 4-hydroxyphenylethanol dehydrogenase, Indole-3-ethanol dehydrogenase, and the like.

[0303] In certain aspects, a selected alcohol dehydrogenase or secondary alcohol dehydrogenase may have one or more of the above exemplified alcohol dehydrogenase activities, such as a 2-butanol dehydrogenase activity, 3-hexanol dehydrogenase activity, and/or a 4-octanol dehydrogenase activity, among others.

[0304] In certain aspects, a recombinant microorganism may comprise one or more secondary alcohol dehydrogenases encoded by a nucleotide reference sequence selected from SEQ ID NO:109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, and 141, or an enzyme having a polypeptide sequence selected from SEQ ID NO:110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, and 142, including biologically active fragments or variants thereof, such as optimized variants. Certain aspects may also comprises nucleotide sequences or polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:109-142.

[0305] For the secondary alcohol dehydrogenase sequences referred to above, SEQ ID NO:109 is the nucleotide sequence and SEQ ID NO:110 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-1: PP.sub.--1946) isolated from Pseudomonas putida KT2440. SEQ ID NO:111 is the nucleotide sequence and SEQ ID NO:112 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-2: PP.sub.--1817) isolated from Pseudomonas putida KT2440.

[0306] SEQ ID NO:113 is the nucleotide sequence and SEQ ID NO:114 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-3: PP.sub.--1953) isolated from Pseudomonas putida KT2440. SEQ ID NO:115 is the nucleotide sequence and SEQ ID NO:116 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-4: PP.sub.--3037) isolated from Pseudomonas putida KT2440.

[0307] SEQ ID NO:117 is the nucleotide sequence and SEQ ID NO:118 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-5: PP.sub.--1852) isolated from Pseudomonas putida KT2440. SEQ ID NO:119 is the nucleotide sequence and SEQ ID NO:120 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-6: PP.sub.--2723) isolated from Pseudomonas putida KT2440.

[0308] SEQ ID NO:121 is the nucleotide sequence and SEQ ID NO:122 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-7: PP.sub.--2002) isolated from Pseudomonas putida KT2440. SEQ ID NO:123 is the nucleotide sequence and SEQ ID NO:124 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-8: PP.sub.--1914) isolated from Pseudomonas putida KT2440.

[0309] SEQ ID NO:125 is the nucleotide sequence and SEQ ID NO:126 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-9: PP.sub.--1914) isolated from Pseudomonas putida KT2440. SEQ ID NO:127 is the nucleotide sequence and SEQ ID NO:128 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-10: PP.sub.--3926) isolated from Pseudomonas putida KT2440.

[0310] SEQ ID NO:129 is the nucleotide sequence and SEQ ID NO:130 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-11: PFL.sub.--1756) isolated from Pseudomonas fluorescens Pf-5. SEQ ID NO:131 is the nucleotide sequence and SEQ ID NO:132 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-12: KPN.sub.--01694) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578.

[0311] SEQ ID NO:133 is the nucleotide sequence and SEQ ID NO:134 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-13: KPN.sub.--02061) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:135 is the nucleotide sequence and SEQ ID NO:136 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-14: KPN.sub.--00827) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578.

[0312] SEQ ID NO:137 is the nucleotide sequence and SEQ ID NO:138 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-16: KPN.sub.--01350) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:139 is the nucleotide sequence and SEQ ID NO:140 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-17: KPN.sub.--03369) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:141 is the nucleotide sequence and SEQ ID NO:142 is the polypeptide sequence of a secondary alcohol dehydrogenase (2adh-18: KPN.sub.--03363) isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578.

[0313] In certain aspects, an alcohol dehydrogenase (e.g., DEHU hydrogenase), a secondary alcohol dehydrogenase (2ADH), a fragment, variant, or derivative thereof, or any other enzyme that utilizes such an active site, may comprise at least one of a nicotinamide adenine dinucleotide (NAD+), NADH, nicotinamide adenine dinucleotide phosphate (NADP+), or NADPH binding motif. In certain embodiments, the NAD+, NADH, NADP+, or NADPH binding motif may be selected from the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y, Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y, Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein Y is independently selected from alanine, glycine, and serine, wherein G is glycine, and wherein X is independently selected from a genetically encoded amino acid.

[0314] As one example of a step in a reduction and dehydration pathway, .alpha.-hydroxy cyclopentanone may be reduced to 1,2-cyclopentanediol. For example, the glycerol dehydrogenase isolated from Hansenula ofunaensis favors the reduction of .alpha.-hydroxy ketones and .alpha.-keto ketones, and has very broad substrate specificity. The similar alcohol dehydrogenase derived from Hansenula polumorpha and meso-2,3-butanediol dehydrogenase has similar properties. Certain embodiments may incorporate a 1,2-cyclopentanediol dehydrogenase to the microbial system or isolated microorganism. Other embodiments may incorporate a glycerol dehydrogenase from Hansenula ofunaensis, Hansenula polumorpha, Klebsiella pneumonia, or any other suitable organism.

[0315] By way of example, a chemical or hydrocarbon such as 1,2-cyclopentanediol may be dehydrated to form cyclopentanone as one enzymatic step in a reduction and dehydration pathway. There are at least two different types of diol dehydratases that may catalyze dehydration of chemicals such as 1,2-cyclopentanediol. Certain embodiments of microbial system comprising a reduction and dehydration pathway will comprise diol dehydratases such as 1,2-cyclopentanediol dehydratase.

[0316] In the last enzymatic step for a reduction and dehydration pathway, the conversion of such exemplary chemicals as .alpha.-hydroxy cyclopentanone to cyclopentanol may include the reduction of cyclopentanone to cyclopentanol. This step may be catalyzed by cyclopentanol dehydrogenase, which is found in Comomonas sp. strain NCIMB 9872 and its gene (cpnA) has been isolated. Certain embodiments of a microbial system or isolated microorganism may comprise a cyclopentanol dehydrogenase, such as that expressed by cpnA in Comomonas sp. strain NCIMB 9872, among others described herein.

[0317] As detailed below, in certain embodiments, selected C--C ligation pathways may be utilized in combination with selected components or enzymes of a reduction and dehydration pathway to produce a commodity chemical, or intermediate thereof.

[0318] For example, certain embodiments include a method wherein the C--C ligation pathway may comprise an acetoaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-butanediol dehydrogenase, a 2,3-butanediol dehydratase, and a 2-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a propionaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-hexanediol dehydrogenase, a 3,4-hexanediol dehydratase, and a 3-hexanol dehydrogenase.

[0319] Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-octanediol dehydrogenase, a 4,5-octanediol dehydratase, and a 4-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-decanediol dehydrogenase, a 5,6-decanediol dehydratase, and a 5-decanol dehydrogenase.

[0320] Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6,7-dodecanediol dehydrogenase, a 6,7-dodecanediol dehydratase, and a 6-dodecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 7,8-tetradecanediol dehydrogenase, a 7,8-tetradecanediol dehydratase, and a 7-tetradecanol dehydrogenase.

[0321] Additional embodiments include a method wherein the C--C ligation pathway may comprise a butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 8,9-hexadecanediol dehydrogenase, a 8,9-hexadecanediol dehydratase, and a 8-hexadecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise an isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,5-dimethyl-3,4-hexanediol dehydrogenase, a 2,5-dimethyl-3,4-hexanediol dehydratase, and a 2,5-dimethyl-3-hexanol dehydrogenase.

[0322] Additional embodiments include a method wherein the C--C ligation pathway may comprise a 2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,6-dimethyl-4,5-octanediol dehydrogenase, a 3,6-dimethyl-4,5-octanediol dehydratase, and a 3,6-dimethyl-4-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a 3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,7-dimethyl-4,5-octanediol dehydrogenase, a 2,7-dimethyl-4,5-octanediol dehydratase, and a 2,7-dimethyl-4-octanol dehydrogenase.

[0323] Additional embodiments include a method wherein the C--C ligation pathway may comprise a 3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,9-dimethyl-5,6-decanediol dehydrogenase, a 2,9-dimethyl-4,5-decanediol dehydratase, and a 2,9-dimethyl-4-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1,4-diphenyl-2,3-butanediol dehydrogenase, a 1,4-diphenyl-2,3-butanediol dehydratase, and a 1,4-diphenyl-2-butanol dehydrogenase.

[0324] Additional embodiments include a method wherein the C--C ligation pathway may comprise a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1,4-diindole-2,3-butanediol dehydrogenase, a 1,4-diindole-2,3-butanediol dehydratase, and a 1,4-diindole-2-butanol dehydrogenase.

[0325] Additional embodiments include a method wherein the C--C ligation pathway may comprise an .alpha.-keto adipate carboxylyase, and wherein the reduction and dehydration pathway may comprise at least one of a 1,2-cyclopentanediol dehydrogenase, a 1,2-cyclopentanediol dehydratase, and a cyclopentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/propiondehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-pentanediol dehydrogenase, a 2,3-pentanediol dehydratase, and a 2(or 3)-pentanol dehydrogenase.

[0326] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-hexanediol dehydrogenase, a 2,3-hexanediol dehydratase, and a 2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-heptanediol dehydrogenase, a 2,3-heptanediol dehydratase, and a 2(or 3)-heptanol dehydrogenase.

[0327] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/hexyldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-octanediol dehydrogenase, a 2,3-octanediol dehydratase, and a 2(or 3)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-nonanediol dehydrogenase, a 2,3-nonanediol dehydratase, and a 2(or 3)-nonanol dehydrogenase.

[0328] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4-methyl-2,3-pentanediol dehydrogenase, a 4-methyl-2,3-pentanediol dehydratase, and a 4-methyl-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4-methyl-2,3-hexanediol dehydrogenase, a 4-methyl-2,3-hexanediol dehydratase, and a 4-methyl-2(or 3)-hexanol dehydrogenase.

[0329] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5-methyl-2,3-hexanediol dehydrogenase, a 5-methyl-2,3-hexanediol dehydrogenase, and a 5-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6-methyl-2,3-heptanediol dehydrogenase, a 6-methyl-2,3-heptanediol dehydrogenase, and a 6-methyl-2(or 3)-heptanol dehydrogenase.

[0330] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-butanediol dehydrogenase, a 1-phenyl-2,3-butanediol dehydratase, and a 1-phenyl-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase.

[0331] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an acetoaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-butanediol dehydrogenase, a 1-indole-2,3-butanediol dehydratase, and a 1-indole-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-heptanediol dehydrogenase, a 3,4-heptanediol dehydratase, and a 3(or 4)-heptanol dehydrogenase.

[0332] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-octanediol dehydrogenase, a 3,4-octanediol dehydratase, and a 3(or 4)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/hexyldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-nonanediol dehydrogenase, a 3,4-nonanediol dehydratase, and a 3(or 4)-nonanol dehydrogenase.

[0333] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-decanediol dehydrogenase, a 3,4-decanediol dehydratase, and a 3(or 4)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,4-undecanediol dehydrogenase, a 3,4-undecanediol dehydratase, and a 3(or 4)-undecanol dehydrogenase.

[0334] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-hexanediol dehydrogenase, a 2-methyl-3,4-hexanediol dehydratase, and a 2-methyl-3(or 4)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5-methyl-3,4-heptanediol dehydrogenase, a 5-methyl-3,4-heptanediol dehydratase, and a 5-methyl-3(or 4)-heptanol dehydrogenase.

[0335] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6-methyl-3,4-heptanediol dehydrogenase, a 6-methyl-3,4-heptanediol dehydratase, and a 6-methyl-3(or 4)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 7-methyl-3,4-octanediol dehydrogenase, a 7-methyl-3,4-octanediol dehydratase, and a 7-methyl-3(or 4)-octanol dehydrogenase.

[0336] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde and a phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-pentanediol dehydrogenase, a 1-phenyl-2,3-pentanediol dehydratase, and a 1-phenyl-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-pentanol dehydrogenase.

[0337] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a propionaldehyde/indoleacetoaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-pentanediol dehydrogenase, a 1-indole-2,3-pentanediol dehydratase, and a 1-indole-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-nonanediol dehydrogenase, a 4,5-nonanediol dehydratase, and a 4(or 5)-nonanol dehydrogenase.

[0338] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/hexyldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-decanediol dehydrogenase, a 4,5-decanediol dehydratase, and a 4(or 5)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-undecanediol dehydrogenase, a 4,5-undecanediol dehydratase, and a 4(or 5)-undecanol dehydrogenase.

[0339] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 4,5-dodecanediol dehydrogenase, a 4,5-dodecanediol dehydratase, and a 4(or 5)-dodecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-heptanediol dehydrogenase, a 2-methyl-3,4-heptanediol dehydratase, and a 2-methyl-3(or 4)-heptanol dehydrogenase.

[0340] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-octanediol dehydrogenase, a 3-methyl-4,5-octanediol dehydratase, and a 3-methyl-4(or 5)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-octanediol dehydrogenase, a 2-methyl-4,5-octanediol dehydratase, and a 2-methyl-4(or 5)-octanol dehydrogenase.

[0341] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of an 8-methyl-4,5-nonanediol dehydrogenase, an 8-methyl-4,5-nonanediol dehydratase, and an 8-methyl-4(or 5)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-hexanediol dehydrogenase, a 1-phenyl-2,3-hexanediol dehydratase, and a 1-phenyl-2(or 3)-hexanol dehydrogenase.

[0342] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a butyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-hexanediol dehydrogenase, a 1-indole-2,3-hexanediol dehydratase, and a 1-indole-2(or 3)-hexanol dehydrogenase.

[0343] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/hexyldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-undecanediol dehydrogenase, a 4,5-undecanediol dehydratase, and a 4(or 5)-undecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-undecanediol dehydrogenase, a 5,6-undecanediol dehydratase, and a 5(or 6)-undecanol dehydrogenase.

[0344] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 5,6-tridecanediol dehydrogenase, a 5,6-tridecanediol dehydratase, and a 5(or 6)-tridecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-octanediol dehydrogenase, a 2-methyl-3,4-octanediol dehydratase, and a 2-methyl-3(or 4)-octanol dehydrogenase.

[0345] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-nonanediol dehydrogenase, a 3-methyl-4,5-nonanediol dehydratase, and a 3-methyl-4(or 5)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-nonanediol dehydrogenase, a 2-methyl-4,5-nonanediol dehydratase, and a 2-methyl-4(or 5)-nonanol dehydrogenase.

[0346] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-decanediol dehydrogenase, a 2-methyl-5,6-decanediol dehydratase, and a 2-methyl-5(or 6)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-heptanediol dehydrogenase, a 1-phenyl-2,3-heptanediol dehydratase, and a 1-phenyl-2(or 3)-heptanol dehydrogenase.

[0347] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a pentaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-heptanediol dehydrogenase, a 1-indole-2,3-heptanediol dehydratase, and a 1-indole-2(or 3)-heptanol dehydrogenase.

[0348] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/heptaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6,7-tridecanediol dehydrogenase, a 6,7-tridecanediol dehydratase, and a 6(or 7)-tridecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 6,7-tetradecanediol dehydrogenase, a 6,7-tetradecanediol dehydratase, and a 6(or 7)-tetradecanol dehydrogenase.

[0349] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-nonanediol dehydrogenase, a 2-methyl-3,4-nonanediol dehydratase, and a 2-methyl-3(or 4)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-decanediol dehydrogenase, a 3-methyl-4,5-decanediol dehydratase, and a 3-methyl-4(or 5)-decanol dehydrogenase.

[0350] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-decanediol dehydrogenase, a 2-methyl-4,5-decanediol dehydratase, and a 2-methyl-4(or 5)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-undecanediol dehydrogenase, a 2-methyl-5,6-undecanediol dehydratase, and a 2-methyl-5(or 6)-undecanol dehydrogenase.

[0351] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-octanediol dehydrogenase, a 1-phenyl-2,3-octanediol dehydratase, and a 1-phenyl-2(or 3)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-octanediol dehydratase, and a 1-(4-hydroxyphenyl)-2(or 3)-octanol dehydrogenase.

[0352] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a hexyldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-octanediol dehydrogenase, a 1-indole-2,3-octanediol dehydratase, and a 1-indole-2(or 3)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/octaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 7,8-pentadecanediol dehydrogenase, a 7,8-pentadecanediol dehydratase, and a 7(or 8)-pentadecanol dehydrogenase.

[0353] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-decanediol dehydrogenase, a 2-methyl-3,4-decanediol dehydratase, and a 2-methyl-3(or 4)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-undecanediol dehydrogenase, a 3-methyl-4,5-undecanediol dehydratase, and a 3-methyl-4(or 5)-undecanol dehydrogenase.

[0354] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-undecanediol dehydrogenase, a 2-methyl-4,5-undecanediol dehydratase, and a 2-methyl-4(or 5)-undecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-dodecanediol dehydrogenase, a 2-methyl-5,6-dodecanediol dehydratase, and a 2-methyl-5(or 6)-dodecanol dehydrogenase.

[0355] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-nonanediol dehydrogenase, a 1-phenyl-2,3-nonanediol dehydratase, and a 1-phenyl-2(or 3)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase, and a 1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase.

[0356] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a heptaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-nonanediol dehydrogenase, a 1-indole-2,3-nonanediol dehydratase, and a 1-indole-2(or 3)-nonanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/isobutyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-3,4-undecanediol dehydrogenase, a 2-methyl-3,4-undecanediol dehydratase, and a 2-methyl-3(or 4)-undecanol dehydrogenase.

[0357] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3-methyl-4,5-dodecanediol dehydrogenase, a 3-methyl-4,5-dodecanediol dehydratase, and a 3-methyl-4(or 5)-dodecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-4,5-dodecanediol dehydrogenase, a 2-methyl-4,5-dodecanediol dehydratase, and a 2-methyl-4(or 5)-dodecanol dehydrogenase.

[0358] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2-methyl-5,6-tridecanediol dehydrogenase, a 2-methyl-5,6-tridecanediol dehydratase, and a 2-methyl-5(or 6)-tridecanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-2,3-decanediol dehydrogenase, a 1-phenyl-2,3-decanediol dehydratase, and a 1-phenyl-2(or 3)-decanol dehydrogenase.

[0359] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase, a 1-(4-hydroxyphenyl)-2,3-decanediol dehydratase, and a 1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an octaldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-2,3-decanediol dehydrogenase, a 1-indole-2,3-decanediol dehydratase, and a 1-indole-2(or 3)-decanol dehydrogenase.

[0360] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/2-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,5-dimethyl-3,4-heptanediol dehydrogenase, a 2,5-dimethyl-3,4-heptanediol dehydratase, and a 2,5-dimethyl-3(or 4)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,6-dimethyl-3,4-heptanediol dehydrogenase, a 2,6-dimethyl-3,4-heptanediol dehydratase, and a 2,6-dimethyl-3(or 4)-heptanol dehydrogenase.

[0361] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,7-dimethyl-3,4-octanediol dehydrogenase, a 2,7-dimethyl-3,4-octanediol dehydratase, and a 2,7-dimethyl-3(or 4)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-4-methyl-2,3-pentanediol dehydrogenase, a 1-phenyl-4-methyl-2,3-pentanediol dehydratase, and a 1-phenyl-4-methyl-2(or 3)-pentanol dehydrogenase.

[0362] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase, a 1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase, and a 1-(4-hydroxyphenyl)-4-methyl-2(or 3)-pentanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of an isobutyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-methyl-2,3-pentanediol dehydrogenase, a 1-indole-4-methyl-2,3-pentanediol dehydratase, and a 1-indole-4-methyl-2(or 3)-pentanol dehydrogenase.

[0363] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/3-methyl-butyraldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,6-dimethyl-4,5-octanediol dehydrogenase, a 2,6-dimethyl-4,5-octanediol dehydratase, and a 2,6-dimethyl-4(or 5)-octanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 3,8-dimethyl-4,5-nonanediol dehydrogenase, a 3,8-dimethyl-4,5-nonanediol dehydratase, and a 3,8-dimethyl-4(or 5)-nonanol dehydrogenase.

[0364] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-4-methyl-2,3-hexanediol dehydrogenase, a 1-phenyl-4-methyl-2,3-hexanediol dehydratase, and a 1-phenyl-4-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase, a 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase, and a 1-(4-hydroxyphenyl)-4-methyl-2 (or 3)-hexanol dehydrogenase.

[0365] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 2-methyl-butyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-methyl-2,3-hexanediol dehydrogenase, a 1-indole-4-methyl-2,3-hexanediol dehydratase, and a 1-indole-4-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 2,8-dimethyl-4,5-nonanediol dehydrogenase, a 2,8-dimethyl-4,5-nonanediol dehydratase, and a 2,8-dimethyl-4(or 5)-nonanol dehydrogenase.

[0366] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-5-methyl-2,3-hexanediol dehydrogenase, a 1-phenyl-5-methyl-2,3-hexanediol dehydratase, and a 1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase, a 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase, and a 1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase.

[0367] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 3-methyl-butyraldehyde/indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-5-methyl-2,3-hexanediol dehydrogenase, a 1-indole-5-methyl-2,3-hexanediol dehydratase, and a 1-indole-5-methyl-2(or 3)-hexanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-methyl-pentaldehyde/phenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-phenyl-6-methyl-2,3-heptanediol dehydrogenase, a 1-phenyl-6-methyl-2,3-heptanediol dehydratase, and a 1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase.

[0368] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-methyl-pentaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase, a 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase, and a 1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-methyl-pentaldehyde/Indoleacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-6-methyl-2,3-heptanediol dehydrogenase, a 1-indole-6-methyl-2,3-heptanediol dehydratase, and a 1-indole-6-methyl-2(or 3)-heptanol dehydrogenase.

[0369] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a phenylacetaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase, a 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase, and a 1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a phenylacetaldehyde/indolephenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-phenyl-2,3-butanediol dehydrogenase, a 1-indole-4-phenyl-2,3-butanediol dehydratase, and a 1-indole-4-phenyl-2(or 3)-butanol dehydrogenase.

[0370] Additional embodiments include a method wherein the C--C ligation pathway may comprise at least one of a 4-hydroxyphenylacetaldehyde/indolephenylacetaldehyde lyase and wherein the reduction and dehydration pathway may comprise at least one of a 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a 1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a 5-amino-pantaldehyde lyase, and wherein the reduction and dehydration pathway may comprise at least one of a 1,10-diamino-5,6-decanediol dehydrogenase, a 1,10-diamino-5,6-decanediol dehydratase, and a 1,10-diamino-5-decanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a 4-hydroxyphenyl acetaldehyde lyase, and wherein the reduction and dehydration pathway may comprise at least one of a 1,4-di(4-hydroxyphenyl)-2,3-butanediol, a 1,4-di(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a 1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase. Additional embodiments include a method wherein the C--C ligation pathway may comprise a succinate semialdehyde lyase, and wherein the reduction and dehydration pathway may comprise at least one of a 2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase, a 2,3-hexanediol-1,6-dicarboxylic acid dehydratase, and a 2-hexanol-1,6-dicarboxylic dehydrogenase.

[0371] Certain embodiments of a microbial system or recombinant microorganism may comprise genes encoding enzymes that are able to catalyze (e.g., reduction and dehydration) the conversion of 4-octanol to octene or octane. Other embodiments may comprise redesigned or de novo designed enzymes for this reduction and dehydration pathway. For example, three redesigned enzymes could convert 4-octanone to either 3- and 4-octene. The first step could be catalyzed by redesigned isocitrate dehydrogenase. This enzyme could catalyze the formation of 4-hydroxy-3(or 5)-carboxylic octane. The 4-hydroxy group could be phosphorylated by redesigned kinase. Finally, redesigned mevalonate diphosphate decarboxylase catalyzes the formation of 3(or 4)-octene.

[0372] In other embodiments, several redesigned enzymes could convert 4-octanone to octane. For example, the 4-hydroxy-3(or 5)-carboxylic octane is sequentially reduced and dehydrated to form 3(or 5)-carboxylic octane. Redesigned enzymes involved in fatty acid metabolism can catalyze these reactions. The 3(or 5)-carboxylic octane can be reduced to corresponding aldehyde by aldehyde dehydrogenase and the product may be decarbonylated to form octane catalyzed by a redesigned decarbonylase.

[0373] As noted above, for the production of certain commodity chemicals, such as 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol, among other similar chemicals, a biosynthesis pathway (e.g., aldehyde biosynthesis pathway) may optionally or further comprise one or more genes encoding a decarboxylase enzyme, such as an indole-3-pyruvate decarboxylase (IPDC), to produce an aldehyde. In certain aspects, an IPDC may comprise an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO:312. An IDPC enzyme may comprise certain conserved amino acid residues, such as G24, D25, E48, A55, R60, G75, E89, H113, G252, G405, G413, G428, G430, and/or N456.

[0374] In these and other embodiments, a recombinant microorganism may comprise an aldehyde reductase, such as a phenylacetoaldehyde reductase (PAR), to convert an aldehyde to a commodity chemical. In certain aspects, a PAR may comprise an amino acid sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO:313, which shows the sequence of a PAR enzymed derived from Rhodococcus sp. ST-10. In certain aspects, a PAR enzyme may comprise at least one of a nicotinamide adenine dinucleotide (NAD+), NADH, nicotinamide adenine dinucleotide phosphate (NADP+), or NADPH binding motif. In certain embodiments, the NAD+, NADH, NADP+, or NADPH binding motif may be selected from the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y, Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y, Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein Y is independently selected from alanine, glycine, and serine, wherein G is glycine, and wherein X is independently selected from a genetically encoded amino acid.

[0375] In certain embodiments, such a recombinant microorganism may also or alternatively comprise a secondary alcohol dehydrogenase having an activity selected from at least one of a phenylethanol dehydrogenase activity, a 4-hydroxyphenylethanol dehydrogenase activity, and an Indole-3-ethanol dehydrogenase activity, to reduce the aldehyde to its corresponding alcohol (e.g. 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol).

[0376] Embodiments of the present invention also include methods for converting a suitable monosaccharide to a commodity chemical comprising, (a) obtaining a suitable monosaccharide; (b) contacting the suitable monosaccharide with a microbial system for a time sufficient to convert to the suitable monosaccharide to the biofuel, wherein the microbial system comprises, (i) one or more genes encoding and expressing a fatty acid biosynthesis pathway, an amino acid biosynthetic pathway, and/or a short chain alcohol biosynthetic pathway; (ii) one or more genes encoding and expressing a keto-acid decarboxylase, aldehyde dehydrogenase, and/or alcohol dehydrogenase; and (iii) an enzymatic reduction pathway selected from (1) an enzymatic long chain alcohol reduction pathway, (2) an enzymatic decarbonylation pathway, (3) an enzymatic decarboxylation pathway, and (4) an enzymatic reduction pathway comprising (1), (2), and/or (3), thereby converting the suitable monosaccharide to the commodity chemical.

[0377] Embodiments of the present invention may comprise one or more genes encoding and expressing enzymes in a fatty acid synthesis pathway, which may be used, as one example, to produce biofuels in the form of alkanes, such as medium to long chain alkanes. In certain embodiments, the specificity of the fatty acid biosynthesis pathway in the microbial system may be recalibrated or redesigned. Merely by way of example, microorganisms generally produce a mixture of long chain fatty acids (e.g., E. coli naturally produce large quantities of long chain fatty acids (C16-C19: <95% in whole cells) and small quantity of medium chain fatty acids (C12: 2% and C14: 5% in whole cells)).

[0378] In certain embodiments, the recalibration or re-engineering may be directed to increasing production of medium chain alkanes, including, but not limited to, caprylate (C8), caprate (C10), laurate (C12), myristate (C14), and palmitate (C16), as alkanes produced from these fatty acids are major components of gasoline, diesels, and kerosene. In addition to these fatty acids, other embodiments may be directed to increased production of long chain fatty acids, including, but not limited to, stearate (C18), arachidonate (C20), behenate (C22) and longer fatty acids, as n-alkanes produced from these fatty acids are one of major components in heavy oils.

[0379] For example, Cuphea mainly accumulate medium chain fatty acids as major components in their seed oils, and these compositions alter depending on species. In particular, Cuphea pulcherrima accumulates caprylate (C8:0) 96%, Cuphea koehneana accumulates caprate (C10:0) 95.3%, and Cuphea polymorpha accumulates laurate (C12:0) 80.1%. Embodiments of the microbial systems or isolated microorganisms according to the present application may incorporate genes from various Cuphea species encoding enzymes involved in a fatty acid biosynthesis pathway, and these microorganisms may be directed in part to the production of middle chain fatty acids.

[0380] In other embodiments, acyl-acyl carrier protein (ACP) thioesterases (TEs) derived from various species including Cuphea hookeriana, Cuphea palustris, Umbellularia californica, and Cinnamomum camphorum may be over-expressed in such microorganisms as E. coli, wherein the specific activity for the formation of each medium chain fatty acids, caprylate (C8), caprate (C10), laurate (C12), myristate (C14), and palmitate (C16) is improved over the wild type. Certain embodiments may include other enzyme components involved in fatty acid biosynthesis as known to a person skilled in the arts, including, but not limited to, ACP and .beta.-ketoacyl ACP synthase (KAS) IV.

[0381] Microbial systems and isolated microorganisms of the present application may also incorporate fatty aldehyde dehydrogenases to reduce fatty acids to fatty aldehydes. Merely by way of explanation, the conversion of fatty acids to fatty aldehydes may be catalyzed by medium and/or long chain fatty aldehyde dehydrogenases isolated from various suitable organisms. Certain embodiments may incorporate, for example, a fatty aldehyde dehydrogenase derived from Vibrio harveyi.

[0382] Microbial systems and isolated microorganisms of the present application may also incorporate one or more enzymes that catalyze the conversion of fatty aldehydes to biofuels such as n-alkanes, including, for example, enzymes comprising an enzymatic long chain alcohol reduction pathway. Certain embodiments may incorporate genes from various other sources that encode enzymes capable of catalyzing the reduction and dehydration of fatty acids to biofuels, such as alkanes. For example, bacterial strain HD-1 is able to produce biofuels, such as n-alkanes, with various chain lengths, and also produces both odd and even numbered alkanes. Certain embodiments of the microbial systems and recombinant microorganisms provided herein may incorporate the HD-1 genes encoding the enzymes involved in this pathway.

[0383] Other embodiments may incorporate redesigned or de novo designed enzymes for this reduction pathway. For example, embodiments of the present invention may include a redesigned isocitrate dehydrogenase, which may catalyze the formation of 2-carboxy-1-alcohols. In certain embodiments, the 2-carboxy-1-alcohols may be sequentially reduced and dehydrated to form 2-carboxy-alkanes, which may be catalyzed by redesigned enzymes involved in fatty acid metabolism. The 2-carboxy-alkanes can be reduced to corresponding aldehyde by aldehyde dehydrogenase and then decarbonylated to form n-alkanes catalyzed by the redesigned decarbonylase as discussed below. Certain embodiments of these microbial systems may produce either even numbered n-alkanes, odd numbered n-alkanes, or both.

[0384] Certain embodiments of the present application may incorporate the genes encoding enzymes catalyzing decarbonylation, or an enzymatic decarbonylation pathway. Merely by way of example, green colonial alga Botyrococcus braunii, race A, produces linear odd-numbered C27, C29, and C31 hydrocarbons that total up to 32% of the alga's dry weight. Microsomal preparations of this organism have decarbonylation activity. This decarbonylase from B. braunii culture is a cobalt-protoporphyrin IX containing enzyme. Certain microbial systems of isolated microorganisms may incorporate the gene encoding fatty aldehyde decarbonylase from Botyrococcus braunii.

[0385] Other embodiments may include redesigned decarbonylase enzymes, for example, wherein the N-terminal membrane sequence is substituted. By way of explanation, the functional activity of a similar enzyme, cytochrome P450 containing Fe-protopolphyrin IX (heme), is improved by substituting N-terminal membrane associated sequence, and the functional activity of decarbonylases of the present microbial systems may comprise similar substitutions or improvements.

[0386] Other embodiments may incorporate the genes encoding a Co-porphyrin synthase. In explanation, decarbonylase enzymes may use Co-protoporphyrin IX as a co-factor, and Clostridium tetranomorphum is able to incorporate cobalt into incubated protopolphyrin IX. Certain embodiments may incorporate the Co-porphyrin synthase from Clostridium tetranomorphum, or from other suitable microorganisms. Other embodiments may incorporate de novo designed decarbonylation enzymes using inorganic metals such as Co.sup.2+, Fe.sup.2+, and Ni.sup.2+ as catalysts.

[0387] Certain embodiments may comprise genes encoding the enzymes responsible for the formation of alkenes, or an enzymatic decarboxylation pathway. These genes may be derived or isolated from various sources, such as higher plants and insects. For example, higher plants such as germinating safflower (Carthamus tinctorius L.) produce a number of odd numbered 1-alkenes, including 1-pentadecene, 1-heptadecene, 1,8-heptadecadiene and 1,8,11-heptadecatriene besides about 80-90% 1,8,11,14-heptadecatetraene by decarboxylation from their corresponding fatty acids. Certain embodiments may incorporate the genes from higher plants such as Carthamus tinctorius.

[0388] Other embodiments may incorporate the genes encoding the enzymes responsible for the formation of alkenes (e.g., an enzymatic decarboxylation pathway) from microorganisms, including, but not limited to, such as bacterial strain DH-1. By way of explanation, bacterial strain DH-1 produces n-alkenes in addition to n-alkanes.

[0389] Other embodiments may incorporate the genes from de novo designed enzymes for an enzymatic decarboxylation pathway. For example, these redesigned enzymes convert .beta.-hydroxy fatty acids to n-alkenes. The first step is catalyzed by a redesigned kinase, which catalyzes the phosphorylation of a .beta.-hydroxy group. A redesigned mevalonate diphosphate decarboxylase then catalyzes the formation of n-alkenes, such as n-1-alkene.

[0390] Any microorganism may be utilized according to the present invention. In certain aspects, a microorganism is a eukaryotic or prokaryotic microorganism. In certain aspects, a microrganism is a yeast, such as S. cerevisiae. In certain aspects, a microorganism is a bacteria, such as a gram-positive bacteria or a gram-negative bacteria. Given its rapid growth rate, well-understood genetics, the variety of available genetic tools, and its capability in producing heterologous proteins, genetically modified E. coli may be used in certain embodiments of a microbial system as described herein, whether for the degradation amd metabolism of a polysaccharide, such as alginate or pectin, or the formation or biosynthesis of commodity chemicals, such as biofuels.

[0391] Other microorganisms may be used according to the present invention, based in part on the compatibility of enzymes and metabolites to host organisms. For example, other organisms such as Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium, Alcaligenes, Ananas comosus (M), Arthrobacter, Aspargillus niger, Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus sojea, Aspergillus usamii, Bacillus alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus clausii, Bacillus lentus, Bacillus licheniformis, Bacillus macerans, Bacillus stearothermophilus, Bacillus subtilis, Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia, Candida cylindracea, Candida rugosa, Carica papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium acetobutylicum, Clostridium thermocellum, Corynebacterium (glutamicum), Corynebacterium efficiens, Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter, Gluconacetobacter, Haloarcula, Humicola insolens, Humicola nsolens, Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kluyveromyces, Kluyveromyces fragilis, Kluyveromyces lactis, Kocuria, Lactlactis, Lactobacillus, Lactobacillus fermentum, Lactobacillus sake, Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis, Methanolobus siciliae, Methanogenium organophilum, Methanobacterium bryantii, Microbacterium imperiale, Micrococcus lysodeikticus, Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium, Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus, Pediococcus halophilus, Penicillium, Penicillium camemberti, Penicillium citrinum, Penicillium emersonii, Penicillium roqueforti, Penicillum lilactinum, Penicillum multicolor, Paracoccus pantotrophus, Propionibacterium, Pseudomonas, Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus, Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar, Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus oligosporus, Rhodococcus, Sccharomyces cerevisiae, Sclerotina libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas, Streptococcus, Streptococcus thermophilus Y-1, Streptomyces, Streptomyces griseus, Streptomyces lividans, Streptomyces murinus, Streptomyces rubiginosus, Streptomyces violaceoruber, Streptoverticillium mobaraense, Tetragenococcus, Thermus, Thiosphaera pantotropha, Trametes, Trichoderma, Trichoderma longibrachiatum, Trichoderma reesei, Trichoderma viride, Trichosporon penicillatum, Vibrio alginolyticus, Xanthomonas, yeast, Zygosaccharomyces rouxii, Zymomonas, and Zymomonus mobilis, may be utilized as recombinant microorganisms provided herein, and, thus, may be utilized according to the various methods of the present invention.

[0392] The following Examples are offered by way of illustration, not limitation.

EXAMPLES

Example 1

Engineering E. coli to Grow on Alginate as a Sole Source of Carbon

[0393] Wild type E. coli cannot use alginate polymer or degraded alginate as its sole carbon source (see FIG. 4). Vibrio splendidus, however, is known to be able to metabolize alginate to support growth. To generate recombinant E. coli that use degraded alginate as its sole carbon source, a Vibrio splendidus fosmid library was constructed and cloned into E. coli.

[0394] To prepare the Vibrio splendidus fosmid library, genomic DNA was isolated from Vibrio Splendidus B01 (gift from Dr. Martin Polz, MIT) using the DNeasy Blood and Tissue Kit (Qiagen, Valencia, Calif.). A fosmid library was then constructed using Copy Control Fosmid Library Production Kit (Epicentre, Madison, Wis.). This library consisted of random genomic fragments of approximately 40 kb inserted into the vector pCC1FOS (Epicentre, Madison, Wis.).

[0395] The fosmid library was packaged into phage, and E. coli DH10B cells harboring a pDONR221 plasmid (Invitrogen, Carlsbad, Calif.) carrying certain Vibrio splendidus genes (V12B01.sub.--02425 to V12B01.sub.--02480; encoding a type II secretion apparatus; see SEQ ID NO:1) were transfected with the phage library. This secretome region encodes a type II secretion apparatus derived from Vibrio splendidus, which was cloned into a pDONR221 plasmid and introduced into E. coli strain DH10B (see Example 1).

[0396] Transformants were selected for chloroamphenicol resistance and then screened for their ability to grow on degraded alginate. The resultant transformants were screened for growth on degraded alginate media. Degraded alginate media was prepared by incubating 2% Alginate (Sigma-Aldrich, St. Louis, Mo.) 10 mM Na-Phosphate buffer, 50 mM KCl, 400 mM NaCl with alginate lyase from Flavobacterium sp. (Sigma-Aldrich, St. Louis, Mo.) at room temperature for at least one week. This degraded alginate was diluted to a concentration of 0.8% to make growth media that had a final concentration of 1.times.M9 salts, 2 mM MgSO.sub.4, 100 .mu.M CaCl2, 0.007% Leucine, 0.01% casamino acids, 1.5% NaCl (this includes all sources of sodium: M9, diluted alginate and added NaCl).

[0397] One fosmid-containing E. coli clone was isolated that grew well on this media. The fosmid DNA from this clone was isolated and prepared using FosmidMAX DNA Purification Kit (Epicentre, Madison, Wis.). This isolated fosmid was transferred back into DH10B cells, and these cells were tested for the ability to grown on alginate.

[0398] The results are illustrated in FIG. 4, which shows that certain fosmid-containing E. coli clones are capable of growing on alginate as a sole source of carbon. Agrobacterium tumefaciens provides a positive control (see hatched circles). As a negative control, E. coli DH10B cells are not capable of growing on alginate (see immediate left of positive control).

[0399] These results also demonstrate that the sequences contained within this Vibrio splendidus derived fosmid clone are sufficient to confer on E. coli the ability to grow on degraded alginate as a sole source of carbon. Accordingly, the type II secretion machinery sequences contained within the pDONR221 vector (i.e., SEQ ID NO:1), which was harbored by the original DH10B cells, were not necessary for growth on degraded alginate.

[0400] The isolated fosmid sufficient to confer growth alginate as a sole source of carbon was sequenced by Elim Biopharmaceuticals (Hayward, Calif.) using the following primers: Uni R3-GGGCGGCCGCAAGGGGTTCGCGTTGGCCGA (SEQ ID NO:147) and PCC1FOS_uni_F-GGAGAAAATACCGCATCAGGCG (SEQ ID NO:148). Sequencing showed that the vector contained a genomic DNA section that contained the full length genes V12B01.sub.--24189 to V12B01.sub.--24249 (see SEQ ID NOS:2-64). SEQ ID NO:2 shows the nucleotide sequence of entire region between V12B01.sub.--24189 to V12B01.sub.--24249. SEQ ID NOS:3-64 show the individual putative genes contained within SEQ ID NO:2. In this sequence, there is a large gene before V12B01.sub.--24189 that is truncated in the fosmid clone. The large gene V12B01.sub.--24184 is a putative protein with similarity to autotransporters and belongs to COG3210, which is a cluster of orthologous proteins that include large exoproteins involved in heme utilization or adhesion. In the fosmid clone, V12B01.sub.--24184 is N-terminally truncated such that the first 5893 bp are missing from the predicted open reading frame (which is predicted to contain 22889 bp in total).

Example 2

Engineering E. coli to Grow on Pectin as a Sole Source of Carbon

[0401] Wild type E. coli is not capable of growing on pectin, di-, or tri-galacturonates as a sole source of carbon. To identify the minimal components to confer on E. coli the capability of growing on pectin, di- and/or tri-galacturonates as a sole source of carbon, an E. coli strain BL21(DE3) harboring both the pBBRGal3P plasmid and the pTrcogl-kdgR plasmid was engineered and tested for the ability to grown on these polysaccharides.

[0402] The pBBRGal3P plasmid was engineered to contain certain genomic region of Erwinia carotovora subsp. Atroseptica SCRI 1043, comprising several genes (kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, and paeX) encoding certain enzymes (kduI, kduD, ogl, pelW, and paeX), transporters (togM, togN, togA, togB, and kdgM), and regulatory proteins (kdgR) responsible for the degradation of di- and trigalacturonate. SEQ ID NO:65 shows the nucleotide sequence of the kdgF-PaeX region from Erwinia carotovora subsp. Atroseptica SCRI1043.

[0403] To construct this plasmid, the DNA sequence encoding kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR of Erwinia carotovora subsp. Atroseptica SCRI 1043 was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 6 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGGGATCCAAGTTGCAGGATATGACGAAAGCG-3') (SEQ ID NO:149) and reverse (5'-GCTCTAGA AGATTATCCCTGTCTGCGGAAGCGG-3') (SEQ ID NO:150) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Erwinia carotovora subsp. Atroseptica SCRI 1043 genome (ATCC) in 50 .mu.l.

[0404] The vector pBBR1MCS-2 was then amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2.5 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GCTCTAGA GGGGTGCCTAATGAGTGAGCTAAC-3') (SEQ ID NO:151) and reverse (5'-CGGGATCC GCGTTAATATTTTGTTAAAATTCGC-3') (SEQ ID NO:152) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBBR1MCS-2 in 50 .mu.l. Both amplified DNA fragments were digested with BamHI and XbaI and ligated.

[0405] The pTrcogl-kdgR plasmid was engineered to contain certain genomic regions of Erwinia carotovora subsp. Atroseptica SCRI 1043, comprising two genes (ogl and kdgR) encoding an enzyme (ogl) and a regulatory protein (kdgR) responsible for degradation of di- and trigalacturonate. SEQ ID NO:66 shows the nucleotide sequence of ogl-kdgR from Erwinia carotovora subsp. Atroseptica SCRI1043.

[0406] To prepare this construct, the DNA sequence encoding ogl and kdgR of Erwinia carotovora subsp. Atroseptica SCRI 1043 was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GCTCTAGA GTTTATGTCGCACCCGCCGTTGG-3') (SEQ ID NO:153) and reverse (5'-CCCAAGC TTAGAAAGGGAAATTGTGGTAGCCC-3') (SEQ ID NO:154) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Erwinia carotovora subsp. Atroseptica SCRI 1043 genome (ATCC) in 50 .mu.l. The amplified DNA fragment was digested with XbaI and HindIII and ligated into pTrc99A pre-digested with the same restriction enzymes.

[0407] The plasmids pBBRGal3P and pTrcogl-kdgR were co-transformed into E. coli strain BL21(DE3). A single colony was inoculated into LB media containing 50 ug/ml kanamycin and 100 ug/ml ampicillin, and the culture was grown in incubation shaker with 200 rpm at 37 C. When culture reached OD 600 nm of 0.6, 500 ul of culture was transferred to eppendorf tube and centrifuged to pellet the cells. The cells were resuspended into 50 ul of M9 media containing 2 mM MgSO.sub.4, 100 uM CaCl.sub.2, 0.4% di- or trigalacturonate, and 5 ul of this solution was inoculated into 500 ul of fresh M9 media containing 2 mM MgSO.sub.4, 100 uM CaCl.sub.2, 0.4% di- or trigalacturonate. The culture was grown in incubation shaker with 200 rpm at 37 C.

[0408] The results in FIG. 5A show that these two plasmids were sufficient to provide E. coli ability to grow on di- and trigalacturonate as sole source of carbon, but not pectin. In particular, these results show that the regions kdgF-paeX and ogl-kdgR were sufficient to confer this ability on E. coli.

[0409] Based on the information obtained from the above experiments, it was considered whether the introduction of pectate lyase, pectate acetylesterase, and methylesterase might confer E. coli capability of growing on pectin. To test this hypothesis, E. coli strain DH5.alpha. bacterial cells were engineered to contain both the pROU2 plasmid and the pPEL74 plasmid.

[0410] The pROU2 plasmid contains certain genomic regions of Erwinia chrysanthemi, comprising several genes (kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR) encoding enzymes (kduI, kduD, ogl, pelW, and paeX), transporters (togM, togN, togA, togB, and kdgM), and regulatory proteins (kdgR) responsible for degradation of di- and trigalacturonate.

[0411] The pPEL74 plasmid contains certain genomic regions of Erwinia chrysanthemi, comprising several genes (pelA, pelE, paeY, and pem) encoding pectate lyases (pelA and pelE), pectin acetylesterases (paeY), and pectin methylesterase (pem).

[0412] As shown in FIG. 5B, E. coli DH5.alpha. engineered with pROU2 and pPEL74 was able to grow on pectin as a sole source of carbon, showing that the genes contained within these plasmids are sufficient to confer this property on an organism that is otherwise incapable of growing on pectin as a sole source of carbon.

Example 3

In Vitro Conversion of Alginate to Pyruvate and Glyceraldehyde-3-Phosphate

[0413] The ability of an enzyme mixture containing all required enzymes for alginate degradation and metabolism was investigated for its ability to produce pyruvate from alginate. In addition, various novel alcohol dehydrogenases (ADHs), such as ADH1-12 (see SEQ ID NOS:69-92), isolated from Agrobacterium tumefaciens, were tested for their ability to catalyze either DEHU or mannuronate hydrogenation.

[0414] A simplified metabolic pathway for alginate degradation and metabolism is shown in FIG. 2. Alginate can be degraded by at least two different methodologies: enzymatic and chemical methodologies.

[0415] In enzymatic degradation, the degradation of alginate is catalyzed by a family of enzymes called alginate lyases. For this experiment, Atu3025 was used. Atu3025 is an exolytically acting enzyme and yields DEHU from alginate polymer. DEHU is converted to the common hexuronate metabolite, KDG. This reaction is catalyzed by alcohol dehydrogenases (e.g., DEHU hydrogenases).

[0416] Chemical degradation catalyzed by acid solution, such as formate, yields a monosaccharide mannuronate. Mannuronate is then converted to mannonate, which is catalyzed by enzymes with mannonate dehydrogenase (mannuronate reductase) activity. In bacteria, mannonate dehydratase (UxuA) catalyzes dehydration from mannuronate to form KDG.

[0417] KDG is readily metabolized to form of pyruvate and glyceraldehydes-3-phosphate (G3P). KDG is first phosphorylated to KDG-6-phosphate (KDGP), which is catalyzed by KDG kinase, and then broken down to pyruvate and G3P, which is catalyzed by KDGP aldolase.

[0418] Preparation of oligoalginate lyase Atu3025 derived from Agrobacterium tumefaciens C58. pETAtu3025 was constructed based on pET29 plasmid backbone (Novagen). The oligoalginate lyase Atu3025 was amplified by PCR: 98.degree. C. for 10 sec, 55.degree. C. for 15 sec, and 72.degree. C. for 60 sec, repeated for 30 times. The reaction mixture contained 1.times. Phusion buffer, 2 mM dNTP, 0.5 .mu.M forward (5'-GGAATTCCATATGCGTCCCTCTGCCCCGGCC-3') (SEQ ID NO:155) and reverse (5'-CGGGATCCTTAGAACTGCTTGGGAAGGGAG-3') (SEQ ID NO:156) primers, 2.5 U Phusion DNA polymerase (Finezyme), and an aliquot of Agrobacterium tumefaciens C58 (gift from Professor Eugene Nester, University of Washington) cells as a template in total volume of 100 .mu.l. The amplified fragment was digested with NdeI and BamHI and ligated into pET29 pre-digested with the same enzymes using T4 DNA ligase to form pETAtu3025. The constructed plasmid was sequenced (Elim Biophamaceuticals) and the DNA sequence of the insert was confirmed. The nucleotide sequence of the Atu3025 insert is provided in SEQ ID NO:67. The polypeptide sequence encoded by the Atu3025 insert is provided in SEQ ID NO:68.

[0419] The pETAtu3025 was transformed into Escherichia coli strain BL21(DE3). A colony of BL21(DE3) containing pETAtu3025 was inoculated into 50 ml of LB media containing 50 .mu.g/ml kanamycin (Km.sup.50). This strain was grown in an orbital shaker with 200 rpm at 37.degree. C. The 0.2 mM IPTG was added to the culture when the OD.sub.600nm reached 0.6, and the induced culture was grown in an orbital shaker with 200 rpm at 20.degree. C. 24 hours after the induction, the cells were harvested by centrifugation at 4,000 rpm.times.g for 10 min and the pellet was resuspended into 2 ml of Bugbuster (Novagen) containing 10 .mu.l of Lysonase.TM. Bioprocessing Reagent (Novagen). The solution was again centrifuged at 4,000 rpm.times.g for 10 min and the supernatant was obtained.

[0420] Construction of pETADH1 through pETADH12. DNA sequences of ADH1-12 of Agrobacterium tumefaciens C58 were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (Table 1) and reverse (Table 1) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Agrobacterium tumefaciens C58 genome in 50 .mu.l. Amplified DNA fragment was digested with NdeI and BamHI and ligated into pET28 pre-digested with the same restriction enzymes. For DNA sequences with internal NdeI or BamHI site, front and bottom half sequences of each ADH were first amplified using described method. The resulting two DNA fragments were gel purified and spliced by overlapping PCR.

TABLE-US-00001 TABLE 1 Primers used to amplify ADH1-12 from Agrobacterium tumefaciens C58. A. tumefaciens Name C58 Forward Primer Reverse Primer ADH1 Atu1557 GGAATTCCATATGTTCACAACGTCCGCCTA GCTTGACGGCCATGTGGCCGAGGCCGC (SEQ ID NO: 276) (SEQ ID NO: 277) GCGGCCTCGGCCACATGGCCGTCAAGC CGGGATCCTTAGGCGGCCTTCTGGCGCG (SEQ ID NO: 278) (SEQ ID NO: 279) ADH2 Atu2022 GGAATTCCATATGGCTATTGCAAGAGGTTA CGGGATCCTTAAGCGTCGAGCGAGGCCA (SEQ ID NO: 280) (SEQ ID NO: 281) ADH3 Atu0626 GGAATTCCATATGACTAAAACAATGAAGGC CACCGGGGCCGGGGTCCGGTATTGCCA (SEQ ID NO: 282) (SEQ ID NO: 283) TGGCAATACCGGACCCCGGCCCCGGTG CGGGATCCTTAGGCGGCGAGATCCACGA (SEQ ID NO: 284) (SEQ ID NO: 285) ADH4 Atu5240 GGAATTCCATATGACCGGGGCGAACCAGCC ATAGCCGCTCATACGCCTCGGTTGCCT (SEQ ID NO: 286) (SEQ ID NO: 287) AGGCAACCGAGGCGTATGAGCGGCTAT CGGGATCCTTAAGCGCCGTGCGGAAGGA (SEQ ID NO: 288) (SEQ ID NO: 289) ADH5 Atu3163 GGAATTCCATATGACCATGCATGCCATTCA CGGGATCCTTATTCGGCTGCAAATTGCA (SEQ ID NO: 290) (SEQ ID NO: 291) ADH6 Atu2151 GGAATTCCATATGCGCGCGCTTTATTACGA CGGGATCCTTATTCGAACCGGTCGATGA (SEQ ID NO: 292) (SEQ ID NO: 293) ADH7 Atu2814 GGAATTCCATATGCTGGCGATTTTCTGTGA CGGGATCCTTATGCGACCTCCACCATGC (SEQ ID NO: 294) (SEQ ID NO: 295) ADH8 Atu5447 GGAATTCCATATGAAAGCCTTCGTCGTCGA CGGGATCCTTAGGATGCGTATGTAACCA (SEQ ID NO: 296) (SEQ ID NO: 297) ADH9 Atu4087 GGAATTCCATATGAAAGCGATTGTCGCCCA CGGGATCCTTAGGAAAAGGCGATCTGCA (SEQ ID NO: 298) (SEQ ID NO: 299) ADH10 Atu4289 GGAATTCCATATGCCGATGGCGCTCGGGCA CGGGATCCTTAGAATTCGATGACTTGCC (SEQ ID NO: 300) (SEQ ID NO: 301) ADH11 Atu3027 GGAATTCCATATGAAACATTCTCAGGACAA GGGCGCCGATCATGTGGTGCGTTTCCG (SEQ ID NO: 302) (SEQ ID NO: 303) CGGAAACGCACCACATGATCGGCGCCC CGGGATCCTTATGCCATACGTTCCATAT (SEQ ID NO: 304) (SEQ ID NO: 305) ADH12 Atu3026 GGAATTCCATATGCAGCGTTTTACCAACAG CGGGATCCTTAGGAAAACAGGACGCCGC (SEQ ID NO: 306) (SEQ ID NO: 307)

Expression and Purification of ADH1-10.

[0421] All plasmids were transformed into Escherichia coli strain BL21(DE3). The single colonies of BL21(DE3) containing respective alcohol dehydrogenase (ADH) genes were inoculated into 50 ml of LB media containing 50 .mu.g/ml kanamycin (Km.sup.50). These strains were grown in an orbital shaker with 200 rpm at 37.degree. C. The 0.2 mM IPTG was added to each culture when the OD.sub.600nm reached 0.6, and the induced culture was grown in an orbital shaker with 200 rpm at 20.degree. C. 24 hours after the induction, the cells were harvested by centrifugation at 4,000 rpm.times.g for 10 min and the pellet was resuspended into 2 ml of Bugbuster (Novagen) containing 10 .mu.l of Lysonase.TM. Bioprocessing Reagent (Novagen). The solution was again centrifuged at 4,000 rpm.times.g for 10 min and the supernatant was obtained.

Preparation of .about.2% DEHU Solution by Enzymatic Degradation.

[0422] DEHU solution was enzymatically prepared. A 2% alginate solution was prepared by adding 10 g of low viscosity alginate into the 500 ml of 20 mM Tris-HCl (pH7.5) solution. An approximately 10 mg of alginate lyase derived from Flavobacterium sp. (purchased from Sigma-aldrich) was added to the alginate solution. 250 ml of this solution was then transferred to another bottle and the E. coli cell lysate containing Atu3025 prepared above section was added. The alginate degradation was carried out at room temperature over night. The resulting products were analyzed by thin layer chromatography, and DEHU formation was confirmed.

Preparation of D-Mannuronate Solution by Chemical Degradation.

[0423] D-mannuronate solution was chemically prepared based on the protocol previously described by Spoehr (Archive of Biochemistry, 14: pp 153-155). Fifty milligram of alginate was dissolved into 800 .mu.L of ninety percent formate. This solution was incubated at 100.degree. C. for over night. Formate was then evaporated and the residual substances were washed with absolute ethanol twice. The residual substance was again dissolved into absolute ethanol and filtrated. Ethanol was evaporated and residual substances were resuspended into 20 mL of 20 mM Tris-HCl (pH 8.0) and the solution was filtrated to make a D-mannuronate solution. This D-mannuronate solution was diluted 5-fold and used for assay.

Assay for DEHU Hydrogenase.

[0424] To identify DEHU hydrogenase, a NADPH dependent DEHU hydrogenation assay was performed. 20 .mu.l of prepared cell lysate containing each ADH was added to 160 .mu.l of 20-fold deluted DEHU solution prepared in the above section. 20 .mu.l of 2.5 mg/ml of NADPH solution (20 mM Tris-HCl, pH 8.0) was added to initiate the hydrogenation reaction, as a preliminary study using cell lysate of A. tumefaciens C58 have shown that DEHU hydrogenation requires NADPH as a co-factor. The consumption of NADPH was monitored an absorbance at 340 nm for 30 min using the kinetic mode of ThermoMAX 96 well plate reader (Molecular Devises). E. coli cell lysate containing alcohol dehydrogenase (ADH) 10 lacking a portion of N-terminal domain was used in a control reaction mixture.

Assay for D-Mannuronate Hydrogenase.

[0425] To identify D-mannuronate hydrogenase, a NADPH dependent D-mannuronate hydrogenation assay was performed. 20 .mu.l of prepared cell lysate containing each ADH was added to 160 .mu.l of D-mannuronate solution prepared in the above section. 20 .mu.l of 2.5 mg/ml of NADPH solution (20 mM Tris-HCl, pH 8.0) was added to initiate the hydrogenation reaction. The consumption of NADPH was monitored an absorbance at 340 nm for 30 min using the kinetic mode of ThermoMAX 96 well plate reader (Molecular Devises). E. coli cell lysate containing alcohol dehydrogenase (ADH) 10 lacking a portion of N-terminal domain was used in a control reaction mixture.

Construction of pETkdgK.

[0426] DNA sequence of kdgK of Escherichi coli encoding 2-keto-deoxy gluconate kinase was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-AGGTACGGTGAAATAA AGGAGG ATATACAT ATGTCCAAAAAGATTGCCGT-3') (SEQ ID NO:157) and reverse (5'-TTTTCCTTTTGCGGCCGCCCCGCTGGCATCGCCTCAC-3') (SEQ ID NO:158) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome in 50 .mu.l. Amplified DNA fragment was digested with NdeI and NotI and ligated into pET29 pre-digested with the same restriction enzymes.

Construction of pETkdgA.

[0427] DNA sequence of kdgA Escherichi coli encoding 2-keto-deoxy gluconate-6-phosphate aldolase was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GGCGATGCCAGCGTAA AGGAGG ATATACAT ATGAAAAACTGGAAAACAAG-3') (SEQ ID NO:159) and reverse (5'-TTTTCCTTTTGCGGCCGCCCCAGCTTAGCGCCTTCTA-3') (SEQ ID NO:160) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome in 50 .mu.l. Amplified DNA fragment was digested with NdeI and NotI and ligated into pET29 pre-digested with the same restriction enzymes.

Protein Expression and Purification.

[0428] All plasmids (pETAtu3025, pETADH11, pETADH12, pETkdgA, pETkdgK, and pETuxuA) were transformed into Escherichia coli strain BL21(DE3). The single colonies of BL21(DE3) containing respective plasmids were inoculated into 50 ml of LB media containing 50 .mu.g/ml kanamycin (Km.sup.50). These strains were grown in an orbital shaker with 200 rpm at 37.degree. C. The 0.2 mM IPTG was added to each culture when the OD.sub.600nm reached 0.6, and the induced culture was grown in an orbital shaker with 200 rpm at 20.degree. C. 24 hours after the induction, the cells were harvested by centrifugation at 4,000 rpm.times.g for 10 min and the pellet was resuspended into 2 ml of Bugbuster (Novagen) containing 10 .mu.l of Lysonase.TM. Bioprocessing Reagent (Novagen) and suggested amount of protease inhibitor cocktail (SIGMA). The solution was again centrifuged at 4,000 rpm.times.g for 10 min and the supernatant was obtained. The supernatant was applied to Nickel-NTA spin column (Qiagen) to purify His-tagged proteins.

[0429] The results of the assays for DEHU hydrogenase activity and D-mannuronate hydrogenase activity of ADH1-10 are shown in FIGS. 7A and 7B. These results demonstrate that the novel enzymes ADH1 and ADH2 showed significant DEHU hydrogenase activity (FIG. 7A), and that the novel enzymes ADH3, ADH4, and ADH9 showed significant mannuronate hydrogenase activity (FIG. 7B).

In Vitro Pyruvate Formation.

[0430] The reaction mixture contained 1% alginate or .about.0.5% mannuronate, .about.5 ug of purified Atu3026 (ADH12) or Atu3027 (ADH11), and .about.5 ug of purified oligoalginate lyase (Atu3025), UxuA, KdgK, and KdgA, 2 mM of ATP, and 0.6 mM of NADPH in 20 mM Tris-HCl pH7.0. The reaction was carried out over night and the pyruvate formation was monitored by the pyruvate assay kit (BioVision, Inc).

[0431] The results of in vitro pyruvate formation from alginate mediated by enzymatic and chemical degradation are shown in FIG. 6B and FIG. 6C, respectively. As can be seen in these figures, alginate was converted to pyruvate via the isolated enzymes. These results also show that each of Atu3026 (ADH12) and Atu3027 (ADH11) are capable of catalyzing both DEHU hydrogenase and mannuronate hydrogenase reactions.

Example 4

Construction and Biological Activity of Biosynthesis Pathways

Construction of Pathways:

[0432] A propionaldehyde biosynthetic pathway comprising a threonine deaminase (ilvA) gene from Escherichia coli and keto-isovalerate decarboxylase (kivd) from Lactococcus lactis is constructed and tested for the ability to convert L-threonine to propionaldehyde.

[0433] A butyraldehyde biosynthetic pathway comprising a thiolase (atoB) gene from E. coli, .beta.-hydroxy butyryl-CoA dehydrogenase (hbd), crotonase (crt), butyryl-CoA dehydrogenase (bcd), electron transfer flavoprotein A (etfA), and electron transfer flavoprotein B (etfB) genes from Clostridium acetobutyricum ATCC 824, and a coenzyme A-linked butyraldehyde dehydrogenase (ald) gene from Clostridium beijerinckii acetobutyricum ATCC 824 was constructed in E. coli and tested for the ability to produce butyraldehyde. Also, a coenzyme A-linked alcohol dehydrogenase (adhE2) gene from Clostridium acetobutyricum ATCC 824 was used as an alternative to ald and tested for the ability to produce butanol.

[0434] An isobutyraldehyde biosynthetic pathway comprising an acetolactate synthase (alsS) from Bacillus subtilis or (als) from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage was optimized for E. coli protein expression) and acetolactate reductoisomerase (ilvC) and 2,3-dihydroxyisovalerate dehydratase (ilvD), genes from E. coli and keto-isovalerate decarboxylase (kivd) from Lactococcus lactis was constructed and tested for the ability to produce isobutyraldehyde, as measured by isobutanal production.

[0435] 3-methylbutyraldehyde and 2-methylbutyraldehyde biosynthesis pathways comprising an acetolactate synthase (alsS) from Bacillus subtilis or (als) from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage was optimized for E. coli protein expression), acetolactate reductoisomerase (ilvC), 2,3-dihydroxyisovalerate dehydratase (ilvD), isopropylmalate synthase (LeuA), isopropylmalate isomerase (LeuC and LeuD), and 3-isopropylmalate dehydrogenase (LeuB) genes from E. coli and keto-isovalerate decarboxylase (kivd) from Lactococcus lactis were constructed and tested for the ability to produce 3-isovaleraldehyde and 2-isovaleraldehyde.

[0436] Phenylacetoaldehyde and 4-hydroxyphenylacetoaldehyde biosynthesis pathways comprising a transketolase (tktA), a 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II (aroL), a shikimate kinase I (aroK), a 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate synthase (aroC), a fused chorismate mutase P/prephenate dehydratase (pheA), and a fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from E. coli, keto-isovalerate decarboxylase (kiwi) from Lactococcus lactis were constructed and tested for the ability to produce phenylacetoaldehyde and/or 4-hydroxyphenylacetoaldehyde.

[0437] A 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis pathway comprising a transketolase (tktA), a 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase (aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II (aroL), a shikimate kinase I (aroK), a 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate synthase (aroC), a fused chorismate mutase P/prephenate dehydratase (pheA), and a fused chorismate mutase T/prephenate dehydrogenase (tyrA) genes from E. coli, keto-isovalerate decarboxylase (kivd) from Lactococcus lactis, alcohol dehydrogenase (adh2) from Saccharomyces cerevisiae, Indole-3-pyruvate decarboxylase (ipdc) from Azospirillum brasilense, phenylethanol reductase (par) from Rhodococcus sp. ST-10, and benzaldehyde lyase (bal) from Pseudomonas fluorescence was constructed and tested for the ability to produce 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol and/or 2-(indole-3)ethanol.

Construction of pBADButP.

[0438] The DNA sequence encoding hbd, crt, bcd, etfA, and etfB of Clostridium acetobutyricum ATCC 824 was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 3 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCTTAGGAGGATTAGTCATGGAAC-3') (SEQ ID NO:161) and reverse (5'-GCTCTAGA TTATTTTGAATAATCGTAGAAACC-3') (SEQ ID NO:162) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Clostridium acetobutyricum ATCC 824 genome (ATCC) in 50 .mu.l. Amplified DNA fragment was digested with BamHI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADButP-atoB.

[0439] The DNA sequence encoding atoB of Escherichia coli DH10B was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GCTCTAGAGGAGGATATATATATGAAAAATTGTGTCATCGTC-3') (SEQ ID NO:163) and reverse (5'-AA CTGCAGTTAATTCAACCGTTCAATCACC-3') (SEQ ID NO:164) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome in 50 .mu.l. Amplified DNA fragment was digested with XbaI and PstI and ligated into pBADButP pre-digested with the same restriction enzymes.

Construction of pBADatoB-ald. The DNA sequence encoding atoB of Escherichia coli DH10B and ald from Clostridium beijerinckii were amplified separately by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTC AGGAGGATATATATATGAAAAATTGTGTCATCGTCAGTG-3') (SEQ ID NO:165) for atoB and 5'-GGTTGAATTAAGGAGGATATATATATGAATAAAGACACACTAATACCTAC-3' for ald) (SEQ ID NO:166) and reverse (5'-GTCTTTATTCATATATATATCCTCCTTAATTCAACCGTTCAATCACCATC-3' (SEQ ID NO:146) for atoB and 5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3' for ald) (SEQ ID NO:167) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B and Clostridium beijerinckii genome (ATCC) in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTC AGGAGGATATATATATGAAAAATTGTGTCATCGTCAGTG-3') (SEQ ID NO:168) and reverse (5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3') (SEQ ID NO:169) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and HindIII and ligated into pBADButP pre-digested with the same restriction enzymes. Construction of pBADButP-atoB-ALD.

[0440] The DNA fragment 1 encoding chloramphenicol acetyltransferase (CAT), P15 origin of replication, araBAD promoter, atoB of Escherichia coli DH10B and ald of Clostridium beijerinckii and the DNA fragment 2 encoding araBAD promoter, hbd, crt, bcd, etfA, and etfB of Clostridium acetobutyricum ATCC 824 were amplified separately by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:170) for fragment 1 and 5'-CGGGGTACCACTTTTCATACTCCCGCCATTCAG-3' (SEQ ID NO:274) for fragment 2, and reverse (5'-CGGGGTACCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:171) for fragment 1 and (5'-AAGGAAAAAAGCGGCCGCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:172) for fragment 2) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADatoB-ald and pBADButP in 50 .mu.l, respectively. Amplified DNA fragments were digested with NotI and KpnI and ligated each other.

Construction of pBADilvCD.

[0441] The DNA fragments encoding ilvC and ilvD of Escherichia coli DH10B were amplified separately by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GCTCTAGAGGAGGATATATATATGGCTAACTACTTCAATACAC-3') (SEQ ID NO:173) for ilvC and 5'-TGCTGTTGCGGGTTAAGGAGGATATATATATGCCTAAGTACCGTTCCGCC-3' for ilvD) (SEQ ID NO:174) and reverse (5'-AACGGTACTTAGGCATATATATATCCTCCTTAACCCGCAACAGCAATACG-3') (SEQ ID NO:175) for ilvC and 5'-ACATGCATGCTTAACCCCCCAGTTTCGATT-3') (SEQ ID NO:176) for ilvD) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli DH10B genome (ATCC) in 50 .mu.l. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GCTCTAGAGGAGGATATATATATGGCTAACTACTTCAATACAC-3') (SEQ ID NO:177) and reverse (5'-ACATGCATGCTTAACCCCCCAGTTTCGATT-3') (SEQ ID NO:178) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with XbaI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADals-ilvCD.

[0442] The DNA fragment encoding als of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 of its codon usage optimized for over-expression in E. coli was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGGATAAACAGTATCCGGT-3') (SEQ ID NO:179) and reverse (5'-GCTCTAGATTACAGAATTTGACTCAGGT-3') (SEQ ID NO:180) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETals in 50 .mu.l. The amplified DNA fragment was digested with SacI and XbaI and ligated into pBADilvCD pre-digested with the same restriction enzymes.

Construction of pBADalsS-ilvCD.

[0443] The DNA fragments encoding front and bottom halves of alsS of Bacillus subtilis B26 were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 0.5 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGTTGACAAAAGCAACAAAAG-3') (SEQ ID NO:181) for front and 5'-CGGTACCCTTTCCAGAGATTTAGAG-3' (SEQ ID NO:275) for back halves, and reverse (5'-CTCTAAATCTCTGGAAAGGGTACCG-3') (SEQ ID NO:182) for front and (5'-GCTCTAGATTAGAGAGCTTTCGTTTTCATG-3' for back halves) (SEQ ID NO:183) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Bacillus subtilis B26 genome (ATCC) in 50 .mu.l. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGTTGACAAAAGCAACAAAAG-3') (SEQ ID NO:184) and reverse (5'-GCTCTAGATTAGAGAGCTTTCGTTTTCATG-3') (SEQ ID NO:185) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was internal XbaI site free and thus was digested with SacI and XbaI and ligated into pBADilvCD pre-digested with the same restriction enzymes.

Construction of pBADLeuABCD.

[0444] The DNA fragment encoding leuA, leuB, leuC, and leuD of Escherichia coli BL21(DE3) was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 3 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:186) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:187) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l. The amplified DNA fragment was digested with SacI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADLeuABCD2.

[0445] The DNA fragment 1 encoding leuA and leuB and the DNA fragment 2 encoding leuC and leuD of Escherichia coli BL21(DE3) were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:188) for fragment 1 and (5'-AGGGGTGTAAGGAGGATATATATATGGCTAAGACGTTATACGAAAAATTG-3') (SEQ ID NO:189) for fragment 2 and reverse (5'-CGTCTTAGCCATATATATATCCTCCTTACACCCCTTCTGCTACATAGCGG-3') (SEQ ID NO:190) for fragment 1 and (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:191) for fragment 2 primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 3 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:192) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:193) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADLeuABCD4.

[0446] The DNA fragments encoding leuA, leuB, leuC and leuD of Escherichia coli BL21(DE3) were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:194) for leuA, (5'-GAAACCGTGTGAGGAGGATATATATATGTCGAAGAATTACCATATTGCCG-3') (SEQ ID NO:195) for leuB, (5'-AGGGGTGTAAGGAGGATATATATATGGCTAAGACGTTATACGAAAAATTG-3') (SEQ ID NO:196) for leuC, and (5'-ACATTAAATAAGGAGGATATATATATGGCAGAGAAATTTATCAAACACAC-3') (SEQ ID NO:197) for leuD and reverse (5'-ATTCTTCGACATATATATATCCTCCTCACACGGTTTCCTTGTTGTTTTCG-3') (SEQ ID NO:198) for leuA, (5'-CGTCTTAGCCATATATATATCCTCCTTACACCCCTTCTGCTACATAGCGG-3') (SEQ ID NO:199) for leuB, (5'-TTTCTCTGCCATATATATATCCTCCTTATTTAATGTTGCGAATGTCGGCG-3') (SEQ ID NO:200) for leuC, and (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:201) for leuD primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 3 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID NO:202) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:203) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and XbaI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADals-ilvCD-leuABCD, pBADals-ilvCD-leuABCD2, pBADals-ilvCD-leuABCD4, pBADalsS-ilvCD-leuABCD, pBADalsS-ilvCD-leuABCD2, pBADalsS-ilvCD-leuABCD4.

[0447] The DNA fragments 1 (for als) and 2 (for alsS) encoding chloramphenicol acetyltransferase (CAT), P15 origin of replication, araBAD promoter, als of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 of its codon usage optimized for over-expression in E. coli or alsS of Bacillus subtilis B26 and ilvC and ilvD of E. coli DH10B were amplified separately by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:204) and reverse (5'-CGGGGTACCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:205) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADals-ilvCD and pBADalsS-ilvCD in 50 .mu.l, respectively.

[0448] To remove an internal SphI restriction enzyme site form leuC, overlap PCR was carried out. The front and bottom halves of DNA fragment 3 (for leuABCD), fragment 4 (for leuABCD2), and fragment 5 (for leuABCD4) encoding araBAD promoter, leuA, leuB, leuC, and leuD of E. coli BL21(DE3) were amplified separately by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-AAGGAAAAAAGCGGCCGCACTTTTCATACTCCCGCCATTCAG-3') (SEQ ID NO:206) for front and (5'-CAAAGGCCGTCTGCACGCGCCGAAAGGCAAA-3') (SEQ ID NO:207) for back halves) and reverse (5'-TTTGCCTTTCGGCGCGTGCAGACGGCCTTTG-3') (SEQ ID NO:208) for front and (5'-ACATGCATGCCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:209) for bottom halves, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADleuABCD, pBADleuABCD2, and pBADleuABCD4 in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-AAGGAAAAAAGCGGCCGCACTTTTCATACTCCCGCCATTCAG-3') (SEQ ID NO:210) and reverse (5'-ACATGCATGCCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:211) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The resulting fragment 3, 4, and 5 were digested with SphI and NotI and ligated into both fragment 1 and 2 pre-digested with the same restriction enzymes.

Construction of pBADaroG-tktA-aroBDE.

[0449] The DNA fragments encoding aroG, tktA, aroB, aroD, and aroE of Escherichia coli BL21(DE3) were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATAT ATGAATTATCAGAACGACGATTTAC-3') (SEQ ID NO:212) for aroG, (5'-GCGTCGCGGGTAAGGAGGAAAATTTTATGTCCTCACGTAAAGAGCTTGCC-3') (SEQ ID NO:213) for tktA, (5'-GAACTGCTGTAAGGAGGTTAAAATTATGGAGAGGATTGTCGTTACTCTCG-3') (SEQ ID NO:214) for aroB,} (5'-CAATCAGCGTAAGGAGGTATATATAATGAAAACCGTAACTGTAAAAGATC-3') (SEQ ID NO:215) for aroD, and (5'-TACACCAGGCATAAGGAGGAATTAATTATGGAAACCTATGCTGTTTTTGG-3') (SEQ ID NO:216) for aroE and reverse (5'-TACGTGAGGACATAAAATTTTCCTCCTTACCCGCGACGCGCTTTTACTGC-3') (SEQ ID NO:217) for aroG, (5'-CAATCCTCTCCATAATTTTAACCTCCTTACAGCAGTTCTTTTGCTTTCGC-3') (SEQ ID NO:218) for tktA, (5'-CAATCAGCGTAAGGAGGTATATATAATGAAAACCGTAACTGTAAAAGATC-3') (SEQ ID NO:219) for aroB, (5'-TACGGTTTTCATTATATATACCTCCTTACGCTGATTGACAATCGGCAATG-3') (SEQ ID NO:220) for aroD, and (5'-ACATGCATGCTTACGCGGACAATTCCTCCTGCAA-3') (SEQ ID NO:221) for aroE, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 3 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGAATTATCAGAACGACGATTTAC-3') (SEQ ID NO:222) and reverse (5'-ACATGCATGCTTACGCGGACAATTCCTCCTGCAA-3') (SEQ ID NO:223) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADpheA-aroLAC.

[0450] The DNA fragments encoding pheA, aroL, aroA, and aroC of Escherichia coli DH10 were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGACATCGGAAAACCCGTTACTGG-3') (SEQ ID NO:224) for pheA, (5'-GATCCAACCTAAGGAGGAAAATTTTATGACACAACCTCTTTTTCTGATCG-3') (SEQ ID NO:225) for aroL, (5'-GATCAATTGTTAAGGAGGTATATATAATGGAATCCCTGACGTTACAACCC-3') (SEQ ID NO:226) for aroA, and (5'-CAGGCAGCCTAAGGAGGAATTAATTATGGCTGGAAACACAATTGGACAAC-3') (SEQ ID NO:227) for aroC and reverse (5'-AGGTTGTGTCATAAAATTTTCCTCCTTAGGTTGGATCAACAGGCACTACG-3') (SEQ ID NO:228) for pheA, (5'-CAGGGATTCCATTATATATACCTCCTTAACAATTGATCGTCTGTGCCAGG-3') (SEQ ID NO:229) for aroL, (5'-GTTTCCAGCCATAATTAATTCCTCCTTAGGCTGCCTGGCTAATCCGCGCC-3') (SEQ ID NO:230) for aroA, and (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:231) for aroC primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGACATCGGAAAACCCGTTACTGG-3') (SEQ ID NO:232) and reverse (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:233) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADtyrA-aroLAC.

[0451] The DNA fragments encoding pheA, aroL, aroA, and aroC of Escherichia coli DH10 were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGGTTGCTGAATTGACCGCATTAC-3') (SEQ ID NO:234) for tyrA, (5'-AATCGCCAGTAAGGAGGAAAATTTTATGACACAACCTCTTTTTCTGATCG-3') (SEQ ID NO:235) for aroL, (5'-GATCAATTGTTAAGGAGGTATATATAATGGAATCCCTGACGTTACAACCC-3') (SEQ ID NO:236) for aroA, and (5'-CAGGCAGCCTAAGGAGGAATTAATTATGGCTGGAAACACAATTGGACAAC-3') (SEQ ID NO:237) for aroC, and reverse (5'-GAGGTTGTGTCATAAAATTTTCCTCCTTACTGGCGATTGTCATTCGCCTG-3') (SEQ ID NO:238) for tyrA, (5'-CAGGGATTCCATTATATATACCTCCTTAACAATTGATCGTCTGTGCCAGG-3') (SEQ ID NO:239) for aroL, (5'-GTTTCCAGCCATAATTAATTCCTCCTTAGGCTGCCTGGCTAATCCGCGCC-3') (SEQ ID NO:240) for aroA, and (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:241) for aroC, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGGTTGCTGAATTGACCGCATTAC-3') (SEQ ID NO:242) and reverse (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3') (SEQ ID NO:243) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and SphI and ligated into pBAD33 pre-digested with the same restriction enzymes.

Construction of pBADpheA-aroLAC-aroG-tktA-aroBDE and pBADtyrA-aroLAC-aroG-tktA-aroBDE.

[0452] A DNA fragment 1 (for pheA) and 2 (for tyrA) encoding chloramphenicol acetyltransferase (CAT), P15 origin of replication, araBAD promoter, pheA or tyrA, aroL, aroA, aroC of Escherichia coli DH10B and a DNA fragment 3 encoding araBAD promoter, aroG, tktA, aroB, aroD, and aroE of Escherichia coli DH10B were amplified separately by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:244) for fragment 1 and 2 and (5'-GCTCTAGAACTTTTCATACTCCCGCCATTCAG-3') (SEQ ID NO:245) for fragment 3, and reverse (5'-GCTCTAGAGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:246) for fragment 1 and 2 and (5'-AAGGAAAAAAGCGGCCGCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:247) for fragment 3, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pBADpheA-aroLAC, pBADtyrA-aroLAC, and pBADaroG-tktA-aroBDE in 50 .mu.l, respectively. Amplified DNA fragments 1 and 2 were digested with NotI and XbaI and ligated into fragment 3 pre-digested with the same restriction enzymes.

Construction of pTrcBAL.

[0453] A DNA sequence encoding benzaldehyde lyase (bal) of Pseudomonas fluorescens of its codon usage optimized for over-expression in E. coli was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CATGCCATGGCTATGATTACTGGTGG-3') (SEQ ID NO:248) and reverse (5'-CCCCGAGCTCTTACGCGCCGGATTGGAAATACA-3') (SEQ ID NO:249) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 .mu.l. Amplified DNA fragment was digested with NcoI and SacI and ligated into pTrc99A pre-digested with the same restriction enzymes.

Construction of pTrcAdhE2.

[0454] A DNA sequence encoding Co-A linked alcohol/aldehyde dehydrogenase (adhE2) of Clostridium acetobutyricum ATCC824 was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CATGCCATGGCCAAAGTTACAAATCAAAAAG-3') (SEQ ID NO:250) and reverse (5'-CGAGCTCTTAAAATGATTTTATATAGATATCC-3') (SEQ ID NO:251) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Clostridium acetobutyricum ATCC824 genome in 50 .mu.l. Amplified DNA fragment was digested with NcoI and SacI and ligated into pTrc99A pre-digested with the same restriction enzymes.

Construction of pTrcAdh2.

[0455] A DNA sequence encoding alcohol dehydrogenase (adh2) of Saccharomyces cerevisiae was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CATGCCATGGGTATTCCAGAAACTCAAAAAG-3') (SEQ ID NO:252) and reverse (5'-CCCGAGCTCTTATTTAGAAGTGTCAACAACG-3') (SEQ ID NO:253) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng genome of Saccharomyces cerevisiae in 50 .mu.l. Amplified DNA fragment was digested with NcoI and SacI and ligated into pTrc99A pre-digested with the same restriction enzymes.

Construction of pTrcBALD.

[0456] A DNA sequence encoding CoA-linked aldehyde dehydrogenase (ald) of Clostridium beijerinckii was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCCGAGCTCAGGAGG ATATACATATGAATAAAGACACACTAATACC-3') (SEQ ID NO:254) and reverse (5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3') (SEQ ID NO:255) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 .mu.l. Amplified DNA fragment was digested with SacI and HndIII and ligated into pTrcBAL pre-digested with the same restriction enzymes.

Construction of pTrcBALK.

[0457] A DNA sequence encoding ketoisovalerate decarboxylase (kivd) of Lactococcus lavtis was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGTATACAGTAGGAGATTACC-3') (SEQ ID NO:256) and reverse (5'-GCTCTAGATTATGATTTATTTTGTTCAGCAAAT-3') (SEQ ID NO:257) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 .mu.l. Amplified DNA fragment was digested with SacI and XbaI and ligated into pTrcBAL pre-digested with the same restriction enzymes.

Construction of pTrcAdh-Kivd.

[0458] A DNA sequence encoding ketoisovalerate decarboxylase (kivd) of Lactococcus lavtis was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGTATACAGTAGGAGATTACC-3') (SEQ ID NO:258) and reverse (5'-GCTCTAGATTATGATTTATTTTGTTCAGCAAAT-3') (SEQ ID NO:259) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50 .mu.l. Amplified DNA fragment was digested with SacI and XbaI and ligated into pTrcAdh2 pre-digested with the same restriction enzymes.

Construction of pTrcBAL-DDH-2ADH.

[0459] To remove internal NcoI site, overlap PCR was carried out. DNA fragments encoding front and bottom halves of meso-2,3-butanedioldehydrogenase (ddh) of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 and secondary alcohol dehydrogenase (2adh) of Pseudomanas fluorescens were amplified separately by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTCAGGAGGATATATATATGAAAAAAGTCGCACTTGTTACCG-3') (SEQ ID NO:260) for front half of ddh, (5'-GGCCGGCGGCCGCGCGATGGCGGTGAAAGTG-3') (SEQ ID NO:261) for bottom half of ddh, (5'-AACTAATCTAGAGGAGGATATATATATGAGCATGACGTTTTCCGGCCAGG-3') (SEQ ID NO:262) for front half of 2adh, and (5'-CCTTGCGGAGGGCTCGATGGATGAGTTCGAC-3') (SEQ ID NO:263) for bottom half of 2adh, and reverse (5'-CACTTTCACCGCCATCGCGCGGCCGCCGGCC-3') (SEQ ID NO:264) for front half of ddh, (5'-GCTCATATATATATCCTCCTCTAGATTAGTTAAACACCATCCCGCCGTCG-3') (SEQ ID NO:265) for bottom half of ddh, (5'-GTCGAACTCATCCATCGAGCCCTCCGCAAGG-3') (SEQ ID NO:266) for front half of 2adh, and (5'-CCCAAGCTTAGATCGCGGTGGCCCCGCCGTCG-3') (SEQ ID NO:267) for bottom half of 2adh, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578 for ddh and Pseudomanas fluorescens genome for 2adh in 50 .mu.l, respectively. The amplified DNA fragments were gel purified and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA solution was combined and each DNA fragment was spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTCAGGAGGATATATATATGAAAAAAGTCGCACTTGTTACCG-3') (SEQ ID NO:268) and reverse (5'-CCCAAGCTTAGATCGCGGTGGCCCCGCCGTCG-3') (SEQ ID NO:269) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced fragment was digested with SacI and HindIII and ligated into pTrcBAL pre-digested with the same restriction enzymes.

Construction of pBBRPduCDEGH.

[0460] A DNA sequence encoding propanediol dehydratase medium (pduD) and small (pduE) subunits and propanediol dehydratase reactivation large (pduG) and small (pduH) subunits of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GCTCTAGAGGAGGATTTAAAAATGGAAATTAACGAAACGCTGC-3') (SEQ ID NO:270) and reverse (5'-TCCCCGCGGTTAAGCATGGCGATCCCGAAATGGAATCCCTTTGAC-3') (SEQ ID NO:271) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578 in 50 .mu.l. Amplified DNA fragment was digested with SacII and XbaI and ligated into pTrc99A pre-digested with the same restriction enzymes to form pBBRPduDEGH.

[0461] A DNA sequence encoding propanediol dehydratase large subunit (pduC) of Klebsiella pneumoniae subsp. pneumoniae MGH 78578 was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCGCTCGAGGAGGATATATATATGAGATCGAAAAGATTTGAAGC-3') (SEQ ID NO:272) and reverse (5'-GCTCTAGATTAGCCAAGTTCATTGGGATCG-3') (SEQ ID NO:273) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578 in 50 .mu.l. Amplified DNA fragment was digested with XhoI and XbaI and ligated into pBBRPduDEGH pre-digested with the same restriction enzymes.

Construction of pTrcIpdc-Par.

[0462] A DNA sequence encoding indole-3-pyruvate (ipdc) of Azospirillum brasilense and phenylethanol reductase (par) of Rhodococcus sp. ST-10 were amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward primers (5'-CATGCCATGGGACTGGCTGAGGCACTGCTGC-3' (SEQ ID NO:314) for ipdc and 5'-CGAGCTCAGGAGGATATATATATGAAAGCTATCCAGTACACCCGTAT-3' (SEQ ID NO:315) for par, and reverse primers (5'-CGAGCTCTTATTCGCGCGGTGCCGCGTGCAGG-3' (SEQ ID NO:316) for ipdc and 5'-GCTCTAGATTACAGGCCCGGAACCACAACGGCGC-3' (SEQ ID NO:317) for par, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pTrcIpdc and pTrcPar, respectively, in 50 .mu.l. Amplified DNA fragment of ipdc and par were digested with NcoI/SacI and SacI/XbaI, respectively, and were ligated into pTrc99A pre-digested with NcoI and XbaI.

Testing and Results:

[0463] To test the butyraldehyde biosynthesis pathway, DH10B harboring pBADButP-atoB/pTrcBALD and pBADButP-atoB-ALD/pTrcB2DH/pBBRpduCDEGH were grown overnight in LB media containing 50 ug/ml chroramphenicol (Cm.sup.50) and 100 ug/ml ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm.sup.50 and Amp.sup.100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS.

[0464] To test the isobutyeraldehyde biosynthesis pathway, DH10B cells harboring pBADals-ilvCD/pTrcBALK or pBADalsS-ilvCD/pTrcBALK were grown overnight in LB media containing 50 ug/ml chloramphenicol (Cm.sup.50) and 100 ug/ml ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm.sup.50 and Amp.sup.100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS for the production of isobutyraldehyde. FIG. 8B shows the production of isobutanal from these cultures.

[0465] To test the 3-methylbutyraldehyde and 2-methylbutyraldehyde biosynthesis pathways, DH10B harboring pBADals-ilvCD-LeuABCD/pTrcBALK, pBADals-ilvCD-LeuABCD2/pTrcBALK, pBADals-ilvCD-LeuABCD/pTrcBALK4, pBADalsS-LeuABCD/pTrcBALK, pBADalsS-LeuABCD2/pTrcBALK, or pBADalsS-LeuABCD4/pTrcBALK were grown overnight in LB media containing 50 ug/ml chloramphenicol (Cm.sup.50) and 100 ug/ml ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm.sup.50 and Amp.sup.100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS. The production of 2-isovaleralcohol (2-methylpental) and 3-isovaleralcohol (3-methylpentanal) was monitored because 3-isovaleraldehyde and 2-isovaleraldehyde are spontaneously converted to their corresponding alcohols. FIG. 8B shows the production of 2-methylpental and 3-methylpentanal from these cultures.

[0466] To test the phenylacetoaldehyde and 4-hydroxyphenylacetoaldehyde biosynthesis pathways, DH10B cells harboring pBADpheA-aroLAC/pTrcBALK, pBADtyrA-aroLAC/pTrcBALK, pBADaroG-tktA-aroBDE/pTrcBALK, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK, and pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK were grown overnight in LB media containing 50 ug/ml chloramphenicol (Cm.sup.50) and 100 ug/ml ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm.sup.50 and Amp.sup.100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS. The production of phenylacetoaldehyde, 4-hydroxyphenylaldehyde and their corresponding alcohols were monitored using GC-MS. FIG. 9B shows the production of 4-hydroxyphenylethanol from these cultures.

[0467] To test the 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and 2-(indole-3) ethanol biosynthesis pathways, DH10B harboring pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcAdh2-Kivd, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcAdh2-Kivd, pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcIpdc-Par, and pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcIpdc-Par were grown overnight in LB media containing 50 ug/ml chroramphenicol (Cm.sup.50) and 100 ug/ml ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each seed culture was inoculated into fresh TB media containing Cm.sup.50 and Amp.sup.100 and was grown in incubation shaker at 37 C, 200 rpm. Three hours after inoculation, the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and were grown for overnight to a week. 700 ul of this culture was extracted with equal volume of ethylacetate and analyzed by GC-MS. The results are detailed below.

[0468] The production of 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol and/or 2-(indole-3-)ethanol was monitored using GC-MS. FIG. 42A shows the production of 2-phenylethanol from these cultures at 24 hours. FIG. 42B shows the production of 2-(4-hydroxyphenyl)ethanol from these cultures at 24 hours. FIG. 42C shows the production of 2-(indole-3-)ethanol from these cultures at 24 hours.

[0469] FIG. 43A shows the GC-MS chromatogram for control (pBAD33 and pTrc99A) at one week. FIG. 43B shows the GC-MS chromatogram for 2-phenylethanol (5.97 min) production from pBADpheA-aroLAC-aroG-tktA-aroBDE and pTrcBALK at one week. FIG. 44 shows the GC-MS chromatogram for 2-(4-hydroxyphenyl)ethanol (9.36 min) and 2-(indole-3) ethanol (10.32 min) production from pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK at one week.

Example 5

Isolation and Biological Activity of Diol Dehydrogenases

[0470] Available substrates such as 3-hydroxy-2-butanone (acetoin), 4-hydroxy-3-hexanone (propioin), 5-hydroxy-4-octanone (butyroin), 6-hydroxy-5-decanone (valeroin), and 1,2-cyclopentanediol were used to measure the ability of diol dehydrogenases (ddh) to catalyze the reduction of large saturated .alpha.-hydroxyketones to produce a diol. All reagents were purchased from Sigma-Aldrich Co. and TCI America, unless otherwise stated.

[0471] For cloning and isolation of DDH polypeptides, genomic DNA from several species of bacteria were obtained from ATCC (Lactobaccilus brevis ATCC 367, Pseudomanas putida KT2440, and Klebsiella pneumoniae MGH78578), PCR-amplified (using Phusioin with polymerase with 1.times. Phusion buffer, 0.2 mM dNTP, 0.5 .mu.L Phusion enzyme, 1.5 .mu.M primers, and 20 pg template DNA in a 50 .mu.L reaction) utilizing the following protocol: 30 cycles, 98.degree. C./10 secs (denaturing), 60.degree. C./15 secs (annealing), 72.degree. C./30 secs (elongation). Polymerase chain reaction products were then digested using restriction enzymes NdeI and BamHI, then ligated into NdeI/BamHI digested pET28 vectors. Vectors containing ddh clones were transformed into BL21(DE3) competent cells for protein expression. Single colony was innoculated into LB media, and expression of 6.times.His-tagged proteins of interest was induced at OD.sub.600=0.6 with 0.1 mM IPTG. Expression was allowed to proceed for 15 hours at 22.degree. C. The 6.times.His-tagged enzymes were purified using Ni-NTA spin columns following suggested protocols by QIAGEN, yielding purified protein concentrations in the range of 1.1-6.5 mg/mL (determined by Bradford assay).

[0472] Diol dehydrogenase ddh1 was isolated from Lactobaccilus brevis ATCC 367, diol dehydrogenase ddh2 was isolated from Pseudomonas putida KT2440, and diol dehydrogenase ddh3 was isolated from Klebsiella pneumoniae MGH78578. The nucleotide sequence encoding and polypeptide sequence of ddh1 are shown in SEQ ID NOS:97 and 98, respectively; nucleotide sequence encoding and polypeptide sequence of ddh2 are shown in SEQ ID NOS:99 and 100, respectively; and nucleotide sequence encoding and polypeptide sequence of ddh3 are shown in SEQ ID NOS: 101 and 102, respectively.

[0473] Reactions to measure biological activity of DDH polypeptides were performed in a final volume of 200 .mu.L as follows: 25 mM substrate, 0.04 mg/mL DDH polypeptide, 0.25 mg/mL nicotinamide cofactor, 200 mM imidazole, 14 mM Tris-HCl, and 1.5% by volume DMSO. Biological activity was assayed using a Molecular Devices Thermomax 96 well plate reader, monitoring absorbance at 340 nm, which corresponds to NADH or NADPH concentration. For the kinetic studies, 0.04 mg/mL DDH polypeptide, 0.25 mg/mL NADH, 20 mM Tris HCl Buffer pH 6.5(red) or 9.0(ox), T=25 C, 100 uL total volume was used.

[0474] FIG. 12A shows the biological activity of ddh1, ddh2, and ddh3 using butyroin as a substrate (triangles represent ddh3 activity). FIG. 12B shows the oxidation activity of ddh3 towards 1,2-cyclopentanediol and 1,2-cyclohexanediol as measured by NADH production. FIG. 13 summarizes the results of kinetic studies for various substrates in the oxidation reactions catalyzed by the DDH polypeptides. These reactions were NAD+ dependent.

Example 6

Sequential In Vivo Biological Activity of CC-Ligases (Lyases) and Diol Dehydrogenases

[0475] The ability of a C--C lyase and a diol hydrogenase to perform the following sequential reaction was tested in E. coli:

##STR00001##

[0476] For .alpha.-hydroxyketone and diol production, a pathway comprising a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) and meso-2,3-butanediol dehydrogenase (ddh) gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578 was constructed in E. coli and tested for its ability to condensate the substrates detailed below in Table 2 (e.g., acetoaldehyde, propionaldehyde, butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, phenylacetaldehyde, and 4-hydroxyphenylacetaldehyde, or their corresponding alcohols) to form .alpha.-hydroxyketone and the corresponding diol in vivo. The production of various .alpha.-hydroxyketones and diols was monitored by gas chromatography-mass spectrometry (GC-MS).

TABLE-US-00002 TABLE 2 Summary of substrates and products. Produced Substrate .alpha.-hydroxyketone Produced diol FIGS. Butanal 5-Hydroxy-4-octanone 4,5-Octanonediol 17A & B n-Pentanal 6-Hydroxy-5-decanone 5,6-Decanediol 18A & B 3-Methylbutanal 2,7-Dimethyl-5-hydroxy-4- 2,7-Dimethyl-4,5-octanediol 19A & B octanone n-Hexanal 7-Hydroxy-6-dodecanone 6,7-dodecanediol 20A & B 4-Methylpentanal 2,9-Dimethyl-6-hydroxy-5- 2,9-Dimethyl-5,6- 21A & B decanone decanediol n-Octanal 9-Hydroxy-8-hexadecanone 8,9-hexadecanediol 22 Acetaldehyde 3-Hydroxy-2-butanone 2,3-Butanediol 23 n-Propanal 4-Hydroxy-3-hexanone 3,4-Hexanediol 24A & B Phenylacetoaldehyde 1,4-Diphenyl-3-hydroxy-2- 1,4-Diphenyl-2,3-butanediol 25 butanone

For Analysis of .ltoreq.C10.

[0477] E. coli harboring pTrcBAL-DDH-2ADH was grown for overnight in LB media containing 50 ug/ml Kanamycine (Km). This seed culture was innoculated into M9 media containing 3% (v/v) glycerol, 0.5% (g/v) and 50 ug/ml Km. 10 mL cultures were grown to O.D..sub.600=0.7, then cultures were induced with 0.5 mM IPTG. The cells were allowed to express the enzymes of interest for 3 hours before various aldehydes were added to a concentration of 5-10 mM. After addition of aldehydes, the cultures were capped and incubated at 37.degree. C. with skaking for 72 hours. Cultures were extracted with 2 mL ethyl acetate, and analyzed on GC-MS using the following protocol:

[0478] 1 .mu.L, injection w/ 50:1 split

[0479] Inlet temperature--150.degree. C.

[0480] Initial oven temperature--50.degree. C.

[0481] Temperature Ramp 1--10.degree. C./min to 150.degree. C.

[0482] Temperature Ramp 2--50.degree. C./min to 300.degree. C.

[0483] GC to MS transfer temp--250.degree. C.

[0484] MS detection--full scan MW 35-200

For Analysis of .gtoreq.C12.

[0485] E. coli DH10B strains harboring pTrc99A (Ctrl vector) or pTrcBAL were inoculated into 0.75.times.M9/0.5% LB containing 0.1 mM CaCl.sub.2, 2 mM MgSO.sub.4, 1 mM KCl, 1% galacturonate, 5 .mu.g/mL thiamine, Amp. The cultures were grown up to an optical density (600 n nm) of 0.8 and induced with 0.25 mM IPTG. The cells were allowed to express the proteins for 2.5 hours at 37.degree. C., then aldehyde substrate was added to a concentration of 5 mM, the culture vial was capped tightly and incubated for 72 hours at 37.degree. C. w/ shaking 200 rpm. 1 mL of the final culture was extracted with 0.75 mL of ethyl acetate, centrifuged facilitate phase separation, then analyzed via GCMS using the following method.

[0486] 1 .mu.L injection w/50:1 split

[0487] Inlet temperature--250.degree. C.

[0488] Initial oven temperature--50.degree. C.

[0489] Temperature Ramp 1--10.degree. C./min to 125.degree. C.

[0490] Temperature Ramp 2--30.degree. C./min to 300.degree. C.

[0491] Final Temperature 300.degree. C.--1 minute

[0492] GC to MS transfer temp--250.degree. C.

[0493] MS detection--full scan MW 40-260.

[0494] The results are depicted in FIGS. 17 through 25. FIG. 17 shows the sequential conversion of butanal into 5-hydroxy-4-octanone and then 4,5-octanonediol. FIG. 18 shows the sequential conversion of n-pentanal into 6-hydroxy-5-decanone and then 5,6-decanediol. FIG. 19 shows the conversion of 3-methylbutanal into 2,7-dimethyl-5-hydroxy-4-octanone and then 2,7-Dimethyl-4,5-octanediol. FIG. 20 shows the sequential conversion of n-hexanal into 7-hydroxy-6-dodecanone and then 6,7-dodecanediol. FIG. 21 shows the conversion of 4-methylpentanal into 2,9-dimethyl-6-hydroxy-5-decanone and then 2,9-dimethyl-5,6-decanediol. FIG. 22 shows the conversion of n-octanal into 9-hydroxy-8-hexadecanone. FIG. 23 shows the conversion of acetaldehyde into 3-hydroxy-2-butanone. FIG. 24 shows the sequential conversion of n-propanal into 4-hydroxy-3-hexanone and then 3,4-hexanediol. FIG. 25 shows the conversion of phenylacetoaldehyde into 1,4-diphenyl-3-hydroxy-2-butanone.

[0495] Similar to above, a pathway comprising a benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage was optimized for E. coli protein expression) was constructed in E. coli and tested for its ability to catalyze the production of various .alpha.-hydroxyketones. The results, which show the broad spectrum of C--C ligase activity for the bal gene tested, are set forth in FIG. 48 through FIG. 55.

Example 7

Sequential Biological Activity of Diol Dehydrogenases and Diol Dehydratases

[0496] To test the sequential biological activity of diol dehydrogenases and diol dehydratases in a dehydration and reduction pathway, butyroin was used as a substrate in a sequential reaction to produce 4-octanone. The enzyme diol dehydrogenase (e.g., ddh) catalyzes the reversible reduction and oxidation of .alpha.-hydroxy ketones and its corresponding diol, such as 5-hydroxy-4-octanone and 4,5-octanediol, and the enzyme diol dehydratase (e.g., pduCDE) catalyzes the irreversible dehydration of diols, such as 4,5-octanediol.

[0497] Diol dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 and diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578 were cloned into a bacterial expression vector and expressed and purified on a Ni-NTA column, as described in Example X except that 1 mM of 1,2-propanediol was added at all time during the expression and purification of diol dehydratase. The large, medium, and small subunits of the pduCDE polypeptide are encoded by the nucleotide sequences of SEQ ID NOs:103, 105, and 107, respectively, and the polypeptide sequence are set forth in SEQ ID NOs: 104, 106, and 108, respectively.

[0498] The ddh3 and pduCDE polypeptides were incubated with butyroin and their appropriate cofactors, then assayed using gas chromatography-mass spectrometry (GC-MS) for their ability to perform sequential reactions resulting in the product 4-octanone. Reaction conditions are given in Table 3 below. The reaction mixture was incubated at 37.degree. C. for 40 hours in a 0.6 mL eppendorf tube with minimal head space. The reaction product was extracted with an equivalent volume of ethyl acetate, stored in a glass vial, and sent to Thermo Fischer Scientific Instruments Division for compositional analysis by GC-MS.

TABLE-US-00003 TABLE 3 Reaction Conditions Rxn Component Concentration 5-hydroxy-4-octanone (butyroin) 8.4 mM Adenosylcobalamin (coenzyme B.sub.12) 33.5 .mu.M KCl 9.6 mM NADH 18 mM dDH3 enzyme 0.19 mg/mL dDOH1 enzyme mix 0.15 mg/mL Reaction Buffer 10 mM Tris HCl pH 7.0

[0499] FIG. 26A shows GC-MS data which confirms the presence of 4,5-octanediol in the sample extraction. The mass-spectra of the peaks, retention time, at 5.36 was identified as butyroin (substrate), and at 6.01, 6.09, and 6.12 min were identified as different isomers of 4,5-octanediol. This compound is the expected product resulting from the reduction of butyroin by ddh3.

[0500] FIG. 26B shows GC-MS data confirming the presence of 4-octanone in the sample extraction. The mass-spectra of the peak, retention time, at 4.55 was identified as 4-octanone. This compound is the expected product resulting from the sequential dehydrogenation of butyroin and dehydration of 4,5-octanediol by ddh3 and pduCDE, respectively.

[0501] FIGS. 27A and 27B show comparisons between the sample extraction gas chromatograph/mass spectrum and the 4-octanone standard gas chromatograph/mass spectrum. These results demonstrate that 4-octanone was produced from butyroin using the enzymes diol dehydrogenase (ddh3) and a diol dehydratase (pduCDE). GC-MS analysis of the incubated reaction mixture confirmed starting material, intermediate and product, demonstrating that these enzymes can be reappropriated for these specific substrates.

Example 8

Isolation and Biological Activity of Secondary Alcohol Dehydrogenases

[0502] Substrates such as 4-octanone, 2,7-dimethyl-4-octanone, cyclopentanone and corresponding alcohols were utilized to measure the ability of secondary alcohol dehydrogenases (2ADHs) to catalyze the reduction of large saturated ketones to secondary alcohols. An example of a reaction catalyzed by secondary alcohol dehydrogenases is illustrated below (reduction of 4-octanone to 4-octanol is shown):

##STR00002##

[0503] All enzymes and reagents were purchased from New England Biolabs and Sigma, respectively, unless otherwise stated.

[0504] Various secondary alcohol dehydrogenases (2ADHs) were isolated from Pseudomonas putida KT2440, Pseudomonas fluorescens Pf-5, and Klebsiella pneumoniae MGH 78578. All vectors were transformed in BL21(DE3) competent cells and expression of the genes encoding the proteins of interest was induced with IPTG (via the T7 promoter). The cells were lysed, proteins were extracted and then purified on Ni-NTA columns. Final protein concentration in the Ni-NTA eluate was diluted to 0.15 mg/mL prior to assays.

[0505] NADPH/NADPH consumption and production assays were performed using a THERMOmax microplate reader in the kinetic mode, monitoring the NADPH absorbance peak at 340 nm until the reaction reached equilibrium. In the assay described in Table 2, 2ADH-2, 2ADH-5, 2ADH-8, and 2ADH-10 were tested for their ability to either catalyze the oxidation of 4-octanol or catalyze the reduction of 4-octanone. These reaction conditions are found in Table 4 below.

TABLE-US-00004 TABLE 4 Reaction Conditions for Various Enzyme Assays Reaction Component Final Concentration NADH Production Assay (30.degree. C.) 2ADH enzyme Approx. 0.058 .mu.g/.mu.L 4-octanol 5.55 mM NAD+ Approx. 1.4 .mu.g/.mu.L Imidizole (from Elution Buffer) Approx. 280 mM NADH Consumption Assay (30.degree. C.) 2ADH enzyme Approx. 0.075 .mu.g/.mu.L 4-octanone 5.0 mM NADH Approx. 0.25 .mu.g/.mu.L Imidizole (from Elution Buffer) Approx. 250 mM NADPH Production Assay (30.degree. C.) 2ADH enzyme Approx. 0.058 .mu.g/.mu.L 4-octanol 5.55 mM NADP+ Approx. 1.4 .mu.g/.mu.L Imidizole (from Elution Buffer) Approx. 280 mM

[0506] Further testing was performed, as described in Tables 5 below, in which 2ADH-2, 2ADH-11, 2ADH-12, 2ADH-13, 2ADH-14, 2ADH-15, 2ADH-16, 2ADH-17, and 2ADH-18 were tested for their ability to either catalyze the oxidation of 4-octanol, 2,7-dimethyl-4-octanonol, or cyclopentanol, or catalyze the reduction of 4-octanone, 2,7-dimethyl-4-octanonone, or cyclopentanone.

TABLE-US-00005 TABLE 5 Rxn Component Final Concentration Rxn Components for NADPH Consumption Assays (Reduction) Substrate 25 mM Enzyme 0.04 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM Tris HCl 14 mM DMSO 1.5% by volume Total Volume 200 .mu.L Rxn Components for NAD(P)H Production Assays (Oxidation) Substrate 5 mM Enzyme 0.04 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM Tris HCl 14 mM Rxn Components for NAD(P)H Production Assay using 2,7-dimethyl-4-octanone as a substrate Substrate 50 mM Enzyme 0.08 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM Tris HCl 14 mM DMSO 3% by volume

[0507] FIG. 30A shows the results from the NADH Production Assay of Table 3, in which 2ADH-2 catalyzes the oxidation of 4-octanol in the presence of NAD+, as measured by NADH production. FIG. 30B shows the results of the NADPH Production Assay of Table 3, in which 2ADH-5, 2ADH-8, and 2ADH-10 catalyze the oxidation of 4-octanol in the presence of NADP+, as measured by NADPH production.

[0508] FIG. 31 shows the oxidation of 4-octanol by by 2ADH-11 (FIG. 31A) and 2ADH-16 (FIG. 31B), as measured by NADH and NADPH production, respectively.

[0509] FIG. 32 shows the oxidation of 2,7-dimethyloctanol by 2ADH-11 and others (FIG. 32A) and 2ADH-16 (FIG. 32B), as measured by NADH and NADPH production, respectively.

[0510] FIG. 33A shows the reduction of 2,7-dimethyl octanol by 2ADH 11 and 2ADH16 as monitored by NADPH consumption. FIG. 33B shows the reduction activity of both 2ADH11 and 2ADH16 towards various substrates. FIG. 34 shows the oxidation (FIG. 34A) and reduction (FIG. 34B) of cyclopentanol by 2ADH-16.

[0511] Similar to above, kinetic testing for both oxidation and reduction reactions was performed on various substrates using 2ADH-16. The conditions for these studies were as follows: 0.04 mg/mL enzyme, 0.25 mg/mL cofactor, 20 mM Tris HCl Buffer pH 6.5(red) or 9.0(ox), T=25 C, 100 uL total volume was used. The calculated rate constants for the reduction reactions, along with the structures of the substrates, are summarized in FIG. 35. The calculated rate constants for the oxidation reactions, along with the structures of the substrates, are summarized in FIG. 36. These results show that 2ADH-16 is capable of catalyzing both the oxidation and reduction of a wide variety of substrates.

Example 9

Isolation and In Vitro and In Vivo Activity of Coenzyme B12 Independent Diol Dehydratases

[0512] Substrates such as 1,2-propanediol, meso-2,3-butanediol, and trans-1,2-cyclopentanediol were utilized to test both the in vitro and in vivo biological activity of a B12 independent diol dehydratase in a dehydration and reduction pathway. Diol dehydratases catalyzes the irreversible dehydration of diols, such as 1,2-propanediol.

[0513] For in vitro activity, E. coli BL21(DE3) harboring pETPduCDE (diol dehydratase subunits) was inoculated into 100 mL LB media, grown to to OD.sub.600=0.7, induced with 0.15 mM IPTG, and incubated for 22 hours at 22.degree. C. The cells were lysed and proteins of interest were purified on a Ni-NTA spin column. Purification of all three dehydratase subunits was accomplished by adding 5 mM 1,2-propanediol to the lysis and wash buffers. The Ni-NTA purification yielded approximately 660 .mu.L of protein mixture at a concentration of 2.2 mg/mL. Protein concentration assays were conducted using a Bradford reagent protocol.

[0514] The purified PduCDE was used to set up in vitro diol dehydratase reactions. Three assays were conducted with 1,2-propanediol and meso-2,3-butanediol. Control reactions were also set up with elution buffer added in place of purified PduCDE. In vitro reactions were conducted under semi-anaerobic conditions in 2 mL screw cap glass vials. Reaction components and concentrations are given in Table 6.

TABLE-US-00006 TABLE 6 Reaction conditions for B.sub.12 dependent DDOH in vitro assay Rxn Component Concentration Diol substrate 10 mM Adenosylcobalamin (B.sub.12) 100 .mu.g/mL KCl 10 mM dOH1 enzyme mix 0.08 mg/mL Reaction Buffer 10 mM Tris HCl pH 7.5

[0515] After 48 hours, 1 mL of the reaction mixture was extracted with 0.5 mL of either ethylacetate or hexanol and analyzed by GCMS.

[0516] The following GCMS protocol was used for all experiments:

[0517] 1 .mu.L, injection w/50:1 split

[0518] Inlet temperature--250.degree. C.

[0519] Initial oven temperature--50.degree. C.

[0520] Temperature Ramp 1--10.degree. C./min to 125.degree. C.

[0521] Temperature Ramp 2--30.degree. C./min to 300.degree. C.

[0522] Final Temperature 300.degree. C.--1 minute

[0523] GC to MS transfer temp--250.degree. C.

[0524] MS detection--full scan MW 40-260

[0525] The results are shown in FIG. 45. FIG. 45A confirms the formation of 1-propanal from 1,2-propanediol, and FIG. 45B confirms the formation of 2-butanone from meso-2,3-butanediol, both of which were catalyzed by B12 independent diol dehydratase.

[0526] For in vivo activity, the pBBRDhaB1/2 plasmid was constructed as follows: the DNA sequence encoding B12-independent glycerol dehydratase (dhaB1) and activator (dhaB2) of Clostridium butyricum was amplified by polymerase chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2 min for dhaB1 and 1 min for dhaB2, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward primers (5'-CCGCTCGAGGAGGATATATATATGATTTCTAAAGGCTTTAGCACCC-3' (SEQ ID NO:318) for dhaB1 and 5'-ACGTGATGTAATCTAGAGGAGGATATATATATGAGCAAAGAAATTAAAGG-3' (SEQ ID NO:319) for dhaB2, and reverse primers (5'-TCTTTGCTCATATATATATCCTCCTCTAGATTACATCACGTGTTCAGTAC-3' (SEQ ID NO:320) for dhaB1 and 5'-CGAGCTCTTATTCGGCGCCAATGGTGCACGGG-3' (SEQ ID NO:321) for dhaB2, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETdhaB1 and pETdhaB2, respectively, in 50 .mu.l. Amplified fragments were gel purified and spliced by another round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2.5 min, repeated 30 times. The reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CCGCTCGAGGAGGATATATATATGATTTCTAAAGGCTTTAGCACCC-3') (SEQ ID NO:322) and reverse primers (5'-CGAGCTCTTATTCGGCGCCAATGGTGCACGGG-3') (SEQ ID NO:323), 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng each fragment in 50 .mu.l. Amplified DNA fragment was digested with XhoI and SacI and ligated into pBBR1MCS-2 pre-digested with the same restriction enzymes.

[0527] Two strains of E. coli DH10B harboring pBBR1MCS-2 or pBBRDhaB1/2 into TB media without glycerol were innoculated. Cultures were grown to OD.sub.600=0.5 and the substrates 1,2-propanediol, meso-2,3-butanediol, and trans-1,2-cyclopentanediol were added to separate cultures to a concentration of 10 mM. 5 ug/ml of co-enzyme S-adenosylmethionine was added before the culture is transferred to anaerobic environment. The cultures were incubated at 37 C for 48 hrs.

[0528] After 48 hours, 1 mL of culture was extracted with 0.5 mL of ethylacetate or hexanol and analyzed by GCMS, as described above. The results are shown in FIG. 46. FIG. 46A shows the in vivo production of 1-propanol from 1,2-propanediol. FIG. 46B shows the in vivo production of 2-butanol from meso-2,3 butanediol. FIG. 46C shows the in vivo production of cyclopentanone from trans-1,2-cyclopentanediol.

Example 10

Identification of Secreted Alginate Lyase and Genomic Regions Sufficient for Growth on Alginate as a Sole Source of Carbon

[0529] To identify secreted or external alginate lyases, and to identify genomic regions from Vibrio splendidus that are sufficient to confer growth in alginate as a sole source of carbon, the following clones were made using the gateway system from Invitrogen (Carlsbad, Calif.). First, entry vectors were made by TOPO cloning PCR fragments into pENTR/D/TOPO. PCR fragments were generated using Vibrio splendidus B01 genomic DNA as a template and amplified with the following primer pairs:

[0530] Vs24214-24249: genomic region corresponding to gene id between V12B01.sub.--24214 and V12B01.sub.--24249 (see Example 1).

TABLE-US-00007 TABLE 7 24214 F cacc caagcgatagtttatatagcgt (SEQ ID NO: 324) 24249 R gaaatgaacggatattacgt (SEQ ID NO: 325)

[0531] Vs24189-24209: genomic region corresponding to gene id between V12B01.sub.--24189 and V12B01.sub.--24209 (see Example 1).

TABLE-US-00008 TABLE 8 24189 R cggaacaggtgattgtggt (SEQ ID NO: 326) 24209 F cacc gcccacttcaagatgaagctgt (SEQ ID NO: 327)

[0532] Vs24214-24239: genomic region corresponding to gene id between V12B01.sub.--24214 and V12B01.sub.--24239 (see Example 1).

TABLE-US-00009 TABLE 9 24214 F cacc caagcgatagtttatatagcgt (SEQ ID NO: 328) 24239 R_1 gtggctaagtacatgccggt (SEQ ID NO: 329)

[0533] The entry vectors were recombined with the destination vector pET-DEST42 (Invitrogen) using the LR recombinase enzyme (Invitrogen). These destination vectors were then put into electro-competent DH10B or BL21 cells.

[0534] The alginate lyase clones were then made by digesting (using enzymes Nde I and Bam HI) the PCR products that were generated using Vibrio splendidus 12B01 genomic DNA as a template and amplified with the following primer pairs:

TABLE-US-00010 TABLE 10 24214 ndeF GGAATTC CAT atgacaaagaatatgacgactaaac (SEQ ID NO: 330) for forward primer for V12B01_24214 24214 bamR CG GGATCC ttattatttcccctgccctgcagt (SEQ ID NO: 331) for reverse primer for V12B01_24214 24219 ndeF GGAATTC CAT atgagctatcaaccacttttac (SEQ ID NO: 332) for forward primer for V12B01_24219 24219 bamR CG GGATCC ttacagttgagcaaatgatcc (SEQ ID NO: 333) for reverse primer for V12B01_24219

[0535] The digested PCR products were then ligated into cut pET28 vector. Certain of the cloned genomic regions of Vibrio splendidus B01 were tested for the presence of secreted alginate lyases, and the above-described constructs were tested in various combinations for the ability to confer growth on alginate as a sole source of carbon.

[0536] The Vs24254 (SEQ ID NO: 32) region of Vibro spendidus encodes a functional external alginate lyase. BL21 cells expressing Vs24254 from the pET28 vector were capable of breaking down alginate in the growth medium. When grown on LB+2% alginate+0.1 mM Isopropyl .beta.-D-1-thiogalactopyranoside (IPTG), only cells expressing the Vs24254 gene give a positive TBA assay result of pink color. This assay was performed by spinning down an overnight culture grown on the above mentioned media. The media was then mixed in a 1:1 ratio with 0.8% thiobarbituric acid (TBA), heated for 10 min at 99 degrees Celsius, and assayed for pink coloration. FIG. 47 shows the results of this assay. The left tube in FIG. 47 represents media taken from an overnight culture of cells expressing Vs24254, while the right hand tube shows the TBA reaction using media from cells expressing Vs24259 (negative control). The lack of pink coloration in the negative control indicates that little or no cleavage of the alginate polymer has occurred. Wildtype E. coli cells not expressing any recombinant proteins show the same coloration as the negative control Vs24259 (data not shown).

[0537] To test the ability of recombinant E. coli to grow on alginate as a sole source of carbon, transformed cells were grown for 19 hours at 30 degrees Celsius with mild shaking in a 96-well plate. Each well held 222 .mu.l of minimal media (see growth conditions for explanation of minimal media) with the 0.66% carbon source in the form of either degraded alginate or glucose (positive control for growth). All cells were either BL21 with no plasmid (BL21--negative control), one plasmid (Da or 3a), or two plasmids (Dk3a and Da3k). The plasmids are indicated by the lower case letter: "a" refers to the plasmid backbone pET-DEST42 and "k" refers to the pENTR/D/TOPO backbone. "D" indicates that the plasmid contains the genomic region Vs24214-24249, while "3" indicates that the plasmid contains the genomic region Vs24189-24209. Thus, Da would be pET-DEST42-Vs24214-24249, Da3k would be pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 and so on.

[0538] As shown in FIG. 56A, the two vector-constructs pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 when combined in E. coli confer growth on degraded alginate as the sole carbon source. This same result is be observed when these genomic inserts are switched into the opposite vector (pET-DEST42-Vs24189-24209 and pENTR/D/TOPO-Vs24214-24249). FIG. 56B shows growth on glucose as a positive control. Thus, the combined genomic regions of Vs24214-24249 and Vs24189-24209 from Vibro splendidus were sufficient to confer on E. coli the ability to grown on alginate as a sole source of carbon.

Example 11

Production of Ethanol from Alginate

[0539] The ability of recombinant E. coli to produce ethanol by growing on alginate on a source of carbon was tested. To generate recombinant E. coli, DNA sequences encoding pyruvate decarboxylase (pdc), and two alcohol dehydrogenase (adhA and adhB) of Zymomonas mobilis were amplified by polymerase chain reaction (PCR). These amplified fragments were gel purified and spliced together by another round of PCR. The final amplified DNA fragment was digested with BamHI and XbaI ligated into pBBR1MCS-2 pre-digested with the same restriction enzymes. The resulting plasmid is referred to as pBBRPdc-AdhA/B.

[0540] E. coli was transformed with either pBBRPdc-AdhA/B or pBBRPdc-AdhA/B+1.5 Fos (fosmid clone containing genomic region between V12B01.sub.--24189 and V12B01.sub.--24249; these sequences confer on E. coli the ability to use alginate as a sole source of carbon, see Examples 1 and 10), grown in m9 media containing alginate, and tested for the production of ethanol. The results are shown in FIG. 57, which demonstrates that the strain harboring pBBRPdc-AdhA/B+1.5 FOS showed significantly higher ethanol production when growing on alginate. These results indicate that the pBBRPdc-AdhA/B+1.5 FOS was able to utilize alginate as a source of carbon in the production of ethanol.

[0541] The various embodiments described above can be combined to provide further embodiments. All of the U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet, are incorporated herein by reference, in their entirety. Aspects of the embodiments can be modified, if necessary to employ concepts of the various patents, applications and publications to provide yet further embodiments.

[0542] These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.

[0543] The following publications are herein incorporated by reference in their entirety. [0544] 1. T. Y. Wong, L. A. Preston, N. L. Schiller, Annu Rev Microbiol 54, 289 (2000). [0545] 2. W. Hashimoto, O. Miyake, A. Ochiai, K. Murata, J Biosci Bioeng 99, 48 (January, 2005). [0546] 3. M. Yamasaki, K. Ogura, W. Hashimoto, B. Mikami, K. Murata, J Mol Biol 352, 11 (Sep. 9, 2005). [0547] 4. M. Yamasaki et al., Acta Crystallogr Sect F Struct Biol Cryst Commun 61, 288 (Mar. 1, 2005). [0548] 5. O. Miyake, A. Ochiai, W. Hashimoto, K. Murata, J Bacteriol 186, 2891 (May, 2004). [0549] 6. O. Miyake, W. Hashimoto, K. Murata, Protein Expr Purif 29, 33 (May, 2003). [0550] 7. H. J. Yoon, B. Mikami, W. Hashimoto, K. Murata, J Mol Biol 290, 505 (Jul. 9, 1999). [0551] 8. H. J. Yoon, W. Hashimoto, O. Miyake, K. Murata, B. Mikami, J Mol Biol 307, 9 (Mar. 16, 2001). [0552] 9. W. Hashimoto, O. Miyake, K. Momma, S. Kawai, K. Murata, J Bacteriol 182, 4572 (August, 2000). [0553] 10. H. J. Yoon et al., Protein Expr Purif 19, 84 (June, 2000). [0554] 11. T. Osawa, Y. Matsubara, T. Muramatsu, M. Kimura, Y. Kakuta, J Mol Biol 345, 1111 (Feb. 4, 2005). [0555] 12. A. Ochiai, W. Hashimoto, K. Murata, Res Microbiol 157, 642 (September, 2006). [0556] 13. F. J. Mergulhao, D. K. Summers, G. A. Monteiro, Biotechnol Adv 23, 177 (May, 2005). [0557] 14. J. H. Choi, S. Y. Lee, Appl Microbiol Biotechnol 64, 625 (June, 2004). [0558] 15. M. P. DeLisa, D. Tullman, G. Georgiou, Proc Natl Acad Sci USA 100, 6115 (May 13, 2003). [0559] 16. N. Blaudeck, G. A. Sprenger, R. Freudl, T. Wiegert, J Bacteriol 183, 604 (January, 2001). [0560] 17. N. Pradel et al., Biochem Biophys Res Commun 306, 786 (Jul. 4, 2003). [0561] 18. L. Masip et al., Science 303, 1185 (Feb. 20, 2004). [0562] 19. C. M. Barrett, N. Ray, J. D. Thomas, C. Robinson, A. Bolhuis, Biochem Biophys Res Commun 304, 279 (May 2, 2003). [0563] 20. R. Binet, S. Letoffe, J. M. Ghigo, P. Delepelaire, C. Wandersman, Folia Microbiol (Praha) 42, 179 (1997). [0564] 21. I. Gentschev, G. Dietrich, W. Goebel, Trends Microbiol 10, 39 (January, 2002). [0565] 22. V. Koronakis, FEBS Lett 555, 66 (Nov. 27, 2003). [0566] 23. J. Jose, Appl Microbiol Biotechnol 69, 607 (February, 2006). [0567] 24. J. Jose, D. Betscheider, D. Zangen, Anal Biochem 346, 258 (Nov. 15, 2005). [0568] 25. M. Ashiuchi, H. Misono, Appl Microbiol Biotechnol 59, 9 (June, 2002). [0569] 26. J. Narita et al., Appl Microbiol Biotechnol 70, 564 (May, 2006). [0570] 27. Y. Aso et al., Nat Biotechnol 24, 188 (February, 2006). [0571] 28. W. Hashimoto et al., Biosci Biotechnol Biochem 69, 673 (April, 2005). [0572] 29. A. E. Lagarde, F. R. Stoeber, J Bacteriol 129, 606 (February, 1977). [0573] 30. M. A. Mandrand-Berthelot, P. Ritzenthaler, M. Mata-Gilsinger, J Bacteriol 160, 600 (November, 1984). [0574] 31. J. Pouyssegur, F. Stoeber, J Bacteriol 117, 641 (February, 1974). [0575] 32. J. Preiss, G. Ashwell, J Biol Chem 237, 309 (February, 1962). [0576] 33. J. Preiss, G. Ashwell, J Biol Chem 237, 317 (February, 1962). [0577] 34. G. M. Bird, P. Haas, Biochemical Journal 25, 403 (1931). [0578] 35. L. H. Cretcher, W. L. Nelson, Science 67, 537 (May 25, 1928). [0579] 36. W. L. Nelson, L. H. Cretcher, Journal of the American Chemical Society 51, 1914 (1929). [0580] 37. W. L. Nelson, L. H. Cretcher, Journal of the American Chemical Society 52, 2130 (1930). [0581] 38. W. L. Nelson, L. H. Cretcher, Journal of the American Chemical Society 54, 3409 (1932). [0582] 39. E. Schoeffel, K. P. Link, Journal of Biological Chemistry 95, 213 (1932). [0583] 40. E. Schoeffel, K. P. Link, Journal of Biological Chemistry 100, 397 (1933). [0584] 41. H. A. Spoehr, Archive of Biochemistry 14, 153 (1947). [0585] 42. J. J. Farmer, 3rd, R. G. Eagon, J Bacteriol 97, 97 (January, 1969). [0586] 43. R. L. Anderson, D. P. Allison, J Biol Chem 240, 2367 (June, 1965). [0587] 44. W. J. Lennarz, R. J. Light, K. Bloch, Proc Natl Acad Sci USA 48, 840 (May, 1962). [0588] 45. S. A. Graham, Crit Rev Food Sci Nutr 28, 139 (1989). [0589] 46. E. Wiberg, P. Edwards, J. Byrne, S. Stymne, K. Dehesh, Planta 212, 33 (December, 2000). [0590] 47. L. Yuan, T. A. Voelker, D. J. Hawkins, Proc Natl Acad Sci USA 92, 10639 (Nov. 7, 1995). [0591] 48. K. Dehesh, A. Jones, D. S. Knutzon, T. A. Voelker, Plant J 9, 167 (February, 1996). [0592] 49. K. Dehesh, P. Edwards, T. Hayes, A. M. Cranmer, J. Fillatti, Plant Physiol 110, 203 (January, 1996). [0593] 50. K. M. Mayer, J. Shanklin, BMC Plant Biol 7, 1 (2007). [0594] 51. J. K. Jha et al., Plant Physiol Biochem 44, 645 (November-December, 2006). [0595] 52. B. S. Schutt, M. Brummel, R. Schuch, F. Spener, Planta 205, 263 (June, 1998). [0596] 53. K. Dehesh, P. Edwards, J. Fillatti, M. Slabaugh, J. Byrne, Plant J 15, 383 (August, 1998). [0597] 54. J. M. Leonard, S. J. Knapp, M. B. Slabaugh, Plant J 13, 621 (March, 1998). [0598] 55. M. Vedadi, R. Szittner, L. Smillie, E. Meighen, Biochemistry 34, 16725 (Dec. 26, 1995). [0599] 56. M. O. Park, J Bacteriol 187, 1426 (February, 2005). [0600] 57. M. O. Park, K. Heguri, K. Hirata, K. Miyamoto, J Appl Microbiol 98, 324 (2005). [0601] 58. M. O. Park, M. Tanabe, K. Hirata, K. Miyamoto, Appl Microbiol Biotechnol 56, 448 (August, 2001). [0602] 59. M. Morikawa, T. Iwasa, S. Yanagida, T. Imanaka, Journal of Fermentation and Bioengineering 85, 243 (1998). [0603] 60. M. Dennis, P. E. Kolattukudy, Proc Natl Acad Sci USA 89, 5306 (Jun. 15, 1992). [0604] 61. T. M. Cheesbrough, P. E. Kolattukudy, J Biol Chem 263, 2738 (Feb. 25, 1988). [0605] 62. M. C. Chang, R. A. Eachus, W. Trieu, D. K. Ro, J. D. Keasling, Nat Chem Biol 3, 274 (May, 2007). [0606] 63. R. J. Porra, B. D. Ross, Biochem J 94, 557 (March, 1965). [0607] 64. X. Chen, W. Guo, L. Zhao, Q. Fu, Y. Ma, J Phys Chem A 111, 3566 (May 10, 2007). [0608] 65. L. Zhao, W. Guo, R. Zhang, S. Wu, X. Lu, Chemphyschem 7, 1345 (Jun. 12, 2006). [0609] 66. L. Zhao, R. Zhang, W. Guo, S. Wu, X. Lu, Chemical Physics Letters 414, 28 (2005). [0610] 67. G. Gorgen, W. Boland, Eur J Biochem 185, 237 (Nov. 6, 1989). [0611] 68. P. Ney, W. Boland, Eur J Biochem 162, 203 (Jan. 2, 1987). [0612] 69. Z. L. Boynton, G. N. Bennett, F. B. Rudolph, Appl Environ Microbiol 62, 2758 (August, 1996). [0613] 70. R. T. Yan, J. S. Chen, Appl Environ Microbiol 56, 2591 (September, 1990). [0614] 71. R. V. Nair, G. N. Bennett, E. T. Papoutsakis, J Bacteriol 176, 871 (February, 1994). [0615] 72. D. P. Wiesenborn, F. B. Rudolph, E. T. Papoutsakis, Appl Environ Microbiol 55, 317 (February, 1989). [0616] 73. D. K. Thompson, J. S. Chen, Appl Environ Microbiol 56, 607 (March, 1990). [0617] 74. M. G. Hartmanis, J Biol Chem 262, 617 (Jan. 15, 1987). [0618] 75. K. X. Huang, S. Huang, F. B. Rudolph, G. N. Bennett, J Mol Microbiol Biotechnol 2, 33 (January, 2000). [0619] 76. L. Fontaine et al., J Bacteriol 184, 821 (February, 2002). [0620] 77. B. McMahon, M. E. Gallagher, S. G. Mayhew, FEMS Microbiol Lett 250, 121 (Sep. 1, 2005). [0621] 78. M. Li, S. Yao, S. K., Microbial Biotechnology 23, 573 (2007). [0622] 79. T. B. Causey, S. Zhou, K. T. Shanmugam, L. O. Ingram, Proc Natl Acad Sci USA 100, 825 (Feb. 4, 2003). [0623] 80. D. E. Chang, S. Shin, J. S. Rhee, J. G. Pan, J Bacteriol 181, 6656 (November, 1999). [0624] 81. C. R. Dittrich, R. V. Vadali, G. N. Bennett, K. Y. San, Biotechnol Prog 21, 627 (March-April, 2005). [0625] 82. H. Lin, N. M. Castro, G. N. Bennett, K. Y. San, Appl Microbiol Biotechnol 71, 870 (August, 2006). [0626] 83. U. Schorken, G. A. Sprenger, Biochim Biophys Acta 1385, 229 (Jun. 29, 1998). [0627] 84. G. A. Sprenger, M. Pohl, Journal of Molecular Catalysis B: Enzymatic 6, 145 (1999). [0628] 85. G. A. Sprenger, M. Pohl, Journal of Molecular Catalysis B: Enzymic 6, 145 (1999). [0629] 86. B. Gonzalez, R. Vicuna, J Bacteriol 171, 2401 (May, 1989). [0630] 87. P. Hinrichsen, I. Gomez, R. Vicuna, Gene 144, 137 (Jun. 24, 1994). [0631] 88. E. Janzen et al., Bioorg Chem 34, 345 (December, 2006). [0632] 89. M. M. Kneen, I. D. Pogozheva, G. L. Kenyon, M. J. McLeish, Biochim Biophys Acta 1753, 263 (Dec. 1, 2005). [0633] 90. K. Yamada-Onodera, A. Nakajima, Y. Tani, J Biosci Bioeng 102, 545 (December, 2006). [0634] 91. K. Yamada-Onodera, M. Fukui, Y. Tani, J Biosci Bioeng 103, 174 (February, 2007). [0635] 92. T. Tobimatsu, M. Azuma, S. Hayashi, K. Nishimoto, T. Toraya, Biosci Biotechnol Biochem 62, 1774 (September, 1998). [0636] 93. T. Tobimatsu et al., J Biol Chem 271, 22352 (Sep. 13, 1996). [0637] 94. T. Toraya, T. Shirakashi, T. Kosuga, S. Fukui, Biochem Biophys Res Commun 69, 475 (Mar. 22, 1976). [0638] 95. M. Yamanishi et al., Eur J Biochem 269, 4484 (September, 2002). [0639] 96. J. R. O'Brien et al., Biochemistry 43, 4635 (Apr. 27, 2004). [0640] 97. C. Raynaud, P. Sarcabal, I. Meynial-Salles, C. Croux, P. Soucaille, Proc Natl Acad Sci USA 100, 5010 (Apr. 29, 2003). [0641] 98. B. Ludwig, A. Akundi, K. Kendall, Appl Environ Microbiol 61, 3729 (October, 1995). [0642] 99. S. X. Xie, J. Ogawa, S. Shimizu, Biosci Biotechnol Biochem 63, 1721 (October, 1999). [0643] 100. T. Zelinski, J. Peters, M. R. Kula, J Biotechnol 33, 283 (Apr. 15, 1994). [0644] 101. M. C. Hunt, A. Rautanen, M. A. Westin, L. T. Svensson, S. E. Alexson, Faseb J 20, 1855 (September, 2006). [0645] 102. M. A. Westin, S. E. Alexson, M. C. Hunt, J Biol Chem 279, 21841 (May 21, 2004). [0646] 103. M. A. Westin, M. C. Hunt, S. E. Alexson, J Biol Chem 280, 38125 (Nov. 18, 2005). [0647] 104. H. Iwaki, Y. Hasegawa, S. Wang, M. M. Kayser, P. C. Lau, Appl Environ Microbiol 68, 5671 (November, 2002).

Sequence CWU 1

1

333112066DNAVibrio splendidus 1ggggacaagt ttgtacaaaa aagcaggctt gacgcttatc acatttagta gaagcttatg 60tggaggcgat tggctttttt ttcaaggaag attacaaaat agctcaggta atgccgattt 120atagatttgc tatgatatag ttcaggatct tatgctttta ataagcagga acagaattta 180tgaacaaaaa agctgatagt ttagtaggtt acagctttat tcgttataga aagggttagg 240gaacgtgaac tttttagagc tcaaacttcg catggataac tctccggtgc tgagccgatt 300tttagagaat ggatttttac tccagcagaa actgagcctt gttctttgtt gtgtgttgat 360cgcagcttct gcatggattt taggacagct tgcatggttt attgaacctg ctgagcaaac 420cgtcgtgcca tggacagcaa cggcttcctc gtcttcaacg cctcaatcga ctcttgatat 480ctcttctttg cagcagagca acatgtttgg tgcttataac ccaaccacgc ctgctgtggt 540tgagcagcaa gttatccaag atgcgccaaa gacgcgactg aacctcgttt tagtgggtgc 600agtagccagt tctaatccaa agctgagctt ggctgtgatt gccaatcgcg gcacacaagc 660aacctacggc attaatgaag agatcgaagg tacgcgagct aagttaaaag cggtattagt 720cgatcgcgtg attattgata actcaggtcg agacgaaacc ttgatgcttg aaggcattga 780gtacaagcgt ttgtctgtat cagcacctgc gccacctcgt acctcttctt ctgtgcgtgg 840caacaaccca gcttctgcag aagagaagct agatgaaatt aaagcgaaga taatgaaaga 900tccgcaacaa atcttccaat atgttcgact gtctcaggtg aaacgcgacg ataaagtgat 960tggttatcgt gtgagccctg gcaaagattc agaacttttt aactctgttg ggctccaaaa 1020cggagatatt gccactcagt taaatggaca agacctgaca gaccctgctg ctatgggcaa 1080catattccgt tctatctcag agctgacaga gctaaacctc gtcgtcgaga gagatggtca 1140acaacatgaa gtgtttattg aattttagaa ctttgcgtct aacgaaggac gaaagtgtag 1200gagaagtacg tgaagcattg gtttaagaaa agtgcatggt tattggcagg aagcttaatc 1260tgcacacccg cagccatcgc gagtgatttt agtgccagct ttaaaggcac tgatattcaa 1320gagtttatta atattgttgg tcgtaaccta gagaagacga tcatcgttga cccttcggtg 1380cgcggaaaaa tcgatgtacg cagctacgac gtactcaatg aagagcaata ctacagcttc 1440ttcctaaacg tattggaagt gtatggctac gcggttgtcg aaatggactc gggtgttctt 1500aagatcatca aggccaaaga ttcgaaaaca tcggcaattc cagtcgttgg agacagtgac 1560acgatcaaag gcgacaatgt ggtgacacgt gttgtgacgg ttcgtaatgt ctcggtgcgt 1620gaactttctc ctctgcttcg tcaactaaac gacaatgcag gcgcgggtaa cgttgtgcac 1680tacgacccag ccaacatcat ccttattaca ggccgagcgg cggtagtaaa ccgtttagct 1740gaaatcatca agcgtgttga ccaagcgggt gataaagaga ttgaagtcgt tgagctaaag 1800aatgcttctg cggcagaaat ggtacgtatc gttgatgcgt taagcaaaac cactgatgcg 1860aaaaacacac ctgcatttct acaacctaaa ttagttgccg atgaacgtac caatgcgatt 1920cttatctcag gcgaccctaa agtacgtagc cgtttaagaa ggctgattga acagcttgat 1980gttgaaatgg caaccaaggg caataaccaa gttatttacc ttaaatatgc aaaagccgaa 2040gatctagttg atgtgctgaa aggcgtgtcg gacaacctac aatcagagaa gcagacatca 2100accaaaggaa gttcatcgca gcgtaaccaa gtgatgatct cagctcacag tgacaccaac 2160tctttagtga ttaccgcaca gccggacatc atgaatgcgc ttcaagatgt gatcgcacag 2220ctggatattc gtcgtgctca agtattgatt gaagcactga ttgtcgaaat ggccgaaggt 2280gacggcgtta accttggtgt gcagtggggt aaccttgaaa cgggtgccat gattcagtac 2340agcaacactg gcgcttccat tggcggtgtg atggttggtt tagaagaagc gaaagacagc 2400gaaacgacaa ccgctgttta tgattcagac ggtaaattct tacgtaatga aaccacgacg 2460gaagaaggtg actattcaac attagcttcc gcactttctg gtgttaatgg tgcggcaatg 2520agtgtggtaa tgggtgactg gaccgccttg atcagtgcag tagcgaccga ttcaaattca 2580aatatcctat cttctccaag tatcaccgtg atggataacg gcgaagcgtc attcattgtg 2640ggtgaagagg tgcctgttct aaccggttct acagcaggct caagtaacga caacccattc 2700caaacagttg aacgtaaaga agtgggtatc aagcttaaag tggtgccgca aatcaatgaa 2760ggtgattcgg ttcaactgca aatagaacaa gaagtatcga acgtattagg cgccaatggt 2820gcggttgatg tgcgttttgc taagcgacag ctaaatacat cagtgattgt tcaagacggt 2880caaatgctgg tgttgggtgg cttgattgac gagcgagcat tggaaagtga atctaaggtg 2940ccgttcttgg gagatattcc tgtgcttgga cacttgttca aatcaaccag tactcaggtt 3000gagaaaaaga acctaatggt cttcatcaaa ccaaccatta ttcgtgatgg tatgacagcc 3060gatggtatca cgcagcgtaa atacaacttc atccgtgctg agcagttgta caaggctgag 3120caaggactga agttaatggc agacgataac atcccagtat tgcctaaatt tggtgccgac 3180atgaatcacc cggctgaaat tcaagccttc atcgatcaaa tggaacaaga ataatggctg 3240aattggtagg ggcggcacgt acttatcagc gcttgccgtt tagctttgcg aatcgctaca 3300agatggtgtt ggaataccaa catccagagc gcgcaccgat actttattat gttgagccac 3360tgaaatcggc ggcgatcatt gaagtgagtc gtgttgtgaa aaatggtttc acgccacaag 3420cgattactct cgatgagttt gataaaaaac taaccgatgc ttatcagcgt gactcgtcag 3480aagctcgtca gctcatggaa gacattggtg ctgatagtga tgatttcttc tcactagcgg 3540aagaactgcc tcaagacgaa gacttacttg aatcagaaga tgatgcacca atcatcaagt 3600taatcaatgc gatgctgggt gaggcgatca aagagggtgc ttcggatata cacatcgaaa 3660cctttgaaaa gtcactttgt atccgtttcc gagttgatgg tgtgctgcgt gatgttctag 3720cgccaagccg taaactggct ccgctattgg tttcacgtgt caaggttatg gctaaactgg 3780atattgcgga aaaacgcgtg ccacaagatg gtcgtatttc tctgcgtatt ggtggccgag 3840cggttgatgt tcgtgtttca accatgcctt cttcgcatgg tgagcgtgtg gtaatgcgtc 3900tgttggacaa aaatgccact cgtctagact tgcacagttt aggtatgaca gccgaaaacc 3960atgaaaactt ccgtaagctg attcagcgcc cacatggcat tatcttggtg accggcccga 4020caggttcagg taaatcgacg accttgtacg caggtctgca agaactcaac agcaatgaac 4080gaaacatttt aaccgttgaa gacccaatcg aattcgatat cgatggcatt ggtcaaacac 4140aagtgaaccc taaggttgat atgacctttg cgcgtggttt acgtgccatt cttcgtcaag 4200atcctgatgt tgttatgatt ggtgagatcc gtgacttgga gaccgcagag attgctgtcc 4260aggcctcttt gacaggtcac ttagttatgt cgactctgca taccaatact gccgtcggtg 4320cgattacacg tctacgtgat atgggcattg aacctttctt gatctcttct tcgctgctgg 4380gtgttttggc tcagcgcttg gttcgtactt tatgtaacga atgtaaagaa ccttatgaag 4440ccgataaaga gcagaagaaa ctgtttgggt tgaagaagaa agaaagcttg acgctttacc 4500atgccaaagg ttgtgaagag tgtggccata agggttatcg aggtcgtacg ggtattcatg 4560agctgttgat gattgatgat tcagtacaag agctgattca cagtgaagcg ggtgagcagg 4620cgattgataa agcaattcgt ggcacaacac caagtattcg agatgatggc ttgagcaaag 4680ttctgaaagg ggtaacgtcc ctagaagaag tgatgcgcgt gaccaaggaa gtctagtatg 4740gcggcatttg aatacaaagc actggatgcc aaaggcaaaa gtaaaaaagg ctcaattgaa 4800gcagataatg ctcgtcaggc tcgccaaaga ataaaagagc ttggcttgat gccggttgag 4860atgaccgagg ctaaagcaaa aacagcaaaa ggtgctcagc catcgaccag ctttaaacgc 4920ggcatcagta cgcctgatct tgcgcttatt actcgtcaaa tatccacgct cgttcaatct 4980ggtatgccgc tagaagagtg tttgaaagcc gttgccgaac agtctgagaa acctcgtatt 5040cgcaccatgc tactcgcggt gagatctaag gtgactgaag gttattcgtt agcagacagc 5100ttgtctgatt atccccatat cttcgatgag ctattcagag ccatggttgc tgctggtgag 5160aagtcagggc atctagatgc ggtattggaa cgattggctg actacgcaga aaaccgtcag 5220aagatgcgtt ctaagttgct gcaagcgatg atctacccca tcgtgctggt ggtgtttgcg 5280gtgacgattg tgtcgttcct actggcaacg gtagtgccga agatcgttga gcctattatc 5340caaatgggac aagagctccc tcagtcgaca caatttttat tagcatcgag tgaatttatc 5400cagaattggg gcatccaatt actggtgttg accattggtg tgattgtgtt ggttaagact 5460gcgctgaaaa agccgggcgt tcgcatgagc tgggatcgca aattattgag catcccgctg 5520ataggcaaga tagcgaaagg gatcaacacc tctcgttttg cacgaacact ttctatctgt 5580acctctagtg cgattcctat ccttgaaggg atgaaggtcg cggtagatgt gatgtcgaat 5640catcacgtga aacaacaagt attacaggca tcagatagcg ttagagaagg ggcaagcctg 5700cgtaaagcgc ttgatcaaac caaactcttt cccccgatga tgctgcatat gatcgccagt 5760ggtgagcaga gtggccaatt ggaacagatg ctgacaagag cggcagataa tcaggatcaa 5820agctttgaat cgaccgttaa tatcgcgtta ggcattttta ccccagcgct tattgcgttg 5880atggctggct tagtgctgtt tatcgtgatg gcgacgctga tgccaatgct tgaaatgaac 5940aatttaatga gtggttaacc tgccgctcat cagacgttag tttttggatt atcgagaaga 6000aggacatcat tcccctcaac tcgctatctg taatttggag aaaataatga aaaataaaat 6060gaaaaaacaa tcaggcttta ccctattaga agtcatggtt gttgtcgtta tccttggtgt 6120tctagcaagt tttgttgtac ctaacctgtt gggcaacaaa gagaaggcgg atcaacaaaa 6180agccatcact gatattgtgg cgctagagaa cgcgctcgac atgtacaaac tggataacag 6240cgtttaccca acaacggatc aaggcctgga cgggttggtg acaaagccaa gcagtccaga 6300gcctcgtaac taccgagacg gcggttacat caagcgtcta cctaacgacc catggggcaa 6360tgagtaccaa tacctaagtc ctggtgataa cggcacaatt gatatcttca ctcttggcgc 6420agatggtcaa gaaggtggtg aaggtattgc tgcagatatc ggcaactgga acatgcagga 6480cttccaataa gcttcggctt gttgtcggtt gatacgttcc tgttgtttga ttcgttatcg 6540ttgcttgata cgttattgat ggtagtacgc aaaaaatgga gtctacaagg tgaaaactaa 6600gcaaacacag ccaggtttca ccttgattga gattcttttg gtgttggtat tactgtcagt 6660atcggcggtc gcggtgatct cgaccatccc taccaatagc aaagatgttg ctaaaaaata 6720cgctcaaagc ttttatcagc gaattcagct actcaatgaa gaggctattt tgagtggctt 6780agattttggt gttcgtgttg atgaaaaaaa atcgacttac gttctgatga ctttgaagtc 6840tgatggctgg caagaaacgg agttcgaaaa gatcccttct tcaactgaat taccggaaga 6900actggcactg tcgctgacat taggtggtgg cgcgtgggaa gacgatgatc ggttgttcaa 6960tccaggaagc ttatttgatg aagatatgtt tgctgatctt gaagaggaaa agaagccgaa 7020accaccacag atctacatct tgtcgagtgc tgaaatgacg ccatttgtac tgtcgtttta 7080cccaaatacc ggtgacacaa tacaagatgt ttggcgcatt cgagtattgg ataatggtgt 7140gattcgatta ctcgagccgg gagaagaaga tgaagaagaa taaccgttct ccttatcgtt 7200ctcgcggtat gcctcttggt tctcgaggaa tgactctgct tgaagtattg gttgcgctgg 7260ctatcttcgc tacggcggcg atcagtgtga ttcgtgctgt cacccagcac atcaatacgc 7320tcagttatct cgaagaaaaa accttcgcgg cgatggtcgt tgataatcaa atggccctag 7380tcatgctaca tcctgagatg cttaaaaaag cgcagggcac gcaagagtta gcgggaagag 7440aatggttctg gaaggtgact cccatcgata ccagcgataa tttattaaag gcgtttgatg 7500tgagtgcggc aaccagtaag aaagcgtctc cagtcgttac ggtgcgcagt tatgtggtta 7560attaagagaa tgtggtcaat taagagcatg ttattaatta agaacagctc gctaactaag 7620agcgtgtcgc taactaagag catgtcggaa aataagcgta cgccgcgtaa acaaggtcta 7680ccttcaaaag ggagaggctt taccttaatt gaagtcttgg tctcgattgc tatctttgcc 7740acgctaagta tggcggctta tcaggtggtt aatcaggtgc agcgaagcaa cgagatctct 7800attgagcgca gtgctcgttt gaaccaactg caacgcagtt tagtcatttt agataatgat 7860tttcgccaga tggcggtgcg aaaatttcgt accaacggtg aagaagcatc atctaagctg 7920atcttaatga aagagtattt attggactcc gacagtgtag gcatcatgtt tactcgtcta 7980ggttggcaca acccacaaca gcagtttcct cgcggtgaag tcacgaaggt tggctaccgt 8040attaaagaag aaacacttga gcgtgtatgg tggcgttatc ccgatacacc ttcaggccaa 8100gaaggtgtga ttacccctct gcttgatgat gttgaaagct tggaattcga gttttatgac 8160ggaagccgct gggggaaaga gtggcaaacc gataaatcac tgccgaaagc ggtgaggctt 8220aagctgacac tgaaagacta tggtgagata gagcgtgttt atctcactcc cggtggcacc 8280ctagatcagg ccgatgattc ttcaaacagt gactcttcag gcagtagtga ggggaataat 8340gactcatcga actaataagc gtttagcgac aaggtcagcc ttgggacgta aacaacgtgg 8400tgtcgcgctg atcattattt tgatgctatt ggcgatcatg gcaaccattg ctggcagcat 8460gtccgagcgt ttgtttacgc aattcaagcg cgttggtaac caactgaatt accaacaggc 8520ttactggtac agcattggtg tggaagcgct tgtgcaaaac ggtattaggc aaagttacaa 8580agacagtgat accgtgaacc taagccaacc atgggcgtta gaagagcagg tatacccatt 8640ggattatggc caagttaagg gccgcattgt tgatgctcag gcatgtttta atcttaatgc 8700cttagccgga gtggcgacca cttcaagtaa ccagactcct tatttaatca cggtttggca 8760aaccttattg gaaaaccaag acgttgagcc ttatcaggct gaggttatcg caaattcaac 8820gtgggaattt gttgatgcgg atacacgaac cacctcttcg tctggtgtag aagacagcac 8880gtatgaagcg atgaagccct cttatttggc ggcgaatggc ttaatggccg atgaatccga 8940gctacgagcg gtttatcaag tcactggtga agtgatgaat aaggttcgcc cctttgtttg 9000cgctctgcca accgatgatt tccgcttgaa tgtgaatact ctcacggaaa aacaagcacc 9060gttattggaa gcgatgtttg cgccaggctt aagtgaatcg gatgccaaac agctgataga 9120taaacgccca tttgatggct gggatacggt agatgctttc atggctgaac ctgccattgt 9180tggtgtaagt gccgaagtca gcaagaaagc gaaagcatat ttaactgtag atagcgccta 9240ttttgagcta gatgcagagg tattagttga gcagtcacgt gtacgtatac ggacgctttt 9300ctatagtagt aatcgagaaa cagtgacggt agtacgccgt cgttttggag gaatcagtga 9360gcgagtttct gaccgttcga ctgagtagcg aaccacaaag ccctgtgcag tggttagttt 9420ggtcgacaag ccaacaagaa gtgatagcaa gcggtgaact gtctagctgg gaacagcttg 9480acgagttaac gccttacgct gaaaagcgca gctgtatcgc tttattgccg ggaagtgaat 9540gcttaattaa gcgtgttgag atcccgaaag gtgctgctcg ccagtttgat tctatgctgc 9600cgttcttatt agaagacgaa gtcgcacaag atatcgaaga cttacacctg actattttag 9660ataaagatgc cactcacgct accgtgtgtg gtgtggatcg tgaatggcta aaacaagctt 9720tagacctgtt tcgcgaagcc aatataatct tccgtaaggt gctaccagat acactagccg 9780tgccttttga agaacaaggc atcagtgcgt tgcagataga tcagcattgg ttattgcgcc 9840aaggtcactc tcaacgtcaa ggtcactatc aagccgtatc gatcagtgaa gcatggttac 9900cgatgttttt gcaaagtgat tgggttgtcg ctggtgagga agagcaagcg acgactatct 9960tcagctatac cgcgatgccg agcgacgacg ttcaacagca aagcggcctc gagtggcaag 10020caaagcctgc ggaattggtg atgtctttat tgagtcagca agcgatcaca agcggcgtaa 10080atttactgac tggcaccttt aaaaccaaat cttcattcag taaatattgg cgtgtttggc 10140agaaagtggc gattgctgct tgtttgctgg tggccgtgat tgtgactcag caagtgttga 10200aggttcagca atacgaagcg caagcacaag cctaccgcat ggagagtgag cgtatcttta 10260gagctgtgct gcctggcaaa caacgcattc cgaccgtgag ttacctcaag cgtcagatga 10320atgatgaagc taagaaatac ggtggttcag gcgaaggtga ttctttactt ggttggttag 10380ctttgctgcc tgaaacctta gggcaagtga agacgatcga agttgaaagc attcgctacg 10440atggcaaccg ttctgaggtt cgactgcagg ctaaaagttc tgacttccaa cactttgaga 10500ccgcaagggt gaagctcgaa gagaagtttg tcgttgagca agggccattg aaccgtaatg 10560gcgatgccgt atttggcagt tttactctta aaccccatca ataacctgcg taaggagatc 10620agtgatgaga aatatgattg aaccactcca agcgtggtgg gcttcaataa gtcagcggga 10680acaacgatta gtcattggtt gttctatttt attgatactg ggcgttgtct attggggatt 10740aatacaacca cttagccaac gagccgagct tgcacaaagc cgcattcaaa gtgagaagca 10800acttctggct tgggtaacgg acaaagcgaa tcaagtggtt gaactacgag gcagtggtgg 10860catcagtgcc agtcagcctt tgaaccaatc tgtgcctgct tctatgcgcc gttttaacat 10920cgagctgata cgcgtgcaac cacgcggtga gatgctgcaa gtttggatta agcctgtgcc 10980atttaataag ttcgttgact ggctgacata cctgaaagaa aagcagggtg ttgaggttga 11040gtttatggat attgatcgct ctgatagccc tggggttatt gagatcaacc gactacagtt 11100taaacgaggt taatgtgaaa cgcggtttat ctttcaaata cggcctgtta ttcagcgtca 11160tttttatcgt ttttttctcg gtaagcttgt tgctgcattt gcctgccgct tttgctctca 11220agcatgcacc cgtcgtgcgt ggtttaagca ttgaaggcgt tgagggcacc gtttggcaag 11280gtcgcgctaa caatatcgcg tggcagcgtg tcaattacgg ctcagtgcag tgggacttcc 11340agttctctaa actattccaa gccaaagcag aacttgcggt tcgctttggc cgcaacagcg 11400acatgaactt atcaggtaaa ggacgtgtcg gatatagcat gagtggtgct tacgcggaaa 11460acttagtggc atcaatgcca gccagcaacg tgatgaaata tgcgccagct atcccagtgc 11520ctgtgtctat tgcagggcaa gttgaactga cgatcaaaca tgcggttcat gctcaacctt 11580ggtgtcaatc aggtgaaggt acgcttgctt ggtctggtgc agcagtcgac tcgccagtgg 11640gttcgttaga ccttggccct gtgattgcgg acataacgtg tgaagacagc acaattgcag 11700ccaaaggcac tcagaagagc gatcaggtag acagcgagtt ctcagcgagc gtaacaccta 11760accaacgcta cacctcggca gcatggttta agccaggcgc tgaattcccg ccagcaatgc 11820agagtcagct taagtggttg ggcaatcctg atagccaagg taaataccaa tttacttatc 11880aaggccgctt ttagcccggt atttacttca gagctagtat ctgaagtaaa tttggcgatc 11940aaatcgcgaa ctataaaaaa cgggcacctc actgaggtgc ccgttttgtt tgttctgaga 12000atctagagga tatctgacgg ttaaagagag caaactcacc cagctttctt gtacaaagtg 12060gtcccc 12066254080DNAVibrio splendidus 2gtgctttgtg acaacggggg atgtatggat attgaagttt cgcgccaggt tgcggtagtt 60gaagctacga gtggagatgt cgtcgtagtt aagccagacg gcagcgcaag aaaagtttca 120gttggcgata ccatccgtga aaatgagatc gtgattacgg ccaacaagtc agagcttgta 180ttaggcgttc agaatgattc gattccggtt gcagagaatt gcgtcggttg tgttgatgaa 240aacgctgcat gggtagatgc cccaatagct ggtgaggtta attttgactt acagcaagca 300gacgcagaaa ccttcactga agacgacctt gctgcaattc aagaagccat tttaggtggt 360gccgatccga ctcaaatctt agaagcaacg gctgctggtg gcggactagg ttctgcaaat 420gctggctttg tgacgattga ctataactac actgaaactc atccatcgac tttctttgag 480accgctggtc tagcagaaca aactgttgat gaagacagag aagaattcag atctatcact 540cgttcatcag gtggccaatc aatcagtgaa acactgactg aaggctccat atctggcaat 600acctatcccc aatctgtaac aacgacagaa acgattattg ctggtagttt agctctcgcc 660cctaactctt tcattccaga aactttatcc ctcgcttcac tacttagtga attaaacagc 720gacattactt caagtggtca gtccgttatc ttcacctatg acgcgacgac taattctatc 780gttggtgttc aagataccga cgaagtatta cgtatcgaca ttgatgccgt cagtgttggc 840aataacattg agctttctct aaccacaacg atttcccagc cgattgatca tgtaccgtcg 900gttggcggtg gtcaggtttc ttacactggc gatcaaatag atattgcctt tgatattcaa 960ggtgaagaca ccgctgggaa cccgctagca acacccgtta acgcacaagt ttcagtgttt 1020gacgggatag atccgtctgt tgaaagtgtc aatatcacta acgttgaaac tagcagcgcg 1080gcaatcgaag ggacgttctc aaatattggt agtgataacc ttcaatcagc cgtatttgat 1140gcaagtgcac tggaccagtt tgatgggttg ctcagtgata atcaaaacac gcttgcgaga 1200ctttctgatg atggaacaac gattactctg tccatccaag gtcgaggtga ggttgttctc 1260actatctctc tagataccga tggcacctat aaattcgagc agtctaatcc gatagaacaa 1320gtgggtaccg attcactgac gttcgctttg ccaatcacga ttaccgattt tgaccaagat 1380gttgtaacca atacgatcaa cattgccatt actgatggcg atagccctgt tattactaat 1440gttgacagta ttgatgttga tgaagcgggc attgttggcg gctcacaaga gggcacggcg 1500ccagtgtctg gcactggcgg tatcaccgcg gacatttttg aaagtgacat cattgaccat 1560tatgagctag aacccactga atttaatact aatggcacct tggtttcaaa tggcgaggct 1620gtgctacttg agttgattga tgaaaccaac ggtgtaagaa cttacgaagg ttatgttgag 1680gtcaatggtt cgagaattac ggtctttgac gttaaaattg atagcccttc attgggcaac 1740tatgagttta atctttatga agaactttct catcaaggcg ctgaagatgc gctgttaact 1800tttgcattgc caatttatgc tgttgatgca gatggcgacc gttctgcact gtctggaggt 1860tcgaacacac cagaagctgc tgagatcctc gttaatgtta aagacgatgt cgttgaatta 1920gttgataagg ttgaatcagt caccgagccg accttagcgg gcgatactat tgtttcgtat 1980aacctgttca attttgaagg cgcagatggt tctacaattc aatcgtttaa ctacgacggt 2040gttgattact cactcgatca aagcctgctc cccgatgcta cccagatttt cagttttact 2100gaaggtgtcg tcactatctc attaaacggt gacttcagtt ttgaagtcgc tcgtgatatc 2160gaccactcaa gcagtgaaac tatcgtcaaa cagttctcat ttttagccga agatggtgat 2220ggggatactg atagttcgac gcttgagtta agtattaccg atggccaaga tccgatcatt 2280gatttgatcc cgcctgtgac tctctctgaa accaacctta atgacggctc tgctcccagc 2340ggaagtacag ttagcgcaac cgagacgatt acctttaccg caggcagcga cgatgtagca 2400agtttccgta ttgaaccaac agagtttaat gtgggcggtg cacttaaatc gaatggattt 2460tcggttgaga taaaagaaga ttcggctaat ccgggtactt acattggctt tattaccaac 2520ggttcgggcg ctgaaatccc agtgtttacg attgctttct ctacgagcac attgggtgaa 2580tacaccttta ctctgcttga agcgttagac catgtagatg gtttagataa gaacgatctg 2640agctttgatc tgcctattta tgcggttgat acggacggcg acgattcatt ggtgtctcag 2700cttaatgtga ctatcggtga tgatgttcaa atcatgcaag acggtacgtt agatatcacc 2760gagccaaatc ttgctgacgg tacaatcaca accaacacca ttgatgtaat gccaaatcaa 2820agtgctgatg gcgcgacgat cactcggttc acttatgacg gtgtcgtaaa cacactggat 2880caaagtattt

caggagaaca gcagttcagc ttcacagaag gcgaactgtt tatcaccctt 2940gaaggtgaag tgcgctttga gcctaatcgc gatctagacc actcagtgag tgaagatatc 3000gtgaagtcga ttgtggtgac ttcaagcgac ttcgataacg atccggtgac ttcaaccatt 3060acgctgacga tcactgatgg tgataacccg acgattgatg ttattccaag tgttacgctt 3120tctgaaatta acctgagcga tggctctgct ccaagtggca gcgcggtaag ctcgactcaa 3180actattactt ttaccaatca aagtgatgat gtggttcgtt tccgtattga gtcaacggag 3240ttcaatacta acgatgatct taaatcgaac ggtttagctg ttgagttacg tgaagacccg 3300gcagggtcgg gtgactacat tggttttacg accagtgcga cgaacgtaga aactccagta 3360ttcacattaa gctttaattc tggatcatta ggtgaataca cgttcacact catcgaagcg 3420ttggaccacc aagatgcccg tggcaacaac gacctcagtt ttgatttacc tgtttacgcg 3480gtagatagtg atggcgatga ttcattggtg tctccgttaa acgtcactat cggtgatgat 3540gttcaaatca tgcaagatag tacgttagat atcgtcgagc caaccgtcgc agatttggcc 3600gctggcacag tgacaactaa caccattgat gtgatgccaa atcaaagtgc cgatggcgca 3660acggtgacgc aattcactta tgatggccag cttcgaacac ttgaccaaaa tgacaatggt 3720gagcagcaat ttagcttcac agaaggtgaa ctgttcatca cgcttcaagg tgatgtgcgc 3780tttgagccta atcgtaatct agaccacaca ctcagcgaag acatcgtgaa atcaatcgtg 3840gtgacatcta gcgattccga taacgatgtg ttgacctcaa ccgtcactct gaccattacc 3900gatggtgata tcccaaccat tgataatgtt ccaactgtga acttgtctga aactaatctg 3960agtgatggct ctgcacctag cggaagcgcg gtgagttcaa ctcaaactat tacttacacc 4020actcaaagtg atgatgtgac aagcttccgt attgaaccga ctgaatttaa tgttggtggc 4080gctctcacat caaacggatt ggcagtcgag ttaaaagctg atccaaccac accgggtggc 4140tacatcggtt ttgtgactga tggttcgaac gttgaaacta acgtgttcac gattagcttc 4200tcagatacca atttaggcca gtacaccttc accttacttg aagcgttaga ccatgtggat 4260ggtttagcga acaatgatct gacctttgat ctgcctgttt atgcagttga tagcgatggc 4320gacgattcac tggtgtctca gttaaatgta accatcggtg atgatgttca aatcatgcaa 4380ggtggtacgt tagatatcac tgagccaaat cttgcagacg gcacaattac aaccaatacc 4440atcgatgtga tgccagagca aagcgccgat ggtgcgacga tcactcagtt cacttatgac 4500ggtcaagttc gaacactgga tcaaacggac aatggtgagc agcaatttag cttcactgaa 4560ggcgagttgt tcatcactct tcaaggtgac gtgcgtttcg aacccaatcg caacctagat 4620cacacagcta gcgaagatat cgtgaagtcg atagtggtga cttcaagcga tttagataac 4680gatgtggtga cgtcaacggt cactctgacg attactgatg gtgatatccc aaccattgat 4740gcagtgccaa gcgttactct gtctgaaatc aatcttagtg acggctctgc gccaagtggc 4800actgcagtta gtcaaactga gacgattacc ttcaccaatc aaagtgatga tgtgaccagt 4860ttccgtattg agccaataga gttcaatgtg ggcggtgcac tgaaatcgaa tggatttgcg 4920gttgagataa aagaagattc ggctaatccg ggtacttaca ttggctttat taccaacggt 4980tcgggcgctg aaatcccagt gtttacgatt gctttctcta cgagctcatt gggtgaatac 5040acctttactc tgcttgaagc gttagaccat gtagatggtt tagataagaa cgatctgagc 5100ttcgatctgc ctgtttatgc ggtcgatacg gacggcgatg attcattggt gtctcagcta 5160aacgtgacca tcggtgatga tgtccaaatc atgcaagacg gtacgttaga tatcatcgag 5220ccaaatctgg ctgatggaac aatcacaacc agcactattg atgtgatgcc aaaccaaagt 5280gctgatggtg cgacgatcac tcagtttact tatgacggtc agctaagaac gcttgatcaa 5340aatgacactg gcgaacagca gttcagcttc acagaaggcg agttgtttat cacccttgaa 5400ggtgaagtgc gctttgagcc aaaccgagac ctagaccaca ccgcgagtga agatattgtt 5460aagtcgattg tggtcacttc aagtgatttc gataacgact ctctgacttc taccgtaacg 5520ctgaccatta ctgatggtga taaccctacg atcgacgtca ttccaagcgt taccctttct 5580gaaactaatc tgagtgatgg ctctgctcca agtggcagcg cggtaagctc gactcaaact 5640attactttta ccaatcaaag tgatgatgtg gttcgtttcc gtattgagcc aacggagttc 5700aatactaacg atgatcttaa atcgaacggt ttagccgttg agttacgtga agacccggct 5760gggtcgggtg actacattgg ttttactact agtgcgacga atgtcgaaac cacggtattt 5820acgctgagtt tttctagcac cacattaggt gaatatacct tcactttgct tgaagcgttg 5880gaccaccaag atgcccgtgg caacaacgac ctcagttttg aactgcctgt ttatgcggta 5940gacagtgatg gcgatgattc actgatgtct ccgttaaacg tcaccatcgg cgatgatgtt 6000caaatcatgc aagacggtac gttagatatc gtcgagccaa ccgtcgcaga tttggccgct 6060ggcattgtga caactaacac cattgatgtg atgccaaatc aaagtgccga tggcgcgacg 6120atcactcaat tcacttatga tggccaactt cgaacacttg accaaaatga caatggcgaa 6180caacagttta gcttcacgga aggtgaacta ttcatcactc ttgaaggtga agtgcgcttt 6240gagcctaatc gtaatctaga ccacacgctg aacgaagaca tcgtgaaatc gatcgtggtg 6300acgtctagtg actccgataa cgatgtgttg acctcaaccg tcactctgac cattaccgat 6360ggtgatatcc caaccattga taatgtgcca acagtgagct tgtcagaaac aagtctgagt 6420gacggctctt caccaagtgg cagcgcagtt agctcaactc aaaccatcac ttacaccact 6480caaagtgatg atgtaaccag cttccgtatt gaaccgactg agttcaatgt tggcggtgct 6540ctcaaatcaa atggattggc ggttgagctg aaggccgatc caaccactcc gggcggctac 6600atcggctttg tgactgatgg ttcgaacgtt gaaactaacg tgttcacgat tagcttctcg 6660gataccaatt taggtcaata caccttcacc ttgcttgaag cgttggatca tgcggatagc 6720cttgcaaata acgatctgag ctttgatctg ccagtctacg ccgtcgatag tgatggcgat 6780gattcactgg tgtctcaact caatgtaacc atcggtgatg atgttcaaat catgcaaggt 6840ggtacgttag atatcactga gccaaacctt gcagacggca caaccacaac taacaccatc 6900gatgtgatgc cagaacaaag tgccgatggt gcgacgatca ctcagtttac gtatgacggg 6960caagttcgca ctctggatca aactgacaat ggtgagcagc aatttagctt cactgaaggc 7020gagttgttca tcactcttca aggtgacgtg cgtttcgaac ccaatcgcaa cctagatcac 7080acagctagcg aagacatcgt gaagtcgata gtggtgactt caagcgattc agataacgat 7140gtggtgacgt caacggtcac tctgactatt actgatggtg atctcccaac cattgatgca 7200gtgccaagcg ttactctgtc tgaaactaat cttagtgacg gctctgcgcc aagtggcagc 7260gcagtcagtc aaactgagac catcaccttt accaatcaaa gtgatgatgt ggcgagtttc 7320cgtattgagc caaccgagtt taatgtgggc ggtgcactga aatcgaatgg gtttgcggtt 7380gagataaaag aagactctgc taatccgggt acttacattg gctttattgc caatggttcg 7440agcgctgaaa tcccagtgtt cacgattgct ttctctacga gtacgttggg tgaatacacc 7500tttactctgc ttgaagcgtt agaccatgcg gatggtttag ataagaacga tctgagcttt 7560gagcttccgg tttacgcggt tgatacagac ggtgatgatt cattggtatc tcagcttaat 7620gtgaccattg gtgatgatgt tcaaatcatg caagatggta cgttagacgt tatcgagcca 7680aatcttgcag acggcacaat cacaaccaac accattgatg tgatgcccga gcaaagtgct 7740gatggtgcga cgatcactca gtttacttat gacggtcagc taagaacgct tgatcaaaat 7800gacactggtg aacagcagtt cagcttcaca gaaggcgagt tgtttatcac ccttgaaggt 7860gaagtgcgct ttgaacctaa tcgcgatcta gaccattccg ttagcgaaga catcgtgaag 7920tcgatagtag tgacttcaag cgacttcgat aacgatccgg tgacttcagc cattacgctg 7980accattactg atggtgataa tccgactatc gattcggtac cgagcgttgt acttgaagaa 8040gctgatttaa ctgatggctc atcgccaagt ggcagcgcgg ttagtcaaac ggaaaccatc 8100actttcacta atcaaagtga cgatgttgag aaattccgtt tagaaccaag tgaatttaat 8160actaacaacg cgctcaagtc cgatggcttg atcattgaga ttcgagagga accaacagga 8220tccggcaatt atattggttt cacgaccgat atttcgaatg tcgaaaccac tgtgtttaca 8280ctcgatttca gcagtaccac tttgggtgag tacaccttca cgcttctgga agcgattgac 8340cacacgcctg ttcaaggcaa taacgatcta acattcaact tgccagtcta cgcggttgat 8400agcgacggtg atgattcgct aatgtcatca ctatcggtga cgattactga tgatgttcaa 8460gtgatggtga gtggttcgct tagtatcgaa gagcctactg ttgccgactt ggctgcaggc 8520acgccaacaa catcagtatt tgatgtatta acatccgcga gtgctgatgg ggcgaccatt 8580actcagttca cttatgatgg tggggcggta ttaacgcttg atcaaaacga tacaggtgag 8640cagaagttcg tggttgctga tggggcatta tatatcactc tgcaaggcga tattcgtttc 8700gaaccaagtc gtaaccttga ccatactggt ggcgatatcg tcaagtcgat agtcgtaact 8760tcaagtgatt ccgatagcga tcttgtgtct tcaacggtaa cgctaaccat tactgatggc 8820gatatcccaa cgattgacac ggtgccaagc gttactctgt cagaaacgaa tctgagcgac 8880ggatctgctc cgaatgcaag tgcggtaagt tcaactcaaa ccattacctt tactaaccaa 8940agtgatgacg tgacgagttt ccgtattgaa ccgactgatt ttaatgttgg tggtgctctg 9000aaatcgaacg gattggcggt cgaactgaaa gcggacccaa ctacaccggg tggctacatc 9060ggttttgtga ctgatggttc gaacgttgaa actaacgtgt ttacgattag cttctcggat 9120accaatttag gtcaatacac cttcaccctg cttgaagcgt tggatcatgt agatggctta 9180gtgaagaatg atctgacttt tgatcttcct gtttatgcgg ttgatagcga tggtgatgat 9240tcactggtgt ctcaactgaa tgtgaccatt ggtgatgatg tacaggtcat gcaaaaccaa 9300gcgcttaata ttattgagcc aacggttgct gatttggctg caggtactcc gacgacagcc 9360actgttgatg tgatgcctag ccaaagtgcc gatggcgcga caatcactca gtttacttac 9420gatggcgggg cggcaataac actcgaccaa aacgacaccg gtgaacagaa gtttgtattt 9480actgaaggtt cactgtttat caccttgcaa ggtgaagtgc gtttcgagcc aaatcgcaat 9540ctaaaccaca cagcgagcga agacatcgtg aagtcgattg tggtgacttc aagcgattta 9600gataacgatg tactgacgtc aacggtcact ctgactatta ctgatggtga tatcccaacc 9660attgatgcag tgccaagcgt tactctgtct gaaactaatc ttagtgacgg ctcagcgcca 9720agcagcagtg ctgtaagtca aacagagacg attaccttca tcaatcaaag tgatgatgtg 9780gcgagtttcc gtattgagcc aacagagttc aatgtgggcg gtgcactgaa atcgaatgga 9840tttgcggttg agataaaaga agattcggct aatccgggta cttatatcgg ttttattacc 9900gatggttcga atactgaagt tcctgtattc acgattgctt tctctacaag tacgttgggc 9960gaatacacct tcaccttact tgaagcgcta gaccatgcaa atggcctaga taagaacgat 10020ctgagttttg atcttcctgt ttatgcggta gacagtgatg gcgatgattc actggtgtct 10080caactgaatg tgaccattgg tgatgatgtc caaataatgc aagacggtac gttagatatc 10140actgagccaa atcttgcaga cggaacaatc acaaccaaca ccattgatgt gatgccaaat 10200cagagtgccg atggtgcgac gatcactgaa ttctcatttg gcggtattgt caaaacactc 10260gatcaaagca tcgtaggtga gcagcagttt agtttcaccg aaggtgagct attcatcact 10320cttcaaggtc aagtgcgctt tgaaccaaat cgtgaccttg accactctgc cagcgaagac 10380atcgtgaagt cgatagtggt tacttcaagt gattttgata acgatcctgt gacttcaacc 10440gttacgctga ccattaccga tggtgatatt ccaactatcg atgcggtacc aagtgttacg 10500ctttcagaaa caaacctagc tgatggttct gcgccaagtg gtagtgcggt tagtcaaacg 10560gagacgatta cttttaccaa tcaaagtgat gatgtggttc gcttccgtct ggaaccaacc 10620gagttcaata ctaacgatgc acttaaatcg aatggcttag cggtcgaact gcgcgaagaa 10680cctcaaggct ctggtcagta cattggcttt accaccagtt cgtctaatgt tgagacaaca 10740gtatttacgt tggactttaa ctccggaacc ttaggtgaat acacatttac tttaatcgaa 10800gctctggatc atcaagatgc gcgtggcaac aacgatttaa gctttaatct acctgtgtat 10860gcggtggata gtgatggcga tgactcgtta gtctctcagc ttggcgtgac cattggcgac 10920gatgtgcagt tgatgcaaga cggcacaatc accagtcgtg agcctgcagc aagtgttgaa 10980acatcaaata cctttgatgt gatgccaaac caaagtgctg atggagccaa agtcacttca 11040tttgttttcg atggtaagac tgcagaaagt cttgatttga atgtgaatgg tgaacaagag 11100ttcgtcttca cggaaggttc ggtatttatt acgacggaag gtgagatacg attcgagccg 11160gtacgtaatc aaaatcatgc tggtggtgat attaccaagt cgattgaggt gacgtctgtt 11220gacctcgatg gcgatattgt cacatcgaca gtgacactga agattgttga tggtgacctt 11280cctactatcg accttgttcc cggaattacg ttatctgaag tggatctggc cgatggctct 11340gtgccaaccg gtaatccagt gacaatgaca caaaccatta cctacacagc gggtagtgac 11400gacgtaagcc atttcagaat tgaccctacg cagttcaata cttcaggggt tttgaaatcg 11460aacggcctag atgtcgaaat aaaagagcag ccagctaatt ctggtaatta cattggcttc 11520gtcaaagacg gttctaacgt agaaaccaac gtcttcacga tcagcttctc gacgagcaat 11580ttagggcaat acacgttcac actacttgaa gcgttagatc atgtagatgg attgcaaaac 11640aatatactaa gcttcgatgt ccctgtttta gcggttgatg cggatggtga tgattctgca 11700atgtcgccta tgacggttgc gatcaccgat gacgtacaag gtgttcaaga tggcaccttg 11760agtatcactg agccttcatt agctgatttg gcatcgggta cgccaccaac gacggcaatc 11820attgatgtta tgccaacgca gagtgctgat ggcgcgaaag taacacagtt tacttacgat 11880ggtggcacag ctgtaacgtt agacccaagc atcgccacag aacaagtctt taccgtaacc 11940gatggcttac tgtacatcac cattgaaggg gaggttcgtt ttgagccgag ccgagatcta 12000gaccattcat ctggcgatat cgtaagaacg attgtcgtca ccaccagtga ttttgataac 12060gatacagata ccgcggatgt cactttgacg atcaaagacg gtatcaatcc cgttatcaat 12120gtggttccag atgttaactt atcggaagtt aatctagcgg atggctcgac gccaagtggt 12180tctgcagtca gttcgactca cacaatcact tacaccgaag gaagtgatga ttttagtcac 12240tttagaattg cgaccaacga attcaatcct ggcgatctgt tgaaatcaag tggtcttgtt 12300gttcaactaa aagaagatcc tgcttctgct ggtgattaca ttggttatac cgatgatggt 12360atgggtaacg ttaccgatgt atttaccatt agctttgata gtgcaaacaa agctcagttt 12420acatttacct tgattgaggc gcttgatcac cttgatggtg tgctttacaa cgatcttacg 12480ttccgtttgc ctatctatgc tgttgataca gatgattctg aatcaacaaa gcgcgatgtg 12540gtggttacga tagaagatga catccagcaa atgcaagatg gcttcttaac cattaccgag 12600ccaaattctg gtactccaac aacaactacc gttgatgtga tgccaatacc aagtgcagac 12660ggtgcgacta ttacgcagtt cacgtatgac ggtggttctc caattactct gaatcaaagc 12720atcagcggcg aacaagagtt tgttttcact gaaggttcac tgtttgtgac actagatggt 12780gatgtaaggt ttgagccaaa tagaaacctt gatcactctg cgggcgacat tgttaaatcg 12840attgtgttca cgtcttcaga ctttgataac gacatcttct catcaaaagt cactctcacc 12900attgttgatg gtgatgggcc aacaatcgac gttgtgccgg gtgtggcatt gtcagaaagc 12960ttacttgcgg atggttcgac gcctagcgta aatcccgtga gtatgactca aaccattact 13020tcacttgcaa gtagtgatga tattgctgaa atagtggtgg aagtcgggtt gttcaatacc 13080aacggcgcgt tgaagtcgga tggtttgtca ctgagtttac gtgaagaccc tgtaaattca 13140ggcgactaca ttgcatttac tactaatggt tcgggtgttg agaaagttat cttcactctg 13200gattttgatg atacgaatcc gagtcaatat acgtttactc tgcttgaacg tttagaccat 13260gttgatggct taggaaataa cgatctgagt tttgatcttt ctgtttatgc agaagatacc 13320gatggtgata tttcagcgtc taaaccgctt acagtcacca tcaccgatga tgttcagctc 13380atgcaatccg gtgcgctcaa cattactgag ccaaccacag gaacaccgac tacagcagtc 13440tttgatgtga tgcctgcgca aagtgcagat ggcgcgacaa tcactaagtt tacctatggc 13500agccaacctg aagagtctct ggtacaaacc gtcacgggtg agcaagaatt tgtgttcact 13560gaaggttctc tgtttatcaa tcttgaaggt gatgtacgtt tcgaacctaa ccgtaatctc 13620gatcattcgg gtggtaacat cgttaagacc attacggtga catcggaaga taaagatggc 13680gatattgtca cttcaacagt gacgctgact attgtagatg gcgcgccacc agtaatagac 13740acagtaccaa cggttgcatt ggaagaagcg aatctggtcg acggatcttc accgggttta 13800cctgttagcc aaactgaaat cattactttc acagcaggaa gtgatgatgt gagccacttc 13860cgtattgatc cggctcaatt caacacatca ggcgatctga aagcggatgg tttggtggtt 13920cagttaaaag aagatcctct aaacagcgat aattatattg gttacgttga aagcggcggt 13980gtccaaacgg atatcttcac catcaccttt agcagcgtgg ttctaggaga gtacacattc 14040accttgttgg aagagttaga tcacctgcct gtacaaggta acaatgatca aatcttcacc 14100ttgccagtga tcgcagtcga caaagacaac actgactcag cggtgaaacc tcttacggtg 14160accattaccg atgatgttcc aaccattact gacaccaccg gcgccagtac gtttgtggtt 14220gatgaagatg atttgggcac tctggcacaa gcgacgggtt cgtttgtaac cacagaaggt 14280gcagatcaag tcgaggttta cgaactacgt aatatatcaa cgttggaagc aacgctatcg 14340tcgggcagtg aaggtattaa gatcactgag atcacaggtg ctgctaacac gaccacctac 14400caaggggcga ccgacccaag tggaacgcca attttcacat tagtgctgac tgatgatggt 14460gcctacacct ttaccttgct tggccctctc aatcacgcta cgacaccgag taacctcgat 14520acattaacaa taccatttga tgttgttgcc gttgacggtg atggcgatga ttctaaccaa 14580tatgtattgc caatcgaggt gctagatgat gtgcctgtaa tgacggcgcc gacgggtgaa 14640acggttgttg atgaagacga tcttactggc attggttccg atcaatctga agatacaatt 14700atcaatggac tgttcaccgt tgatgaaggt gcggatggcg ttgtgctgta tgagctggtt 14760gatgaagatt tggttctgac gggcttaacc tctgatggag aaagcttaga gtggctagct 14820gtttcacaaa acggcacaac atttacttac gttgctcaaa ctgcaacgag taatgaagcg 14880gtgttcgaga ttattttcga cacctcggat aacagctacc aatttgaatt atttaagcca 14940ctgaagcacc ctgacggtgc aaacgagaac gcgatagatc ttgatttctc aatcgttgct 15000gaagattttg atcaagacca atcggatgcg atcggtctaa aaattacggt aaccgatgat 15060gttccgttag tgacaactca atcgattact cgtcttgaag gtcaggggta tggcaactct 15120aaagtcgaca tgtttgccaa tgcaacagat gtgggggctg atggcgcggt actgagtcga 15180attgagggta tctcaaataa tggtgcagat attgttttcc gtagcgggaa caatgggcca 15240tatagtagcg gcttcgattt aaacagcggt agccaacaag ttcgagtcta cgagcaaaca 15300aatggcggtg ctgatactcg tgaacttggc cgtctacgca tcaactcaaa tggtgaggtt 15360gaattcagag ctaacggcta tctcgatcat gacggtgatg acaccatcga cttctcgatt 15420aacgtgattg ccacagatgg agatttagac acctctgaaa caccgttaga tattacgatt 15480actgataggg attctacaag aattgcgctg aaagtgacga ccttcgagga tgcgggtaga 15540gactcaacca taccttacgc aacaggtgat gagccgactc ttgagaatgt tcaagataac 15600caaaatggtt tgccgaatgc gccagcgcaa gttgcgctgc aagttagtct gtatgaccaa 15660gataacgctg aatctattgg gcagttgacg attaaaagcc cgaacggagg tgatagtcat 15720caaggtactt tttattactt tgatggtgct gactacatag aattagtgcc tgagtcaaat 15780gggagcatta tatttggctc tcctgaactc gaacaaagct tcgctccaaa cccgagtgaa 15840ccaagacaaa ctatcgcgac gatagacaac ctgttctttg ttccagacca acacgctagt 15900tcggatgaaa ctggtgggcg agttcgttat gagcttgaaa ttgagaaaaa tggcagtacg 15960gatcacaccg ttaattcaaa cttcagaatt gagattgaag ctgtagctga tattgcgact 16020tgggatgatt ccaacagcac gtatcagtat caagtcaacg aagatgaaga caatgtcacg 16080ttgcagctga acgcagagtc tcaagataac agtaatactg agacgattac ctatgaactt 16140gaagccgttc aaggcgacgg gaagtttgag ttacttgatc aaaatggcaa tgtgttaacg 16200cccgttaatg gtgtttatat catcgcatct gctgatatca atagcaccgt agttaaccct 16260attgataact tctcagggca gattgagttc aaagcgacgg caattacgga agagacgctt 16320aacccatacg atgattcaga caacggtgga gcaaacgata agacgacggc tcgttctgtg 16380gaacaaagta ttgttattga tgtgaccgca gatgcggacc ctggcacatt cagtgttagt 16440cgaattcaga tcaacgaaga caatatcgat gatccagatt acgtcgggcc tttggacaat 16500aaagacgcgt tcacgttaga cgaagtcatc accatgacag ggtcggtcga ttctgacagt 16560tctgaagaac tgtttgtgcg catcagtaat gttacggaag gagctgtgct ttacttctta 16620ggcaccacga cagtcgttcc gaccatcacg atcaatggtg tggattatca agaaatcgcg 16680tattccgatt tggctaacgt tgaggttgtt ccaaccaaac acagtaatgt cgatttcacc 16740ttcgatgtta cgggagtggt caaagatacg gcaaatctat ccacgggcgc ccaaatcgat 16800gaggagatac taggaactaa aaccgtcaac gttgaagtca aaggcgttgc cgatactcct 16860tatggtggaa cgaatggcac ggcttggagt gcaattacag atggcactac atctggtgtt 16920caaaccacga ttcaagagag ccaaaatggt gatacctttg ctgagcttga tttcaccgtg 16980ttgtcgggag agagaagacc agatactggc actacaccat tagctgacga tgggtcagaa 17040tcaataaccg ttattctatc gggtataccc gatggggttg ttctagaaga cggtgacggt 17100acagtgattg accttaactt tgtcggttat gaaaccggac cgggcggtag tcctgactta 17160tccaaaccta tctacgaagc gaacattact gaggcgggta aaacttcagg cattcgcatc 17220agacctgtcg actcttcaac cgagaatatt cacattcaag gtaaagtgat tgtgactgag 17280aacgatggtc acacgcttac gtttgatcaa gaaattcgag tgcttgttat acctcgaatc 17340gacacatcag caacttatgt caatacgact aacggtgatg aagatacggc tatcaatatt 17400gattggcacc ctgaaggcac ggattacatt gatgacgatg agcatttcac taagataact 17460attaatggaa taccactggg tgttactgca gtagtcaacg gtgatgtgac cgttgatgac 17520tcaaccccag gaacattgat tataacgcct aaagatgctt cccaaactcc tgaacaattt 17580actcaaattg cattagctaa taacttcatt caaatgacgc ctccggctga ttctagtgca 17640gattttacgt tgaccaccga acttaaaatg gaagagcgag atcatgagta tacgtctagc 17700ggcctagagg atgaagatgg tggttatgtc gaagccgatc cagatataac cggaatcatt 17760aacgttcaag tacgacctgt ggttgaacct ggagatgccg acaacaagat tgtcgtttca 17820aacgaagatg gctctggaga tctcactacg attacggctg atgctaatgg tgtcattaaa 17880tttacaacta acagtgataa ccaaacgact gatactaacg gagacgaaat ctgggacggt 17940gaatacgtcg

tccgatacca agaaacggat ttaagcacag tagaagagca agtcgacgaa 18000gtgattgttc agctgactaa caccgatgga agcgcgttat ctgatgatat tttagggcaa 18060cttttagtaa ctggtgcctc ttacgaaggc ggtggccgat gggttgtgac caatgaagat 18120gcctttagcg tcagtgcgcc caatggatta gatttcaccc ctgccaatga tgcggatgat 18180gtagctactg atttcaatga tatcaagatg acaattttca ctttggtctc agatcctggt 18240gatgctaaca atgaaacgtc cgcccaagtg caacgcaccg gagaagtaac gctttcttat 18300cctgaagtgc tgacggcacc tgacaaagtt gccgcagata ttgcgattgt gccagacagt 18360gttatcgacg ctgttgagga tactcagctt gatctcggcg cggcactcaa cggcattttg 18420agcttgacgg gtcgcgatga ttctactgac caagtgacgg tgatcatcga tggcactctg 18480gtcattgatg ctacaacatc attcccaatt agcctgtcgg gaacaagtga tgttgacttt 18540gtgaatggga aatatgttta cgagacgact gttgagcagg gcgtagccgt cgattcatcg 18600ggtttgttat tgaatctgcc accaaactac tctggtgact ttaggttgcc aatgaccatc 18660gtgaccaaag atttacaatc tggtgatgag aagaccttag tgactgaagt tatcatcaaa 18720gtcgcaccag atgctgagac ggatccaacg attgaggtga atgtcgtggg ttcgcttgat 18780gatgccttta atcctgttga taccgacggt caagctgggc aagatccggt gggttacgaa 18840gacacctata ttcaactcga cttcaattcg accatttcgg atcaggtttc cggcgtcgaa 18900ggcggccaag aagcgtttac gtccattact ttaacgttgg acgacccttc tataggtgca 18960ttctatgaca acacgggtac ttcattaggt acatctgtta cgtttaatca ggctgaaata 19020gcagcgggtg cactcgataa cgtgctcttt agggcaatcg aaaattaccc aacgggtaat 19080gatattaacc aagtgcaggt taatgtcagc ggtacagtca cagataccgc aacctataat 19140gatcctgctt ctcctgcggg tacggcaaca gactcagata ctttctctac gagtgtcagc 19200tttgaagtcg ttcctgtggt cgatgacgtg tctgtcactg gaccgggtag cgatcctgat 19260gttatcgaga ttactggcaa cgaagaccag ctcatttctt tgtcggggac agggcctgta 19320tcgattgcac tgactgacct tgatggttca gaacagtttg tatcgattaa gttcacagat 19380gtccctgatg gcttccaaat gcgtgcagat gctggctcga catataccgt gaaaaataat 19440ggtaatggag agtggagtgt tcaactgcct caagcttcgg ggttgtcatt cgatttaagt 19500gagatttcga tcttgccgcc taaaaacttc agtggtaccg ctgagtttgg tgtggaagtc 19560ttcactcaag aatcgttgct gggtgtgcct actgcggcgg caaacttgcc aagcttcaaa 19620ctgcatgtgg tacctgttgg tgacgatgtt gataccaatc cgactgattc tgtaacaggc 19680aacgaaggcc aaaacattga tatcgaaatc aatgcgacta ttttggataa agaattgtct 19740gcaacaggaa gcgggacgta taccgagaat gcgcccgaaa cgcttcgagt tgaagtggcg 19800ggtgttcctc aagatgcttc tattttctat ccagatggca cgacattggc tagctacgat 19860ccggcgacgc agctctggac tctcgatgtt ccagctcagt cgttagataa gatcgtattt 19920aactctggcg aacataatag tgatacaggc aatgtactgg gtatcaatgg tccactgcag 19980attacggtac gttcagtaga tactgatgct gataatacag agtacctagg tacgccaacc 20040agcttcgatg tcgatctggt gattgatcct attaacgatc aaccgatctt tgtgaacgta 20100acgaatattg aaacatcgga agacatcagt gttgccatcg acaactttag tatctacgac 20160gtcgacgcaa actttgataa tccagatgct ccgtatgaac tgacgcttaa agtcgaccaa 20220acactgccgg gagcgcaagg tgtgtttgag tttaccagct ctcctgacgt gacgtttgta 20280ttgcaacctg acggctcatt ggtgattacc ggtaaagaag ccgacattaa taccgcattg 20340actaatggag ctgtgacttt caaacccgac ccagaccaga actacctcaa ccagactggt 20400ttagtcacaa tcaatgcaac gctcgatgat ggtggtaata acggtttgat tgacgcggtt 20460gatccgaata ccgctcaaac caatcaaact accttcacca ttaaggtgac ggaagtgaat 20520gacgctcctg tggcgactaa cgttgattta ggctcgattg cggaagacgc tcaaatcgtg 20580attgttgaga gtgacttgat tgcagccagt tctgatctag aaaaccataa tctcacagta 20640accggtgtga ctcttactca agggcaaggt cagcttacac gctatgaaaa tgctggtggt 20700gctgatgacg cagcgattac ggggccattc tggatattca ttgcagataa tgatttcaac 20760ggcgacgtta aattcaatta ctccattatc gatgatggta ccaccaacgg tgtggatgat 20820tttaaaaccg atagcgctga aatcagcctt gtagttactg aagtcaatga ccagccagtg 20880gcatcgaaca ttgatttggg caccatgctt gaagaaggac agctggtcat taaagaggaa 20940gacctgattt ccgcaaccac tgatccggaa aacgacacga ttactgtgaa cagtttggtg 21000ctcgatcaag gtcagggcca attacaacgc tttgagaacg tgggcggtgc tgatgatgct 21060acgatcactg gcccgtactg ggtatttact gcagccaacg aatacaacgg tgatgttaag 21120ttcacttata ccgttgagga cgatggtaca accaacggcg ctgatgattt cttaacagat 21180accggcgaaa ttagcgttgt ggtaacggaa gtgaatgatc aaccggtggc aacggatatc 21240gacttaggaa acatccttga agaagggcag ttgatcatca aagaggaaga cttaattgct 21300gctacgagcg atccggaaaa cgacacgatt accgtgacca atctggtgct cgacgaaggc 21360caaggccagt tacagcgctt tgagaacgtg ggcggtgctg atgacgctat gattactggc 21420ccgtactgga tatttacggc tgctgatgaa tacaacggta acgttaagtt cacctatacc 21480gtcgaggatg atggtacaac caacggcgct aatgatttcc taacggatac tgcagagatc 21540acagcgattg tcgacggagt gaacgatacg cctgttgtta atggtgacag tgtcactacg 21600attgttgacg aggatgctgg tcagctattg agtggtatca atgtcagtga cccagattat 21660gtggatgcat tttctaatga cttgatgaca gtcacgctga cagtggatta cggtacattg 21720aacgtatcac ttccggcagt gacgacagtg atggtcaacg gcaacaacac tggttcggtt 21780atcttagttg gtactttgag tgacctgaat gcgctgattg atacgccaac cagtccaaac 21840ggtgtctacc tcgatgcgag cttgtctcca accaatagca ttggcttaga agtaatcgcc 21900aaagacagcg gtaacccttc tggtatcgcg attgaaactg caccagtggt ttataatatc 21960gcagtgacac cagtcgctaa tgcgccaacc ttgtctattg atccggcatt taactatgtg 22020agaaacatta cgaccagctc atctgtggtc gctaatagtg gagtcgcttt agttggaatt 22080gtcgctgcat tgacggacat tactgaagag ttaacgttga agatcagcga tgttccggat 22140ggtgttgatg taaccagtga tgtgggtacg gtttcgttgg tgggtgatac ttggatagcg 22200accgctgatg cgatcgatag tctcagactc gtagagcagt catcattagg taaaccgttg 22260accccgggta attacacctt gaaagttgag gcgctatctg aagagactga caacaacgat 22320attgcgatat ctcaaaacat cgatctgaat ctcaatattg ttgccaatcc aatagatctc 22380gatctgtctt ctgaaacaga cgatgtgcaa cttttagcga gtaactttga tactaacctc 22440actggcggaa ctggaaatga ccgacttgta ggtggagcgg gtgacgatac gctggttggc 22500ggtgacggta acgacacact cattggtggc ggcggttccg atattctaac cggtggcaat 22560ggtatggatt cgtttgtatg gctcaatatt gaagatggcg ttgaagacac cattaccgat 22620ttcagcctgt ctgaaggaga ccaaatcgac ctacgagaag tattacctga gttgaagaat 22680acatctccag acatgtctgc attgctacaa cagatagacg cgaaagtgga aggggatgat 22740attgagctta cgatcaagtc tgatggttta ggcactacgg aacaggtgat tgtggttgaa 22800gaccttgctc ctcagctaac cttaagtggc accatgcctt cggatatttt ggatgcgtta 22860gtgcaacaaa atgtcatcac tcacggttaa cgcctaattg gaggctagct attagaatct 22920aacgattaaa ctaaaagcgg accatttaac cataacgaaa gaggccagca ttgctggcct 22980cttttttgtc actgtataaa tcgtaaagag ttacttaaga gagttgtgga tcaggaactc 23040ttcttcgacg cctttcaatt tcatctcatc cataatgaag ttcactgtgt tcaacaagcg 23100ttgttcacct tttggtatca ggtaaccgaa ttgactgttg gtaaacggtg tttcacagcg 23160tgccgcttca agacgttcgt ccgtcacttg atagaacaga ccttcaggag tttctgtcac 23220cattacatca actttacctt ccgcaacggc ttgcggaacg tctaggttgt tctcgtaacg 23280cgtaaagctc gcgtcttgca agttagcatc cgcaaacatc tcattagtcc caccgatatt 23340gacgccaaca cgcacagaag agaggttcac tttctcaatg ctgttgtatt gttctgcttt 23400gcctttcgca actaagaaac acttgccaaa ggtcatgtaa ccttgagttt gttctgcgtt 23460taactgacgc tgcattttac gcgtgatacc gcccatcgcg atgtcgtatt tatcgctgtc 23520tagatcggtc agtagatctt tccatgtggt acgaacaatc tgtaattcaa cgcccaactg 23580ctctgcaaca tgtttggcta cgtcaatgtc ataaccagag taggttttgc cgtcgaagta 23640agaaaaaggt ttgtagtcgc ctgtggtgcc gacgcgaagt gtgcctgatt tttgaatgtc 23700ttctagctgg tcagcttgta ctacaccaga aagtgccaga gtaatggaag caagtaatag 23760tgatgttttt ttcattgtaa ttatctgttg tgtttgtgtt gttattcaaa gtaacagaaa 23820caatcagaga aagagatcaa accattggaa aggttgtaaa agaagataaa acgagggcag 23880gagataggta acgctattga tttgtgaaca ttgataaaca tgtgtttcat attccatttt 23940gataaaccgt agacaaacaa aaagcccatg ttatcgaata acatgggctt cattttggtt 24000taacttgtta gctgcttatt tagctgctta tttagctgtt tagctgttta gctgtttagc 24060tacttagcaa ctgactcgtt gttcatctta gccggagctt tagatgcgtt aaccagcagg 24120ataccaacgg tgagtaccat cgaaccacat agtaggaaca acaagcgtcc tgttggttcg 24180tttggaatca gagccattgc taggataccg aaacctgctg tgctgataag cttaccaagc 24240attgaacgct gtttagtatc taggttctgc tgctcttcac cttccgctac tagcggcgta 24300ttccagttag tgaatagttg gtcaacttct ttctcacgtt caggcgatag gcctttgtag 24360aagcgagaag ttaggatgaa gtaaccacca gtaaacacta cgtgagcagc taagctaaga 24420ccaactttca agtcgctcca ttcacggcca gtaagcgctg tttccatacc aaataggtgc 24480tcgatgtctt ctgcttgaag cgagataccg aagatgtaag aaacgaagcc accaacgatt 24540aacgtagacc aaccagccca gtcaggcgtc ttacgaatcc acataccaag tagtacaggg 24600ataagcattg ggaagccaat taacgcacct acgttcatta cgatatcgaa caagctcaaa 24660tgacgtagag agttaatgaa caagccaatc gcgatgatga taatacccat catgatagtg 24720gttagcttac ttacaataac cagctctttc tgagttgcgt tttgacgtag aatagggctg 24780tagaagttca ttacaaagat gccagcgtta cggttcaaac ctgaatccat agaagacatt 24840gttgcagcga acattgctga cataagaaga ccaaccatac ctgctggcat tacgttctgt 24900acgaatgcta ggtaagcagc atcaccagct ttatcaccca ttgaagcgta ctccaatgcg 24960aaatcaggca tgaatgcact tacgtaccaa ggtggtagga accagattag tgggccaaca 25020accataagga tacatgctag gcctgccgct ttacgtgcgt tttcactgtc tttcgcacat 25080aggtaacggt aagcgttgat gctgttgttc attacaccga actgcttcac gaagatgaat 25140acaacccaaa gaacgaagat gctcatgtag tttaggttat tacctaacat gaagtcgccg 25200tcgaaatttg caacgatgtt agttaggcca ccaccgtgga agtaagctgc aaccgcacaa 25260gtaatcgtaa ccgccatgat aacaagcatt tgcatgaagt cagaagcaac aaccgcccaa 25320gagccgcctg ttactgccat caatactaga accatacccg ttaccacaat ggttgcttcc 25380attgggatgt tgaataccgc tgctacgaag atagctagac catttagcca gatacccgca 25440gagataaggc tgtcaggcat acctgcccat gtgaagaact gttcagacgt tttaccaaag 25500cgctgacgaa tagcttcgat cgccgttacc acacgaagtt ggcggaactt tggagcgaag 25560tacatatagt tcatgaagta gccaaaagca ttggctaaga ataggattac aataacgaaa 25620ccgtcattga acgcgcgtcc tgcggcacct gtaaacgtcc atgctgaaaa ctgtgtcatg 25680aaggcggttg caccaaccat ccaccacaac attttgccgc cccctctgaa gtaatcacta 25740gtcgacgtgg tgaacttacg gaacatccaa ccaatagcga ttaaaaagaa gaagtaggcg 25800agaacaacaa aagtatcgat agtcatcttt tcagcctttt aaatatcata attaactggg 25860cttagattaa cgcgttcaaa ggtttatttg tactacaata tgtctttagt atgatctagg 25920tcgcattgat ttttgggtgc acacgataag ttaatttaac ctactgtttt tattgatttt 25980aattgttttt atgaattgct ctagatccaa gataaattga agttcaaatg tttatatgta 26040ttacaatata agtaatgagg ctttagttta ccttatttat aagattttaa ttataaccgt 26100aacaaatatg ctacaactga gcgtggttgt gcgacgacat tcacgttaat ttggaactct 26160attctggaaa ttcttgtatt aggatttcaa gtgtagctca ttgttttcac ttcgctattt 26220tgtgtttgtc tgcggttctg tcgcctttcc atgctattga ttaatttttt cgtgctagag 26280agacgcgtat ttggaatgtt tgtcactgag tgggcgttaa actggacgac gggacactct 26340ttcggctcac tttgtctatt gtggtcttca gtgcatgcta tgagaaatgt ttgacgacgt 26400attgaaaagg aatattgtcg gataaaggga tgggtaagga gctggataag cggtagggag 26460ccccagtaac gcttcgctag atgcatactg aggttgcttg aaagccttac atcactcgtt 26520cttgcctgtc ttagtcacgg agctgtacga ggccataggg agaacggtga tagggtatgg 26580ggaaacagaa cgttgattga gcgtgtttta cggttagtca gcgcaataaa cgccagataa 26640taaaaagccc caccgaggtg aggctttatc acgaaatcta aaacagatta agcgttaacg 26700tgatcaactg cgtcacgaac aagcttgcct agttcgtccc acttaccttc atcgataagg 26760ttagttggaa ccatccaagt accgccacac gcaagaacag aagggatcga taggtattca 26820tcaacattct tcaagcttac gccaccagta ggcatgaatt taacagggta aactgctgtt 26880agtgctttaa gcatgccagt accgcctgaa ggctcagcag ggaagaactt caacgtgcga 26940agacccattt ccattgcttg ctcaactagg cttgggttgt taacacccgg tacgattgca 27000atacctttat cgatacagta ttgaacagta cgtgggttaa aacctgggct tacgatgaaa 27060tcaacaccag cttcgataga tgcgtcaact tgctcgttag tcagtacagt acctgaaccg 27120attagcatgt ctgggaattc tttacgcatg atgcgaatcg cttcgattgc acattctgta 27180cgtagtgtaa tttctgcaca tggcatgcca ttttcaacca acgctttacc tagagggata 27240gcgtcttcag cacggttgat cgcgattaca ggaattactt ttaggtttgc tagttgttca 27300tttaatgtcg tcatgaattc tttctcacgt taaatgtggg cctgctttca actaagcaaa 27360cccttgatta atagttaaag tgcgtaatta tagagacaga tcaggcgtcg cttctagagg 27420aatgatagca cctggatgct gaatcacggt tcctgccaca atatgacctg caaatgcagc 27480atcacgagca ctaccgccgc tcaagcgctt ggccaagaag cctgcactga acgagtcgcc 27540agcggcagtc gtatcaacga tgttgtctac agggttgggt gcaacgtatt gagcgctttg 27600gctttcaacc actaagcagt ctttcgcgcc acgtttaatg acgatctctt tcacaccaga 27660ctctgacgta cgtgtaatac attgttcaat gctttcgtcg ccgtatagct cttgctcatc 27720atcaaacgtc agcagagccg tatctgtgta cttaagcatt ttcaagtacc aagaaatcgc 27780ttcttgttgg ctttcccaaa gtttaggtcg gtagttattg tcgaagaata cttggccgcc 27840ttgagctttg aatttgtcta agaagttgaa tagctgcgtg cgaccatttt ctgtcaagat 27900tgccagcgta ataccactta agtaaatcgc gtcaaaagag aacagcttat caagaagagc 27960aggcgtgtct tcctgatcaa acatgaactt cgctgcagca tcactacgcc agtagtggaa 28020actgcgttca ccagtttcat cggtctcgat gtagtaaagc cctggttgtt tgtggtccag 28080ctgagcaatt aagctcgtgt cgataccttc cgcttgccaa ttttttaaca tgtcggtact 28140gaatgggtca gtgcctagtg cagttacgta gctcgtgttg atatcttgct cttttgttaa 28200gcgtgacaag taaagtgcag tattcagcgt atcgccacca aaactttgct taagcccgtc 28260ttgtttcttt tgtagctcaa ccatgcactc gccaatgacc gcgatgttta atgatttcat 28320atgcttacct tagcaactga ggttgcgcta gttattattt taggaaatct tcacgcgcag 28380gattgaagat atcaagaagg atgctgtctt gttctagagc aactgcaccg tgcatcatgt 28440gtttacgagc gaagtaagca tcgccttctt taagcacttt cttctcgccg tcgatttcag 28500cttcgaagct accacgaaca acataaccga tttggtcgtg aatttcgtga gtatgagggt 28560ggccaatcgc gcccttatca aagcataggt gtactgccat tagatcgtca gtgtaagcaa 28620cgattttacg cttaatgccg ccaccaagtt cttcccatgg attttcatct aggataaaga 28680aagagttcat tgtgtatctc ctaatctgtt taaatctttt aagtgttact taacttgcat 28740ccatcataag ggaatgagtt caattgtaat acaatatatc taaatttgtg tgatattgat 28800caagcgatag tttatatagc gtaaatgaat caacaactta agaattgctt ggtatctggc 28860attagttagc tgcatcaatg gcttacggtg aattatgtga ctctactcat catttggcga 28920cgaataggta taattaaagc tcatattgta ttactttata tggagtttga aaatttaatc 28980aaagtttaag cagataaact ctttattgag ggtgacaaag aatatgacga ctaaaccagt 29040attgttgact gaagctgaaa tcgaacagct tcatcttgaa gtgggccgtt ctagcttaat 29100gggcaaaacc attgcagcga acgcgaaaga cctagaagca ttcatgcgtt tacctattga 29160tgttccaggt cacggtgaag ctgggggtta cgaacataac cgccacaagc aaaattacac 29220gtacatgaac ctagctggtc gcatgttctt gatcactaaa gagcaaaaat acgctgactt 29280tgttacagaa ttactagaag agtacgcaga caaatatcta acgtttgatt accacgtaca 29340gaaaaacacc aacccaacag gtcgtttgtt ccaccaaatc ctaaacgaac actgctggtt 29400aatgttctca agcttagctt attcttgtgt tgcttcaaca ctgacacaag atcagcgtga 29460caatattgag tctcgcattt ttgaacccat gctagaaatg ttcacggtta aatacgcaca 29520cgacttcgac cgtattcaca atcacggtat ttgggcagta gccgctgtgg gtatctgtgg 29580tcttgcttta ggcaaacgtg aatacctaga aatgtcagtg tacggcatcg accgtaatga 29640tactggcggt ttcctagcgc aagtttctca gctatttgca ccttctggct actacatgga 29700aggtccttac taccatcgtt atgcgattcg cccaacgtgt gtgttcgctg aagtgattca 29760ccgtcatatg cctgaagttg atatctacaa ctacaaaggc ggcgtgattg gtaacacagt 29820acaagctatg cttgcgacag cgtacccgaa cggcgagttc ccggctctga atgatgcttc 29880tcgtactatg ggtatcacag acatgggtgt tcaggttgcg gtcagtgttt acagtaagca 29940ttactcttct gaaaacggtg tagaccaaaa cattctgggt atggcgaaga ttcaagacgc 30000agtatggatg catccatgtg gtcttgagct atctaaagca tacgaagccg catctgcaga 30060gaaagaaatc ggcatgcctt tctggccaag tgttgaattg aatgaaggcc ctcaaggtca 30120caacggcgcg caaggcttta tccgtatgca ggataagaaa ggcgacgttt ctcaacttgt 30180gatgaactac ggccaacacg gcatgggtca cggcaacttt gatacgctgg gtatttcttt 30240ctttaaccgc ggtcaagaag tgctacgtga atacggcttc tgtcgttggg ttaacgttga 30300gccaaaattc ggcggccgtt acctagacga aaacaaatct tacgctcgtc aaacgattgc 30360tcacaatgca gttacgattg atgaaaaatg tcagaacaac tttgacgttg aacgtgcaga 30420ctcagtacat ggtttacctc acttctttaa agtagaagac gatcaaatca acggtatgag 30480tgcatttgct aacgatcatt accaaggctt tgacatgcaa cgcagcgtgt tcatgctaaa 30540tcttgaagaa ttagaatctc cgttattgtt agacctatac cgcttagatt ctacaaaagg 30600cggcgaaggc gagcaccaat acgactattc acaccaatat gcgggtcaga ttgttcgcac 30660taacttcgaa taccaagcga acaaagagct aaacactcta ggtgacgatt tcggttacca 30720acatctatgg aacgtcgcaa gcggtgaagt gaagggcaca gcaattgtaa gttggctaca 30780aaacaacacc tactacacat ggctaggtgc aacgtctaac gataatgctg aagtaatatt 30840tactcgcact ggcgctaacg acccaagttt caatctacgt tcagagcctg cgttcattct 30900acgcagcaaa ggcgaaacaa cactgtttgc ttctgttgtt gaaacgcacg gttatttcaa 30960cgaagaattc gagcaatctg tcaatgcacg tggtgttgtg aaagacatca aagtcgtggc 31020tcacaccaat gtcggttcgg tagttgagat caccacagag aaatcaaacg tgacagtgat 31080gatcagcaac caacttggcg cgactgacag cactgaacac aaagtagaac tgaacggcaa 31140agtatacagc tggaaaggct tctactcagt agagacaact ttacaagaaa cgaattcaga 31200agaacttagc actgcagggc aggggaaata ataatgagct atcaaccact tttacttaac 31260tttgatgaag cagctgaact tcgtaaagaa cttggcaagg atagcctatt aggtaacgca 31320ctgactcgcg acattaaaca aactgacgct tacatggctg aagttggcat tgaagtacca 31380ggtcacggtg aaggcggcgg ttacgagcac aaccgtcata agcaaaacta catccatatg 31440gatctagcag gccgtttgtt ccttatcact gaggaaacaa aataccgaga ttacatcgtt 31500gatatgctaa cagcgtacgc gacggtatac ccaacacttg aaagcaacgt aagccgtgac 31560tctaaccctc cgggtaagct gttccaccaa acgttgaacg agaacatgtg gatgctttac 31620gcttcttgtg cgtacagctg catctaccac acgatctctg aagagcaaaa gcgtctgatc 31680gaagacgatc ttcttaagca aatgatcgaa atgttcgttg tgacttacgc acacgacttc 31740gatatcgtac acaaccacgg cttatgggca gtggcagcag taggtatctg tggttacgca 31800atcaacgatc aagagtctgt agacaaagca ctatacggcc tgaaactaga caaagtcagc 31860ggcggtttct tagcgcaact agaccaactg ttttcgccag acggctacta catggaaggt 31920ccttactacc accgtttctc tctgcgtcca atctacctgt tcgcagaagc gattgaacgt 31980cgtcagcctg aagttggtat ctatgaattc aacgattcag tgatcaagac aacgtcttac 32040tctgtattca aaacggcatt cccagacggt acattgcctg ctctgaacga ttcatcgaag 32100acaatctcta tcaacgatga aggcgttatc atggcaacgt ctgtgtgtta ccaccgttac 32160gagcaaactg aaactctact tggtatggct aaccaccagc aaaacgtttg ggttcatgct 32220tcaggtaaaa cactgtctga cgcggttgat gcagcagacg acatcaaagc attcaactgg 32280ggtagcctgt ttgtaaccga cggccctgaa ggcgaaaaag gcggcgtaag catccttcgt 32340caccgtgacg aacaagatga cgacacgatg gcgttgatct ggtttggtca acacggttct 32400gatcaccagt accactctgc tctagaccac ggtcactacg atggcctgca cctaagcgta 32460tttaaccgtg gccacgaagt gctgcacgat ttcggcttcg gtcgctgggt aaacgttgag 32520cctaagtttg gcggtcgtta catcccagag aacaagtctt actgtaagca gacggttgct 32580cacaacacag taacggttga tcagaaaacg cagaacaact tcaacacagc attggctgag 32640tctaagtttg gtcagaagca cttcttcgta gcagacgacc agtctctaca aggcatgagc 32700ggcacaattt ctgagtacta cactggcgta gacatgcaac gcagcgtgat tcttgctgaa 32760cttcctgagt tcgagaagcc acttgtaatc gacgtatacc gcatcgaagc tgacgctgaa 32820caccagtacg acctacccgt tcaccactct ggtcagatca tccgtactga cttcgattac 32880aacatggaaa aaacgcttaa gccgctaggt gaagacaacg gttaccagca cttatggaac 32940gtggcttcag gcaaagtgaa cgaagaaggt tctctagtaa gctggctaca tgacagcagc 33000tactacagcc

tagtaaccag cgcgaatgcg ggcagcgaag tgatttttgc tcgcactggt 33060gctaacgatc cagacttcaa ccttaagagt gagcctgcgt tcatcttacg tcagtctggt 33120caaaaccacg tgtttgcttc tgtactagaa acgcatggtt actttaacga gtctatcgaa 33180gcctctgtag gcgctcgtgg tctagttaaa tcagtatctg ttgtgggcca taacagtgtc 33240gggactgttg ttcgcattca gactacttct ggcaacactt accactacgg tatctcaaac 33300caagctgaag acacgcagca agcaactcac actgttgagt tcgcgggtga gacatactcg 33360tgggaaggat catttgctca actgtaaatg attaacatac atgccgttta acgatggcat 33420gtattgatgt ggtgctttgc gggaacgaag catcacattg aattcagtcg tgattgcaaa 33480tcgttcgttg ataccaacaa cgactgaata catcgggaat aagtcaaacc gagtaactca 33540ctgcgagttg ctcggttttt ttatgcgtgc tgcttttata agaaggggga aagaggatgg 33600ggcaacggag cttccctttt ccttcgaatc ttacagagtg ggctaaagta taatttagga 33660tttaaaaata aagggattca aggatgaagt ggttattggc aatagttgcg atgtctggtg 33720tcgcattggc ggcagaaaat aagaatgttg aggtgagcag tgagcatttc gtccgttatc 33780aataccaaga caaaatcagc tatggaaagc tagacaatga cgcagtgtta ccggtcagcg 33840gcgatctctt tggcgaatat tcggtagcaa aaaattcgat cccgttagag tcggttgagg 33900tgttactacc gacaaaacca gagaaagtct tcgccgtcgg gatgaacttc gctagccact 33960tagcctcacc tgccgatgca ccaccgccga tgtttcttaa acttccttct tctttgattc 34020tcacgggcga agtgattcaa gtgccaccaa aagcaagaaa tgttcatttt gaaggcgagc 34080tggtggttgt gattggtaga gagctcagtc aagccagtga agaagaagcc gaacaagcga 34140tctttggcgt cacggtgggc aacgatatta ctgaaagaag ttggcaaggc gccgatttac 34200aatggctccg agcgaaagct tccgatggtt ttggcccggt tggcaacaca attgtgcgcg 34260gcattgatta caacaatatt gagttaacca ctcgtgttaa cggtaaagtg gttcaacaag 34320aaaatacttc gttcatgatc cacaagccaa gaaaagtcgt gagctatttg agctattatt 34380ttaccctcaa accgggcgat ctaattttca tgggcacgcc aggtagaact tatgctctgt 34440ccgacaaaga tcaagtgagt gtcacgattg aaggggtagg gactgtggta aatgaagtgc 34500ggttctgatg gaattgaatt agcgttggga gctacagagc ttatgtctga atttgcagta 34560cgtagacgac ttgaacctat taatttgaac taggttaact tgtgtagtga ataaactaac 34620cgtttttcgg ttccattatt ttagcccaat tgagtgatgt ttttggaagc gagcagagaa 34680aacgagaatg acgaacctac atgctcggcg agggttttgt tagtggtgta acacagtgtt 34740tctagctaag agaaattaga tgctttctaa gtgtttgatt aattgaataa attaacaggt 34800actatccgct ttgattttac tcaattggct gtaggtttaa atactgttat agtgttcctt 34860aaataataca taaacataac atataaataa gcgaacttat ggctagcact tttaattcaa 34920tttcgggctc gaagcgtagc ctgcacgtgc aagtagcacg cgaaatcgct cgaggaattt 34980tgtctggtga tctgccgcaa ggttctatta ttcctggtga aatggcgttg tgtgaacagt 35040ttggtatcag ccgaacggca cttcgtgaag cagttaaact actgacctct aaaggtctgt 35100tagagtctcg ccctaaaatt ggtactcgcg tagtcgaccg cgcatactgg aacttccttg 35160atcctcaact gattgaatgg atggacggac taaccgacgt agaccaattc tgttctcagt 35220ttttaggcct tcgccgtgcg atcgagcctg aagcgtgtgc actggcggca aaatttgcga 35280cagctgaaca acgtatcgag ctttcagaga tcttccaaaa gatggtcgaa gtggatgaag 35340ctgaagtgtt tgaccaagaa cgttggacag acattgatac tcgtttccat agcttgatct 35400tcaatgcgac cggtaacgac ttctatctac cgttcggtaa tattctgact actatgttcg 35460ttaacttcat agtgcattct tctgaagagg gaagcacatg catcaatgaa caccgcagaa 35520tctatgaagc tatcatggcc ggtgattgtg acaaggctag aattgcttct gctgttcact 35580tgcaagatgc caaccaccgt ttggcaacag cataatagaa atgatttaaa gcgcacctga 35640gccatctcac atcgagatga acaccctcac gttcggataa acgactttaa aaggtatgcc 35700tagtgcatgc cttttttggt ttttagaccg cgtgttgcac tatctgtagc actattttgg 35760gtcagtcttt tcgctacgtc tgttaagcta ttcttccacg ttacaacccg ccttgttttt 35820aacgtctacg taacaatccc caagcatcgt tctaaacaca tttttagact gtctgtacct 35880gacaagtagt tatgcgacag ccgggatttt tcacctctca gtattctaaa tctgggatta 35940aacaaacagg gttctcggat ttaatattta gatatttaaa tcgaattcta atgatattac 36000ccactcgatt tcgtaaaaaa cactggttta ttgtgtgatg aatgatgtgg gtttggtcaa 36060ggattctctt ttattatttt tgagaacttt atgtttatat gtgtttgatt gtatttgtta 36120ataagtgtgc aaagtctcac ttttatttta agttgttgtt tttaatgttt aatttatttt 36180gagtgtttga tcttttgggt ttttacctaa aaccctaaca atttccttaa tggattagcc 36240atattccatc ctatgtcata tatataatta acttaatcaa tcaaaataag atcaccatca 36300cttatttgga ttattgtact acaaataaag agtcgaattt cctatagtcc tcgtaacaaa 36360ttaaaacgga caaaggatac acgatggaac tcaacacgat tattgtcggc atttatttcc 36420tattcttgat tgcgataggt tggatgttta gaacatttac aagtactact agtgactact 36480tccgcggggg cggtaacatg ttgtggtgga tggttggtgc aaccgccttt atgacccagt 36540ttagtgcatg gacattcacc ggtgcagcag gtaaagcgta taacgatggt ttcgctgtag 36600cggtcatctt cgtagccaac gcatttggtt acttcatgaa ctacgcgtac ttcgcgccga 36660aattccgtca acttcgcgtt gttacggtaa tcgaagcgat tcgtatgcgt tttggtgcga 36720ccaacgaaca agtattcact tggtcttcaa tgccaaactc agtggtatct gcgggtgtgt 36780ggttaaacgc attggcaatc atcgcttcgg gtatcttcgg tttcgacatg aacatgacta 36840tctgggtgac tggcctagtg gtattggcaa tgtcggtaac aggtggttca tgggcggtaa 36900tcgcatctga cttcatgcag atggttatca tcatggcggt aacggtaact tgtgcggttg 36960tagcggttgt tcaaggtggc ggtgttggtg agattgttaa caacttccca gtacaagatg 37020gtggttcgtt cctttggggc aacaacatca actacctaag catctttacg atttgggcat 37080tcttcatctt cgttaagcag ttctcaatca cgaacaacat gcttaactct taccgttacc 37140tagcggctaa agactcaaag aacgctaaga aagctgcact gcttgcttgt gtgttgatgt 37200tgtgtggtgt gtttatttgg ttcatgcctt cttggttcat tgcaggccaa ggtgttgatt 37260tatcagcggc ttacccgaat gcaggtaaaa aagcgggtga ctttgcttac ctatacttcg 37320tacaagagta catgccagca ggtatggttg gtctattagt tgccgcgatg tttgcagcga 37380caatgtcttc aatggactca ggtctaaacc gtaactcagg tatttttgtt aagaacttct 37440acgaaacaat cgttcgtaaa ggtcaagcat cagagaaaga gctagtaacc gtatctaaaa 37500ttacttcagc ggtatttggt ttcgctatta tcctaatcgc acagttcatc aactcattaa 37560aaggcttaag cctgtttgat acgatgatgt acgtaggtgc gttaatcggc ttccctatga 37620cgattcctgc attccttggt ttcttcatca agaagactcc ggactgggct ggttggggaa 37680cgctagttgt tggtggtatc gtatcttatg tggttggttt tgttatcaac gcggagatgg 37740tagcagcggc gtttggtctt gatactctaa caggacgtga atggtctgat gttaaagttg 37800cgattggtct gattgctcac atcacgctaa ccggtggctt cttcgtacta tctacgatgt 37860tctacaagcc tctatcaaaa gaacgtcaag cggatgttga taagttcttt ggcaacttag 37920ataccccatt agtagctgaa tcggcagagc aaaaagtgtt ggataacaaa caacgtcaaa 37980tgcttggtaa actgattgcg gtagcgggtg ttggtattat gctgatggct cttctgacta 38040acccaatgtg ggggcgccta gtcttcatct tatgtggtgt gatagtgggt ggtgtcggta 38100ttctacttgt gaaagcggtc gatgacggcg gcaagcaagc gaaagcagta accgaaagct 38160aatacataga aaacgtttat aatagaatgc gacgactcga aagggcgtcg cattttttat 38220tctgcggaac tggaaaaccg tcaggtgaaa gatatctgac ctaaatcacg aaaactgtac 38280aaagtggttc aatcgaatcg aaatatattc aattgtccta caataagacg tatattgttg 38340ctaattcctt tcaatcaact tgaaaaataa gtgagttaga atgagcgacc aaaaatctct 38400tgatgcaatc aggaagatga agctggaaaa cgatacttca gcaggtaatc ttgtagacct 38460actccctatc gaagttcaaa cacgtgactt cgacctatca ttcctagaca ccttgagcga 38520agcacgtccg cgtcttcttg ttcaagctga tcagctagaa gaattcaaag caaaagtgaa 38580agctgatcaa gctcactgta tgtttgatga tttctacaac aactctaccg ttaagttcct 38640tgagactgct cctttcgaag agcctcaagc gtacccagct gagacggtag gtaaagcttc 38700tctatggcgt ccttattggc gtcaaatgta cgttgattgc caaatggcac tgaacgcgac 38760acgtaaccta gcgattgctg gtgttgtaaa agaagacgaa gcgctcattg cgaaagcaaa 38820agcttggact ctaaaactgt ctacgtacga tccagaaggc gtgacttctc gtggctataa 38880cgatgaagcg gctttccgtg ttatcgctgc tatggcttgg ggttacgatt ggctacacgg 38940ctacttcacc gatgaagaac gccagcaagt tcaagatgct ttgattgagc gtctagacga 39000aatcatgcac cacctgaaag tgacggttga tctattgaac aacccactaa atagccacgg 39060tgttcgttct atctcttctg ctatcatccc aacgtgtatc gcgctttacc acgatcaccc 39120gaaagcaggc gagtacattg catacgcgct agaatactac gcagtacatt acccaccatg 39180gggcggtgta gacggcggtt gggctgaagg tcctgattac tggaacacgc aaactgcatt 39240cctaggcgaa gcattcgacc tattgaaagc atactgtggt gtagacatgt ttaacaaaac 39300attctacgaa aacacaggtg atttcccgct ttactgcatg ccagttcact ctaagcgcgc 39360gagcttctgt gaccagtctt caatcggcga tttcccaggt ttaaaactgg cttacaacat 39420caagcactac gcaggtgtta accagaagcc tgagtacgtt tggtactata accagcttaa 39480aggccgtgat actgaagcac acaccaaatt ctacaacttc ggttggtggg acttcggtta 39540tgacgatctt cgttttaact tcctttggga tgcacctgaa gagaaagccc catcgaacga 39600tccactgttg aaagtattcc caatcacggg ttgggctgca ttccacaaca agatgactga 39660gcgtgataac catattcaca tggtattcaa atgttctccg tttggctcaa tcagccactc 39720tcacggtgac caaaacgcat ttacgcttca cgcatttggt gaaacgctag cgtcagtaac 39780aggttactat ggtggtttcg gtgtagacat gcacacgaaa tggcgtcgtc aaacgttctc 39840taaaaacctg ccactatttg gcggtaaagg tcagtacggc gagaacaaga acacaggcta 39900cgaaaaccac caagatcgct tttgtatcga agcgggcggc actatctctg acttcgacac 39960tgaatctgat gtgaagatgg ttgaaggtga tgcaacggca tcttacaagt acttcgttcc 40020tgaaatcgaa tcttacaagc gtaaagtctg gttcgttcaa ggtaaagtct tcgtaatgca 40080agacaaggca acgctttctg aagagaaaga catgacttgg ctaatgcaca caactttcgc 40140aaacgaagtg gcagacaagt ctttcactat ccgtggcgaa gttgcgcacc tagacgtaaa 40200cttcatcaac gagtctgctg ataacatcac gtcagttaag aacgttgaag gctttggcga 40260agttgaccca tacgagttca aagatcttga gatccaccgt cacgtggaag tggaattcaa 40320gccatcgaaa gagcacaaca tcctgacgct tcttgttcct aataagaatg aaggcgagca 40380agttgaagtg tttcacaagc ttgaaggcaa cacgctactg ctaaatgttg acggcgaaac 40440ggtttcaatc gaactgtaat ccgctgaagt aacagaagtt agatactaaa aactccgagt 40500gaaagctcgg agtttttttg tttggctagc caattaagtt ggagttggat aagtcagtta 40560agttgtatta gttgacaacg ttggcaaacc gatcaggttg aaagaaaact taattggcca 40620gagataaata gcttctcgat gccaagtcag tggctgaggg ctaaatctgg acattgatgc 40680acataaagac cggcatgtac ttagccacta tgctcaatga aatgtgcagg agtcgtataa 40740gagactcgta tatatcgctc tgttagaaga acagggcgcc aacgcctgtt tcctagcaat 40800tgttatgact tacttttccg tgaacagtct tatcactggc tgagtaaggg agtagtgaac 40860tatacatagg taaaggcgta gcttgttctt actaatcgta tgacatttaa cgtacgttat 40920tcgttattat aatgaacata taatcataca atactatatt tggagtttga acatgactaa 40980acctgtaatc ggtttcattg gcctaggtct tatgggcggc aacatggttg aaaacctaca 41040aaagcgcggc taccacgtaa acgtaatgga tctaagcgct gaagctgttg ctcgcgtaac 41100agatcgcggc aacgcaactg cattcacttc tgctaaagaa ctagctgctg caagtgacat 41160cgttcagttt tgtctgacaa cttctgctgt tgttgaaaaa atcgtttacg gcgaagacgg 41220cgttctagcg ggcatcaaag aaggcgcagt actagtagac ttcggtactt ctatccctgc 41280ttctactaag aaaatcggcg cagctcttgc tgaaaaaggc gcgggcatga tcgacgcacc 41340tctaggtcgt actcctgcac acgctaaaga tggtcttctg aacatcatgg ctgctggcga 41400catggaaact ttcaacaaag ttaaacctgt tcttgaagag caaggcgaaa acgtattcca 41460cctaggggct ctaggttctg gtcacgtgac taagcttgta aacaacttca tgggtatgac 41520gactgttgcg actatgtctc aagctttcgc tgttgctcaa cgcgctggtg ttgatggcca 41580acaactgttt gacatcatgt ctgcaggtcc atctaactct ccgttcatgc aattctgtaa 41640gttctacgcg gtagacggcg aagagaagct aggtttctct gttgctaacg caaacaaaga 41700ccttggttac ttccttgcac tttgtgaaga gctaggtact gagtctctaa tcgctcaagg 41760tactgcaaca agcctacaag ctgctgttga tgcaggcatg ggtaacaacg acgtaccagt 41820aatcttcgac tacttcgcta aactagagaa gtaatcgacg tacgacctcg ctagggtatt 41880gcttgtcttc taggcggcga tacctcagcg aggttcgttt ttatctgcca tacccaaccc 41940tttgttccct tgttaaaatc ttctacttct acttctactt ctacttcaat ttcctcagtt 42000acacctaatc aaaactctgt ttaactctgt tactgcctca attcctattt ttttctatat 42060ctatttctaa cggtaaattc aaaaccttct agcaccaact cattcactca tttttcctcg 42120caagctcaaa ctcaacgcgc ttacatgatt gttggtgatg gcttaacacc gctcgtatat 42180cggtcctgaa aagaaagtaa aaaaaaagcc cacacagctg gtgactgtat gggcatgttc 42240ggacgagccg tctggacaaa caaatgagca atagtaagtg aaaaaacgaa taacgagatc 42300ccccgacagt ttctacgtta aacgcgttca atgaccttaa agcggctgct tcaattatca 42360ctttgaattg aacaaaagca tccagaaaga acttaagtta tgattcaaat acaccatagt 42420acaagactta ttgtattaca aataaatttt aagattgaat gcctttagtg aatggttagt 42480tggtagaagt gtgagttaag actcattttt tcactcagct gggtgaggta aagaagaaga 42540gttttcgaaa agatgttatc ggaaaaatga tgagctaatt atctaaaaat cgatctattt 42600taatgtgtta tgcgtcaatg tttaacttcg aacaaaatcc aaactcataa atgataccta 42660tgtcacaggg cggttttagc cagttttaat atatcaagat cgctcacaga atgtctggtc 42720aattaaacat acaatattaa ttaagttgat ggttgtgacg atggatcggc atgaacaagt 42780ttcgctttcc gtatcttcga aaatgtaaaa aatggccatt tcattcggat gaaaataata 42840gacataggtt gatatggatg atgagtttta tgaattcaaa attgtctcta gggtttaaag 42900gaaaattgat tttaatggta gcggtcgtca gttctagtgc tttggcattt acgaactggt 42960ttacgcttaa cttggccact gaacaggtaa accaaacgat ttataacgag attgatcact 43020cgcttacgat agaaatcaat caaatagaaa gtaccgttca gcgcaccatc gataccgtta 43080actctgttgc acaagagttc atgaaatccc cttaccaagt gccgaatgaa gcactcatgc 43140attatgccgc taagcttggt ggcattgaca agattgtggt gggttttgac gacggccgtt 43200cttatacctc tcgcccttca gagtctttcc ctaacggtgt tggaataaaa gaaaaataca 43260atccaaccac tcgaccttgg tatcaacaag cgaaattgaa atcaggctta tcttttagtg 43320gtctgttttt cactaagagt actcaagtgc ctatgatcgg tgtgacctac tcataccaag 43380atcgtgtcat catggccgat atacgctttg acgatttgga aacgcagctt gaacagctgg 43440acagcatcta cgaagccaaa ggcattatca tcgacgaaaa ggggatggtg gtcgcttcaa 43500caatcgaaaa cgtgcttccg caaaccaata tatcttctgc agacactcaa atgaaactca 43560acagtgccat tgaacagcct gatcaattca ttgagggtgt gattgatggt aaccagagaa 43620tcttgatggc caagaaagtg gatattggca gccagaaaga gtggttcatg atctccagta 43680ttgaccctga actcgcgctc aatcagctga atggcgtgat gtcgagtgcg cgcatcctta 43740tcgtcgcttg tgtacttggc tcggtgatat tgatgatttt acttctgaat cgtttctacc 43800gcccaatcgt gtcactgcgc aaaatcgtcc acgatctatc acaaggtaac ggagacctca 43860ctcaaaggct tgctgagaag gggaatgatg acttagggca tatcgccaaa gacatcaact 43920tgttcattat cggcttacaa gagatggtta aggatgtgaa atacaagaac tcggatctcg 43980ataccaaggt actgagtatt cgcgaaggtt gtaaagaaac cagcgatgta ctgaaagttc 44040atactgatga aacggttcaa gtggtctctg cgattaacgg cttgtctgaa gcatcaaacg 44100aagtagagaa gagttctcag tcggcggcag aagcagcaag agaggccgct gtgttcagtg 44160atgagacgaa acagattaac acggtgacgg aaacctatat cagtgatctt gagaagcaag 44220tctgcaccac ttctgatgac attcgctcaa tggccaatga aacgcagagc atccagtcta 44280tcgtgtctgt gattggcgga attgcggaac aaactaattt gctggcattg aatgcgtcaa 44340ttgaagcggc gagggcgggt gaacatggtc gaggtttcgc ggtggttgct gatgaagtcc 44400gtgcgctagc caaccgaacg caaatcagta cctctgaaat tgatgaagcg ttatctggct 44460tgcagtctaa atcagatggt ttggttaaat ctattgagtt gaccaaaagt aactgtgaac 44520tgactcgcgc tcaagttgtt caagctgtaa acatgttggc gaagctaacc gagcagatgg 44580aaacagtaag tcgttttaat aatgacattt cgggttcgtc tgttgagcaa aacgccctta 44640ttcagagcat tgctaagaac atgcataaga ttgaaagctt tgttgaggag cttaataaac 44700taagccaaga tcagttaact gaatcagcag aaatcaaaac acttaacggt agcgttagtg 44760aattgatgag cagctttaag gtttaatgtt tctaatattt atacctaaaa atcaacatgt 44820taagtttagt tgttgatctg aaggccactc aataactgtc gagtttagag tggcttttct 44880gcgttgttct tgagtctaac tctacgtaat atccgttcat ttcacttcat ttgccgcatc 44940tcacattctg ataaatagac aattgacata aaatagtaca aatatacatt gtcactctac 45000tcttatggat aagtgagata aatgtgaata agccaatctt tgtcgtcgta ctcgcttcgc 45060ttacgtatgg ctgcggtgga agcagctcca gtgactctag tgacccttct gataccaata 45120actcaggagc atcttatggt gttgttgctc cctatgatat tgccaagtat caaaacatcc 45180tttccagctc agatcttcag gtgtctgatc ctaatggaga ggagggcaat aaaacctctg 45240aagtcaaaga tggtaacttc gatggttatg tcagtgatta tttttatgct gacgaagaga 45300cggaaaatct gatcttcaaa atggcgaact acaagatgcg ctctgaagtt cgtgaaggag 45360aaaacttcga tatcaatgaa gcaggcgtaa gacgcagtct acatgcggaa ataagcctac 45420ctgatattga gcatgtaatg gcgagttctc ccgcagatca cgatgaagtg accgtgctac 45480agatccacaa taaaggtaca gacgagagtg gcacgggtta tatccctcat ccgctattgc 45540gtgtggtttg ggagcaagaa cgagatggcc tcacaggtca ctactgggca gtcatgaaaa 45600ataatgccat tgactgtagc agtgccgctg actcttcgga ttgttatgcc acttcatata 45660atcgctacga tttgggagag gcggatctcg ataacttcac caagtttgat ctttctgttt 45720atgaaaatac cctttcgatc aaagtgaacg atgaagttaa agtcgacgaa gacatcacct 45780actggcagca tctactgagt tactttaaag cgggtatcta caatcaattt gaaaatggtg 45840aagccacggc tcactttcag gcactgcgat acaccaccac acaggtcaac ggctcaaacg 45900attgggatat taatgattgg aagttgacga ttcctgcgag taaagacact tggtatggaa 45960gtgggggtga cagtgcggct gaactagaac ctgagcgctg cgaatcgagc aaagaccttc 46020tcgccaacga cagtgatgtc tacgacagcg atattggtct ttcttatttc aataccgatg 46080aagggagagt gcactttaga gcggatatgg gatatggcac ctctaccgaa aattctagct 46140atattcgctc tgagctcagg gagttgtatc aaagcagtgt tcaaccggat tgtagcacca 46200gcgatgaaga tacaagttgg tatttggacg acactagaac gaacgctacc agtcacgagt 46260taaccgcaag cttacgaatt gaagactacc cgaacattaa taaccaagac ccgaaagtgg 46320tgcttgggca aatacacggt tggaagatca atcaagcatt ggtgaagttg ttatgggaag 46380gcgagagtaa gccagtaaga gtgatactga actctgattt tgagcgcaac aaccaagact 46440gtaaccattg tgacccgttc agtgtcgagt taggtactta ttcggcaagt gaagagtggc 46500gatatacgat tcgagccaat caagacggta tctacttagc gactcatgat ttagatggaa 46560ctaatacggt ttctcattta atcccttggg gacaagatta cacagataaa gatggggaca 46620cggtctcgtt gacgtcagat tggacatcga cagacatcgc tttctatttc aaagcgggca 46680tctacccaca atttaagcct gatagcgact atgcgggtga agtgtttgat gtgagcttta 46740gttctctaag agcagagcat aactgagttc tctgatgttt ggttagccat gtcggtaatg 46800aagaagacca tattgatgcc tacaatgtgg tctttttttg tttttggaca cttacagtga 46860tgtgttttga aggacaaatg ttctgctcga atcatgcaaa tacacacgat tacagctcgc 46920ttgttctgcc cttgctagct catttcgcat tccaaattct tatatattgt cttttatcaa 46980taggaaatgt gatccagtta aagtatggaa aaatcggaaa gtgttcctag tctcatttat 47040ccaacgaagt gttttatttg tattataaga ttacgtaata ttttcgtgtt atcgcaaata 47100ctgataggtg aatcgcctta tagctcgtgt ttgctgattt agctttcact tacgaacgct 47160gtctttgtat tataataatg gattaaatat gaaacaaatt actctaaaaa ctttactcgc 47220ttcttctatt ctacttgcgg ttggttgtgc gagcacgagc acgcctactg ctgattttcc 47280aaataacaaa gaaactggtg aagcgcttct gacgccagtt gctgtttccg ctagtagcca 47340tgatggtaac ggacctgatc gtctcgttga ccaagaccta actacacgtt ggtcatctgc 47400gggtgacggc gagtgggcaa cgctagacta tggttcagta caggagtttg acgcggttca 47460ggcatctttc agtaaaggta atcagcgcca atctaaattt gatatccaag tgagtgttga 47520tggcgaaagc tggacaacgg tactagaaaa ccaactaagc tcaggtaaag cgatcggcct 47580agagcgtttc caatttgagc cagtagtgca agcacgctac gtaagatacg ttggtcacgg 47640taacaccaaa aacggttgga acagtgtgac tggattagcg gcggttaact gtagcattaa 47700cgcatgtcct gctagccata tcatcacttc agacgtggtt gcagcagaag ccgtgattat 47760tgctgaaatg aaagcggcag aaaaagcacg taaagatgcg cgcaaagatc tacgctctgg 47820taacttcggt gtagcagcgg tttacccttg tgagacgacc gttgaatgtg acactcgcag 47880tgcacttcca gttccgacag gcctgccagc gacaccagtt gcaggtaact cgccaagcga 47940aaactttgac atgacgcatt ggtacctatc tcaaccattt gaccatgaca aaaatggcaa 48000acctgatgat gtgtctgagt ggaaccttgc aaacggttac caacaccctg aaatcttcta 48060cacagctgat

gacggcggcc tagtattcaa agcttacgtg aaaggtgtac gtacctctaa 48120aaacactaag tacgcgcgta cagagcttcg tgaaatgatg cgtcgtggtg atcagtctat 48180tagcactaaa ggtgttaata agaataactg ggtattctca agcgctcctg aatctgactt 48240agagtcggca gcgggtattg acggcgttct agaagcgacg ttgaaaatcg accatgcaac 48300aacgacgggt aatgcgaatg aagtaggtcg ctttatcatt ggtcagattc acgatcaaaa 48360cgatgaacca attcgtttgt actaccgtaa actgccaaac caagaaacgg gtgcggttta 48420cttcgcacat gaaagccaag acgcaactaa agaggacttc taccctctag tgggcgacat 48480gacggctgaa gtgggtgacg atggtatcgc gcttggcgaa gtgttcagct accgtattga 48540cgttaaaggc aacacgatga ctgtaacgct aatacgtgaa ggcaaagacg atgttgtaca 48600agtggttgat atgagcaaca gcggctacga cgcaggcggc aagtacatgt acttcaaagc 48660cggtgtttac aaccaaaaca tcagcggcga cctagacgat tactcacaag cgactttcta 48720tcagctagat gtatcgcacg atcaatacaa aaagtaatct aatcgaataa cacttaatat 48780taaaggtatt gcaatagcct ccagccttag ggtttggagg cttttttgtg cctgctgttg 48840gttgggctta agcgtatgat ttaattgagt aggagagggg tagttatcag ttgcacagag 48900tttaagacat tatcattaag ctcattcagt attaacttta gtcattatca gtcactatta 48960ccccccaagc gccgatcaca attaacctag ctcatgatta atctcagtta ccaataggct 49020agcctgtagc ggattcaaac ccaaataatg tcgtgatgtt tatcggaatc accatagctc 49080gaaaactttg accttgttct caaggctttg ccaatgcacg aacgtattat gtgcgtggtt 49140tactaataag cgttagctcg gctgactact catactgttc ttgaaaccgt tactcttggg 49200ttgtttagct agactcctag caacagccat aaatagtgct ctaactcttt cataattaga 49260agggtagggt tagccattct attggttcca atgctttatg aaatactagg cgggctcaag 49320tcgatgatca aacgactcta acagcttaag gttatgcgct tttgcgttag ttacctgcag 49380gccgtaaatg ccctgattgt agttgtacgg tgacgctgaa taataatttg taggattagt 49440atagaactga gagactttgt ctatctatga tcgatacagg ctttgagagg gctggatcag 49500tagaaagaca gaatgacaat tagcactaga gttattttgg tttttaatta gagttaataa 49560aatagatatt tggtttgtta aatttaatcg tgtcataagc tctgtgtttt aaaaaataaa 49620aaaagccata gcagttgcta tggctttgaa taagtcaggt tctaaggtaa gcaaacagca 49680agtcaacttg tctgttttga tattcttagt cttagttcaa gatattttct ttacctgccg 49740cagtgttcac tgcagatggt tgtgcgtaga tggctgtatt tcttatatct ttaccgttgt 49800cagctgaaag tagggttttg taaccattga acgtattgct ttcaatagtg acattacagc 49860caaggttatc atcactattc tttatgacgg cactgctaga atcaccaact ttaattccgc 49920ccacatcact accaccagag ttaatagtga atgtattatt tgtgatttgt gaaccaaatc 49980gcccgttgtc actactacag ttaatacgaa ttgcattatt ttggaaactg ccacttaaac 50040cgacaaattc gctattgtct aatgtaaagt aacctcgaga gaataaccaa cttgcttttt 50100tagtacctag atcatcttcg gtaatgccgt ttgcatcgaa ctttaggttt tcaagtgcta 50160caggatctga atctttacca attttaccaa taacgatcgc accagtttca ttatctgaag 50220taccagctga ctccctacca aaacacccgg ccaaattgtc gttagcaaaa gtcatgtttt 50280tgatacctgc accgggtgca gtgacatcaa tacaagcatc tccggtaatg gttgctaaac 50340cagcaccatc aattgtgaca gctttattta gctcaataac accggtatca aacgtacctt 50400cagatgataa atcaataatc gcgccatctt ctgctgatgc aatcgcagcg ttcacatcat 50460cgactgattt aggttttcca tcatcggcat tttctaatgc agttataaca gattcatctg 50520taatctctgt tgctgagtaa gctgtttcta cgtcatcttt ttcacaggtc atatctagct 50580gttgctcagt cactgtgtaa gtacagtcct taccttcaaa ggaaacaaca ccttcttctt 50640cagcagtata tattgaactc tcaaaagtga agccattagc tacgtctcca ctgtaaatgc 50700ttagagcacg agtaccttcc tcattattat caaagcgata tggtgaagtt ccgctgagtg 50760actgtgcagc aacagcacca cctgtcagat cccaatagac gttttctata gagtaaactt 50820caacaggttc aacagggtct gttccgcctg gatctgttgg aataggtaaa ccatcactgt 50880tacaaccaaa taataaacct gtcgaaagag cgacagctgt agcgacttta gaaatttgca 50940taaaatattc tctttatgat attaaatcca tatgtaaatc acataagaaa tagataatga 51000atagtcgtta aatatttatt aggatgaagc taattctgat tagaacatcc tattatttaa 51060aataaagtaa ttaaaatatg cccaaataaa ttacaagagg agagggctat tttatatttt 51120gactatttta ttattagaat gagtaagcaa taccaacacg gtatttagct tcacgatctt 51180ttgaagatga actgatatca gaagaccaga tttcagcaaa aggtttccaa gaaccgaact 51240tgtagtttac ctttaagcca gcatcccatt cccagtcatc actattataa agaagtacgt 51300tatccaaaga ttttacatag tttgcttcgt aagaaaggcc aagcttaggt agagattcaa 51360ttttgtaaga gcccgtcagt gtaactttag acttttgagc tgattctaaa cgctcgccag 51420tttcagaatc tttgtcgcca aattgtgtgt ggttacggaa gtcagcatat tcatgacggt 51480aacgaatagc agttgttaaa cccatatctg ctttatagcc aacgcggaac tgaggtttaa 51540acgtaacctt tttcatcttc cagtcgccat cgttagcatt aggctcatcc caatcccaag 51600caataggcat acccatttgt agataccaat tattgtctat tttgtatgtc gcagtgttat 51660cgatctccat accatagatg taccaattgc catcgtaaaa actctggctg tttgctgatt 51720taacagaacc tgaatcttca tcatagtaag agtcatcacc gtggaactta agttctagac 51780cagtagagtg cttccacttg tctgacagct taaagctttc acctagctta actcggtgtt 51840gatggcgagc gtctacgtga gccgtatcac cattagtctt tgtataatcc gtcgcagcac 51900gatactcgta acgataatca agagatgcac cagcagctgt gcccgctaaa agagtacatg 51960caacagctgc agcaattttt gtaacagaat tcataccttt gtctcactat tattttttta 52020ttttggatac atccaatgta cccctgactc acaaaccaat accttacatg gtattaaatt 52080aatgtatgac aaatatggta tttattccta gggtagattt ctgtgagatc tatcaaaagt 52140tccgactaat ggcctattta tatagctaaa tgttatgaat atctcaattt aaggcttacc 52200aatcaaatca atcatgactc agttctcata ttaacaaacc ttgtaagctc agttggttgt 52260atgtgttaaa ataatacaaa tataagaata ttcccacact ttcatatcga tgttctagtt 52320gttgtggttt aaacataacg gcgcatgttg agggatatag atataaacca ccgccaaatg 52380tttggtaaaa gttaaaagat ggcgaaatgt aaattctatt tattggttgg tttatttaag 52440tcgaagagaa aatatttagt actaattcgt gttcaaaagt agtttctgtg ctgagagtgt 52500actcagtatc tgttaacaat aaaggatgag tcatgtttaa gaaaaacata ttagcagtgg 52560cgttattagc gactgtgcca atggttactt tcgcaaataa cggtgtttct taccccgtac 52620ctgccgataa attcgatatg cataattgga aaataaccat accttcagat attaatgaag 52680atggtcgcgt tgatgaaata gaaggggtcg ctatgatgag ctactcacat agtgatttct 52740tccatcttga taaagacggc aaccttgtat ttgaagtgca gaaccaagcg attacgacga 52800aaaactcgaa gaatgcgcgt tctgagttac gccagatgcc aagaggcgca gatttctcta 52860tcgatacggc tgataaagga aaccagtggg cactgtcgag tcacccagcg gctagtgaat 52920acagtgctgt gggcggaaca ttagaagcga cattaaaagt gaatcacgtc tcagttaacg 52980ctaagttccc agaaaaatac ccagctcatt ctgttgtggt tggtcagatt catgctaaaa 53040aacacaacga gctaatcaaa gctggaaccg gttatgggca tggtaatgaa ccactaaaga 53100tcttctataa gaagtttcct gaccaagaaa tgggttcagt attctggaac tatgaacgta 53160acctagagaa aaaagatcct aaccgtgccg atatcgctta tccagtgtgg ggtaacacgt 53220gggaaaaccc tgcagagccg ggtgaagccg gtattgctct tggtgaagag tttagctaca 53280aagtggaagt gaaaggcacc atgatgtacc taacgtttga aaccgagcgt cacgataccg 53340ttaagtatga aatcgacctg agtaagggca tcgatgaact tgactcacca acgggctatg 53400ctgaagatga tttttactac aaagcgggcg catacggcca atgtagcgtg agcgattctc 53460accctgtatg ggggcctggt tgtggcggta ctggcgattt cgctgtcgat aaaaagaatg 53520gcgattacaa cagtgtgact ttctctgcgc ttaagttaaa cggtaaatag cacatagcat 53580aaccaatagt ctagctagac gcagtcctta aggaatattt tcgaagacca cttaaccgaa 53640tgttgagtgg tctttttgtt ttatatgagt tttaagatga acttggtatt aatgtgacct 53700tggtatcaat gagggtgtac gtgaagccta ccaatgaaag gtacagctaa aacaatacaa 53760ccttgtcaaa agacaaggtt gcattcagaa agcgtaggaa gattttagga cgacaactcg 53820atacggagtt tagtcataca tcaactcttt ggctttgtcg gcatcaaact ctttaagaga 53880ctttcgagcc aagtgacgga atgggaaagc tttcacgact tcttcgaatg gttggatggc 53940aaatgcccaa aagatagaac cgtctaatcc aaagatgatc aatgcacaca atggaattga 54000aattacccat tgaccagtaa agttgatttt gaagactgcg gtcgtttttc ctagggctct 54060taatacattc ccatgaaccg 54080322890DNAVibrio splendidus 3gtgctttgtg acaacggggg atgtatggat attgaagttt cgcgccaggt tgcggtagtt 60gaagctacga gtggagatgt cgtcgtagtt aagccagacg gcagcgcaag aaaagtttca 120gttggcgata ccatccgtga aaatgagatc gtgattacgg ccaacaagtc agagcttgta 180ttaggcgttc agaatgattc gattccggtt gcagagaatt gcgtcggttg tgttgatgaa 240aacgctgcat gggtagatgc cccaatagct ggtgaggtta attttgactt acagcaagca 300gacgcagaaa ccttcactga agacgacctt gctgcaattc aagaagccat tttaggtggt 360gccgatccga ctcaaatctt agaagcaacg gctgctggtg gcggactagg ttctgcaaat 420gctggctttg tgacgattga ctataactac actgaaactc atccatcgac tttctttgag 480accgctggtc tagcagaaca aactgttgat gaagacagag aagaattcag atctatcact 540cgttcatcag gtggccaatc aatcagtgaa acactgactg aaggctccat atctggcaat 600acctatcccc aatctgtaac aacgacagaa acgattattg ctggtagttt agctctcgcc 660cctaactctt tcattccaga aactttatcc ctcgcttcac tacttagtga attaaacagc 720gacattactt caagtggtca gtccgttatc ttcacctatg acgcgacgac taattctatc 780gttggtgttc aagataccga cgaagtatta cgtatcgaca ttgatgccgt cagtgttggc 840aataacattg agctttctct aaccacaacg atttcccagc cgattgatca tgtaccgtcg 900gttggcggtg gtcaggtttc ttacactggc gatcaaatag atattgcctt tgatattcaa 960ggtgaagaca ccgctgggaa cccgctagca acacccgtta acgcacaagt ttcagtgttt 1020gacgggatag atccgtctgt tgaaagtgtc aatatcacta acgttgaaac tagcagcgcg 1080gcaatcgaag ggacgttctc aaatattggt agtgataacc ttcaatcagc cgtatttgat 1140gcaagtgcac tggaccagtt tgatgggttg ctcagtgata atcaaaacac gcttgcgaga 1200ctttctgatg atggaacaac gattactctg tccatccaag gtcgaggtga ggttgttctc 1260actatctctc tagataccga tggcacctat aaattcgagc agtctaatcc gatagaacaa 1320gtgggtaccg attcactgac gttcgctttg ccaatcacga ttaccgattt tgaccaagat 1380gttgtaacca atacgatcaa cattgccatt actgatggcg atagccctgt tattactaat 1440gttgacagta ttgatgttga tgaagcgggc attgttggcg gctcacaaga gggcacggcg 1500ccagtgtctg gcactggcgg tatcaccgcg gacatttttg aaagtgacat cattgaccat 1560tatgagctag aacccactga atttaatact aatggcacct tggtttcaaa tggcgaggct 1620gtgctacttg agttgattga tgaaaccaac ggtgtaagaa cttacgaagg ttatgttgag 1680gtcaatggtt cgagaattac ggtctttgac gttaaaattg atagcccttc attgggcaac 1740tatgagttta atctttatga agaactttct catcaaggcg ctgaagatgc gctgttaact 1800tttgcattgc caatttatgc tgttgatgca gatggcgacc gttctgcact gtctggaggt 1860tcgaacacac cagaagctgc tgagatcctc gttaatgtta aagacgatgt cgttgaatta 1920gttgataagg ttgaatcagt caccgagccg accttagcgg gcgatactat tgtttcgtat 1980aacctgttca attttgaagg cgcagatggt tctacaattc aatcgtttaa ctacgacggt 2040gttgattact cactcgatca aagcctgctc cccgatgcta cccagatttt cagttttact 2100gaaggtgtcg tcactatctc attaaacggt gacttcagtt ttgaagtcgc tcgtgatatc 2160gaccactcaa gcagtgaaac tatcgtcaaa cagttctcat ttttagccga agatggtgat 2220ggggatactg atagttcgac gcttgagtta agtattaccg atggccaaga tccgatcatt 2280gatttgatcc cgcctgtgac tctctctgaa accaacctta atgacggctc tgctcccagc 2340ggaagtacag ttagcgcaac cgagacgatt acctttaccg caggcagcga cgatgtagca 2400agtttccgta ttgaaccaac agagtttaat gtgggcggtg cacttaaatc gaatggattt 2460tcggttgaga taaaagaaga ttcggctaat ccgggtactt acattggctt tattaccaac 2520ggttcgggcg ctgaaatccc agtgtttacg attgctttct ctacgagcac attgggtgaa 2580tacaccttta ctctgcttga agcgttagac catgtagatg gtttagataa gaacgatctg 2640agctttgatc tgcctattta tgcggttgat acggacggcg acgattcatt ggtgtctcag 2700cttaatgtga ctatcggtga tgatgttcaa atcatgcaag acggtacgtt agatatcacc 2760gagccaaatc ttgctgacgg tacaatcaca accaacacca ttgatgtaat gccaaatcaa 2820agtgctgatg gcgcgacgat cactcggttc acttatgacg gtgtcgtaaa cacactggat 2880caaagtattt caggagaaca gcagttcagc ttcacagaag gcgaactgtt tatcaccctt 2940gaaggtgaag tgcgctttga gcctaatcgc gatctagacc actcagtgag tgaagatatc 3000gtgaagtcga ttgtggtgac ttcaagcgac ttcgataacg atccggtgac ttcaaccatt 3060acgctgacga tcactgatgg tgataacccg acgattgatg ttattccaag tgttacgctt 3120tctgaaatta acctgagcga tggctctgct ccaagtggca gcgcggtaag ctcgactcaa 3180actattactt ttaccaatca aagtgatgat gtggttcgtt tccgtattga gtcaacggag 3240ttcaatacta acgatgatct taaatcgaac ggtttagctg ttgagttacg tgaagacccg 3300gcagggtcgg gtgactacat tggttttacg accagtgcga cgaacgtaga aactccagta 3360ttcacattaa gctttaattc tggatcatta ggtgaataca cgttcacact catcgaagcg 3420ttggaccacc aagatgcccg tggcaacaac gacctcagtt ttgatttacc tgtttacgcg 3480gtagatagtg atggcgatga ttcattggtg tctccgttaa acgtcactat cggtgatgat 3540gttcaaatca tgcaagatag tacgttagat atcgtcgagc caaccgtcgc agatttggcc 3600gctggcacag tgacaactaa caccattgat gtgatgccaa atcaaagtgc cgatggcgca 3660acggtgacgc aattcactta tgatggccag cttcgaacac ttgaccaaaa tgacaatggt 3720gagcagcaat ttagcttcac agaaggtgaa ctgttcatca cgcttcaagg tgatgtgcgc 3780tttgagccta atcgtaatct agaccacaca ctcagcgaag acatcgtgaa atcaatcgtg 3840gtgacatcta gcgattccga taacgatgtg ttgacctcaa ccgtcactct gaccattacc 3900gatggtgata tcccaaccat tgataatgtt ccaactgtga acttgtctga aactaatctg 3960agtgatggct ctgcacctag cggaagcgcg gtgagttcaa ctcaaactat tacttacacc 4020actcaaagtg atgatgtgac aagcttccgt attgaaccga ctgaatttaa tgttggtggc 4080gctctcacat caaacggatt ggcagtcgag ttaaaagctg atccaaccac accgggtggc 4140tacatcggtt ttgtgactga tggttcgaac gttgaaacta acgtgttcac gattagcttc 4200tcagatacca atttaggcca gtacaccttc accttacttg aagcgttaga ccatgtggat 4260ggtttagcga acaatgatct gacctttgat ctgcctgttt atgcagttga tagcgatggc 4320gacgattcac tggtgtctca gttaaatgta accatcggtg atgatgttca aatcatgcaa 4380ggtggtacgt tagatatcac tgagccaaat cttgcagacg gcacaattac aaccaatacc 4440atcgatgtga tgccagagca aagcgccgat ggtgcgacga tcactcagtt cacttatgac 4500ggtcaagttc gaacactgga tcaaacggac aatggtgagc agcaatttag cttcactgaa 4560ggcgagttgt tcatcactct tcaaggtgac gtgcgtttcg aacccaatcg caacctagat 4620cacacagcta gcgaagatat cgtgaagtcg atagtggtga cttcaagcga tttagataac 4680gatgtggtga cgtcaacggt cactctgacg attactgatg gtgatatccc aaccattgat 4740gcagtgccaa gcgttactct gtctgaaatc aatcttagtg acggctctgc gccaagtggc 4800actgcagtta gtcaaactga gacgattacc ttcaccaatc aaagtgatga tgtgaccagt 4860ttccgtattg agccaataga gttcaatgtg ggcggtgcac tgaaatcgaa tggatttgcg 4920gttgagataa aagaagattc ggctaatccg ggtacttaca ttggctttat taccaacggt 4980tcgggcgctg aaatcccagt gtttacgatt gctttctcta cgagctcatt gggtgaatac 5040acctttactc tgcttgaagc gttagaccat gtagatggtt tagataagaa cgatctgagc 5100ttcgatctgc ctgtttatgc ggtcgatacg gacggcgatg attcattggt gtctcagcta 5160aacgtgacca tcggtgatga tgtccaaatc atgcaagacg gtacgttaga tatcatcgag 5220ccaaatctgg ctgatggaac aatcacaacc agcactattg atgtgatgcc aaaccaaagt 5280gctgatggtg cgacgatcac tcagtttact tatgacggtc agctaagaac gcttgatcaa 5340aatgacactg gcgaacagca gttcagcttc acagaaggcg agttgtttat cacccttgaa 5400ggtgaagtgc gctttgagcc aaaccgagac ctagaccaca ccgcgagtga agatattgtt 5460aagtcgattg tggtcacttc aagtgatttc gataacgact ctctgacttc taccgtaacg 5520ctgaccatta ctgatggtga taaccctacg atcgacgtca ttccaagcgt taccctttct 5580gaaactaatc tgagtgatgg ctctgctcca agtggcagcg cggtaagctc gactcaaact 5640attactttta ccaatcaaag tgatgatgtg gttcgtttcc gtattgagcc aacggagttc 5700aatactaacg atgatcttaa atcgaacggt ttagccgttg agttacgtga agacccggct 5760gggtcgggtg actacattgg ttttactact agtgcgacga atgtcgaaac cacggtattt 5820acgctgagtt tttctagcac cacattaggt gaatatacct tcactttgct tgaagcgttg 5880gaccaccaag atgcccgtgg caacaacgac ctcagttttg aactgcctgt ttatgcggta 5940gacagtgatg gcgatgattc actgatgtct ccgttaaacg tcaccatcgg cgatgatgtt 6000caaatcatgc aagacggtac gttagatatc gtcgagccaa ccgtcgcaga tttggccgct 6060ggcattgtga caactaacac cattgatgtg atgccaaatc aaagtgccga tggcgcgacg 6120atcactcaat tcacttatga tggccaactt cgaacacttg accaaaatga caatggcgaa 6180caacagttta gcttcacgga aggtgaacta ttcatcactc ttgaaggtga agtgcgcttt 6240gagcctaatc gtaatctaga ccacacgctg aacgaagaca tcgtgaaatc gatcgtggtg 6300acgtctagtg actccgataa cgatgtgttg acctcaaccg tcactctgac cattaccgat 6360ggtgatatcc caaccattga taatgtgcca acagtgagct tgtcagaaac aagtctgagt 6420gacggctctt caccaagtgg cagcgcagtt agctcaactc aaaccatcac ttacaccact 6480caaagtgatg atgtaaccag cttccgtatt gaaccgactg agttcaatgt tggcggtgct 6540ctcaaatcaa atggattggc ggttgagctg aaggccgatc caaccactcc gggcggctac 6600atcggctttg tgactgatgg ttcgaacgtt gaaactaacg tgttcacgat tagcttctcg 6660gataccaatt taggtcaata caccttcacc ttgcttgaag cgttggatca tgcggatagc 6720cttgcaaata acgatctgag ctttgatctg ccagtctacg ccgtcgatag tgatggcgat 6780gattcactgg tgtctcaact caatgtaacc atcggtgatg atgttcaaat catgcaaggt 6840ggtacgttag atatcactga gccaaacctt gcagacggca caaccacaac taacaccatc 6900gatgtgatgc cagaacaaag tgccgatggt gcgacgatca ctcagtttac gtatgacggg 6960caagttcgca ctctggatca aactgacaat ggtgagcagc aatttagctt cactgaaggc 7020gagttgttca tcactcttca aggtgacgtg cgtttcgaac ccaatcgcaa cctagatcac 7080acagctagcg aagacatcgt gaagtcgata gtggtgactt caagcgattc agataacgat 7140gtggtgacgt caacggtcac tctgactatt actgatggtg atctcccaac cattgatgca 7200gtgccaagcg ttactctgtc tgaaactaat cttagtgacg gctctgcgcc aagtggcagc 7260gcagtcagtc aaactgagac catcaccttt accaatcaaa gtgatgatgt ggcgagtttc 7320cgtattgagc caaccgagtt taatgtgggc ggtgcactga aatcgaatgg gtttgcggtt 7380gagataaaag aagactctgc taatccgggt acttacattg gctttattgc caatggttcg 7440agcgctgaaa tcccagtgtt cacgattgct ttctctacga gtacgttggg tgaatacacc 7500tttactctgc ttgaagcgtt agaccatgcg gatggtttag ataagaacga tctgagcttt 7560gagcttccgg tttacgcggt tgatacagac ggtgatgatt cattggtatc tcagcttaat 7620gtgaccattg gtgatgatgt tcaaatcatg caagatggta cgttagacgt tatcgagcca 7680aatcttgcag acggcacaat cacaaccaac accattgatg tgatgcccga gcaaagtgct 7740gatggtgcga cgatcactca gtttacttat gacggtcagc taagaacgct tgatcaaaat 7800gacactggtg aacagcagtt cagcttcaca gaaggcgagt tgtttatcac ccttgaaggt 7860gaagtgcgct ttgaacctaa tcgcgatcta gaccattccg ttagcgaaga catcgtgaag 7920tcgatagtag tgacttcaag cgacttcgat aacgatccgg tgacttcagc cattacgctg 7980accattactg atggtgataa tccgactatc gattcggtac cgagcgttgt acttgaagaa 8040gctgatttaa ctgatggctc atcgccaagt ggcagcgcgg ttagtcaaac ggaaaccatc 8100actttcacta atcaaagtga cgatgttgag aaattccgtt tagaaccaag tgaatttaat 8160actaacaacg cgctcaagtc cgatggcttg atcattgaga ttcgagagga accaacagga 8220tccggcaatt atattggttt cacgaccgat atttcgaatg tcgaaaccac tgtgtttaca 8280ctcgatttca gcagtaccac tttgggtgag tacaccttca cgcttctgga agcgattgac 8340cacacgcctg ttcaaggcaa taacgatcta acattcaact tgccagtcta cgcggttgat 8400agcgacggtg atgattcgct aatgtcatca ctatcggtga cgattactga tgatgttcaa 8460gtgatggtga gtggttcgct tagtatcgaa gagcctactg ttgccgactt ggctgcaggc 8520acgccaacaa catcagtatt tgatgtatta acatccgcga gtgctgatgg ggcgaccatt 8580actcagttca cttatgatgg tggggcggta ttaacgcttg atcaaaacga tacaggtgag 8640cagaagttcg tggttgctga tggggcatta tatatcactc tgcaaggcga tattcgtttc 8700gaaccaagtc gtaaccttga ccatactggt ggcgatatcg tcaagtcgat agtcgtaact 8760tcaagtgatt ccgatagcga tcttgtgtct tcaacggtaa cgctaaccat tactgatggc 8820gatatcccaa cgattgacac ggtgccaagc gttactctgt cagaaacgaa tctgagcgac 8880ggatctgctc cgaatgcaag tgcggtaagt tcaactcaaa ccattacctt tactaaccaa 8940agtgatgacg tgacgagttt ccgtattgaa

ccgactgatt ttaatgttgg tggtgctctg 9000aaatcgaacg gattggcggt cgaactgaaa gcggacccaa ctacaccggg tggctacatc 9060ggttttgtga ctgatggttc gaacgttgaa actaacgtgt ttacgattag cttctcggat 9120accaatttag gtcaatacac cttcaccctg cttgaagcgt tggatcatgt agatggctta 9180gtgaagaatg atctgacttt tgatcttcct gtttatgcgg ttgatagcga tggtgatgat 9240tcactggtgt ctcaactgaa tgtgaccatt ggtgatgatg tacaggtcat gcaaaaccaa 9300gcgcttaata ttattgagcc aacggttgct gatttggctg caggtactcc gacgacagcc 9360actgttgatg tgatgcctag ccaaagtgcc gatggcgcga caatcactca gtttacttac 9420gatggcgggg cggcaataac actcgaccaa aacgacaccg gtgaacagaa gtttgtattt 9480actgaaggtt cactgtttat caccttgcaa ggtgaagtgc gtttcgagcc aaatcgcaat 9540ctaaaccaca cagcgagcga agacatcgtg aagtcgattg tggtgacttc aagcgattta 9600gataacgatg tactgacgtc aacggtcact ctgactatta ctgatggtga tatcccaacc 9660attgatgcag tgccaagcgt tactctgtct gaaactaatc ttagtgacgg ctcagcgcca 9720agcagcagtg ctgtaagtca aacagagacg attaccttca tcaatcaaag tgatgatgtg 9780gcgagtttcc gtattgagcc aacagagttc aatgtgggcg gtgcactgaa atcgaatgga 9840tttgcggttg agataaaaga agattcggct aatccgggta cttatatcgg ttttattacc 9900gatggttcga atactgaagt tcctgtattc acgattgctt tctctacaag tacgttgggc 9960gaatacacct tcaccttact tgaagcgcta gaccatgcaa atggcctaga taagaacgat 10020ctgagttttg atcttcctgt ttatgcggta gacagtgatg gcgatgattc actggtgtct 10080caactgaatg tgaccattgg tgatgatgtc caaataatgc aagacggtac gttagatatc 10140actgagccaa atcttgcaga cggaacaatc acaaccaaca ccattgatgt gatgccaaat 10200cagagtgccg atggtgcgac gatcactgaa ttctcatttg gcggtattgt caaaacactc 10260gatcaaagca tcgtaggtga gcagcagttt agtttcaccg aaggtgagct attcatcact 10320cttcaaggtc aagtgcgctt tgaaccaaat cgtgaccttg accactctgc cagcgaagac 10380atcgtgaagt cgatagtggt tacttcaagt gattttgata acgatcctgt gacttcaacc 10440gttacgctga ccattaccga tggtgatatt ccaactatcg atgcggtacc aagtgttacg 10500ctttcagaaa caaacctagc tgatggttct gcgccaagtg gtagtgcggt tagtcaaacg 10560gagacgatta cttttaccaa tcaaagtgat gatgtggttc gcttccgtct ggaaccaacc 10620gagttcaata ctaacgatgc acttaaatcg aatggcttag cggtcgaact gcgcgaagaa 10680cctcaaggct ctggtcagta cattggcttt accaccagtt cgtctaatgt tgagacaaca 10740gtatttacgt tggactttaa ctccggaacc ttaggtgaat acacatttac tttaatcgaa 10800gctctggatc atcaagatgc gcgtggcaac aacgatttaa gctttaatct acctgtgtat 10860gcggtggata gtgatggcga tgactcgtta gtctctcagc ttggcgtgac cattggcgac 10920gatgtgcagt tgatgcaaga cggcacaatc accagtcgtg agcctgcagc aagtgttgaa 10980acatcaaata cctttgatgt gatgccaaac caaagtgctg atggagccaa agtcacttca 11040tttgttttcg atggtaagac tgcagaaagt cttgatttga atgtgaatgg tgaacaagag 11100ttcgtcttca cggaaggttc ggtatttatt acgacggaag gtgagatacg attcgagccg 11160gtacgtaatc aaaatcatgc tggtggtgat attaccaagt cgattgaggt gacgtctgtt 11220gacctcgatg gcgatattgt cacatcgaca gtgacactga agattgttga tggtgacctt 11280cctactatcg accttgttcc cggaattacg ttatctgaag tggatctggc cgatggctct 11340gtgccaaccg gtaatccagt gacaatgaca caaaccatta cctacacagc gggtagtgac 11400gacgtaagcc atttcagaat tgaccctacg cagttcaata cttcaggggt tttgaaatcg 11460aacggcctag atgtcgaaat aaaagagcag ccagctaatt ctggtaatta cattggcttc 11520gtcaaagacg gttctaacgt agaaaccaac gtcttcacga tcagcttctc gacgagcaat 11580ttagggcaat acacgttcac actacttgaa gcgttagatc atgtagatgg attgcaaaac 11640aatatactaa gcttcgatgt ccctgtttta gcggttgatg cggatggtga tgattctgca 11700atgtcgccta tgacggttgc gatcaccgat gacgtacaag gtgttcaaga tggcaccttg 11760agtatcactg agccttcatt agctgatttg gcatcgggta cgccaccaac gacggcaatc 11820attgatgtta tgccaacgca gagtgctgat ggcgcgaaag taacacagtt tacttacgat 11880ggtggcacag ctgtaacgtt agacccaagc atcgccacag aacaagtctt taccgtaacc 11940gatggcttac tgtacatcac cattgaaggg gaggttcgtt ttgagccgag ccgagatcta 12000gaccattcat ctggcgatat cgtaagaacg attgtcgtca ccaccagtga ttttgataac 12060gatacagata ccgcggatgt cactttgacg atcaaagacg gtatcaatcc cgttatcaat 12120gtggttccag atgttaactt atcggaagtt aatctagcgg atggctcgac gccaagtggt 12180tctgcagtca gttcgactca cacaatcact tacaccgaag gaagtgatga ttttagtcac 12240tttagaattg cgaccaacga attcaatcct ggcgatctgt tgaaatcaag tggtcttgtt 12300gttcaactaa aagaagatcc tgcttctgct ggtgattaca ttggttatac cgatgatggt 12360atgggtaacg ttaccgatgt atttaccatt agctttgata gtgcaaacaa agctcagttt 12420acatttacct tgattgaggc gcttgatcac cttgatggtg tgctttacaa cgatcttacg 12480ttccgtttgc ctatctatgc tgttgataca gatgattctg aatcaacaaa gcgcgatgtg 12540gtggttacga tagaagatga catccagcaa atgcaagatg gcttcttaac cattaccgag 12600ccaaattctg gtactccaac aacaactacc gttgatgtga tgccaatacc aagtgcagac 12660ggtgcgacta ttacgcagtt cacgtatgac ggtggttctc caattactct gaatcaaagc 12720atcagcggcg aacaagagtt tgttttcact gaaggttcac tgtttgtgac actagatggt 12780gatgtaaggt ttgagccaaa tagaaacctt gatcactctg cgggcgacat tgttaaatcg 12840attgtgttca cgtcttcaga ctttgataac gacatcttct catcaaaagt cactctcacc 12900attgttgatg gtgatgggcc aacaatcgac gttgtgccgg gtgtggcatt gtcagaaagc 12960ttacttgcgg atggttcgac gcctagcgta aatcccgtga gtatgactca aaccattact 13020tcacttgcaa gtagtgatga tattgctgaa atagtggtgg aagtcgggtt gttcaatacc 13080aacggcgcgt tgaagtcgga tggtttgtca ctgagtttac gtgaagaccc tgtaaattca 13140ggcgactaca ttgcatttac tactaatggt tcgggtgttg agaaagttat cttcactctg 13200gattttgatg atacgaatcc gagtcaatat acgtttactc tgcttgaacg tttagaccat 13260gttgatggct taggaaataa cgatctgagt tttgatcttt ctgtttatgc agaagatacc 13320gatggtgata tttcagcgtc taaaccgctt acagtcacca tcaccgatga tgttcagctc 13380atgcaatccg gtgcgctcaa cattactgag ccaaccacag gaacaccgac tacagcagtc 13440tttgatgtga tgcctgcgca aagtgcagat ggcgcgacaa tcactaagtt tacctatggc 13500agccaacctg aagagtctct ggtacaaacc gtcacgggtg agcaagaatt tgtgttcact 13560gaaggttctc tgtttatcaa tcttgaaggt gatgtacgtt tcgaacctaa ccgtaatctc 13620gatcattcgg gtggtaacat cgttaagacc attacggtga catcggaaga taaagatggc 13680gatattgtca cttcaacagt gacgctgact attgtagatg gcgcgccacc agtaatagac 13740acagtaccaa cggttgcatt ggaagaagcg aatctggtcg acggatcttc accgggttta 13800cctgttagcc aaactgaaat cattactttc acagcaggaa gtgatgatgt gagccacttc 13860cgtattgatc cggctcaatt caacacatca ggcgatctga aagcggatgg tttggtggtt 13920cagttaaaag aagatcctct aaacagcgat aattatattg gttacgttga aagcggcggt 13980gtccaaacgg atatcttcac catcaccttt agcagcgtgg ttctaggaga gtacacattc 14040accttgttgg aagagttaga tcacctgcct gtacaaggta acaatgatca aatcttcacc 14100ttgccagtga tcgcagtcga caaagacaac actgactcag cggtgaaacc tcttacggtg 14160accattaccg atgatgttcc aaccattact gacaccaccg gcgccagtac gtttgtggtt 14220gatgaagatg atttgggcac tctggcacaa gcgacgggtt cgtttgtaac cacagaaggt 14280gcagatcaag tcgaggttta cgaactacgt aatatatcaa cgttggaagc aacgctatcg 14340tcgggcagtg aaggtattaa gatcactgag atcacaggtg ctgctaacac gaccacctac 14400caaggggcga ccgacccaag tggaacgcca attttcacat tagtgctgac tgatgatggt 14460gcctacacct ttaccttgct tggccctctc aatcacgcta cgacaccgag taacctcgat 14520acattaacaa taccatttga tgttgttgcc gttgacggtg atggcgatga ttctaaccaa 14580tatgtattgc caatcgaggt gctagatgat gtgcctgtaa tgacggcgcc gacgggtgaa 14640acggttgttg atgaagacga tcttactggc attggttccg atcaatctga agatacaatt 14700atcaatggac tgttcaccgt tgatgaaggt gcggatggcg ttgtgctgta tgagctggtt 14760gatgaagatt tggttctgac gggcttaacc tctgatggag aaagcttaga gtggctagct 14820gtttcacaaa acggcacaac atttacttac gttgctcaaa ctgcaacgag taatgaagcg 14880gtgttcgaga ttattttcga cacctcggat aacagctacc aatttgaatt atttaagcca 14940ctgaagcacc ctgacggtgc aaacgagaac gcgatagatc ttgatttctc aatcgttgct 15000gaagattttg atcaagacca atcggatgcg atcggtctaa aaattacggt aaccgatgat 15060gttccgttag tgacaactca atcgattact cgtcttgaag gtcaggggta tggcaactct 15120aaagtcgaca tgtttgccaa tgcaacagat gtgggggctg atggcgcggt actgagtcga 15180attgagggta tctcaaataa tggtgcagat attgttttcc gtagcgggaa caatgggcca 15240tatagtagcg gcttcgattt aaacagcggt agccaacaag ttcgagtcta cgagcaaaca 15300aatggcggtg ctgatactcg tgaacttggc cgtctacgca tcaactcaaa tggtgaggtt 15360gaattcagag ctaacggcta tctcgatcat gacggtgatg acaccatcga cttctcgatt 15420aacgtgattg ccacagatgg agatttagac acctctgaaa caccgttaga tattacgatt 15480actgataggg attctacaag aattgcgctg aaagtgacga ccttcgagga tgcgggtaga 15540gactcaacca taccttacgc aacaggtgat gagccgactc ttgagaatgt tcaagataac 15600caaaatggtt tgccgaatgc gccagcgcaa gttgcgctgc aagttagtct gtatgaccaa 15660gataacgctg aatctattgg gcagttgacg attaaaagcc cgaacggagg tgatagtcat 15720caaggtactt tttattactt tgatggtgct gactacatag aattagtgcc tgagtcaaat 15780gggagcatta tatttggctc tcctgaactc gaacaaagct tcgctccaaa cccgagtgaa 15840ccaagacaaa ctatcgcgac gatagacaac ctgttctttg ttccagacca acacgctagt 15900tcggatgaaa ctggtgggcg agttcgttat gagcttgaaa ttgagaaaaa tggcagtacg 15960gatcacaccg ttaattcaaa cttcagaatt gagattgaag ctgtagctga tattgcgact 16020tgggatgatt ccaacagcac gtatcagtat caagtcaacg aagatgaaga caatgtcacg 16080ttgcagctga acgcagagtc tcaagataac agtaatactg agacgattac ctatgaactt 16140gaagccgttc aaggcgacgg gaagtttgag ttacttgatc aaaatggcaa tgtgttaacg 16200cccgttaatg gtgtttatat catcgcatct gctgatatca atagcaccgt agttaaccct 16260attgataact tctcagggca gattgagttc aaagcgacgg caattacgga agagacgctt 16320aacccatacg atgattcaga caacggtgga gcaaacgata agacgacggc tcgttctgtg 16380gaacaaagta ttgttattga tgtgaccgca gatgcggacc ctggcacatt cagtgttagt 16440cgaattcaga tcaacgaaga caatatcgat gatccagatt acgtcgggcc tttggacaat 16500aaagacgcgt tcacgttaga cgaagtcatc accatgacag ggtcggtcga ttctgacagt 16560tctgaagaac tgtttgtgcg catcagtaat gttacggaag gagctgtgct ttacttctta 16620ggcaccacga cagtcgttcc gaccatcacg atcaatggtg tggattatca agaaatcgcg 16680tattccgatt tggctaacgt tgaggttgtt ccaaccaaac acagtaatgt cgatttcacc 16740ttcgatgtta cgggagtggt caaagatacg gcaaatctat ccacgggcgc ccaaatcgat 16800gaggagatac taggaactaa aaccgtcaac gttgaagtca aaggcgttgc cgatactcct 16860tatggtggaa cgaatggcac ggcttggagt gcaattacag atggcactac atctggtgtt 16920caaaccacga ttcaagagag ccaaaatggt gatacctttg ctgagcttga tttcaccgtg 16980ttgtcgggag agagaagacc agatactggc actacaccat tagctgacga tgggtcagaa 17040tcaataaccg ttattctatc gggtataccc gatggggttg ttctagaaga cggtgacggt 17100acagtgattg accttaactt tgtcggttat gaaaccggac cgggcggtag tcctgactta 17160tccaaaccta tctacgaagc gaacattact gaggcgggta aaacttcagg cattcgcatc 17220agacctgtcg actcttcaac cgagaatatt cacattcaag gtaaagtgat tgtgactgag 17280aacgatggtc acacgcttac gtttgatcaa gaaattcgag tgcttgttat acctcgaatc 17340gacacatcag caacttatgt caatacgact aacggtgatg aagatacggc tatcaatatt 17400gattggcacc ctgaaggcac ggattacatt gatgacgatg agcatttcac taagataact 17460attaatggaa taccactggg tgttactgca gtagtcaacg gtgatgtgac cgttgatgac 17520tcaaccccag gaacattgat tataacgcct aaagatgctt cccaaactcc tgaacaattt 17580actcaaattg cattagctaa taacttcatt caaatgacgc ctccggctga ttctagtgca 17640gattttacgt tgaccaccga acttaaaatg gaagagcgag atcatgagta tacgtctagc 17700ggcctagagg atgaagatgg tggttatgtc gaagccgatc cagatataac cggaatcatt 17760aacgttcaag tacgacctgt ggttgaacct ggagatgccg acaacaagat tgtcgtttca 17820aacgaagatg gctctggaga tctcactacg attacggctg atgctaatgg tgtcattaaa 17880tttacaacta acagtgataa ccaaacgact gatactaacg gagacgaaat ctgggacggt 17940gaatacgtcg tccgatacca agaaacggat ttaagcacag tagaagagca agtcgacgaa 18000gtgattgttc agctgactaa caccgatgga agcgcgttat ctgatgatat tttagggcaa 18060cttttagtaa ctggtgcctc ttacgaaggc ggtggccgat gggttgtgac caatgaagat 18120gcctttagcg tcagtgcgcc caatggatta gatttcaccc ctgccaatga tgcggatgat 18180gtagctactg atttcaatga tatcaagatg acaattttca ctttggtctc agatcctggt 18240gatgctaaca atgaaacgtc cgcccaagtg caacgcaccg gagaagtaac gctttcttat 18300cctgaagtgc tgacggcacc tgacaaagtt gccgcagata ttgcgattgt gccagacagt 18360gttatcgacg ctgttgagga tactcagctt gatctcggcg cggcactcaa cggcattttg 18420agcttgacgg gtcgcgatga ttctactgac caagtgacgg tgatcatcga tggcactctg 18480gtcattgatg ctacaacatc attcccaatt agcctgtcgg gaacaagtga tgttgacttt 18540gtgaatggga aatatgttta cgagacgact gttgagcagg gcgtagccgt cgattcatcg 18600ggtttgttat tgaatctgcc accaaactac tctggtgact ttaggttgcc aatgaccatc 18660gtgaccaaag atttacaatc tggtgatgag aagaccttag tgactgaagt tatcatcaaa 18720gtcgcaccag atgctgagac ggatccaacg attgaggtga atgtcgtggg ttcgcttgat 18780gatgccttta atcctgttga taccgacggt caagctgggc aagatccggt gggttacgaa 18840gacacctata ttcaactcga cttcaattcg accatttcgg atcaggtttc cggcgtcgaa 18900ggcggccaag aagcgtttac gtccattact ttaacgttgg acgacccttc tataggtgca 18960ttctatgaca acacgggtac ttcattaggt acatctgtta cgtttaatca ggctgaaata 19020gcagcgggtg cactcgataa cgtgctcttt agggcaatcg aaaattaccc aacgggtaat 19080gatattaacc aagtgcaggt taatgtcagc ggtacagtca cagataccgc aacctataat 19140gatcctgctt ctcctgcggg tacggcaaca gactcagata ctttctctac gagtgtcagc 19200tttgaagtcg ttcctgtggt cgatgacgtg tctgtcactg gaccgggtag cgatcctgat 19260gttatcgaga ttactggcaa cgaagaccag ctcatttctt tgtcggggac agggcctgta 19320tcgattgcac tgactgacct tgatggttca gaacagtttg tatcgattaa gttcacagat 19380gtccctgatg gcttccaaat gcgtgcagat gctggctcga catataccgt gaaaaataat 19440ggtaatggag agtggagtgt tcaactgcct caagcttcgg ggttgtcatt cgatttaagt 19500gagatttcga tcttgccgcc taaaaacttc agtggtaccg ctgagtttgg tgtggaagtc 19560ttcactcaag aatcgttgct gggtgtgcct actgcggcgg caaacttgcc aagcttcaaa 19620ctgcatgtgg tacctgttgg tgacgatgtt gataccaatc cgactgattc tgtaacaggc 19680aacgaaggcc aaaacattga tatcgaaatc aatgcgacta ttttggataa agaattgtct 19740gcaacaggaa gcgggacgta taccgagaat gcgcccgaaa cgcttcgagt tgaagtggcg 19800ggtgttcctc aagatgcttc tattttctat ccagatggca cgacattggc tagctacgat 19860ccggcgacgc agctctggac tctcgatgtt ccagctcagt cgttagataa gatcgtattt 19920aactctggcg aacataatag tgatacaggc aatgtactgg gtatcaatgg tccactgcag 19980attacggtac gttcagtaga tactgatgct gataatacag agtacctagg tacgccaacc 20040agcttcgatg tcgatctggt gattgatcct attaacgatc aaccgatctt tgtgaacgta 20100acgaatattg aaacatcgga agacatcagt gttgccatcg acaactttag tatctacgac 20160gtcgacgcaa actttgataa tccagatgct ccgtatgaac tgacgcttaa agtcgaccaa 20220acactgccgg gagcgcaagg tgtgtttgag tttaccagct ctcctgacgt gacgtttgta 20280ttgcaacctg acggctcatt ggtgattacc ggtaaagaag ccgacattaa taccgcattg 20340actaatggag ctgtgacttt caaacccgac ccagaccaga actacctcaa ccagactggt 20400ttagtcacaa tcaatgcaac gctcgatgat ggtggtaata acggtttgat tgacgcggtt 20460gatccgaata ccgctcaaac caatcaaact accttcacca ttaaggtgac ggaagtgaat 20520gacgctcctg tggcgactaa cgttgattta ggctcgattg cggaagacgc tcaaatcgtg 20580attgttgaga gtgacttgat tgcagccagt tctgatctag aaaaccataa tctcacagta 20640accggtgtga ctcttactca agggcaaggt cagcttacac gctatgaaaa tgctggtggt 20700gctgatgacg cagcgattac ggggccattc tggatattca ttgcagataa tgatttcaac 20760ggcgacgtta aattcaatta ctccattatc gatgatggta ccaccaacgg tgtggatgat 20820tttaaaaccg atagcgctga aatcagcctt gtagttactg aagtcaatga ccagccagtg 20880gcatcgaaca ttgatttggg caccatgctt gaagaaggac agctggtcat taaagaggaa 20940gacctgattt ccgcaaccac tgatccggaa aacgacacga ttactgtgaa cagtttggtg 21000ctcgatcaag gtcagggcca attacaacgc tttgagaacg tgggcggtgc tgatgatgct 21060acgatcactg gcccgtactg ggtatttact gcagccaacg aatacaacgg tgatgttaag 21120ttcacttata ccgttgagga cgatggtaca accaacggcg ctgatgattt cttaacagat 21180accggcgaaa ttagcgttgt ggtaacggaa gtgaatgatc aaccggtggc aacggatatc 21240gacttaggaa acatccttga agaagggcag ttgatcatca aagaggaaga cttaattgct 21300gctacgagcg atccggaaaa cgacacgatt accgtgacca atctggtgct cgacgaaggc 21360caaggccagt tacagcgctt tgagaacgtg ggcggtgctg atgacgctat gattactggc 21420ccgtactgga tatttacggc tgctgatgaa tacaacggta acgttaagtt cacctatacc 21480gtcgaggatg atggtacaac caacggcgct aatgatttcc taacggatac tgcagagatc 21540acagcgattg tcgacggagt gaacgatacg cctgttgtta atggtgacag tgtcactacg 21600attgttgacg aggatgctgg tcagctattg agtggtatca atgtcagtga cccagattat 21660gtggatgcat tttctaatga cttgatgaca gtcacgctga cagtggatta cggtacattg 21720aacgtatcac ttccggcagt gacgacagtg atggtcaacg gcaacaacac tggttcggtt 21780atcttagttg gtactttgag tgacctgaat gcgctgattg atacgccaac cagtccaaac 21840ggtgtctacc tcgatgcgag cttgtctcca accaatagca ttggcttaga agtaatcgcc 21900aaagacagcg gtaacccttc tggtatcgcg attgaaactg caccagtggt ttataatatc 21960gcagtgacac cagtcgctaa tgcgccaacc ttgtctattg atccggcatt taactatgtg 22020agaaacatta cgaccagctc atctgtggtc gctaatagtg gagtcgcttt agttggaatt 22080gtcgctgcat tgacggacat tactgaagag ttaacgttga agatcagcga tgttccggat 22140ggtgttgatg taaccagtga tgtgggtacg gtttcgttgg tgggtgatac ttggatagcg 22200accgctgatg cgatcgatag tctcagactc gtagagcagt catcattagg taaaccgttg 22260accccgggta attacacctt gaaagttgag gcgctatctg aagagactga caacaacgat 22320attgcgatat ctcaaaacat cgatctgaat ctcaatattg ttgccaatcc aatagatctc 22380gatctgtctt ctgaaacaga cgatgtgcaa cttttagcga gtaactttga tactaacctc 22440actggcggaa ctggaaatga ccgacttgta ggtggagcgg gtgacgatac gctggttggc 22500ggtgacggta acgacacact cattggtggc ggcggttccg atattctaac cggtggcaat 22560ggtatggatt cgtttgtatg gctcaatatt gaagatggcg ttgaagacac cattaccgat 22620ttcagcctgt ctgaaggaga ccaaatcgac ctacgagaag tattacctga gttgaagaat 22680acatctccag acatgtctgc attgctacaa cagatagacg cgaaagtgga aggggatgat 22740attgagctta cgatcaagtc tgatggttta ggcactacgg aacaggtgat tgtggttgaa 22800gaccttgctc ctcagctaac cttaagtggc accatgcctt cggatatttt ggatgcgtta 22860gtgcaacaaa atgtcatcac tcacggttaa 2289047629PRTVibrio splendidus 4Met Leu Cys Asp Asn Gly Gly Cys Met Asp Ile Glu Val Ser Arg Gln1 5 10 15 Val Ala Val Val Glu Ala Thr Ser Gly Asp Val Val Val Val Lys Pro 20 25 30 Asp Gly Ser Ala Arg Lys Val Ser Val Gly Asp Thr Ile Arg Glu Asn 35 40 45 Glu Ile Val Ile Thr Ala Asn Lys Ser Glu Leu Val Leu Gly Val Gln 50 55 60 Asn Asp Ser Ile Pro Val Ala Glu Asn Cys Val Gly Cys Val Asp Glu65 70 75 80 Asn Ala Ala Trp Val Asp Ala Pro Ile Ala Gly Glu Val Asn Phe Asp 85 90 95 Leu Gln Gln Ala Asp Ala Glu Thr Phe Thr Glu Asp Asp Leu Ala Ala 100 105 110 Ile Gln Glu Ala Ile Leu Gly Gly Ala Asp Pro Thr Gln Ile Leu Glu 115 120 125 Ala Thr Ala Ala Gly Gly Gly Leu Gly Ser Ala Asn Ala Gly Phe Val 130 135 140 Thr Ile Asp Tyr Asn Tyr Thr Glu Thr His Pro Ser Thr Phe Phe Glu145 150 155 160 Thr Ala Gly Leu Ala Glu Gln Thr Val Asp Glu Asp

Arg Glu Glu Phe 165 170 175 Arg Ser Ile Thr Arg Ser Ser Gly Gly Gln Ser Ile Ser Glu Thr Leu 180 185 190 Thr Glu Gly Ser Ile Ser Gly Asn Thr Tyr Pro Gln Ser Val Thr Thr 195 200 205 Thr Glu Thr Ile Ile Ala Gly Ser Leu Ala Leu Ala Pro Asn Ser Phe 210 215 220 Ile Pro Glu Thr Leu Ser Leu Ala Ser Leu Leu Ser Glu Leu Asn Ser225 230 235 240 Asp Ile Thr Ser Ser Gly Gln Ser Val Ile Phe Thr Tyr Asp Ala Thr 245 250 255 Thr Asn Ser Ile Val Gly Val Gln Asp Thr Asp Glu Val Leu Arg Ile 260 265 270 Asp Ile Asp Ala Val Ser Val Gly Asn Asn Ile Glu Leu Ser Leu Thr 275 280 285 Thr Thr Ile Ser Gln Pro Ile Asp His Val Pro Ser Val Gly Gly Gly 290 295 300 Gln Val Ser Tyr Thr Gly Asp Gln Ile Asp Ile Ala Phe Asp Ile Gln305 310 315 320 Gly Glu Asp Thr Ala Gly Asn Pro Leu Ala Thr Pro Val Asn Ala Gln 325 330 335 Val Ser Val Phe Asp Gly Ile Asp Pro Ser Val Glu Ser Val Asn Ile 340 345 350 Thr Asn Val Glu Thr Ser Ser Ala Ala Ile Glu Gly Thr Phe Ser Asn 355 360 365 Ile Gly Ser Asp Asn Leu Gln Ser Ala Val Phe Asp Ala Ser Ala Leu 370 375 380 Asp Gln Phe Asp Gly Leu Leu Ser Asp Asn Gln Asn Thr Leu Ala Arg385 390 395 400 Leu Ser Asp Asp Gly Thr Thr Ile Thr Leu Ser Ile Gln Gly Arg Gly 405 410 415 Glu Val Val Leu Thr Ile Ser Leu Asp Thr Asp Gly Thr Tyr Lys Phe 420 425 430 Glu Gln Ser Asn Pro Ile Glu Gln Val Gly Thr Asp Ser Leu Thr Phe 435 440 445 Ala Leu Pro Ile Thr Ile Thr Asp Phe Asp Gln Asp Val Val Thr Asn 450 455 460 Thr Ile Asn Ile Ala Ile Thr Asp Gly Asp Ser Pro Val Ile Thr Asn465 470 475 480 Val Asp Ser Ile Asp Val Asp Glu Ala Gly Ile Val Gly Gly Ser Gln 485 490 495 Glu Gly Thr Ala Pro Val Ser Gly Thr Gly Gly Ile Thr Ala Asp Ile 500 505 510 Phe Glu Ser Asp Ile Ile Asp His Tyr Glu Leu Glu Pro Thr Glu Phe 515 520 525 Asn Thr Asn Gly Thr Leu Val Ser Asn Gly Glu Ala Val Leu Leu Glu 530 535 540 Leu Ile Asp Glu Thr Asn Gly Val Arg Thr Tyr Glu Gly Tyr Val Glu545 550 555 560 Val Asn Gly Ser Arg Ile Thr Val Phe Asp Val Lys Ile Asp Ser Pro 565 570 575 Ser Leu Gly Asn Tyr Glu Phe Asn Leu Tyr Glu Glu Leu Ser His Gln 580 585 590 Gly Ala Glu Asp Ala Leu Leu Thr Phe Ala Leu Pro Ile Tyr Ala Val 595 600 605 Asp Ala Asp Gly Asp Arg Ser Ala Leu Ser Gly Gly Ser Asn Thr Pro 610 615 620 Glu Ala Ala Glu Ile Leu Val Asn Val Lys Asp Asp Val Val Glu Leu625 630 635 640 Val Asp Lys Val Glu Ser Val Thr Glu Pro Thr Leu Ala Gly Asp Thr 645 650 655 Ile Val Ser Tyr Asn Leu Phe Asn Phe Glu Gly Ala Asp Gly Ser Thr 660 665 670 Ile Gln Ser Phe Asn Tyr Asp Gly Val Asp Tyr Ser Leu Asp Gln Ser 675 680 685 Leu Leu Pro Asp Ala Thr Gln Ile Phe Ser Phe Thr Glu Gly Val Val 690 695 700 Thr Ile Ser Leu Asn Gly Asp Phe Ser Phe Glu Val Ala Arg Asp Ile705 710 715 720 Asp His Ser Ser Ser Glu Thr Ile Val Lys Gln Phe Ser Phe Leu Ala 725 730 735 Glu Asp Gly Asp Gly Asp Thr Asp Ser Ser Thr Leu Glu Leu Ser Ile 740 745 750 Thr Asp Gly Gln Asp Pro Ile Ile Asp Leu Ile Pro Pro Val Thr Leu 755 760 765 Ser Glu Thr Asn Leu Asn Asp Gly Ser Ala Pro Ser Gly Ser Thr Val 770 775 780 Ser Ala Thr Glu Thr Ile Thr Phe Thr Ala Gly Ser Asp Asp Val Ala785 790 795 800 Ser Phe Arg Ile Glu Pro Thr Glu Phe Asn Val Gly Gly Ala Leu Lys 805 810 815 Ser Asn Gly Phe Ser Val Glu Ile Lys Glu Asp Ser Ala Asn Pro Gly 820 825 830 Thr Tyr Ile Gly Phe Ile Thr Asn Gly Ser Gly Ala Glu Ile Pro Val 835 840 845 Phe Thr Ile Ala Phe Ser Thr Ser Thr Leu Gly Glu Tyr Thr Phe Thr 850 855 860 Leu Leu Glu Ala Leu Asp His Val Asp Gly Leu Asp Lys Asn Asp Leu865 870 875 880 Ser Phe Asp Leu Pro Ile Tyr Ala Val Asp Thr Asp Gly Asp Asp Ser 885 890 895 Leu Val Ser Gln Leu Asn Val Thr Ile Gly Asp Asp Val Gln Ile Met 900 905 910 Gln Asp Gly Thr Leu Asp Ile Thr Glu Pro Asn Leu Ala Asp Gly Thr 915 920 925 Ile Thr Thr Asn Thr Ile Asp Val Met Pro Asn Gln Ser Ala Asp Gly 930 935 940 Ala Thr Ile Thr Arg Phe Thr Tyr Asp Gly Val Val Asn Thr Leu Asp945 950 955 960 Gln Ser Ile Ser Gly Glu Gln Gln Phe Ser Phe Thr Glu Gly Glu Leu 965 970 975 Phe Ile Thr Leu Glu Gly Glu Val Arg Phe Glu Pro Asn Arg Asp Leu 980 985 990 Asp His Ser Val Ser Glu Asp Ile Val Lys Ser Ile Val Val Thr Ser 995 1000 1005 Ser Asp Phe Asp Asn Asp Pro Val Thr Ser Thr Ile Thr Leu Thr Ile 1010 1015 1020 Thr Asp Gly Asp Asn Pro Thr Ile Asp Val Ile Pro Ser Val Thr Leu1025 1030 1035 1040 Ser Glu Ile Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly Ser Ala Val 1045 1050 1055 Ser Ser Thr Gln Thr Ile Thr Phe Thr Asn Gln Ser Asp Asp Val Val 1060 1065 1070 Arg Phe Arg Ile Glu Ser Thr Glu Phe Asn Thr Asn Asp Asp Leu Lys 1075 1080 1085 Ser Asn Gly Leu Ala Val Glu Leu Arg Glu Asp Pro Ala Gly Ser Gly 1090 1095 1100 Asp Tyr Ile Gly Phe Thr Thr Ser Ala Thr Asn Val Glu Thr Pro Val1105 1110 1115 1120 Phe Thr Leu Ser Phe Asn Ser Gly Ser Leu Gly Glu Tyr Thr Phe Thr 1125 1130 1135 Leu Ile Glu Ala Leu Asp His Gln Asp Ala Arg Gly Asn Asn Asp Leu 1140 1145 1150 Ser Phe Asp Leu Pro Val Tyr Ala Val Asp Ser Asp Gly Asp Asp Ser 1155 1160 1165 Leu Val Ser Pro Leu Asn Val Thr Ile Gly Asp Asp Val Gln Ile Met 1170 1175 1180 Gln Asp Ser Thr Leu Asp Ile Val Glu Pro Thr Val Ala Asp Leu Ala1185 1190 1195 1200 Ala Gly Thr Val Thr Thr Asn Thr Ile Asp Val Met Pro Asn Gln Ser 1205 1210 1215 Ala Asp Gly Ala Thr Val Thr Gln Phe Thr Tyr Asp Gly Gln Leu Arg 1220 1225 1230 Thr Leu Asp Gln Asn Asp Asn Gly Glu Gln Gln Phe Ser Phe Thr Glu 1235 1240 1245 Gly Glu Leu Phe Ile Thr Leu Gln Gly Asp Val Arg Phe Glu Pro Asn 1250 1255 1260 Arg Asn Leu Asp His Thr Leu Ser Glu Asp Ile Val Lys Ser Ile Val1265 1270 1275 1280 Val Thr Ser Ser Asp Ser Asp Asn Asp Val Leu Thr Ser Thr Val Thr 1285 1290 1295 Leu Thr Ile Thr Asp Gly Asp Ile Pro Thr Ile Asp Asn Val Pro Thr 1300 1305 1310 Val Asn Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly 1315 1320 1325 Ser Ala Val Ser Ser Thr Gln Thr Ile Thr Tyr Thr Thr Gln Ser Asp 1330 1335 1340 Asp Val Thr Ser Phe Arg Ile Glu Pro Thr Glu Phe Asn Val Gly Gly1345 1350 1355 1360 Ala Leu Thr Ser Asn Gly Leu Ala Val Glu Leu Lys Ala Asp Pro Thr 1365 1370 1375 Thr Pro Gly Gly Tyr Ile Gly Phe Val Thr Asp Gly Ser Asn Val Glu 1380 1385 1390 Thr Asn Val Phe Thr Ile Ser Phe Ser Asp Thr Asn Leu Gly Gln Tyr 1395 1400 1405 Thr Phe Thr Leu Leu Glu Ala Leu Asp His Val Asp Gly Leu Ala Asn 1410 1415 1420 Asn Asp Leu Thr Phe Asp Leu Pro Val Tyr Ala Val Asp Ser Asp Gly1425 1430 1435 1440 Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr Ile Gly Asp Asp Val 1445 1450 1455 Gln Ile Met Gln Gly Gly Thr Leu Asp Ile Thr Glu Pro Asn Leu Ala 1460 1465 1470 Asp Gly Thr Ile Thr Thr Asn Thr Ile Asp Val Met Pro Glu Gln Ser 1475 1480 1485 Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly Gln Val Arg 1490 1495 1500 Thr Leu Asp Gln Thr Asp Asn Gly Glu Gln Gln Phe Ser Phe Thr Glu1505 1510 1515 1520 Gly Glu Leu Phe Ile Thr Leu Gln Gly Asp Val Arg Phe Glu Pro Asn 1525 1530 1535 Arg Asn Leu Asp His Thr Ala Ser Glu Asp Ile Val Lys Ser Ile Val 1540 1545 1550 Val Thr Ser Ser Asp Leu Asp Asn Asp Val Val Thr Ser Thr Val Thr 1555 1560 1565 Leu Thr Ile Thr Asp Gly Asp Ile Pro Thr Ile Asp Ala Val Pro Ser 1570 1575 1580 Val Thr Leu Ser Glu Ile Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly1585 1590 1595 1600 Thr Ala Val Ser Gln Thr Glu Thr Ile Thr Phe Thr Asn Gln Ser Asp 1605 1610 1615 Asp Val Thr Ser Phe Arg Ile Glu Pro Ile Glu Phe Asn Val Gly Gly 1620 1625 1630 Ala Leu Lys Ser Asn Gly Phe Ala Val Glu Ile Lys Glu Asp Ser Ala 1635 1640 1645 Asn Pro Gly Thr Tyr Ile Gly Phe Ile Thr Asn Gly Ser Gly Ala Glu 1650 1655 1660 Ile Pro Val Phe Thr Ile Ala Phe Ser Thr Ser Ser Leu Gly Glu Tyr1665 1670 1675 1680 Thr Phe Thr Leu Leu Glu Ala Leu Asp His Val Asp Gly Leu Asp Lys 1685 1690 1695 Asn Asp Leu Ser Phe Asp Leu Pro Val Tyr Ala Val Asp Thr Asp Gly 1700 1705 1710 Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr Ile Gly Asp Asp Val 1715 1720 1725 Gln Ile Met Gln Asp Gly Thr Leu Asp Ile Ile Glu Pro Asn Leu Ala 1730 1735 1740 Asp Gly Thr Ile Thr Thr Ser Thr Ile Asp Val Met Pro Asn Gln Ser1745 1750 1755 1760 Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly Gln Leu Arg 1765 1770 1775 Thr Leu Asp Gln Asn Asp Thr Gly Glu Gln Gln Phe Ser Phe Thr Glu 1780 1785 1790 Gly Glu Leu Phe Ile Thr Leu Glu Gly Glu Val Arg Phe Glu Pro Asn 1795 1800 1805 Arg Asp Leu Asp His Thr Ala Ser Glu Asp Ile Val Lys Ser Ile Val 1810 1815 1820 Val Thr Ser Ser Asp Phe Asp Asn Asp Ser Leu Thr Ser Thr Val Thr1825 1830 1835 1840 Leu Thr Ile Thr Asp Gly Asp Asn Pro Thr Ile Asp Val Ile Pro Ser 1845 1850 1855 Val Thr Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly 1860 1865 1870 Ser Ala Val Ser Ser Thr Gln Thr Ile Thr Phe Thr Asn Gln Ser Asp 1875 1880 1885 Asp Val Val Arg Phe Arg Ile Glu Pro Thr Glu Phe Asn Thr Asn Asp 1890 1895 1900 Asp Leu Lys Ser Asn Gly Leu Ala Val Glu Leu Arg Glu Asp Pro Ala1905 1910 1915 1920 Gly Ser Gly Asp Tyr Ile Gly Phe Thr Thr Ser Ala Thr Asn Val Glu 1925 1930 1935 Thr Thr Val Phe Thr Leu Ser Phe Ser Ser Thr Thr Leu Gly Glu Tyr 1940 1945 1950 Thr Phe Thr Leu Leu Glu Ala Leu Asp His Gln Asp Ala Arg Gly Asn 1955 1960 1965 Asn Asp Leu Ser Phe Glu Leu Pro Val Tyr Ala Val Asp Ser Asp Gly 1970 1975 1980 Asp Asp Ser Leu Met Ser Pro Leu Asn Val Thr Ile Gly Asp Asp Val1985 1990 1995 2000 Gln Ile Met Gln Asp Gly Thr Leu Asp Ile Val Glu Pro Thr Val Ala 2005 2010 2015 Asp Leu Ala Ala Gly Ile Val Thr Thr Asn Thr Ile Asp Val Met Pro 2020 2025 2030 Asn Gln Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly 2035 2040 2045 Gln Leu Arg Thr Leu Asp Gln Asn Asp Asn Gly Glu Gln Gln Phe Ser 2050 2055 2060 Phe Thr Glu Gly Glu Leu Phe Ile Thr Leu Glu Gly Glu Val Arg Phe2065 2070 2075 2080 Glu Pro Asn Arg Asn Leu Asp His Thr Leu Asn Glu Asp Ile Val Lys 2085 2090 2095 Ser Ile Val Val Thr Ser Ser Asp Ser Asp Asn Asp Val Leu Thr Ser 2100 2105 2110 Thr Val Thr Leu Thr Ile Thr Asp Gly Asp Ile Pro Thr Ile Asp Asn 2115 2120 2125 Val Pro Thr Val Ser Leu Ser Glu Thr Ser Leu Ser Asp Gly Ser Ser 2130 2135 2140 Pro Ser Gly Ser Ala Val Ser Ser Thr Gln Thr Ile Thr Tyr Thr Thr2145 2150 2155 2160 Gln Ser Asp Asp Val Thr Ser Phe Arg Ile Glu Pro Thr Glu Phe Asn 2165 2170 2175 Val Gly Gly Ala Leu Lys Ser Asn Gly Leu Ala Val Glu Leu Lys Ala 2180 2185 2190 Asp Pro Thr Thr Pro Gly Gly Tyr Ile Gly Phe Val Thr Asp Gly Ser 2195 2200 2205 Asn Val Glu Thr Asn Val Phe Thr Ile Ser Phe Ser Asp Thr Asn Leu 2210 2215 2220 Gly Gln Tyr Thr Phe Thr Leu Leu Glu Ala Leu Asp His Ala Asp Ser2225 2230 2235 2240 Leu Ala Asn Asn Asp Leu Ser Phe Asp Leu Pro Val Tyr Ala Val Asp 2245 2250 2255 Ser Asp Gly Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr Ile Gly 2260 2265 2270 Asp Asp Val Gln Ile Met Gln Gly Gly Thr Leu Asp Ile Thr Glu Pro 2275 2280 2285 Asn Leu Ala Asp Gly Thr Thr Thr Thr Asn Thr Ile Asp Val Met Pro 2290 2295 2300 Glu Gln Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly2305 2310 2315 2320 Gln Val Arg Thr Leu Asp Gln Thr Asp Asn Gly Glu Gln Gln Phe Ser 2325 2330 2335 Phe Thr Glu Gly Glu Leu Phe Ile Thr Leu Gln Gly Asp Val Arg Phe 2340 2345 2350 Glu Pro Asn Arg Asn Leu Asp His Thr Ala Ser Glu Asp Ile Val Lys 2355 2360 2365 Ser Ile Val Val Thr Ser Ser Asp Ser Asp Asn Asp Val Val Thr Ser 2370 2375 2380 Thr Val Thr Leu Thr Ile Thr Asp Gly Asp Leu Pro Thr Ile Asp Ala2385 2390 2395 2400 Val Pro Ser Val Thr Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala 2405 2410 2415 Pro Ser Gly Ser Ala Val Ser Gln Thr Glu Thr Ile Thr Phe Thr Asn 2420 2425 2430 Gln Ser Asp Asp Val Ala Ser Phe Arg Ile Glu Pro Thr Glu Phe Asn 2435 2440 2445 Val Gly Gly Ala Leu Lys Ser Asn Gly Phe Ala Val Glu Ile Lys Glu 2450 2455 2460 Asp Ser Ala Asn Pro Gly Thr Tyr Ile Gly Phe Ile Ala Asn Gly Ser2465 2470 2475 2480 Ser Ala Glu Ile Pro

Val Phe Thr Ile Ala Phe Ser Thr Ser Thr Leu 2485 2490 2495 Gly Glu Tyr Thr Phe Thr Leu Leu Glu Ala Leu Asp His Ala Asp Gly 2500 2505 2510 Leu Asp Lys Asn Asp Leu Ser Phe Glu Leu Pro Val Tyr Ala Val Asp 2515 2520 2525 Thr Asp Gly Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr Ile Gly 2530 2535 2540 Asp Asp Val Gln Ile Met Gln Asp Gly Thr Leu Asp Val Ile Glu Pro2545 2550 2555 2560 Asn Leu Ala Asp Gly Thr Ile Thr Thr Asn Thr Ile Asp Val Met Pro 2565 2570 2575 Glu Gln Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly 2580 2585 2590 Gln Leu Arg Thr Leu Asp Gln Asn Asp Thr Gly Glu Gln Gln Phe Ser 2595 2600 2605 Phe Thr Glu Gly Glu Leu Phe Ile Thr Leu Glu Gly Glu Val Arg Phe 2610 2615 2620 Glu Pro Asn Arg Asp Leu Asp His Ser Val Ser Glu Asp Ile Val Lys2625 2630 2635 2640 Ser Ile Val Val Thr Ser Ser Asp Phe Asp Asn Asp Pro Val Thr Ser 2645 2650 2655 Ala Ile Thr Leu Thr Ile Thr Asp Gly Asp Asn Pro Thr Ile Asp Ser 2660 2665 2670 Val Pro Ser Val Val Leu Glu Glu Ala Asp Leu Thr Asp Gly Ser Ser 2675 2680 2685 Pro Ser Gly Ser Ala Val Ser Gln Thr Glu Thr Ile Thr Phe Thr Asn 2690 2695 2700 Gln Ser Asp Asp Val Glu Lys Phe Arg Leu Glu Pro Ser Glu Phe Asn2705 2710 2715 2720 Thr Asn Asn Ala Leu Lys Ser Asp Gly Leu Ile Ile Glu Ile Arg Glu 2725 2730 2735 Glu Pro Thr Gly Ser Gly Asn Tyr Ile Gly Phe Thr Thr Asp Ile Ser 2740 2745 2750 Asn Val Glu Thr Thr Val Phe Thr Leu Asp Phe Ser Ser Thr Thr Leu 2755 2760 2765 Gly Glu Tyr Thr Phe Thr Leu Leu Glu Ala Ile Asp His Thr Pro Val 2770 2775 2780 Gln Gly Asn Asn Asp Leu Thr Phe Asn Leu Pro Val Tyr Ala Val Asp2785 2790 2795 2800 Ser Asp Gly Asp Asp Ser Leu Met Ser Ser Leu Ser Val Thr Ile Thr 2805 2810 2815 Asp Asp Val Gln Val Met Val Ser Gly Ser Leu Ser Ile Glu Glu Pro 2820 2825 2830 Thr Val Ala Asp Leu Ala Ala Gly Thr Pro Thr Thr Ser Val Phe Asp 2835 2840 2845 Val Leu Thr Ser Ala Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr 2850 2855 2860 Tyr Asp Gly Gly Ala Val Leu Thr Leu Asp Gln Asn Asp Thr Gly Glu2865 2870 2875 2880 Gln Lys Phe Val Val Ala Asp Gly Ala Leu Tyr Ile Thr Leu Gln Gly 2885 2890 2895 Asp Ile Arg Phe Glu Pro Ser Arg Asn Leu Asp His Thr Gly Gly Asp 2900 2905 2910 Ile Val Lys Ser Ile Val Val Thr Ser Ser Asp Ser Asp Ser Asp Leu 2915 2920 2925 Val Ser Ser Thr Val Thr Leu Thr Ile Thr Asp Gly Asp Ile Pro Thr 2930 2935 2940 Ile Asp Thr Val Pro Ser Val Thr Leu Ser Glu Thr Asn Leu Ser Asp2945 2950 2955 2960 Gly Ser Ala Pro Asn Ala Ser Ala Val Ser Ser Thr Gln Thr Ile Thr 2965 2970 2975 Phe Thr Asn Gln Ser Asp Asp Val Thr Ser Phe Arg Ile Glu Pro Thr 2980 2985 2990 Asp Phe Asn Val Gly Gly Ala Leu Lys Ser Asn Gly Leu Ala Val Glu 2995 3000 3005 Leu Lys Ala Asp Pro Thr Thr Pro Gly Gly Tyr Ile Gly Phe Val Thr 3010 3015 3020 Asp Gly Ser Asn Val Glu Thr Asn Val Phe Thr Ile Ser Phe Ser Asp3025 3030 3035 3040 Thr Asn Leu Gly Gln Tyr Thr Phe Thr Leu Leu Glu Ala Leu Asp His 3045 3050 3055 Val Asp Gly Leu Val Lys Asn Asp Leu Thr Phe Asp Leu Pro Val Tyr 3060 3065 3070 Ala Val Asp Ser Asp Gly Asp Asp Ser Leu Val Ser Gln Leu Asn Val 3075 3080 3085 Thr Ile Gly Asp Asp Val Gln Val Met Gln Asn Gln Ala Leu Asn Ile 3090 3095 3100 Ile Glu Pro Thr Val Ala Asp Leu Ala Ala Gly Thr Pro Thr Thr Ala3105 3110 3115 3120 Thr Val Asp Val Met Pro Ser Gln Ser Ala Asp Gly Ala Thr Ile Thr 3125 3130 3135 Gln Phe Thr Tyr Asp Gly Gly Ala Ala Ile Thr Leu Asp Gln Asn Asp 3140 3145 3150 Thr Gly Glu Gln Lys Phe Val Phe Thr Glu Gly Ser Leu Phe Ile Thr 3155 3160 3165 Leu Gln Gly Glu Val Arg Phe Glu Pro Asn Arg Asn Leu Asn His Thr 3170 3175 3180 Ala Ser Glu Asp Ile Val Lys Ser Ile Val Val Thr Ser Ser Asp Leu3185 3190 3195 3200 Asp Asn Asp Val Leu Thr Ser Thr Val Thr Leu Thr Ile Thr Asp Gly 3205 3210 3215 Asp Ile Pro Thr Ile Asp Ala Val Pro Ser Val Thr Leu Ser Glu Thr 3220 3225 3230 Asn Leu Ser Asp Gly Ser Ala Pro Ser Ser Ser Ala Val Ser Gln Thr 3235 3240 3245 Glu Thr Ile Thr Phe Ile Asn Gln Ser Asp Asp Val Ala Ser Phe Arg 3250 3255 3260 Ile Glu Pro Thr Glu Phe Asn Val Gly Gly Ala Leu Lys Ser Asn Gly3265 3270 3275 3280 Phe Ala Val Glu Ile Lys Glu Asp Ser Ala Asn Pro Gly Thr Tyr Ile 3285 3290 3295 Gly Phe Ile Thr Asp Gly Ser Asn Thr Glu Val Pro Val Phe Thr Ile 3300 3305 3310 Ala Phe Ser Thr Ser Thr Leu Gly Glu Tyr Thr Phe Thr Leu Leu Glu 3315 3320 3325 Ala Leu Asp His Ala Asn Gly Leu Asp Lys Asn Asp Leu Ser Phe Asp 3330 3335 3340 Leu Pro Val Tyr Ala Val Asp Ser Asp Gly Asp Asp Ser Leu Val Ser3345 3350 3355 3360 Gln Leu Asn Val Thr Ile Gly Asp Asp Val Gln Ile Met Gln Asp Gly 3365 3370 3375 Thr Leu Asp Ile Thr Glu Pro Asn Leu Ala Asp Gly Thr Ile Thr Thr 3380 3385 3390 Asn Thr Ile Asp Val Met Pro Asn Gln Ser Ala Asp Gly Ala Thr Ile 3395 3400 3405 Thr Glu Phe Ser Phe Gly Gly Ile Val Lys Thr Leu Asp Gln Ser Ile 3410 3415 3420 Val Gly Glu Gln Gln Phe Ser Phe Thr Glu Gly Glu Leu Phe Ile Thr3425 3430 3435 3440 Leu Gln Gly Gln Val Arg Phe Glu Pro Asn Arg Asp Leu Asp His Ser 3445 3450 3455 Ala Ser Glu Asp Ile Val Lys Ser Ile Val Val Thr Ser Ser Asp Phe 3460 3465 3470 Asp Asn Asp Pro Val Thr Ser Thr Val Thr Leu Thr Ile Thr Asp Gly 3475 3480 3485 Asp Ile Pro Thr Ile Asp Ala Val Pro Ser Val Thr Leu Ser Glu Thr 3490 3495 3500 Asn Leu Ala Asp Gly Ser Ala Pro Ser Gly Ser Ala Val Ser Gln Thr3505 3510 3515 3520 Glu Thr Ile Thr Phe Thr Asn Gln Ser Asp Asp Val Val Arg Phe Arg 3525 3530 3535 Leu Glu Pro Thr Glu Phe Asn Thr Asn Asp Ala Leu Lys Ser Asn Gly 3540 3545 3550 Leu Ala Val Glu Leu Arg Glu Glu Pro Gln Gly Ser Gly Gln Tyr Ile 3555 3560 3565 Gly Phe Thr Thr Ser Ser Ser Asn Val Glu Thr Thr Val Phe Thr Leu 3570 3575 3580 Asp Phe Asn Ser Gly Thr Leu Gly Glu Tyr Thr Phe Thr Leu Ile Glu3585 3590 3595 3600 Ala Leu Asp His Gln Asp Ala Arg Gly Asn Asn Asp Leu Ser Phe Asn 3605 3610 3615 Leu Pro Val Tyr Ala Val Asp Ser Asp Gly Asp Asp Ser Leu Val Ser 3620 3625 3630 Gln Leu Gly Val Thr Ile Gly Asp Asp Val Gln Leu Met Gln Asp Gly 3635 3640 3645 Thr Ile Thr Ser Arg Glu Pro Ala Ala Ser Val Glu Thr Ser Asn Thr 3650 3655 3660 Phe Asp Val Met Pro Asn Gln Ser Ala Asp Gly Ala Lys Val Thr Ser3665 3670 3675 3680 Phe Val Phe Asp Gly Lys Thr Ala Glu Ser Leu Asp Leu Asn Val Asn 3685 3690 3695 Gly Glu Gln Glu Phe Val Phe Thr Glu Gly Ser Val Phe Ile Thr Thr 3700 3705 3710 Glu Gly Glu Ile Arg Phe Glu Pro Val Arg Asn Gln Asn His Ala Gly 3715 3720 3725 Gly Asp Ile Thr Lys Ser Ile Glu Val Thr Ser Val Asp Leu Asp Gly 3730 3735 3740 Asp Ile Val Thr Ser Thr Val Thr Leu Lys Ile Val Asp Gly Asp Leu3745 3750 3755 3760 Pro Thr Ile Asp Leu Val Pro Gly Ile Thr Leu Ser Glu Val Asp Leu 3765 3770 3775 Ala Asp Gly Ser Val Pro Thr Gly Asn Pro Val Thr Met Thr Gln Thr 3780 3785 3790 Ile Thr Tyr Thr Ala Gly Ser Asp Asp Val Ser His Phe Arg Ile Asp 3795 3800 3805 Pro Thr Gln Phe Asn Thr Ser Gly Val Leu Lys Ser Asn Gly Leu Asp 3810 3815 3820 Val Glu Ile Lys Glu Gln Pro Ala Asn Ser Gly Asn Tyr Ile Gly Phe3825 3830 3835 3840 Val Lys Asp Gly Ser Asn Val Glu Thr Asn Val Phe Thr Ile Ser Phe 3845 3850 3855 Ser Thr Ser Asn Leu Gly Gln Tyr Thr Phe Thr Leu Leu Glu Ala Leu 3860 3865 3870 Asp His Val Asp Gly Leu Gln Asn Asn Ile Leu Ser Phe Asp Val Pro 3875 3880 3885 Val Leu Ala Val Asp Ala Asp Gly Asp Asp Ser Ala Met Ser Pro Met 3890 3895 3900 Thr Val Ala Ile Thr Asp Asp Val Gln Gly Val Gln Asp Gly Thr Leu3905 3910 3915 3920 Ser Ile Thr Glu Pro Ser Leu Ala Asp Leu Ala Ser Gly Thr Pro Pro 3925 3930 3935 Thr Thr Ala Ile Ile Asp Val Met Pro Thr Gln Ser Ala Asp Gly Ala 3940 3945 3950 Lys Val Thr Gln Phe Thr Tyr Asp Gly Gly Thr Ala Val Thr Leu Asp 3955 3960 3965 Pro Ser Ile Ala Thr Glu Gln Val Phe Thr Val Thr Asp Gly Leu Leu 3970 3975 3980 Tyr Ile Thr Ile Glu Gly Glu Val Arg Phe Glu Pro Ser Arg Asp Leu3985 3990 3995 4000 Asp His Ser Ser Gly Asp Ile Val Arg Thr Ile Val Val Thr Thr Ser 4005 4010 4015 Asp Phe Asp Asn Asp Thr Asp Thr Ala Asp Val Thr Leu Thr Ile Lys 4020 4025 4030 Asp Gly Ile Asn Pro Val Ile Asn Val Val Pro Asp Val Asn Leu Ser 4035 4040 4045 Glu Val Asn Leu Ala Asp Gly Ser Thr Pro Ser Gly Ser Ala Val Ser 4050 4055 4060 Ser Thr His Thr Ile Thr Tyr Thr Glu Gly Ser Asp Asp Phe Ser His4065 4070 4075 4080 Phe Arg Ile Ala Thr Asn Glu Phe Asn Pro Gly Asp Leu Leu Lys Ser 4085 4090 4095 Ser Gly Leu Val Val Gln Leu Lys Glu Asp Pro Ala Ser Ala Gly Asp 4100 4105 4110 Tyr Ile Gly Tyr Thr Asp Asp Gly Met Gly Asn Val Thr Asp Val Phe 4115 4120 4125 Thr Ile Ser Phe Asp Ser Ala Asn Lys Ala Gln Phe Thr Phe Thr Leu 4130 4135 4140 Ile Glu Ala Leu Asp His Leu Asp Gly Val Leu Tyr Asn Asp Leu Thr4145 4150 4155 4160 Phe Arg Leu Pro Ile Tyr Ala Val Asp Thr Asp Asp Ser Glu Ser Thr 4165 4170 4175 Lys Arg Asp Val Val Val Thr Ile Glu Asp Asp Ile Gln Gln Met Gln 4180 4185 4190 Asp Gly Phe Leu Thr Ile Thr Glu Pro Asn Ser Gly Thr Pro Thr Thr 4195 4200 4205 Thr Thr Val Asp Val Met Pro Ile Pro Ser Ala Asp Gly Ala Thr Ile 4210 4215 4220 Thr Gln Phe Thr Tyr Asp Gly Gly Ser Pro Ile Thr Leu Asn Gln Ser4225 4230 4235 4240 Ile Ser Gly Glu Gln Glu Phe Val Phe Thr Glu Gly Ser Leu Phe Val 4245 4250 4255 Thr Leu Asp Gly Asp Val Arg Phe Glu Pro Asn Arg Asn Leu Asp His 4260 4265 4270 Ser Ala Gly Asp Ile Val Lys Ser Ile Val Phe Thr Ser Ser Asp Phe 4275 4280 4285 Asp Asn Asp Ile Phe Ser Ser Lys Val Thr Leu Thr Ile Val Asp Gly 4290 4295 4300 Asp Gly Pro Thr Ile Asp Val Val Pro Gly Val Ala Leu Ser Glu Ser4305 4310 4315 4320 Leu Leu Ala Asp Gly Ser Thr Pro Ser Val Asn Pro Val Ser Met Thr 4325 4330 4335 Gln Thr Ile Thr Ser Leu Ala Ser Ser Asp Asp Ile Ala Glu Ile Val 4340 4345 4350 Val Glu Val Gly Leu Phe Asn Thr Asn Gly Ala Leu Lys Ser Asp Gly 4355 4360 4365 Leu Ser Leu Ser Leu Arg Glu Asp Pro Val Asn Ser Gly Asp Tyr Ile 4370 4375 4380 Ala Phe Thr Thr Asn Gly Ser Gly Val Glu Lys Val Ile Phe Thr Leu4385 4390 4395 4400 Asp Phe Asp Asp Thr Asn Pro Ser Gln Tyr Thr Phe Thr Leu Leu Glu 4405 4410 4415 Arg Leu Asp His Val Asp Gly Leu Gly Asn Asn Asp Leu Ser Phe Asp 4420 4425 4430 Leu Ser Val Tyr Ala Glu Asp Thr Asp Gly Asp Ile Ser Ala Ser Lys 4435 4440 4445 Pro Leu Thr Val Thr Ile Thr Asp Asp Val Gln Leu Met Gln Ser Gly 4450 4455 4460 Ala Leu Asn Ile Thr Glu Pro Thr Thr Gly Thr Pro Thr Thr Ala Val4465 4470 4475 4480 Phe Asp Val Met Pro Ala Gln Ser Ala Asp Gly Ala Thr Ile Thr Lys 4485 4490 4495 Phe Thr Tyr Gly Ser Gln Pro Glu Glu Ser Leu Val Gln Thr Val Thr 4500 4505 4510 Gly Glu Gln Glu Phe Val Phe Thr Glu Gly Ser Leu Phe Ile Asn Leu 4515 4520 4525 Glu Gly Asp Val Arg Phe Glu Pro Asn Arg Asn Leu Asp His Ser Gly 4530 4535 4540 Gly Asn Ile Val Lys Thr Ile Thr Val Thr Ser Glu Asp Lys Asp Gly4545 4550 4555 4560 Asp Ile Val Thr Ser Thr Val Thr Leu Thr Ile Val Asp Gly Ala Pro 4565 4570 4575 Pro Val Ile Asp Thr Val Pro Thr Val Ala Leu Glu Glu Ala Asn Leu 4580 4585 4590 Val Asp Gly Ser Ser Pro Gly Leu Pro Val Ser Gln Thr Glu Ile Ile 4595 4600 4605 Thr Phe Thr Ala Gly Ser Asp Asp Val Ser His Phe Arg Ile Asp Pro 4610 4615 4620 Ala Gln Phe Asn Thr Ser Gly Asp Leu Lys Ala Asp Gly Leu Val Val4625 4630 4635 4640 Gln Leu Lys Glu Asp Pro Leu Asn Ser Asp Asn Tyr Ile Gly Tyr Val 4645 4650 4655 Glu Ser Gly Gly Val Gln Thr Asp Ile Phe Thr Ile Thr Phe Ser Ser 4660 4665 4670 Val Val Leu Gly Glu Tyr Thr Phe Thr Leu Leu Glu Glu Leu Asp His 4675 4680 4685 Leu Pro Val Gln Gly Asn Asn Asp Gln Ile Phe Thr Leu Pro Val Ile 4690 4695 4700 Ala Val Asp Lys Asp Asn Thr Asp Ser Ala Val Lys Pro Leu Thr Val4705 4710 4715 4720 Thr Ile Thr Asp Asp Val Pro Thr Ile Thr Asp Thr Thr Gly Ala Ser 4725 4730 4735 Thr Phe Val Val Asp Glu Asp Asp Leu Gly Thr Leu Ala Gln Ala Thr 4740 4745 4750 Gly Ser Phe Val Thr Thr Glu Gly Ala Asp Gln Val Glu Val Tyr Glu 4755 4760 4765 Leu Arg Asn Ile Ser Thr Leu Glu Ala Thr Leu Ser Ser Gly Ser Glu 4770 4775 4780 Gly Ile Lys Ile Thr Glu Ile Thr Gly Ala Ala Asn Thr Thr Thr Tyr4785 4790 4795

4800 Gln Gly Ala Thr Asp Pro Ser Gly Thr Pro Ile Phe Thr Leu Val Leu 4805 4810 4815 Thr Asp Asp Gly Ala Tyr Thr Phe Thr Leu Leu Gly Pro Leu Asn His 4820 4825 4830 Ala Thr Thr Pro Ser Asn Leu Asp Thr Leu Thr Ile Pro Phe Asp Val 4835 4840 4845 Val Ala Val Asp Gly Asp Gly Asp Asp Ser Asn Gln Tyr Val Leu Pro 4850 4855 4860 Ile Glu Val Leu Asp Asp Val Pro Val Met Thr Ala Pro Thr Gly Glu4865 4870 4875 4880 Thr Val Val Asp Glu Asp Asp Leu Thr Gly Ile Gly Ser Asp Gln Ser 4885 4890 4895 Glu Asp Thr Ile Ile Asn Gly Leu Phe Thr Val Asp Glu Gly Ala Asp 4900 4905 4910 Gly Val Val Leu Tyr Glu Leu Val Asp Glu Asp Leu Val Leu Thr Gly 4915 4920 4925 Leu Thr Ser Asp Gly Glu Ser Leu Glu Trp Leu Ala Val Ser Gln Asn 4930 4935 4940 Gly Thr Thr Phe Thr Tyr Val Ala Gln Thr Ala Thr Ser Asn Glu Ala4945 4950 4955 4960 Val Phe Glu Ile Ile Phe Asp Thr Ser Asp Asn Ser Tyr Gln Phe Glu 4965 4970 4975 Leu Phe Lys Pro Leu Lys His Pro Asp Gly Ala Asn Glu Asn Ala Ile 4980 4985 4990 Asp Leu Asp Phe Ser Ile Val Ala Glu Asp Phe Asp Gln Asp Gln Ser 4995 5000 5005 Asp Ala Ile Gly Leu Lys Ile Thr Val Thr Asp Asp Val Pro Leu Val 5010 5015 5020 Thr Thr Gln Ser Ile Thr Arg Leu Glu Gly Gln Gly Tyr Gly Asn Ser5025 5030 5035 5040 Lys Val Asp Met Phe Ala Asn Ala Thr Asp Val Gly Ala Asp Gly Ala 5045 5050 5055 Val Leu Ser Arg Ile Glu Gly Ile Ser Asn Asn Gly Ala Asp Ile Val 5060 5065 5070 Phe Arg Ser Gly Asn Asn Gly Pro Tyr Ser Ser Gly Phe Asp Leu Asn 5075 5080 5085 Ser Gly Ser Gln Gln Val Arg Val Tyr Glu Gln Thr Asn Gly Gly Ala 5090 5095 5100 Asp Thr Arg Glu Leu Gly Arg Leu Arg Ile Asn Ser Asn Gly Glu Val5105 5110 5115 5120 Glu Phe Arg Ala Asn Gly Tyr Leu Asp His Asp Gly Asp Asp Thr Ile 5125 5130 5135 Asp Phe Ser Ile Asn Val Ile Ala Thr Asp Gly Asp Leu Asp Thr Ser 5140 5145 5150 Glu Thr Pro Leu Asp Ile Thr Ile Thr Asp Arg Asp Ser Thr Arg Ile 5155 5160 5165 Ala Leu Lys Val Thr Thr Phe Glu Asp Ala Gly Arg Asp Ser Thr Ile 5170 5175 5180 Pro Tyr Ala Thr Gly Asp Glu Pro Thr Leu Glu Asn Val Gln Asp Asn5185 5190 5195 5200 Gln Asn Gly Leu Pro Asn Ala Pro Ala Gln Val Ala Leu Gln Val Ser 5205 5210 5215 Leu Tyr Asp Gln Asp Asn Ala Glu Ser Ile Gly Gln Leu Thr Ile Lys 5220 5225 5230 Ser Pro Asn Gly Gly Asp Ser His Gln Gly Thr Phe Tyr Tyr Phe Asp 5235 5240 5245 Gly Ala Asp Tyr Ile Glu Leu Val Pro Glu Ser Asn Gly Ser Ile Ile 5250 5255 5260 Phe Gly Ser Pro Glu Leu Glu Gln Ser Phe Ala Pro Asn Pro Ser Glu5265 5270 5275 5280 Pro Arg Gln Thr Ile Ala Thr Ile Asp Asn Leu Phe Phe Val Pro Asp 5285 5290 5295 Gln His Ala Ser Ser Asp Glu Thr Gly Gly Arg Val Arg Tyr Glu Leu 5300 5305 5310 Glu Ile Glu Lys Asn Gly Ser Thr Asp His Thr Val Asn Ser Asn Phe 5315 5320 5325 Arg Ile Glu Ile Glu Ala Val Ala Asp Ile Ala Thr Trp Asp Asp Ser 5330 5335 5340 Asn Ser Thr Tyr Gln Tyr Gln Val Asn Glu Asp Glu Asp Asn Val Thr5345 5350 5355 5360 Leu Gln Leu Asn Ala Glu Ser Gln Asp Asn Ser Asn Thr Glu Thr Ile 5365 5370 5375 Thr Tyr Glu Leu Glu Ala Val Gln Gly Asp Gly Lys Phe Glu Leu Leu 5380 5385 5390 Asp Gln Asn Gly Asn Val Leu Thr Pro Val Asn Gly Val Tyr Ile Ile 5395 5400 5405 Ala Ser Ala Asp Ile Asn Ser Thr Val Val Asn Pro Ile Asp Asn Phe 5410 5415 5420 Ser Gly Gln Ile Glu Phe Lys Ala Thr Ala Ile Thr Glu Glu Thr Leu5425 5430 5435 5440 Asn Pro Tyr Asp Asp Ser Asp Asn Gly Gly Ala Asn Asp Lys Thr Thr 5445 5450 5455 Ala Arg Ser Val Glu Gln Ser Ile Val Ile Asp Val Thr Ala Asp Ala 5460 5465 5470 Asp Pro Gly Thr Phe Ser Val Ser Arg Ile Gln Ile Asn Glu Asp Asn 5475 5480 5485 Ile Asp Asp Pro Asp Tyr Val Gly Pro Leu Asp Asn Lys Asp Ala Phe 5490 5495 5500 Thr Leu Asp Glu Val Ile Thr Met Thr Gly Ser Val Asp Ser Asp Ser5505 5510 5515 5520 Ser Glu Glu Leu Phe Val Arg Ile Ser Asn Val Thr Glu Gly Ala Val 5525 5530 5535 Leu Tyr Phe Leu Gly Thr Thr Thr Val Val Pro Thr Ile Thr Ile Asn 5540 5545 5550 Gly Val Asp Tyr Gln Glu Ile Ala Tyr Ser Asp Leu Ala Asn Val Glu 5555 5560 5565 Val Val Pro Thr Lys His Ser Asn Val Asp Phe Thr Phe Asp Val Thr 5570 5575 5580 Gly Val Val Lys Asp Thr Ala Asn Leu Ser Thr Gly Ala Gln Ile Asp5585 5590 5595 5600 Glu Glu Ile Leu Gly Thr Lys Thr Val Asn Val Glu Val Lys Gly Val 5605 5610 5615 Ala Asp Thr Pro Tyr Gly Gly Thr Asn Gly Thr Ala Trp Ser Ala Ile 5620 5625 5630 Thr Asp Gly Thr Thr Ser Gly Val Gln Thr Thr Ile Gln Glu Ser Gln 5635 5640 5645 Asn Gly Asp Thr Phe Ala Glu Leu Asp Phe Thr Val Leu Ser Gly Glu 5650 5655 5660 Arg Arg Pro Asp Thr Gly Thr Thr Pro Leu Ala Asp Asp Gly Ser Glu5665 5670 5675 5680 Ser Ile Thr Val Ile Leu Ser Gly Ile Pro Asp Gly Val Val Leu Glu 5685 5690 5695 Asp Gly Asp Gly Thr Val Ile Asp Leu Asn Phe Val Gly Tyr Glu Thr 5700 5705 5710 Gly Pro Gly Gly Ser Pro Asp Leu Ser Lys Pro Ile Tyr Glu Ala Asn 5715 5720 5725 Ile Thr Glu Ala Gly Lys Thr Ser Gly Ile Arg Ile Arg Pro Val Asp 5730 5735 5740 Ser Ser Thr Glu Asn Ile His Ile Gln Gly Lys Val Ile Val Thr Glu5745 5750 5755 5760 Asn Asp Gly His Thr Leu Thr Phe Asp Gln Glu Ile Arg Val Leu Val 5765 5770 5775 Ile Pro Arg Ile Asp Thr Ser Ala Thr Tyr Val Asn Thr Thr Asn Gly 5780 5785 5790 Asp Glu Asp Thr Ala Ile Asn Ile Asp Trp His Pro Glu Gly Thr Asp 5795 5800 5805 Tyr Ile Asp Asp Asp Glu His Phe Thr Lys Ile Thr Ile Asn Gly Ile 5810 5815 5820 Pro Leu Gly Val Thr Ala Val Val Asn Gly Asp Val Thr Val Asp Asp5825 5830 5835 5840 Ser Thr Pro Gly Thr Leu Ile Ile Thr Pro Lys Asp Ala Ser Gln Thr 5845 5850 5855 Pro Glu Gln Phe Thr Gln Ile Ala Leu Ala Asn Asn Phe Ile Gln Met 5860 5865 5870 Thr Pro Pro Ala Asp Ser Ser Ala Asp Phe Thr Leu Thr Thr Glu Leu 5875 5880 5885 Lys Met Glu Glu Arg Asp His Glu Tyr Thr Ser Ser Gly Leu Glu Asp 5890 5895 5900 Glu Asp Gly Gly Tyr Val Glu Ala Asp Pro Asp Ile Thr Gly Ile Ile5905 5910 5915 5920 Asn Val Gln Val Arg Pro Val Val Glu Pro Gly Asp Ala Asp Asn Lys 5925 5930 5935 Ile Val Val Ser Asn Glu Asp Gly Ser Gly Asp Leu Thr Thr Ile Thr 5940 5945 5950 Ala Asp Ala Asn Gly Val Ile Lys Phe Thr Thr Asn Ser Asp Asn Gln 5955 5960 5965 Thr Thr Asp Thr Asn Gly Asp Glu Ile Trp Asp Gly Glu Tyr Val Val 5970 5975 5980 Arg Tyr Gln Glu Thr Asp Leu Ser Thr Val Glu Glu Gln Val Asp Glu5985 5990 5995 6000 Val Ile Val Gln Leu Thr Asn Thr Asp Gly Ser Ala Leu Ser Asp Asp 6005 6010 6015 Ile Leu Gly Gln Leu Leu Val Thr Gly Ala Ser Tyr Glu Gly Gly Gly 6020 6025 6030 Arg Trp Val Val Thr Asn Glu Asp Ala Phe Ser Val Ser Ala Pro Asn 6035 6040 6045 Gly Leu Asp Phe Thr Pro Ala Asn Asp Ala Asp Asp Val Ala Thr Asp 6050 6055 6060 Phe Asn Asp Ile Lys Met Thr Ile Phe Thr Leu Val Ser Asp Pro Gly6065 6070 6075 6080 Asp Ala Asn Asn Glu Thr Ser Ala Gln Val Gln Arg Thr Gly Glu Val 6085 6090 6095 Thr Leu Ser Tyr Pro Glu Val Leu Thr Ala Pro Asp Lys Val Ala Ala 6100 6105 6110 Asp Ile Ala Ile Val Pro Asp Ser Val Ile Asp Ala Val Glu Asp Thr 6115 6120 6125 Gln Leu Asp Leu Gly Ala Ala Leu Asn Gly Ile Leu Ser Leu Thr Gly 6130 6135 6140 Arg Asp Asp Ser Thr Asp Gln Val Thr Val Ile Ile Asp Gly Thr Leu6145 6150 6155 6160 Val Ile Asp Ala Thr Thr Ser Phe Pro Ile Ser Leu Ser Gly Thr Ser 6165 6170 6175 Asp Val Asp Phe Val Asn Gly Lys Tyr Val Tyr Glu Thr Thr Val Glu 6180 6185 6190 Gln Gly Val Ala Val Asp Ser Ser Gly Leu Leu Leu Asn Leu Pro Pro 6195 6200 6205 Asn Tyr Ser Gly Asp Phe Arg Leu Pro Met Thr Ile Val Thr Lys Asp 6210 6215 6220 Leu Gln Ser Gly Asp Glu Lys Thr Leu Val Thr Glu Val Ile Ile Lys6225 6230 6235 6240 Val Ala Pro Asp Ala Glu Thr Asp Pro Thr Ile Glu Val Asn Val Val 6245 6250 6255 Gly Ser Leu Asp Asp Ala Phe Asn Pro Val Asp Thr Asp Gly Gln Ala 6260 6265 6270 Gly Gln Asp Pro Val Gly Tyr Glu Asp Thr Tyr Ile Gln Leu Asp Phe 6275 6280 6285 Asn Ser Thr Ile Ser Asp Gln Val Ser Gly Val Glu Gly Gly Gln Glu 6290 6295 6300 Ala Phe Thr Ser Ile Thr Leu Thr Leu Asp Asp Pro Ser Ile Gly Ala6305 6310 6315 6320 Phe Tyr Asp Asn Thr Gly Thr Ser Leu Gly Thr Ser Val Thr Phe Asn 6325 6330 6335 Gln Ala Glu Ile Ala Ala Gly Ala Leu Asp Asn Val Leu Phe Arg Ala 6340 6345 6350 Ile Glu Asn Tyr Pro Thr Gly Asn Asp Ile Asn Gln Val Gln Val Asn 6355 6360 6365 Val Ser Gly Thr Val Thr Asp Thr Ala Thr Tyr Asn Asp Pro Ala Ser 6370 6375 6380 Pro Ala Gly Thr Ala Thr Asp Ser Asp Thr Phe Ser Thr Ser Val Ser6385 6390 6395 6400 Phe Glu Val Val Pro Val Val Asp Asp Val Ser Val Thr Gly Pro Gly 6405 6410 6415 Ser Asp Pro Asp Val Ile Glu Ile Thr Gly Asn Glu Asp Gln Leu Ile 6420 6425 6430 Ser Leu Ser Gly Thr Gly Pro Val Ser Ile Ala Leu Thr Asp Leu Asp 6435 6440 6445 Gly Ser Glu Gln Phe Val Ser Ile Lys Phe Thr Asp Val Pro Asp Gly 6450 6455 6460 Phe Gln Met Arg Ala Asp Ala Gly Ser Thr Tyr Thr Val Lys Asn Asn6465 6470 6475 6480 Gly Asn Gly Glu Trp Ser Val Gln Leu Pro Gln Ala Ser Gly Leu Ser 6485 6490 6495 Phe Asp Leu Ser Glu Ile Ser Ile Leu Pro Pro Lys Asn Phe Ser Gly 6500 6505 6510 Thr Ala Glu Phe Gly Val Glu Val Phe Thr Gln Glu Ser Leu Leu Gly 6515 6520 6525 Val Pro Thr Ala Ala Ala Asn Leu Pro Ser Phe Lys Leu His Val Val 6530 6535 6540 Pro Val Gly Asp Asp Val Asp Thr Asn Pro Thr Asp Ser Val Thr Gly6545 6550 6555 6560 Asn Glu Gly Gln Asn Ile Asp Ile Glu Ile Asn Ala Thr Ile Leu Asp 6565 6570 6575 Lys Glu Leu Ser Ala Thr Gly Ser Gly Thr Tyr Thr Glu Asn Ala Pro 6580 6585 6590 Glu Thr Leu Arg Val Glu Val Ala Gly Val Pro Gln Asp Ala Ser Ile 6595 6600 6605 Phe Tyr Pro Asp Gly Thr Thr Leu Ala Ser Tyr Asp Pro Ala Thr Gln 6610 6615 6620 Leu Trp Thr Leu Asp Val Pro Ala Gln Ser Leu Asp Lys Ile Val Phe6625 6630 6635 6640 Asn Ser Gly Glu His Asn Ser Asp Thr Gly Asn Val Leu Gly Ile Asn 6645 6650 6655 Gly Pro Leu Gln Ile Thr Val Arg Ser Val Asp Thr Asp Ala Asp Asn 6660 6665 6670 Thr Glu Tyr Leu Gly Thr Pro Thr Ser Phe Asp Val Asp Leu Val Ile 6675 6680 6685 Asp Pro Ile Asn Asp Gln Pro Ile Phe Val Asn Val Thr Asn Ile Glu 6690 6695 6700 Thr Ser Glu Asp Ile Ser Val Ala Ile Asp Asn Phe Ser Ile Tyr Asp6705 6710 6715 6720 Val Asp Ala Asn Phe Asp Asn Pro Asp Ala Pro Tyr Glu Leu Thr Leu 6725 6730 6735 Lys Val Asp Gln Thr Leu Pro Gly Ala Gln Gly Val Phe Glu Phe Thr 6740 6745 6750 Ser Ser Pro Asp Val Thr Phe Val Leu Gln Pro Asp Gly Ser Leu Val 6755 6760 6765 Ile Thr Gly Lys Glu Ala Asp Ile Asn Thr Ala Leu Thr Asn Gly Ala 6770 6775 6780 Val Thr Phe Lys Pro Asp Pro Asp Gln Asn Tyr Leu Asn Gln Thr Gly6785 6790 6795 6800 Leu Val Thr Ile Asn Ala Thr Leu Asp Asp Gly Gly Asn Asn Gly Leu 6805 6810 6815 Ile Asp Ala Val Asp Pro Asn Thr Ala Gln Thr Asn Gln Thr Thr Phe 6820 6825 6830 Thr Ile Lys Val Thr Glu Val Asn Asp Ala Pro Val Ala Thr Asn Val 6835 6840 6845 Asp Leu Gly Ser Ile Ala Glu Asp Ala Gln Ile Val Ile Val Glu Ser 6850 6855 6860 Asp Leu Ile Ala Ala Ser Ser Asp Leu Glu Asn His Asn Leu Thr Val6865 6870 6875 6880 Thr Gly Val Thr Leu Thr Gln Gly Gln Gly Gln Leu Thr Arg Tyr Glu 6885 6890 6895 Asn Ala Gly Gly Ala Asp Asp Ala Ala Ile Thr Gly Pro Phe Trp Ile 6900 6905 6910 Phe Ile Ala Asp Asn Asp Phe Asn Gly Asp Val Lys Phe Asn Tyr Ser 6915 6920 6925 Ile Ile Asp Asp Gly Thr Thr Asn Gly Val Asp Asp Phe Lys Thr Asp 6930 6935 6940 Ser Ala Glu Ile Ser Leu Val Val Thr Glu Val Asn Asp Gln Pro Val6945 6950 6955 6960 Ala Ser Asn Ile Asp Leu Gly Thr Met Leu Glu Glu Gly Gln Leu Val 6965 6970 6975 Ile Lys Glu Glu Asp Leu Ile Ser Ala Thr Thr Asp Pro Glu Asn Asp 6980 6985 6990 Thr Ile Thr Val Asn Ser Leu Val Leu Asp Gln Gly Gln Gly Gln Leu 6995 7000 7005 Gln Arg Phe Glu Asn Val Gly Gly Ala Asp Asp Ala Thr Ile Thr Gly 7010 7015 7020 Pro Tyr Trp Val Phe Thr Ala Ala Asn Glu Tyr Asn Gly Asp Val Lys7025 7030 7035 7040 Phe Thr Tyr Thr Val Glu Asp Asp Gly Thr Thr Asn Gly Ala Asp Asp 7045 7050 7055 Phe Leu Thr Asp Thr Gly Glu Ile Ser Val Val Val Thr Glu Val Asn 7060 7065 7070 Asp Gln Pro Val Ala Thr Asp Ile Asp Leu Gly Asn Ile Leu Glu Glu 7075 7080 7085 Gly Gln Leu Ile Ile Lys Glu Glu Asp Leu Ile Ala Ala Thr Ser Asp 7090 7095 7100 Pro Glu Asn Asp Thr Ile Thr Val Thr Asn Leu Val Leu Asp Glu Gly7105 7110

7115 7120 Gln Gly Gln Leu Gln Arg Phe Glu Asn Val Gly Gly Ala Asp Asp Ala 7125 7130 7135 Met Ile Thr Gly Pro Tyr Trp Ile Phe Thr Ala Ala Asp Glu Tyr Asn 7140 7145 7150 Gly Asn Val Lys Phe Thr Tyr Thr Val Glu Asp Asp Gly Thr Thr Asn 7155 7160 7165 Gly Ala Asn Asp Phe Leu Thr Asp Thr Ala Glu Ile Thr Ala Ile Val 7170 7175 7180 Asp Gly Val Asn Asp Thr Pro Val Val Asn Gly Asp Ser Val Thr Thr7185 7190 7195 7200 Ile Val Asp Glu Asp Ala Gly Gln Leu Leu Ser Gly Ile Asn Val Ser 7205 7210 7215 Asp Pro Asp Tyr Val Asp Ala Phe Ser Asn Asp Leu Met Thr Val Thr 7220 7225 7230 Leu Thr Val Asp Tyr Gly Thr Leu Asn Val Ser Leu Pro Ala Val Thr 7235 7240 7245 Thr Val Met Val Asn Gly Asn Asn Thr Gly Ser Val Ile Leu Val Gly 7250 7255 7260 Thr Leu Ser Asp Leu Asn Ala Leu Ile Asp Thr Pro Thr Ser Pro Asn7265 7270 7275 7280 Gly Val Tyr Leu Asp Ala Ser Leu Ser Pro Thr Asn Ser Ile Gly Leu 7285 7290 7295 Glu Val Ile Ala Lys Asp Ser Gly Asn Pro Ser Gly Ile Ala Ile Glu 7300 7305 7310 Thr Ala Pro Val Val Tyr Asn Ile Ala Val Thr Pro Val Ala Asn Ala 7315 7320 7325 Pro Thr Leu Ser Ile Asp Pro Ala Phe Asn Tyr Val Arg Asn Ile Thr 7330 7335 7340 Thr Ser Ser Ser Val Val Ala Asn Ser Gly Val Ala Leu Val Gly Ile7345 7350 7355 7360 Val Ala Ala Leu Thr Asp Ile Thr Glu Glu Leu Thr Leu Lys Ile Ser 7365 7370 7375 Asp Val Pro Asp Gly Val Asp Val Thr Ser Asp Val Gly Thr Val Ser 7380 7385 7390 Leu Val Gly Asp Thr Trp Ile Ala Thr Ala Asp Ala Ile Asp Ser Leu 7395 7400 7405 Arg Leu Val Glu Gln Ser Ser Leu Gly Lys Pro Leu Thr Pro Gly Asn 7410 7415 7420 Tyr Thr Leu Lys Val Glu Ala Leu Ser Glu Glu Thr Asp Asn Asn Asp7425 7430 7435 7440 Ile Ala Ile Ser Gln Asn Ile Asp Leu Asn Leu Asn Ile Val Ala Asn 7445 7450 7455 Pro Ile Asp Leu Asp Leu Ser Ser Glu Thr Asp Asp Val Gln Leu Leu 7460 7465 7470 Ala Ser Asn Phe Asp Thr Asn Leu Thr Gly Gly Thr Gly Asn Asp Arg 7475 7480 7485 Leu Val Gly Gly Ala Gly Asp Asp Thr Leu Val Gly Gly Asp Gly Asn 7490 7495 7500 Asp Thr Leu Ile Gly Gly Gly Gly Ser Asp Ile Leu Thr Gly Gly Asn7505 7510 7515 7520 Gly Met Asp Ser Phe Val Trp Leu Asn Ile Glu Asp Gly Val Glu Asp 7525 7530 7535 Thr Ile Thr Asp Phe Ser Leu Ser Glu Gly Asp Gln Ile Asp Leu Arg 7540 7545 7550 Glu Val Leu Pro Glu Leu Lys Asn Thr Ser Pro Asp Met Ser Ala Leu 7555 7560 7565 Leu Gln Gln Ile Asp Ala Lys Val Glu Gly Asp Asp Ile Glu Leu Thr 7570 7575 7580 Ile Lys Ser Asp Gly Leu Gly Thr Thr Glu Gln Val Ile Val Val Glu7585 7590 7595 7600 Asp Leu Ala Pro Gln Leu Thr Leu Ser Gly Thr Met Pro Ser Asp Ile 7605 7610 7615 Leu Asp Ala Leu Val Gln Gln Asn Val Ile Thr His Gly 7620 7625 5765DNAVibrio splendidus 5atgaaaaaaa catcactatt acttgcttcc attactctgg cactttctgg tgtagtacaa 60gctgaccagc tagaagacat tcaaaaatca ggcacacttc gcgtcggcac cacaggcgac 120tacaaacctt tttcttactt cgacggcaaa acctactctg gttatgacat tgacgtagcc 180aaacatgttg cagagcagtt gggcgttgaa ttacagattg ttcgtaccac atggaaagat 240ctactgaccg atctagacag cgataaatac gacatcgcga tgggcggtat cacgcgtaaa 300atgcagcgtc agttaaacgc agaacaaact caaggttaca tgacctttgg caagtgtttc 360ttagttgcga aaggcaaagc agaacaatac aacagcattg agaaagtgaa cctctcttct 420gtgcgtgttg gcgtcaatat cggtgggact aatgagatgt ttgcggatgc taacttgcaa 480gacgcgagct ttacgcgtta cgagaacaac ctagacgttc cgcaagccgt tgcggaaggt 540aaagttgatg taatggtgac agaaactcct gaaggtctgt tctatcaagt gacggacgaa 600cgtcttgaag cggcacgctg tgaaacaccg tttaccaaca gtcaattcgg ttacctgata 660ccaaaaggtg aacaacgctt gttgaacaca gtgaacttca ttatggatga gatgaaattg 720aaaggcgtcg aagaagagtt cctgatccac aactctctta agtaa 7656254PRTVibrio splendidus 6Met Lys Lys Thr Ser Leu Leu Leu Ala Ser Ile Thr Leu Ala Leu Ser1 5 10 15 Gly Val Val Gln Ala Asp Gln Leu Glu Asp Ile Gln Lys Ser Gly Thr 20 25 30 Leu Arg Val Gly Thr Thr Gly Asp Tyr Lys Pro Phe Ser Tyr Phe Asp 35 40 45 Gly Lys Thr Tyr Ser Gly Tyr Asp Ile Asp Val Ala Lys His Val Ala 50 55 60 Glu Gln Leu Gly Val Glu Leu Gln Ile Val Arg Thr Thr Trp Lys Asp65 70 75 80 Leu Leu Thr Asp Leu Asp Ser Asp Lys Tyr Asp Ile Ala Met Gly Gly 85 90 95 Ile Thr Arg Lys Met Gln Arg Gln Leu Asn Ala Glu Gln Thr Gln Gly 100 105 110 Tyr Met Thr Phe Gly Lys Cys Phe Leu Val Ala Lys Gly Lys Ala Glu 115 120 125 Gln Tyr Asn Ser Ile Glu Lys Val Asn Leu Ser Ser Val Arg Val Gly 130 135 140 Val Asn Ile Gly Gly Thr Asn Glu Met Phe Ala Asp Ala Asn Leu Gln145 150 155 160 Asp Ala Ser Phe Thr Arg Tyr Glu Asn Asn Leu Asp Val Pro Gln Ala 165 170 175 Val Ala Glu Gly Lys Val Asp Val Met Val Thr Glu Thr Pro Glu Gly 180 185 190 Leu Phe Tyr Gln Val Thr Asp Glu Arg Leu Glu Ala Ala Arg Cys Glu 195 200 205 Thr Pro Phe Thr Asn Ser Gln Phe Gly Tyr Leu Ile Pro Lys Gly Glu 210 215 220 Gln Arg Leu Leu Asn Thr Val Asn Phe Ile Met Asp Glu Met Lys Leu225 230 235 240 Lys Gly Val Glu Glu Glu Phe Leu Ile His Asn Ser Leu Lys 245 250 71764DNAVibrio splendidus 7atgactatcg atacttttgt tgttctcgcc tacttcttct ttttaatcgc tattggttgg 60atgttccgta agttcaccac gtcgactagt gattacttca gagggggcgg caaaatgttg 120tggtggatgg ttggtgcaac cgccttcatg acacagtttt cagcatggac gtttacaggt 180gccgcaggac gcgcgttcaa tgacggtttc gttattgtaa tcctattctt agccaatgct 240tttggctact tcatgaacta tatgtacttc gctccaaagt tccgccaact tcgtgtggta 300acggcgatcg aagctattcg tcagcgcttt ggtaaaacgt ctgaacagtt cttcacatgg 360gcaggtatgc ctgacagcct tatctctgcg ggtatctggc taaatggtct agctatcttc 420gtagcagcgg tattcaacat cccaatggaa gcaaccattg tggtaacggg tatggttcta 480gtattgatgg cagtaacagg cggctcttgg gcggttgttg cttctgactt catgcaaatg 540cttgttatca tggcggttac gattacttgt gcggttgcag cttacttcca cggtggtggc 600ctaactaaca tcgttgcaaa tttcgacggc gacttcatgt taggtaataa cctaaactac 660atgagcatct tcgttctttg ggttgtattc atcttcgtga agcagttcgg tgtaatgaac 720aacagcatca acgcttaccg ttacctatgt gcgaaagaca gtgaaaacgc acgtaaagcg 780gcaggcctag catgtatcct tatggttgtt ggcccactaa tctggttcct accaccttgg 840tacgtaagtg cattcatgcc tgatttcgca ttggagtacg cttcaatggg tgataaagct 900ggtgatgctg cttacctagc attcgtacag aacgtaatgc cagcaggtat ggttggtctt 960cttatgtcag caatgttcgc tgcaacaatg tcttctatgg attcaggttt gaaccgtaac 1020gctggcatct ttgtaatgaa cttctacagc cctattctac gtcaaaacgc aactcagaaa 1080gagctggtta ttgtaagtaa gctaaccact atcatgatgg gtattatcat catcgcgatt 1140ggcttgttca ttaactctct acgtcatttg agcttgttcg atatcgtaat gaacgtaggt 1200gcgttaattg gcttcccaat gcttatccct gtactacttg gtatgtggat tcgtaagacg 1260cctgactggg ctggttggtc tacgttaatc gttggtggct tcgtttctta catcttcggt 1320atctcgcttc aagcagaaga catcgagcac ctatttggta tggaaacagc gcttactggc 1380cgtgaatgga gcgacttgaa agttggtctt agcttagctg ctcacgtagt gtttactggt 1440ggttacttca tcctaacttc tcgcttctac aaaggcctat cgcctgaacg tgagaaagaa 1500gttgaccaac tattcactaa ctggaatacg ccgctagtag cggaaggtga agagcagcag 1560aacctagata ctaaacagcg ttcaatgctt ggtaagctta tcagcacagc aggtttcggt 1620atcctagcaa tggctctgat tccaaacgaa ccaacaggac gcttgttgtt cctactatgt 1680ggttcgatgg tactcaccgt tggtatcctg ctggttaacg catctaaagc tccggctaag 1740atgaacaacg agtcagttgc taag 17648588PRTVibrio splendidus 8Met Thr Ile Asp Thr Phe Val Val Leu Ala Tyr Phe Phe Phe Leu Ile1 5 10 15 Ala Ile Gly Trp Met Phe Arg Lys Phe Thr Thr Ser Thr Ser Asp Tyr 20 25 30 Phe Arg Gly Gly Gly Lys Met Leu Trp Trp Met Val Gly Ala Thr Ala 35 40 45 Phe Met Thr Gln Phe Ser Ala Trp Thr Phe Thr Gly Ala Ala Gly Arg 50 55 60 Ala Phe Asn Asp Gly Phe Val Ile Val Ile Leu Phe Leu Ala Asn Ala65 70 75 80 Phe Gly Tyr Phe Met Asn Tyr Met Tyr Phe Ala Pro Lys Phe Arg Gln 85 90 95 Leu Arg Val Val Thr Ala Ile Glu Ala Ile Arg Gln Arg Phe Gly Lys 100 105 110 Thr Ser Glu Gln Phe Phe Thr Trp Ala Gly Met Pro Asp Ser Leu Ile 115 120 125 Ser Ala Gly Ile Trp Leu Asn Gly Leu Ala Ile Phe Val Ala Ala Val 130 135 140 Phe Asn Ile Pro Met Glu Ala Thr Ile Val Val Thr Gly Met Val Leu145 150 155 160 Val Leu Met Ala Val Thr Gly Gly Ser Trp Ala Val Val Ala Ser Asp 165 170 175 Phe Met Gln Met Leu Val Ile Met Ala Val Thr Ile Thr Cys Ala Val 180 185 190 Ala Ala Tyr Phe His Gly Gly Gly Leu Thr Asn Ile Val Ala Asn Phe 195 200 205 Asp Gly Asp Phe Met Leu Gly Asn Asn Leu Asn Tyr Met Ser Ile Phe 210 215 220 Val Leu Trp Val Val Phe Ile Phe Val Lys Gln Phe Gly Val Met Asn225 230 235 240 Asn Ser Ile Asn Ala Tyr Arg Tyr Leu Cys Ala Lys Asp Ser Glu Asn 245 250 255 Ala Arg Lys Ala Ala Gly Leu Ala Cys Ile Leu Met Val Val Gly Pro 260 265 270 Leu Ile Trp Phe Leu Pro Pro Trp Tyr Val Ser Ala Phe Met Pro Asp 275 280 285 Phe Ala Leu Glu Tyr Ala Ser Met Gly Asp Lys Ala Gly Asp Ala Ala 290 295 300 Tyr Leu Ala Phe Val Gln Asn Val Met Pro Ala Gly Met Val Gly Leu305 310 315 320 Leu Met Ser Ala Met Phe Ala Ala Thr Met Ser Ser Met Asp Ser Gly 325 330 335 Leu Asn Arg Asn Ala Gly Ile Phe Val Met Asn Phe Tyr Ser Pro Ile 340 345 350 Leu Arg Gln Asn Ala Thr Gln Lys Glu Leu Val Ile Val Ser Lys Leu 355 360 365 Thr Thr Ile Met Met Gly Ile Ile Ile Ile Ala Ile Gly Leu Phe Ile 370 375 380 Asn Ser Leu Arg His Leu Ser Leu Phe Asp Ile Val Met Asn Val Gly385 390 395 400 Ala Leu Ile Gly Phe Pro Met Leu Ile Pro Val Leu Leu Gly Met Trp 405 410 415 Ile Arg Lys Thr Pro Asp Trp Ala Gly Trp Ser Thr Leu Ile Val Gly 420 425 430 Gly Phe Val Ser Tyr Ile Phe Gly Ile Ser Leu Gln Ala Glu Asp Ile 435 440 445 Glu His Leu Phe Gly Met Glu Thr Ala Leu Thr Gly Arg Glu Trp Ser 450 455 460 Asp Leu Lys Val Gly Leu Ser Leu Ala Ala His Val Val Phe Thr Gly465 470 475 480 Gly Tyr Phe Ile Leu Thr Ser Arg Phe Tyr Lys Gly Leu Ser Pro Glu 485 490 495 Arg Glu Lys Glu Val Asp Gln Leu Phe Thr Asn Trp Asn Thr Pro Leu 500 505 510 Val Ala Glu Gly Glu Glu Gln Gln Asn Leu Asp Thr Lys Gln Arg Ser 515 520 525 Met Leu Gly Lys Leu Ile Ser Thr Ala Gly Phe Gly Ile Leu Ala Met 530 535 540 Ala Leu Ile Pro Asn Glu Pro Thr Gly Arg Leu Leu Phe Leu Leu Cys545 550 555 560 Gly Ser Met Val Leu Thr Val Gly Ile Leu Leu Val Asn Ala Ser Lys 565 570 575 Ala Pro Ala Lys Met Asn Asn Glu Ser Val Ala Lys 580 585 9627DNAVibrio splendidus 9atgacgacat taaatgaaca actagcaaac ctaaaagtaa ttcctgtaat cgcgatcaac 60cgtgctgaag acgctatccc tctaggtaaa gcgttggttg aaaatggcat gccatgtgca 120gaaattacac tacgtacaga atgtgcaatc gaagcgattc gcatcatgcg taaagaattc 180ccagacatgc taatcggttc aggtactgta ctgactaacg agcaagttga cgcatctatc 240gaagctggtg ttgatttcat cgtaagccca ggttttaacc cacgtactgt tcaatactgt 300atcgataaag gtattgcaat cgtaccgggt gttaacaacc caagcctagt tgagcaagca 360atggaaatgg gtcttcgcac gttgaagttc ttccctgctg agccttcagg cggtactggc 420atgcttaaag cactaacagc agtttaccct gttaaattca tgcctactgg tggcgtaagc 480ttgaagaatg ttgatgaata cctatcgatc ccttctgttc ttgcgtgtgg cggtacttgg 540atggttccaa ctaaccttat cgatgaaggt aagtgggacg aactaggcaa gcttgttcgt 600gacgcagttg atcacgttaa cgcttaa 62710208PRTVibrio splendidus 10Met Thr Thr Leu Asn Glu Gln Leu Ala Asn Leu Lys Val Ile Pro Val1 5 10 15 Ile Ala Ile Asn Arg Ala Glu Asp Ala Ile Pro Leu Gly Lys Ala Leu 20 25 30 Val Glu Asn Gly Met Pro Cys Ala Glu Ile Thr Leu Arg Thr Glu Cys 35 40 45 Ala Ile Glu Ala Ile Arg Ile Met Arg Lys Glu Phe Pro Asp Met Leu 50 55 60 Ile Gly Ser Gly Thr Val Leu Thr Asn Glu Gln Val Asp Ala Ser Ile65 70 75 80 Glu Ala Gly Val Asp Phe Ile Val Ser Pro Gly Phe Asn Pro Arg Thr 85 90 95 Val Gln Tyr Cys Ile Asp Lys Gly Ile Ala Ile Val Pro Gly Val Asn 100 105 110 Asn Pro Ser Leu Val Glu Gln Ala Met Glu Met Gly Leu Arg Thr Leu 115 120 125 Lys Phe Phe Pro Ala Glu Pro Ser Gly Gly Thr Gly Met Leu Lys Ala 130 135 140 Leu Thr Ala Val Tyr Pro Val Lys Phe Met Pro Thr Gly Gly Val Ser145 150 155 160 Leu Lys Asn Val Asp Glu Tyr Leu Ser Ile Pro Ser Val Leu Ala Cys 165 170 175 Gly Gly Thr Trp Met Val Pro Thr Asn Leu Ile Asp Glu Gly Lys Trp 180 185 190 Asp Glu Leu Gly Lys Leu Val Arg Asp Ala Val Asp His Val Asn Ala 195 200 205 11933DNAVibrio splendidus 11atgaaatcat taaacatcgc ggtcattggc gagtgcatgg ttgagctaca aaagaaacaa 60gacgggctta agcaaagttt tggtggcgat acgctgaata ctgcacttta cttgtcacgc 120ttaacaaaag agcaagatat caacacgagc tacgtaactg cactaggcac tgacccattc 180agtaccgaca tgttaaaaaa ttggcaagcg gaaggtatcg acacgagctt aattgctcag 240ctggaccaca aacaaccagg gctttactac atcgagaccg atgaaactgg tgaacgcagt 300ttccactact ggcgtagtga tgctgcagcg aagttcatgt ttgatcagga agacacgcct 360gctcttcttg ataagctgtt ctcttttgac gcgatttact taagtggtat tacgctggca 420atcttgacag aaaatggtcg cacgcagcta ttcaacttct tagacaaatt caaagctcaa 480ggcggccaag tattcttcga caataactac cgacctaaac tttgggaaag ccaacaagaa 540gcgatttctt ggtacttgaa aatgcttaag tacacagata cggctctgct gacgtttgat 600gatgagcaag agctatacgg cgacgaaagc attgaacaat gtattacacg tacgtcagag 660tctggtgtga aagagatcgt cattaaacgt ggcgcgaaag actgcttagt ggttgaaagc 720caaagcgctc aatacgttgc acccaaccct gtagacaaca tcgttgatac gactgccgct 780ggcgactcgt tcagtgcagg cttcttggcc aagcgcttga gcggcggtag tgctcgtgat 840gctgcatttg caggtcatat tgtggcagga accgtgattc agcatccagg tgctatcatt 900cctctagaag cgacgcctga tctgtctcta taa 93312310PRTVibrio splendidus 12Met Lys Ser Leu Asn Ile Ala Val Ile Gly Glu Cys Met Val Glu Leu1 5 10 15 Gln Lys Lys Gln Asp Gly Leu Lys Gln Ser Phe Gly Gly Asp Thr Leu 20 25 30 Asn Thr Ala Leu Tyr Leu Ser Arg Leu Thr Lys Glu Gln Asp Ile Asn 35 40 45 Thr Ser Tyr Val Thr Ala Leu Gly Thr Asp Pro Phe Ser Thr Asp Met 50 55 60 Leu Lys Asn Trp Gln Ala Glu Gly Ile Asp Thr Ser Leu Ile Ala Gln65 70

75 80 Leu Asp His Lys Gln Pro Gly Leu Tyr Tyr Ile Glu Thr Asp Glu Thr 85 90 95 Gly Glu Arg Ser Phe His Tyr Trp Arg Ser Asp Ala Ala Ala Lys Phe 100 105 110 Met Phe Asp Gln Glu Asp Thr Pro Ala Leu Leu Asp Lys Leu Phe Ser 115 120 125 Phe Asp Ala Ile Tyr Leu Ser Gly Ile Thr Leu Ala Ile Leu Thr Glu 130 135 140 Asn Gly Arg Thr Gln Leu Phe Asn Phe Leu Asp Lys Phe Lys Ala Gln145 150 155 160 Gly Gly Gln Val Phe Phe Asp Asn Asn Tyr Arg Pro Lys Leu Trp Glu 165 170 175 Ser Gln Gln Glu Ala Ile Ser Trp Tyr Leu Lys Met Leu Lys Tyr Thr 180 185 190 Asp Thr Ala Leu Leu Thr Phe Asp Asp Glu Gln Glu Leu Tyr Gly Asp 195 200 205 Glu Ser Ile Glu Gln Cys Ile Thr Arg Thr Ser Glu Ser Gly Val Lys 210 215 220 Glu Ile Val Ile Lys Arg Gly Ala Lys Asp Cys Leu Val Val Glu Ser225 230 235 240 Gln Ser Ala Gln Tyr Val Ala Pro Asn Pro Val Asp Asn Ile Val Asp 245 250 255 Thr Thr Ala Ala Gly Asp Ser Phe Ser Ala Gly Phe Leu Ala Lys Arg 260 265 270 Leu Ser Gly Gly Ser Ala Arg Asp Ala Ala Phe Ala Gly His Ile Val 275 280 285 Ala Gly Thr Val Ile Gln His Pro Gly Ala Ile Ile Pro Leu Glu Ala 290 295 300 Thr Pro Asp Leu Ser Leu305 310 13336DNAVibrio splendidus 13atgaactctt tctttatcct agatgaaaat ccatgggaag aacttggtgg cggcattaag 60cgtaaaatcg ttgcttacac tgacgatcta atggcagtac acctatgctt tgataagggc 120gcgattggcc accctcatac tcacgaaatt cacgaccaaa tcggttatgt tgttcgtggt 180agcttcgaag ctgaaatcga cggcgagaag aaagtgctta aagaaggcga tgcttacttc 240gctcgtaaac acatgatgca cggtgcagtt gctctagaac aagacagcat ccttcttgat 300atcttcaatc ctgcgcgtga agatttccta aaataa 33614111PRTVibrio splendidus 14Met Asn Ser Phe Phe Ile Leu Asp Glu Asn Pro Trp Glu Glu Leu Gly1 5 10 15 Gly Gly Ile Lys Arg Lys Ile Val Ala Tyr Thr Asp Asp Leu Met Ala 20 25 30 Val His Leu Cys Phe Asp Lys Gly Ala Ile Gly His Pro His Thr His 35 40 45 Glu Ile His Asp Gln Ile Gly Tyr Val Val Arg Gly Ser Phe Glu Ala 50 55 60 Glu Ile Asp Gly Glu Lys Lys Val Leu Lys Glu Gly Asp Ala Tyr Phe65 70 75 80 Ala Arg Lys His Met Met His Gly Ala Val Ala Leu Glu Gln Asp Ser 85 90 95 Ile Leu Leu Asp Ile Phe Asn Pro Ala Arg Glu Asp Phe Leu Lys 100 105 110 152208DNAVibrio splendidus 15atgacgacta aaccagtatt gttgactgaa gctgaaatcg aacagcttca tcttgaagtg 60ggccgttcta gcttaatggg caaaaccatt gcagcgaacg cgaaagacct agaagcattc 120atgcgtttac ctattgatgt tccaggtcac ggtgaagctg ggggttacga acataaccgc 180cacaagcaaa attacacgta catgaaccta gctggtcgca tgttcttgat cactaaagag 240caaaaatacg ctgactttgt tacagaatta ctagaagagt acgcagacaa atatctaacg 300tttgattacc acgtacagaa aaacaccaac ccaacaggtc gtttgttcca ccaaatccta 360aacgaacact gctggttaat gttctcaagc ttagcttatt cttgtgttgc ttcaacactg 420acacaagatc agcgtgacaa tattgagtct cgcatttttg aacccatgct agaaatgttc 480acggttaaat acgcacacga cttcgaccgt attcacaatc acggtatttg ggcagtagcc 540gctgtgggta tctgtggtct tgctttaggc aaacgtgaat acctagaaat gtcagtgtac 600ggcatcgacc gtaatgatac tggcggtttc ctagcgcaag tttctcagct atttgcacct 660tctggctact acatggaagg tccttactac catcgttatg cgattcgccc aacgtgtgtg 720ttcgctgaag tgattcaccg tcatatgcct gaagttgata tctacaacta caaaggcggc 780gtgattggta acacagtaca agctatgctt gcgacagcgt acccgaacgg cgagttcccg 840gctctgaatg atgcttctcg tactatgggt atcacagaca tgggtgttca ggttgcggtc 900agtgtttaca gtaagcatta ctcttctgaa aacggtgtag accaaaacat tctgggtatg 960gcgaagattc aagacgcagt atggatgcat ccatgtggtc ttgagctatc taaagcatac 1020gaagccgcat ctgcagagaa agaaatcggc atgcctttct ggccaagtgt tgaattgaat 1080gaaggccctc aaggtcacaa cggcgcgcaa ggctttatcc gtatgcagga taagaaaggc 1140gacgtttctc aacttgtgat gaactacggc caacacggca tgggtcacgg caactttgat 1200acgctgggta tttctttctt taaccgcggt caagaagtgc tacgtgaata cggcttctgt 1260cgttgggtta acgttgagcc aaaattcggc ggccgttacc tagacgaaaa caaatcttac 1320gctcgtcaaa cgattgctca caatgcagtt acgattgatg aaaaatgtca gaacaacttt 1380gacgttgaac gtgcagactc agtacatggt ttacctcact tctttaaagt agaagacgat 1440caaatcaacg gtatgagtgc atttgctaac gatcattacc aaggctttga catgcaacgc 1500agcgtgttca tgctaaatct tgaagaatta gaatctccgt tattgttaga cctataccgc 1560ttagattcta caaaaggcgg cgaaggcgag caccaatacg actattcaca ccaatatgcg 1620ggtcagattg ttcgcactaa cttcgaatac caagcgaaca aagagctaaa cactctaggt 1680gacgatttcg gttaccaaca tctatggaac gtcgcaagcg gtgaagtgaa gggcacagca 1740attgtaagtt ggctacaaaa caacacctac tacacatggc taggtgcaac gtctaacgat 1800aatgctgaag taatatttac tcgcactggc gctaacgacc caagtttcaa tctacgttca 1860gagcctgcgt tcattctacg cagcaaaggc gaaacaacac tgtttgcttc tgttgttgaa 1920acgcacggtt atttcaacga agaattcgag caatctgtca atgcacgtgg tgttgtgaaa 1980gacatcaaag tcgtggctca caccaatgtc ggttcggtag ttgagatcac cacagagaaa 2040tcaaacgtga cagtgatgat cagcaaccaa cttggcgcga ctgacagcac tgaacacaaa 2100gtagaactga acggcaaagt atacagctgg aaaggcttct actcagtaga gacaacttta 2160caagaaacga attcagaaga acttagcact gcagggcagg ggaaataa 220816735PRTVibrio splendidus 16Met Thr Thr Lys Pro Val Leu Leu Thr Glu Ala Glu Ile Glu Gln Leu1 5 10 15 His Leu Glu Val Gly Arg Ser Ser Leu Met Gly Lys Thr Ile Ala Ala 20 25 30 Asn Ala Lys Asp Leu Glu Ala Phe Met Arg Leu Pro Ile Asp Val Pro 35 40 45 Gly His Gly Glu Ala Gly Gly Tyr Glu His Asn Arg His Lys Gln Asn 50 55 60 Tyr Thr Tyr Met Asn Leu Ala Gly Arg Met Phe Leu Ile Thr Lys Glu65 70 75 80 Gln Lys Tyr Ala Asp Phe Val Thr Glu Leu Leu Glu Glu Tyr Ala Asp 85 90 95 Lys Tyr Leu Thr Phe Asp Tyr His Val Gln Lys Asn Thr Asn Pro Thr 100 105 110 Gly Arg Leu Phe His Gln Ile Leu Asn Glu His Cys Trp Leu Met Phe 115 120 125 Ser Ser Leu Ala Tyr Ser Cys Val Ala Ser Thr Leu Thr Gln Asp Gln 130 135 140 Arg Asp Asn Ile Glu Ser Arg Ile Phe Glu Pro Met Leu Glu Met Phe145 150 155 160 Thr Val Lys Tyr Ala His Asp Phe Asp Arg Ile His Asn His Gly Ile 165 170 175 Trp Ala Val Ala Ala Val Gly Ile Cys Gly Leu Ala Leu Gly Lys Arg 180 185 190 Glu Tyr Leu Glu Met Ser Val Tyr Gly Ile Asp Arg Asn Asp Thr Gly 195 200 205 Gly Phe Leu Ala Gln Val Ser Gln Leu Phe Ala Pro Ser Gly Tyr Tyr 210 215 220 Met Glu Gly Pro Tyr Tyr His Arg Tyr Ala Ile Arg Pro Thr Cys Val225 230 235 240 Phe Ala Glu Val Ile His Arg His Met Pro Glu Val Asp Ile Tyr Asn 245 250 255 Tyr Lys Gly Gly Val Ile Gly Asn Thr Val Gln Ala Met Leu Ala Thr 260 265 270 Ala Tyr Pro Asn Gly Glu Phe Pro Ala Leu Asn Asp Ala Ser Arg Thr 275 280 285 Met Gly Ile Thr Asp Met Gly Val Gln Val Ala Val Ser Val Tyr Ser 290 295 300 Lys His Tyr Ser Ser Glu Asn Gly Val Asp Gln Asn Ile Leu Gly Met305 310 315 320 Ala Lys Ile Gln Asp Ala Val Trp Met His Pro Cys Gly Leu Glu Leu 325 330 335 Ser Lys Ala Tyr Glu Ala Ala Ser Ala Glu Lys Glu Ile Gly Met Pro 340 345 350 Phe Trp Pro Ser Val Glu Leu Asn Glu Gly Pro Gln Gly His Asn Gly 355 360 365 Ala Gln Gly Phe Ile Arg Met Gln Asp Lys Lys Gly Asp Val Ser Gln 370 375 380 Leu Val Met Asn Tyr Gly Gln His Gly Met Gly His Gly Asn Phe Asp385 390 395 400 Thr Leu Gly Ile Ser Phe Phe Asn Arg Gly Gln Glu Val Leu Arg Glu 405 410 415 Tyr Gly Phe Cys Arg Trp Val Asn Val Glu Pro Lys Phe Gly Gly Arg 420 425 430 Tyr Leu Asp Glu Asn Lys Ser Tyr Ala Arg Gln Thr Ile Ala His Asn 435 440 445 Ala Val Thr Ile Asp Glu Lys Cys Gln Asn Asn Phe Asp Val Glu Arg 450 455 460 Ala Asp Ser Val His Gly Leu Pro His Phe Phe Lys Val Glu Asp Asp465 470 475 480 Gln Ile Asn Gly Met Ser Ala Phe Ala Asn Asp His Tyr Gln Gly Phe 485 490 495 Asp Met Gln Arg Ser Val Phe Met Leu Asn Leu Glu Glu Leu Glu Ser 500 505 510 Pro Leu Leu Leu Asp Leu Tyr Arg Leu Asp Ser Thr Lys Gly Gly Glu 515 520 525 Gly Glu His Gln Tyr Asp Tyr Ser His Gln Tyr Ala Gly Gln Ile Val 530 535 540 Arg Thr Asn Phe Glu Tyr Gln Ala Asn Lys Glu Leu Asn Thr Leu Gly545 550 555 560 Asp Asp Phe Gly Tyr Gln His Leu Trp Asn Val Ala Ser Gly Glu Val 565 570 575 Lys Gly Thr Ala Ile Val Ser Trp Leu Gln Asn Asn Thr Tyr Tyr Thr 580 585 590 Trp Leu Gly Ala Thr Ser Asn Asp Asn Ala Glu Val Ile Phe Thr Arg 595 600 605 Thr Gly Ala Asn Asp Pro Ser Phe Asn Leu Arg Ser Glu Pro Ala Phe 610 615 620 Ile Leu Arg Ser Lys Gly Glu Thr Thr Leu Phe Ala Ser Val Val Glu625 630 635 640 Thr His Gly Tyr Phe Asn Glu Glu Phe Glu Gln Ser Val Asn Ala Arg 645 650 655 Gly Val Val Lys Asp Ile Lys Val Val Ala His Thr Asn Val Gly Ser 660 665 670 Val Val Glu Ile Thr Thr Glu Lys Ser Asn Val Thr Val Met Ile Ser 675 680 685 Asn Gln Leu Gly Ala Thr Asp Ser Thr Glu His Lys Val Glu Leu Asn 690 695 700 Gly Lys Val Tyr Ser Trp Lys Gly Phe Tyr Ser Val Glu Thr Thr Leu705 710 715 720 Gln Glu Thr Asn Ser Glu Glu Leu Ser Thr Ala Gly Gln Gly Lys 725 730 735 172154DNAVibrio splendidus 17atgagctatc aaccactttt acttaacttt gatgaagcag ctgaacttcg taaagaactt 60ggcaaggata gcctattagg taacgcactg actcgcgaca ttaaacaaac tgacgcttac 120atggctgaag ttggcattga agtaccaggt cacggtgaag gcggcggtta cgagcacaac 180cgtcataagc aaaactacat ccatatggat ctagcaggcc gtttgttcct tatcactgag 240gaaacaaaat accgagatta catcgttgat atgctaacag cgtacgcgac ggtataccca 300acacttgaaa gcaacgtaag ccgtgactct aaccctccgg gtaagctgtt ccaccaaacg 360ttgaacgaga acatgtggat gctttacgct tcttgtgcgt acagctgcat ctaccacacg 420atctctgaag agcaaaagcg tctgatcgaa gacgatcttc ttaagcaaat gatcgaaatg 480ttcgttgtga cttacgcaca cgacttcgat atcgtacaca accacggctt atgggcagtg 540gcagcagtag gtatctgtgg ttacgcaatc aacgatcaag agtctgtaga caaagcacta 600tacggcctga aactagacaa agtcagcggc ggtttcttag cgcaactaga ccaactgttt 660tcgccagacg gctactacat ggaaggtcct tactaccacc gtttctctct gcgtccaatc 720tacctgttcg cagaagcgat tgaacgtcgt cagcctgaag ttggtatcta tgaattcaac 780gattcagtga tcaagacaac gtcttactct gtattcaaaa cggcattccc agacggtaca 840ttgcctgctc tgaacgattc atcgaagaca atctctatca acgatgaagg cgttatcatg 900gcaacgtctg tgtgttacca ccgttacgag caaactgaaa ctctacttgg tatggctaac 960caccagcaaa acgtttgggt tcatgcttca ggtaaaacac tgtctgacgc ggttgatgca 1020gcagacgaca tcaaagcatt caactggggt agcctgtttg taaccgacgg ccctgaaggc 1080gaaaaaggcg gcgtaagcat ccttcgtcac cgtgacgaac aagatgacga cacgatggcg 1140ttgatctggt ttggtcaaca cggttctgat caccagtacc actctgctct agaccacggt 1200cactacgatg gcctgcacct aagcgtattt aaccgtggcc acgaagtgct gcacgatttc 1260ggcttcggtc gctgggtaaa cgttgagcct aagtttggcg gtcgttacat cccagagaac 1320aagtcttact gtaagcagac ggttgctcac aacacagtaa cggttgatca gaaaacgcag 1380aacaacttca acacagcatt ggctgagtct aagtttggtc agaagcactt cttcgtagca 1440gacgaccagt ctctacaagg catgagcggc acaatttctg agtactacac tggcgtagac 1500atgcaacgca gcgtgattct tgctgaactt cctgagttcg agaagccact tgtaatcgac 1560gtataccgca tcgaagctga cgctgaacac cagtacgacc tacccgttca ccactctggt 1620cagatcatcc gtactgactt cgattacaac atggaaaaaa cgcttaagcc gctaggtgaa 1680gacaacggtt accagcactt atggaacgtg gcttcaggca aagtgaacga agaaggttct 1740ctagtaagct ggctacatga cagcagctac tacagcctag taaccagcgc gaatgcgggc 1800agcgaagtga tttttgctcg cactggtgct aacgatccag acttcaacct taagagtgag 1860cctgcgttca tcttacgtca gtctggtcaa aaccacgtgt ttgcttctgt actagaaacg 1920catggttact ttaacgagtc tatcgaagcc tctgtaggcg ctcgtggtct agttaaatca 1980gtatctgttg tgggccataa cagtgtcggg actgttgttc gcattcagac tacttctggc 2040aacacttacc actacggtat ctcaaaccaa gctgaagaca cgcagcaagc aactcacact 2100gttgagttcg cgggtgagac atactcgtgg gaaggatcat ttgctcaact gtaa 215418717PRTVibrio slpendidus 18Met Ser Tyr Gln Pro Leu Leu Leu Asn Phe Asp Glu Ala Ala Glu Leu1 5 10 15 Arg Lys Glu Leu Gly Lys Asp Ser Leu Leu Gly Asn Ala Leu Thr Arg 20 25 30 Asp Ile Lys Gln Thr Asp Ala Tyr Met Ala Glu Val Gly Ile Glu Val 35 40 45 Pro Gly His Gly Glu Gly Gly Gly Tyr Glu His Asn Arg His Lys Gln 50 55 60 Asn Tyr Ile His Met Asp Leu Ala Gly Arg Leu Phe Leu Ile Thr Glu65 70 75 80 Glu Thr Lys Tyr Arg Asp Tyr Ile Val Asp Met Leu Thr Ala Tyr Ala 85 90 95 Thr Val Tyr Pro Thr Leu Glu Ser Asn Val Ser Arg Asp Ser Asn Pro 100 105 110 Pro Gly Lys Leu Phe His Gln Thr Leu Asn Glu Asn Met Trp Met Leu 115 120 125 Tyr Ala Ser Cys Ala Tyr Ser Cys Ile Tyr His Thr Ile Ser Glu Glu 130 135 140 Gln Lys Arg Leu Ile Glu Asp Asp Leu Leu Lys Gln Met Ile Glu Met145 150 155 160 Phe Val Val Thr Tyr Ala His Asp Phe Asp Ile Val His Asn His Gly 165 170 175 Leu Trp Ala Val Ala Ala Val Gly Ile Cys Gly Tyr Ala Ile Asn Asp 180 185 190 Gln Glu Ser Val Asp Lys Ala Leu Tyr Gly Leu Lys Leu Asp Lys Val 195 200 205 Ser Gly Gly Phe Leu Ala Gln Leu Asp Gln Leu Phe Ser Pro Asp Gly 210 215 220 Tyr Tyr Met Glu Gly Pro Tyr Tyr His Arg Phe Ser Leu Arg Pro Ile225 230 235 240 Tyr Leu Phe Ala Glu Ala Ile Glu Arg Arg Gln Pro Glu Val Gly Ile 245 250 255 Tyr Glu Phe Asn Asp Ser Val Ile Lys Thr Thr Ser Tyr Ser Val Phe 260 265 270 Lys Thr Ala Phe Pro Asp Gly Thr Leu Pro Ala Leu Asn Asp Ser Ser 275 280 285 Lys Thr Ile Ser Ile Asn Asp Glu Gly Val Ile Met Ala Thr Ser Val 290 295 300 Cys Tyr His Arg Tyr Glu Gln Thr Glu Thr Leu Leu Gly Met Ala Asn305 310 315 320 His Gln Gln Asn Val Trp Val His Ala Ser Gly Lys Thr Leu Ser Asp 325 330 335 Ala Val Asp Ala Ala Asp Asp Ile Lys Ala Phe Asn Trp Gly Ser Leu 340 345 350 Phe Val Thr Asp Gly Pro Glu Gly Glu Lys Gly Gly Val Ser Ile Leu 355 360 365 Arg His Arg Asp Glu Gln Asp Asp Asp Thr Met Ala Leu Ile Trp Phe 370 375 380 Gly Gln His Gly Ser Asp His Gln Tyr His Ser Ala Leu Asp His Gly385 390 395 400 His Tyr Asp Gly Leu His Leu Ser Val Phe Asn Arg Gly His Glu Val 405 410 415 Leu His Asp Phe Gly Phe Gly Arg Trp Val Asn Val Glu Pro Lys Phe 420 425 430 Gly Gly Arg Tyr Ile Pro Glu Asn Lys Ser Tyr Cys Lys Gln Thr Val 435 440 445 Ala His Asn Thr Val Thr Val Asp Gln Lys Thr Gln Asn Asn Phe Asn 450 455 460 Thr Ala Leu Ala Glu Ser Lys Phe Gly Gln Lys His Phe Phe Val Ala465 470 475

480 Asp Asp Gln Ser Leu Gln Gly Met Ser Gly Thr Ile Ser Glu Tyr Tyr 485 490 495 Thr Gly Val Asp Met Gln Arg Ser Val Ile Leu Ala Glu Leu Pro Glu 500 505 510 Phe Glu Lys Pro Leu Val Ile Asp Val Tyr Arg Ile Glu Ala Asp Ala 515 520 525 Glu His Gln Tyr Asp Leu Pro Val His His Ser Gly Gln Ile Ile Arg 530 535 540 Thr Asp Phe Asp Tyr Asn Met Glu Lys Thr Leu Lys Pro Leu Gly Glu545 550 555 560 Asp Asn Gly Tyr Gln His Leu Trp Asn Val Ala Ser Gly Lys Val Asn 565 570 575 Glu Glu Gly Ser Leu Val Ser Trp Leu His Asp Ser Ser Tyr Tyr Ser 580 585 590 Leu Val Thr Ser Ala Asn Ala Gly Ser Glu Val Ile Phe Ala Arg Thr 595 600 605 Gly Ala Asn Asp Pro Asp Phe Asn Leu Lys Ser Glu Pro Ala Phe Ile 610 615 620 Leu Arg Gln Ser Gly Gln Asn His Val Phe Ala Ser Val Leu Glu Thr625 630 635 640 His Gly Tyr Phe Asn Glu Ser Ile Glu Ala Ser Val Gly Ala Arg Gly 645 650 655 Leu Val Lys Ser Val Ser Val Val Gly His Asn Ser Val Gly Thr Val 660 665 670 Val Arg Ile Gln Thr Thr Ser Gly Asn Thr Tyr His Tyr Gly Ile Ser 675 680 685 Asn Gln Ala Glu Asp Thr Gln Gln Ala Thr His Thr Val Glu Phe Ala 690 695 700 Gly Glu Thr Tyr Ser Trp Glu Gly Ser Phe Ala Gln Leu705 710 715 19825DNAVibrio splendidus 19atgaagtggt tattggcaat agttgcgatg tctggtgtcg cattggcggc agaaaataag 60aatgttgagg tgagcagtga gcatttcgtc cgttatcaat accaagacaa aatcagctat 120ggaaagctag acaatgacgc agtgttaccg gtcagcggcg atctctttgg cgaatattcg 180gtagcaaaaa attcgatccc gttagagtcg gttgaggtgt tactaccgac aaaaccagag 240aaagtcttcg ccgtcgggat gaacttcgct agccacttag cctcacctgc cgatgcacca 300ccgccgatgt ttcttaaact tccttcttct ttgattctca cgggcgaagt gattcaagtg 360ccaccaaaag caagaaatgt tcattttgaa ggcgagctgg tggttgtgat tggtagagag 420ctcagtcaag ccagtgaaga agaagccgaa caagcgatct ttggcgtcac ggtgggcaac 480gatattactg aaagaagttg gcaaggcgcc gatttacaat ggctccgagc gaaagcttcc 540gatggttttg gcccggttgg caacacaatt gtgcgcggca ttgattacaa caatattgag 600ttaaccactc gtgttaacgg taaagtggtt caacaagaaa atacttcgtt catgatccac 660aagccaagaa aagtcgtgag ctatttgagc tattatttta ccctcaaacc gggcgatcta 720attttcatgg gcacgccagg tagaacttat gctctgtccg acaaagatca agtgagtgtc 780acgattgaag gggtagggac tgtggtaaat gaagtgcggt tctga 82520274PRTVibrio splendidus 20Met Lys Trp Leu Leu Ala Ile Val Ala Met Ser Gly Val Ala Leu Ala1 5 10 15 Ala Glu Asn Lys Asn Val Glu Val Ser Ser Glu His Phe Val Arg Tyr 20 25 30 Gln Tyr Gln Asp Lys Ile Ser Tyr Gly Lys Leu Asp Asn Asp Ala Val 35 40 45 Leu Pro Val Ser Gly Asp Leu Phe Gly Glu Tyr Ser Val Ala Lys Asn 50 55 60 Ser Ile Pro Leu Glu Ser Val Glu Val Leu Leu Pro Thr Lys Pro Glu65 70 75 80 Lys Val Phe Ala Val Gly Met Asn Phe Ala Ser His Leu Ala Ser Pro 85 90 95 Ala Asp Ala Pro Pro Pro Met Phe Leu Lys Leu Pro Ser Ser Leu Ile 100 105 110 Leu Thr Gly Glu Val Ile Gln Val Pro Pro Lys Ala Arg Asn Val His 115 120 125 Phe Glu Gly Glu Leu Val Val Val Ile Gly Arg Glu Leu Ser Gln Ala 130 135 140 Ser Glu Glu Glu Ala Glu Gln Ala Ile Phe Gly Val Thr Val Gly Asn145 150 155 160 Asp Ile Thr Glu Arg Ser Trp Gln Gly Ala Asp Leu Gln Trp Leu Arg 165 170 175 Ala Lys Ala Ser Asp Gly Phe Gly Pro Val Gly Asn Thr Ile Val Arg 180 185 190 Gly Ile Asp Tyr Asn Asn Ile Glu Leu Thr Thr Arg Val Asn Gly Lys 195 200 205 Val Val Gln Gln Glu Asn Thr Ser Phe Met Ile His Lys Pro Arg Lys 210 215 220 Val Val Ser Tyr Leu Ser Tyr Tyr Phe Thr Leu Lys Pro Gly Asp Leu225 230 235 240 Ile Phe Met Gly Thr Pro Gly Arg Thr Tyr Ala Leu Ser Asp Lys Asp 245 250 255 Gln Val Ser Val Thr Ile Glu Gly Val Gly Thr Val Val Asn Glu Val 260 265 270 Arg Phe21717DNAVibrio splendidus 21atggctagca cttttaattc aatttcgggc tcgaagcgta gcctgcacgt gcaagtagca 60cgcgaaatcg ctcgaggaat tttgtctggt gatctgccgc aaggttctat tattcctggt 120gaaatggcgt tgtgtgaaca gtttggtatc agccgaacgg cacttcgtga agcagttaaa 180ctactgacct ctaaaggtct gttagagtct cgccctaaaa ttggtactcg cgtagtcgac 240cgcgcatact ggaacttcct tgatcctcaa ctgattgaat ggatggacgg actaaccgac 300gtagaccaat tctgttctca gtttttaggc cttcgccgtg cgatcgagcc tgaagcgtgt 360gcactggcgg caaaatttgc gacagctgaa caacgtatcg agctttcaga gatcttccaa 420aagatggtcg aagtggatga agctgaagtg tttgaccaag aacgttggac agacattgat 480actcgtttcc atagcttgat cttcaatgcg accggtaacg acttctatct accgttcggt 540aatattctga ctactatgtt cgttaacttc atagtgcatt cttctgaaga gggaagcaca 600tgcatcaatg aacaccgcag aatctatgaa gctatcatgg ccggtgattg tgacaaggct 660agaattgctt ctgctgttca cttgcaagat gccaaccacc gtttggcaac agcataa 71722238PRTVibrio splendidus 22Met Ala Ser Thr Phe Asn Ser Ile Ser Gly Ser Lys Arg Ser Leu His1 5 10 15 Val Gln Val Ala Arg Glu Ile Ala Arg Gly Ile Leu Ser Gly Asp Leu 20 25 30 Pro Gln Gly Ser Ile Ile Pro Gly Glu Met Ala Leu Cys Glu Gln Phe 35 40 45 Gly Ile Ser Arg Thr Ala Leu Arg Glu Ala Val Lys Leu Leu Thr Ser 50 55 60 Lys Gly Leu Leu Glu Ser Arg Pro Lys Ile Gly Thr Arg Val Val Asp65 70 75 80 Arg Ala Tyr Trp Asn Phe Leu Asp Pro Gln Leu Ile Glu Trp Met Asp 85 90 95 Gly Leu Thr Asp Val Asp Gln Phe Cys Ser Gln Phe Leu Gly Leu Arg 100 105 110 Arg Ala Ile Glu Pro Glu Ala Cys Ala Leu Ala Ala Lys Phe Ala Thr 115 120 125 Ala Glu Gln Arg Ile Glu Leu Ser Glu Ile Phe Gln Lys Met Val Glu 130 135 140 Val Asp Glu Ala Glu Val Phe Asp Gln Glu Arg Trp Thr Asp Ile Asp145 150 155 160 Thr Arg Phe His Ser Leu Ile Phe Asn Ala Thr Gly Asn Asp Phe Tyr 165 170 175 Leu Pro Phe Gly Asn Ile Leu Thr Thr Met Phe Val Asn Phe Ile Val 180 185 190 His Ser Ser Glu Glu Gly Ser Thr Cys Ile Asn Glu His Arg Arg Ile 195 200 205 Tyr Glu Ala Ile Met Ala Gly Asp Cys Asp Lys Ala Arg Ile Ala Ser 210 215 220 Ala Val His Leu Gln Asp Ala Asn His Arg Leu Ala Thr Ala225 230 235 231779DNAVibrio splendidus 23atggaactca acacgattat tgtcggcatt tatttcctat tcttgattgc gataggttgg 60atgtttagaa catttacaag tactactagt gactacttcc gcgggggcgg taacatgttg 120tggtggatgg ttggtgcaac cgcctttatg acccagttta gtgcatggac attcaccggt 180gcagcaggta aagcgtataa cgatggtttc gctgtagcgg tcatcttcgt agccaacgca 240tttggttact tcatgaacta cgcgtacttc gcgccgaaat tccgtcaact tcgcgttgtt 300acggtaatcg aagcgattcg tatgcgtttt ggtgcgacca acgaacaagt attcacttgg 360tcttcaatgc caaactcagt ggtatctgcg ggtgtgtggt taaacgcatt ggcaatcatc 420gcttcgggta tcttcggttt cgacatgaac atgactatct gggtgactgg cctagtggta 480ttggcaatgt cggtaacagg tggttcatgg gcggtaatcg catctgactt catgcagatg 540gttatcatca tggcggtaac ggtaacttgt gcggttgtag cggttgttca aggtggcggt 600gttggtgaga ttgttaacaa cttcccagta caagatggtg gttcgttcct ttggggcaac 660aacatcaact acctaagcat ctttacgatt tgggcattct tcatcttcgt taagcagttc 720tcaatcacga acaacatgct taactcttac cgttacctag cggctaaaga ctcaaagaac 780gctaagaaag ctgcactgct tgcttgtgtg ttgatgttgt gtggtgtgtt tatttggttc 840atgccttctt ggttcattgc aggccaaggt gttgatttat cagcggctta cccgaatgca 900ggtaaaaaag cgggtgactt tgcttaccta tacttcgtac aagagtacat gccagcaggt 960atggttggtc tattagttgc cgcgatgttt gcagcgacaa tgtcttcaat ggactcaggt 1020ctaaaccgta actcaggtat ttttgttaag aacttctacg aaacaatcgt tcgtaaaggt 1080caagcatcag agaaagagct agtaaccgta tctaaaatta cttcagcggt atttggtttc 1140gctattatcc taatcgcaca gttcatcaac tcattaaaag gcttaagcct gtttgatacg 1200atgatgtacg taggtgcgtt aatcggcttc cctatgacga ttcctgcatt ccttggtttc 1260ttcatcaaga agactccgga ctgggctggt tggggaacgc tagttgttgg tggtatcgta 1320tcttatgtgg ttggttttgt tatcaacgcg gagatggtag cagcggcgtt tggtcttgat 1380actctaacag gacgtgaatg gtctgatgtt aaagttgcga ttggtctgat tgctcacatc 1440acgctaaccg gtggcttctt cgtactatct acgatgttct acaagcctct atcaaaagaa 1500cgtcaagcgg atgttgataa gttctttggc aacttagata ccccattagt agctgaatcg 1560gcagagcaaa aagtgttgga taacaaacaa cgtcaaatgc ttggtaaact gattgcggta 1620gcgggtgttg gtattatgct gatggctctt ctgactaacc caatgtgggg gcgcctagtc 1680ttcatcttat gtggtgtgat agtgggtggt gtcggtattc tacttgtgaa agcggtcgat 1740gacggcggca agcaagcgaa agcagtaacc gaaagctaa 177924592PRTVibrio splendidus 24Met Glu Leu Asn Thr Ile Ile Val Gly Ile Tyr Phe Leu Phe Leu Ile1 5 10 15 Ala Ile Gly Trp Met Phe Arg Thr Phe Thr Ser Thr Thr Ser Asp Tyr 20 25 30 Phe Arg Gly Gly Gly Asn Met Leu Trp Trp Met Val Gly Ala Thr Ala 35 40 45 Phe Met Thr Gln Phe Ser Ala Trp Thr Phe Thr Gly Ala Ala Gly Lys 50 55 60 Ala Tyr Asn Asp Gly Phe Ala Val Ala Val Ile Phe Val Ala Asn Ala65 70 75 80 Phe Gly Tyr Phe Met Asn Tyr Ala Tyr Phe Ala Pro Lys Phe Arg Gln 85 90 95 Leu Arg Val Val Thr Val Ile Glu Ala Ile Arg Met Arg Phe Gly Ala 100 105 110 Thr Asn Glu Gln Val Phe Thr Trp Ser Ser Met Pro Asn Ser Val Val 115 120 125 Ser Ala Gly Val Trp Leu Asn Ala Leu Ala Ile Ile Ala Ser Gly Ile 130 135 140 Phe Gly Phe Asp Met Asn Met Thr Ile Trp Val Thr Gly Leu Val Val145 150 155 160 Leu Ala Met Ser Val Thr Gly Gly Ser Trp Ala Val Ile Ala Ser Asp 165 170 175 Phe Met Gln Met Val Ile Ile Met Ala Val Thr Val Thr Cys Ala Val 180 185 190 Val Ala Val Val Gln Gly Gly Gly Val Gly Glu Ile Val Asn Asn Phe 195 200 205 Pro Val Gln Asp Gly Gly Ser Phe Leu Trp Gly Asn Asn Ile Asn Tyr 210 215 220 Leu Ser Ile Phe Thr Ile Trp Ala Phe Phe Ile Phe Val Lys Gln Phe225 230 235 240 Ser Ile Thr Asn Asn Met Leu Asn Ser Tyr Arg Tyr Leu Ala Ala Lys 245 250 255 Asp Ser Lys Asn Ala Lys Lys Ala Ala Leu Leu Ala Cys Val Leu Met 260 265 270 Leu Cys Gly Val Phe Ile Trp Phe Met Pro Ser Trp Phe Ile Ala Gly 275 280 285 Gln Gly Val Asp Leu Ser Ala Ala Tyr Pro Asn Ala Gly Lys Lys Ala 290 295 300 Gly Asp Phe Ala Tyr Leu Tyr Phe Val Gln Glu Tyr Met Pro Ala Gly305 310 315 320 Met Val Gly Leu Leu Val Ala Ala Met Phe Ala Ala Thr Met Ser Ser 325 330 335 Met Asp Ser Gly Leu Asn Arg Asn Ser Gly Ile Phe Val Lys Asn Phe 340 345 350 Tyr Glu Thr Ile Val Arg Lys Gly Gln Ala Ser Glu Lys Glu Leu Val 355 360 365 Thr Val Ser Lys Ile Thr Ser Ala Val Phe Gly Phe Ala Ile Ile Leu 370 375 380 Ile Ala Gln Phe Ile Asn Ser Leu Lys Gly Leu Ser Leu Phe Asp Thr385 390 395 400 Met Met Tyr Val Gly Ala Leu Ile Gly Phe Pro Met Thr Ile Pro Ala 405 410 415 Phe Leu Gly Phe Phe Ile Lys Lys Thr Pro Asp Trp Ala Gly Trp Gly 420 425 430 Thr Leu Val Val Gly Gly Ile Val Ser Tyr Val Val Gly Phe Val Ile 435 440 445 Asn Ala Glu Met Val Ala Ala Ala Phe Gly Leu Asp Thr Leu Thr Gly 450 455 460 Arg Glu Trp Ser Asp Val Lys Val Ala Ile Gly Leu Ile Ala His Ile465 470 475 480 Thr Leu Thr Gly Gly Phe Phe Val Leu Ser Thr Met Phe Tyr Lys Pro 485 490 495 Leu Ser Lys Glu Arg Gln Ala Asp Val Asp Lys Phe Phe Gly Asn Leu 500 505 510 Asp Thr Pro Leu Val Ala Glu Ser Ala Glu Gln Lys Val Leu Asp Asn 515 520 525 Lys Gln Arg Gln Met Leu Gly Lys Leu Ile Ala Val Ala Gly Val Gly 530 535 540 Ile Met Leu Met Ala Leu Leu Thr Asn Pro Met Trp Gly Arg Leu Val545 550 555 560 Phe Ile Leu Cys Gly Val Ile Val Gly Gly Val Gly Ile Leu Leu Val 565 570 575 Lys Ala Val Asp Asp Gly Gly Lys Gln Ala Lys Ala Val Thr Glu Ser 580 585 590 252079DNAVibrio splendidus 25atgagcgacc aaaaatctct tgatgcaatc aggaagatga agctggaaaa cgatacttca 60gcaggtaatc ttgtagacct actccctatc gaagttcaaa cacgtgactt cgacctatca 120ttcctagaca ccttgagcga agcacgtccg cgtcttcttg ttcaagctga tcagctagaa 180gaattcaaag caaaagtgaa agctgatcaa gctcactgta tgtttgatga tttctacaac 240aactctaccg ttaagttcct tgagactgct cctttcgaag agcctcaagc gtacccagct 300gagacggtag gtaaagcttc tctatggcgt ccttattggc gtcaaatgta cgttgattgc 360caaatggcac tgaacgcgac acgtaaccta gcgattgctg gtgttgtaaa agaagacgaa 420gcgctcattg cgaaagcaaa agcttggact ctaaaactgt ctacgtacga tccagaaggc 480gtgacttctc gtggctataa cgatgaagcg gctttccgtg ttatcgctgc tatggcttgg 540ggttacgatt ggctacacgg ctacttcacc gatgaagaac gccagcaagt tcaagatgct 600ttgattgagc gtctagacga aatcatgcac cacctgaaag tgacggttga tctattgaac 660aacccactaa atagccacgg tgttcgttct atctcttctg ctatcatccc aacgtgtatc 720gcgctttacc acgatcaccc gaaagcaggc gagtacattg catacgcgct agaatactac 780gcagtacatt acccaccatg gggcggtgta gacggcggtt gggctgaagg tcctgattac 840tggaacacgc aaactgcatt cctaggcgaa gcattcgacc tattgaaagc atactgtggt 900gtagacatgt ttaacaaaac attctacgaa aacacaggtg atttcccgct ttactgcatg 960ccagttcact ctaagcgcgc gagcttctgt gaccagtctt caatcggcga tttcccaggt 1020ttaaaactgg cttacaacat caagcactac gcaggtgtta accagaagcc tgagtacgtt 1080tggtactata accagcttaa aggccgtgat actgaagcac acaccaaatt ctacaacttc 1140ggttggtggg acttcggtta tgacgatctt cgttttaact tcctttggga tgcacctgaa 1200gagaaagccc catcgaacga tccactgttg aaagtattcc caatcacggg ttgggctgca 1260ttccacaaca agatgactga gcgtgataac catattcaca tggtattcaa atgttctccg 1320tttggctcaa tcagccactc tcacggtgac caaaacgcat ttacgcttca cgcatttggt 1380gaaacgctag cgtcagtaac aggttactat ggtggtttcg gtgtagacat gcacacgaaa 1440tggcgtcgtc aaacgttctc taaaaacctg ccactatttg gcggtaaagg tcagtacggc 1500gagaacaaga acacaggcta cgaaaaccac caagatcgct tttgtatcga agcgggcggc 1560actatctctg acttcgacac tgaatctgat gtgaagatgg ttgaaggtga tgcaacggca 1620tcttacaagt acttcgttcc tgaaatcgaa tcttacaagc gtaaagtctg gttcgttcaa 1680ggtaaagtct tcgtaatgca agacaaggca acgctttctg aagagaaaga catgacttgg 1740ctaatgcaca caactttcgc aaacgaagtg gcagacaagt ctttcactat ccgtggcgaa 1800gttgcgcacc tagacgtaaa cttcatcaac gagtctgctg ataacatcac gtcagttaag 1860aacgttgaag gctttggcga agttgaccca tacgagttca aagatcttga gatccaccgt 1920cacgtggaag tggaattcaa gccatcgaaa gagcacaaca tcctgacgct tcttgttcct 1980aataagaatg aaggcgagca agttgaagtg tttcacaagc ttgaaggcaa cacgctactg 2040ctaaatgttg acggcgaaac ggtttcaatc gaactgtaa 207926692PRTVibrio splendidus 26Met Ser Asp Gln Lys Ser Leu Asp Ala Ile Arg Lys Met Lys Leu Glu1 5 10 15 Asn Asp Thr Ser Ala Gly Asn Leu Val Asp Leu Leu Pro Ile Glu Val 20 25 30 Gln Thr Arg Asp Phe Asp Leu Ser Phe Leu Asp Thr Leu Ser Glu Ala 35 40 45 Arg Pro Arg Leu Leu Val Gln Ala Asp Gln Leu Glu Glu Phe Lys Ala 50 55 60 Lys Val Lys Ala Asp Gln Ala His Cys Met Phe Asp Asp Phe Tyr Asn65 70 75 80 Asn Ser Thr Val Lys Phe Leu Glu Thr Ala Pro Phe Glu Glu Pro Gln 85 90 95 Ala Tyr Pro

Ala Glu Thr Val Gly Lys Ala Ser Leu Trp Arg Pro Tyr 100 105 110 Trp Arg Gln Met Tyr Val Asp Cys Gln Met Ala Leu Asn Ala Thr Arg 115 120 125 Asn Leu Ala Ile Ala Gly Val Val Lys Glu Asp Glu Ala Leu Ile Ala 130 135 140 Lys Ala Lys Ala Trp Thr Leu Lys Leu Ser Thr Tyr Asp Pro Glu Gly145 150 155 160 Val Thr Ser Arg Gly Tyr Asn Asp Glu Ala Ala Phe Arg Val Ile Ala 165 170 175 Ala Met Ala Trp Gly Tyr Asp Trp Leu His Gly Tyr Phe Thr Asp Glu 180 185 190 Glu Arg Gln Gln Val Gln Asp Ala Leu Ile Glu Arg Leu Asp Glu Ile 195 200 205 Met His His Leu Lys Val Thr Val Asp Leu Leu Asn Asn Pro Leu Asn 210 215 220 Ser His Gly Val Arg Ser Ile Ser Ser Ala Ile Ile Pro Thr Cys Ile225 230 235 240 Ala Leu Tyr His Asp His Pro Lys Ala Gly Glu Tyr Ile Ala Tyr Ala 245 250 255 Leu Glu Tyr Tyr Ala Val His Tyr Pro Pro Trp Gly Gly Val Asp Gly 260 265 270 Gly Trp Ala Glu Gly Pro Asp Tyr Trp Asn Thr Gln Thr Ala Phe Leu 275 280 285 Gly Glu Ala Phe Asp Leu Leu Lys Ala Tyr Cys Gly Val Asp Met Phe 290 295 300 Asn Lys Thr Phe Tyr Glu Asn Thr Gly Asp Phe Pro Leu Tyr Cys Met305 310 315 320 Pro Val His Ser Lys Arg Ala Ser Phe Cys Asp Gln Ser Ser Ile Gly 325 330 335 Asp Phe Pro Gly Leu Lys Leu Ala Tyr Asn Ile Lys His Tyr Ala Gly 340 345 350 Val Asn Gln Lys Pro Glu Tyr Val Trp Tyr Tyr Asn Gln Leu Lys Gly 355 360 365 Arg Asp Thr Glu Ala His Thr Lys Phe Tyr Asn Phe Gly Trp Trp Asp 370 375 380 Phe Gly Tyr Asp Asp Leu Arg Phe Asn Phe Leu Trp Asp Ala Pro Glu385 390 395 400 Glu Lys Ala Pro Ser Asn Asp Pro Leu Leu Lys Val Phe Pro Ile Thr 405 410 415 Gly Trp Ala Ala Phe His Asn Lys Met Thr Glu Arg Asp Asn His Ile 420 425 430 His Met Val Phe Lys Cys Ser Pro Phe Gly Ser Ile Ser His Ser His 435 440 445 Gly Asp Gln Asn Ala Phe Thr Leu His Ala Phe Gly Glu Thr Leu Ala 450 455 460 Ser Val Thr Gly Tyr Tyr Gly Gly Phe Gly Val Asp Met His Thr Lys465 470 475 480 Trp Arg Arg Gln Thr Phe Ser Lys Asn Leu Pro Leu Phe Gly Gly Lys 485 490 495 Gly Gln Tyr Gly Glu Asn Lys Asn Thr Gly Tyr Glu Asn His Gln Asp 500 505 510 Arg Phe Cys Ile Glu Ala Gly Gly Thr Ile Ser Asp Phe Asp Thr Glu 515 520 525 Ser Asp Val Lys Met Val Glu Gly Asp Ala Thr Ala Ser Tyr Lys Tyr 530 535 540 Phe Val Pro Glu Ile Glu Ser Tyr Lys Arg Lys Val Trp Phe Val Gln545 550 555 560 Gly Lys Val Phe Val Met Gln Asp Lys Ala Thr Leu Ser Glu Glu Lys 565 570 575 Asp Met Thr Trp Leu Met His Thr Thr Phe Ala Asn Glu Val Ala Asp 580 585 590 Lys Ser Phe Thr Ile Arg Gly Glu Val Ala His Leu Asp Val Asn Phe 595 600 605 Ile Asn Glu Ser Ala Asp Asn Ile Thr Ser Val Lys Asn Val Glu Gly 610 615 620 Phe Gly Glu Val Asp Pro Tyr Glu Phe Lys Asp Leu Glu Ile His Arg625 630 635 640 His Val Glu Val Glu Phe Lys Pro Ser Lys Glu His Asn Ile Leu Thr 645 650 655 Leu Leu Val Pro Asn Lys Asn Glu Gly Glu Gln Val Glu Val Phe His 660 665 670 Lys Leu Glu Gly Asn Thr Leu Leu Leu Asn Val Asp Gly Glu Thr Val 675 680 685 Ser Ile Glu Leu 690 27882DNAVibrio splendidus 27atgactaaac ctgtaatcgg tttcattggc ctaggtctta tgggcggcaa catggttgaa 60aacctacaaa agcgcggcta ccacgtaaac gtaatggatc taagcgctga agctgttgct 120cgcgtaacag atcgcggcaa cgcaactgca ttcacttctg ctaaagaact agctgctgca 180agtgacatcg ttcagttttg tctgacaact tctgctgttg ttgaaaaaat cgtttacggc 240gaagacggcg ttctagcggg catcaaagaa ggcgcagtac tagtagactt cggtacttct 300atccctgctt ctactaagaa aatcggcgca gctcttgctg aaaaaggcgc gggcatgatc 360gacgcacctc taggtcgtac tcctgcacac gctaaagatg gtcttctgaa catcatggct 420gctggcgaca tggaaacttt caacaaagtt aaacctgttc ttgaagagca aggcgaaaac 480gtattccacc taggggctct aggttctggt cacgtgacta agcttgtaaa caacttcatg 540ggtatgacga ctgttgcgac tatgtctcaa gctttcgctg ttgctcaacg cgctggtgtt 600gatggccaac aactgtttga catcatgtct gcaggtccat ctaactctcc gttcatgcaa 660ttctgtaagt tctacgcggt agacggcgaa gagaagctag gtttctctgt tgctaacgca 720aacaaagacc ttggttactt ccttgcactt tgtgaagagc taggtactga gtctctaatc 780gctcaaggta ctgcaacaag cctacaagct gctgttgatg caggcatggg taacaacgac 840gtaccagtaa tcttcgacta cttcgctaaa ctagagaagt aa 88228293PRTVibrio splendidus 28Met Thr Lys Pro Val Ile Gly Phe Ile Gly Leu Gly Leu Met Gly Gly1 5 10 15 Asn Met Val Glu Asn Leu Gln Lys Arg Gly Tyr His Val Asn Val Met 20 25 30 Asp Leu Ser Ala Glu Ala Val Ala Arg Val Thr Asp Arg Gly Asn Ala 35 40 45 Thr Ala Phe Thr Ser Ala Lys Glu Leu Ala Ala Ala Ser Asp Ile Val 50 55 60 Gln Phe Cys Leu Thr Thr Ser Ala Val Val Glu Lys Ile Val Tyr Gly65 70 75 80 Glu Asp Gly Val Leu Ala Gly Ile Lys Glu Gly Ala Val Leu Val Asp 85 90 95 Phe Gly Thr Ser Ile Pro Ala Ser Thr Lys Lys Ile Gly Ala Ala Leu 100 105 110 Ala Glu Lys Gly Ala Gly Met Ile Asp Ala Pro Leu Gly Arg Thr Pro 115 120 125 Ala His Ala Lys Asp Gly Leu Leu Asn Ile Met Ala Ala Gly Asp Met 130 135 140 Glu Thr Phe Asn Lys Val Lys Pro Val Leu Glu Glu Gln Gly Glu Asn145 150 155 160 Val Phe His Leu Gly Ala Leu Gly Ser Gly His Val Thr Lys Leu Val 165 170 175 Asn Asn Phe Met Gly Met Thr Thr Val Ala Thr Met Ser Gln Ala Phe 180 185 190 Ala Val Ala Gln Arg Ala Gly Val Asp Gly Gln Gln Leu Phe Asp Ile 195 200 205 Met Ser Ala Gly Pro Ser Asn Ser Pro Phe Met Gln Phe Cys Lys Phe 210 215 220 Tyr Ala Val Asp Gly Glu Glu Lys Leu Gly Phe Ser Val Ala Asn Ala225 230 235 240 Asn Lys Asp Leu Gly Tyr Phe Leu Ala Leu Cys Glu Glu Leu Gly Thr 245 250 255 Glu Ser Leu Ile Ala Gln Gly Thr Ala Thr Ser Leu Gln Ala Ala Val 260 265 270 Asp Ala Gly Met Gly Asn Asn Asp Val Pro Val Ile Phe Asp Tyr Phe 275 280 285 Ala Lys Leu Glu Lys 290 291872DNAVibrio splendidus 29atggtagcgg tcgtcagttc tagtgctttg gcatttacga actggtttac gcttaacttg 60gccactgaac aggtaaacca aacgatttat aacgagattg atcactcgct tacgatagaa 120atcaatcaaa tagaaagtac cgttcagcgc accatcgata ccgttaactc tgttgcacaa 180gagttcatga aatcccctta ccaagtgccg aatgaagcac tcatgcatta tgccgctaag 240cttggtggca ttgacaagat tgtggtgggt tttgacgacg gccgttctta tacctctcgc 300ccttcagagt ctttccctaa cggtgttgga ataaaagaaa aatacaatcc aaccactcga 360ccttggtatc aacaagcgaa attgaaatca ggcttatctt ttagtggtct gtttttcact 420aagagtactc aagtgcctat gatcggtgtg acctactcat accaagatcg tgtcatcatg 480gccgatatac gctttgacga tttggaaacg cagcttgaac agctggacag catctacgaa 540gccaaaggca ttatcatcga cgaaaagggg atggtggtcg cttcaacaat cgaaaacgtg 600cttccgcaaa ccaatatatc ttctgcagac actcaaatga aactcaacag tgccattgaa 660cagcctgatc aattcattga gggtgtgatt gatggtaacc agagaatctt gatggccaag 720aaagtggata ttggcagcca gaaagagtgg ttcatgatct ccagtattga ccctgaactc 780gcgctcaatc agctgaatgg cgtgatgtcg agtgcgcgca tccttatcgt cgcttgtgta 840cttggctcgg tgatattgat gattttactt ctgaatcgtt tctaccgccc aatcgtgtca 900ctgcgcaaaa tcgtccacga tctatcacaa ggtaacggag acctcactca aaggcttgct 960gagaagggga atgatgactt agggcatatc gccaaagaca tcaacttgtt cattatcggc 1020ttacaagaga tggttaagga tgtgaaatac aagaactcgg atctcgatac caaggtactg 1080agtattcgcg aaggttgtaa agaaaccagc gatgtactga aagttcatac tgatgaaacg 1140gttcaagtgg tctctgcgat taacggcttg tctgaagcat caaacgaagt agagaagagt 1200tctcagtcgg cggcagaagc agcaagagag gccgctgtgt tcagtgatga gacgaaacag 1260attaacacgg tgacggaaac ctatatcagt gatcttgaga agcaagtctg caccacttct 1320gatgacattc gctcaatggc caatgaaacg cagagcatcc agtctatcgt gtctgtgatt 1380ggcggaattg cggaacaaac taatttgctg gcattgaatg cgtcaattga agcggcgagg 1440gcgggtgaac atggtcgagg tttcgcggtg gttgctgatg aagtccgtgc gctagccaac 1500cgaacgcaaa tcagtacctc tgaaattgat gaagcgttat ctggcttgca gtctaaatca 1560gatggtttgg ttaaatctat tgagttgacc aaaagtaact gtgaactgac tcgcgctcaa 1620gttgttcaag ctgtaaacat gttggcgaag ctaaccgagc agatggaaac agtaagtcgt 1680tttaataatg acatttcggg ttcgtctgtt gagcaaaacg cccttattca gagcattgct 1740aagaacatgc ataagattga aagctttgtt gaggagctta ataaactaag ccaagatcag 1800ttaactgaat cagcagaaat caaaacactt aacggtagcg ttagtgaatt gatgagcagc 1860tttaaggttt aa 187230623PRTVibrio splendidus 30Met Val Ala Val Val Ser Ser Ser Ala Leu Ala Phe Thr Asn Trp Phe1 5 10 15 Thr Leu Asn Leu Ala Thr Glu Gln Val Asn Gln Thr Ile Tyr Asn Glu 20 25 30 Ile Asp His Ser Leu Thr Ile Glu Ile Asn Gln Ile Glu Ser Thr Val 35 40 45 Gln Arg Thr Ile Asp Thr Val Asn Ser Val Ala Gln Glu Phe Met Lys 50 55 60 Ser Pro Tyr Gln Val Pro Asn Glu Ala Leu Met His Tyr Ala Ala Lys65 70 75 80 Leu Gly Gly Ile Asp Lys Ile Val Val Gly Phe Asp Asp Gly Arg Ser 85 90 95 Tyr Thr Ser Arg Pro Ser Glu Ser Phe Pro Asn Gly Val Gly Ile Lys 100 105 110 Glu Lys Tyr Asn Pro Thr Thr Arg Pro Trp Tyr Gln Gln Ala Lys Leu 115 120 125 Lys Ser Gly Leu Ser Phe Ser Gly Leu Phe Phe Thr Lys Ser Thr Gln 130 135 140 Val Pro Met Ile Gly Val Thr Tyr Ser Tyr Gln Asp Arg Val Ile Met145 150 155 160 Ala Asp Ile Arg Phe Asp Asp Leu Glu Thr Gln Leu Glu Gln Leu Asp 165 170 175 Ser Ile Tyr Glu Ala Lys Gly Ile Ile Ile Asp Glu Lys Gly Met Val 180 185 190 Val Ala Ser Thr Ile Glu Asn Val Leu Pro Gln Thr Asn Ile Ser Ser 195 200 205 Ala Asp Thr Gln Met Lys Leu Asn Ser Ala Ile Glu Gln Pro Asp Gln 210 215 220 Phe Ile Glu Gly Val Ile Asp Gly Asn Gln Arg Ile Leu Met Ala Lys225 230 235 240 Lys Val Asp Ile Gly Ser Gln Lys Glu Trp Phe Met Ile Ser Ser Ile 245 250 255 Asp Pro Glu Leu Ala Leu Asn Gln Leu Asn Gly Val Met Ser Ser Ala 260 265 270 Arg Ile Leu Ile Val Ala Cys Val Leu Gly Ser Val Ile Leu Met Ile 275 280 285 Leu Leu Leu Asn Arg Phe Tyr Arg Pro Ile Val Ser Leu Arg Lys Ile 290 295 300 Val His Asp Leu Ser Gln Gly Asn Gly Asp Leu Thr Gln Arg Leu Ala305 310 315 320 Glu Lys Gly Asn Asp Asp Leu Gly His Ile Ala Lys Asp Ile Asn Leu 325 330 335 Phe Ile Ile Gly Leu Gln Glu Met Val Lys Asp Val Lys Tyr Lys Asn 340 345 350 Ser Asp Leu Asp Thr Lys Val Leu Ser Ile Arg Glu Gly Cys Lys Glu 355 360 365 Thr Ser Asp Val Leu Lys Val His Thr Asp Glu Thr Val Gln Val Val 370 375 380 Ser Ala Ile Asn Gly Leu Ser Glu Ala Ser Asn Glu Val Glu Lys Ser385 390 395 400 Ser Gln Ser Ala Ala Glu Ala Ala Arg Glu Ala Ala Val Phe Ser Asp 405 410 415 Glu Thr Lys Gln Ile Asn Thr Val Thr Glu Thr Tyr Ile Ser Asp Leu 420 425 430 Glu Lys Gln Val Cys Thr Thr Ser Asp Asp Ile Arg Ser Met Ala Asn 435 440 445 Glu Thr Gln Ser Ile Gln Ser Ile Val Ser Val Ile Gly Gly Ile Ala 450 455 460 Glu Gln Thr Asn Leu Leu Ala Leu Asn Ala Ser Ile Glu Ala Ala Arg465 470 475 480 Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg 485 490 495 Ala Leu Ala Asn Arg Thr Gln Ile Ser Thr Ser Glu Ile Asp Glu Ala 500 505 510 Leu Ser Gly Leu Gln Ser Lys Ser Asp Gly Leu Val Lys Ser Ile Glu 515 520 525 Leu Thr Lys Ser Asn Cys Glu Leu Thr Arg Ala Gln Val Val Gln Ala 530 535 540 Val Asn Met Leu Ala Lys Leu Thr Glu Gln Met Glu Thr Val Ser Arg545 550 555 560 Phe Asn Asn Asp Ile Ser Gly Ser Ser Val Glu Gln Asn Ala Leu Ile 565 570 575 Gln Ser Ile Ala Lys Asn Met His Lys Ile Glu Ser Phe Val Glu Glu 580 585 590 Leu Asn Lys Leu Ser Gln Asp Gln Leu Thr Glu Ser Ala Glu Ile Lys 595 600 605 Thr Leu Asn Gly Ser Val Ser Glu Leu Met Ser Ser Phe Lys Val 610 615 620 311743DNAVibrio splendidus 31gtgaataagc caatctttgt cgtcgtactc gcttcgctta cgtatggctg cggtggaagc 60agctccagtg actctagtga cccttctgat accaataact caggagcatc ttatggtgtt 120gttgctccct atgatattgc caagtatcaa aacatccttt ccagctcaga tcttcaggtg 180tctgatccta atggagagga gggcaataaa acctctgaag tcaaagatgg taacttcgat 240ggttatgtca gtgattattt ttatgctgac gaagagacgg aaaatctgat cttcaaaatg 300gcgaactaca agatgcgctc tgaagttcgt gaaggagaaa acttcgatat caatgaagca 360ggcgtaagac gcagtctaca tgcggaaata agcctacctg atattgagca tgtaatggcg 420agttctcccg cagatcacga tgaagtgacc gtgctacaga tccacaataa aggtacagac 480gagagtggca cgggttatat ccctcatccg ctattgcgtg tggtttggga gcaagaacga 540gatggcctca caggtcacta ctgggcagtc atgaaaaata atgccattga ctgtagcagt 600gccgctgact cttcggattg ttatgccact tcatataatc gctacgattt gggagaggcg 660gatctcgata acttcaccaa gtttgatctt tctgtttatg aaaataccct ttcgatcaaa 720gtgaacgatg aagttaaagt cgacgaagac atcacctact ggcagcatct actgagttac 780tttaaagcgg gtatctacaa tcaatttgaa aatggtgaag ccacggctca ctttcaggca 840ctgcgataca ccaccacaca ggtcaacggc tcaaacgatt gggatattaa tgattggaag 900ttgacgattc ctgcgagtaa agacacttgg tatggaagtg ggggtgacag tgcggctgaa 960ctagaacctg agcgctgcga atcgagcaaa gaccttctcg ccaacgacag tgatgtctac 1020gacagcgata ttggtctttc ttatttcaat accgatgaag ggagagtgca ctttagagcg 1080gatatgggat atggcacctc taccgaaaat tctagctata ttcgctctga gctcagggag 1140ttgtatcaaa gcagtgttca accggattgt agcaccagcg atgaagatac aagttggtat 1200ttggacgaca ctagaacgaa cgctaccagt cacgagttaa ccgcaagctt acgaattgaa 1260gactacccga acattaataa ccaagacccg aaagtggtgc ttgggcaaat acacggttgg 1320aagatcaatc aagcattggt gaagttgtta tgggaaggcg agagtaagcc agtaagagtg 1380atactgaact ctgattttga gcgcaacaac caagactgta accattgtga cccgttcagt 1440gtcgagttag gtacttattc ggcaagtgaa gagtggcgat atacgattcg agccaatcaa 1500gacggtatct acttagcgac tcatgattta gatggaacta atacggtttc tcatttaatc 1560ccttggggac aagattacac agataaagat ggggacacgg tctcgttgac gtcagattgg 1620acatcgacag acatcgcttt ctatttcaaa gcgggcatct acccacaatt taagcctgat 1680agcgactatg cgggtgaagt gtttgatgtg agctttagtt ctctaagagc agagcataac 1740tga 174332580PRTVibrio splendidus 32Met Asn Lys Pro Ile Phe Val Val Val Leu Ala Ser Leu Thr Tyr Gly1 5 10 15 Cys Gly Gly Ser Ser Ser Ser Asp Ser Ser Asp Pro Ser Asp Thr Asn 20 25 30 Asn Ser Gly Ala Ser Tyr Gly Val Val Ala Pro Tyr Asp Ile Ala Lys 35 40 45 Tyr Gln Asn Ile Leu Ser Ser Ser Asp Leu Gln Val Ser Asp Pro Asn 50 55 60 Gly Glu Glu Gly Asn Lys Thr Ser Glu Val Lys

Asp Gly Asn Phe Asp65 70 75 80 Gly Tyr Val Ser Asp Tyr Phe Tyr Ala Asp Glu Glu Thr Glu Asn Leu 85 90 95 Ile Phe Lys Met Ala Asn Tyr Lys Met Arg Ser Glu Val Arg Glu Gly 100 105 110 Glu Asn Phe Asp Ile Asn Glu Ala Gly Val Arg Arg Ser Leu His Ala 115 120 125 Glu Ile Ser Leu Pro Asp Ile Glu His Val Met Ala Ser Ser Pro Ala 130 135 140 Asp His Asp Glu Val Thr Val Leu Gln Ile His Asn Lys Gly Thr Asp145 150 155 160 Glu Ser Gly Thr Gly Tyr Ile Pro His Pro Leu Leu Arg Val Val Trp 165 170 175 Glu Gln Glu Arg Asp Gly Leu Thr Gly His Tyr Trp Ala Val Met Lys 180 185 190 Asn Asn Ala Ile Asp Cys Ser Ser Ala Ala Asp Ser Ser Asp Cys Tyr 195 200 205 Ala Thr Ser Tyr Asn Arg Tyr Asp Leu Gly Glu Ala Asp Leu Asp Asn 210 215 220 Phe Thr Lys Phe Asp Leu Ser Val Tyr Glu Asn Thr Leu Ser Ile Lys225 230 235 240 Val Asn Asp Glu Val Lys Val Asp Glu Asp Ile Thr Tyr Trp Gln His 245 250 255 Leu Leu Ser Tyr Phe Lys Ala Gly Ile Tyr Asn Gln Phe Glu Asn Gly 260 265 270 Glu Ala Thr Ala His Phe Gln Ala Leu Arg Tyr Thr Thr Thr Gln Val 275 280 285 Asn Gly Ser Asn Asp Trp Asp Ile Asn Asp Trp Lys Leu Thr Ile Pro 290 295 300 Ala Ser Lys Asp Thr Trp Tyr Gly Ser Gly Gly Asp Ser Ala Ala Glu305 310 315 320 Leu Glu Pro Glu Arg Cys Glu Ser Ser Lys Asp Leu Leu Ala Asn Asp 325 330 335 Ser Asp Val Tyr Asp Ser Asp Ile Gly Leu Ser Tyr Phe Asn Thr Asp 340 345 350 Glu Gly Arg Val His Phe Arg Ala Asp Met Gly Tyr Gly Thr Ser Thr 355 360 365 Glu Asn Ser Ser Tyr Ile Arg Ser Glu Leu Arg Glu Leu Tyr Gln Ser 370 375 380 Ser Val Gln Pro Asp Cys Ser Thr Ser Asp Glu Asp Thr Ser Trp Tyr385 390 395 400 Leu Asp Asp Thr Arg Thr Asn Ala Thr Ser His Glu Leu Thr Ala Ser 405 410 415 Leu Arg Ile Glu Asp Tyr Pro Asn Ile Asn Asn Gln Asp Pro Lys Val 420 425 430 Val Leu Gly Gln Ile His Gly Trp Lys Ile Asn Gln Ala Leu Val Lys 435 440 445 Leu Leu Trp Glu Gly Glu Ser Lys Pro Val Arg Val Ile Leu Asn Ser 450 455 460 Asp Phe Glu Arg Asn Asn Gln Asp Cys Asn His Cys Asp Pro Phe Ser465 470 475 480 Val Glu Leu Gly Thr Tyr Ser Ala Ser Glu Glu Trp Arg Tyr Thr Ile 485 490 495 Arg Ala Asn Gln Asp Gly Ile Tyr Leu Ala Thr His Asp Leu Asp Gly 500 505 510 Thr Asn Thr Val Ser His Leu Ile Pro Trp Gly Gln Asp Tyr Thr Asp 515 520 525 Lys Asp Gly Asp Thr Val Ser Leu Thr Ser Asp Trp Thr Ser Thr Asp 530 535 540 Ile Ala Phe Tyr Phe Lys Ala Gly Ile Tyr Pro Gln Phe Lys Pro Asp545 550 555 560 Ser Asp Tyr Ala Gly Glu Val Phe Asp Val Ser Phe Ser Ser Leu Arg 565 570 575 Ala Glu His Asn 580 331569DNAVibrio splendidus 33atgaaacaaa ttactctaaa aactttactc gcttcttcta ttctacttgc ggttggttgt 60gcgagcacga gcacgcctac tgctgatttt ccaaataaca aagaaactgg tgaagcgctt 120ctgacgccag ttgctgtttc cgctagtagc catgatggta acggacctga tcgtctcgtt 180gaccaagacc taactacacg ttggtcatct gcgggtgacg gcgagtgggc aacgctagac 240tatggttcag tacaggagtt tgacgcggtt caggcatctt tcagtaaagg taatcagcgc 300caatctaaat ttgatatcca agtgagtgtt gatggcgaaa gctggacaac ggtactagaa 360aaccaactaa gctcaggtaa agcgatcggc ctagagcgtt tccaatttga gccagtagtg 420caagcacgct acgtaagata cgttggtcac ggtaacacca aaaacggttg gaacagtgtg 480actggattag cggcggttaa ctgtagcatt aacgcatgtc ctgctagcca tatcatcact 540tcagacgtgg ttgcagcaga agccgtgatt attgctgaaa tgaaagcggc agaaaaagca 600cgtaaagatg cgcgcaaaga tctacgctct ggtaacttcg gtgtagcagc ggtttaccct 660tgtgagacga ccgttgaatg tgacactcgc agtgcacttc cagttccgac aggcctgcca 720gcgacaccag ttgcaggtaa ctcgccaagc gaaaactttg acatgacgca ttggtaccta 780tctcaaccat ttgaccatga caaaaatggc aaacctgatg atgtgtctga gtggaacctt 840gcaaacggtt accaacaccc tgaaatcttc tacacagctg atgacggcgg cctagtattc 900aaagcttacg tgaaaggtgt acgtacctct aaaaacacta agtacgcgcg tacagagctt 960cgtgaaatga tgcgtcgtgg tgatcagtct attagcacta aaggtgttaa taagaataac 1020tgggtattct caagcgctcc tgaatctgac ttagagtcgg cagcgggtat tgacggcgtt 1080ctagaagcga cgttgaaaat cgaccatgca acaacgacgg gtaatgcgaa tgaagtaggt 1140cgctttatca ttggtcagat tcacgatcaa aacgatgaac caattcgttt gtactaccgt 1200aaactgccaa accaagaaac gggtgcggtt tacttcgcac atgaaagcca agacgcaact 1260aaagaggact tctaccctct agtgggcgac atgacggctg aagtgggtga cgatggtatc 1320gcgcttggcg aagtgttcag ctaccgtatt gacgttaaag gcaacacgat gactgtaacg 1380ctaatacgtg aaggcaaaga cgatgttgta caagtggttg atatgagcaa cagcggctac 1440gacgcaggcg gcaagtacat gtacttcaaa gccggtgttt acaaccaaaa catcagcggc 1500gacctagacg attactcaca agcgactttc tatcagctag atgtatcgca cgatcaatac 1560aaaaagtaa 156934522PRTVibrio splendidus 34Met Lys Gln Ile Thr Leu Lys Thr Leu Leu Ala Ser Ser Ile Leu Leu1 5 10 15 Ala Val Gly Cys Ala Ser Thr Ser Thr Pro Thr Ala Asp Phe Pro Asn 20 25 30 Asn Lys Glu Thr Gly Glu Ala Leu Leu Thr Pro Val Ala Val Ser Ala 35 40 45 Ser Ser His Asp Gly Asn Gly Pro Asp Arg Leu Val Asp Gln Asp Leu 50 55 60 Thr Thr Arg Trp Ser Ser Ala Gly Asp Gly Glu Trp Ala Thr Leu Asp65 70 75 80 Tyr Gly Ser Val Gln Glu Phe Asp Ala Val Gln Ala Ser Phe Ser Lys 85 90 95 Gly Asn Gln Arg Gln Ser Lys Phe Asp Ile Gln Val Ser Val Asp Gly 100 105 110 Glu Ser Trp Thr Thr Val Leu Glu Asn Gln Leu Ser Ser Gly Lys Ala 115 120 125 Ile Gly Leu Glu Arg Phe Gln Phe Glu Pro Val Val Gln Ala Arg Tyr 130 135 140 Val Arg Tyr Val Gly His Gly Asn Thr Lys Asn Gly Trp Asn Ser Val145 150 155 160 Thr Gly Leu Ala Ala Val Asn Cys Ser Ile Asn Ala Cys Pro Ala Ser 165 170 175 His Ile Ile Thr Ser Asp Val Val Ala Ala Glu Ala Val Ile Ile Ala 180 185 190 Glu Met Lys Ala Ala Glu Lys Ala Arg Lys Asp Ala Arg Lys Asp Leu 195 200 205 Arg Ser Gly Asn Phe Gly Val Ala Ala Val Tyr Pro Cys Glu Thr Thr 210 215 220 Val Glu Cys Asp Thr Arg Ser Ala Leu Pro Val Pro Thr Gly Leu Pro225 230 235 240 Ala Thr Pro Val Ala Gly Asn Ser Pro Ser Glu Asn Phe Asp Met Thr 245 250 255 His Trp Tyr Leu Ser Gln Pro Phe Asp His Asp Lys Asn Gly Lys Pro 260 265 270 Asp Asp Val Ser Glu Trp Asn Leu Ala Asn Gly Tyr Gln His Pro Glu 275 280 285 Ile Phe Tyr Thr Ala Asp Asp Gly Gly Leu Val Phe Lys Ala Tyr Val 290 295 300 Lys Gly Val Arg Thr Ser Lys Asn Thr Lys Tyr Ala Arg Thr Glu Leu305 310 315 320 Arg Glu Met Met Arg Arg Gly Asp Gln Ser Ile Ser Thr Lys Gly Val 325 330 335 Asn Lys Asn Asn Trp Val Phe Ser Ser Ala Pro Glu Ser Asp Leu Glu 340 345 350 Ser Ala Ala Gly Ile Asp Gly Val Leu Glu Ala Thr Leu Lys Ile Asp 355 360 365 His Ala Thr Thr Thr Gly Asn Ala Asn Glu Val Gly Arg Phe Ile Ile 370 375 380 Gly Gln Ile His Asp Gln Asn Asp Glu Pro Ile Arg Leu Tyr Tyr Arg385 390 395 400 Lys Leu Pro Asn Gln Glu Thr Gly Ala Val Tyr Phe Ala His Glu Ser 405 410 415 Gln Asp Ala Thr Lys Glu Asp Phe Tyr Pro Leu Val Gly Asp Met Thr 420 425 430 Ala Glu Val Gly Asp Asp Gly Ile Ala Leu Gly Glu Val Phe Ser Tyr 435 440 445 Arg Ile Asp Val Lys Gly Asn Thr Met Thr Val Thr Leu Ile Arg Glu 450 455 460 Gly Lys Asp Asp Val Val Gln Val Val Asp Met Ser Asn Ser Gly Tyr465 470 475 480 Asp Ala Gly Gly Lys Tyr Met Tyr Phe Lys Ala Gly Val Tyr Asn Gln 485 490 495 Asn Ile Ser Gly Asp Leu Asp Asp Tyr Ser Gln Ala Thr Phe Tyr Gln 500 505 510 Leu Asp Val Ser His Asp Gln Tyr Lys Lys 515 520 351230DNAVibrio splendidus 35atgcaaattt ctaaagtcgc tacagctgtc gctctttcga caggtttatt atttggttgt 60aacagtgatg gtttacctat tccaacagat ccaggcggaa cagaccctgt tgaacctgtt 120gaagtttact ctatagaaaa cgtctattgg gatctgacag gtggtgctgt tgctgcacag 180tcactcagcg gaacttcacc atatcgcttt gataataatg aggaaggtac tcgtgctcta 240agcatttaca gtggagacgt agctaatggc ttcacttttg agagttcaat atatactgct 300gaagaagaag gtgttgtttc ctttgaaggt aaggactgta cttacacagt gactgagcaa 360cagctagata tgacctgtga aaaagatgac gtagaaacag cttactcagc aacagagatt 420acagatgaat ctgttataac tgcattagaa aatgccgatg atggaaaacc taaatcagtc 480gatgatgtga acgctgcgat tgcatcagca gaagatggcg cgattattga tttatcatct 540gaaggtacgt ttgataccgg tgttattgag ctaaataaag ctgtcacaat tgatggtgct 600ggtttagcaa ccattaccgg agatgcttgt attgatgtca ctgcacccgg tgcaggtatc 660aaaaacatga cttttgctaa cgacaatttg gccgggtgtt ttggtaggga gtcagctggt 720acttcagata atgaaactgg tgcgatcgtt attggtaaaa ttggtaaaga ttcagatcct 780gtagcacttg aaaacctaaa gttcgatgca aacggcatta ccgaagatga tctaggtact 840aaaaaagcaa gttggttatt ctctcgaggt tactttacat tagacaatag cgaatttgtc 900ggtttaagtg gcagtttcca aaataatgca attcgtatta actgtagtag tgacaacggg 960cgatttggtt cacaaatcac aaataataca ttcactatta actctggtgg tagtgatgtg 1020ggcggaatta aagttggtga ttctagcagt gccgtcataa agaatagtga tgataacctt 1080ggctgtaatg tcactattga aagcaatacg ttcaatggtt acaaaaccct actttcagct 1140gacaacggta aagatataag aaatacagcc atctacgcac aaccatctgc agtgaacact 1200gcggcaggta aagaaaatat cttgaactaa 123036409PRTVibrio splendidus 36Met Gln Ile Ser Lys Val Ala Thr Ala Val Ala Leu Ser Thr Gly Leu1 5 10 15 Leu Phe Gly Cys Asn Ser Asp Gly Leu Pro Ile Pro Thr Asp Pro Gly 20 25 30 Gly Thr Asp Pro Val Glu Pro Val Glu Val Tyr Ser Ile Glu Asn Val 35 40 45 Tyr Trp Asp Leu Thr Gly Gly Ala Val Ala Ala Gln Ser Leu Ser Gly 50 55 60 Thr Ser Pro Tyr Arg Phe Asp Asn Asn Glu Glu Gly Thr Arg Ala Leu65 70 75 80 Ser Ile Tyr Ser Gly Asp Val Ala Asn Gly Phe Thr Phe Glu Ser Ser 85 90 95 Ile Tyr Thr Ala Glu Glu Glu Gly Val Val Ser Phe Glu Gly Lys Asp 100 105 110 Cys Thr Tyr Thr Val Thr Glu Gln Gln Leu Asp Met Thr Cys Glu Lys 115 120 125 Asp Asp Val Glu Thr Ala Tyr Ser Ala Thr Glu Ile Thr Asp Glu Ser 130 135 140 Val Ile Thr Ala Leu Glu Asn Ala Asp Asp Gly Lys Pro Lys Ser Val145 150 155 160 Asp Asp Val Asn Ala Ala Ile Ala Ser Ala Glu Asp Gly Ala Ile Ile 165 170 175 Asp Leu Ser Ser Glu Gly Thr Phe Asp Thr Gly Val Ile Glu Leu Asn 180 185 190 Lys Ala Val Thr Ile Asp Gly Ala Gly Leu Ala Thr Ile Thr Gly Asp 195 200 205 Ala Cys Ile Asp Val Thr Ala Pro Gly Ala Gly Ile Lys Asn Met Thr 210 215 220 Phe Ala Asn Asp Asn Leu Ala Gly Cys Phe Gly Arg Glu Ser Ala Gly225 230 235 240 Thr Ser Asp Asn Glu Thr Gly Ala Ile Val Ile Gly Lys Ile Gly Lys 245 250 255 Asp Ser Asp Pro Val Ala Leu Glu Asn Leu Lys Phe Asp Ala Asn Gly 260 265 270 Ile Thr Glu Asp Asp Leu Gly Thr Lys Lys Ala Ser Trp Leu Phe Ser 275 280 285 Arg Gly Tyr Phe Thr Leu Asp Asn Ser Glu Phe Val Gly Leu Ser Gly 290 295 300 Ser Phe Gln Asn Asn Ala Ile Arg Ile Asn Cys Ser Ser Asp Asn Gly305 310 315 320 Arg Phe Gly Ser Gln Ile Thr Asn Asn Thr Phe Thr Ile Asn Ser Gly 325 330 335 Gly Ser Asp Val Gly Gly Ile Lys Val Gly Asp Ser Ser Ser Ala Val 340 345 350 Ile Lys Asn Ser Asp Asp Asn Leu Gly Cys Asn Val Thr Ile Glu Ser 355 360 365 Asn Thr Phe Asn Gly Tyr Lys Thr Leu Leu Ser Ala Asp Asn Gly Lys 370 375 380 Asp Ile Arg Asn Thr Ala Ile Tyr Ala Gln Pro Ser Ala Val Asn Thr385 390 395 400 Ala Ala Gly Lys Glu Asn Ile Leu Asn 405 37861DNAVibrio splendidus 37atgaattctg ttacaaaaat tgctgcagct gttgcatgta ctcttttagc gggcacagct 60gctggtgcat ctcttgatta tcgttacgag tatcgtgctg cgacggatta tacaaagact 120aatggtgata cggctcacgt agacgctcgc catcaacacc gagttaagct aggtgaaagc 180tttaagctgt cagacaagtg gaagcactct actggtctag aacttaagtt ccacggtgat 240gactcttact atgatgaaga ttcaggttct gttaaatcag caaacagcca gagtttttac 300gatggcaatt ggtacatcta tggtatggag atcgataaca ctgcgacata caaaatagac 360aataattggt atctacaaat gggtatgcct attgcttggg attgggatga gcctaatgct 420aacgatggcg actggaagat gaaaaaggtt acgtttaaac ctcagttccg cgttggctat 480aaagcagata tgggtttaac aactgctatt cgttaccgtc atgaatatgc tgacttccgt 540aaccacacac aatttggcga caaagattct gaaactggcg agcgtttaga atcagctcaa 600aagtctaaag ttacactgac gggctcttac aaaattgaat ctctacctaa gcttggcctt 660tcttacgaag caaactatgt aaaatctttg gataacgtac ttctttataa tagtgatgac 720tgggaatggg atgctggctt aaaggtaaac tacaagttcg gttcttggaa accttttgct 780gaaatctggt cttctgatat cagttcatct tcaaaagatc gtgaagctaa ataccgtgtt 840ggtattgctt actcattcta a 86138286PRTVibrio splendidus 38Met Asn Ser Val Thr Lys Ile Ala Ala Ala Val Ala Cys Thr Leu Leu1 5 10 15 Ala Gly Thr Ala Ala Gly Ala Ser Leu Asp Tyr Arg Tyr Glu Tyr Arg 20 25 30 Ala Ala Thr Asp Tyr Thr Lys Thr Asn Gly Asp Thr Ala His Val Asp 35 40 45 Ala Arg His Gln His Arg Val Lys Leu Gly Glu Ser Phe Lys Leu Ser 50 55 60 Asp Lys Trp Lys His Ser Thr Gly Leu Glu Leu Lys Phe His Gly Asp65 70 75 80 Asp Ser Tyr Tyr Asp Glu Asp Ser Gly Ser Val Lys Ser Ala Asn Ser 85 90 95 Gln Ser Phe Tyr Asp Gly Asn Trp Tyr Ile Tyr Gly Met Glu Ile Asp 100 105 110 Asn Thr Ala Thr Tyr Lys Ile Asp Asn Asn Trp Tyr Leu Gln Met Gly 115 120 125 Met Pro Ile Ala Trp Asp Trp Asp Glu Pro Asn Ala Asn Asp Gly Asp 130 135 140 Trp Lys Met Lys Lys Val Thr Phe Lys Pro Gln Phe Arg Val Gly Tyr145 150 155 160 Lys Ala Asp Met Gly Leu Thr Thr Ala Ile Arg Tyr Arg His Glu Tyr 165 170 175 Ala Asp Phe Arg Asn His Thr Gln Phe Gly Asp Lys Asp Ser Glu Thr 180 185 190 Gly Glu Arg Leu Glu Ser Ala Gln Lys Ser Lys Val Thr Leu Thr Gly 195 200 205 Ser Tyr Lys Ile Glu Ser Leu Pro Lys Leu Gly Leu Ser Tyr Glu Ala 210 215 220 Asn Tyr Val Lys Ser Leu Asp Asn Val Leu Leu Tyr Asn Ser Asp Asp225 230 235 240 Trp Glu Trp Asp Ala Gly Leu Lys Val Asn Tyr Lys Phe Gly Ser Trp 245 250 255 Lys Pro Phe Ala Glu Ile Trp Ser Ser Asp Ile Ser Ser Ser Ser Lys 260 265

270 Asp Arg Glu Ala Lys Tyr Arg Val Gly Ile Ala Tyr Ser Phe 275 280 285 391038DNAVibrio splendidus 39atgtttaaga aaaacatatt agcagtggcg ttattagcga ctgtgccaat ggttactttc 60gcaaataacg gtgtttctta ccccgtacct gccgataaat tcgatatgca taattggaaa 120ataaccatac cttcagatat taatgaagat ggtcgcgttg atgaaataga aggggtcgct 180atgatgagct actcacatag tgatttcttc catcttgata aagacggcaa ccttgtattt 240gaagtgcaga accaagcgat tacgacgaaa aactcgaaga atgcgcgttc tgagttacgc 300cagatgccaa gaggcgcaga tttctctatc gatacggctg ataaaggaaa ccagtgggca 360ctgtcgagtc acccagcggc tagtgaatac agtgctgtgg gcggaacatt agaagcgaca 420ttaaaagtga atcacgtctc agttaacgct aagttcccag aaaaataccc agctcattct 480gttgtggttg gtcagattca tgctaaaaaa cacaacgagc taatcaaagc tggaaccggt 540tatgggcatg gtaatgaacc actaaagatc ttctataaga agtttcctga ccaagaaatg 600ggttcagtat tctggaacta tgaacgtaac ctagagaaaa aagatcctaa ccgtgccgat 660atcgcttatc cagtgtgggg taacacgtgg gaaaaccctg cagagccggg tgaagccggt 720attgctcttg gtgaagagtt tagctacaaa gtggaagtga aaggcaccat gatgtaccta 780acgtttgaaa ccgagcgtca cgataccgtt aagtatgaaa tcgacctgag taagggcatc 840gatgaacttg actcaccaac gggctatgct gaagatgatt tttactacaa agcgggcgca 900tacggccaat gtagcgtgag cgattctcac cctgtatggg ggcctggttg tggcggtact 960ggcgatttcg ctgtcgataa aaagaatggc gattacaaca gtgtgacttt ctctgcgctt 1020aagttaaacg gtaaatag 103840345PRTVibrio splendidus 40Met Phe Lys Lys Asn Ile Leu Ala Val Ala Leu Leu Ala Thr Val Pro1 5 10 15 Met Val Thr Phe Ala Asn Asn Gly Val Ser Tyr Pro Val Pro Ala Asp 20 25 30 Lys Phe Asp Met His Asn Trp Lys Ile Thr Ile Pro Ser Asp Ile Asn 35 40 45 Glu Asp Gly Arg Val Asp Glu Ile Glu Gly Val Ala Met Met Ser Tyr 50 55 60 Ser His Ser Asp Phe Phe His Leu Asp Lys Asp Gly Asn Leu Val Phe65 70 75 80 Glu Val Gln Asn Gln Ala Ile Thr Thr Lys Asn Ser Lys Asn Ala Arg 85 90 95 Ser Glu Leu Arg Gln Met Pro Arg Gly Ala Asp Phe Ser Ile Asp Thr 100 105 110 Ala Asp Lys Gly Asn Gln Trp Ala Leu Ser Ser His Pro Ala Ala Ser 115 120 125 Glu Tyr Ser Ala Val Gly Gly Thr Leu Glu Ala Thr Leu Lys Val Asn 130 135 140 His Val Ser Val Asn Ala Lys Phe Pro Glu Lys Tyr Pro Ala His Ser145 150 155 160 Val Val Val Gly Gln Ile His Ala Lys Lys His Asn Glu Leu Ile Lys 165 170 175 Ala Gly Thr Gly Tyr Gly His Gly Asn Glu Pro Leu Lys Ile Phe Tyr 180 185 190 Lys Lys Phe Pro Asp Gln Glu Met Gly Ser Val Phe Trp Asn Tyr Glu 195 200 205 Arg Asn Leu Glu Lys Lys Asp Pro Asn Arg Ala Asp Ile Ala Tyr Pro 210 215 220 Val Trp Gly Asn Thr Trp Glu Asn Pro Ala Glu Pro Gly Glu Ala Gly225 230 235 240 Ile Ala Leu Gly Glu Glu Phe Ser Tyr Lys Val Glu Val Lys Gly Thr 245 250 255 Met Met Tyr Leu Thr Phe Glu Thr Glu Arg His Asp Thr Val Lys Tyr 260 265 270 Glu Ile Asp Leu Ser Lys Gly Ile Asp Glu Leu Asp Ser Pro Thr Gly 275 280 285 Tyr Ala Glu Asp Asp Phe Tyr Tyr Lys Ala Gly Ala Tyr Gly Gln Cys 290 295 300 Ser Val Ser Asp Ser His Pro Val Trp Gly Pro Gly Cys Gly Gly Thr305 310 315 320 Gly Asp Phe Ala Val Asp Lys Lys Asn Gly Asp Tyr Asn Ser Val Thr 325 330 335 Phe Ser Ala Leu Lys Leu Asn Gly Lys 340 345 41897DNAVibrio splendidus 41atggataact ctccggtgct gagccgattt ttagagaatg gatttttact ccagcagaaa 60ctgagccttg ttctttgttg tgtgttgatc gcagcttctg catggatttt aggacagctt 120gcatggttta ttgaacctgc tgagcaaacc gtcgtgccat ggacagcaac ggcttcctcg 180tcttcaacgc ctcaatcgac tcttgatatc tcttctttgc agcagagcaa catgtttggt 240gcttataacc caaccacgcc tgctgtggtt gagcagcaag ttatccaaga tgcgccaaag 300acgcgactga acctcgtttt agtgggtgca gtagccagtt ctaatccaaa gctgagcttg 360gctgtgattg ccaatcgcgg cacacaagca acctacggca ttaatgaaga gatcgaaggt 420acgcgagcta agttaaaagc ggtattagtc gatcgcgtga ttattgataa ctcaggtcga 480gacgaaacct tgatgcttga aggcattgag tacaagcgtt tgtctgtatc agcacctgcg 540ccacctcgta cctcttcttc tgtgcgtggc aacaacccag cttctgcaga agagaagcta 600gatgaaatta aagcgaagat aatgaaagat ccgcaacaaa tcttccaata tgttcgactg 660tctcaggtga aacgcgacga taaagtgatt ggttatcgtg tgagccctgg caaagattca 720gaacttttta actctgttgg gctccaaaac ggagatattg ccactcagtt aaatggacaa 780gacctgacag accctgctgc tatgggcaac atattccgtt ctatctcaga gctgacagag 840ctaaacctcg tcgtcgagag agatggtcaa caacatgaag tgtttattga attttag 89742298PRTVibrio splendidus 42Met Asp Asn Ser Pro Val Leu Ser Arg Phe Leu Glu Asn Gly Phe Leu1 5 10 15 Leu Gln Gln Lys Leu Ser Leu Val Leu Cys Cys Val Leu Ile Ala Ala 20 25 30 Ser Ala Trp Ile Leu Gly Gln Leu Ala Trp Phe Ile Glu Pro Ala Glu 35 40 45 Gln Thr Val Val Pro Trp Thr Ala Thr Ala Ser Ser Ser Ser Thr Pro 50 55 60 Gln Ser Thr Leu Asp Ile Ser Ser Leu Gln Gln Ser Asn Met Phe Gly65 70 75 80 Ala Tyr Asn Pro Thr Thr Pro Ala Val Val Glu Gln Gln Val Ile Gln 85 90 95 Asp Ala Pro Lys Thr Arg Leu Asn Leu Val Leu Val Gly Ala Val Ala 100 105 110 Ser Ser Asn Pro Lys Leu Ser Leu Ala Val Ile Ala Asn Arg Gly Thr 115 120 125 Gln Ala Thr Tyr Gly Ile Asn Glu Glu Ile Glu Gly Thr Arg Ala Lys 130 135 140 Leu Lys Ala Val Leu Val Asp Arg Val Ile Ile Asp Asn Ser Gly Arg145 150 155 160 Asp Glu Thr Leu Met Leu Glu Gly Ile Glu Tyr Lys Arg Leu Ser Val 165 170 175 Ser Ala Pro Ala Pro Pro Arg Thr Ser Ser Ser Val Arg Gly Asn Asn 180 185 190 Pro Ala Ser Ala Glu Glu Lys Leu Asp Glu Ile Lys Ala Lys Ile Met 195 200 205 Lys Asp Pro Gln Gln Ile Phe Gln Tyr Val Arg Leu Ser Gln Val Lys 210 215 220 Arg Asp Asp Lys Val Ile Gly Tyr Arg Val Ser Pro Gly Lys Asp Ser225 230 235 240 Glu Leu Phe Asn Ser Val Gly Leu Gln Asn Gly Asp Ile Ala Thr Gln 245 250 255 Leu Asn Gly Gln Asp Leu Thr Asp Pro Ala Ala Met Gly Asn Ile Phe 260 265 270 Arg Ser Ile Ser Glu Leu Thr Glu Leu Asn Leu Val Val Glu Arg Asp 275 280 285 Gly Gln Gln His Glu Val Phe Ile Glu Phe 290 295 432025DNAVibrio splendidus 43gtgaagcatt ggtttaagaa aagtgcatgg ttattggcag gaagcttaat ctgcacaccc 60gcagccatcg cgagtgattt tagtgccagc tttaaaggca ctgatattca agagtttatt 120aatattgttg gtcgtaacct agagaagacg atcatcgttg acccttcggt gcgcggaaaa 180atcgatgtac gcagctacga cgtactcaat gaagagcaat actacagctt cttcctaaac 240gtattggaag tgtatggcta cgcggttgtc gaaatggact cgggtgttct taagatcatc 300aaggccaaag attcgaaaac atcggcaatt ccagtcgttg gagacagtga cacgatcaaa 360ggcgacaatg tggtgacacg tgttgtgacg gttcgtaatg tctcggtgcg tgaactttct 420cctctgcttc gtcaactaaa cgacaatgca ggcgcgggta acgttgtgca ctacgaccca 480gccaacatca tccttattac aggccgagcg gcggtagtaa accgtttagc tgaaatcatc 540aagcgtgttg accaagcggg tgataaagag attgaagtcg ttgagctaaa gaatgcttct 600gcggcagaaa tggtacgtat cgttgatgcg ttaagcaaaa ccactgatgc gaaaaacaca 660cctgcatttc tacaacctaa attagttgcc gatgaacgta ccaatgcgat tcttatctca 720ggcgacccta aagtacgtag ccgtttaaga aggctgattg aacagcttga tgttgaaatg 780gcaaccaagg gcaataacca agttatttac cttaaatatg caaaagccga agatctagtt 840gatgtgctga aaggcgtgtc ggacaaccta caatcagaga agcagacatc aaccaaagga 900agttcatcgc agcgtaacca agtgatgatc tcagctcaca gtgacaccaa ctctttagtg 960attaccgcac agccggacat catgaatgcg cttcaagatg tgatcgcaca gctggatatt 1020cgtcgtgctc aagtattgat tgaagcactg attgtcgaaa tggccgaagg tgacggcgtt 1080aaccttggtg tgcagtgggg taaccttgaa acgggtgcca tgattcagta cagcaacact 1140ggcgcttcca ttggcggtgt gatggttggt ttagaagaag cgaaagacag cgaaacgaca 1200accgctgttt atgattcaga cggtaaattc ttacgtaatg aaaccacgac ggaagaaggt 1260gactattcaa cattagcttc cgcactttct ggtgttaatg gtgcggcaat gagtgtggta 1320atgggtgact ggaccgcctt gatcagtgca gtagcgaccg attcaaattc aaatatccta 1380tcttctccaa gtatcaccgt gatggataac ggcgaagcgt cattcattgt gggtgaagag 1440gtgcctgttc taaccggttc tacagcaggc tcaagtaacg acaacccatt ccaaacagtt 1500gaacgtaaag aagtgggtat caagcttaaa gtggtgccgc aaatcaatga aggtgattcg 1560gttcaactgc aaatagaaca agaagtatcg aacgtattag gcgccaatgg tgcggttgat 1620gtgcgttttg ctaagcgaca gctaaataca tcagtgattg ttcaagacgg tcaaatgctg 1680gtgttgggtg gcttgattga cgagcgagca ttggaaagtg aatctaaggt gccgttcttg 1740ggagatattc ctgtgcttgg acacttgttc aaatcaacca gtactcaggt tgagaaaaag 1800aacctaatgg tcttcatcaa accaaccatt attcgtgatg gtatgacagc cgatggtatc 1860acgcagcgta aatacaactt catccgtgct gagcagttgt acaaggctga gcaaggactg 1920aagttaatgg cagacgataa catcccagta ttgcctaaat ttggtgccga catgaatcac 1980ccggctgaaa ttcaagcctt catcgatcaa atggaacaag aataa 202544674PRTVibrio splendidus 44Met Lys His Trp Phe Lys Lys Ser Ala Trp Leu Leu Ala Gly Ser Leu1 5 10 15 Ile Cys Thr Pro Ala Ala Ile Ala Ser Asp Phe Ser Ala Ser Phe Lys 20 25 30 Gly Thr Asp Ile Gln Glu Phe Ile Asn Ile Val Gly Arg Asn Leu Glu 35 40 45 Lys Thr Ile Ile Val Asp Pro Ser Val Arg Gly Lys Ile Asp Val Arg 50 55 60 Ser Tyr Asp Val Leu Asn Glu Glu Gln Tyr Tyr Ser Phe Phe Leu Asn65 70 75 80 Val Leu Glu Val Tyr Gly Tyr Ala Val Val Glu Met Asp Ser Gly Val 85 90 95 Leu Lys Ile Ile Lys Ala Lys Asp Ser Lys Thr Ser Ala Ile Pro Val 100 105 110 Val Gly Asp Ser Asp Thr Ile Lys Gly Asp Asn Val Val Thr Arg Val 115 120 125 Val Thr Val Arg Asn Val Ser Val Arg Glu Leu Ser Pro Leu Leu Arg 130 135 140 Gln Leu Asn Asp Asn Ala Gly Ala Gly Asn Val Val His Tyr Asp Pro145 150 155 160 Ala Asn Ile Ile Leu Ile Thr Gly Arg Ala Ala Val Val Asn Arg Leu 165 170 175 Ala Glu Ile Ile Lys Arg Val Asp Gln Ala Gly Asp Lys Glu Ile Glu 180 185 190 Val Val Glu Leu Lys Asn Ala Ser Ala Ala Glu Met Val Arg Ile Val 195 200 205 Asp Ala Leu Ser Lys Thr Thr Asp Ala Lys Asn Thr Pro Ala Phe Leu 210 215 220 Gln Pro Lys Leu Val Ala Asp Glu Arg Thr Asn Ala Ile Leu Ile Ser225 230 235 240 Gly Asp Pro Lys Val Arg Ser Arg Leu Arg Arg Leu Ile Glu Gln Leu 245 250 255 Asp Val Glu Met Ala Thr Lys Gly Asn Asn Gln Val Ile Tyr Leu Lys 260 265 270 Tyr Ala Lys Ala Glu Asp Leu Val Asp Val Leu Lys Gly Val Ser Asp 275 280 285 Asn Leu Gln Ser Glu Lys Gln Thr Ser Thr Lys Gly Ser Ser Ser Gln 290 295 300 Arg Asn Gln Val Met Ile Ser Ala His Ser Asp Thr Asn Ser Leu Val305 310 315 320 Ile Thr Ala Gln Pro Asp Ile Met Asn Ala Leu Gln Asp Val Ile Ala 325 330 335 Gln Leu Asp Ile Arg Arg Ala Gln Val Leu Ile Glu Ala Leu Ile Val 340 345 350 Glu Met Ala Glu Gly Asp Gly Val Asn Leu Gly Val Gln Trp Gly Asn 355 360 365 Leu Glu Thr Gly Ala Met Ile Gln Tyr Ser Asn Thr Gly Ala Ser Ile 370 375 380 Gly Gly Val Met Val Gly Leu Glu Glu Ala Lys Asp Ser Glu Thr Thr385 390 395 400 Thr Ala Val Tyr Asp Ser Asp Gly Lys Phe Leu Arg Asn Glu Thr Thr 405 410 415 Thr Glu Glu Gly Asp Tyr Ser Thr Leu Ala Ser Ala Leu Ser Gly Val 420 425 430 Asn Gly Ala Ala Met Ser Val Val Met Gly Asp Trp Thr Ala Leu Ile 435 440 445 Ser Ala Val Ala Thr Asp Ser Asn Ser Asn Ile Leu Ser Ser Pro Ser 450 455 460 Ile Thr Val Met Asp Asn Gly Glu Ala Ser Phe Ile Val Gly Glu Glu465 470 475 480 Val Pro Val Leu Thr Gly Ser Thr Ala Gly Ser Ser Asn Asp Asn Pro 485 490 495 Phe Gln Thr Val Glu Arg Lys Glu Val Gly Ile Lys Leu Lys Val Val 500 505 510 Pro Gln Ile Asn Glu Gly Asp Ser Val Gln Leu Gln Ile Glu Gln Glu 515 520 525 Val Ser Asn Val Leu Gly Ala Asn Gly Ala Val Asp Val Arg Phe Ala 530 535 540 Lys Arg Gln Leu Asn Thr Ser Val Ile Val Gln Asp Gly Gln Met Leu545 550 555 560 Val Leu Gly Gly Leu Ile Asp Glu Arg Ala Leu Glu Ser Glu Ser Lys 565 570 575 Val Pro Phe Leu Gly Asp Ile Pro Val Leu Gly His Leu Phe Lys Ser 580 585 590 Thr Ser Thr Gln Val Glu Lys Lys Asn Leu Met Val Phe Ile Lys Pro 595 600 605 Thr Ile Ile Arg Asp Gly Met Thr Ala Asp Gly Ile Thr Gln Arg Lys 610 615 620 Tyr Asn Phe Ile Arg Ala Glu Gln Leu Tyr Lys Ala Glu Gln Gly Leu625 630 635 640 Lys Leu Met Ala Asp Asp Asn Ile Pro Val Leu Pro Lys Phe Gly Ala 645 650 655 Asp Met Asn His Pro Ala Glu Ile Gln Ala Phe Ile Asp Gln Met Glu 660 665 670 Gln Glu451503DNAVibrio splendidus 45atggctgaat tggtaggggc ggcacgtact tatcagcgct tgccgtttag ctttgcgaat 60cgctacaaga tggtgttgga ataccaacat ccagagcgcg caccgatact ttattatgtt 120gagccactga aatcggcggc gatcattgaa gtgagtcgtg ttgtgaaaaa tggtttcacg 180ccacaagcga ttactctcga tgagtttgat aaaaaactaa ccgatgctta tcagcgtgac 240tcgtcagaag ctcgtcagct catggaagac attggtgctg atagtgatga tttcttctca 300ctagcggaag aactgcctca agacgaagac ttacttgaat cagaagatga tgcaccaatc 360atcaagttaa tcaatgcgat gctgggtgag gcgatcaaag agggtgcttc ggatatacac 420atcgaaacct ttgaaaagtc actttgtatc cgtttccgag ttgatggtgt gctgcgtgat 480gttctagcgc caagccgtaa actggctccg ctattggttt cacgtgtcaa ggttatggct 540aaactggata ttgcggaaaa acgcgtgcca caagatggtc gtatttctct gcgtattggt 600ggccgagcgg ttgatgttcg tgtttcaacc atgccttctt cgcatggtga gcgtgtggta 660atgcgtctgt tggacaaaaa tgccactcgt ctagacttgc acagtttagg tatgacagcc 720gaaaaccatg aaaacttccg taagctgatt cagcgcccac atggcattat cttggtgacc 780ggcccgacag gttcaggtaa atcgacgacc ttgtacgcag gtctgcaaga actcaacagc 840aatgaacgaa acattttaac cgttgaagac ccaatcgaat tcgatatcga tggcattggt 900caaacacaag tgaaccctaa ggttgatatg acctttgcgc gtggtttacg tgccattctt 960cgtcaagatc ctgatgttgt tatgattggt gagatccgtg acttggagac cgcagagatt 1020gctgtccagg cctctttgac aggtcactta gttatgtcga ctctgcatac caatactgcc 1080gtcggtgcga ttacacgtct acgtgatatg ggcattgaac ctttcttgat ctcttcttcg 1140ctgctgggtg ttttggctca gcgcttggtt cgtactttat gtaacgaatg taaagaacct 1200tatgaagccg ataaagagca gaagaaactg tttgggttga agaagaaaga aagcttgacg 1260ctttaccatg ccaaaggttg tgaagagtgt ggccataagg gttatcgagg tcgtacgggt 1320attcatgagc tgttgatgat tgatgattca gtacaagagc tgattcacag tgaagcgggt 1380gagcaggcga ttgataaagc aattcgtggc acaacaccaa gtattcgaga tgatggcttg 1440agcaaagttc tgaaaggggt aacgtcccta gaagaagtga tgcgcgtgac caaggaagtc 1500tag 150346500PRTVibrio splendidus 46Met Ala Glu Leu Val Gly Ala Ala Arg Thr Tyr Gln Arg Leu Pro Phe1 5 10 15 Ser Phe Ala Asn Arg Tyr Lys Met Val Leu Glu Tyr Gln His Pro Glu 20 25 30 Arg Ala Pro Ile Leu Tyr Tyr Val Glu Pro Leu Lys Ser Ala Ala Ile 35 40 45 Ile Glu Val Ser Arg Val Val Lys Asn Gly Phe Thr Pro Gln Ala Ile 50 55 60 Thr Leu Asp Glu Phe Asp Lys Lys Leu Thr Asp Ala Tyr Gln Arg Asp65 70 75 80 Ser Ser Glu Ala Arg Gln Leu Met Glu Asp Ile Gly Ala Asp Ser Asp 85

90 95 Asp Phe Phe Ser Leu Ala Glu Glu Leu Pro Gln Asp Glu Asp Leu Leu 100 105 110 Glu Ser Glu Asp Asp Ala Pro Ile Ile Lys Leu Ile Asn Ala Met Leu 115 120 125 Gly Glu Ala Ile Lys Glu Gly Ala Ser Asp Ile His Ile Glu Thr Phe 130 135 140 Glu Lys Ser Leu Cys Ile Arg Phe Arg Val Asp Gly Val Leu Arg Asp145 150 155 160 Val Leu Ala Pro Ser Arg Lys Leu Ala Pro Leu Leu Val Ser Arg Val 165 170 175 Lys Val Met Ala Lys Leu Asp Ile Ala Glu Lys Arg Val Pro Gln Asp 180 185 190 Gly Arg Ile Ser Leu Arg Ile Gly Gly Arg Ala Val Asp Val Arg Val 195 200 205 Ser Thr Met Pro Ser Ser His Gly Glu Arg Val Val Met Arg Leu Leu 210 215 220 Asp Lys Asn Ala Thr Arg Leu Asp Leu His Ser Leu Gly Met Thr Ala225 230 235 240 Glu Asn His Glu Asn Phe Arg Lys Leu Ile Gln Arg Pro His Gly Ile 245 250 255 Ile Leu Val Thr Gly Pro Thr Gly Ser Gly Lys Ser Thr Thr Leu Tyr 260 265 270 Ala Gly Leu Gln Glu Leu Asn Ser Asn Glu Arg Asn Ile Leu Thr Val 275 280 285 Glu Asp Pro Ile Glu Phe Asp Ile Asp Gly Ile Gly Gln Thr Gln Val 290 295 300 Asn Pro Lys Val Asp Met Thr Phe Ala Arg Gly Leu Arg Ala Ile Leu305 310 315 320 Arg Gln Asp Pro Asp Val Val Met Ile Gly Glu Ile Arg Asp Leu Glu 325 330 335 Thr Ala Glu Ile Ala Val Gln Ala Ser Leu Thr Gly His Leu Val Met 340 345 350 Ser Thr Leu His Thr Asn Thr Ala Val Gly Ala Ile Thr Arg Leu Arg 355 360 365 Asp Met Gly Ile Glu Pro Phe Leu Ile Ser Ser Ser Leu Leu Gly Val 370 375 380 Leu Ala Gln Arg Leu Val Arg Thr Leu Cys Asn Glu Cys Lys Glu Pro385 390 395 400 Tyr Glu Ala Asp Lys Glu Gln Lys Lys Leu Phe Gly Leu Lys Lys Lys 405 410 415 Glu Ser Leu Thr Leu Tyr His Ala Lys Gly Cys Glu Glu Cys Gly His 420 425 430 Lys Gly Tyr Arg Gly Arg Thr Gly Ile His Glu Leu Leu Met Ile Asp 435 440 445 Asp Ser Val Gln Glu Leu Ile His Ser Glu Ala Gly Glu Gln Ala Ile 450 455 460 Asp Lys Ala Ile Arg Gly Thr Thr Pro Ser Ile Arg Asp Asp Gly Leu465 470 475 480 Ser Lys Val Leu Lys Gly Val Thr Ser Leu Glu Glu Val Met Arg Val 485 490 495 Thr Lys Glu Val 500 471221DNAVibrio splendidus 47atggcggcat ttgaatacaa agcactggat gccaaaggca aaagtaaaaa aggctcaatt 60gaagcagata atgctcgtca ggctcgccaa agaataaaag agcttggctt gatgccggtt 120gagatgaccg aggctaaagc aaaaacagca aaaggtgctc agccatcgac cagctttaaa 180cgcggcatca gtacgcctga tcttgcgctt attactcgtc aaatatccac gctcgttcaa 240tctggtatgc cgctagaaga gtgtttgaaa gccgttgccg aacagtctga gaaacctcgt 300attcgcacca tgctactcgc ggtgagatct aaggtgactg aaggttattc gttagcagac 360agcttgtctg attatcccca tatcttcgat gagctattca gagccatggt tgctgctggt 420gagaagtcag ggcatctaga tgcggtattg gaacgattgg ctgactacgc agaaaaccgt 480cagaagatgc gttctaagtt gctgcaagcg atgatctacc ccatcgtgct ggtggtgttt 540gcggtgacga ttgtgtcgtt cctactggca acggtagtgc cgaagatcgt tgagcctatt 600atccaaatgg gacaagagct ccctcagtcg acacaatttt tattagcatc gagtgaattt 660atccagaatt ggggcatcca attactggtg ttgaccattg gtgtgattgt gttggttaag 720actgcgctga aaaagccggg cgttcgcatg agctgggatc gcaaattatt gagcatcccg 780ctgataggca agatagcgaa agggatcaac acctctcgtt ttgcacgaac actttctatc 840tgtacctcta gtgcgattcc tatccttgaa gggatgaagg tcgcggtaga tgtgatgtcg 900aatcatcacg tgaaacaaca agtattacag gcatcagata gcgttagaga aggggcaagc 960ctgcgtaaag cgcttgatca aaccaaactc tttcccccga tgatgctgca tatgatcgcc 1020agtggtgagc agagtggcca attggaacag atgctgacaa gagcggcaga taatcaggat 1080caaagctttg aatcgaccgt taatatcgcg ttaggcattt ttaccccagc gcttattgcg 1140ttgatggctg gcttagtgct gtttatcgtg atggcgacgc tgatgccaat gcttgaaatg 1200aacaatttaa tgagtggtta a 122148406PRTVibrio splendidus 48Met Ala Ala Phe Glu Tyr Lys Ala Leu Asp Ala Lys Gly Lys Ser Lys1 5 10 15 Lys Gly Ser Ile Glu Ala Asp Asn Ala Arg Gln Ala Arg Gln Arg Ile 20 25 30 Lys Glu Leu Gly Leu Met Pro Val Glu Met Thr Glu Ala Lys Ala Lys 35 40 45 Thr Ala Lys Gly Ala Gln Pro Ser Thr Ser Phe Lys Arg Gly Ile Ser 50 55 60 Thr Pro Asp Leu Ala Leu Ile Thr Arg Gln Ile Ser Thr Leu Val Gln65 70 75 80 Ser Gly Met Pro Leu Glu Glu Cys Leu Lys Ala Val Ala Glu Gln Ser 85 90 95 Glu Lys Pro Arg Ile Arg Thr Met Leu Leu Ala Val Arg Ser Lys Val 100 105 110 Thr Glu Gly Tyr Ser Leu Ala Asp Ser Leu Ser Asp Tyr Pro His Ile 115 120 125 Phe Asp Glu Leu Phe Arg Ala Met Val Ala Ala Gly Glu Lys Ser Gly 130 135 140 His Leu Asp Ala Val Leu Glu Arg Leu Ala Asp Tyr Ala Glu Asn Arg145 150 155 160 Gln Lys Met Arg Ser Lys Leu Leu Gln Ala Met Ile Tyr Pro Ile Val 165 170 175 Leu Val Val Phe Ala Val Thr Ile Val Ser Phe Leu Leu Ala Thr Val 180 185 190 Val Pro Lys Ile Val Glu Pro Ile Ile Gln Met Gly Gln Glu Leu Pro 195 200 205 Gln Ser Thr Gln Phe Leu Leu Ala Ser Ser Glu Phe Ile Gln Asn Trp 210 215 220 Gly Ile Gln Leu Leu Val Leu Thr Ile Gly Val Ile Val Leu Val Lys225 230 235 240 Thr Ala Leu Lys Lys Pro Gly Val Arg Met Ser Trp Asp Arg Lys Leu 245 250 255 Leu Ser Ile Pro Leu Ile Gly Lys Ile Ala Lys Gly Ile Asn Thr Ser 260 265 270 Arg Phe Ala Arg Thr Leu Ser Ile Cys Thr Ser Ser Ala Ile Pro Ile 275 280 285 Leu Glu Gly Met Lys Val Ala Val Asp Val Met Ser Asn His His Val 290 295 300 Lys Gln Gln Val Leu Gln Ala Ser Asp Ser Val Arg Glu Gly Ala Ser305 310 315 320 Leu Arg Lys Ala Leu Asp Gln Thr Lys Leu Phe Pro Pro Met Met Leu 325 330 335 His Met Ile Ala Ser Gly Glu Gln Ser Gly Gln Leu Glu Gln Met Leu 340 345 350 Thr Arg Ala Ala Asp Asn Gln Asp Gln Ser Phe Glu Ser Thr Val Asn 355 360 365 Ile Ala Leu Gly Ile Phe Thr Pro Ala Leu Ile Ala Leu Met Ala Gly 370 375 380 Leu Val Leu Phe Ile Val Met Ala Thr Leu Met Pro Met Leu Glu Met385 390 395 400 Asn Asn Leu Met Ser Gly 405 49444DNAVibrio splendidus 49atgaaaaata aaatgaaaaa acaatcaggc tttaccctat tagaagtcat ggttgttgtc 60gttatccttg gtgttctagc aagttttgtt gtacctaacc tgttgggcaa caaagagaag 120gcggatcaac aaaaagccat cactgatatt gtggcgctag agaacgcgct cgacatgtac 180aaactggata acagcgttta cccaacaacg gatcaaggcc tggacgggtt ggtgacaaag 240ccaagcagtc cagagcctcg taactaccga gacggcggtt acatcaagcg tctacctaac 300gacccatggg gcaatgagta ccaataccta agtcctggtg ataacggcac aattgatatc 360ttcactcttg gcgcagatgg tcaagaaggt ggtgaaggta ttgctgcaga tatcggcaac 420tggaacatgc aggacttcca ataa 44450146PRTVibrio splendidus 50Lys Asn Lys Met Lys Lys Gln Ser Gly Phe Thr Leu Leu Glu Val Met1 5 10 15 Val Val Val Val Ile Leu Gly Val Leu Ala Ser Phe Val Val Pro Asn 20 25 30 Leu Leu Gly Asn Lys Glu Lys Ala Asp Gln Gln Lys Ala Ile Thr Asp 35 40 45 Ile Val Ala Leu Glu Asn Ala Leu Asp Met Tyr Lys Leu Asp Asn Ser 50 55 60 Val Tyr Pro Thr Thr Asp Gln Gly Leu Asp Gly Leu Val Thr Lys Pro65 70 75 80 Ser Ser Pro Glu Pro Arg Asn Tyr Arg Asp Gly Gly Tyr Ile Lys Arg 85 90 95 Leu Pro Asn Asp Pro Trp Gly Asn Glu Tyr Gln Tyr Leu Ser Pro Gly 100 105 110 Asp Asn Gly Thr Ile Asp Ile Phe Thr Leu Gly Ala Asp Gly Gln Glu 115 120 125 Gly Gly Glu Gly Ile Ala Ala Asp Ile Gly Asn Trp Asn Met Gln Asp 130 135 140 Phe Gln145 51594DNAVibrio splendidus 51gtgaaaacta agcaaacaca gccaggtttc accttgattg agattctttt ggtgttggta 60ttactgtcag tatcggcggt cgcggtgatc tcgaccatcc ctaccaatag caaagatgtt 120gctaaaaaat acgctcaaag cttttatcag cgaattcagc tactcaatga agaggctatt 180ttgagtggct tagattttgg tgttcgtgtt gatgaaaaaa aatcgactta cgttctgatg 240actttgaagt ctgatggctg gcaagaaacg gagttcgaaa agatcccttc ttcaactgaa 300ttaccggaag aactggcact gtcgctgaca ttaggtggtg gcgcgtggga agacgatgat 360cggttgttca atccaggaag cttatttgat gaagatatgt ttgctgatct tgaagaggaa 420aagaagccga aaccaccaca gatctacatc ttgtcgagtg ctgaaatgac gccatttgta 480ctgtcgtttt acccaaatac cggtgacaca atacaagatg tttggcgcat tcgagtattg 540gataatggtg tgattcgatt actcgagccg ggagaagaag atgaagaaga ataa 59452197PRTVibrio splendidus 52Met Lys Thr Lys Gln Thr Gln Pro Gly Phe Thr Leu Ile Glu Ile Leu1 5 10 15 Leu Val Leu Val Leu Leu Ser Val Ser Ala Val Ala Val Ile Ser Thr 20 25 30 Ile Pro Thr Asn Ser Lys Asp Val Ala Lys Lys Tyr Ala Gln Ser Phe 35 40 45 Tyr Gln Arg Ile Gln Leu Leu Asn Glu Glu Ala Ile Leu Ser Gly Leu 50 55 60 Asp Phe Gly Val Arg Val Asp Glu Lys Lys Ser Thr Tyr Val Leu Met65 70 75 80 Thr Leu Lys Ser Asp Gly Trp Gln Glu Thr Glu Phe Glu Lys Ile Pro 85 90 95 Ser Ser Thr Glu Leu Pro Glu Glu Leu Ala Leu Ser Leu Thr Leu Gly 100 105 110 Gly Gly Ala Trp Glu Asp Asp Asp Arg Leu Phe Asn Pro Gly Ser Leu 115 120 125 Phe Asp Glu Asp Met Phe Ala Asp Leu Glu Glu Glu Lys Lys Pro Lys 130 135 140 Pro Pro Gln Ile Tyr Ile Leu Ser Ser Ala Glu Met Thr Pro Phe Val145 150 155 160 Leu Ser Phe Tyr Pro Asn Thr Gly Asp Thr Ile Gln Asp Val Trp Arg 165 170 175 Ile Arg Val Leu Asp Asn Gly Val Ile Arg Leu Leu Glu Pro Gly Glu 180 185 190 Glu Asp Glu Glu Glu 195 53396DNAVibrio splendidus 53atgaagaaga ataaccgttc tccttatcgt tctcgcggta tgcctcttgg ttctcgagga 60atgactctgc ttgaagtatt ggttgcgctg gctatcttcg ctacggcggc gatcagtgtg 120attcgtgctg tcacccagca catcaatacg ctcagttatc tcgaagaaaa aaccttcgcg 180gcgatggtcg ttgataatca aatggcccta gtcatgctac atcctgagat gcttaaaaaa 240gcgcagggca cgcaagagtt agcgggaaga gaatggttct ggaaggtgac tcccatcgat 300accagcgata atttattaaa ggcgtttgat gtgagtgcgg caaccagtaa gaaagcgtct 360ccagtcgtta cggtgcgcag ttatgtggtt aattaa 39654131PRTVibrio splendidus 54Met Lys Lys Asn Asn Arg Ser Pro Tyr Arg Ser Arg Gly Met Pro Leu1 5 10 15 Gly Ser Arg Gly Met Thr Leu Leu Glu Val Leu Val Ala Leu Ala Ile 20 25 30 Phe Ala Thr Ala Ala Ile Ser Val Ile Arg Ala Val Thr Gln His Ile 35 40 45 Asn Thr Leu Ser Tyr Leu Glu Glu Lys Thr Phe Ala Ala Met Val Val 50 55 60 Asp Asn Gln Met Ala Leu Val Met Leu His Pro Glu Met Leu Lys Lys65 70 75 80 Ala Gln Gly Thr Gln Glu Leu Ala Gly Arg Glu Trp Phe Trp Lys Val 85 90 95 Thr Pro Ile Asp Thr Ser Asp Asn Leu Leu Lys Ala Phe Asp Val Ser 100 105 110 Ala Ala Thr Ser Lys Lys Ala Ser Pro Val Val Thr Val Arg Ser Tyr 115 120 125 Val Val Asn 130 55804DNAVibrio slpendidus 55atgtggttaa ttaagagaat gtggtcaatt aagagcatgt tattaattaa gaacagctcg 60ctaactaaga gcgtgtcgct aactaagagc atgtcggaaa ataagcgtac gccgcgtaaa 120caaggtctac cttcaaaagg gagaggcttt accttaattg aagtcttggt ctcgattgct 180atctttgcca cgctaagtat ggcggcttat caggtggtta atcaggtgca gcgaagcaac 240gagatctcta ttgagcgcag tgctcgtttg aaccaactgc aacgcagttt agtcatttta 300gataatgatt ttcgccagat ggcggtgcga aaatttcgta ccaacggtga agaagcatca 360tctaagctga tcttaatgaa agagtattta ttggactccg acagtgtagg catcatgttt 420actcgtctag gttggcacaa cccacaacag cagtttcctc gcggtgaagt cacgaaggtt 480ggctaccgta ttaaagaaga aacacttgag cgtgtatggt ggcgttatcc cgatacacct 540tcaggccaag aaggtgtgat tacccctctg cttgatgatg ttgaaagctt ggaattcgag 600ttttatgacg gaagccgctg ggggaaagag tggcaaaccg ataaatcact gccgaaagcg 660gtgaggctta agctgacact gaaagactat ggtgagatag agcgtgttta tctcactccc 720ggtggcaccc tagatcaggc cgatgattct tcaaacagtg actcttcagg cagtagtgag 780gggaataatg actcatcgaa ctaa 80456267PRTVibrio splendidus 56Met Trp Leu Ile Lys Arg Met Trp Ser Ile Lys Ser Met Leu Leu Ile1 5 10 15 Lys Asn Ser Ser Leu Thr Lys Ser Val Ser Leu Thr Lys Ser Met Ser 20 25 30 Glu Asn Lys Arg Thr Pro Arg Lys Gln Gly Leu Pro Ser Lys Gly Arg 35 40 45 Gly Phe Thr Leu Ile Glu Val Leu Val Ser Ile Ala Ile Phe Ala Thr 50 55 60 Leu Ser Met Ala Ala Tyr Gln Val Val Asn Gln Val Gln Arg Ser Asn65 70 75 80 Glu Ile Ser Ile Glu Arg Ser Ala Arg Leu Asn Gln Leu Gln Arg Ser 85 90 95 Leu Val Ile Leu Asp Asn Asp Phe Arg Gln Met Ala Val Arg Lys Phe 100 105 110 Arg Thr Asn Gly Glu Glu Ala Ser Ser Lys Leu Ile Leu Met Lys Glu 115 120 125 Tyr Leu Leu Asp Ser Asp Ser Val Gly Ile Met Phe Thr Arg Leu Gly 130 135 140 Trp His Asn Pro Gln Gln Gln Phe Pro Arg Gly Glu Val Thr Lys Val145 150 155 160 Gly Tyr Arg Ile Lys Glu Glu Thr Leu Glu Arg Val Trp Trp Arg Tyr 165 170 175 Pro Asp Thr Pro Ser Gly Gln Glu Gly Val Ile Thr Pro Leu Leu Asp 180 185 190 Asp Val Glu Ser Leu Glu Phe Glu Phe Tyr Asp Gly Ser Arg Trp Gly 195 200 205 Lys Glu Trp Gln Thr Asp Lys Ser Leu Pro Lys Ala Val Arg Leu Lys 210 215 220 Leu Thr Leu Lys Asp Tyr Gly Glu Ile Glu Arg Val Tyr Leu Thr Pro225 230 235 240 Gly Gly Thr Leu Asp Gln Ala Asp Asp Ser Ser Asn Ser Asp Ser Ser 245 250 255 Gly Ser Ser Glu Gly Asn Asn Asp Ser Ser Asn 260 265 571050DNAVibrio splendidus 57atgactcatc gaactaataa gcgtttagcg acaaggtcag ccttgggacg taaacaacgt 60ggtgtcgcgc tgatcattat tttgatgcta ttggcgatca tggcaaccat tgctggcagc 120atgtccgagc gtttgtttac gcaattcaag cgcgttggta accaactgaa ttaccaacag 180gcttactggt acagcattgg tgtggaagcg cttgtgcaaa acggtattag gcaaagttac 240aaagacagtg ataccgtgaa cctaagccaa ccatgggcgt tagaagagca ggtataccca 300ttggattatg gccaagttaa gggccgcatt gttgatgctc aggcatgttt taatcttaat 360gccttagccg gagtggcgac cacttcaagt aaccagactc cttatttaat cacggtttgg 420caaaccttat tggaaaacca agacgttgag ccttatcagg ctgaggttat cgcaaattca 480acgtgggaat ttgttgatgc ggatacacga accacctctt cgtctggtgt agaagacagc 540acgtatgaag cgatgaagcc ctcttatttg gcggcgaatg gcttaatggc cgatgaatcc 600gagctacgag cggtttatca agtcactggt gaagtgatga ataaggttcg cccctttgtt 660tgcgctctgc caaccgatga tttccgcttg aatgtgaata ctctcacgga aaaacaagca 720ccgttattgg aagcgatgtt tgcgccaggc ttaagtgaat cggatgccaa acagctgata 780gataaacgcc catttgatgg ctgggatacg gtagatgctt tcatggctga acctgccatt 840gttggtgtaa gtgccgaagt cagcaagaaa gcgaaagcat atttaactgt agatagcgcc 900tattttgagc tagatgcaga ggtattagtt gagcagtcac gtgtacgtat acggacgctt 960ttctatagta gtaatcgaga aacagtgacg gtagtacgcc gtcgttttgg aggaatcagt 1020gagcgagttt ctgaccgttc gactgagtag

105058349PRTVibrio splendidus 58Met Thr His Arg Thr Asn Lys Arg Leu Ala Thr Arg Ser Ala Leu Gly1 5 10 15 Arg Lys Gln Arg Gly Val Ala Leu Ile Ile Ile Leu Met Leu Leu Ala 20 25 30 Ile Met Ala Thr Ile Ala Gly Ser Met Ser Glu Arg Leu Phe Thr Gln 35 40 45 Phe Lys Arg Val Gly Asn Gln Leu Asn Tyr Gln Gln Ala Tyr Trp Tyr 50 55 60 Ser Ile Gly Val Glu Ala Leu Val Gln Asn Gly Ile Arg Gln Ser Tyr65 70 75 80 Lys Asp Ser Asp Thr Val Asn Leu Ser Gln Pro Trp Ala Leu Glu Glu 85 90 95 Gln Val Tyr Pro Leu Asp Tyr Gly Gln Val Lys Gly Arg Ile Val Asp 100 105 110 Ala Gln Ala Cys Phe Asn Leu Asn Ala Leu Ala Gly Val Ala Thr Thr 115 120 125 Ser Ser Asn Gln Thr Pro Tyr Leu Ile Thr Val Trp Gln Thr Leu Leu 130 135 140 Glu Asn Gln Asp Val Glu Pro Tyr Gln Ala Glu Val Ile Ala Asn Ser145 150 155 160 Thr Trp Glu Phe Val Asp Ala Asp Thr Arg Thr Thr Ser Ser Ser Gly 165 170 175 Val Glu Asp Ser Thr Tyr Glu Ala Met Lys Pro Ser Tyr Leu Ala Ala 180 185 190 Asn Gly Leu Met Ala Asp Glu Ser Glu Leu Arg Ala Val Tyr Gln Val 195 200 205 Thr Gly Glu Val Met Asn Lys Val Arg Pro Phe Val Cys Ala Leu Pro 210 215 220 Thr Asp Asp Phe Arg Leu Asn Val Asn Thr Leu Thr Glu Lys Gln Ala225 230 235 240 Pro Leu Leu Glu Ala Met Phe Ala Pro Gly Leu Ser Glu Ser Asp Ala 245 250 255 Lys Gln Leu Ile Asp Lys Arg Pro Phe Asp Gly Trp Asp Thr Val Asp 260 265 270 Ala Phe Met Ala Glu Pro Ala Ile Val Gly Val Ser Ala Glu Val Ser 275 280 285 Lys Lys Ala Lys Ala Tyr Leu Thr Val Asp Ser Ala Tyr Phe Glu Leu 290 295 300 Asp Ala Glu Val Leu Val Glu Gln Ser Arg Val Arg Ile Arg Thr Leu305 310 315 320 Phe Tyr Ser Ser Asn Arg Glu Thr Val Thr Val Val Arg Arg Arg Phe 325 330 335 Gly Gly Ile Ser Glu Arg Val Ser Asp Arg Ser Thr Glu 340 345 591248DNAVibrio splendidus 59gtgagcgagt ttctgaccgt tcgactgagt agcgaaccac aaagccctgt gcagtggtta 60gtttggtcga caagccaaca agaagtgata gcaagcggtg aactgtctag ctgggaacag 120cttgacgagt taacgcctta cgctgaaaag cgcagctgta tcgctttatt gccgggaagt 180gaatgcttaa ttaagcgtgt tgagatcccg aaaggtgctg ctcgccagtt tgattctatg 240ctgccgttct tattagaaga cgaagtcgca caagatatcg aagacttaca cctgactatt 300ttagataaag atgccactca cgctaccgtg tgtggtgtgg atcgtgaatg gctaaaacaa 360gctttagacc tgtttcgcga agccaatata atcttccgta aggtgctacc agatacacta 420gccgtgcctt ttgaagaaca aggcatcagt gcgttgcaga tagatcagca ttggttattg 480cgccaaggtc actctcaacg tcaaggtcac tatcaagccg tatcgatcag tgaagcatgg 540ttaccgatgt ttttgcaaag tgattgggtt gtcgctggtg aggaagagca agcgacgact 600atcttcagct ataccgcgat gccgagcgac gacgttcaac agcaaagcgg cctcgagtgg 660caagcaaagc ctgcggaatt ggtgatgtct ttattgagtc agcaagcgat cacaagcggc 720gtaaatttac tgactggcac ctttaaaacc aaatcttcat tcagtaaata ttggcgtgtt 780tggcagaaag tggcgattgc tgcttgtttg ctggtggccg tgattgtgac tcagcaagtg 840ttgaaggttc agcaatacga agcgcaagca caagcctacc gcatggagag tgagcgtatc 900tttagagctg tgctgcctgg caaacaacgc attccgaccg tgagttacct caagcgtcag 960atgaatgatg aagctaagaa atacggtggt tcaggcgaag gtgattcttt acttggttgg 1020ttagctttgc tgcctgaaac cttagggcaa gtgaagacga tcgaagttga aagcattcgc 1080tacgatggca accgttctga ggttcgactg caggctaaaa gttctgactt ccaacacttt 1140gagaccgcaa gggtgaagct cgaagagaag tttgtcgttg agcaagggcc attgaaccgt 1200aatggcgatg ccgtatttgg cagttttact cttaaacccc atcaataa 124860415PRTVibrio splendidus 60Met Ser Glu Phe Leu Thr Val Arg Leu Ser Ser Glu Pro Gln Ser Pro1 5 10 15 Val Gln Trp Leu Val Trp Ser Thr Ser Gln Gln Glu Val Ile Ala Ser 20 25 30 Gly Glu Leu Ser Ser Trp Glu Gln Leu Asp Glu Leu Thr Pro Tyr Ala 35 40 45 Glu Lys Arg Ser Cys Ile Ala Leu Leu Pro Gly Ser Glu Cys Leu Ile 50 55 60 Lys Arg Val Glu Ile Pro Lys Gly Ala Ala Arg Gln Phe Asp Ser Met65 70 75 80 Leu Pro Phe Leu Leu Glu Asp Glu Val Ala Gln Asp Ile Glu Asp Leu 85 90 95 His Leu Thr Ile Leu Asp Lys Asp Ala Thr His Ala Thr Val Cys Gly 100 105 110 Val Asp Arg Glu Trp Leu Lys Gln Ala Leu Asp Leu Phe Arg Glu Ala 115 120 125 Asn Ile Ile Phe Arg Lys Val Leu Pro Asp Thr Leu Ala Val Pro Phe 130 135 140 Glu Glu Gln Gly Ile Ser Ala Leu Gln Ile Asp Gln His Trp Leu Leu145 150 155 160 Arg Gln Gly His Ser Gln Arg Gln Gly His Tyr Gln Ala Val Ser Ile 165 170 175 Ser Glu Ala Trp Leu Pro Met Phe Leu Gln Ser Asp Trp Val Val Ala 180 185 190 Gly Glu Glu Glu Gln Ala Thr Thr Ile Phe Ser Tyr Thr Ala Met Pro 195 200 205 Ser Asp Asp Val Gln Gln Gln Ser Gly Leu Glu Trp Gln Ala Lys Pro 210 215 220 Ala Glu Leu Val Met Ser Leu Leu Ser Gln Gln Ala Ile Thr Ser Gly225 230 235 240 Val Asn Leu Leu Thr Gly Thr Phe Lys Thr Lys Ser Ser Phe Ser Lys 245 250 255 Tyr Trp Arg Val Trp Gln Lys Val Ala Ile Ala Ala Cys Leu Leu Val 260 265 270 Ala Val Ile Val Thr Gln Gln Val Leu Lys Val Gln Gln Tyr Glu Ala 275 280 285 Gln Ala Gln Ala Tyr Arg Met Glu Ser Glu Arg Ile Phe Arg Ala Val 290 295 300 Leu Pro Gly Lys Gln Arg Ile Pro Thr Val Ser Tyr Leu Lys Arg Gln305 310 315 320 Met Asn Asp Glu Ala Lys Lys Tyr Gly Gly Ser Gly Glu Gly Asp Ser 325 330 335 Leu Leu Gly Trp Leu Ala Leu Leu Pro Glu Thr Leu Gly Gln Val Lys 340 345 350 Thr Ile Glu Val Glu Ser Ile Arg Tyr Asp Gly Asn Arg Ser Glu Val 355 360 365 Arg Leu Gln Ala Lys Ser Ser Asp Phe Gln His Phe Glu Thr Ala Arg 370 375 380 Val Lys Leu Glu Glu Lys Phe Val Val Glu Gln Gly Pro Leu Asn Arg385 390 395 400 Asn Gly Asp Ala Val Phe Gly Ser Phe Thr Leu Lys Pro His Gln 405 410 415 61489DNAVibrio splendidus 61atgagaaata tgattgaacc actccaagcg tggtgggctt caataagtca gcgggaacaa 60cgattagtca ttggttgttc tattttattg atactgggcg ttgtctattg gggattaata 120caaccactta gccaacgagc cgagcttgca caaagccgca ttcaaagtga gaagcaactt 180ctggcttggg taacggacaa agcgaatcaa gtggttgaac tacgaggcag tggtggcatc 240agtgccagtc agcctttgaa ccaatctgtg cctgcttcta tgcgccgttt taacatcgag 300ctgatacgcg tgcaaccacg cggtgagatg ctgcaagttt ggattaagcc tgtgccattt 360aataagttcg ttgactggct gacatacctg aaagaaaagc agggtgttga ggttgagttt 420atggatattg atcgctctga tagccctggg gttattgaga tcaaccgact acagtttaaa 480cgaggttaa 48962162PRTVibrio splendidus 62Met Arg Asn Met Ile Glu Pro Leu Gln Ala Trp Trp Ala Ser Ile Ser1 5 10 15 Gln Arg Glu Gln Arg Leu Val Ile Gly Cys Ser Ile Leu Leu Ile Leu 20 25 30 Gly Val Val Tyr Trp Gly Leu Ile Gln Pro Leu Ser Gln Arg Ala Glu 35 40 45 Leu Ala Gln Ser Arg Ile Gln Ser Glu Lys Gln Leu Leu Ala Trp Val 50 55 60 Thr Asp Lys Ala Asn Gln Val Val Glu Leu Arg Gly Ser Gly Gly Ile65 70 75 80 Ser Ala Ser Gln Pro Leu Asn Gln Ser Val Pro Ala Ser Met Arg Arg 85 90 95 Phe Asn Ile Glu Leu Ile Arg Val Gln Pro Arg Gly Glu Met Leu Gln 100 105 110 Val Trp Ile Lys Pro Val Pro Phe Asn Lys Phe Val Asp Trp Leu Thr 115 120 125 Tyr Leu Lys Glu Lys Gln Gly Val Glu Val Glu Phe Met Asp Ile Asp 130 135 140 Arg Ser Asp Ser Pro Gly Val Ile Glu Ile Asn Arg Leu Gln Phe Lys145 150 155 160 Arg Gly63780DNAVibrio splendidus 63gtgaaacgcg gtttatcttt caaatacggc ctgttattca gcgtcatttt tatcgttttt 60ttctcggtaa gcttgttgct gcatttgcct gccgcttttg ctctcaagca tgcacccgtc 120gtgcgtggtt taagcattga aggcgttgag ggcaccgttt ggcaaggtcg cgctaacaat 180atcgcgtggc agcgtgtcaa ttacggctca gtgcagtggg acttccagtt ctctaaacta 240ttccaagcca aagcagaact tgcggttcgc tttggccgca acagcgacat gaacttatca 300ggtaaaggac gtgtcggata tagcatgagt ggtgcttacg cggaaaactt agtggcatca 360atgccagcca gcaacgtgat gaaatatgcg ccagctatcc cagtgcctgt gtctattgca 420gggcaagttg aactgacgat caaacatgcg gttcatgctc aaccttggtg tcaatcaggt 480gaaggtacgc ttgcttggtc tggtgcagca gtcgactcgc cagtgggttc gttagacctt 540ggccctgtga ttgcggacat aacgtgtgaa gacagcacaa ttgcagccaa aggcactcag 600aagagcgatc aggtagacag cgagttctca gcgagcgtaa cacctaacca acgctacacc 660tcggcagcat ggtttaagcc aggcgctgaa ttcccgccag caatgcagag tcagcttaag 720tggttgggca atcctgatag ccaaggtaaa taccaattta cttatcaagg ccgcttttag 78064259PRTVibrio splendidus 64Met Lys Arg Gly Leu Ser Phe Lys Tyr Gly Leu Leu Phe Ser Val Ile1 5 10 15 Phe Ile Val Phe Phe Ser Val Ser Leu Leu Leu His Leu Pro Ala Ala 20 25 30 Phe Ala Leu Lys His Ala Pro Val Val Arg Gly Leu Ser Ile Glu Gly 35 40 45 Val Glu Gly Thr Val Trp Gln Gly Arg Ala Asn Asn Ile Ala Trp Gln 50 55 60 Arg Val Asn Tyr Gly Ser Val Gln Trp Asp Phe Gln Phe Ser Lys Leu65 70 75 80 Phe Gln Ala Lys Ala Glu Leu Ala Val Arg Phe Gly Arg Asn Ser Asp 85 90 95 Met Asn Leu Ser Gly Lys Gly Arg Val Gly Tyr Ser Met Ser Gly Ala 100 105 110 Tyr Ala Glu Asn Leu Val Ala Ser Met Pro Ala Ser Asn Val Met Lys 115 120 125 Tyr Ala Pro Ala Ile Pro Val Pro Val Ser Ile Ala Gly Gln Val Glu 130 135 140 Leu Thr Ile Lys His Ala Val His Ala Gln Pro Trp Cys Gln Ser Gly145 150 155 160 Glu Gly Thr Leu Ala Trp Ser Gly Ala Ala Val Asp Ser Pro Val Gly 165 170 175 Ser Leu Asp Leu Gly Pro Val Ile Ala Asp Ile Thr Cys Glu Asp Ser 180 185 190 Thr Ile Ala Ala Lys Gly Thr Gln Lys Ser Asp Gln Val Asp Ser Glu 195 200 205 Phe Ser Ala Ser Val Thr Pro Asn Gln Arg Tyr Thr Ser Ala Ala Trp 210 215 220 Phe Lys Pro Gly Ala Glu Phe Pro Pro Ala Met Gln Ser Gln Leu Lys225 230 235 240 Trp Leu Gly Asn Pro Asp Ser Gln Gly Lys Tyr Gln Phe Thr Tyr Gln 245 250 255 Gly Arg Phe6510967DNAErwinia carotovora subsp. Atroseptica SCRI1043 65aagttgcagg atatgacgaa agcgtggccg acgactatac cggccacgct ttgaggaatt 60acaggaaatc agctcgctta ggcgagaaag catcgatcag tacgctaccg tcttccagcg 120aaaccacgcc gtgcatctcg tgtttcaccg ccagataggc gtcgcccgtt ttcagggtgc 180gtttttcacc ttcgatcacg acttcaaagc tgccagcggc aacataagca atctggtcgt 240gaatctcatg gaagtgcggc gtaccaatcg cacctttatc aaagtgcacg taaaccatca 300tcagctcatc gctccatgtc atgattttac gtttaatgcc accgcccagc tcttcccatg 360gcgtttcatc atcaataaag tatcttctca tcatctctct cctctaacgc tctttttgcc 420cataccttct attgcgtcaa caaaccgtgt acgacaacga atgcatggct atggattgcg 480acattttagc cacatcagta ccagaagaaa cataaaataa gcaaaaccat gacggccctc 540aagaaataaa taaaacatta tttcattttt attgaattcg catctcatcc aaactatcat 600cccgcataac aagaaagaac cgggcatgtt gaggaacagg tgacgttgtc actgccacgc 660aacatcatct gtttcgcccg gcgctttcgc caggaacgat tcctcttctt ggaacggcgc 720ctgatttttg tttttctctg aaagagaggc taagaaatgc aagttcgtca aagcattcac 780agcgatcacg cgaagcagct agatacagca ggcctgcgtc gtgaattcct gatcgaacag 840attttttctg ccgatgccta cactatgacc tatagccaca tcgaccgaat catcgtcggt 900ggcatcatgc ccgtacacag cgccgtaacg attggcggtg aagtgggtaa acaactcggc 960gttagctatt tccttgagcg tcgcgaactc ggagccatca acattggcgg cgcgggtacc 1020gttactgtcg atggcgagcg ctatgacgtg ggtaatgaag aagcaattta tgttggcatg 1080ggcgtgaaag acgtgcagtt taccagcact gatgccacta acccggccaa gttctactac 1140aacagcgcgc ctgcacatac gacatatcct acccgcaaga ttacccaagc tgacgcttca 1200ccacaaaccg tgggagaaga tgcaagctgt aatcgtcgca caattaacaa atacattgtt 1260cccgatgtat tgccaacctg ccagctcacc atgggattaa ccaagttagc tgaaggcagc 1320ctgtggaaca ccatgccttg tcatacgcat gagcgccgga tggaagtcta tttctatttt 1380gatatggatg aggaaacggc cgttttccac atgatggggc aaccgcagga aacccgtcac 1440atagttatta aaaacgagca ggcggtgatt tcaccgagct ggtcgattca ttccggtgtt 1500ggcaccagac gctacacctt tatctggggc atggttggcg agaatcaagt tttcggtgac 1560atggatcacg tcaaggttag cgagttacgt taatcgcttt caaccggaat taccggtgtt 1620ccctacagta acagctaacg actaagtatt gtcgcttata gagagattat tgatatgatt 1680ttaaattctt ttgatttgca aggtaaagtt gctcttatca cgggttgtga tacgggttta 1740ggtcagggta tggctatcgg tctggcacaa gctggctgtg atatcgttgg cgtcaacatc 1800gttgaaccaa aagataccat cgaaaaagtt accgcactgg gacgccgttt cctcagcctg 1860accgctgaca tgagcaacgt agcgggtcat gccgagctgg tagagaaagc cgttgctgaa 1920tttggtcacg ttgacattct ggtcaacaac gccggtatca tccgtcgtga agatgctatc 1980gagttcagcg agaaaaactg ggacgacgtc atgaatctga acattaagag cgttttcttt 2040atgtctcagg ctgttgcacg ccagtttatc aaacaaggta aaggcggcaa gatcatcaac 2100atcgcctcta tgctgtcctt ccaaggcggt atccgcgtgc cttcttacac tgcgtcaaaa 2160agcgccgtta tgggtgtaac ccgtctgctg gctaacgagt gggcaaaaca cggcatcaac 2220gttaacgcca ttgctccagg gtacatggca accaacaata ctcagcaact gcgcgccgat 2280gaagaccgca gcaaagagat tctggaccgt atcccggctg gccgttgggg tttaccacag 2340gatctgatgg gcccatccgt cttcctggca tccagcgcat ctgattacat caatggctac 2400acgattgccg ttgatggtgg ctggctggct cgctaagtgt aatttttctt agcggcattt 2460cgctaatcca cgataaaaag cacaatttag gttgtgcttt ttatttattt ttcaagttgt 2520tatttcgttt tttataattc tcttttctgc ctaaatcctt tcttaaaaaa aaatcaaaac 2580aacgttccga ctttgatcac actttcgata ttgcgtgcat gacgacaagg ttaatagcgc 2640aatataatca atcaaaacag tgtttctatt tataaggaac tgttcacgca gttccataag 2700aaggtactcc atgagtattt ttgaaaactt atacaccagc aggaaatcgc agctcgacga 2760atgggttgct gcacttgata gccacatatc ctgcgttcag gaaaaaggcc gcagccaaag 2820ccaaccgacg ctattactgg ccgatggttt tgatgtggaa aattatgcgc ctgcggtatg 2880gcaatttccg gatgggcaca gcgcgcctat ttctaatttt gccagccagc agaattggct 2940aagaacgctg tgcgccatga gcgtcgttac gggtaatgat agttaccaac agcacgctat 3000cgcacaaagc gaatatttcc tggatcattt cgttgatgat aatagcggcc tgttctactg 3060gggcggccat cgctttatta atctggatac gctggaaggc gaagggccag aatccaaagc 3120tcaggtgcat gaattaaagc accacctgcc ctattacgcg ctgttacatc gtgttaacgc 3180ggaaaagacg ctgaacttct ttcaggggtt ctggaacgca cacgttgaag attggaattc 3240actggatctg ggtcgtcatg gcgattacag caaaaaacgc gatcctgatg ttttcctgca 3300taaccgtcat gatgtcgtcg atccggcaca gtggcccgtt ctgccattaa cgaaaggcct 3360gacgtttgtt aatgccggca cggatctgat ttacgccgca ttcaaatatg cagaatatac 3420gggcgatagc catgccgcgg catggggtaa acacctttat cgccaatacg ttctggctcg 3480caacccagaa accggtatgc cggtgtatca attcagttca ccacagcagc gccagccagt 3540gccggaagac gataaccaga cgcagtcctg gtttggcgat cgcgctcaac gccagtttgg 3600cccagagttc ggtgaaatcg cacgtgaagc caatgtgctg ttccgcgata tgcgtccact 3660gctgattgat aacccgctgg caatgctgga tatcctccgc acacagcctg atgcagaaat 3720gctgaattgg gtaatctctg gattaaaaaa ttattaccag tacgcctacg atgtcaccag 3780caatacgttg cgcccgatgt ggaacaacgg gcaggacatg acaggctacc gttttaaacg 3840cgatggctat tacggcaaag cgggaacgga attaaaaccg ttcgcattag aaggtgatta 3900tttattacct ctggttcgtg cttatcgtct gagcggtgat gaagacctgt acgcactggt 3960taacaccatg ctgacacggc tgaataaaga agatattcag cacatcgcca gtccgctact 4020tttgttgacc gttatcgaac tggccgatca caagcaatca gaatcctggg cacattacgc 4080cgcacaactg gcgggcgtta tgtttgaaca acatttccat cgtggtttgt ttgttcgctc 4140tgcacagcat cgttatgttc gtctggatga tacctatccg ctggctttac tgactttcgt 4200tgccgcctgt cgcaacaaat taaacgatat cccgccgtat ctgacacaag gtggatatgt 4260tcacggcgat tttcacgtta acggggaaaa tagaattgtt tatgacgtgg aattaattta 4320tccagagtta ttaacagctt aattttatgt tttttttaat gattcacaat taatcaatag 4380gtaagcatta tgaatgaaaa cagaatgctg gggttagcct atatctcccc ctatattata 4440gggctgatag tttttaccgc tttccccttt atttcgtcat ttatcctcag ttttactgag 4500tatgatttga

tgagtccgcc tgagtttacg ggtcttgaga actatcaccg tatgttcatg 4560gaggatgatc ttttttggaa atcaatgggc gtcacctttg cctatgtatt tctgaccatt 4620ccattgaaat taatcttcgc actgttaatt gcgtttgtac ttaatttcaa attacgtggt 4680atcggtttct tccgtactgc ttactatgtg ccttctattc tgggcagcag cgtggccatt 4740gccgttctgt ggcgtgccct attcgccatc gatggcttgc tgaacagctt cctcggcgta 4800tttggctttg atgccatcaa ctggctgggc gaaccttcgc tggcactgat gtcggtaacc 4860ctgctgcgcg tatggcagtt tggttccgcc atggttatct tccttgctgc attgcagaac 4920gtcccgcaat cacagtatga agcagccatg atcgacggtg catccaaatg gcaaatgttc 4980ctgaaagtaa cggttccact gattacgccg gttattttct ttaactttat catgcagacc 5040actcaggcat tccaggagtt tacggcacct tacgtcatca ctggcggcgg tccaacgcac 5100tacacctatc tgttctcgct ctatatctat gataccgcgt tcaagtattt cgatatgggc 5160tatggtgctg cgctggcatg ggttctgttc ctggttgttg cggtatttgc ggcaatctcc 5220tttaagtcgt cgaaatactg ggtgttctac tccgctgata aaggaggaaa aaatggctga 5280catgcattca aacctgacta cagcacaaga aattgctgct gcagaagtac gccgcacgct 5340gcgtaaagag aaactcagtg cctccatccg ttacgtgata ctgctgttcg ttggcttact 5400gatgctttac ccactagcgt ggatgttctc agcgtcgttc aaaccgaacc aagagatctt 5460cacgacactg ggcctgtggc cggaacacgc cacatgggac ggtttcgtta acggttggaa 5520aaccggtacg gaatacaatt tcggtcacta catgatcaat acgctcaagt tcgtgattcc 5580gaaagtgcta ctgaccatta tctcttccac cattgtcgct tacggctttg cccgtttcga 5640gattccatgg aagggcttct ggttcgggac gctgatcacc accatgctgt taccaagcac 5700cgtgttgctg attccgcagt acatcatgtt ccgtgaaatg ggcatgctga acagctatct 5760gccactgtac ttgccgatgg cgtttgcaac acaagggttc tttgtgttca tgctgatcca 5820gttcctgcgt ggtgtaccac gtgatatgga agaagccgcc cagatcgatg gctgtaactc 5880cttccaggtt ctgtggtatg tggtcgtgcc gattttgaaa ccagccatca tctctgttgc 5940gctgttccag ttcatgtggt caatgaacga cttcatcggt ccgctgattt atgtctatag 6000cgtggataaa tatccgattg cgctggcgct gaaaatgtct atcgacgtta ctgaaggcgc 6060tccgtggaat gaaatcctgg caatgtccag catctccatt ctgccatcca ttattgtttt 6120cttcctggca cagcgttact tcgtacaagg cgtgaccagc agcggaatta aaggttaata 6180gaggatttat catggctgaa gttattttca ataaactgga aaaagtatac accaacggct 6240tcaaagcggt tcacggcatc gacctgacca ttaaagacgg tgagttcatg gttatcgtcg 6300gcccgtcagg ctgtgcgaaa tcaacgacgc tgcgtatgtt agcgggtctg gaaaccatca 6360gcggcggtga agttcgcatc ggcgagcgcg ttgttaacaa tctggcaccg aaagagcgtg 6420ggattgcaat ggtgttccag aactatgcgc tctaccctca tatgacggta aaagagaacc 6480tggcgtttgg tctgaagctg agcaaaatgc ctaaagatca aattgaagcg caagtaacgg 6540aagcagccaa aattctggag ctggaagacc tgatggatcg tctgccacgc cagctatctg 6600gtggtcaggc gcagcgtgtg gccgtaggcc gtgccatcgt taaaaagccg gatgttttcc 6660tgtttgatga accgttatct aacctggatg ccaaactgcg tgcttccatg cgtatccgta 6720tttctgacct gcataagcag ttgaagaaaa gcggtaaagc ggcaacgacg gtatatgtta 6780cccacgacca gactgaagcc atgaccatgg gcgaccgtat ctgcgttatg aagctgggtc 6840acatcatgca ggtcgatacg ccggataacc tgtaccattt ccctgtcaac atgttcgttg 6900ctggcttcat tggctcacca gaaatgaaca ttaagccgtg caaactggtc gagaaagacg 6960gtcagattgg cgttgttgtg ggtaataacg cgctggtatt aaatactgaa aaacaagata 7020aagtgcgcag ctacgtagga caagacgtat tcttcggcgt tcgcccagac tatgtttcct 7080tgtcagatac gccatttgaa ggcagccact cacagggtga actggttcgc gtagaaaaca 7140tgggtcacga attctttatg tacattaaag tcgatggctt tgaattaacc agccgcattc 7200cttatgacga aggtcggctg attatcgaga agggactgca tcgtccggta tatttccagt 7260tcgacatgga aaaatgccat atttttgatg caaaaacaga aaaaaatatc tctctttaac 7320aggagtagta accgatgaaa aaagcgatcc tacacacgtt aatagcttca tctttggcat 7380tagttgcaat gccatctctg gcagccgatc aggttgagtt gagaatgtcc tggtggggcg 7440gcaacagccg tcaccaacag acgctcaagg cgattgaaga gttccataag cagcacccag 7500acatcaccgt gaaagcggaa tacaccggat gggatggtca cctgtctcgt ctgacaacac 7560agattgccgg taacactgag ccagatgtga tgcagactaa ctggaactgg ctgccgattt 7620tctccaaaaa cggcgatggt ttttatgatc tgaacaaagt gaaagattct ctggatctga 7680cccagttcga agcaaaagaa ctgcaaaaca ccacggttaa cggcaagctg aacggtattc 7740ctatttctgt taccgctcgc gtgttctatt tcaacaacga aagctgggca aaagcgggac 7800tggaataccc gaaaacgtgg gacgaactgc tgaacgccgg taaagtgttc aaagagaagc 7860tgggcgacca atactaccct atcgtgttgg aacaccagga ttctctggca ctgctgaact 7920cttacatggt tcaaaaatac aacattcctg ctattgatgt gaaaagtcag aaattcgcct 7980ataccgatgc acaatgggtt gaattctttg gcatgtataa gaaactgatc gacagccatg 8040tcatgcctga tgcgaaatac tatgcctctt tcggtaagag caacatgtat gagatgaagc 8100catggatcaa tggcgagtgg tctggtactt acatgtggaa ctccactatc actaagtact 8160ctgacaactt gcaaccacca gcaaaactgg cgttaggtaa ctacccaatg ctgcctggtg 8220caaaagatgc tggcttgttc ttcaaacctg cacaaatgct gtctatcggt aagtcaacca 8280agcatcctaa agagtctgct cagttgatca acttcctgct gaacagcaaa gaaggtgctc 8340aggctttggg tctggaacgt ggtgtaccgt tgagtaaagc ggctgtggct cagctgaccg 8400ctgatggcat catcaaagat gatgctccag cagttgccgg gttgaagctg gcgctgtctc 8460tgccgcatga agttgctgtt tctccttatt tcgacgaccc acaaatcgtt tctctgtttg 8520gtgataccat ccaatctatc gattatggtc agaaatctgt ggaagacgca gcgaaatact 8580tccagcgtca atctgagcgt gttctgaaac gcgcaatgaa ataatgtagc actcgattta 8640ccctgtaatt catccctgcc gcaccgacgg cagggatttt tcatttaaat taaaacatcc 8700tctatattca attcgatctc cctcacaatt tgaaacccta ttttactttt tgttactcaa 8760aacgatctcg atcacagaac gtaatttaat aataaataga atagaacttg tcccaaaaaa 8820cataatgcgc ctttcgaatt aaagtattaa gcacagtcct aaccaatggg gaatataaca 8880atgaaattta aattattagc tctggctgtt acatcattaa ttagtgtgaa tgcaatggct 8940gtaactatcg attaccgtca tgaaatgaaa gatacaccga aaaatgatca ccgcgatcgt 9000ttgtcaatgt cacaccgttt tgccaatggc tttggtttat ccgttgaagc aaaatggcgt 9060caatccagtg ctgacagcac accgaataaa ccatttaatg aaaccgtcag caacggtact 9120gaagttgtcg ccagctatgt ttacaacttc aacaaaactt tttctctgga gccaggtttc 9180tctttagatt caagctctac ctctaacaac tatcgccctt atctgcgcgg taaagtgaat 9240atcactgacg atctttctac ctctttacgt tatcgtcctt actacaaacg taacagcggt 9300gatgttccaa atgcatcaaa aaacaaccaa gagaatggtt ataacctaac cgccgttctc 9360agctataaat tcctgaaaga tttccaagtt gattacgaac tggactacaa aaaagcaaat 9420aaagccggtg cgtatcaata cgacaatgaa acatacaatt tcgaccatga tgtaaaattg 9480tcttataaaa tggataaaaa ctggaagcct tatatggctg taggtaatgt tgcagattcc 9540ggcaccaacg atcatcgtca aactcgttac cgtgttggtg tgcaatacag cttctaataa 9600cggccttgtt atttaaataa gcgttattag gtagcagaag ggatgttatt gttaatcgat 9660ttactcagat ctacttttat cattaacatc cctttattat ggtgtccgtt gtaggttaag 9720caggttagtt acgtttcttt gttgtacatg atttagttat atgcgtttta gctgctgtaa 9780ttgctgtgtc tgatttaccc tcttcgtgta tgaatgttat ttctttatta aaatttgcgg 9840ttcagggtag tcattttttc tccgatgtga tggctaccct attttttacc accgcccaac 9900gattcccccc tcattccctt tgtcaggtga tctatcatga ttgttcgttc tctgcttgtc 9960ggggccatta tgatgtctgt aaatggatta agttacgcac aacctgtttt ctctgtctgg 10020ccacacggtg aagcaccggg tgcctcttct tcaacggcac agccgcaagt ggtcgaacgg 10080agtaaagatc cttctcttcc cgatcgagcc gcaacgggta ttcgcagccc tgaaattacc 10140gtttatccgg cagagaaacc caatggcatg gcattactca ttacgccggg cggttcttat 10200cagcgcgtcg tgctagataa agaaggcagc gatctagccc ctttctttaa tcaacaaggc 10260tacacccttt tcgtgatgac ctatcgtatg cccggtgaag gccataaaga aggcgctgac 10320gctccgctag ccgatgccca acgagccatc agaacactga gagccaacgc cgaaaagtgg 10380cacattaacc cgcagcgcat cggtattatg gggttctccg ccggtggtca cgttgccgcc 10440agccttggaa cccgattcgc acagtccgtt taccccgcga tggacgccgt tgataacgta 10500agcgcacgcc ctgacttcat ggtgttgatg taccccgtaa tttctatgca ggcagatatt 10560gcgcacgccg gttcacgtaa acagttaatc ggcgagcaac cgatggaagt acaagcggta 10620cgttattctc ctgagaaaca ggttactgat cagactcccc ccacgttttt ggtgcatgcg 10680gttgacgatc cgtcagtgtc ggttgataac agcctggtga tgtttagcgc gctgcgggca 10740aagcagattc cggtcgaaat gcatctcttt gagaaaggta aacacggctt cggtctccgc 10800ggcaccaagg ggcttcctgc cgctgcctgg cctcaactgc tggacaactg gctacgcgct 10860ttacctgcaa gcaacgaatt gccgaaagcc gcgccataag gtatagcaaa catcgtaacc 10920gaaataaatc gttacgccgt caccgcttcc gcagacaggg ataatct 10967662582DNAErwinia carotovora subsp. Atroseptica SCRI1043 66ccaacggcgg gtgcgacata aacataagcg aatcgaagcg ctgcgctccg gtgagtatct 60gaagtaattt acgatagttt ctttccaaag gcccattcgg gcctttgtta tttcagcgtt 120tattgattca tcaaacctgc gctttctctg ctcgaatgtt ttcactagat ctgaaacagg 180tggtgaaaac atgaagaatg ttttataaaa taaaaccacg atcacggaaa aatgaaacat 240tgtttctata ataccgatat gacaggcgtc tcgcgtgaga tttgtggcct gatttttgaa 300caaccggtgt cggggtgacc gattcgtcgg acgttcagta atgtcaggtt atcgaagcgt 360atgcgtgtgt ggcgtcaaat tcttcatgat aagttctaag gatttacgga tggccaaagg 420taataagatc cccctaacgt ttcataccta ccaggatgca gcaaccggca ccgaagttgt 480gcgtttaacc ccgcccgatg ttatctgcca ccggaattat ttctaccaga agtgtttctt 540caatgacggt agcaagctgc tgtttggcgc tgcatttgat ggcccatgga actactatct 600gctggattta aaagagcaga acgccacaca gttgacggaa ggcaaaggcg acaatacttt 660tggtggtttc ctgtctccga atgacgatgc gctatattac gttaaaaata cccgtaattt 720gatgcgtgtc gatctgacta cgctggaaga gaaaacgatt tatcaggtgc ctgacgattg 780ggtcggctac ggtacttggg ttgccaactc cgattgcacc aaaatggtcg gtattgagat 840caagaaagaa gactggaagc cactgaccga ttggaaaaaa ttccaggagt tctacttcac 900taatccttgc tgtcgtctga ttcgcgtcga tttggtaacg ggcgaagcgg agactatcct 960tcaggaaaac cagtggctgg gtcacccaat ctaccgtcca ggtgatgaca acacggttgc 1020tttctgtcac gaaggcccgc atgacctggt tgatgctcgt atgtggttca tcaacgaaga 1080tggcaccaac atgcgcaaag tgaaagagca tgcagaaggc gaaagctgca cccacgaatt 1140ttgggtgccg gatggctccg cgatgattta tgtctcttat cttaaagacg ataccaaccg 1200ttatattcgc agcatcgatc ccgttacgct ggaagatcgc caactgcgtg taatgccgcc 1260gtgttctcac ctgatgagta actatgatgg cacactgttg gtcggtgatg gttccgatgc 1320accggtcgac gtgcaggatg atggtggcta caaaattgag aacgatccgt tcctgtatgt 1380tttcaacctg aaaactggca aagaacatcg tattgcgcag cacaatacat cctgggaagt 1440gttggaaggg gaccgtcagg tcactcaccc gcacccgtct ttcacgccgg ataataaaca 1500agttctgttt acttctgacg tagatggaaa acctgcgttg tatctggcga aggttcctga 1560ttcagtctgg aactaataat actaataaat ccgcgtcacg tttcatggcg cggattattt 1620taaaatattt acttacatat tattttatta agtctctgac gcggttattt ctcaaactta 1680acttgattat cgttgttgct ccattgccat aatcaaagcg ttccctttat actaaaacca 1740ttgttctatt ttttttaaaa caaaaaaacc tgagtagggt aaccacaaaa atggctagtg 1800cagatttaga taaacaaccc gattccgtgt cgtccgtttt aaaggttttt ggtattttgc 1860aggcattagg tgaagagaga gaaattggta ttaccgagct ttctcagcga gtcatgatgt 1920ctaagagtac cgtttaccgt ttcttgcaga cgatgaaatc cctgggctat gtcgcgcagg 1980aaggtgaatc agagaagtat tcgctaacgc tcaagttgtt tgaacttggt gcaaaagcat 2040tgcagaacgt agacttaatc cgcagtgcgg atatacagat gcgcgagttg tctgtgctga 2100cgcgggaaac gattcacctt ggcgcgttgg atgaagacgg catcgtttat atccacaaga 2160ttgattctat gtataacctg cgtatgtatt cgcgcatcgg tcgccgtaat ccactacaca 2220gtaccgcaat tggtaaagtg ttgctggctt ggcgcgatcg cggtgaagtg gaagaggttc 2280tgtcgactgt cgaattcacg cgtagtacgc cacacacatt gtgtactgct gaagatcttc 2340tcaatcaact ggatgtcgtg cgtgagcaag gctacgggga agataaagaa gagcaggaag 2400aagggctgcg ttgtatcgct gtgccagtat tcgatcgttt tggtgtggtg attgccggcc 2460tcagtatttc cttcccaacg attcgttttt cagaagaaaa caaacacgaa tatgtggcca 2520tgctgcacac cgcagctaga aatatctctg agcaaatggg ctaccacaat ttccctttct 2580ga 2582672331DNAAgrobacterium tumefaciens 67atgcgtccct ctgccccggc catctccaga cagacacttc tcgatgaacc ccgcccgggc 60tcattgacca ttggctacga gccgagcgaa gaagcacaac cgacggagaa ccctccgcgc 120ttttcatggc tacccgatat tgacgacggc gcgcgttacg tgctgcgcat ttcgaccgat 180cccggtttta cagacaaaaa aacgctcgtc ttcgaggatc tcgcctggaa tttcttcacc 240ccggatgaag cactgccgga cggccattat cactggtgtt atgcgctatg ggatcagaaa 300tccgcaacag cgcattccaa ctggagcacc gtacgcagtt tcgagatcag tgaagcactg 360ccgaaaacgc cgctgcccgg caggtctgcc cgccatgctg ccgcgcaaac cagccaccct 420cggctgtggc tcaactccga gcaattgagt gccttcgccg atgccgttgc gaaggacccc 480aaccattgtg gctgggccga gttttacgaa aaatcggtcg agccgtggct cgagcggccg 540gtcatgccgg aaccgcagcc ctatcccaac aacacgcgtg tcgccacgct ctggcggcag 600atgtatatag actgccagga agtgatctat gcgatccggc acctggccat tgccggccgc 660gtgctcggac gcgacgacct tctcgatgca tcccgcaaat ggctgctggc cgtcgccgcc 720tgggacacga aaggtgcgac ctcacgcgcc tataatgacg aggcggggtt ccgcgtcgtc 780gtcgcactcg cctggggtta tgactggctg tacgaccatc tgagcgaaga cgaacgcagg 840accgtgcgat ccgttcttct cgaacggacg cgggaagttg ccgatcatgt catcgcacac 900gcccgcattc acgtctttcc ctatgacagc catgcggtgc gctcgctttc ggctgtattg 960acgccggcct gcatcgcact tcagggagaa agcgacgagg ctggcgaatg gctcgactat 1020accgtcgaat tccttgccac gctctattct ccctgggcgg gaaccgatgg tggttgggcg 1080gaaggtccgc attactggat gaccggcatg gcctatctca tcgaggccgc caatctgatc 1140cgctcctata ttggttatga cctctatcaa cggccgtttt tccagaatac cggtcgcttc 1200ccgctttaca ccaaggcgcc gggaacccgc cgcgccaact tcggcgacga ctccaccctt 1260ggcgaccttc ccggcctgaa gctgggatac aacgtccggc aattcgccgg cgtcaccggc 1320aatggccatt accagtggta tttcgatcac atcaaggccg atgcgacagg cacggaaatg 1380gccttttaca attacggctg gtgggacctc aacttcgacg atctcgtcta tcgccacgat 1440tacccgcagg tggaagccgt gtctcccgcc gacctgccgg cactcgccgt tttcgatgat 1500attggttggg cgaccatcca aaaagacatg gaagacccgg accggcacct gcagttcgtc 1560ttcaaatcca gcccttacgg ttcgctcagc cacagtcacg gcgaccagaa tgcctttgtg 1620ctttatgccc atggcgagga tctggcgatc cagtccggtt attacgtggc gttcaattcg 1680cagatgcatc tgaattggcg gcgtcagaca cggtcgaaaa atgccgtgct gatcggcggc 1740aaaggccaat atgcggaaaa ggacaaggcg cttgcacgcc gcgccgccgg ccgcatcgtc 1800tcggtggagg aacagcccgg ccatgttcgt atcgtcggcg atgcaaccgc cgcctaccag 1860gttgcgaacc cgctggttca aaaggtgctg cgcgaaaccc acttcgttaa tgacagctat 1920ttcgtgattg tcgacgaagt cgaatgttcg gaaccccagg aactgcaatg gctttgccat 1980acactcggag cgccgcagac cggcaggtca agcttccgct acaatggccg gaaagccggt 2040ttctacggac agttcgttta ctcttcgggc ggcacgccgc aaatcagcgc cgtggagggt 2100tttcccgata tcgacccgaa agaattcgaa gggctcgaca tacaccacca tgtctgcgcc 2160acggttccgg ccgccacccg gcatcgcctt gtcacccttc tggtgcctta cagcctgaag 2220gagccgaagc gcattttcag cttcatcgat gatcagggtt tttccaccga catctacttc 2280agtgatgtcg atgacgagcg tttcaagctc tcccttccca agcagttcta a 233168776PRTAgrobacterium tumefaciens 68Met Arg Pro Ser Ala Pro Ala Ile Ser Arg Gln Thr Leu Leu Asp Glu1 5 10 15 Pro Arg Pro Gly Ser Leu Thr Ile Gly Tyr Glu Pro Ser Glu Glu Ala 20 25 30 Gln Pro Thr Glu Asn Pro Pro Arg Phe Ser Trp Leu Pro Asp Ile Asp 35 40 45 Asp Gly Ala Arg Tyr Val Leu Arg Ile Ser Thr Asp Pro Gly Phe Thr 50 55 60 Asp Lys Lys Thr Leu Val Phe Glu Asp Leu Ala Trp Asn Phe Phe Thr65 70 75 80 Pro Asp Glu Ala Leu Pro Asp Gly His Tyr His Trp Cys Tyr Ala Leu 85 90 95 Trp Asp Gln Lys Ser Ala Thr Ala His Ser Asn Trp Ser Thr Val Arg 100 105 110 Ser Phe Glu Ile Ser Glu Ala Leu Pro Lys Thr Pro Leu Pro Gly Arg 115 120 125 Ser Ala Arg His Ala Ala Ala Gln Thr Ser His Pro Arg Leu Trp Leu 130 135 140 Asn Ser Glu Gln Leu Ser Ala Phe Ala Asp Ala Val Ala Lys Asp Pro145 150 155 160 Asn His Cys Gly Trp Ala Glu Phe Tyr Glu Lys Ser Val Glu Pro Trp 165 170 175 Leu Glu Arg Pro Val Met Pro Glu Pro Gln Pro Tyr Pro Asn Asn Thr 180 185 190 Arg Val Ala Thr Leu Trp Arg Gln Met Tyr Ile Asp Cys Gln Glu Val 195 200 205 Ile Tyr Ala Ile Arg His Leu Ala Ile Ala Gly Arg Val Leu Gly Arg 210 215 220 Asp Asp Leu Leu Asp Ala Ser Arg Lys Trp Leu Leu Ala Val Ala Ala225 230 235 240 Trp Asp Thr Lys Gly Ala Thr Ser Arg Ala Tyr Asn Asp Glu Ala Gly 245 250 255 Phe Arg Val Val Val Ala Leu Ala Trp Gly Tyr Asp Trp Leu Tyr Asp 260 265 270 His Leu Ser Glu Asp Glu Arg Arg Thr Val Arg Ser Val Leu Leu Glu 275 280 285 Arg Thr Arg Glu Val Ala Asp His Val Ile Ala His Ala Arg Ile His 290 295 300 Val Phe Pro Tyr Asp Ser His Ala Val Arg Ser Leu Ser Ala Val Leu305 310 315 320 Thr Pro Ala Cys Ile Ala Leu Gln Gly Glu Ser Asp Glu Ala Gly Glu 325 330 335 Trp Leu Asp Tyr Thr Val Glu Phe Leu Ala Thr Leu Tyr Ser Pro Trp 340 345 350 Ala Gly Thr Asp Gly Gly Trp Ala Glu Gly Pro His Tyr Trp Met Thr 355 360 365 Gly Met Ala Tyr Leu Ile Glu Ala Ala Asn Leu Ile Arg Ser Tyr Ile 370 375 380 Gly Tyr Asp Leu Tyr Gln Arg Pro Phe Phe Gln Asn Thr Gly Arg Phe385 390 395 400 Pro Leu Tyr Thr Lys Ala Pro Gly Thr Arg Arg Ala Asn Phe Gly Asp 405 410 415 Asp Ser Thr Leu Gly Asp Leu Pro Gly Leu Lys Leu Gly Tyr Asn Val 420 425 430 Arg Gln Phe Ala Gly Val Thr Gly Asn Gly His Tyr Gln Trp Tyr Phe 435 440 445 Asp His Ile Lys Ala Asp Ala Thr Gly Thr Glu Met Ala Phe Tyr Asn 450 455 460 Tyr Gly Trp Trp Asp Leu Asn Phe Asp Asp Leu Val Tyr Arg His Asp465 470 475 480 Tyr Pro Gln Val Glu Ala Val Ser Pro Ala Asp Leu Pro Ala Leu Ala 485 490 495 Val Phe Asp Asp Ile Gly Trp Ala Thr Ile Gln Lys Asp Met Glu Asp 500 505 510 Pro Asp Arg His Leu Gln Phe Val Phe Lys Ser Ser Pro Tyr Gly Ser 515 520 525 Leu Ser His Ser His Gly Asp Gln Asn Ala Phe Val Leu Tyr

Ala His 530 535 540 Gly Glu Asp Leu Ala Ile Gln Ser Gly Tyr Tyr Val Ala Phe Asn Ser545 550 555 560 Gln Met His Leu Asn Trp Arg Arg Gln Thr Arg Ser Lys Asn Ala Val 565 570 575 Leu Ile Gly Gly Lys Gly Gln Tyr Ala Glu Lys Asp Lys Ala Leu Ala 580 585 590 Arg Arg Ala Ala Gly Arg Ile Val Ser Val Glu Glu Gln Pro Gly His 595 600 605 Val Arg Ile Val Gly Asp Ala Thr Ala Ala Tyr Gln Val Ala Asn Pro 610 615 620 Leu Val Gln Lys Val Leu Arg Glu Thr His Phe Val Asn Asp Ser Tyr625 630 635 640 Phe Val Ile Val Asp Glu Val Glu Cys Ser Glu Pro Gln Glu Leu Gln 645 650 655 Trp Leu Cys His Thr Leu Gly Ala Pro Gln Thr Gly Arg Ser Ser Phe 660 665 670 Arg Tyr Asn Gly Arg Lys Ala Gly Phe Tyr Gly Gln Phe Val Tyr Ser 675 680 685 Ser Gly Gly Thr Pro Gln Ile Ser Ala Val Glu Gly Phe Pro Asp Ile 690 695 700 Asp Pro Lys Glu Phe Glu Gly Leu Asp Ile His His His Val Cys Ala705 710 715 720 Thr Val Pro Ala Ala Thr Arg His Arg Leu Val Thr Leu Leu Val Pro 725 730 735 Tyr Ser Leu Lys Glu Pro Lys Arg Ile Phe Ser Phe Ile Asp Asp Gln 740 745 750 Gly Phe Ser Thr Asp Ile Tyr Phe Ser Asp Val Asp Asp Glu Arg Phe 755 760 765 Lys Leu Ser Leu Pro Lys Gln Phe 770 775 691068DNAAgrobacterium temefaciens C58 69atgttcacaa cgtccgccta tgcctgcgat gacggctctt cgccgatgaa gctcgcgacc 60atcaggcgcc gcgatcccgg tccgcgcgat gtcgaaatcg agatagaatt ctgtggcgtc 120tgccactcgg acatccatac ggcccgcagc gaatggccgg gctccctcta cccttgcgtc 180cccggccacg aaatcgtcgg ccgtgtcggt cgggtgggcg cgcaagtcac ccggttcaag 240acgggtgacc gcgtcggtgt cggctgtatc gtcgatagct gccgcgaatg cgcaagctgc 300gccgaagggc tggagcaata ttgcgaaaac ggcatgaccg gcacctataa ctcccctgac 360aaggcgatgg gcggcggcgc gcatacgctt ggcggctatt ccgcccatgt ggtggtggat 420gaccgctatg tgctcaatat tcccgaaggg ctcgatccgg cggcagcagc accgctactc 480tgcgctggta tcaccaccta ctcgccgctg cgccactgga atgccggccc cggcaaacgc 540gtcggcgtcg tcggtctggg cggcctcggc catatggccg tcaagctcgc caatgccatg 600ggtgcgactg tcgtgatgat caccacctcg cccggcaagg cggaggatgc caaaaaactc 660ggcgcacacg aggtgatcat ctcccgcgat gcggagcaga tgaagaaggc tacctcgagc 720ctcgatctca tcatcgatgc tgtcgccgcc gaccacgaca tcgacgccta tctggcgctg 780ctgaaacgcg atggcgcgct ggtgcaggtg ggcgcgccgg aaaagccact ttcggtgatg 840gccttcagcc tcatccccgg ccgcaagacc tttgccggct cgatgatcgg cggtattccc 900gagactcagg aaatgctgga tttctgcgcc gaaaaaggca tcgccggcga aatcgagatg 960atcgatatcg atcagatcaa tgacgcttat gaacgcatga taaaaagcga tgtgcgttat 1020cgtttcgtca ttgatatgaa gagcctgccg cgccagaagg ccgcctga 106870355PRTAgrobacterium tumefaciens C58 70Met Phe Thr Thr Ser Ala Tyr Ala Cys Asp Asp Gly Ser Ser Pro Met1 5 10 15 Lys Leu Ala Thr Ile Arg Arg Arg Asp Pro Gly Pro Arg Asp Val Glu 20 25 30 Ile Glu Ile Glu Phe Cys Gly Val Cys His Ser Asp Ile His Thr Ala 35 40 45 Arg Ser Glu Trp Pro Gly Ser Leu Tyr Pro Cys Val Pro Gly His Glu 50 55 60 Ile Val Gly Arg Val Gly Arg Val Gly Ala Gln Val Thr Arg Phe Lys65 70 75 80 Thr Gly Asp Arg Val Gly Val Gly Cys Ile Val Asp Ser Cys Arg Glu 85 90 95 Cys Ala Ser Cys Ala Glu Gly Leu Glu Gln Tyr Cys Glu Asn Gly Met 100 105 110 Thr Gly Thr Tyr Asn Ser Pro Asp Lys Ala Met Gly Gly Gly Ala His 115 120 125 Thr Leu Gly Gly Tyr Ser Ala His Val Val Val Asp Asp Arg Tyr Val 130 135 140 Leu Asn Ile Pro Glu Gly Leu Asp Pro Ala Ala Ala Ala Pro Leu Leu145 150 155 160 Cys Ala Gly Ile Thr Thr Tyr Ser Pro Leu Arg His Trp Asn Ala Gly 165 170 175 Pro Gly Lys Arg Val Gly Val Val Gly Leu Gly Gly Leu Gly His Met 180 185 190 Ala Val Lys Leu Ala Asn Ala Met Gly Ala Thr Val Val Met Ile Thr 195 200 205 Thr Ser Pro Gly Lys Ala Glu Asp Ala Lys Lys Leu Gly Ala His Glu 210 215 220 Val Ile Ile Ser Arg Asp Ala Glu Gln Met Lys Lys Ala Thr Ser Ser225 230 235 240 Leu Asp Leu Ile Ile Asp Ala Val Ala Ala Asp His Asp Ile Asp Ala 245 250 255 Tyr Leu Ala Leu Leu Lys Arg Asp Gly Ala Leu Val Gln Val Gly Ala 260 265 270 Pro Glu Lys Pro Leu Ser Val Met Ala Phe Ser Leu Ile Pro Gly Arg 275 280 285 Lys Thr Phe Ala Gly Ser Met Ile Gly Gly Ile Pro Glu Thr Gln Glu 290 295 300 Met Leu Asp Phe Cys Ala Glu Lys Gly Ile Ala Gly Glu Ile Glu Met305 310 315 320 Ile Asp Ile Asp Gln Ile Asn Asp Ala Tyr Glu Arg Met Ile Lys Ser 325 330 335 Asp Val Arg Tyr Arg Phe Val Ile Asp Met Lys Ser Leu Pro Arg Gln 340 345 350 Lys Ala Ala 355 711047DNAAgrobacterium tumefaciens C58 71atggctattg caagaggtta tgctgcgacc gacgcgtcga agccgcttac cccgttcacc 60ttcgaacgcc gcgagccgaa tgatgacgac gtcgtcatcg atatcaaata tgccggcatc 120tgccactcgg acatccacac cgtccgcaac gaatggcaca atgccgttta cccgatcgtt 180ccgggccacg aaatcgccgg tgtcgtgcgg gccgttggtt ccaaggtcac gcggttcaag 240gtcggcgacc atgtcggcgt cggctgcttt gtcgattcct gcgttggctg cgccacccgc 300gatgtcgaca atgagcagta tatgccgggt ctcgtgcaga cctacaattc cgttgaacgg 360gacggcaaga gcgcgaccca gggcggttat tccgaccata tcgtggtcag ggaagactac 420gtcctgtcca tcccggacaa cctgccgctc gatgcctccg cgccgcttct ctgcgccggc 480atcacgctct attcgccgct gcagcactgg aatgcaggcc ccggcaagaa agtggctatc 540gtcggcatgg gtggccttgg ccacatgggc gtgaagatcg gctcggccat gggcgctgat 600atcaccgttc tctcgcagac gctgtcgaag aaggaagacg gcctcaagct cggcgcgaag 660gaatattacg ccaccagcga cgcctcgacc tttgagaaac tcgccggcac cttcgacctg 720atcctgtgca cagtctcggc cgaaatcgac tggaacgcct acctcaacct gctcaaggtc 780aacggcacga tggttctgct cggcgtgccg gaacatgcga tcccggtgca cgcattctcg 840gtcattcccg cccgccgttc gctcgccggt tcgatgatcg gctcgatcaa ggaaacccag 900gaaatgctgg atttctgcgg caagcacgac atcgtttcgg aaatcgaaac gatcggcatc 960aaggacgtca acgaagccta tgagcgcgtg ctgaagagcg acgtgcgtta ccgcttcgtc 1020atcgacatgg cctcgctcga cgcttga 104772348PRTAgrobacterium tumefaciens C58 72Met Ala Ile Ala Arg Gly Tyr Ala Ala Thr Asp Ala Ser Lys Pro Leu1 5 10 15 Thr Pro Phe Thr Phe Glu Arg Arg Glu Pro Asn Asp Asp Asp Val Val 20 25 30 Ile Asp Ile Lys Tyr Ala Gly Ile Cys His Ser Asp Ile His Thr Val 35 40 45 Arg Asn Glu Trp His Asn Ala Val Tyr Pro Ile Val Pro Gly His Glu 50 55 60 Ile Ala Gly Val Val Arg Ala Val Gly Ser Lys Val Thr Arg Phe Lys65 70 75 80 Val Gly Asp His Val Gly Val Gly Cys Phe Val Asp Ser Cys Val Gly 85 90 95 Cys Ala Thr Arg Asp Val Asp Asn Glu Gln Tyr Met Pro Gly Leu Val 100 105 110 Gln Thr Tyr Asn Ser Val Glu Arg Asp Gly Lys Ser Ala Thr Gln Gly 115 120 125 Gly Tyr Ser Asp His Ile Val Val Arg Glu Asp Tyr Val Leu Ser Ile 130 135 140 Pro Asp Asn Leu Pro Leu Asp Ala Ser Ala Pro Leu Leu Cys Ala Gly145 150 155 160 Ile Thr Leu Tyr Ser Pro Leu Gln His Trp Asn Ala Gly Pro Gly Lys 165 170 175 Lys Val Ala Ile Val Gly Met Gly Gly Leu Gly His Met Gly Val Lys 180 185 190 Ile Gly Ser Ala Met Gly Ala Asp Ile Thr Val Leu Ser Gln Thr Leu 195 200 205 Ser Lys Lys Glu Asp Gly Leu Lys Leu Gly Ala Lys Glu Tyr Tyr Ala 210 215 220 Thr Ser Asp Ala Ser Thr Phe Glu Lys Leu Ala Gly Thr Phe Asp Leu225 230 235 240 Ile Leu Cys Thr Val Ser Ala Glu Ile Asp Trp Asn Ala Tyr Leu Asn 245 250 255 Leu Leu Lys Val Asn Gly Thr Met Val Leu Leu Gly Val Pro Glu His 260 265 270 Ala Ile Pro Val His Ala Phe Ser Val Ile Pro Ala Arg Arg Ser Leu 275 280 285 Ala Gly Ser Met Ile Gly Ser Ile Lys Glu Thr Gln Glu Met Leu Asp 290 295 300 Phe Cys Gly Lys His Asp Ile Val Ser Glu Ile Glu Thr Ile Gly Ile305 310 315 320 Lys Asp Val Asn Glu Ala Tyr Glu Arg Val Leu Lys Ser Asp Val Arg 325 330 335 Tyr Arg Phe Val Ile Asp Met Ala Ser Leu Asp Ala 340 345 731029DNAAgrobacterium tumefaciens C58 73atgactaaaa caatgaaggc ggcggttgtc cgcgcatttg gaaaaccgct gaccatcgag 60gaagtggcaa taccggatcc cggccccggt gaaattctca tcaactacaa ggcgacgggc 120gtttgccaca ccgacctgca cgccgcaacg ggggattggc cggtcaagcc caacccgccc 180ttcattcccg gacatgaagg tgcaggttac gtcgccaaga tcggcgctgg cgtcaccggc 240atcaaggagg gcgaccgcgc cggcacgccc tggctctaca ccgcctgcgg atgctgcatt 300ccctgccgta ccggctggga aaccctgtgc ccgagccaga agaactcagg ttattccgtc 360aacggcagct ttgccgaata tggccttgcc gatccgaaat tcgtcggccg cctgcctgac 420aatctcgatt tcggcccagc cgcacccgtg ctctgcgccg gcgttacagt ctataagggc 480ctgaaggaaa ccgaagtcag gcccggtgaa tgggtggtca tttcaggcat tggcgggctt 540ggccacatgg ccgtgcaata tgcgaaagcc atgggcatgc atgtggttgc cgccgatatt 600ttcgacgaca agctggcgct tgccaaaaag ctcggagccg acgtcgtcgt caacggccgc 660gcgcctgacg cggtggagca agtgcaaaag gcaaccggcg gcgtccatgg cgcgctggtg 720acggcggttt caccgaaggc catggagcag gcttatggct tcctgcgctc caagggcacg 780atggcgcttg tcggtctgcc gccgggcttc atctccattc cggtgttcga cacggtgctg 840aagcgcatca cggtgcgtgg ctccatcgtc ggcacgcggc aggatctgga ggaggcgttg 900accttcgccg gtgaaggcaa ggtggccgcc cacttctcgt gggacaagct cgaaaacatc 960aatgatatct tccatcgcat ggaagagggc aagatcgacg gccgtatcgt cgtggatctc 1020gccgcctga 102974342PRTAgrobacterium tumefaciens C58 74Met Thr Lys Thr Met Lys Ala Ala Val Val Arg Ala Phe Gly Lys Pro1 5 10 15 Leu Thr Ile Glu Glu Val Ala Ile Pro Asp Pro Gly Pro Gly Glu Ile 20 25 30 Leu Ile Asn Tyr Lys Ala Thr Gly Val Cys His Thr Asp Leu His Ala 35 40 45 Ala Thr Gly Asp Trp Pro Val Lys Pro Asn Pro Pro Phe Ile Pro Gly 50 55 60 His Glu Gly Ala Gly Tyr Val Ala Lys Ile Gly Ala Gly Val Thr Gly65 70 75 80 Ile Lys Glu Gly Asp Arg Ala Gly Thr Pro Trp Leu Tyr Thr Ala Cys 85 90 95 Gly Cys Cys Ile Pro Cys Arg Thr Gly Trp Glu Thr Leu Cys Pro Ser 100 105 110 Gln Lys Asn Ser Gly Tyr Ser Val Asn Gly Ser Phe Ala Glu Tyr Gly 115 120 125 Leu Ala Asp Pro Lys Phe Val Gly Arg Leu Pro Asp Asn Leu Asp Phe 130 135 140 Gly Pro Ala Ala Pro Val Leu Cys Ala Gly Val Thr Val Tyr Lys Gly145 150 155 160 Leu Lys Glu Thr Glu Val Arg Pro Gly Glu Trp Val Val Ile Ser Gly 165 170 175 Ile Gly Gly Leu Gly His Met Ala Val Gln Tyr Ala Lys Ala Met Gly 180 185 190 Met His Val Val Ala Ala Asp Ile Phe Asp Asp Lys Leu Ala Leu Ala 195 200 205 Lys Lys Leu Gly Ala Asp Val Val Val Asn Gly Arg Ala Pro Asp Ala 210 215 220 Val Glu Gln Val Gln Lys Ala Thr Gly Gly Val His Gly Ala Leu Val225 230 235 240 Thr Ala Val Ser Pro Lys Ala Met Glu Gln Ala Tyr Gly Phe Leu Arg 245 250 255 Ser Lys Gly Thr Met Ala Leu Val Gly Leu Pro Pro Gly Phe Ile Ser 260 265 270 Ile Pro Val Phe Asp Thr Val Leu Lys Arg Ile Thr Val Arg Gly Ser 275 280 285 Ile Val Gly Thr Arg Gln Asp Leu Glu Glu Ala Leu Thr Phe Ala Gly 290 295 300 Glu Gly Lys Val Ala Ala His Phe Ser Trp Asp Lys Leu Glu Asn Ile305 310 315 320 Asn Asp Ile Phe His Arg Met Glu Glu Gly Lys Ile Asp Gly Arg Ile 325 330 335 Val Val Asp Leu Ala Ala 340 751008DNAAgrobacterium tumefaciens C58 75atgaccgggg cgaaccagcc ttgggaggtt caagaggttc ccgttccgaa ggcagagcca 60ggacttgtcc ttgttaaaat ccacgcctcc ggcatgtgct acacggacgt gtgggcgacg 120cagggtgccg gtggcgacat ctatccgcag acccccggcc atgaggttgt cggcgagatc 180atcgaggtcg gcgcgggcgt tcatacgcgc aaggtgggag accgggtcgg caccacctgg 240gtgcagtcct cttgtggacg atgctcctac tgccgccaga accgtccgtt gaccggccag 300acagccatga actgcgattc acccaggaca acggggttcg cgacgcaagg cgggcacgca 360gagtacatcg cgatctctgc tgaaggcaca gtgttattac ccgacgggct cgactacacg 420gatgccgcac ccatgatgtg cgcaggctac acgacctgga gcggcttgcg cgacgccgag 480cccaaacctg gtgacagaat tgcggtactt ggcatcggcg ggctggggca cgtcgccgtg 540cagttctcca aagccttggg gtttgagacc atcgcgatca cgcattcacc cgacaagcac 600aagttggcca ccgatcttgg tgcagacatc gtcgtcgccg atggcaaaga gttattggag 660gccggcggtg cggacgttct tctggttacg accaacgact tcgacaccgc cgaaaaagcg 720atggcgggcg taaggcctga cgggcgcatc gttctttgcg cgctcgactt cagcaagccg 780ttctcgatcc cgtccgacgg caagccgttc cacatgatgc gccaacgcgt ggttgggtcc 840acgcatggcg gacagcacta tctcgccgaa atcctcgatc tcgccgccaa gggcaaggtc 900aagccgattg tcgagacctt cgccctcgag caggcaaccg aggcatatga gcggctatcc 960accgggaaga tgcgcttccg gggcgtgttc cttccgcacg gcgcttga 100876335PRTAgrobacterium tumefaciens C58 76Met Thr Gly Ala Asn Gln Pro Trp Glu Val Gln Glu Val Pro Val Pro1 5 10 15 Lys Ala Glu Pro Gly Leu Val Leu Val Lys Ile His Ala Ser Gly Met 20 25 30 Cys Tyr Thr Asp Val Trp Ala Thr Gln Gly Ala Gly Gly Asp Ile Tyr 35 40 45 Pro Gln Thr Pro Gly His Glu Val Val Gly Glu Ile Ile Glu Val Gly 50 55 60 Ala Gly Val His Thr Arg Lys Val Gly Asp Arg Val Gly Thr Thr Trp65 70 75 80 Val Gln Ser Ser Cys Gly Arg Cys Ser Tyr Cys Arg Gln Asn Arg Pro 85 90 95 Leu Thr Gly Gln Thr Ala Met Asn Cys Asp Ser Pro Arg Thr Thr Gly 100 105 110 Phe Ala Thr Gln Gly Gly His Ala Glu Tyr Ile Ala Ile Ser Ala Glu 115 120 125 Gly Thr Val Leu Leu Pro Asp Gly Leu Asp Tyr Thr Asp Ala Ala Pro 130 135 140 Met Met Cys Ala Gly Tyr Thr Thr Trp Ser Gly Leu Arg Asp Ala Glu145 150 155 160 Pro Lys Pro Gly Asp Arg Ile Ala Val Leu Gly Ile Gly Gly Leu Gly 165 170 175 His Val Ala Val Gln Phe Ser Lys Ala Leu Gly Phe Glu Thr Ile Ala 180 185 190 Ile Thr His Ser Pro Asp Lys His Lys Leu Ala Thr Asp Leu Gly Ala 195 200 205 Asp Ile Val Val Ala Asp Gly Lys Glu Leu Leu Glu Ala Gly Gly Ala 210 215 220 Asp Val Leu Leu Val Thr Thr Asn Asp Phe Asp Thr Ala Glu Lys Ala225 230 235 240 Met Ala Gly Val Arg Pro Asp Gly Arg Ile Val Leu Cys Ala Leu Asp 245 250 255 Phe Ser Lys Pro Phe Ser Ile Pro Ser Asp Gly Lys Pro Phe His Met 260 265 270 Met Arg Gln Arg Val Val Gly Ser Thr His Gly Gly Gln His Tyr Leu 275 280 285 Ala Glu Ile Leu Asp Leu Ala Ala Lys Gly Lys Val Lys Pro Ile Val 290 295 300 Glu Thr Phe Ala Leu Glu Gln Ala Thr Glu Ala Tyr Glu Arg Leu Ser305 310 315 320 Thr Gly Lys Met Arg Phe Arg Gly Val Phe Leu Pro His Gly Ala

325 330 335 771017DNAAgrobacterium tumefaciens C58 77atgaccatgc atgccattca attcgtcgag aagggacgcg ccgtgctggc ggaactcccc 60gtcgccgatc tgccgccggg ccatgcgctc gtgcgggtca aggcttcggg gctttgccat 120accgatatcg acgtgctgca tgcgcgttat ggcgacggtg cgttccccgt cattccgggg 180catgaatatg ctggcgaagt cgcagccgtg gcttccgatg tgacagtctt caaggctggc 240gaccgggttg tcgtcgatcc caatctgccc tgtggcacct gcgccagctg caggaaaggg 300ctgaccaacc tttgcagcac attgaaagct tacggcgttt cccacaatgg cggctttgcg 360gagttcagtg tggtgcgtgc cgatcacctg cacggtatcg gttcgatgcc ctatcacgtc 420gcggcgctgg ctgagccgct tgcctgtgtt gtcaatggca tgcagagtgc gggtattggc 480gagagtggcg tggtgccgga gaatgcgctt gttttcggtg ctgggcccat cggcctgctg 540cttgccctgt cgctgaaatc acgcggcatt gcgacggtga cgatggccga tatcaatgaa 600agcaggctgg cctttgccca ggacctcggg cttcagacgg cggtatccgg ctcggaagcg 660ctctcgcggc agcggaagga gttcgatttc gtggccgatg cgacgggtat tgccccggtc 720gccgaggcga tgatcccgct ggttgcggat ggcggcacgg cgctattctt cggcgtctgc 780gcgccggatg cccgtatttc ggtggcaccg tttgaaatct tccggcgcca gctgaaactt 840gtcggctcgc attcgctgaa ccgcaacata ccgcaggcgc ttgccattct ggagacggat 900ggcgaggtca tggcgcggct cgtttcgcac cgcttgccgc tttcggagat gctgccgttc 960tttacgaaaa aaccgtctga tccggcgacg atgaaagtgc aatttgcagc cgaatga 101778338PRTAgrobacterium tumefaciens C58 78Met Thr Met His Ala Ile Gln Phe Val Glu Lys Gly Arg Ala Val Leu1 5 10 15 Ala Glu Leu Pro Val Ala Asp Leu Pro Pro Gly His Ala Leu Val Arg 20 25 30 Val Lys Ala Ser Gly Leu Cys His Thr Asp Ile Asp Val Leu His Ala 35 40 45 Arg Tyr Gly Asp Gly Ala Phe Pro Val Ile Pro Gly His Glu Tyr Ala 50 55 60 Gly Glu Val Ala Ala Val Ala Ser Asp Val Thr Val Phe Lys Ala Gly65 70 75 80 Asp Arg Val Val Val Asp Pro Asn Leu Pro Cys Gly Thr Cys Ala Ser 85 90 95 Cys Arg Lys Gly Leu Thr Asn Leu Cys Ser Thr Leu Lys Ala Tyr Gly 100 105 110 Val Ser His Asn Gly Gly Phe Ala Glu Phe Ser Val Val Arg Ala Asp 115 120 125 His Leu His Gly Ile Gly Ser Met Pro Tyr His Val Ala Ala Leu Ala 130 135 140 Glu Pro Leu Ala Cys Val Val Asn Gly Met Gln Ser Ala Gly Ile Gly145 150 155 160 Glu Ser Gly Val Val Pro Glu Asn Ala Leu Val Phe Gly Ala Gly Pro 165 170 175 Ile Gly Leu Leu Leu Ala Leu Ser Leu Lys Ser Arg Gly Ile Ala Thr 180 185 190 Val Thr Met Ala Asp Ile Asn Glu Ser Arg Leu Ala Phe Ala Gln Asp 195 200 205 Leu Gly Leu Gln Thr Ala Val Ser Gly Ser Glu Ala Leu Ser Arg Gln 210 215 220 Arg Lys Glu Phe Asp Phe Val Ala Asp Ala Thr Gly Ile Ala Pro Val225 230 235 240 Ala Glu Ala Met Ile Pro Leu Val Ala Asp Gly Gly Thr Ala Leu Phe 245 250 255 Phe Gly Val Cys Ala Pro Asp Ala Arg Ile Ser Val Ala Pro Phe Glu 260 265 270 Ile Phe Arg Arg Gln Leu Lys Leu Val Gly Ser His Ser Leu Asn Arg 275 280 285 Asn Ile Pro Gln Ala Leu Ala Ile Leu Glu Thr Asp Gly Glu Val Met 290 295 300 Ala Arg Leu Val Ser His Arg Leu Pro Leu Ser Glu Met Leu Pro Phe305 310 315 320 Phe Thr Lys Lys Pro Ser Asp Pro Ala Thr Met Lys Val Gln Phe Ala 325 330 335 Ala Glu791044DNAAgrobacterium tumefaciens C58 79atgcgcgcgc tttattacga acgattcggc gagacccctg tagtcgcgtc cctgcctgat 60ccggcaccga gcgatggcgg cgtggtgatt gcggtgaagg caaccggcct ctgccgcagc 120gactggcatg gctggatggg acatgacacg gatatccgtc tgccgcatgt gcccggccac 180gagttcgccg gcgtcatctc cgcagtcggc agaaacgtca cccgcttcaa gacgggtgat 240cgcgttaccg tgcctttcgt ctccggctgc ggccattgcc atgagtgccg ctccggcaat 300cagcaggtct gcgaaacgca gttccagccc ggcttcaccc attggggttc cttcgccgaa 360tatgtcgcca tcgactatgc cgatcagaac ctcgtgcacc tgccggaatc gatgagttac 420gccaccgccg ccggcctcgg ttgccgtttc gccacctcct tccgggcggt gacggatcag 480ggacgcctga agggcggcga atggctggct gtccatggct gcggcggtgt cggtctctcc 540gccatcatga tcggcgccgg cctcggcgca caggtcgtcg ccatcgatat tgccgaagac 600aagctcgaac tcgcccggca actgggtgca accgcaacca tcaacagccg ctccgttgcc 660gatgtcgccg aagcggtgcg cgacatcacc ggtggcggcg cgcatgtgtc ggtggatgcg 720cttggccatc cgcagacctg ctgcaattcc atcagcaacc tgcgccggcg cggacgccat 780gtgcaggtgg ggctgatgct ggcagaccat gccatgccgg ccattcccat ggcccgggtg 840atcgctcatg agctggagat ctatggcagc cacggcatgc aggcatggcg ttacgaggac 900atgctggcca tgatcgaaag cggcaggctt gcgccggaaa agctgattgg ccgccatatc 960tcgctgaccg aagcggccgt cgccctgccc ggaatggata ggttccagga gagcggcatc 1020agcatcatcg accggttcga atag 104480357PRTAgrobacterium tumefaciens C58 80Met Asn Leu Arg Thr Asn Asp Glu Ala Met Met Arg Ala Leu Tyr Tyr1 5 10 15 Glu Arg Phe Gly Glu Thr Pro Val Val Ala Ser Leu Pro Asp Pro Ala 20 25 30 Pro Ser Asp Gly Gly Val Val Ile Ala Val Lys Ala Thr Gly Leu Cys 35 40 45 Arg Ser Asp Trp His Gly Trp Met Gly His Asp Thr Asp Ile Arg Leu 50 55 60 Pro His Val Pro Gly His Glu Phe Ala Gly Val Ile Ser Ala Val Gly65 70 75 80 Arg Asn Val Thr Arg Phe Lys Thr Gly Asp Arg Val Thr Val Pro Phe 85 90 95 Val Ser Gly Cys Gly His Cys His Glu Cys Arg Ser Gly Asn Gln Gln 100 105 110 Val Cys Glu Thr Gln Phe Gln Pro Gly Phe Thr His Trp Gly Ser Phe 115 120 125 Ala Glu Tyr Val Ala Ile Asp Tyr Ala Asp Gln Asn Leu Val His Leu 130 135 140 Pro Glu Ser Met Ser Tyr Ala Thr Ala Ala Gly Leu Gly Cys Arg Phe145 150 155 160 Ala Thr Ser Phe Arg Ala Val Thr Asp Gln Gly Arg Leu Lys Gly Gly 165 170 175 Glu Trp Leu Ala Val His Gly Cys Gly Gly Val Gly Leu Ser Ala Ile 180 185 190 Met Ile Gly Ala Gly Leu Gly Ala Gln Val Val Ala Ile Asp Ile Ala 195 200 205 Glu Asp Lys Leu Glu Leu Ala Arg Gln Leu Gly Ala Thr Ala Thr Ile 210 215 220 Asn Ser Arg Ser Val Ala Asp Val Ala Glu Ala Val Arg Asp Ile Thr225 230 235 240 Gly Gly Gly Ala His Val Ser Val Asp Ala Leu Gly His Pro Gln Thr 245 250 255 Cys Cys Asn Ser Ile Ser Asn Leu Arg Arg Arg Gly Arg His Val Gln 260 265 270 Val Gly Leu Met Leu Ala Asp His Ala Met Pro Ala Ile Pro Met Ala 275 280 285 Arg Val Ile Ala His Glu Leu Glu Ile Tyr Gly Ser His Gly Met Gln 290 295 300 Ala Trp Arg Tyr Glu Asp Met Leu Ala Met Ile Glu Ser Gly Arg Leu305 310 315 320 Ala Pro Glu Lys Leu Ile Gly Arg His Ile Ser Leu Thr Glu Ala Ala 325 330 335 Val Ala Leu Pro Gly Met Asp Arg Phe Gln Glu Ser Gly Ile Ser Ile 340 345 350 Ile Asp Arg Phe Glu 355 811011DNAAgrobacterium tumefaciens C58 81atgctggcga ttttctgtga cactcccggt caattaaccg ccaaggatct gccgaacccc 60gtgcgcggcg aaggtgaagt cctggtacgt attcgccgga ttggcgtttg cggcacggat 120ctgcacatct ttaccggcaa ccagccctat ctttcctatc cgcggatcat gggtcacgaa 180ctttccggca cggttgagga ggcacccgct ggcagccacc tttccgctgg cgatgtggtg 240accataattc cctatatgtc ctgcgggaaa tgcaatgcct gcctgaaggg taagagcaat 300tgctgccgca atatcggtgt gcttggcgtt catcgcgatg gcggcatggt ggaatatctg 360agcgtgccgc agcaattcgt gctgaaggcg gaggggctga gcctcgacca ggcagccatg 420acggaatttc tggcgatcgg tgcccatgcg gtgcgtcgcg gtgccgtcga aaaagggcaa 480aaggtcctga tcgtcggtgc cggcccgatc ggcatggcgg ttgctgtctt tgcggttctc 540gatggcacgg aagtgacgat gatcgacggt cgcaccgacc ggctggattt ctgcaaggac 600cacctcggtg tcgctcatac agtcgccctc ggcgacggtg acaaagatcg tctgtccgac 660attaccggtg gcaatttctt cgatgcggtg tttgatgcga ccggcaatcc gaaagccatg 720gagcgcggtt tctccttcgt cggtcacggc ggctcctatg ttctggtgtc catcgtcgcc 780agcgatatca gcttcaacga cccggaattt cacaagcgtg agacgacgct gctcggcagc 840cgcaacgcga cggctgatga tttcgagcgg gtgcttcgcg ccttgcgcga agggaaagtg 900ccggaggcac taatcaccca tcgcatgaca cttgccgatg ttccctcgaa gttcgccggc 960ctgaccgatc cgaaagccgg agtcatcaag ggcatggtgg aggtcgcatg a 101182336PRTAgrobacterium tumefaciens C58 82Met Leu Ala Ile Phe Cys Asp Thr Pro Gly Gln Leu Thr Ala Lys Asp1 5 10 15 Leu Pro Asn Pro Val Arg Gly Glu Gly Glu Val Leu Val Arg Ile Arg 20 25 30 Arg Ile Gly Val Cys Gly Thr Asp Leu His Ile Phe Thr Gly Asn Gln 35 40 45 Pro Tyr Leu Ser Tyr Pro Arg Ile Met Gly His Glu Leu Ser Gly Thr 50 55 60 Val Glu Glu Ala Pro Ala Gly Ser His Leu Ser Ala Gly Asp Val Val65 70 75 80 Thr Ile Ile Pro Tyr Met Ser Cys Gly Lys Cys Asn Ala Cys Leu Lys 85 90 95 Gly Lys Ser Asn Cys Cys Arg Asn Ile Gly Val Leu Gly Val His Arg 100 105 110 Asp Gly Gly Met Val Glu Tyr Leu Ser Val Pro Gln Gln Phe Val Leu 115 120 125 Lys Ala Glu Gly Leu Ser Leu Asp Gln Ala Ala Met Thr Glu Phe Leu 130 135 140 Ala Ile Gly Ala His Ala Val Arg Arg Gly Ala Val Glu Lys Gly Gln145 150 155 160 Lys Val Leu Ile Val Gly Ala Gly Pro Ile Gly Met Ala Val Ala Val 165 170 175 Phe Ala Val Leu Asp Gly Thr Glu Val Thr Met Ile Asp Gly Arg Thr 180 185 190 Asp Arg Leu Asp Phe Cys Lys Asp His Leu Gly Val Ala His Thr Val 195 200 205 Ala Leu Gly Asp Gly Asp Lys Asp Arg Leu Ser Asp Ile Thr Gly Gly 210 215 220 Asn Phe Phe Asp Ala Val Phe Asp Ala Thr Gly Asn Pro Lys Ala Met225 230 235 240 Glu Arg Gly Phe Ser Phe Val Gly His Gly Gly Ser Tyr Val Leu Val 245 250 255 Ser Ile Val Ala Ser Asp Ile Ser Phe Asn Asp Pro Glu Phe His Lys 260 265 270 Arg Glu Thr Thr Leu Leu Gly Ser Arg Asn Ala Thr Ala Asp Asp Phe 275 280 285 Glu Arg Val Leu Arg Ala Leu Arg Glu Gly Lys Val Pro Glu Ala Leu 290 295 300 Ile Thr His Arg Met Thr Leu Ala Asp Val Pro Ser Lys Phe Ala Gly305 310 315 320 Leu Thr Asp Pro Lys Ala Gly Val Ile Lys Gly Met Val Glu Val Ala 325 330 335 831005DNAAgrobacterium tumefaciens C58 83gtgaaagcct tcgtcgtcga caagtacaag aagaagggcc cgctgcgtct ggccgacatg 60cccaatccgg tcatcggcgc caatgatgtg ctggttcgca tccatgccac tgccatcaat 120cttctcgact ccaaggtgcg cgacggggaa ttcaagctgt tcctgcccta tcgtcctccc 180ttcattctcg gtcatgatct ggccggaacg gtcatccgcg tcggcgcgaa tgtacggcag 240ttcaagacag gcgacgaggt tttcgctcgc ccgcgtgatc accgggtcgg aaccttcgca 300gaaatgattg cggtcgatgc cgcagacctt gcgctgaagc caacgagcct gtccatggag 360caggcagcgt cgatcccgct cgtcggactg actgcctggc aggcgcttat cgaggttggc 420aaggtcaagt ccggccagaa ggttttcatc caggccggtt ccggcggtgt cggcaccttc 480gccatccagc ttgccaagca tctcggcgct accgtggcca cgaccaccag cgccgcgaat 540gccgaactgg tcaaaagcct cggcgcagat gtggtgatcg actacaagac gcaggacttc 600gaacaggtgc tgtccggcta cgatctcgtc ctgaacagcc aggatgccaa gacgctggaa 660aagtcgttga acgtgctgag accgggcgga aagctcattt cgatctccgg tccgccggat 720gttgcctttg ccagatcgtt gaaactgaat ccgctcctgc gttttgtcgt cagaatgctg 780agccgtggtg tcctgaaaaa ggcaagcaga cgcggtgtcg attactcttt cctgttcatg 840cgcgccgaag gtcagcaatt gcatgagatc gccgaactga tcgatgccgg caccatccgt 900ccggtcgtcg acaaggtgtt tcaatttgcg cagacgcccg acgccctggc ctatgtcgag 960accggacggg caaggggcaa ggttgtggtt acatacgcat cctag 100584359PRTAgrobacterium tumefaciens C58 84Met Pro Ser Leu Cys Arg Lys Pro Trp Leu Ser Ser Leu Pro Asp Leu1 5 10 15 Ile Asn Val Ser His Trp Arg Lys Pro Val Lys Ala Phe Val Val Asp 20 25 30 Lys Tyr Lys Lys Lys Gly Pro Leu Arg Leu Ala Asp Met Pro Asn Pro 35 40 45 Val Ile Gly Ala Asn Asp Val Leu Val Arg Ile His Ala Thr Ala Ile 50 55 60 Asn Leu Leu Asp Ser Lys Val Arg Asp Gly Glu Phe Lys Leu Phe Leu65 70 75 80 Pro Tyr Arg Pro Pro Phe Ile Leu Gly His Asp Leu Ala Gly Thr Val 85 90 95 Ile Arg Val Gly Ala Asn Val Arg Gln Phe Lys Thr Gly Asp Glu Val 100 105 110 Phe Ala Arg Pro Arg Asp His Arg Val Gly Thr Phe Ala Glu Met Ile 115 120 125 Ala Val Asp Ala Ala Asp Leu Ala Leu Lys Pro Thr Ser Leu Ser Met 130 135 140 Glu Gln Ala Ala Ser Ile Pro Leu Val Gly Leu Thr Ala Trp Gln Ala145 150 155 160 Leu Ile Glu Val Gly Lys Val Lys Ser Gly Gln Lys Val Phe Ile Gln 165 170 175 Ala Gly Ser Gly Gly Val Gly Thr Phe Ala Ile Gln Leu Ala Lys His 180 185 190 Leu Gly Ala Thr Val Ala Thr Thr Thr Ser Ala Ala Asn Ala Glu Leu 195 200 205 Val Lys Ser Leu Gly Ala Asp Val Val Ile Asp Tyr Lys Thr Gln Asp 210 215 220 Phe Glu Gln Val Leu Ser Gly Tyr Asp Leu Val Leu Asn Ser Gln Asp225 230 235 240 Ala Lys Thr Leu Glu Lys Ser Leu Asn Val Leu Arg Pro Gly Gly Lys 245 250 255 Leu Ile Ser Ile Ser Gly Pro Pro Asp Val Ala Phe Ala Arg Ser Leu 260 265 270 Lys Leu Asn Pro Leu Leu Arg Phe Val Val Arg Met Leu Ser Arg Gly 275 280 285 Val Leu Lys Lys Ala Ser Arg Arg Gly Val Asp Tyr Ser Phe Leu Phe 290 295 300 Met Arg Ala Glu Gly Gln Gln Leu His Glu Ile Ala Glu Leu Ile Asp305 310 315 320 Ala Gly Thr Ile Arg Pro Val Val Asp Lys Val Phe Gln Phe Ala Gln 325 330 335 Thr Pro Asp Ala Leu Ala Tyr Val Glu Thr Gly Arg Ala Arg Gly Lys 340 345 350 Val Val Val Thr Tyr Ala Ser 355 851032DNAAgrobacterium tumefaciens C58 85atgaaagcga ttgtcgccca cggggcaaag gatgtgcgca tcgaagaccg gccggaggaa 60aagccgggtc cgggcgaggt gcggctccgt ctggcgaggg gcgggatctg cggcagtgat 120ctgcattatt acaatcatgg cggtttcggc gccgtgcggc ttcgtgaacc catggtgctg 180ggccatgagg tttccgccgt catcgaggaa ctgggcgaag gcgttgaggg gctgaagatc 240ggcggtctgg tggcggtttc gccgtcgcgc ccatgccgaa cctgccgctt ctgccaggag 300ggtctgcaca atcagtgcct caacatgcgg ttttatggca gcgccatgcc tttcccgcat 360attcagggcg cgttccggga aattctggtg gcggacgccc tgcaatgcgt gccggccgat 420ggtctcagcg ccggggaagc cgccatggcg gaaccgctgg cggtgacgct gcatgccaca 480cgccgggccg gcgatttgct gggaaaacgt gtgctcgtca cgggttgcgg ccccatcggc 540attctctcca ttctggctgc gcgccgggcg ggtgctgctg aaatcgtcgc caccgacctt 600tccgatttca cgctcggcaa ggcgcgtgaa gcgggggcgg accgtgtcat caacagcaag 660gatgagcccg atgcgctcgc cgcttatggt gcaaacaagg gaaccttcga cattctctat 720gaatgctcgg gtgcggccgt ggcgcttgcc ggcggcatta cggcactgcg gccgcgcggc 780atcatcgtcc agctcgggct cggcggcgat atgagcctgc cgatgatggc gatcacagcc 840aaggaactcg acctgcgtgg ttcctttcgc ttccacgagg aattcgccac cggcgtcgag 900ctgatgcgca agggcctgat cgacgtcaaa cccttcatca cccagaccgt cgatcttgcc 960gacgccatct cggccttcga attcgcctcg gatcgcagcc gcgccatgaa ggtgcagatc 1020gccttttcct aa 103286343PRTAgrobacterium tumefaciens C58 86Met Lys Ala Ile Val Ala His Gly Ala Lys Asp Val Arg Ile Glu Asp1 5 10 15 Arg Pro Glu Glu Lys Pro Gly Pro Gly Glu Val Arg Leu Arg Leu Ala 20 25 30 Arg Gly Gly Ile Cys Gly Ser Asp Leu His Tyr Tyr Asn His Gly Gly 35 40 45 Phe Gly Ala Val Arg Leu Arg Glu Pro Met Val Leu Gly His Glu Val 50 55

60 Ser Ala Val Ile Glu Glu Leu Gly Glu Gly Val Glu Gly Leu Lys Ile65 70 75 80 Gly Gly Leu Val Ala Val Ser Pro Ser Arg Pro Cys Arg Thr Cys Arg 85 90 95 Phe Cys Gln Glu Gly Leu His Asn Gln Cys Leu Asn Met Arg Phe Tyr 100 105 110 Gly Ser Ala Met Pro Phe Pro His Ile Gln Gly Ala Phe Arg Glu Ile 115 120 125 Leu Val Ala Asp Ala Leu Gln Cys Val Pro Ala Asp Gly Leu Ser Ala 130 135 140 Gly Glu Ala Ala Met Ala Glu Pro Leu Ala Val Thr Leu His Ala Thr145 150 155 160 Arg Arg Ala Gly Asp Leu Leu Gly Lys Arg Val Leu Val Thr Gly Cys 165 170 175 Gly Pro Ile Gly Ile Leu Ser Ile Leu Ala Ala Arg Arg Ala Gly Ala 180 185 190 Ala Glu Ile Val Ala Thr Asp Leu Ser Asp Phe Thr Leu Gly Lys Ala 195 200 205 Arg Glu Ala Gly Ala Asp Arg Val Ile Asn Ser Lys Asp Glu Pro Asp 210 215 220 Ala Leu Ala Ala Tyr Gly Ala Asn Lys Gly Thr Phe Asp Ile Leu Tyr225 230 235 240 Glu Cys Ser Gly Ala Ala Val Ala Leu Ala Gly Gly Ile Thr Ala Leu 245 250 255 Arg Pro Arg Gly Ile Ile Val Gln Leu Gly Leu Gly Gly Asp Met Ser 260 265 270 Leu Pro Met Met Ala Ile Thr Ala Lys Glu Leu Asp Leu Arg Gly Ser 275 280 285 Phe Arg Phe His Glu Glu Phe Ala Thr Gly Val Glu Leu Met Arg Lys 290 295 300 Gly Leu Ile Asp Val Lys Pro Phe Ile Thr Gln Thr Val Asp Leu Ala305 310 315 320 Asp Ala Ile Ser Ala Phe Glu Phe Ala Ser Asp Arg Ser Arg Ala Met 325 330 335 Lys Val Gln Ile Ala Phe Ser 340 87939DNAAgrobacterium tumefaciens C58 87atgccgatgg cgctcgggca cgaagcggcg ggcgtcgtcg aggcattggg cgaaggcgtg 60cgcgatcttg agcccggcga tcatgtggtc atggtcttca tgcccagttg cggacattgc 120ctgccctgtg cggaaggcag gcccgctctg tgcgagccgg gcgccgccgc caatgcagca 180ggcaggctgt tgggtggcgc cacccgcctg aactatcatg gcgaggtcgt ccatcatcac 240cttggtgtgt cggcctttgc cgaatatgcc gtggtgtcgc gcaattcgct ggtcaagatc 300gaccgcgatc ttccatttgt cgaggcggca ctcttcggct gcgcggttct caccggcgtc 360ggcgccgtcg tgaatacggc aagggtcagg accggctcga ctgcggtcgt catcggactt 420ggcggtgtgg gccttgccgc ggttctcgga gcccgggcgg ccggtgccag caagatcgtc 480gccgtcgacc tttcgcagga aaagcttgca ctcgccagcg aactgggcgc gaccgccatc 540gtgaacggac gcgatgagga tgccgtcgag caggtccgcg agctcacttc cggcggtgcc 600gattatgcct tcgagatggc agggtctatt cgcgccctcg aaaacgcctt caggatgacc 660aaacgtggcg gcaccaccgt taccgccggt ctgccaccgc cgggtgcggc cctgccgctc 720aacgtcgtgc agctcgtcgg cgaggagcgg acactcaagg gcagctatat cggcacctgt 780gtgcctctcc gggatattcc gcgcttcatc gccctttatc gcgacggccg gttgccggtg 840aaccgccttc tgagcggaag gctgaagcta gaagacatca atgaagggtt cgaccgcctg 900cacgacggaa gcgccgttcg gcaagtcatc gaattctga 93988312PRTAgrobacterium tumefaciens C58 88Met Pro Met Ala Leu Gly His Glu Ala Ala Gly Val Val Glu Ala Leu1 5 10 15 Gly Glu Gly Val Arg Asp Leu Glu Pro Gly Asp His Val Val Met Val 20 25 30 Phe Met Pro Ser Cys Gly His Cys Leu Pro Cys Ala Glu Gly Arg Pro 35 40 45 Ala Leu Cys Glu Pro Gly Ala Ala Ala Asn Ala Ala Gly Arg Leu Leu 50 55 60 Gly Gly Ala Thr Arg Leu Asn Tyr His Gly Glu Val Val His His His65 70 75 80 Leu Gly Val Ser Ala Phe Ala Glu Tyr Ala Val Val Ser Arg Asn Ser 85 90 95 Leu Val Lys Ile Asp Arg Asp Leu Pro Phe Val Glu Ala Ala Leu Phe 100 105 110 Gly Cys Ala Val Leu Thr Gly Val Gly Ala Val Val Asn Thr Ala Arg 115 120 125 Val Arg Thr Gly Ser Thr Ala Val Val Ile Gly Leu Gly Gly Val Gly 130 135 140 Leu Ala Ala Val Leu Gly Ala Arg Ala Ala Gly Ala Ser Lys Ile Val145 150 155 160 Ala Val Asp Leu Ser Gln Glu Lys Leu Ala Leu Ala Ser Glu Leu Gly 165 170 175 Ala Thr Ala Ile Val Asn Gly Arg Asp Glu Asp Ala Val Glu Gln Val 180 185 190 Arg Glu Leu Thr Ser Gly Gly Ala Asp Tyr Ala Phe Glu Met Ala Gly 195 200 205 Ser Ile Arg Ala Leu Glu Asn Ala Phe Arg Met Thr Lys Arg Gly Gly 210 215 220 Thr Thr Val Thr Ala Gly Leu Pro Pro Pro Gly Ala Ala Leu Pro Leu225 230 235 240 Asn Val Val Gln Leu Val Gly Glu Glu Arg Thr Leu Lys Gly Ser Tyr 245 250 255 Ile Gly Thr Cys Val Pro Leu Arg Asp Ile Pro Arg Phe Ile Ala Leu 260 265 270 Tyr Arg Asp Gly Arg Leu Pro Val Asn Arg Leu Leu Ser Gly Arg Leu 275 280 285 Lys Leu Glu Asp Ile Asn Glu Gly Phe Asp Arg Leu His Asp Gly Ser 290 295 300 Ala Val Arg Gln Val Ile Glu Phe305 310 891035DNAAgrobacterium tumefaciens C58 89atgaaacatt ctcaggacaa accacgcctg ctgattgcga tgcgtagcga gcttccagaa 60ggcttcttcg gtccgcgcga atgggcaagg ctgaatgccg tagcggacat tattccgggc 120tttccccata cggatttcga cacggcgaac ggtgccgagg ctctcgccga agcggatatt 180ctgctcgctg cctggggtac gccatccctg acacgcgaac gactttcacg cgcgccgcgg 240ctgaaaatgc tggcctatgc ggcatcatcg gtgcggatgg ttgcgcccgc agaattctgg 300gagacgtcgg atattctggt cacgacagca gcttccgcca tggccgtgcc ggttgccgaa 360ttcacctatg cggcaatcat catgtgcggc aaggatgtgt ttcgattgcg ggatgaacat 420agaacagagc gcggcaccgg cgtttttggc agcaggcgcg gcagaagcct gccctatctt 480ggcaatcatg cccgcaaggt tggcattgtc ggcgcctcgc gcatcgggcg gctggtgatg 540gagatgctgg cgcgcggcac attcgagatt gccgtttacg atccctttct gtcggcggaa 600gaggccgcat cccttggcgc gaagaaagcc gaactggacg agcttctcgc atggtccgat 660gtggtctcgc tgcacgcgcc gatcctgccg gaaacgcacc atatgatcgg cgcccgcgaa 720ctggcgctga tggcggacca tgccatcttc atcaacacgg cgcggggctg gctggtcgac 780cacgatgcat tgctgactga agcgatttcc ggacggctgc gcattctgat tgacacgccc 840gaacccgagc ccctgcccac ggacagcccg ttttacgatc tgcccaatgt cgttctaacc 900ccccatatag ccggggcgct gggcaatgaa ttgcgcgcac tttccgatct ggccattacc 960gaaattgaac gtttcgtggc gggacttgcg cccctccacc cggtccacaa gcaggatatg 1020gaacgtatgg catga 103590331PRTAgrobacterium tumefaciens C58 90Met Arg Ser Glu Leu Pro Glu Gly Phe Phe Gly Pro Arg Glu Trp Ala1 5 10 15 Arg Leu Asn Ala Val Ala Asp Ile Ile Pro Gly Phe Pro His Thr Asp 20 25 30 Phe Asp Thr Ala Asn Gly Ala Glu Ala Leu Ala Glu Ala Asp Ile Leu 35 40 45 Leu Ala Ala Trp Gly Thr Pro Ser Leu Thr Arg Glu Arg Leu Ser Arg 50 55 60 Ala Pro Arg Leu Lys Met Leu Ala Tyr Ala Ala Ser Ser Val Arg Met65 70 75 80 Val Ala Pro Ala Glu Phe Trp Glu Thr Ser Asp Ile Leu Val Thr Thr 85 90 95 Ala Ala Ser Ala Met Ala Val Pro Val Ala Glu Phe Thr Tyr Ala Ala 100 105 110 Ile Ile Met Cys Gly Lys Asp Val Phe Arg Leu Arg Asp Glu His Arg 115 120 125 Thr Glu Arg Gly Thr Gly Val Phe Gly Ser Arg Arg Gly Arg Ser Leu 130 135 140 Pro Tyr Leu Gly Asn His Ala Arg Lys Val Gly Ile Val Gly Ala Ser145 150 155 160 Arg Ile Gly Arg Leu Val Met Glu Met Leu Ala Arg Gly Thr Phe Glu 165 170 175 Ile Ala Val Tyr Asp Pro Phe Leu Ser Ala Glu Glu Ala Ala Ser Leu 180 185 190 Gly Ala Lys Lys Ala Glu Leu Asp Glu Leu Leu Ala Trp Ser Asp Val 195 200 205 Val Ser Leu His Ala Pro Ile Leu Pro Glu Thr His His Met Ile Gly 210 215 220 Ala Arg Glu Leu Ala Leu Met Ala Asp His Ala Ile Phe Ile Asn Thr225 230 235 240 Ala Arg Gly Trp Leu Val Asp His Asp Ala Leu Leu Thr Glu Ala Ile 245 250 255 Ser Gly Arg Leu Arg Ile Leu Ile Asp Thr Pro Glu Pro Glu Pro Leu 260 265 270 Pro Thr Asp Ser Pro Phe Tyr Asp Leu Pro Asn Val Val Leu Thr Pro 275 280 285 His Ile Ala Gly Ala Leu Gly Asn Glu Leu Arg Ala Leu Ser Asp Leu 290 295 300 Ala Ile Thr Glu Ile Glu Arg Phe Val Ala Gly Leu Ala Pro Leu His305 310 315 320 Pro Val His Lys Gln Asp Met Glu Arg Met Ala 325 330 91750DNAAgrobacterium tumefaciens C58 91atgcagcgtt ttaccaacag aaccatcgtt gtcgccgggg ccggccggga tatcggccgg 60gcatgcgcca tccgtttcgc acaggaaggc gccaatgtcg ttcttaccta taatggcgcg 120gcagagggcg cggccacagc cgttgccgaa atcgaaaagc ttggtcgttc ggctctggcg 180atcaaggcgg atctcacaaa cgccgccgaa gtcgaggctg ccatatctgc ggctgcggac 240aagtttgggg agatccacgg cctcgtccat gttgccggcg gcctgatcgc ccgcaagaca 300atcgcagaaa tggatgaagc cttctggcat caggtcctcg acgtcaatct gacatcgctg 360ttcctgacgg ccaagaccgc attgccgaag atggccaagg gcggcgcgat cgtcactttc 420tcgtcgcagg ccggccgtga tggcggcggc ccgggcgctc ttgcctatgc cacttccaag 480ggtgccgtga tgaccttcac ccgcggactt gccaaagaag tcggccccaa aatccgcgtc 540aacgccgttt gccccggtat gatctccacc accttccacg ataccttcac caagccggag 600gtgcgcgaac gggtggccgg cgcgacgtcg ctcaagcgcg aagggtcgag cgaagacgtc 660gccggtctgg tggccttcct cgcgtctgac gatgccgctt atgtcaccgg cgcctgctac 720gacatcaatg gcggcgtcct gttttcctga 75092249PRTAgrobacterium tumefaciens C58 92Met Gln Arg Phe Thr Asn Arg Thr Ile Val Val Ala Gly Ala Gly Arg1 5 10 15 Asp Ile Gly Arg Ala Cys Ala Ile Arg Phe Ala Gln Glu Gly Ala Asn 20 25 30 Val Val Leu Thr Tyr Asn Gly Ala Ala Glu Gly Ala Ala Thr Ala Val 35 40 45 Ala Glu Ile Glu Lys Leu Gly Arg Ser Ala Leu Ala Ile Lys Ala Asp 50 55 60 Leu Thr Asn Ala Ala Glu Val Glu Ala Ala Ile Ser Ala Ala Ala Asp65 70 75 80 Lys Phe Gly Glu Ile His Gly Leu Val His Val Ala Gly Gly Leu Ile 85 90 95 Ala Arg Lys Thr Ile Ala Glu Met Asp Glu Ala Phe Trp His Gln Val 100 105 110 Leu Asp Val Asn Leu Thr Ser Leu Phe Leu Thr Ala Lys Thr Ala Leu 115 120 125 Pro Lys Met Ala Lys Gly Gly Ala Ile Val Thr Phe Ser Ser Gln Ala 130 135 140 Gly Arg Asp Gly Gly Gly Pro Gly Ala Leu Ala Tyr Ala Thr Ser Lys145 150 155 160 Gly Ala Val Met Thr Phe Thr Arg Gly Leu Ala Lys Glu Val Gly Pro 165 170 175 Lys Ile Arg Val Asn Ala Val Cys Pro Gly Met Ile Ser Thr Thr Phe 180 185 190 His Asp Thr Phe Thr Lys Pro Glu Val Arg Glu Arg Val Ala Gly Ala 195 200 205 Thr Ser Leu Lys Arg Glu Gly Ser Ser Glu Asp Val Ala Gly Leu Val 210 215 220 Ala Phe Leu Ala Ser Asp Asp Ala Ala Tyr Val Thr Gly Ala Cys Tyr225 230 235 240 Asp Ile Asn Gly Gly Val Leu Phe Ser 245 93930DNAEscherichia coli DH10B 93atgtccaaaa agattgccgt gattggcgaa tgcatgattg agctttccga gaaaggcgcg 60gacgttaagc gcggtttcgg cggcgatacc ctgaacactt ccgtctatat cgcccgtcag 120gtcgatcctg cggcattaac cgttcattac gtaacggcgc tgggaacgga cagttttagc 180cagcagatgc tggacgcctg gcacggcgag aacgttgata cttccctgac ccaacggatg 240gaaaaccgtc tgccgggcct ttactacatt gaaaccgaca gcaccggcga gcgtacgttc 300tactactggc ggaacgaagc cgccgccaaa ttctggctgg agagtgagca gtctgcggcg 360atttgcgaag agctggcgaa tttcgattat ctctacctga gcgggattag cctggcgatc 420ttaagcccga ccagccgcga aaagctgctt tccctgctgc gcgaatgccg cgccaacggc 480ggaaaagtga ttttcgacaa taactatcgt ccgcgcctgt gggccagcaa agaagagaca 540cagcaggtgt accaacaaat gctggaatgc acggatatcg ccttcctgac gctggacgac 600gaagacgcgc tgtggggtca acagccggtg gaagacgtca ttgcgcgcac ccataacgcg 660ggcgtgaaag aagtggtggt gaaacgcggg gcggattctt gcctggtgtc cattgctggc 720gaagggttag tggatgttcc ggcggtgaaa ctgccgaaag aaaaagtgat cgataccacc 780gcagctggcg actctttcag tgccggttat ctggcggtac gtctgacagg cggcagcgcg 840gaagacgcgg cgaaacgtgg gcacctgacc gcaagtaccg ttattcagta tcgcggcgcg 900attatcccgc gtgaggcgat gccagcgtaa 93094309PRTEscherichia coli DH10B 94Met Ser Lys Lys Ile Ala Val Ile Gly Glu Cys Met Ile Glu Leu Ser1 5 10 15 Glu Lys Gly Ala Asp Val Lys Arg Gly Phe Gly Gly Asp Thr Leu Asn 20 25 30 Thr Ser Val Tyr Ile Ala Arg Gln Val Asp Pro Ala Ala Leu Thr Val 35 40 45 His Tyr Val Thr Ala Leu Gly Thr Asp Ser Phe Ser Gln Gln Met Leu 50 55 60 Asp Ala Trp His Gly Glu Asn Val Asp Thr Ser Leu Thr Gln Arg Met65 70 75 80 Glu Asn Arg Leu Pro Gly Leu Tyr Tyr Ile Glu Thr Asp Ser Thr Gly 85 90 95 Glu Arg Thr Phe Tyr Tyr Trp Arg Asn Glu Ala Ala Ala Lys Phe Trp 100 105 110 Leu Glu Ser Glu Gln Ser Ala Ala Ile Cys Glu Glu Leu Ala Asn Phe 115 120 125 Asp Tyr Leu Tyr Leu Ser Gly Ile Ser Leu Ala Ile Leu Ser Pro Thr 130 135 140 Ser Arg Glu Lys Leu Leu Ser Leu Leu Arg Glu Cys Arg Ala Asn Gly145 150 155 160 Gly Lys Val Ile Phe Asp Asn Asn Tyr Arg Pro Arg Leu Trp Ala Ser 165 170 175 Lys Glu Glu Thr Gln Gln Val Tyr Gln Gln Met Leu Glu Cys Thr Asp 180 185 190 Ile Ala Phe Leu Thr Leu Asp Asp Glu Asp Ala Leu Trp Gly Gln Gln 195 200 205 Pro Val Glu Asp Val Ile Ala Arg Thr His Asn Ala Gly Val Lys Glu 210 215 220 Val Val Val Lys Arg Gly Ala Asp Ser Cys Leu Val Ser Ile Ala Gly225 230 235 240 Glu Gly Leu Val Asp Val Pro Ala Val Lys Leu Pro Lys Glu Lys Val 245 250 255 Ile Asp Thr Thr Ala Ala Gly Asp Ser Phe Ser Ala Gly Tyr Leu Ala 260 265 270 Val Arg Leu Thr Gly Gly Ser Ala Glu Asp Ala Ala Lys Arg Gly His 275 280 285 Leu Thr Ala Ser Thr Val Ile Gln Tyr Arg Gly Ala Ile Ile Pro Arg 290 295 300 Glu Ala Met Pro Ala305 95642DNAEscherichia coli DH10B 95atgaaaaact ggaaaacaag tgcagaatca atcctgacca ccggcccggt tgtaccggtt 60atcgtggtaa aaaaactgga acacgcggtg ccgatggcaa aagcgttggt tgctggtggg 120gtgcgcgttc tggaagtgac tctgcgtacc gagtgtgcag ttgacgctat ccgtgctatc 180gccaaagaag tgcctgaagc gattgtgggt gccggtacgg tgctgaatcc acagcagctg 240gcagaagtca ctgaagcggg tgcacagttc gcaattagcc cgggtctgac cgagccgctg 300ctgaaagctg ctaccgaagg gactattcct ctgattccgg ggatcagcac tgtttccgaa 360ctgatgctgg gtatggacta cggtttgaaa gagttcaaat tcttcccggc tgaagctaac 420ggcggcgtga aagccctgca ggcgatcgcg ggtccgttct cccaggtccg tttctgcccg 480acgggtggta tttctccggc taactaccgt gactacctgg cgctgaaaag cgtgctgtgc 540atcggtggtt cctggctggt tccggcagat gcgctggaag cgggcgatta cgaccgcatt 600actaagctgg cgcgtgaagc tgtagaaggc gctaagctgt aa 64296213PRTEscherichia coli DH10B 96Met Lys Asn Trp Lys Thr Ser Ala Glu Ser Ile Leu Thr Thr Gly Pro1 5 10 15 Val Val Pro Val Ile Val Val Lys Lys Leu Glu His Ala Val Pro Met 20 25 30 Ala Lys Ala Leu Val Ala Gly Gly Val Arg Val Leu Glu Val Thr Leu 35 40 45 Arg Thr Glu Cys Ala Val Asp Ala Ile Arg Ala Ile Ala Lys Glu Val 50 55 60 Pro Glu Ala Ile Val Gly Ala Gly Thr Val Leu Asn Pro Gln Gln Leu65 70 75 80 Ala Glu Val Thr Glu Ala Gly Ala Gln Phe Ala Ile Ser Pro Gly Leu 85 90 95 Thr

Glu Pro Leu Leu Lys Ala Ala Thr Glu Gly Thr Ile Pro Leu Ile 100 105 110 Pro Gly Ile Ser Thr Val Ser Glu Leu Met Leu Gly Met Asp Tyr Gly 115 120 125 Leu Lys Glu Phe Lys Phe Phe Pro Ala Glu Ala Asn Gly Gly Val Lys 130 135 140 Ala Leu Gln Ala Ile Ala Gly Pro Phe Ser Gln Val Arg Phe Cys Pro145 150 155 160 Thr Gly Gly Ile Ser Pro Ala Asn Tyr Arg Asp Tyr Leu Ala Leu Lys 165 170 175 Ser Val Leu Cys Ile Gly Gly Ser Trp Leu Val Pro Ala Asp Ala Leu 180 185 190 Glu Ala Gly Asp Tyr Asp Arg Ile Thr Lys Leu Ala Arg Glu Ala Val 195 200 205 Glu Gly Ala Lys Leu 210 97780DNALactobaccilus brevis ATCC 367 97atggcatcaa atggaaaagt agcaatggtt accggtggcg gacaaggaat tggtgaagcc 60atctcgaaac ggttagctaa cgacggcttt gctgtggcaa ttgctgattt gaacttggac 120aatgccaaca aggtcgtttc tgatattgaa gctgctggtg gcaaggccat tgcggtcaag 180accgatgtct ctgatcgtga tagcgtgttt gctgcggtta atgaagcggc cgacaagctg 240ggcggctttg acgttatcgt taataacgcc ggccttggcc caaccacgcc aattgacacc 300atcacccaag aacagtttga tacggtttat cacgttaacg tgggtggggt tctttggggc 360attcaagcag cccatgcgaa gttcaaggaa ttgggtcatg gtgggaagat catttccgcg 420acgtctcaag ccggggttgt tggtaacccg aacttagctc tgtacagtgg aactaagttt 480gccattcgtg gtgtgaccca agttgcggcg cgtgacttag ccgctgaagg tatcacggtc 540aatgcttatg cacccgggat tgttaagaca ccaatgatgt ttgacatcgc tcacaaggtt 600ggtcaaaatg ctggtaaaga cgacgaatgg gggatgcaaa ccttctcaaa ggacatcgct 660ttatgtcgat tgtcagaacc agaagatgtg gctaacgggg tggctttctt agccggtccc 720gattctaact acattacggg tcaaacactt gaagttgatg gtgggatgca gttccactaa 78098259PRTLactobaccilus brevis ATCC 367 98Met Ala Ser Asn Gly Lys Val Ala Met Val Thr Gly Gly Gly Gln Gly1 5 10 15 Ile Gly Glu Ala Ile Ser Lys Arg Leu Ala Asn Asp Gly Phe Ala Val 20 25 30 Ala Ile Ala Asp Leu Asn Leu Asp Asn Ala Asn Lys Val Val Ser Asp 35 40 45 Ile Glu Ala Ala Gly Gly Lys Ala Ile Ala Val Lys Thr Asp Val Ser 50 55 60 Asp Arg Asp Ser Val Phe Ala Ala Val Asn Glu Ala Ala Asp Lys Leu65 70 75 80 Gly Gly Phe Asp Val Ile Val Asn Asn Ala Gly Leu Gly Pro Thr Thr 85 90 95 Pro Ile Asp Thr Ile Thr Gln Glu Gln Phe Asp Thr Val Tyr His Val 100 105 110 Asn Val Gly Gly Val Leu Trp Gly Ile Gln Ala Ala His Ala Lys Phe 115 120 125 Lys Glu Leu Gly His Gly Gly Lys Ile Ile Ser Ala Thr Ser Gln Ala 130 135 140 Gly Val Val Gly Asn Pro Asn Leu Ala Leu Tyr Ser Gly Thr Lys Phe145 150 155 160 Ala Ile Arg Gly Val Thr Gln Val Ala Ala Arg Asp Leu Ala Ala Glu 165 170 175 Gly Ile Thr Val Asn Ala Tyr Ala Pro Gly Ile Val Lys Thr Pro Met 180 185 190 Met Phe Asp Ile Ala His Lys Val Gly Gln Asn Ala Gly Lys Asp Asp 195 200 205 Glu Trp Gly Met Gln Thr Phe Ser Lys Asp Ile Ala Leu Cys Arg Leu 210 215 220 Ser Glu Pro Glu Asp Val Ala Asn Gly Val Ala Phe Leu Ala Gly Pro225 230 235 240 Asp Ser Asn Tyr Ile Thr Gly Gln Thr Leu Glu Val Asp Gly Gly Met 245 250 255 Gln Phe His991089DNAPseudomonas putida KT2440 99atgaatgacc tgagccacac ccacatgcgc gcggccgtct ggcatggccg ccacgatatt 60cgtgtcgaac aggtaccttt gccggccgac cctgcgccgg gctgggtgca gatcaaggtg 120gactggtgcg gcatctgcgg ctccgacctg cacgaatatg ttgccggccc ggtgttcatc 180ccggtagagg ccccgcaccc gctgaccggc attcagggcc agtgcatcct cggccacgaa 240ttctgcggcc acatcgccaa gcttggcgaa ggcgtggaag gctatgccgt aggcgacccg 300gtggcggcag acgcgtgcca gcattgtggt acctgctatt actgcaccca tggcctgtac 360aacatctgcg aacgcctggc gttcaccggc ctgatgaaca acggtgcctt cgccgagctg 420gtcaacgtgc ccgccaacct gctctaccgg ctgccgcagg gcttccctgc cgaagccggg 480gcactgatcg agccgctggc ggtgggtatg cacgcggtga aaaaggccgg cagcctgctt 540gggcaaaccg ttgtagtggt tggggccggc accatcggcc tgtgcaccat catgtgcgcc 600aaggctgcag gtgcggcaca ggtcatcgcc cttgagatgt cctctgcgcg caaagccaag 660gccaaggaag cgggcgccaa cgtggtgctg gaccccagcc agtgcgatgc cctggcggaa 720atccgcgcac tgactgctgg gctgggcgcc gatgtgagtt ttgagtgcat cggcaacaaa 780catacggcca agctggccat cgacaccatc cgcaaagcag gcaagtgcgt gctggtgggt 840attttcgaag agcccagcga gttcaacttc ttcgagctgg tgtccaccga gaagcaagtg 900ctgggggcgt tggcgtacaa cggcgagttt gctgacgtga ttgccttcat tgctgatggt 960cggctggata ttcgcccgct ggtaaccggc cggatcggat tggagcagat tgtcgagctg 1020ggcttcgagg aactggtgaa caacaaagag gagaacgtga agatcatcgt ttcaccaggt 1080gtgcgctga 1089100362PRTPseudomonas putida KT2440 100Met Asn Asp Leu Ser His Thr His Met Arg Ala Ala Val Trp His Gly1 5 10 15 Arg His Asp Ile Arg Val Glu Gln Val Pro Leu Pro Ala Asp Pro Ala 20 25 30 Pro Gly Trp Val Gln Ile Lys Val Asp Trp Cys Gly Ile Cys Gly Ser 35 40 45 Asp Leu His Glu Tyr Val Ala Gly Pro Val Phe Ile Pro Val Glu Ala 50 55 60 Pro His Pro Leu Thr Gly Ile Gln Gly Gln Cys Ile Leu Gly His Glu65 70 75 80 Phe Cys Gly His Ile Ala Lys Leu Gly Glu Gly Val Glu Gly Tyr Ala 85 90 95 Val Gly Asp Pro Val Ala Ala Asp Ala Cys Gln His Cys Gly Thr Cys 100 105 110 Tyr Tyr Cys Thr His Gly Leu Tyr Asn Ile Cys Glu Arg Leu Ala Phe 115 120 125 Thr Gly Leu Met Asn Asn Gly Ala Phe Ala Glu Leu Val Asn Val Pro 130 135 140 Ala Asn Leu Leu Tyr Arg Leu Pro Gln Gly Phe Pro Ala Glu Ala Gly145 150 155 160 Ala Leu Ile Glu Pro Leu Ala Val Gly Met His Ala Val Lys Lys Ala 165 170 175 Gly Ser Leu Leu Gly Gln Thr Val Val Val Val Gly Ala Gly Thr Ile 180 185 190 Gly Leu Cys Thr Ile Met Cys Ala Lys Ala Ala Gly Ala Ala Gln Val 195 200 205 Ile Ala Leu Glu Met Ser Ser Ala Arg Lys Ala Lys Ala Lys Glu Ala 210 215 220 Gly Ala Asn Val Val Leu Asp Pro Ser Gln Cys Asp Ala Leu Ala Glu225 230 235 240 Ile Arg Ala Leu Thr Ala Gly Leu Gly Ala Asp Val Ser Phe Glu Cys 245 250 255 Ile Gly Asn Lys His Thr Ala Lys Leu Ala Ile Asp Thr Ile Arg Lys 260 265 270 Ala Gly Lys Cys Val Leu Val Gly Ile Phe Glu Glu Pro Ser Glu Phe 275 280 285 Asn Phe Phe Glu Leu Val Ser Thr Glu Lys Gln Val Leu Gly Ala Leu 290 295 300 Ala Tyr Asn Gly Glu Phe Ala Asp Val Ile Ala Phe Ile Ala Asp Gly305 310 315 320 Arg Leu Asp Ile Arg Pro Leu Val Thr Gly Arg Ile Gly Leu Glu Gln 325 330 335 Ile Val Glu Leu Gly Phe Glu Glu Leu Val Asn Asn Lys Glu Glu Asn 340 345 350 Val Lys Ile Ile Val Ser Pro Gly Val Arg 355 360 101771DNAKlebsiella pneumoniae MGH78578 101atgaaaaaag tcgcacttgt taccggcgcc ggccagggga ttggtaaagc tatcgccctt 60cgtctggtga aggatggatt tgccgtggcc attgccgatt ataacgacgc caccgccaaa 120gcggtcgcct cggaaatcaa ccaggccggc ggacacgccg tggcggtgaa agtggatgtc 180tccgaccgcg atcaggtatt tgccgccgtt gaacaggcgc gcaaaacgct gggcggcttc 240gacgtcatcg tcaataacgc cggtgtggca ccgtctacgc cgatcgagtc cattaccccg 300gagattgtcg acaaagtcta caacatcaac gtcaaagggg tgatctgggg tattcaggcg 360gcggtcgagg cctttaagaa agaggggcac ggcgggaaaa tcatcaacgc ctgttcccag 420gccggccacg tcggcaaccc ggagctggcg gtgtatagct ccagtaaatt cgcggtacgc 480ggcttaaccc agaccgccgc tcgcgacctc gcgccgctgg gcatcacggt caacggctac 540tgcccgggga ttgtcaaaac gccaatgtgg gccgaaattg accgccaggt gtccgaagcc 600gccggtaaac cgctgggcta cggtaccgcc gagttcgcca aacgcatcac tctcggtcgt 660ctgtccgagc cggaagatgt cgccgcctgc gtctcctatc ttgccagccc ggattctgat 720tacatgaccg gtcagtcgtt gctgatcgac ggcgggatgg tatttaacta a 771102256PRTKlebsiella pneumoniae MGH78578 102Met Lys Lys Val Ala Leu Val Thr Gly Ala Gly Gln Gly Ile Gly Lys1 5 10 15 Ala Ile Ala Leu Arg Leu Val Lys Asp Gly Phe Ala Val Ala Ile Ala 20 25 30 Asp Tyr Asn Asp Ala Thr Ala Lys Ala Val Ala Ser Glu Ile Asn Gln 35 40 45 Ala Gly Gly His Ala Val Ala Val Lys Val Asp Val Ser Asp Arg Asp 50 55 60 Gln Val Phe Ala Ala Val Glu Gln Ala Arg Lys Thr Leu Gly Gly Phe65 70 75 80 Asp Val Ile Val Asn Asn Ala Gly Val Ala Pro Ser Thr Pro Ile Glu 85 90 95 Ser Ile Thr Pro Glu Ile Val Asp Lys Val Tyr Asn Ile Asn Val Lys 100 105 110 Gly Val Ile Trp Gly Ile Gln Ala Ala Val Glu Ala Phe Lys Lys Glu 115 120 125 Gly His Gly Gly Lys Ile Ile Asn Ala Cys Ser Gln Ala Gly His Val 130 135 140 Gly Asn Pro Glu Leu Ala Val Tyr Ser Ser Ser Lys Phe Ala Val Arg145 150 155 160 Gly Leu Thr Gln Thr Ala Ala Arg Asp Leu Ala Pro Leu Gly Ile Thr 165 170 175 Val Asn Gly Tyr Cys Pro Gly Ile Val Lys Thr Pro Met Trp Ala Glu 180 185 190 Ile Asp Arg Gln Val Ser Glu Ala Ala Gly Lys Pro Leu Gly Tyr Gly 195 200 205 Thr Ala Glu Phe Ala Lys Arg Ile Thr Leu Gly Arg Leu Ser Glu Pro 210 215 220 Glu Asp Val Ala Ala Cys Val Ser Tyr Leu Ala Ser Pro Asp Ser Asp225 230 235 240 Tyr Met Thr Gly Gln Ser Leu Leu Ile Asp Gly Gly Met Val Phe Asn 245 250 255 1031665DNAKlebsiella pneumoniae MGH78578 103atgagatcga aaagatttga agcactggcg aaacgccctg tgaatcagga tggtttcgtt 60aaggagtgga ttgaagaggg ctttatcgcg atggaaagcc ctaacgatcc caaaccttct 120atccgcatcg tcaacggcgc ggtgaccgaa ctcgacgata aaccggttga gcagttcgac 180ctgattgacc actttatcgc gcgctacggc attaatctcg cccgggccga agaagtgatg 240gccatggatt cggttaagct cgccaacatg ctctgcgacc cgaacgttaa acgcagcgac 300atcgtgccgc tcactaccgc gatgaccccg gcgaaaatcg tggaagtggt gtcgcatatg 360aacgtggtcg agatgatgat ggcgatgcaa aaaatgcgcg cccgccgcac gccgtcccag 420caggcgcatg tcactaatat caaagataat ccggtacaga ttgccgccga cgccgctgaa 480ggcgcatggc gcggctttga cgagcaggag accaccgtcg ccgtggcgcg ctacgcgccg 540ttcaacgcca tcgccctgct ggtcggttca caggttggcc gccccggcgt cctcacccag 600tgttcgctgg aagaagccac cgagctgaaa ctgggcatgc tgggccacac ctgctatgcc 660gaaaccattt cggtatacgg tacggaaccg gtgtttaccg atggcgatga caccccgtgg 720tcgaaaggct tcctcgcctc ctcctacgcc tcgcgcggcc tgaaaatgcg ctttacctcc 780ggttccggct cggaggtgca gatgggctat gccgaaggca aatcgatgct ttatctcgaa 840gcgcgctgca tctacatcac caaagccgcc ggggtgcaag gcctgcagaa tggctccgtc 900agctgtatcg gcgtgccgtc cgccgtgccg tccgggatcc gcgccgtact ggcggaaaac 960ctgatctgct cagcgctgga tctggagtgc gcctccagca acgatcaaac ctttacccac 1020tcggatatgc ggcgtaccgc gcgtctgctg atgcagttcc tgccaggtac cgactttatc 1080tcctccggtt actcggcggt gccgaactac gacaacatgt tcgccggttc caacgaagat 1140gccgaagact tcgatgacta caacgtgatc cagcgcgacc tgaaggtcga tggcggcctg 1200cggccggtgc gtgaagagga cgtgatcgcc attcgcaaca aagccgcccg cgcgctgcag 1260gcggtatttg ccggcatggg tttgccgcct attacggatg aagaagtaga agccgccacc 1320tacgcccacg gttcaaaaga tatgcctgag cgcaatatcg tcgaggacat caagtttgct 1380caggagatca tcaacaagaa ccgcaacggc ctggaggtgg tgaaagccct ggcgaaaggc 1440ggcttccccg atgtcgccca ggacatgctc aatattcaga aagccaagct caccggcgac 1500tacctgcata cctccgccat cattgttggc gagggccagg tgctctcggc cgtgaatgac 1560gtgaacgatt atgccggtcc ggcaacaggc taccgcctgc aaggcgagcg ctgggaagag 1620attaaaaata tcccgggcgc gctcgatccc aatgaacttg gctaa 1665104554PRTKlebsiella pneumoniae MGH78578 104Met Arg Ser Lys Arg Phe Glu Ala Leu Ala Lys Arg Pro Val Asn Gln1 5 10 15 Asp Gly Phe Val Lys Glu Trp Ile Glu Glu Gly Phe Ile Ala Met Glu 20 25 30 Ser Pro Asn Asp Pro Lys Pro Ser Ile Arg Ile Val Asn Gly Ala Val 35 40 45 Thr Glu Leu Asp Asp Lys Pro Val Glu Gln Phe Asp Leu Ile Asp His 50 55 60 Phe Ile Ala Arg Tyr Gly Ile Asn Leu Ala Arg Ala Glu Glu Val Met65 70 75 80 Ala Met Asp Ser Val Lys Leu Ala Asn Met Leu Cys Asp Pro Asn Val 85 90 95 Lys Arg Ser Asp Ile Val Pro Leu Thr Thr Ala Met Thr Pro Ala Lys 100 105 110 Ile Val Glu Val Val Ser His Met Asn Val Val Glu Met Met Met Ala 115 120 125 Met Gln Lys Met Arg Ala Arg Arg Thr Pro Ser Gln Gln Ala His Val 130 135 140 Thr Asn Ile Lys Asp Asn Pro Val Gln Ile Ala Ala Asp Ala Ala Glu145 150 155 160 Gly Ala Trp Arg Gly Phe Asp Glu Gln Glu Thr Thr Val Ala Val Ala 165 170 175 Arg Tyr Ala Pro Phe Asn Ala Ile Ala Leu Leu Val Gly Ser Gln Val 180 185 190 Gly Arg Pro Gly Val Leu Thr Gln Cys Ser Leu Glu Glu Ala Thr Glu 195 200 205 Leu Lys Leu Gly Met Leu Gly His Thr Cys Tyr Ala Glu Thr Ile Ser 210 215 220 Val Tyr Gly Thr Glu Pro Val Phe Thr Asp Gly Asp Asp Thr Pro Trp225 230 235 240 Ser Lys Gly Phe Leu Ala Ser Ser Tyr Ala Ser Arg Gly Leu Lys Met 245 250 255 Arg Phe Thr Ser Gly Ser Gly Ser Glu Val Gln Met Gly Tyr Ala Glu 260 265 270 Gly Lys Ser Met Leu Tyr Leu Glu Ala Arg Cys Ile Tyr Ile Thr Lys 275 280 285 Ala Ala Gly Val Gln Gly Leu Gln Asn Gly Ser Val Ser Cys Ile Gly 290 295 300 Val Pro Ser Ala Val Pro Ser Gly Ile Arg Ala Val Leu Ala Glu Asn305 310 315 320 Leu Ile Cys Ser Ala Leu Asp Leu Glu Cys Ala Ser Ser Asn Asp Gln 325 330 335 Thr Phe Thr His Ser Asp Met Arg Arg Thr Ala Arg Leu Leu Met Gln 340 345 350 Phe Leu Pro Gly Thr Asp Phe Ile Ser Ser Gly Tyr Ser Ala Val Pro 355 360 365 Asn Tyr Asp Asn Met Phe Ala Gly Ser Asn Glu Asp Ala Glu Asp Phe 370 375 380 Asp Asp Tyr Asn Val Ile Gln Arg Asp Leu Lys Val Asp Gly Gly Leu385 390 395 400 Arg Pro Val Arg Glu Glu Asp Val Ile Ala Ile Arg Asn Lys Ala Ala 405 410 415 Arg Ala Leu Gln Ala Val Phe Ala Gly Met Gly Leu Pro Pro Ile Thr 420 425 430 Asp Glu Glu Val Glu Ala Ala Thr Tyr Ala His Gly Ser Lys Asp Met 435 440 445 Pro Glu Arg Asn Ile Val Glu Asp Ile Lys Phe Ala Gln Glu Ile Ile 450 455 460 Asn Lys Asn Arg Asn Gly Leu Glu Val Val Lys Ala Leu Ala Lys Gly465 470 475 480 Gly Phe Pro Asp Val Ala Gln Asp Met Leu Asn Ile Gln Lys Ala Lys 485 490 495 Leu Thr Gly Asp Tyr Leu His Thr Ser Ala Ile Ile Val Gly Glu Gly 500 505 510 Gln Val Leu Ser Ala Val Asn Asp Val Asn Asp Tyr Ala Gly Pro Ala 515 520 525 Thr Gly Tyr Arg Leu Gln Gly Glu Arg Trp Glu Glu Ile Lys Asn Ile 530 535 540 Pro Gly Ala Leu Asp Pro Asn Glu Leu Gly545 550 105690DNAKlebsiella pneumoniae MGH78578 105atggaaatta acgaaacgct gctgcgccag attatcgaag aggtgctgtc ggagatgaaa 60tcaggcgcag ataagccggt ctcctttagc gcgcctgcgg cttctgtcgc ctctgccgcg 120ccggtcgccg ttgcgcctgt gtccggcgac agcttcctga cggaaatcgg cgaagccaaa 180cccggcacgc agcaggatga agtcattatt gccgtcgggc cagcgtttgg tctggcgcaa 240accgccaata tcgtcggcat

tccgcataaa aatattctgc gcgaagtgat cgccggcatt 300gaggaagaag gcatcaaagc ccgggtgatc cgctgcttta agtcttctga cgtcgccttc 360gtggcagtgg aaggcaaccg cctgagcggc tccggcatct cgatcggtat tcagtcgaaa 420ggcaccaccg tcatccacca gcgcggcctg ccgccgcttt ccaatctgga actcttcccg 480caggcgccgc tgctgacgct ggaaacctac cgtcagattg gcaaaaacgc cgcgcgctac 540gccaaacgcg agtcgccgca gccggtgccg acgcttaacg atcagatggc tcgtcccaaa 600taccaggcga agtcggccat tttgcacatt aaagagacca aatacgtggt gacgggcaaa 660aacccgcagg aactgcgcgt ggcgctttaa 690106229PRTKlebsiella pneumoniae MGH78578 106Met Glu Ile Asn Glu Thr Leu Leu Arg Gln Ile Ile Glu Glu Val Leu1 5 10 15 Ser Glu Met Lys Ser Gly Ala Asp Lys Pro Val Ser Phe Ser Ala Pro 20 25 30 Ala Ala Ser Val Ala Ser Ala Ala Pro Val Ala Val Ala Pro Val Ser 35 40 45 Gly Asp Ser Phe Leu Thr Glu Ile Gly Glu Ala Lys Pro Gly Thr Gln 50 55 60 Gln Asp Glu Val Ile Ile Ala Val Gly Pro Ala Phe Gly Leu Ala Gln65 70 75 80 Thr Ala Asn Ile Val Gly Ile Pro His Lys Asn Ile Leu Arg Glu Val 85 90 95 Ile Ala Gly Ile Glu Glu Glu Gly Ile Lys Ala Arg Val Ile Arg Cys 100 105 110 Phe Lys Ser Ser Asp Val Ala Phe Val Ala Val Glu Gly Asn Arg Leu 115 120 125 Ser Gly Ser Gly Ile Ser Ile Gly Ile Gln Ser Lys Gly Thr Thr Val 130 135 140 Ile His Gln Arg Gly Leu Pro Pro Leu Ser Asn Leu Glu Leu Phe Pro145 150 155 160 Gln Ala Pro Leu Leu Thr Leu Glu Thr Tyr Arg Gln Ile Gly Lys Asn 165 170 175 Ala Ala Arg Tyr Ala Lys Arg Glu Ser Pro Gln Pro Val Pro Thr Leu 180 185 190 Asn Asp Gln Met Ala Arg Pro Lys Tyr Gln Ala Lys Ser Ala Ile Leu 195 200 205 His Ile Lys Glu Thr Lys Tyr Val Val Thr Gly Lys Asn Pro Gln Glu 210 215 220 Leu Arg Val Ala Leu225 107525DNAKlebsiella pneumoniae MGH78578 107atgaataccg acgcaattga atccatggta cgcgacgtgc tgagccggat gaacagccta 60caggacggga taacgcccgc gccagccgcg ccgacaaacg acaccgttcg ccagccaaaa 120gttagcgact acccgttagc gacccgccat ccggagtggg tcaaaaccgc taccaataaa 180acgctcgatg acctgacgct ggagaacgta ttaagcgatc gcgttacggc gcaggacatg 240cgcatcactc cggaaacgct gcgtatgcag gcggcgatcg cccaggatgc cggacgcgat 300cggctggcga tgaactttga gcgggccgca gagctcaccg cggttcccga cgaccgaatc 360cttgagatct acaacgccct gcgcccatac cgttccaccc aggcggagct actggcgatc 420gctgatgacc tcgagcatcg ctaccaggca cgactctgtg ccgcctttgt tcgggaagcg 480gccgggctgt acatcgagcg taagaagctg aaaggcgacg attaa 525108174PRTKlebsiella pneumoniae MGH78578 108Met Asn Thr Asp Ala Ile Glu Ser Met Val Arg Asp Val Leu Ser Arg1 5 10 15 Met Asn Ser Leu Gln Asp Gly Ile Thr Pro Ala Pro Ala Ala Pro Thr 20 25 30 Asn Asp Thr Val Arg Gln Pro Lys Val Ser Asp Tyr Pro Leu Ala Thr 35 40 45 Arg His Pro Glu Trp Val Lys Thr Ala Thr Asn Lys Thr Leu Asp Asp 50 55 60 Leu Thr Leu Glu Asn Val Leu Ser Asp Arg Val Thr Ala Gln Asp Met65 70 75 80 Arg Ile Thr Pro Glu Thr Leu Arg Met Gln Ala Ala Ile Ala Gln Asp 85 90 95 Ala Gly Arg Asp Arg Leu Ala Met Asn Phe Glu Arg Ala Ala Glu Leu 100 105 110 Thr Ala Val Pro Asp Asp Arg Ile Leu Glu Ile Tyr Asn Ala Leu Arg 115 120 125 Pro Tyr Arg Ser Thr Gln Ala Glu Leu Leu Ala Ile Ala Asp Asp Leu 130 135 140 Glu His Arg Tyr Gln Ala Arg Leu Cys Ala Ala Phe Val Arg Glu Ala145 150 155 160 Ala Gly Leu Tyr Ile Glu Arg Lys Lys Leu Lys Gly Asp Asp 165 170 109789DNAPseudomonas putida KT2440 109atgacagtca attatgattt ttccggaaaa gtcgtgctgg ttaccggcgc tggctctggt 60attggccgtg ccactgcgct tgccttcgcg cagtcgggcg catccgttgc ggtcgcagac 120atctcgactg accacggttt gaaaaccgta gagttggtca aagccgaagg aggcgaggcg 180accttcttcc atgtcgatgt aggctctgaa cccagcgtcc agtcgatgct ggctggtgtc 240gtggcgcatt acggcggcct ggacattgcg cacaacaacg ccggcattga ggccaatatc 300gtgccgctgg ccgagctgga ctccgacaac tggcgtcgtg tcatcgatgt gaacctttcc 360tcggtgttct attgcctgaa aggtgaaatc cctctgatgc tgaaaagggg cggcggcgcc 420attgtgaata ccgcatcggc ctccgggctg attggcggct atcgcctttc cgggtatacc 480gccacgaagc acggcgtagt ggggctgact aaggctgctg ctatcgatta tgcaaaccag 540aatatccgga ttaatgccgt gtgccctggt ccagttgact ccccattcct ggctgacatg 600ccgcaaccca tgcgcgatcg acttctcttt ggcactccaa ttggacgatt ggccaccgca 660gaggagatcg cgcgttcggt tctgtggctg tgttctgacg atgcaaaata cgtggtgggc 720cattcgatgt cagtcgacgg tggcgtggca gtgactgcgg ttggtactcg aatggatgat 780ctcttttaa 789110262PRTPseudomonas putida KT2440 110Met Thr Val Asn Tyr Asp Phe Ser Gly Lys Val Val Leu Val Thr Gly1 5 10 15 Ala Gly Ser Gly Ile Gly Arg Ala Thr Ala Leu Ala Phe Ala Gln Ser 20 25 30 Gly Ala Ser Val Ala Val Ala Asp Ile Ser Thr Asp His Gly Leu Lys 35 40 45 Thr Val Glu Leu Val Lys Ala Glu Gly Gly Glu Ala Thr Phe Phe His 50 55 60 Val Asp Val Gly Ser Glu Pro Ser Val Gln Ser Met Leu Ala Gly Val65 70 75 80 Val Ala His Tyr Gly Gly Leu Asp Ile Ala His Asn Asn Ala Gly Ile 85 90 95 Glu Ala Asn Ile Val Pro Leu Ala Glu Leu Asp Ser Asp Asn Trp Arg 100 105 110 Arg Val Ile Asp Val Asn Leu Ser Ser Val Phe Tyr Cys Leu Lys Gly 115 120 125 Glu Ile Pro Leu Met Leu Lys Arg Gly Gly Gly Ala Ile Val Asn Thr 130 135 140 Ala Ser Ala Ser Gly Leu Ile Gly Gly Tyr Arg Leu Ser Gly Tyr Thr145 150 155 160 Ala Thr Lys His Gly Val Val Gly Leu Thr Lys Ala Ala Ala Ile Asp 165 170 175 Tyr Ala Asn Gln Asn Ile Arg Ile Asn Ala Val Cys Pro Gly Pro Val 180 185 190 Asp Ser Pro Phe Leu Ala Asp Met Pro Gln Pro Met Arg Asp Arg Leu 195 200 205 Leu Phe Gly Thr Pro Ile Gly Arg Leu Ala Thr Ala Glu Glu Ile Ala 210 215 220 Arg Ser Val Leu Trp Leu Cys Ser Asp Asp Ala Lys Tyr Val Val Gly225 230 235 240 His Ser Met Ser Val Asp Gly Gly Val Ala Val Thr Ala Val Gly Thr 245 250 255 Arg Met Asp Asp Leu Phe 260 111762DNAPseudomonas putida KT2440 111atgagcatga ccttttctgg ccaggtagcc ctggtgaccg gcgcgggtgc cggcatcggc 60cgggcaaccg ccctggcgtt cgcccacgag ggcatgaaag tggtggtggc ggacctcgac 120ccggtcggcg gcgaggccac cgtggcgcag atccacgcgg caggcggcga agcgctgttc 180attgcctgcg acgtgacccg cgacgccgag gtgcgccagt tgcatgagcg cctgatggcc 240gcctacggcc ggctggacta cgccttcaac aacgccggga tcgagatcga gcaacaccgc 300ctggccgaag gcagcgaagc ggagttcgat gccatcatgg gcgtgaacgt gaagggcgtg 360tggttgtgca tgaagtatca gttgcccttg ttgctggccc aaggcggtgg ggccatcgtc 420aataccgcgt cggtggcggg gctaggggcg gcgccaaaga tgagcatcta cagcgccagc 480aagcatgcgg tcatcggtct gaccaagtcg gcggccatcg agtacgccaa gaagggcatc 540cgcgtgaacg ccgtgtgccc ggccgtgatc gacaccgaca tgttccgccg cgcttaccag 600gccgacccgc gcaaggccga gttcgccgca gccatgcacc cggtagggcg cattggcaag 660gtcgaggaaa tcgccagcgc cgtgctgtat ctgtgcagtg acggcgcggc gtttaccacc 720gggcattgcc tgacggtgga tggtggggct acggcgatct ga 762112253PRTPseudomonas putida KT2440 112Met Ser Met Thr Phe Ser Gly Gln Val Ala Leu Val Thr Gly Ala Gly1 5 10 15 Ala Gly Ile Gly Arg Ala Thr Ala Leu Ala Phe Ala His Glu Gly Met 20 25 30 Lys Val Val Val Ala Asp Leu Asp Pro Val Gly Gly Glu Ala Thr Val 35 40 45 Ala Gln Ile His Ala Ala Gly Gly Glu Ala Leu Phe Ile Ala Cys Asp 50 55 60 Val Thr Arg Asp Ala Glu Val Arg Gln Leu His Glu Arg Leu Met Ala65 70 75 80 Ala Tyr Gly Arg Leu Asp Tyr Ala Phe Asn Asn Ala Gly Ile Glu Ile 85 90 95 Glu Gln His Arg Leu Ala Glu Gly Ser Glu Ala Glu Phe Asp Ala Ile 100 105 110 Met Gly Val Asn Val Lys Gly Val Trp Leu Cys Met Lys Tyr Gln Leu 115 120 125 Pro Leu Leu Leu Ala Gln Gly Gly Gly Ala Ile Val Asn Thr Ala Ser 130 135 140 Val Ala Gly Leu Gly Ala Ala Pro Lys Met Ser Ile Tyr Ser Ala Ser145 150 155 160 Lys His Ala Val Ile Gly Leu Thr Lys Ser Ala Ala Ile Glu Tyr Ala 165 170 175 Lys Lys Gly Ile Arg Val Asn Ala Val Cys Pro Ala Val Ile Asp Thr 180 185 190 Asp Met Phe Arg Arg Ala Tyr Gln Ala Asp Pro Arg Lys Ala Glu Phe 195 200 205 Ala Ala Ala Met His Pro Val Gly Arg Ile Gly Lys Val Glu Glu Ile 210 215 220 Ala Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly Ala Ala Phe Thr Thr225 230 235 240 Gly His Cys Leu Thr Val Asp Gly Gly Ala Thr Ala Ile 245 250 113810DNAPseudomonas putida KT2440 113atgtcttttc aaaacaaaat cgttgtgctc acaggcgcag cttctggcat cggcaaagcg 60acagcacagc tgctagtgga gcagggcgcc catgtggttg ccatggatct taaaagcgac 120ttgcttcaac aagcattcgg cagtgaggag cacgttctgt gcatccctac cgacgtcagc 180gatagcgaag ccgtgcgagc cgccttccag gcagtggacg cgaaatttgg ccgtgtcgac 240gtgattatta acgccgcggg catcaacgca cctacgcgag aagccaacca gaaaatggtt 300gatgccaacg tcgctgccct cgatgccatg aagagcgggc gggcgcccac tttcgacttc 360ctggccgata cctcggatca ggatttccgg cgcgtaatgg aagtcaattt gttcagccag 420ttttactgca ttcgagaggg tgttccgctg atgcgccgag cgggtggcgg cagcatcgtc 480aacatctcca gcgtggcagc gctcctgggc gtggcaatgc cactttacta ccccgcctcc 540aaggcggcgg tgctgggcct cacccgtgca gcggcagctg agttggcacc ttacaacatt 600cgtgtgaatg ccatcgctcc aggctctgtc gacacaccat tgatgcatga gcaaccaccg 660gaagtcgttc agttcctggt cagcatgcaa cccatcaagc ggctggccca acccgaggag 720cttgcccaaa gcatcctgtt ccttgccggt gagcattcgt ccttcatcac cggacagacg 780ctttctccca acggcgggat gcacatgtaa 810114269PRTPseudomonas putida KT2440 114Met Ser Phe Gln Asn Lys Ile Val Val Leu Thr Gly Ala Ala Ser Gly1 5 10 15 Ile Gly Lys Ala Thr Ala Gln Leu Leu Val Glu Gln Gly Ala His Val 20 25 30 Val Ala Met Asp Leu Lys Ser Asp Leu Leu Gln Gln Ala Phe Gly Ser 35 40 45 Glu Glu His Val Leu Cys Ile Pro Thr Asp Val Ser Asp Ser Glu Ala 50 55 60 Val Arg Ala Ala Phe Gln Ala Val Asp Ala Lys Phe Gly Arg Val Asp65 70 75 80 Val Ile Ile Asn Ala Ala Gly Ile Asn Ala Pro Thr Arg Glu Ala Asn 85 90 95 Gln Lys Met Val Asp Ala Asn Val Ala Ala Leu Asp Ala Met Lys Ser 100 105 110 Gly Arg Ala Pro Thr Phe Asp Phe Leu Ala Asp Thr Ser Asp Gln Asp 115 120 125 Phe Arg Arg Val Met Glu Val Asn Leu Phe Ser Gln Phe Tyr Cys Ile 130 135 140 Arg Glu Gly Val Pro Leu Met Arg Arg Ala Gly Gly Gly Ser Ile Val145 150 155 160 Asn Ile Ser Ser Val Ala Ala Leu Leu Gly Val Ala Met Pro Leu Tyr 165 170 175 Tyr Pro Ala Ser Lys Ala Ala Val Leu Gly Leu Thr Arg Ala Ala Ala 180 185 190 Ala Glu Leu Ala Pro Tyr Asn Ile Arg Val Asn Ala Ile Ala Pro Gly 195 200 205 Ser Val Asp Thr Pro Leu Met His Glu Gln Pro Pro Glu Val Val Gln 210 215 220 Phe Leu Val Ser Met Gln Pro Ile Lys Arg Leu Ala Gln Pro Glu Glu225 230 235 240 Leu Ala Gln Ser Ile Leu Phe Leu Ala Gly Glu His Ser Ser Phe Ile 245 250 255 Thr Gly Gln Thr Leu Ser Pro Asn Gly Gly Met His Met 260 265 115771DNAPseudomonas putida KT2440 115atgacccttg aaggcaaaac tgcactcgtc accggttcca ccagcggcat tggcctgggc 60atcgcccagg tattggcccg ggctggcgcc aacatcgtgc tcaacggctt tggtgacccg 120ggccccgcca tggcggaaat tgcccggcac ggggtgaagg ttgtgcacca cccggccgac 180ctgtcggatg tggtccagat cgaggctttg ttcaacctgg ccgaacgcga gttcggcggc 240gtcgacatcc tggtcaacaa cgccggtatc cagcatgtgg caccggttga gcagttcccg 300ccagaaagct gggacaagat catcgccctg aacctgtcgg ccgtattcca tggcacgcgc 360ctggcgctgc cgggcatgcg cacgcgcaac tgggggcgca tcatcaatat cgcttcggtg 420catggcctgg tcggctcgat tggcaaggca gcctacgtgg cagccaagca tggcgtgatc 480ggcctgacca aggtggtcgg cctggaaacc gccaccagtc atgtcacctg caatgccata 540tgcccgggct gggtgctgac accgctggtg caaaagcaga tcgacgatcg tgcggccaag 600ggtggcgatc ggctgcaagc gcagcacgat ctgctggcag aaaagcaacc gtcgctggct 660ttcgtcaccc ccgaacacct cggtgagctg gtactctttc tgtgcagcga ggccggtagc 720caggttcgcg gcgccgcctg gaacgtcgat ggtggctggt tggcccagtg a 771116256PRTPseudomonas putida KT2440 116Met Thr Leu Glu Gly Lys Thr Ala Leu Val Thr Gly Ser Thr Ser Gly1 5 10 15 Ile Gly Leu Gly Ile Ala Gln Val Leu Ala Arg Ala Gly Ala Asn Ile 20 25 30 Val Leu Asn Gly Phe Gly Asp Pro Gly Pro Ala Met Ala Glu Ile Ala 35 40 45 Arg His Gly Val Lys Val Val His His Pro Ala Asp Leu Ser Asp Val 50 55 60 Val Gln Ile Glu Ala Leu Phe Asn Leu Ala Glu Arg Glu Phe Gly Gly65 70 75 80 Val Asp Ile Leu Val Asn Asn Ala Gly Ile Gln His Val Ala Pro Val 85 90 95 Glu Gln Phe Pro Pro Glu Ser Trp Asp Lys Ile Ile Ala Leu Asn Leu 100 105 110 Ser Ala Val Phe His Gly Thr Arg Leu Ala Leu Pro Gly Met Arg Thr 115 120 125 Arg Asn Trp Gly Arg Ile Ile Asn Ile Ala Ser Val His Gly Leu Val 130 135 140 Gly Ser Ile Gly Lys Ala Ala Tyr Val Ala Ala Lys His Gly Val Ile145 150 155 160 Gly Leu Thr Lys Val Val Gly Leu Glu Thr Ala Thr Ser His Val Thr 165 170 175 Cys Asn Ala Ile Cys Pro Gly Trp Val Leu Thr Pro Leu Val Gln Lys 180 185 190 Gln Ile Asp Asp Arg Ala Ala Lys Gly Gly Asp Arg Leu Gln Ala Gln 195 200 205 His Asp Leu Leu Ala Glu Lys Gln Pro Ser Leu Ala Phe Val Thr Pro 210 215 220 Glu His Leu Gly Glu Leu Val Leu Phe Leu Cys Ser Glu Ala Gly Ser225 230 235 240 Gln Val Arg Gly Ala Ala Trp Asn Val Asp Gly Gly Trp Leu Ala Gln 245 250 255 117750DNAPseudomonas putida KT2440 117atgtccaagc aacttacact cgaaggcaaa gtggccctgg ttcagggcgg ttcccgaggc 60attggcgcag ctatcgtaag gcgcctggcc cgcgaaggcg cgcaagtggc cttcacctat 120gtcagctctg ccggcccggc tgaagaactg gctcgggaaa ttaccgagaa cggcggcaaa 180gccttggccc tgcgggctga cagcgctgat gccgcggccg tgcagctggc ggttgatgac 240accgagaaag ccttgggccg gctggatatc ctggtcaaca acgccggtgt gctggcagtg 300gccccagtga cagagttcga cctggccgac ttcgatcata tgctggccgt gaacgtacgc 360agcgtgttcg tcgccagcca ggccgcggca cgctatatgg gccagggcgg tcgtatcatc 420aacattggca gcaccaacgc cgagcgcatg ccgtttgccg gtggtgcacc gtacgccatg 480agcaagtcgg cactggttgg tctgacccgc ggcatggcac gcgacctcgg gccgcagggc 540attaccgtga acaacgtgca gcccggcccg gtggacaccg acatgaaccc ggccagtggc 600gagtttgccg agagcctgat tccgctgatg gccattgggc gatatggcga gccggaggag 660attgccagct tcgtggctta cctggcaggg cctgaagccg ggtatatcac cggggccagc 720ctgactgtag atggtgggtt tgcagcctga 750118249PRTPseudomonas putida KT2440 118Met Ser Lys Gln Leu Thr Leu Glu Gly Lys Val Ala Leu Val Gln Gly1 5 10 15 Gly Ser Arg Gly Ile Gly Ala Ala Ile Val Arg Arg Leu Ala Arg Glu

20 25 30 Gly Ala Gln Val Ala Phe Thr Tyr Val Ser Ser Ala Gly Pro Ala Glu 35 40 45 Glu Leu Ala Arg Glu Ile Thr Glu Asn Gly Gly Lys Ala Leu Ala Leu 50 55 60 Arg Ala Asp Ser Ala Asp Ala Ala Ala Val Gln Leu Ala Val Asp Asp65 70 75 80 Thr Glu Lys Ala Leu Gly Arg Leu Asp Ile Leu Val Asn Asn Ala Gly 85 90 95 Val Leu Ala Val Ala Pro Val Thr Glu Phe Asp Leu Ala Asp Phe Asp 100 105 110 His Met Leu Ala Val Asn Val Arg Ser Val Phe Val Ala Ser Gln Ala 115 120 125 Ala Ala Arg Tyr Met Gly Gln Gly Gly Arg Ile Ile Asn Ile Gly Ser 130 135 140 Thr Asn Ala Glu Arg Met Pro Phe Ala Gly Gly Ala Pro Tyr Ala Met145 150 155 160 Ser Lys Ser Ala Leu Val Gly Leu Thr Arg Gly Met Ala Arg Asp Leu 165 170 175 Gly Pro Gln Gly Ile Thr Val Asn Asn Val Gln Pro Gly Pro Val Asp 180 185 190 Thr Asp Met Asn Pro Ala Ser Gly Glu Phe Ala Glu Ser Leu Ile Pro 195 200 205 Leu Met Ala Ile Gly Arg Tyr Gly Glu Pro Glu Glu Ile Ala Ser Phe 210 215 220 Val Ala Tyr Leu Ala Gly Pro Glu Ala Gly Tyr Ile Thr Gly Ala Ser225 230 235 240 Leu Thr Val Asp Gly Gly Phe Ala Ala 245 119858DNAPseudomonas putida KT2440 119atgagcgact accctacccc tccattccca tcccaaccgc aaagcgttcc cggttcccag 60cgcaagatgg atccgtatcc ggactgcggt gagcagagct acaccggcaa caatcgcctc 120gcaggcaaga tcgccttgat aaccggtgct gacagcggca tcgggcgtgc ggtggcgatt 180gcctatgccc gagaaggcgc tgacgttgcc attgcctatc tgaatgaaca cgacgatgcg 240caggaaaccg cgcgctgggt caaagcggct ggccgccagt gcctgctgct gcccggcgac 300ctggcacaga aacagcactg ccacgacatc gtcgacaaga ccgtggcgca gtttggtcgc 360atcgatatcc tggtcaacaa cgccgcgttc cagatggccc atgaaagcct ggacgacatt 420gatgacgatg aatgggtgaa gaccttcgat accaacatca ccgccatttt ccgcatttgc 480cagcgcgctt tgccctcgat gccaaagggc ggttcgatca tcaacaccag ttcggtcaac 540tctgacgacc cgtcacccag cctgttggcc tatgccgcga ccaaaggggc tattgccaat 600ttcactgcag gccttgcgca actgctgggc aagcagggca ttcgcgtcaa cagcgtcgca 660cccggcccga tctggacccc gctgatcccg gccaccatgc ctgatgaggc ggtgagaaac 720ttcggttccg gttacccgat gggacggccg ggtcaacctg tggaggtggc gccaatctat 780gtcttgctgg ggtccgatga agccagctac atctcgggtt cgcgttacgc cgtgacggga 840ggcaaaccta ttctgtga 858120285PRTPseudomonas putida KT2440 120Met Ser Asp Tyr Pro Thr Pro Pro Phe Pro Ser Gln Pro Gln Ser Val1 5 10 15 Pro Gly Ser Gln Arg Lys Met Asp Pro Tyr Pro Asp Cys Gly Glu Gln 20 25 30 Ser Tyr Thr Gly Asn Asn Arg Leu Ala Gly Lys Ile Ala Leu Ile Thr 35 40 45 Gly Ala Asp Ser Gly Ile Gly Arg Ala Val Ala Ile Ala Tyr Ala Arg 50 55 60 Glu Gly Ala Asp Val Ala Ile Ala Tyr Leu Asn Glu His Asp Asp Ala65 70 75 80 Gln Glu Thr Ala Arg Trp Val Lys Ala Ala Gly Arg Gln Cys Leu Leu 85 90 95 Leu Pro Gly Asp Leu Ala Gln Lys Gln His Cys His Asp Ile Val Asp 100 105 110 Lys Thr Val Ala Gln Phe Gly Arg Ile Asp Ile Leu Val Asn Asn Ala 115 120 125 Ala Phe Gln Met Ala His Glu Ser Leu Asp Asp Ile Asp Asp Asp Glu 130 135 140 Trp Val Lys Thr Phe Asp Thr Asn Ile Thr Ala Ile Phe Arg Ile Cys145 150 155 160 Gln Arg Ala Leu Pro Ser Met Pro Lys Gly Gly Ser Ile Ile Asn Thr 165 170 175 Ser Ser Val Asn Ser Asp Asp Pro Ser Pro Ser Leu Leu Ala Tyr Ala 180 185 190 Ala Thr Lys Gly Ala Ile Ala Asn Phe Thr Ala Gly Leu Ala Gln Leu 195 200 205 Leu Gly Lys Gln Gly Ile Arg Val Asn Ser Val Ala Pro Gly Pro Ile 210 215 220 Trp Thr Pro Leu Ile Pro Ala Thr Met Pro Asp Glu Ala Val Arg Asn225 230 235 240 Phe Gly Ser Gly Tyr Pro Met Gly Arg Pro Gly Gln Pro Val Glu Val 245 250 255 Ala Pro Ile Tyr Val Leu Leu Gly Ser Asp Glu Ala Ser Tyr Ile Ser 260 265 270 Gly Ser Arg Tyr Ala Val Thr Gly Gly Lys Pro Ile Leu 275 280 285 121774DNAPseudomonas putida KT2440 121atgatcgaaa tcagcggcag caccccgggc cacaatggcc gggtagcctt ggtcacgggc 60gccgcccgcg gcatcggtct gggcattgcc gcatggctga tctgcgaagg ctggcaagtg 120gtgctgagtg atctggaccg ccagcgtggt accaaagtgg ccaaggcgtt gggcgacaac 180gcctggttca tcaccatgga cgttgccgac gaggcccagg tcagtgccgg cgtgtccgaa 240gtgctcgggc agttcggccg gctggacgcg ctggtgtgca atgcggccat tgccaacccg 300cacaaccaga cgctggaaag cctgagcctg gcacaatgga accgggtgct gggggtcaac 360ctcagcggcc ccatgctgct ggccaagcat tgtgcgccgt acctgcgtgc gcacaatggg 420gcgatcgtca acctgacctc tacccgtgct cggcagtccg aacccgacac cgaggcttac 480gcggcaagca agggcggcct ggtggctttg acccatgccc tggccatgag cctgggcccg 540gagattcgcg tcaatgcggt gagcccgggc tggatcgatg cccgtgatcc gtcgcagcgc 600cgtgccgagc cgttgagcga agctgaccat gcccagcatc caacgggcag ggtagggacc 660gtggaagatg tcgcggccat ggttgcctgg ttgctgtcac gccaggcggc atttgtcacc 720ggccaggagt ttgtggtcga tggcggcatg acccgcaaga tgatctatac ctga 774122257PRTPseudomonas putida KT2440 122Met Ile Glu Ile Ser Gly Ser Thr Pro Gly His Asn Gly Arg Val Ala1 5 10 15 Leu Val Thr Gly Ala Ala Arg Gly Ile Gly Leu Gly Ile Ala Ala Trp 20 25 30 Leu Ile Cys Glu Gly Trp Gln Val Val Leu Ser Asp Leu Asp Arg Gln 35 40 45 Arg Gly Thr Lys Val Ala Lys Ala Leu Gly Asp Asn Ala Trp Phe Ile 50 55 60 Thr Met Asp Val Ala Asp Glu Ala Gln Val Ser Ala Gly Val Ser Glu65 70 75 80 Val Leu Gly Gln Phe Gly Arg Leu Asp Ala Leu Val Cys Asn Ala Ala 85 90 95 Ile Ala Asn Pro His Asn Gln Thr Leu Glu Ser Leu Ser Leu Ala Gln 100 105 110 Trp Asn Arg Val Leu Gly Val Asn Leu Ser Gly Pro Met Leu Leu Ala 115 120 125 Lys His Cys Ala Pro Tyr Leu Arg Ala His Asn Gly Ala Ile Val Asn 130 135 140 Leu Thr Ser Thr Arg Ala Arg Gln Ser Glu Pro Asp Thr Glu Ala Tyr145 150 155 160 Ala Ala Ser Lys Gly Gly Leu Val Ala Leu Thr His Ala Leu Ala Met 165 170 175 Ser Leu Gly Pro Glu Ile Arg Val Asn Ala Val Ser Pro Gly Trp Ile 180 185 190 Asp Ala Arg Asp Pro Ser Gln Arg Arg Ala Glu Pro Leu Ser Glu Ala 195 200 205 Asp His Ala Gln His Pro Thr Gly Arg Val Gly Thr Val Glu Asp Val 210 215 220 Ala Ala Met Val Ala Trp Leu Leu Ser Arg Gln Ala Ala Phe Val Thr225 230 235 240 Gly Gln Glu Phe Val Val Asp Gly Gly Met Thr Arg Lys Met Ile Tyr 245 250 255 Thr123741DNAPseudomonas putida KT2440 123atgagcctgc aaggtaaagt tgcactggtt accggcgcca gccgtggcat tggccaggcc 60atcgccctcg agctgggccg ccagggcgcg accgtgatcg gtaccgccac gtcggcgtcc 120ggtgccgagc gcatcgctgc caccctgaaa gaacacggca ttaccggcac tggcatggag 180ctgaacgtga ccagcgccga atcggttgaa gccgtactgg ccgccattgg cgagcagttc 240ggcgcgccgg ccatcttggt caacaatgcc ggtatcaccc gcgacaacct catgctgcgc 300atgaaagacg acgagtggtt tgatgtcatc gacaccaacc tgaacagcct ctaccgtctg 360tccaagggcg tgctgcgtgg catgaccaag gcgcgttggg gtcgtatcat cagcatcggc 420tcggtcgttg gtgccatggg taacgcaggt caggccaact acgcggctgc caaggccggt 480ctggaaggtt tcagccgcgc cctggcgcgt gaagtgggtt cgcgtggtat caccgtcaac 540tcggtgaccc caggcttcat cgataccgac atgacccgcg agctgccaga agctcagcgc 600gaagccctgc agacccagat tccgctgggc cgcctgggcc aggctgacga aattgccaag 660gtggtttcgt tcctggcatc cgacggcgcc gcctacgtga ccggcgctac cgtgccggtc 720aacggcggga tgtacatgta a 741124246PRTPseudomonas putida KT2440 124Met Ser Leu Gln Gly Lys Val Ala Leu Val Thr Gly Ala Ser Arg Gly1 5 10 15 Ile Gly Gln Ala Ile Ala Leu Glu Leu Gly Arg Gln Gly Ala Thr Val 20 25 30 Ile Gly Thr Ala Thr Ser Ala Ser Gly Ala Glu Arg Ile Ala Ala Thr 35 40 45 Leu Lys Glu His Gly Ile Thr Gly Thr Gly Met Glu Leu Asn Val Thr 50 55 60 Ser Ala Glu Ser Val Glu Ala Val Leu Ala Ala Ile Gly Glu Gln Phe65 70 75 80 Gly Ala Pro Ala Ile Leu Val Asn Asn Ala Gly Ile Thr Arg Asp Asn 85 90 95 Leu Met Leu Arg Met Lys Asp Asp Glu Trp Phe Asp Val Ile Asp Thr 100 105 110 Asn Leu Asn Ser Leu Tyr Arg Leu Ser Lys Gly Val Leu Arg Gly Met 115 120 125 Thr Lys Ala Arg Trp Gly Arg Ile Ile Ser Ile Gly Ser Val Val Gly 130 135 140 Ala Met Gly Asn Ala Gly Gln Ala Asn Tyr Ala Ala Ala Lys Ala Gly145 150 155 160 Leu Glu Gly Phe Ser Arg Ala Leu Ala Arg Glu Val Gly Ser Arg Gly 165 170 175 Ile Thr Val Asn Ser Val Thr Pro Gly Phe Ile Asp Thr Asp Met Thr 180 185 190 Arg Glu Leu Pro Glu Ala Gln Arg Glu Ala Leu Gln Thr Gln Ile Pro 195 200 205 Leu Gly Arg Leu Gly Gln Ala Asp Glu Ile Ala Lys Val Val Ser Phe 210 215 220 Leu Ala Ser Asp Gly Ala Ala Tyr Val Thr Gly Ala Thr Val Pro Val225 230 235 240 Asn Gly Gly Met Tyr Met 245 125738DNAPseudomonas putida KT2440 125atgactcaga aaatagctgt cgtgaccggc ggcagtcgcg gcattggcaa gtccatcgtg 60ctggccctgg ccggcgcggg ttatcaggtt gccttcagtt atgtccgtga cgaggcgtca 120gccgctgcct tgcaggcgca ggtcgaaggg ctcggccggg actgcctggc cgtgcagtgt 180gatgtcaagg aagcgccgag cattcaggcg ttttttgaac gggtcgagca acgtttcgag 240cgtatcgact tgttggtcaa caacgccggt attacccgtg acggtttgct cgccacgcaa 300tcgttgaacg acatcaccga ggtcatccag accaacctgg tcggcacgtt gttgtgctgt 360cagcaggtgc tgccctgcat gatgcgccaa cgcagcgggt gcatcgtcaa cctcagttcg 420gtggccgcgc aaaagcccgg caagggccag agcaactacg ccgccgccaa aggcggtgta 480gaagcattga cacgcgcact ggcggtggag ttggcgccgc gcaacatccg ggtcaacgcg 540gtggcgcccg gcatcgtcag caccgacatg agccaagccc tggtcggcgc ccatgagcag 600gaaatccagt cgcggctgtt gatcaaacgg ttcgcccggc ctgaagaaat tgccgacgcg 660gtgctgtatc tggccgagcg cggcctgtac atcacgggcg aagtcctgtc cgtcaacggc 720ggattgaaaa tgccatga 738126245PRTPseudomonas putida KT2440 126Met Thr Gln Lys Ile Ala Val Val Thr Gly Gly Ser Arg Gly Ile Gly1 5 10 15 Lys Ser Ile Val Leu Ala Leu Ala Gly Ala Gly Tyr Gln Val Ala Phe 20 25 30 Ser Tyr Val Arg Asp Glu Ala Ser Ala Ala Ala Leu Gln Ala Gln Val 35 40 45 Glu Gly Leu Gly Arg Asp Cys Leu Ala Val Gln Cys Asp Val Lys Glu 50 55 60 Ala Pro Ser Ile Gln Ala Phe Phe Glu Arg Val Glu Gln Arg Phe Glu65 70 75 80 Arg Ile Asp Leu Leu Val Asn Asn Ala Gly Ile Thr Arg Asp Gly Leu 85 90 95 Leu Ala Thr Gln Ser Leu Asn Asp Ile Thr Glu Val Ile Gln Thr Asn 100 105 110 Leu Val Gly Thr Leu Leu Cys Cys Gln Gln Val Leu Pro Cys Met Met 115 120 125 Arg Gln Arg Ser Gly Cys Ile Val Asn Leu Ser Ser Val Ala Ala Gln 130 135 140 Lys Pro Gly Lys Gly Gln Ser Asn Tyr Ala Ala Ala Lys Gly Gly Val145 150 155 160 Glu Ala Leu Thr Arg Ala Leu Ala Val Glu Leu Ala Pro Arg Asn Ile 165 170 175 Arg Val Asn Ala Val Ala Pro Gly Ile Val Ser Thr Asp Met Ser Gln 180 185 190 Ala Leu Val Gly Ala His Glu Gln Glu Ile Gln Ser Arg Leu Leu Ile 195 200 205 Lys Arg Phe Ala Arg Pro Glu Glu Ile Ala Asp Ala Val Leu Tyr Leu 210 215 220 Ala Glu Arg Gly Leu Tyr Ile Thr Gly Glu Val Leu Ser Val Asn Gly225 230 235 240 Gly Leu Lys Met Pro 245 127768DNAPseudomonas putida KT2440 127atgtccaaga cccacctgtt cgacctcgac ggcaagattg cctttgtttc cggcgccagc 60cgtggcatcg gcgaggccat cgcccacttg ctcgcgcagc aaggggccca tgtgatcgtt 120tccagccgca agcttgacgg gtgccagcag gtggccgacg ccatcattgc cgccggcggc 180aaggccacgg ctgtggcctg ccacattggt gagctggaac agattcagca ggtgttcgcc 240ggcattcgcg aacagttcgg gcgactggac gtgctggtca acaatgcagc caccaacccg 300caattctgca atgtgctgga caccgaccca ggggcgttcc agaagaccgt ggacgtgaac 360atccgtggtt acttcttcat gtcggtggag gctggcaagc tgatgcgcga gaacggcggc 420ggcagcatca tcaacgtggc gtcgatcaac ggtgtttcac ccgggctgtt ccaaggcatc 480tactcggtga ccaaggcggc ggtcatcaac atgaccaagg tgttcgccaa agagtgtgca 540cccttcggta ttcgctgcaa cgcgctactg ccggggctga ccgataccaa gttcgcttcg 600gcattggtga agaacgaagc catcctcaac gccgccttgc agcagatccc cctcaaacgc 660gtggccgacc ccaaggaaat ggcgggtgcg gtgctgtacc tggccagcga tgcctccagc 720tacaccaccg gcaccacgct caatgtcgac ggtggcttcc tgtcctga 768128255PRTPseudomonas putida KT2440 128Met Ser Lys Thr His Leu Phe Asp Leu Asp Gly Lys Ile Ala Phe Val1 5 10 15 Ser Gly Ala Ser Arg Gly Ile Gly Glu Ala Ile Ala His Leu Leu Ala 20 25 30 Gln Gln Gly Ala His Val Ile Val Ser Ser Arg Lys Leu Asp Gly Cys 35 40 45 Gln Gln Val Ala Asp Ala Ile Ile Ala Ala Gly Gly Lys Ala Thr Ala 50 55 60 Val Ala Cys His Ile Gly Glu Leu Glu Gln Ile Gln Gln Val Phe Ala65 70 75 80 Gly Ile Arg Glu Gln Phe Gly Arg Leu Asp Val Leu Val Asn Asn Ala 85 90 95 Ala Thr Asn Pro Gln Phe Cys Asn Val Leu Asp Thr Asp Pro Gly Ala 100 105 110 Phe Gln Lys Thr Val Asp Val Asn Ile Arg Gly Tyr Phe Phe Met Ser 115 120 125 Val Glu Ala Gly Lys Leu Met Arg Glu Asn Gly Gly Gly Ser Ile Ile 130 135 140 Asn Val Ala Ser Ile Asn Gly Val Ser Pro Gly Leu Phe Gln Gly Ile145 150 155 160 Tyr Ser Val Thr Lys Ala Ala Val Ile Asn Met Thr Lys Val Phe Ala 165 170 175 Lys Glu Cys Ala Pro Phe Gly Ile Arg Cys Asn Ala Leu Leu Pro Gly 180 185 190 Leu Thr Asp Thr Lys Phe Ala Ser Ala Leu Val Lys Asn Glu Ala Ile 195 200 205 Leu Asn Ala Ala Leu Gln Gln Ile Pro Leu Lys Arg Val Ala Asp Pro 210 215 220 Lys Glu Met Ala Gly Ala Val Leu Tyr Leu Ala Ser Asp Ala Ser Ser225 230 235 240 Tyr Thr Thr Gly Thr Thr Leu Asn Val Asp Gly Gly Phe Leu Ser 245 250 255 129762DNAPseudomonas fluorescens Pf-5 129atgagcatga cgttttccgg ccaggtggcc ctagtgaccg gcgcagccaa tggtatcggc 60cgcgccaccg cccaggcatt tgccgcacaa ggcttgaagg tggtggtggc ggacctggac 120acggcggggg gcgagggcac cgtggcgctg atccgcgagg ccggtggcga ggcattgttc 180gtgccgtgca acgttaccct ggaggcggat gtgcaaagcc tcatggcccg caccatcgaa 240gcctatgggc gcctggatta cgccttcaac aatgccggta tcgagatcga aaagggccgc 300cttgcggagg gctccatgga tgagttcgac gccatcatgg gggtcaacgt caaaggggtc 360tggctgtgca tgaagtacca gttgccgctg ctgctggccc agggcggtgg ggcgatcgtc 420aacaccgcct cggtggcggg cctgggcgcg gcgccgaaga tgagcatcta tgcggcctcc 480aagcatgcgg tgatcggcct gaccaagtcg gcggccatcg aatatgcgaa gaagaaaatc 540cgcgtgaacg cggtatgccc ggcggtgatc gacaccgaca tgttccgccg tgcctacgag 600gcggacccga agaaggccga gttcgccgcg gccatgcacc cggtggggcg catcggcaag 660gtcgaggaga tcgccagtgc ggtgctctac ctgtgcagcg atggcgcggc ctttaccacc 720ggccatgcac tggcggtcga cggcggggcc accgcgatct ga

762130253PRTPseudomonas fluorscens Pf-5 130Met Ser Met Thr Phe Ser Gly Gln Val Ala Leu Val Thr Gly Ala Ala1 5 10 15 Asn Gly Ile Gly Arg Ala Thr Ala Gln Ala Phe Ala Ala Gln Gly Leu 20 25 30 Lys Val Val Val Ala Asp Leu Asp Thr Ala Gly Gly Glu Gly Thr Val 35 40 45 Ala Leu Ile Arg Glu Ala Gly Gly Glu Ala Leu Phe Val Pro Cys Asn 50 55 60 Val Thr Leu Glu Ala Asp Val Gln Ser Leu Met Ala Arg Thr Ile Glu65 70 75 80 Ala Tyr Gly Arg Leu Asp Tyr Ala Phe Asn Asn Ala Gly Ile Glu Ile 85 90 95 Glu Lys Gly Arg Leu Ala Glu Gly Ser Met Asp Glu Phe Asp Ala Ile 100 105 110 Met Gly Val Asn Val Lys Gly Val Trp Leu Cys Met Lys Tyr Gln Leu 115 120 125 Pro Leu Leu Leu Ala Gln Gly Gly Gly Ala Ile Val Asn Thr Ala Ser 130 135 140 Val Ala Gly Leu Gly Ala Ala Pro Lys Met Ser Ile Tyr Ala Ala Ser145 150 155 160 Lys His Ala Val Ile Gly Leu Thr Lys Ser Ala Ala Ile Glu Tyr Ala 165 170 175 Lys Lys Lys Ile Arg Val Asn Ala Val Cys Pro Ala Val Ile Asp Thr 180 185 190 Asp Met Phe Arg Arg Ala Tyr Glu Ala Asp Pro Lys Lys Ala Glu Phe 195 200 205 Ala Ala Ala Met His Pro Val Gly Arg Ile Gly Lys Val Glu Glu Ile 210 215 220 Ala Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly Ala Ala Phe Thr Thr225 230 235 240 Gly His Ala Leu Ala Val Asp Gly Gly Ala Thr Ala Ile 245 250 131735DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578 131atgaaacttg ccagtaaaac cgccattgtc accggcgccg cacgcggtat cggctttggc 60attgcccagg tgcttgcgcg ggaaggcgcg cgagtgatta tcgccgatcg tgatgcacac 120ggcgaagccg ccgccgcttc cctgcgcgaa tcgggcgcac aggcgctgtt tatcagctgc 180aatatcgctg aaaaaacgca ggtcgaagcc ctgtattccc aggccgaaga ggcgtttggc 240ccggtagaca ttctggtgaa taacgccgga atcaaccgcg acgccatgct gcacaaatta 300acggaagcgg actgggacac ggttatcgac gttaacctga aaggcacttt cctctgtatg 360cagcaggccg ctatccgcat gcgcgagcgc ggtgcgggcc gcattatcaa tatcgcttcc 420gccagttggc ttggcaacgt cgggcaaacc aactattcgg cgtcaaaagc cggcgtggtg 480ggaatgacca aaaccgcctg ccgcgaactg gcgaaaaaag gtgtcacggt gaatgccatc 540tgcccgggct ttatcgatac cgacatgacg cgcggcgtac cggaaaacgt ctggcaaatc 600atggtcagca aaattcccgc gggttacgcc ggcgaggcga aagacgtcgg cgagtgtgtg 660gcgtttctgg cgtccgatgg cgcgcgctat atcaatggtg aagtgattaa cgtcggcggc 720ggcatggtgc tgtaa 735132253PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 132Met Ser Met Thr Phe Ser Gly Gln Val Ala Leu Val Thr Gly Ala Ala1 5 10 15 Asn Gly Ile Gly Arg Ala Thr Ala Gln Ala Phe Ala Ala Gln Gly Leu 20 25 30 Lys Val Val Val Ala Asp Leu Asp Thr Ala Gly Gly Glu Gly Thr Val 35 40 45 Ala Leu Ile Arg Glu Ala Gly Gly Glu Ala Leu Phe Val Pro Cys Asn 50 55 60 Val Thr Leu Glu Ala Asp Val Gln Ser Leu Met Ala Arg Thr Ile Glu65 70 75 80 Ala Tyr Gly Arg Leu Asp Tyr Ala Phe Asn Asn Ala Gly Ile Glu Ile 85 90 95 Glu Lys Gly Arg Leu Ala Glu Gly Ser Met Asp Glu Phe Asp Ala Ile 100 105 110 Met Gly Val Asn Val Lys Gly Val Trp Leu Cys Met Lys Tyr Gln Leu 115 120 125 Pro Leu Leu Leu Ala Gln Gly Gly Gly Ala Ile Val Asn Thr Ala Ser 130 135 140 Val Ala Gly Leu Gly Ala Ala Pro Lys Met Ser Ile Tyr Ala Ala Ser145 150 155 160 Lys His Ala Val Ile Gly Leu Thr Lys Ser Ala Ala Ile Glu Tyr Ala 165 170 175 Lys Lys Lys Ile Arg Val Asn Ala Val Cys Pro Ala Val Ile Asp Thr 180 185 190 Asp Met Phe Arg Arg Ala Tyr Glu Ala Asp Pro Lys Lys Ala Glu Phe 195 200 205 Ala Ala Ala Met His Pro Val Gly Arg Ile Gly Lys Val Glu Glu Ile 210 215 220 Ala Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly Ala Ala Phe Thr Thr225 230 235 240 Gly His Ala Leu Ala Val Asp Gly Gly Ala Thr Ala Ile 245 250 133750DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578 133atgttattga aagataaagt cgccattatt actggcgcgg cctccgcacg cggtttgggc 60ttcgcgactg cgaaattatt cgccgaaaac ggcgcgaaag tggtcattat cgacctcaat 120ggcgaagcca gtaaaaccgc cgcggcggca ttaggcgaag accatctcgg cctggcggcc 180aacgtcgctg atgaagtgca ggtgcaggcg gccatcgaac agatcctggc gaaatacggt 240cgggttgatg tactggtcaa taacgccggg attacccagc cgctgaagct gatggatatc 300aagcgcgcca actatgacgc ggtgcttgat gttagcctgc gcggcacgct gctgatgtcg 360caggcggtta tccccaccat gcgggcgcaa aaatccggca gcatcgtctg catctcgtcc 420gtctccgccc agcgcggcgg cggtattttc ggcggaccgc actacagcgc ggcaaaagcc 480ggggtgctgg gtctggcgcg ggcgatggcg cgcgagcttg gcccggataa cgtccgcgtt 540aactgcatca ccccggggct gattcagacc gacattaccg ccggcaagct gactgatgac 600atgacggcca acattcttgc cggcattccg atgaaccgcc ttggcgacgc gatagacatc 660gcgcgcgccg cgctgttcct cggcagcgat ctttcctcct actccaccgg catcaccctg 720gacgttaacg gcggcatgtt aattcactaa 750134249PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 134Met Leu Leu Lys Asp Lys Val Ala Ile Ile Thr Gly Ala Ala Ser Ala1 5 10 15 Arg Gly Leu Gly Phe Ala Thr Ala Lys Leu Phe Ala Glu Asn Gly Ala 20 25 30 Lys Val Val Ile Ile Asp Leu Asn Gly Glu Ala Ser Lys Thr Ala Ala 35 40 45 Ala Ala Leu Gly Glu Asp His Leu Gly Leu Ala Ala Asn Val Ala Asp 50 55 60 Glu Val Gln Val Gln Ala Ala Ile Glu Gln Ile Leu Ala Lys Tyr Gly65 70 75 80 Arg Val Asp Val Leu Val Asn Asn Ala Gly Ile Thr Gln Pro Leu Lys 85 90 95 Leu Met Asp Ile Lys Arg Ala Asn Tyr Asp Ala Val Leu Asp Val Ser 100 105 110 Leu Arg Gly Thr Leu Leu Met Ser Gln Ala Val Ile Pro Thr Met Arg 115 120 125 Ala Gln Lys Ser Gly Ser Ile Val Cys Ile Ser Ser Val Ser Ala Gln 130 135 140 Arg Gly Gly Gly Ile Phe Gly Gly Pro His Tyr Ser Ala Ala Lys Ala145 150 155 160 Gly Val Leu Gly Leu Ala Arg Ala Met Ala Arg Glu Leu Gly Pro Asp 165 170 175 Asn Val Arg Val Asn Cys Ile Thr Pro Gly Leu Ile Gln Thr Asp Ile 180 185 190 Thr Ala Gly Lys Leu Thr Asp Asp Met Thr Ala Asn Ile Leu Ala Gly 195 200 205 Ile Pro Met Asn Arg Leu Gly Asp Ala Ile Asp Ile Ala Arg Ala Ala 210 215 220 Leu Phe Leu Gly Ser Asp Leu Ser Ser Tyr Ser Thr Gly Ile Thr Leu225 230 235 240 Asp Val Asn Gly Gly Met Leu Ile His 245 135750DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578 135atgttattga aagataaagt cgccattatt actggcgcgg cctccgcacg cggtttgggc 60ttcgcgactg cgaaattatt cgccgaaaac ggcgcgaaag tggtcattat cgacctcaat 120ggcgaagcca gtaaaaccgc cgcggcggca ttaggcgaag accatctcgg cctggcggcc 180aacgtcgctg atgaagtgca ggtgcaggcg gccatcgaac agatcctggc gaaatacggt 240cgggttgatg tactggtcaa taacgccggg attacccagc cgctgaagct gatggatatc 300aagcgcgcca actatgacgc ggtgcttgat gttagcctgc gcggcacgct gctgatgtcg 360caggcggtta tccccaccat gcgggcgcaa aaatccggca gcatcgtctg catctcgtcc 420gtctccgccc agcgcggcgg cggtattttc ggcggaccgc actacagcgc ggcaaaagcc 480ggggtgctgg gtctggcgcg ggcgatggcg cgcgagcttg gcccggataa cgtccgcgtt 540aactgcatca ccccggggct gattcagacc gacattaccg ccggcaagct gactgatgac 600atgacggcca acattcttgc cggcattccg atgaaccgcc ttggcgacgc gatagacatc 660gcgcgcgccg cgctgttcct cggcagcgat ctttcctcct actccaccgg catcaccctg 720gacgttaacg gcggcatgtt aattcactaa 750136249PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 136Met Leu Leu Lys Asp Lys Val Ala Ile Ile Thr Gly Ala Ala Ser Ala1 5 10 15 Arg Gly Leu Gly Phe Ala Thr Ala Lys Leu Phe Ala Glu Asn Gly Ala 20 25 30 Lys Val Val Ile Ile Asp Leu Asn Gly Glu Ala Ser Lys Thr Ala Ala 35 40 45 Ala Ala Leu Gly Glu Asp His Leu Gly Leu Ala Ala Asn Val Ala Asp 50 55 60 Glu Val Gln Val Gln Ala Ala Ile Glu Gln Ile Leu Ala Lys Tyr Gly65 70 75 80 Arg Val Asp Val Leu Val Asn Asn Ala Gly Ile Thr Gln Pro Leu Lys 85 90 95 Leu Met Asp Ile Lys Arg Ala Asn Tyr Asp Ala Val Leu Asp Val Ser 100 105 110 Leu Arg Gly Thr Leu Leu Met Ser Gln Ala Val Ile Pro Thr Met Arg 115 120 125 Ala Gln Lys Ser Gly Ser Ile Val Cys Ile Ser Ser Val Ser Ala Gln 130 135 140 Arg Gly Gly Gly Ile Phe Gly Gly Pro His Tyr Ser Ala Ala Lys Ala145 150 155 160 Gly Val Leu Gly Leu Ala Arg Ala Met Ala Arg Glu Leu Gly Pro Asp 165 170 175 Asn Val Arg Val Asn Cys Ile Thr Pro Gly Leu Ile Gln Thr Asp Ile 180 185 190 Thr Ala Gly Lys Leu Thr Asp Asp Met Thr Ala Asn Ile Leu Ala Gly 195 200 205 Ile Pro Met Asn Arg Leu Gly Asp Ala Ile Asp Ile Ala Arg Ala Ala 210 215 220 Leu Phe Leu Gly Ser Asp Leu Ser Ser Tyr Ser Thr Gly Ile Thr Leu225 230 235 240 Asp Val Asn Gly Gly Met Leu Ile His 245 137714DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578 137atgacagcgt ttcacaacaa atcagtgctg gttttaggcg ggagtcgggg aattggcgcg 60gcgatcgtca ggcgttttgt cgccgatggc gcgtcggtgg tgtttagcta ttccggttcg 120ccggaagcgg ccgagcggct ggcggcagag accggcagca cggcggtgca ggcggacagc 180gccgatcgcg atgcggtgat aagcctggtc cgcgacagcg gcccgctgga cgtgttagtg 240gtcaatgccg ggatcgcgct tttcggtgac gctctcgagc aggacagcga tgcaatcgat 300cgcctgttcc acatcaatat tcacgccccc taccatgcct ccgtcgaagc ggcgcgccgc 360atgccggaag gcgggcgcat tattgtcatc ggctcagtca atggcgatcg catgccgttg 420ccgggaatgg cggcctatgc gctcagcaaa tcggccctgc aggggctggc gcgcggcctg 480gcgcgggatt ttggcccgcg cggcatcacg gtcaacgtcg tccagcccgg cccaattgat 540accgacgcca acccggagaa cggcccgatg aaagagctga tgcacagctt tatggccatt 600aagcgccatg gccgtccgga agaggtggcg ggaatggtgg cgtggctggc cggtccggag 660gcgtcgtttg tcactggcgc catgcacacc atcgacggag cgtttggcgc ctga 714138237PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 138Met Thr Ala Phe His Asn Lys Ser Val Leu Val Leu Gly Gly Ser Arg1 5 10 15 Gly Ile Gly Ala Ala Ile Val Arg Arg Phe Val Ala Asp Gly Ala Ser 20 25 30 Val Val Phe Ser Tyr Ser Gly Ser Pro Glu Ala Ala Glu Arg Leu Ala 35 40 45 Ala Glu Thr Gly Ser Thr Ala Val Gln Ala Asp Ser Ala Asp Arg Asp 50 55 60 Ala Val Ile Ser Leu Val Arg Asp Ser Gly Pro Leu Asp Val Leu Val65 70 75 80 Val Asn Ala Gly Ile Ala Leu Phe Gly Asp Ala Leu Glu Gln Asp Ser 85 90 95 Asp Ala Ile Asp Arg Leu Phe His Ile Asn Ile His Ala Pro Tyr His 100 105 110 Ala Ser Val Glu Ala Ala Arg Arg Met Pro Glu Gly Gly Arg Ile Ile 115 120 125 Val Ile Gly Ser Val Asn Gly Asp Arg Met Pro Leu Pro Gly Met Ala 130 135 140 Ala Tyr Ala Leu Ser Lys Ser Ala Leu Gln Gly Leu Ala Arg Gly Leu145 150 155 160 Ala Arg Asp Phe Gly Pro Arg Gly Ile Thr Val Asn Val Val Gln Pro 165 170 175 Gly Pro Ile Asp Thr Asp Ala Asn Pro Glu Asn Gly Pro Met Lys Glu 180 185 190 Leu Met His Ser Phe Met Ala Ile Lys Arg His Gly Arg Pro Glu Glu 195 200 205 Val Ala Gly Met Val Ala Trp Leu Ala Gly Pro Glu Ala Ser Phe Val 210 215 220 Thr Gly Ala Met His Thr Ile Asp Gly Ala Phe Gly Ala225 230 235 139750DNAKlebsiella pneumoniae subp. pneumoniae MGH78578 139atgaacggcc tgctaaacgg taaacgtatt gtcgtcaccg gtgcggcgcg cggtctcggg 60taccactttg ccgaagcctg cgccgctcag ggcgcgacgg tggtgatgtg cgacatcctg 120cagggagagc tggcggaaag cgctcatcgc ctgcagcaga agggctatca ggtcgaatct 180cacgccatcg atcttgccag tcaagcatcg atcgagcagg tcttcagcgc catcggcgcg 240caggggtcta tcgatggctt agtcaataac gcagcgatgg ccaccggcgt cggcggaaaa 300aatatgatcg attacgatcc ggatctgtgg gatcgggtaa tgacggtcaa cgttaaaggc 360acctggttgg tgacccgcgc ggcggtaccg ctgctgcgcg aaggggcggc gatcgtcaac 420gtcgcttcgg ataccgcgct gtggggcgcg ccgcggctga tggcctatgt cgccagtaag 480ggcgcggtga ttgcgatgac ccgctccatg gcccgcgagc tgggtgaaaa gcggatccgt 540atcaacgcca tcgcgccggg actgacccgc gttgaggcca cggaatacgt tcccgccgag 600cgtcatcagc tgtatgagaa cggccgcgcg ctcagcggcg cgcagcagcc ggaagatgtc 660accggcagcg tggtctggct gctgagcgat ctttcgcgct ttatcaccgg ccaactgatc 720ccggtcaacg gcggttttgt ctttaactaa 750140249PRTKlebsiella pneumoniae subsp. pneumoinae MGH78578 140Met Asn Gly Leu Leu Asn Gly Lys Arg Ile Val Val Thr Gly Ala Ala1 5 10 15 Arg Gly Leu Gly Tyr His Phe Ala Glu Ala Cys Ala Ala Gln Gly Ala 20 25 30 Thr Val Val Met Cys Asp Ile Leu Gln Gly Glu Leu Ala Glu Ser Ala 35 40 45 His Arg Leu Gln Gln Lys Gly Tyr Gln Val Glu Ser His Ala Ile Asp 50 55 60 Leu Ala Ser Gln Ala Ser Ile Glu Gln Val Phe Ser Ala Ile Gly Ala65 70 75 80 Gln Gly Ser Ile Asp Gly Leu Val Asn Asn Ala Ala Met Ala Thr Gly 85 90 95 Val Gly Gly Lys Asn Met Ile Asp Tyr Asp Pro Asp Leu Trp Asp Arg 100 105 110 Val Met Thr Val Asn Val Lys Gly Thr Trp Leu Val Thr Arg Ala Ala 115 120 125 Val Pro Leu Leu Arg Glu Gly Ala Ala Ile Val Asn Val Ala Ser Asp 130 135 140 Thr Ala Leu Trp Gly Ala Pro Arg Leu Met Ala Tyr Val Ala Ser Lys145 150 155 160 Gly Ala Val Ile Ala Met Thr Arg Ser Met Ala Arg Glu Leu Gly Glu 165 170 175 Lys Arg Ile Arg Ile Asn Ala Ile Ala Pro Gly Leu Thr Arg Val Glu 180 185 190 Ala Thr Glu Tyr Val Pro Ala Glu Arg His Gln Leu Tyr Glu Asn Gly 195 200 205 Arg Ala Leu Ser Gly Ala Gln Gln Pro Glu Asp Val Thr Gly Ser Val 210 215 220 Val Trp Leu Leu Ser Asp Leu Ser Arg Phe Ile Thr Gly Gln Leu Ile225 230 235 240 Pro Val Asn Gly Gly Phe Val Phe Asn 245 141795DNAKlebsiella pneumoniae subsp. pneumoniae MGH78578 141atgaatgcac aaattgaagg gcgcgtcgcg gtagtcaccg gcggttcgtc aggaatcggc 60tttgaaacgc tgcgcctgct gctgggcgaa ggggcgaaag tcgccttttg cggccgcaac 120ccggatcggc ttgccagcgc ccatgcggcg ttgcaaaacg aatatccaga aggtgaggtg 180ttctcctggc gctgtgacgt actgaacgaa gctgaagttg aggcgttcgc cgccgcggtc 240gccgcgcgtt tcggcggcgt cgatatgctg attaataacg ccggccaggg ctatgtcgcc 300cacttcgccg atacgccacg tgaggcctgg ctgcacgaag ccgaactgaa actgttcggc 360gtgattaacc cggtaaaggc ctttcagtcc ctgctagagg cgtcggatat cgcctcgatt 420acctgtgtga actcgctgct ggcgttacag ccggaagagc acatgatcgc cacctctgcc 480gcccgcgccg cgctgctcaa tatgacgctg actctgtcga aagagctggt ggataaaggt 540attcgtgtga attccattct gctggggatg gtggagtccg ggcagtggca gcgccgtttt 600gagagccgaa gcgataagag ccagagttgg cagcagtgga ccgccgatat cgcccgtaag 660cgggggatcc cgatggcgcg tctcggtaag ccgcaggagc cagcgcaagc gctgctattc 720ctcgcttcgc cgctggcctc ctttaccacc ggcgcggcgc tggacgtttc cggcggtttc 780tgtcgccatc tgtaa

795142264PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 142Met Asn Ala Gln Ile Glu Gly Arg Val Ala Val Val Thr Gly Gly Ser1 5 10 15 Ser Gly Ile Gly Phe Glu Thr Leu Arg Leu Leu Leu Gly Glu Gly Ala 20 25 30 Lys Val Ala Phe Cys Gly Arg Asn Pro Asp Arg Leu Ala Ser Ala His 35 40 45 Ala Ala Leu Gln Asn Glu Tyr Pro Glu Gly Glu Val Phe Ser Trp Arg 50 55 60 Cys Asp Val Leu Asn Glu Ala Glu Val Glu Ala Phe Ala Ala Ala Val65 70 75 80 Ala Ala Arg Phe Gly Gly Val Asp Met Leu Ile Asn Asn Ala Gly Gln 85 90 95 Gly Tyr Val Ala His Phe Ala Asp Thr Pro Arg Glu Ala Trp Leu His 100 105 110 Glu Ala Glu Leu Lys Leu Phe Gly Val Ile Asn Pro Val Lys Ala Phe 115 120 125 Gln Ser Leu Leu Glu Ala Ser Asp Ile Ala Ser Ile Thr Cys Val Asn 130 135 140 Ser Leu Leu Ala Leu Gln Pro Glu Glu His Met Ile Ala Thr Ser Ala145 150 155 160 Ala Arg Ala Ala Leu Leu Asn Met Thr Leu Thr Leu Ser Lys Glu Leu 165 170 175 Val Asp Lys Gly Ile Arg Val Asn Ser Ile Leu Leu Gly Met Val Glu 180 185 190 Ser Gly Gln Trp Gln Arg Arg Phe Glu Ser Arg Ser Asp Lys Ser Gln 195 200 205 Ser Trp Gln Gln Trp Thr Ala Asp Ile Ala Arg Lys Arg Gly Ile Pro 210 215 220 Met Ala Arg Leu Gly Lys Pro Gln Glu Pro Ala Gln Ala Leu Leu Phe225 230 235 240 Leu Ala Ser Pro Leu Ala Ser Phe Thr Thr Gly Ala Ala Leu Asp Val 245 250 255 Ser Gly Gly Phe Cys Arg His Leu 260 1431795DNAPseudomonas fluorescens 143cgccaagcaa tcgggctttg gggcagaatt gggtcgcgaa gggcttgagg agtttgccca 60gtccaagatc atcaacgccg cgctataaat taaaggatcc cccatggcga tgattacagg 120cggcgaactg gttgttcgca ccctaataaa ggctggggtc gaacatctgt tcggcctgca 180cggcgcgcat atcgatacga tttttcaagc ctgtctcgat catgatgtgc cgatcatcga 240cacccgccat gaggccgccg cagggcatgc ggccgagggc tatgcccgcg ctggcgccaa 300gctgggcgtg gctggtcacg gcgggcgggg gatttaccaa tgcggtcacg cccattgcca 360acgcttggct ggatcgcaag gccggtgtat tcctcacccg ggatcgggcg cgctgcgtga 420tgatgaaacc aacacgttgc aggcggggat tgatcaggtc gccatggcgg cgcccattac 480caaatgggcg catcgggtga tggcaaccga gcatatccca cggctggtga tgcaggcgat 540ccgcgccgcg ttgagcgcgc cacgcgggcc ggtgttgctg gatctgccgt gggatattct 600gatgaaccag attgatgagg atagcgtcat tatccccgat ctggtcttgt ccgcgcatgg 660ggccagaccc gaccctgccg atctggatca ggctctcgcg cttttgcgca aggcggagcg 720gccggtcatc gtgctcggct cagaagcctc gcggacagcg cgcaagacgg cgcttagcgc 780cttcgtggcg gcgactggcg tgccggtgtt tgccgattat gaagggctaa gcatgctctc 840ggggctgccc gatgctatgc ggggcgggct ggtgcaaaac ctctattctt ttgccaaagc 900cgatgccgcg ccagatctcg tgctgatgct gggggcgcgc tttggcctta acaccgggca 960tggatctggg cagttgatcc cccatagcgc gcaggtcatt caggtcgacc ctgatgcctg 1020cgagctggga cgcctgcagg gcatcgctct gggcattgtg gccgatgtgg gtgggaccat 1080cgaggctttg gcgcaggcca ccgcgcaaga tgcggcttgg ccggatcgcg gcgactggtg 1140cgccaaagtg acggatctgg cgcaagagcg ctatgccagc atcgctgcga aatcgagcag 1200cgagcatgcg ctccacccct ttcacgcctc gcaggtcatt gccaaacacg tcgatgcagg 1260ggtgacggtg gtagcggatg gtgcgctgac ctatctctgg ctgtccgaag tgatgagccg 1320cgtgaaaccc ggcggttttc tctgccacgg ctatctaggc tcgatgggcg tgggcttcgg 1380cacggcgctg ggcgcgcaag tggccgatct tgaagcaggc cgccgcacga tccttgtgac 1440cggcgatggc tcggtgggct atagcatcgg tgaatttgat acgctggtgc gcaaacaatt 1500gccgctgatc gtcatcatca tgaacaacca aagctggggg gcgacattgc atttccagca 1560attggccgtc ggccccaatc gcgtgacggg cacccgtttg gaaaatggct cctatcacgg 1620ggtggccgcc gcctttggcg cggatggcta tcatgtcgac agtgtggaga gcttttctgc 1680ggctctggcc caagcgctcg cccataatcg ccccgcctgc atcaatgtcg cggtcgcgct 1740cgatccgatc ccgcccgaag aactcattct gatcggcatg gaccccttcg catga 1795144563PRTPseudomonas fluorescens 144Met Ala Met Ile Thr Gly Gly Glu Leu Val Val Arg Thr Leu Ile Lys1 5 10 15 Ala Gly Val Glu His Leu Phe Gly Leu His Gly Ala His Ile Asp Thr 20 25 30 Ile Phe Gln Ala Cys Leu Asp His Asp Val Pro Ile Ile Asp Thr Arg 35 40 45 His Glu Ala Ala Ala Gly His Ala Ala Glu Gly Tyr Ala Arg Ala Gly 50 55 60 Ala Lys Leu Gly Val Ala Gly His Gly Gly Arg Gly Ile Tyr Gln Cys65 70 75 80 Gly His Ala His Cys Gln Arg Leu Ala Gly Ser Gln Gly Arg Cys Ile 85 90 95 Pro His Pro Gly Ser Gly Ala Leu Arg Asp Asp Glu Thr Asn Thr Leu 100 105 110 Gln Ala Gly Ile Asp Gln Val Ala Met Ala Ala Pro Ile Thr Lys Trp 115 120 125 Ala His Arg Val Met Ala Thr Glu His Ile Pro Arg Leu Val Met Gln 130 135 140 Ala Ile Arg Ala Ala Leu Ser Ala Pro Arg Gly Pro Val Leu Leu Asp145 150 155 160 Leu Pro Trp Asp Ile Leu Met Asn Gln Ile Asp Glu Asp Ser Val Ile 165 170 175 Ile Pro Asp Leu Val Leu Ser Ala His Gly Ala Arg Pro Asp Pro Ala 180 185 190 Asp Leu Asp Gln Ala Leu Ala Leu Leu Arg Lys Ala Glu Arg Pro Val 195 200 205 Ile Val Leu Gly Ser Glu Ala Ser Arg Thr Ala Arg Lys Thr Ala Leu 210 215 220 Ser Ala Phe Val Ala Ala Thr Gly Val Pro Val Phe Ala Asp Tyr Glu225 230 235 240 Gly Leu Ser Met Leu Ser Gly Leu Pro Asp Ala Met Arg Gly Gly Leu 245 250 255 Val Gln Asn Leu Tyr Ser Phe Ala Lys Ala Asp Ala Ala Pro Asp Leu 260 265 270 Val Leu Met Leu Gly Ala Arg Phe Gly Leu Asn Thr Gly His Gly Ser 275 280 285 Gly Gln Leu Ile Pro His Ser Ala Gln Val Ile Gln Val Asp Pro Asp 290 295 300 Ala Cys Glu Leu Gly Arg Leu Gln Gly Ile Ala Leu Gly Ile Val Ala305 310 315 320 Asp Val Gly Gly Thr Ile Glu Ala Leu Ala Gln Ala Thr Ala Gln Asp 325 330 335 Ala Ala Trp Pro Asp Arg Gly Asp Trp Cys Ala Lys Val Thr Asp Leu 340 345 350 Ala Gln Glu Arg Tyr Ala Ser Ile Ala Ala Lys Ser Ser Ser Glu His 355 360 365 Ala Leu His Pro Phe His Ala Ser Gln Val Ile Ala Lys His Val Asp 370 375 380 Ala Gly Val Thr Val Val Ala Asp Gly Ala Leu Thr Tyr Leu Trp Leu385 390 395 400 Ser Glu Val Met Ser Arg Val Lys Pro Gly Gly Phe Leu Cys His Gly 405 410 415 Tyr Leu Gly Ser Met Gly Val Gly Phe Gly Thr Ala Leu Gly Ala Gln 420 425 430 Val Ala Asp Leu Glu Ala Gly Arg Arg Thr Ile Leu Val Thr Gly Asp 435 440 445 Gly Ser Val Gly Tyr Ser Ile Gly Glu Phe Asp Thr Leu Val Arg Lys 450 455 460 Gln Leu Pro Leu Ile Val Ile Ile Met Asn Asn Gln Ser Trp Gly Ala465 470 475 480 Thr Leu His Phe Gln Gln Leu Ala Val Gly Pro Asn Arg Val Thr Gly 485 490 495 Thr Arg Leu Glu Asn Gly Ser Tyr His Gly Val Ala Ala Ala Phe Gly 500 505 510 Ala Asp Gly Tyr His Val Asp Ser Val Glu Ser Phe Ser Ala Ala Leu 515 520 525 Ala Gln Ala Leu Ala His Asn Arg Pro Ala Cys Ile Asn Val Ala Val 530 535 540 Ala Leu Asp Pro Ile Pro Pro Glu Glu Leu Ile Leu Ile Gly Met Asp545 550 555 560 Pro Phe Ala1459PRTArtificial SequenceA polypeptide that is similar to an autotransporter adhesion or type I secretion target repeat. 145Gly Gly Xaa Gly Xaa Asp Xaa Xaa Xaa1 5 14650DNAArtificial SequencePrimer 146gtctttattc atatatatat cctccttaat tcaaccgttc aatcaccatc 5014730DNAArtificial SequencePrimer 147gggcggccgc aaggggttcg cgttggccga 3014822DNAArtificial SequencePrimer 148ggagaaaata ccgcatcagg cg 2214932DNAArtificial SequencePrimer 149cgggatccaa gttgcaggat atgacgaaag cg 3215033DNAArtificial SequencePrimer 150gctctagaag attatccctg tctgcggaag cgg 3315132DNAArtificial SequencePrimer 151gctctagagg ggtgcctaat gagtgagcta ac 3215233DNAArtificial SequencePrimer 152cgggatccgc gttaatattt tgttaaaatt cgc 3315331DNAArtificial SequencePrimer 153gctctagagt ttatgtcgca cccgccgttg g 3115432DNAArtificial SequencePrimer 154cccaagctta gaaagggaaa ttgtggtagc cc 3215531DNAArtificial SequencePrimer 155ggaattccat atgcgtccct ctgccccggc c 3115630DNAArtificial SequencePrimer 156cgggatcctt agaactgctt gggaagggag 3015750DNAArtificial SequencePrimer 157aggtacggtg aaataaagga ggatatacat atgtccaaaa agattgccgt 5015837DNAArtificial SequencePrimer 158ttttcctttt gcggccgccc cgctggcatc gcctcac 3715950DNAArtificial SequencePrimer 159ggcgatgcca gcgtaaagga ggatatacat atgaaaaact ggaaaacaag 5016037DNAArtificial SequencePrimer 160ttttcctttt gcggccgccc cagcttagcg ccttcta 3716131DNAArtificial SequencePrimer 161cccgagctct taggaggatt agtcatggaa c 3116232DNAArtificial SequencePrimer 162gctctagatt attttgaata atcgtagaaa cc 3216342DNAArtificial sequencePrimer 163gctctagagg aggatatata tatgaaaaat tgtgtcatcg tc 4216430DNAArtificial SequencePrimer 164aactgcagtt aattcaaccg ttcaatcacc 3016546DNAArtificial SequencePrimer 165cgagctcagg aggatatata tatgaaaaat tgtgtcatcg tcagtg 4616650DNAArtificial SequencePrimer 166ggttgaatta aggaggatat atatatgaat aaagacacac taatacctac 5016730DNAArtificial SequencePrimer 167cccaagctta gccggcaagt acacatcttc 3016846DNAArtificial SequencePrimer 168cgagctcagg aggatatata tatgaaaaat tgtgtcatcg tcagtg 4616930DNAArtificial SequencePrimer 169cccaagctta gccggcaagt acacatcttc 3017040DNAArtificial SequencePrimer 170aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg 4017135DNAArtificial SequencePrimer 171cggggtaccg cggatacata tttgaatgta tttag 3517244DNAArtificial SequencePrimer 172aaggaaaaaa gcggccgcgc ggatacatat ttgaatgtat ttag 4417343DNAArtificial SequencePrimer 173gctctagagg aggatatata tatggctaac tacttcaata cac 4317450DNAArtificial SequencePrimer 174tgctgttgcg ggttaaggag gatatatata tgcctaagta ccgttccgcc 5017550DNAArtificial SequencePrimer 175aacggtactt aggcatatat atatcctcct taacccgcaa cagcaatacg 5017630DNAArtificial SequencePrimer 176acatgcatgc ttaacccccc agtttcgatt 3017743DNAArtificial SequencePrimer 177gctctagagg aggatatata tatggctaac tacttcaata cac 4317830DNAArtificial SequencePrimer 178acatgcatgc ttaacccccc agtttcgatt 3017943DNAArtificial SequencePrimer 179cccgagctca ggaggatata tatatggata aacagtatcc ggt 4318028DNAArtificial SequencePrimer 180gctctagatt acagaatttg actcaggt 2818145DNAArtificial SequencePrimer 181cccgagctca ggaggatata tatatgttga caaaagcaac aaaag 4518225DNAArtificial SequencePrimer 182ctctaaatct ctggaaaggg taccg 2518330DNAArtificial SequencePrimer 183gctctagatt agagagcttt cgttttcatg 3018445DNAArtificial SequencePrimer 184cccgagctca ggaggatata tatatgttga caaaagcaac aaaag 4518530DNAArtificial SequencePrimer 185gctctagatt agagagcttt cgttttcatg 3018646DNAArtificial SequencePrimer 186cgagctcagg aggatatata tatgagccag caagtcatta ttttcg 4618735DNAArtificial SequencePrimer 187aaaactgcag cgtttgatga cgtggacgat agcgg 3518846DNAArtificial SequencePrimer 188cgagctcagg aggatatata tatgagccag caagtcatta ttttcg 4618950DNAArtificial SequencePrimer 189aggggtgtaa ggaggatata tatatggcta agacgttata cgaaaaattg 5019050DNAArtificial SequencePrimer 190cgtcttagcc atatatatat cctccttaca ccccttctgc tacatagcgg 5019135DNAArtificial SequencePrimer 191aaaactgcag cgtttgatga cgtggacgat agcgg 3519246DNAArtificial SequencePrimer 192cgagctcagg aggatatata tatgagccag caagtcatta ttttcg 4619335DNAArtificial SequencePrimer 193aaaactgcag cgtttgatga cgtggacgat agcgg 3519446DNAArtificial SequencePrimer 194cgagctcagg aggatatata tatgagccag caagtcatta ttttcg 4619550DNAArtificial SequencePrimer 195gaaaccgtgt gaggaggata tatatatgtc gaagaattac catattgccg 5019650DNAArtificial SequencePrimer 196aggggtgtaa ggaggatata tatatggcta agacgttata cgaaaaattg 5019750DNAArtificial SequencePrimer 197acattaaata aggaggatat atatatggca gagaaattta tcaaacacac 5019850DNAArtificial SequencePrimer 198attcttcgac atatatatat cctcctcaca cggtttcctt gttgttttcg 5019950DNAArtificial SequencePrimer 199cgtcttagcc atatatatat cctccttaca ccccttctgc tacatagcgg 5020050DNAArtificial SequencePrimer 200tttctctgcc atatatatat cctccttatt taatgttgcg aatgtcggcg 5020135DNAArtificial SequencePrimer 201aaaactgcag cgtttgatga cgtggacgat agcgg 3520246DNAArtificial SequencePrimer 202cgagctcagg aggatatata tatgagccag caagtcatta ttttcg 4620335DNAArtificial SequencePrimer 203aaaactgcag cgtttgatga cgtggacgat agcgg 3520440DNAArtificial SequencePrimer 204aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg 4020535DNAArtificial SequencePrimer 205cggggtaccg cggatacata tttgaatgta tttag 3520642DNAArtificial SequencePrimer 206aaggaaaaaa gcggccgcac ttttcatact cccgccattc ag 4220731DNAArtificial SequencePrimer 207caaaggccgt ctgcacgcgc cgaaaggcaa a 3120831DNAArtificial SequencePrimer 208tttgcctttc ggcgcgtgca gacggccttt g 3120935DNAArtificial SequencePrimer 209acatgcatgc cgtttgatga cgtggacgat agcgg 3521042DNAArtificial SequencePrimer 210aaggaaaaaa gcggccgcac ttttcatact cccgccattc ag 4221135DNAArtificial SequencePrimer 211acatgcatgc cgtttgatga cgtggacgat agcgg 3521248DNAArtificial SequencePrimer 212cccgagctca ggaggatata tatatgaatt atcagaacga cgatttac 4821350DNAArtificial SequencePrimer 213gcgtcgcggg taaggaggaa aattttatgt cctcacgtaa agagcttgcc 5021450DNAArtificial SequencePrimer 214gaactgctgt aaggaggtta aaattatgga gaggattgtc gttactctcg 5021550DNAArtificial SequencePrimer 215caatcagcgt aaggaggtat atataatgaa aaccgtaact gtaaaagatc 5021650DNAArtificial SequencePrimer 216tacaccaggc ataaggagga attaattatg gaaacctatg ctgtttttgg 5021750DNAArtificial SequencePrimer 217tacgtgagga cataaaattt tcctccttac ccgcgacgcg cttttactgc 5021850DNAArtificial SequencePrimer 218caatcctctc cataatttta acctccttac agcagttctt ttgctttcgc 5021950DNAArtificial SequencePrimer 219caatcagcgt aaggaggtat atataatgaa aaccgtaact gtaaaagatc 5022050DNAArtificial SequencePrimer 220tacggttttc attatatata cctccttacg ctgattgaca atcggcaatg 5022134DNAArtificial SequencePrimer 221acatgcatgc ttacgcggac aattcctcct gcaa 3422248DNAArtificial SequencePrimer 222cccgagctca ggaggatata tatatgaatt atcagaacga cgatttac 4822334DNAArtificial SequencePrimer 223acatgcatgc ttacgcggac aattcctcct gcaa 3422448DNAArtificial SequencePrimer 224cccgagctca ggaggatata tatatgacat cggaaaaccc gttactgg 4822550DNAArtificial SequencePrimer 225gatccaacct aaggaggaaa attttatgac acaacctctt tttctgatcg 5022650DNAArtificial SequencePrimer 226gatcaattgt taaggaggta tatataatgg aatccctgac gttacaaccc 5022750DNAArtificial SequencePrimer 227caggcagcct aaggaggaat taattatggc tggaaacaca attggacaac 5022850DNAArtificial SequencePrimer 228aggttgtgtc ataaaatttt cctccttagg ttggatcaac aggcactacg 5022950DNAArtificial SequencePrimer 229cagggattcc attatatata cctccttaac aattgatcgt ctgtgccagg 5023050DNAArtificial SequencePrimer

230gtttccagcc ataattaatt cctccttagg ctgcctggct aatccgcgcc 5023135DNAArtificial SequencePrimer 231acatgcatgc ttaccagcgt ggaatatcag tcttc 3523248DNAArtificial SequencePrimer 232cccgagctca ggaggatata tatatgacat cggaaaaccc gttactgg 4823335DNAArtificial SequencePrimer 233acatgcatgc ttaccagcgt ggaatatcag tcttc 3523448DNAArtificial SequencePrimer 234cccgagctca ggaggatata tatatggttg ctgaattgac cgcattac 4823550DNAArtificial SequencePrimer 235aatcgccagt aaggaggaaa attttatgac acaacctctt tttctgatcg 5023650DNAArtificial SequencePrimer 236gatcaattgt taaggaggta tatataatgg aatccctgac gttacaaccc 5023750DNAArtificial SequencePrimer 237caggcagcct aaggaggaat taattatggc tggaaacaca attggacaac 5023850DNAArtificial SequencePrimer 238gaggttgtgt cataaaattt tcctccttac tggcgattgt cattcgcctg 5023950DNAArtificial SequencePrimer 239cagggattcc attatatata cctccttaac aattgatcgt ctgtgccagg 5024050DNAArtificial SequencePrimer 240gtttccagcc ataattaatt cctccttagg ctgcctggct aatccgcgcc 5024135DNAArtificial SequencePrimer 241acatgcatgc ttaccagcgt ggaatatcag tcttc 3524248DNAArtificial SequencePrimer 242cccgagctca ggaggatata tatatggttg ctgaattgac cgcattac 4824335DNAArtificial SequencePrimer 243acatgcatgc ttaccagcgt ggaatatcag tcttc 3524440DNAArtificial SequencePrimer 244aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg 4024532DNAArtificial SequencePrimer 245gctctagaac ttttcatact cccgccattc ag 3224634DNAArtificial SequencePrimer 246gctctagagc ggatacatat ttgaatgtat ttag 3424744DNAArtificial SequencePrimer 247aaggaaaaaa gcggccgcgc ggatacatat ttgaatgtat ttag 4424826DNAArtificial SequencePrimer 248catgccatgg ctatgattac tggtgg 2624933DNAArtificial SequencePrimer 249ccccgagctc ttacgcgccg gattggaaat aca 3325031DNAArtificial SequencePrimer 250catgccatgg ccaaagttac aaatcaaaaa g 3125132DNAArtificial SequencePrimer 251cgagctctta aaatgatttt atatagatat cc 3225231DNAArtificial SequencePrimer 252catgccatgg gtattccaga aactcaaaaa g 3125331DNAArtificial SequencePrimer 253cccgagctct tatttagaag tgtcaacaac g 3125447DNAArtificial SequencePrimer 254ccccgagctc aggaggatat acatatgaat aaagacacac taatacc 4725530DNAArtificial SequencePrimer 255cccaagctta gccggcaagt acacatcttc 3025645DNAArtificial SequencePrimer 256cccgagctca ggaggatata tatatgtata cagtaggaga ttacc 4525733DNAArtificial SequencePrimer 257gctctagatt atgatttatt ttgttcagca aat 3325845DNAArtificial SequencePrimer 258cccgagctca ggaggatata tatatgtata cagtaggaga ttacc 4525933DNAArtificial SequencePrimer 259gctctagatt atgatttatt ttgttcagca aat 3326046DNAArtificial SequencePrimer 260cgagctcagg aggatatata tatgaaaaaa gtcgcacttg ttaccg 4626131DNAArtificial SequencePrimer 261ggccggcggc cgcgcgatgg cggtgaaagt g 3126250DNAArtificial SequencePrimer 262aactaatcta gaggaggata tatatatgag catgacgttt tccggccagg 5026331DNAArtificial SequencePrimer 263ccttgcggag ggctcgatgg atgagttcga c 3126431DNAArtificial SequencePrimer 264cactttcacc gccatcgcgc ggccgccggc c 3126550DNAArtificial SequencePrimer 265gctcatatat atatcctcct ctagattagt taaacaccat cccgccgtcg 5026631DNAArtificial SequencePrimer 266gtcgaactca tccatcgagc cctccgcaag g 3126732DNAArtificial SequencePrimer 267cccaagctta gatcgcggtg gccccgccgt cg 3226846DNAArtificial SequencePrimer 268cgagctcagg aggatatata tatgaaaaaa gtcgcacttg ttaccg 4626932DNAArtificial SequencePrimer 269cccaagctta gatcgcggtg gccccgccgt cg 3227043DNAArtificial SequencePrimer 270gctctagagg aggatttaaa aatggaaatt aacgaaacgc tgc 4327145DNAArtificial SequencePrimer 271tccccgcggt taagcatggc gatcccgaaa tggaatccct ttgac 4527244DNAArtificial SequencePrimer 272ccgctcgagg aggatatata tatgagatcg aaaagatttg aagc 4427330DNAArtificial SequencePrimer 273gctctagatt agccaagttc attgggatcg 3027433DNAArtificial SequencePrimer 274cggggtacca cttttcatac tcccgccatt cag 3327525DNAArtificial SequencePrimer 275cggtaccctt tccagagatt tagag 2527630DNAArtificial SequencePrimer 276ggaattccat atgttcacaa cgtccgccta 3027727DNAArtificial SequencePrimer 277gcttgacggc catgtggccg aggccgc 2727827DNAArtificial SequencePrimer 278gcggcctcgg ccacatggcc gtcaagc 2727928DNAArtificial SequencePrimer 279cgggatcctt aggcggcctt ctggcgcg 2828030DNAArtificial SequencePrimer 280ggaattccat atggctattg caagaggtta 3028128DNAArtificial SequencePrimer 281cgggatcctt aagcgtcgag cgaggcca 2828230DNAArtificial SequencePrimer 282ggaattccat atgactaaaa caatgaaggc 3028327DNAArtificial SequencePrimer 283caccggggcc ggggtccggt attgcca 2728427DNAArtificial SequencePrimer 284tggcaatacc ggaccccggc cccggtg 2728528DNAArtificial SequencePrimer 285cgggatcctt aggcggcgag atccacga 2828630DNAArtificial SequencePrimer 286ggaattccat atgaccgggg cgaaccagcc 3028727DNAArtificial SequencePrimer 287atagccgctc atacgcctcg gttgcct 2728827DNAArtificial SequencePrimer 288aggcaaccga ggcgtatgag cggctat 2728928DNAArtificial SequencePrimer 289cgggatcctt aagcgccgtg cggaagga 2829030DNAArtificial SequencePrimer 290ggaattccat atgaccatgc atgccattca 3029128DNAArtificial SequencePrimer 291cgggatcctt attcggctgc aaattgca 2829230DNAArtificial SequencePrimer 292ggaattccat atgcgcgcgc tttattacga 3029328DNAArtificial SequencePrimer 293cgggatcctt attcgaaccg gtcgatga 2829430DNAArtificial SequencePrimer 294ggaattccat atgctggcga ttttctgtga 3029528DNAArtificial SequencePrimer 295cgggatcctt atgcgacctc caccatgc 2829630DNAArtificial SequencePrimer 296ggaattccat atgaaagcct tcgtcgtcga 3029728DNAArtificial SequencePrimer 297cgggatcctt aggatgcgta tgtaacca 2829830DNAArtificial SequencePrimer 298ggaattccat atgaaagcga ttgtcgccca 3029928DNAArtificial SequencePrimer 299cgggatcctt aggaaaaggc gatctgca 2830030DNAArtificial SequencePrimer 300ggaattccat atgccgatgg cgctcgggca 3030128DNAArtificial SequencePrimer 301cgggatcctt agaattcgat gacttgcc 2830230DNAArtificial SequencePrimer 302ggaattccat atgaaacatt ctcaggacaa 3030327DNAArtificial SequencePrimer 303gggcgccgat catgtggtgc gtttccg 2730427DNAArtificial SequencePrimer 304cggaaacgca ccacatgatc ggcgccc 2730528DNAArtificial SequencePrimer 305cgggatcctt atgccatacg ttccatat 2830630DNAArtificial SequencePrimer 306ggaattccat atgcagcgtt ttaccaacag 3030728DNAArtificial SequencePrimer 307cgggatcctt aggaaaacag gacgccgc 28308610PRTKlebsiella pneumoniae subsp. pneumoniae MGH 78578 308Met Arg Tyr Ile Ala Gly Ile Asp Ile Gly Asn Ser Ser Thr Glu Val1 5 10 15 Ala Leu Ala Thr Val Asp Asp Ala Gly Val Leu Asn Ile Arg His Ser 20 25 30 Ala Leu Ala Glu Thr Thr Gly Ile Lys Gly Thr Leu Arg Asn Val Phe 35 40 45 Gly Ile Gln Glu Ala Leu Thr Gln Ala Ala Lys Ala Ala Gly Ile Gln 50 55 60 Leu Ser Asp Ile Ser Leu Ile Arg Ile Asn Glu Ala Thr Pro Val Ile65 70 75 80 Gly Asp Val Ala Met Glu Thr Ile Thr Glu Thr Ile Ile Thr Glu Ser 85 90 95 Thr Met Ile Gly His Asn Pro Lys Thr Pro Gly Gly Val Gly Leu Gly 100 105 110 Val Gly Ile Thr Ile Thr Pro Glu Ala Leu Leu Ser Cys Ser Ala Asp 115 120 125 Thr Pro Tyr Ile Leu Val Val Ser Ser Ala Phe Asp Phe Ala Asp Val 130 135 140 Ala Ala Met Val Asn Ala Ala Thr Ala Ala Gly Tyr Gln Ile Thr Gly145 150 155 160 Ile Ile Leu Gln Gln Asp Asp Gly Val Leu Val Asn Asn Arg Leu Gln 165 170 175 Gln Pro Leu Pro Val Ile Asp Glu Val Gln His Ile Asp Arg Ile Pro 180 185 190 Leu Gly Met Leu Ala Ala Val Glu Val Ala Leu Pro Gly Lys Ile Ile 195 200 205 Glu Thr Leu Ser Asn Pro Tyr Gly Ile Ala Thr Val Phe Asp Leu Asn 210 215 220 Ala Glu Glu Thr Lys Asn Ile Val Pro Met Ala Arg Ala Leu Ile Gly225 230 235 240 Asn Arg Ser Ala Val Val Val Lys Thr Pro Ser Gly Asp Val Lys Ala 245 250 255 Arg Ala Ile Pro Ala Gly Asn Leu Leu Leu Ile Ala Gln Gly Arg Ser 260 265 270 Val Gln Val Asp Val Ala Ala Gly Ala Glu Ala Ile Met Lys Ala Val 275 280 285 Asp Gly Cys Gly Lys Leu Asp Asn Val Ala Gly Glu Ala Gly Thr Asn 290 295 300 Ile Gly Gly Met Leu Glu His Val Arg Gln Thr Met Ala Glu Leu Thr305 310 315 320 Asn Lys Pro Ala Gln Glu Ile Arg Ile Gln Asp Leu Leu Ala Val Asp 325 330 335 Thr Ala Val Pro Val Ser Val Thr Gly Gly Leu Ala Gly Glu Phe Ser 340 345 350 Leu Glu Gln Ala Val Gly Ile Ala Ser Met Val Lys Ser Asp Arg Leu 355 360 365 Gln Met Ala Leu Ile Ala Arg Glu Ile Glu His Lys Leu Gln Ile Ala 370 375 380 Val Gln Val Gly Gly Ala Glu Ala Glu Ala Ala Ile Leu Gly Ala Leu385 390 395 400 Thr Thr Pro Gly Thr Thr Arg Pro Leu Ala Ile Leu Asp Leu Gly Ala 405 410 415 Gly Ser Thr Asp Ala Ser Ile Ile Asn Ala Gln Gly Glu Ile Ser Ala 420 425 430 Thr His Leu Ala Gly Ala Gly Asp Met Val Thr Met Ile Ile Ala Arg 435 440 445 Glu Leu Gly Leu Glu Asp Arg Tyr Leu Ala Glu Glu Ile Lys Lys Tyr 450 455 460 Pro Leu Ala Lys Val Glu Ser Leu Phe His Leu Arg His Glu Asp Gly465 470 475 480 Ser Val Gln Phe Phe Pro Ser Ala Leu Pro Pro Ala Val Phe Ala Arg 485 490 495 Val Cys Val Val Lys Pro Asp Glu Leu Val Pro Leu Pro Gly Asp Leu 500 505 510 Pro Leu Glu Lys Val Arg Ala Ile Arg Arg Ser Ala Lys Ser Arg Val 515 520 525 Phe Val Thr Asn Ala Leu Arg Ala Leu Arg Gln Val Ser Pro Thr Gly 530 535 540 Asn Ile Arg Asp Ile Pro Phe Val Val Leu Val Gly Gly Ser Ser Leu545 550 555 560 Asp Phe Glu Ile Pro Gln Leu Val Thr Asp Ala Leu Ala His Tyr Arg 565 570 575 Leu Val Ala Gly Arg Gly Asn Ile Arg Gly Cys Glu Gly Pro Arg Asn 580 585 590 Ala Val Ala Ser Gly Leu Leu Leu Ser Trp Gln Lys Gly Gly Thr His 595 600 605 Gly Glu 610 309116PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 309Met Glu Ser Ser Val Val Ala Pro Ala Ile Val Ile Ala Val Thr Asp1 5 10 15 Glu Cys Ser Glu Gln Trp Arg Asp Val Leu Leu Gly Ile Glu Glu Glu 20 25 30 Gly Ile Pro Phe Val Leu Gln Pro Gln Thr Gly Gly Asp Leu Ile His 35 40 45 His Ala Trp Gln Ala Ala Gln Arg Ser Pro Leu Gln Val Gly Ile Ala 50 55 60 Cys Asp Arg Glu Arg Leu Ile Val His Tyr Lys Asn Leu Pro Ala Ser65 70 75 80 Thr Pro Leu Phe Ser Leu Met Tyr His Gln Asn Arg Leu Ala Arg Arg 85 90 95 Asn Thr Gly Asn Asn Ala Ala Arg Leu Val Lys Gly Ile Pro Phe Arg 100 105 110 Asp Arg His Ala 115 310787PRTClostridium butyricum 310Met Ile Ser Lys Gly Phe Ser Thr Gln Thr Glu Arg Ile Asn Ile Leu1 5 10 15 Lys Ala Gln Ile Leu Asn Ala Lys Pro Cys Val Glu Ser Glu Arg Ala 20 25 30 Ile Leu Ile Thr Glu Ser Phe Lys Gln Thr Glu Gly Gln Pro Ala Ile 35 40 45 Leu Arg Arg Ala Leu Ala Leu Lys His Ile Leu Glu Asn Ile Pro Ile 50 55 60 Thr Ile Arg Asp Gln Glu Leu Ile Val Gly Ser Leu Thr Lys Glu Pro65 70 75 80 Arg Ser Ser Gln Val Phe Pro Glu Phe Ser Asn Lys Trp Leu Gln Asp 85 90 95 Glu Leu Asp Arg Leu Asn Lys Arg Thr Gly Asp Ala Phe Gln Ile Ser 100 105 110 Glu Glu Ser Lys Glu Lys Leu Lys Asp Val Phe Glu Tyr Trp Asn Gly 115 120 125 Lys Thr Thr Ser Glu Leu Ala Thr Ser Tyr Met Thr Glu Glu Thr Arg 130 135 140 Glu Ala Val Asn Cys Asp Val Phe Thr Val Gly Asn Tyr Tyr Tyr Asn145 150 155 160 Gly Val Gly His Val Ser Val Asp Tyr Gly Lys Val Leu Arg Val Gly 165 170 175 Phe Asn Gly Ile Ile Asn Glu Ala Lys Glu Gln Leu Glu Lys Asn Arg 180 185 190 Ser Ile Asp Pro Asp Phe Ile Lys Lys Glu Lys Phe Leu Asn Ser Val 195 200 205 Ile Ile Ser Cys Glu Ala Ala Ile Thr Tyr Val Asn Arg Tyr Ala Lys 210 215 220 Lys Ala Lys Glu Ile Ala Asp Asn Thr Ser Asp Ala Lys Arg Lys Ala225 230 235 240 Glu Leu Asn Glu Ile Ala Lys Ile Cys Ser Lys Val Ser Gly Glu Gly 245 250 255 Ala Lys Ser Phe Tyr Glu Ala Cys Gln Leu Phe Trp Phe Ile His Ala 260 265 270 Ile Ile Asn Ile Glu Ser Asn Gly His Ser Ile Ser Pro Ala Arg Phe 275 280 285 Asp Gln Tyr Met Tyr Pro Tyr Tyr Glu Asn Asp Lys Asn Ile Thr Asp 290 295 300 Lys Phe Ala Gln Glu Leu Ile Asp Cys Ile Trp Ile Lys Leu Asn Asp305 310 315 320 Ile Asn Lys Val Arg Asp Glu Ile Ser Thr Lys His Phe Gly Gly Tyr 325 330 335 Pro Met Tyr Gln Asn Leu Ile Val Gly Gly Gln Asn Ser Glu Gly Lys 340 345 350 Asp Ala Thr Asn Lys Val Ser Tyr Met Ala Leu Glu Ala Ala Val His 355 360 365 Val Lys Leu Pro Gln Pro Ser Leu Ser Val Arg Ile Trp Asn Lys Thr 370 375 380 Pro Asp Glu Phe Leu Leu Arg Ala Ala Glu Leu Thr Arg Glu Gly Leu385 390 395 400 Gly Leu Pro Ala Tyr Tyr Asn Asp Glu Val Ile Ile Pro Ala Leu Val 405 410 415 Ser Arg Gly Leu Thr Leu Glu Asp Ala Arg Asp Tyr Gly Ile Ile Gly 420 425 430 Cys Val Glu Pro Gln Lys Pro Gly Lys Thr Glu Gly Trp His Asp Ser 435 440 445 Ala Phe Phe Asn Leu Ala Arg Ile Val Glu Leu Thr Ile Asn Ser Gly 450 455 460 Phe Asp Lys Asn Lys Gln Ile Gly Pro Lys Thr Gln Asn Phe Glu Glu465 470 475 480 Met Lys Ser Phe Asp Glu Phe Met Lys Ala Tyr Lys Ala Gln Met Glu 485 490

495 Tyr Phe Val Lys His Met Cys Cys Ala Asp Asn Cys Ile Asp Ile Ala 500 505 510 His Ala Glu Arg Ala Pro Leu Pro Phe Leu Ser Ser Met Val Asp Asn 515 520 525 Cys Ile Gly Lys Gly Lys Ser Leu Gln Asp Gly Gly Ala Glu Tyr Asn 530 535 540 Phe Ser Gly Pro Gln Gly Val Gly Val Ala Asn Ile Gly Asp Ser Leu545 550 555 560 Val Ala Val Lys Lys Ile Val Phe Asp Glu Asn Lys Ile Thr Pro Ser 565 570 575 Glu Leu Lys Lys Thr Leu Asn Asn Asp Phe Lys Asn Ser Glu Glu Ile 580 585 590 Gln Ala Leu Leu Lys Asn Ala Pro Lys Phe Gly Asn Asp Ile Asp Glu 595 600 605 Val Asp Asn Leu Ala Arg Glu Gly Ala Leu Val Tyr Cys Arg Glu Val 610 615 620 Asn Lys Tyr Thr Asn Pro Arg Gly Gly Asn Phe Gln Pro Gly Leu Tyr625 630 635 640 Pro Ser Ser Ile Asn Val Tyr Phe Gly Ser Leu Thr Gly Ala Thr Pro 645 650 655 Asp Gly Arg Lys Ser Gly Gln Pro Leu Ala Asp Gly Val Ser Pro Ser 660 665 670 Arg Gly Cys Asp Val Ser Gly Pro Thr Ala Ala Cys Asn Ser Val Ser 675 680 685 Lys Leu Asp His Phe Ile Ala Ser Asn Gly Thr Leu Phe Asn Gln Lys 690 695 700 Phe His Pro Ser Ala Leu Lys Gly Asp Asn Gly Leu Met Asn Leu Ser705 710 715 720 Ser Leu Ile Arg Ser Tyr Phe Asp Gln Lys Gly Phe His Val Gln Phe 725 730 735 Asn Val Ile Asp Lys Lys Ile Leu Leu Ala Ala Gln Lys Asn Pro Glu 740 745 750 Lys Tyr Gln Asp Leu Ile Val Arg Val Ala Gly Tyr Ser Ala Gln Phe 755 760 765 Ile Ser Leu Asp Lys Ser Ile Gln Asn Asp Ile Ile Ala Arg Thr Glu 770 775 780 His Val Met785 311304PRTClostridium buyricum 311Met Ser Lys Glu Ile Lys Gly Val Leu Phe Asn Ile Gln Lys Phe Ser1 5 10 15 Leu His Asp Gly Pro Gly Ile Arg Thr Ile Val Phe Phe Lys Gly Cys 20 25 30 Ser Met Ser Cys Leu Trp Cys Ser Asn Pro Glu Ser Gln Asp Ile Lys 35 40 45 Pro Gln Val Met Phe Asn Lys Asn Leu Cys Thr Lys Cys Gly Arg Cys 50 55 60 Lys Ser Gln Cys Lys Ser Ala Ala Ile Asp Met Asn Ser Glu Tyr Arg65 70 75 80 Ile Asp Lys Ser Lys Cys Thr Glu Cys Thr Lys Cys Val Asp Asn Cys 85 90 95 Leu Ser Gly Ala Leu Val Ile Glu Gly Arg Asn Tyr Ser Val Glu Asp 100 105 110 Val Ile Lys Glu Leu Lys Lys Asp Ser Val Gln Tyr Arg Arg Ser Asn 115 120 125 Gly Gly Ile Thr Leu Ser Gly Gly Glu Val Leu Leu Gln Pro Asp Phe 130 135 140 Ala Val Glu Leu Leu Lys Glu Cys Lys Ser Tyr Gly Trp His Thr Ala145 150 155 160 Ile Glu Thr Ala Met Tyr Val Asn Ser Glu Ser Val Lys Lys Val Ile 165 170 175 Pro Tyr Ile Asp Leu Ala Met Ile Asp Ile Lys Ser Met Asn Asp Glu 180 185 190 Ile His Arg Lys Phe Thr Gly Val Ser Asn Glu Ile Ile Leu Gln Asn 195 200 205 Ile Lys Leu Ser Asp Glu Leu Ala Lys Glu Ile Ile Ile Arg Ile Pro 210 215 220 Val Ile Glu Gly Phe Asn Ala Asp Leu Gln Ser Ile Gly Ala Ile Ala225 230 235 240 Gln Phe Ser Lys Ser Leu Thr Asn Leu Lys Arg Ile Asp Leu Leu Pro 245 250 255 Tyr His Asn Tyr Gly Glu Asn Lys Tyr Gln Ala Ile Gly Arg Glu Tyr 260 265 270 Ser Leu Lys Glu Leu Lys Ser Pro Ser Lys Asp Lys Met Glu Arg Leu 275 280 285 Lys Ala Leu Val Glu Ile Met Gly Ile Pro Cys Thr Ile Gly Ala Glu 290 295 300 312545PRTAzospirillum brasilense 312Met Lys Leu Ala Glu Ala Leu Leu Arg Ala Leu Lys Asp Arg Gly Ala1 5 10 15 Gln Ala Met Phe Gly Ile Pro Gly Asp Phe Ala Leu Pro Phe Phe Lys 20 25 30 Val Ala Glu Glu Thr Gln Ile Leu Pro Leu His Thr Leu Ser His Glu 35 40 45 Pro Ala Val Gly Phe Ala Ala Asp Ala Ala Ala Arg Tyr Ser Ser Thr 50 55 60 Leu Gly Val Ala Ala Val Thr Tyr Gly Ala Gly Ala Phe Asn Met Val65 70 75 80 Asn Ala Val Ala Gly Ala Tyr Ala Glu Lys Ser Pro Val Val Val Ile 85 90 95 Ser Gly Ala Pro Gly Thr Thr Glu Gly Asn Ala Gly Leu Leu Leu His 100 105 110 His Gln Gly Arg Thr Leu Asp Thr Gln Phe Gln Val Phe Lys Glu Ile 115 120 125 Thr Val Ala Gln Ala Arg Leu Asp Asp Pro Ala Lys Ala Pro Ala Glu 130 135 140 Ile Ala Arg Val Leu Gly Ala Ala Arg Ala Gln Ser Arg Pro Val Tyr145 150 155 160 Leu Glu Ile Pro Arg Asn Met Val Asn Ala Glu Val Glu Pro Val Gly 165 170 175 Asp Asp Pro Ala Trp Pro Val Asp Arg Asp Ala Leu Ala Ala Cys Ala 180 185 190 Asp Glu Val Leu Ala Ala Met Arg Ser Ala Thr Ser Pro Val Leu Met 195 200 205 Val Cys Val Glu Val Arg Arg Tyr Gly Leu Glu Ala Lys Val Ala Glu 210 215 220 Leu Ala Gln Arg Leu Gly Val Pro Val Val Thr Thr Phe Met Gly Arg225 230 235 240 Gly Leu Leu Ala Asp Ala Pro Thr Pro Pro Leu Gly Thr Tyr Ile Gly 245 250 255 Val Ala Gly Asp Ala Glu Ile Thr Arg Leu Val Glu Glu Ser Asp Gly 260 265 270 Leu Phe Leu Leu Gly Ala Ile Leu Ser Asp Thr Asn Phe Ala Val Ser 275 280 285 Gln Arg Lys Ile Asp Leu Arg Lys Thr Ile His Ala Phe Asp Arg Ala 290 295 300 Val Thr Leu Gly Tyr His Thr Tyr Ala Asp Ile Pro Leu Ala Gly Leu305 310 315 320 Val Asp Ala Leu Leu Glu Arg Leu Pro Pro Ser Asp Arg Thr Thr Arg 325 330 335 Gly Lys Glu Pro His Ala Tyr Pro Thr Gly Leu Gln Ala Asp Gly Glu 340 345 350 Pro Ile Ala Pro Met Asp Ile Ala Arg Ala Val Asn Asp Arg Val Arg 355 360 365 Ala Gly Gln Glu Pro Leu Leu Ile Ala Ala Asp Met Gly Asp Cys Leu 370 375 380 Phe Thr Ala Met Asp Met Ile Asp Ala Gly Leu Met Ala Pro Gly Tyr385 390 395 400 Tyr Ala Gly Met Gly Phe Gly Val Pro Ala Gly Ile Gly Ala Gln Cys 405 410 415 Val Ser Gly Gly Lys Arg Ile Leu Thr Val Val Gly Asp Gly Ala Phe 420 425 430 Gln Met Thr Gly Trp Glu Leu Gly Asn Cys Arg Arg Leu Gly Ile Asp 435 440 445 Pro Ile Val Ile Leu Phe Asn Asn Ala Ser Trp Glu Met Leu Arg Thr 450 455 460 Phe Gln Pro Glu Ser Ala Phe Asn Asp Leu Asp Asp Trp Arg Phe Ala465 470 475 480 Asp Met Ala Ala Gly Met Gly Gly Asp Gly Val Arg Val Arg Thr Arg 485 490 495 Ala Glu Leu Lys Ala Ala Leu Asp Lys Ala Phe Ala Thr Arg Gly Arg 500 505 510 Phe Gln Leu Ile Glu Ala Met Ile Pro Arg Gly Val Leu Ser Asp Thr 515 520 525 Leu Ala Arg Phe Val Gln Gly Gln Lys Arg Leu His Ala Ala Pro Arg 530 535 540 Glu545 313348PRTRhodococcus sp. ST-10 313Met Lys Ala Ile Gln Tyr Thr Arg Ile Gly Ala Glu Pro Glu Leu Thr1 5 10 15 Glu Ile Pro Lys Pro Glu Pro Gly Pro Gly Glu Val Leu Leu Glu Val 20 25 30 Thr Ala Ala Gly Val Cys His Ser Asp Asp Phe Ile Met Ser Leu Pro 35 40 45 Glu Glu Gln Tyr Thr Tyr Gly Leu Pro Leu Thr Leu Gly His Glu Gly 50 55 60 Ala Gly Lys Val Ala Ala Val Gly Glu Gly Val Glu Gly Leu Asp Ile65 70 75 80 Gly Thr Asn Val Val Val Tyr Gly Pro Trp Gly Cys Gly Asn Cys Trp 85 90 95 His Cys Ser Gln Gly Leu Glu Asn Tyr Cys Ser Arg Ala Gln Glu Leu 100 105 110 Gly Ile Asn Pro Pro Gly Leu Gly Ala Pro Gly Ala Leu Ala Glu Phe 115 120 125 Met Ile Val Asp Ser Pro Arg His Leu Val Pro Ile Gly Asp Leu Asp 130 135 140 Pro Val Lys Thr Val Pro Leu Thr Asp Ala Gly Leu Thr Pro Tyr His145 150 155 160 Ala Ile Lys Arg Ser Leu Pro Lys Leu Arg Gly Gly Ser Tyr Ala Val 165 170 175 Val Ile Gly Thr Gly Gly Leu Gly His Val Ala Ile Gln Leu Leu Arg 180 185 190 His Leu Ser Ala Ala Thr Val Ile Ala Leu Asp Val Ser Ala Asp Lys 195 200 205 Leu Glu Leu Ala Thr Lys Val Gly Ala His Glu Val Val Leu Ser Asp 210 215 220 Lys Asp Ala Ala Glu Asn Val Arg Lys Ile Thr Gly Ser Gln Gly Ala225 230 235 240 Ala Leu Val Leu Asp Phe Val Gly Tyr Gln Pro Thr Ile Asp Thr Ala 245 250 255 Met Ala Val Ala Gly Val Gly Ser Asp Val Thr Ile Val Gly Ile Gly 260 265 270 Asp Gly Gln Ala His Ala Lys Val Gly Phe Phe Gln Ser Pro Tyr Glu 275 280 285 Ala Ser Val Thr Val Pro Tyr Trp Gly Ala Arg Asn Glu Leu Ile Glu 290 295 300 Leu Ile Asp Leu Ala His Ala Gly Ile Phe Asp Ile Ser Val Glu Thr305 310 315 320 Phe Ser Leu Asp Asn Gly Ala Glu Ala Tyr Arg Arg Leu Ala Ala Gly 325 330 335 Thr Leu Ser Gly Arg Ala Val Val Val Pro Gly Leu 340 345 31431DNAArtificial SequencePrimer 314catgccatgg gactggctga ggcactgctg c 3131547DNAArtificial SequencePrimer 315cgagctcagg aggatatata tatgaaagct atccagtaca cccgtat 4731632DNAArtificial SequencePrimer 316cgagctctta ttcgcgcggt gccgcgtgca gg 3231734DNAArtificial SequencePrimer 317gctctagatt acaggcccgg aaccacaacg gcgc 3431846DNAArtificial SequencePrimer 318ccgctcgagg aggatatata tatgatttct aaaggcttta gcaccc 4631950DNAArtificial SequencePrimer 319acgtgatgta atctagagga ggatatatat atgagcaaag aaattaaagg 5032050DNAArtificial SequencePrimer 320tctttgctca tatatatatc ctcctctaga ttacatcacg tgttcagtac 5032132DNAArtificial SequencePrimer 321cgagctctta ttcggcgcca atggtgcacg gg 3232246DNAArtificial SequencePrimer 322ccgctcgagg aggatatata tatgatttct aaaggcttta gcaccc 4632332DNAArtificial SequencePrimer 323cgagctctta ttcggcgcca atggtgcacg gg 3232426DNAArtificial SequencePrimer 324cacccaagcg atagtttata tagcgt 2632520DNAArtificial SequencePrimer 325gaaatgaacg gatattacgt 2032619DNAArtificial SequencePrimer 326cggaacaggt gattgtggt 1932726DNAArtificial SequencePrimer 327caccgcccac ttcaagatga agctgt 2632826DNAArtificial SequencePrimer 328cacccaagcg atagtttata tagcgt 2632920DNAArtificial SequencePrimer 329gtggctaagt acatgccggt 2033035DNAArtificial SequencePrimer 330ggaattccat atgacaaaga atatgacgac taaac 3533132DNAArtificial SequencePrimer 331cgggatcctt attatttccc ctgccctgca gt 3233232DNAArtificial SequencePrimer 332ggaattccat atgagctatc aaccactttt ac 3233329DNAArtificial SequencePrimer 333cgggatcctt acagttgagc aaatgatcc 29

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed