Compositions And Methods For Producing Chemicals And Derivatives Thereof

Kambourakis; Spiros ;   et al.

Patent Application Summary

U.S. patent application number 15/377749 was filed with the patent office on 2017-03-30 for compositions and methods for producing chemicals and derivatives thereof. The applicant listed for this patent is Synthetic Genomics, Inc.. Invention is credited to Benjamin M. Griffin, Spiros Kambourakis, Kevin V. Martin.

Application Number20170088865 15/377749
Document ID /
Family ID51207986
Filed Date2017-03-30

United States Patent Application 20170088865
Kind Code A1
Kambourakis; Spiros ;   et al. March 30, 2017

COMPOSITIONS AND METHODS FOR PRODUCING CHEMICALS AND DERIVATIVES THEREOF

Abstract

The present invention provides methods for producing a product of one or more enzymatic pathways. The pathways used in the methods of the invention involve one or more conversion steps such as, for example, an enzymatic conversion of guluronic acid into D-glucarate (Step 7); an enzymatic conversion of 5-ketogluconate (5-KGA) into L-Iduronic acid (Step 15); an enzymatic conversion of L-Iduronic acid into Idaric acid Step 7b); and an enzymatic conversion of 5-ketocluconate into 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 16). In some embodiments the methods of the invention produce 2,5-furandicarboxylic acid (FDCA) as a product. The methods include both enzymatic and chemical conversions as steps. Various pathways are also provided for converting glucose into 5-dehdyro-4-deoxy-glucarate (DDG), and for converting glucose into 2,5-furandicarboxylic acid (FDCA). Additional products that can be produce include metabolic products such as, but not limited to, guluronic acid, L-iduronic acid, idaric acid, glucaric acid.


Inventors: Kambourakis; Spiros; (San Diego, CA) ; Griffin; Benjamin M.; (San Diego, CA) ; Martin; Kevin V.; (Solana Beach, CA)
Applicant:
Name City State Country Type

Synthetic Genomics, Inc.

La Jolla

CA

US
Family ID: 51207986
Appl. No.: 15/377749
Filed: December 13, 2016

Related U.S. Patent Documents

Application Number Filing Date Patent Number
14222453 Mar 21, 2014 9528133
15377749
14033300 Sep 20, 2013 9506090
14222453
61704408 Sep 21, 2012

Current U.S. Class: 1/1
Current CPC Class: C12N 15/52 20130101; C07D 307/33 20130101; C07D 309/30 20130101; C12P 7/58 20130101; C12P 7/42 20130101
International Class: C12P 7/58 20060101 C12P007/58; C12P 7/42 20060101 C12P007/42

Claims



1. A method for producing a product of an enzymatic or chemical pathway from a starting substrate, the pathway comprising one or more conversion steps selected from the group consisting of: the conversion of DTHU to DDG (Step-5); the conversion of gluconic acid to guluronic acid (Step-6); the conversion of DEHU to DDH (Step 7A); and the conversion of guluronic acid to DEHU (Step 17A).

2. The method of claim 1 wherein the substrate is glucose and the product is DDG, comprising the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to 3-dehydro-gluconic (DHG) (Step-2); the conversion of 3-dehydro-gluconic (DHG) to 4,6-Dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step-3); the conversion of 2,5 DDH to 4-deoxy-5-threo-hexosulose uronate (DTHU) (Step 4); and the conversion of DTHU to DDG (Step-5).

3. The method of claim 1 wherein the substrate is glucose and the product is DDG, comprising the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to guluronic acid (Step-6); the conversion of guluronic to glucarate (Step-7); and the conversion of glucarate to DDG (Step-8).

4. The method of claim 1 wherein the substrate is glucose and the product is DDG, comprising the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to 5-ketogluconate (5-KGA) (Step 14); the conversion of 5-ketogluconate (5-KGA) to 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 16); the conversion of 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) to 4-deoxy-5-threo-hexosulose uronate (DTHU) (Step 4); and the conversion of 4-deoxy-5-threo-hexosulose uronate (DTHU) to DDG (Step 5).

5. The method of claim 1 wherein the substrate is glucose and the product is DDG, comprising the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to 5-ketogluconate (5-KGA) (Step 14); the conversion of 5-ketogluconate (5-KGA) to L-Iduronic acid (Step 15); the conversion of L-Iduronic acid to 4-deoxy-5-threo-hexosulose uronate (DTHU) (Step 7B); and the conversion of 4-deoxy-5-threo-hexosulose uronate (DTHU) to DDG (Step 5).

6. The method of claim 1 wherein the substrate is glucose and the product is DDH, comprising the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to guluronic acid lactone (Step 19); the conversion of guluronic acid lactone to guluronic acid (Step 1B); the conversion of guluronic acid to DEHU (Step 17A); the conversion of DEHU to DDH (Step 7A).

7. The method of claim 1 wherein the substrate is glucose and the product is DDH, comprising the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to guluronic acid (Step 6); the conversion of guluronic acid to DEHU (Step 17A); and the conversion of DEHU to DDH (Step 7A).

8. The method of claim 1 wherein the one or more conversion steps is the conversion of DTHU to DDG (Step-5).

9. The method of claim 1 wherein the one or more conversion steps is the conversion of gluconic acid to guluronic acid (Step-6).

10. The method of claim 1 wherein the one or more conversion steps is the conversion of DEHU to DDH (Step 7A).

11. The method of claim 1 wherein the one or more conversion steps is the conversion of guluronic acid to DEHU (Step 17A).
Description



CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a divisional application of U.S. application Ser. No. 14/222,453 filed Mar. 21, 2014, now issues as U.S. Pat. No. 9,528,133; which is a continuation-in-part of U.S. application Ser. No. 14/033,300 filed Sep. 20, 2013, now issued as U.S. Pat. No. 9,506,090; which claims the benefit under 35 U.S.C. .sctn.119(e) to U.S. Application Ser. No. 61/704,408 filed Sep. 21, 2012, now expired. The disclosure of each of the prior applications is considered part of and is incorporated by reference in the disclosure of this application.

INCORPORATION OF SEQUENCE LISTING

[0002] The material in the accompanying Sequence Listing is hereby incorporated by reference into this application. The accompanying sequence listing text file named SGI1660-3_ST.25.txt was created on Dec. 6, 2016 and is 191 KB. The file can be assessed using Microsoft Word on a computer that uses Windows OS.

BACKGROUND OF THE INVENTION

[0003] In recent years, an increasing effort has been devoted to identify new and effective ways to use renewable feedstocks for the production of organic chemicals. Among a plethora of downstream chemical processing technologies, the conversion of biomass-derived sugars to value-added chemicals is considered very important. In particular, six-carboned carbohydrates, i.e., hexoses such as fructose and glucose, are widely recognized the most abundant monosaccharides existing in nature, therefore can be suitably and economically used as the chemical feedstocks.

[0004] The production of furans and furan derivatives from sugars has attracted increasing attention in chemistry and in catalysis studies, and is believed to have the potential to provide one of the major routes to achieving sustainable energy supply and chemicals production. Indeed, dehydration and/or oxidation of the sugars available within biorefineries with integrated biomass conversion processes can lead to a large family of products including a wide range of furans and furan derivatives.

[0005] Among the furans having the most commercial values, furan-2,5-dicarboxylic acid (also known as 2,5-furandicarboxylic acid, hereinafter abbreviated as FDCA) is a valuable intermediate with various uses in several industries including pharmaceuticals, pesticides, antibacterial agents, fragrances, agricultural chemicals, as well as in a wide range of manufacturing applications of polymer materials, e.g., bioplastic resins. As such, FDCA is considered a green alternative of terephthalic acid (TPA), a petroleum-based monomer that is one of the largest-volume petrochemicals produced yearly worldwide. In fact, the US Department of Energy has identified FDCA as one of the top 12 priority compounds made from sugars into a value-added chemical for establishing the "green" chemistry of the future, and as such, it has been named one of the "sleeping giants" of the renewable intermediate chemicals (Werpy and Petersen, Top Value Added Chemicals from Biomass. US Department of Energy, Biomass, Vol 1, 2004).

[0006] Although various methods have been proposed for commercial scale production of FDCA (for review, see, e.g., Tong et al., Appl. Catalysis A: General, 385, 1-13, 2010), the main industrial synthesis of FDCA currently relies on a chemical dehydration of hexoses, such as glucose or fructose, to the intermediate 5-hydroxymethylfurfural (5-HMF), followed by a chemical oxidation to FDCA. However, it has been reported that current FDCA production processes via dehydration are generally nonselective, unless immediately upon their formation, the unstable intermediate products can be transformed to more stable materials. Thus, the primary technical barrier in the production and use of FDCA is the development of an effective and selective dehydration process from biomass-derived sugars.

[0007] It is therefore desirable to develop methods for production of this highly important compound, as well as many other chemicals and metabolites, by alternative means that not only would substitute renewable for petroleum-based feedstocks, but also use less energy and capital-intensive technologies. In particular, the selective control of sugar dehydration could be a very powerful technology, leading to a wide range of additional, inexpensive building blocks.

SUMMARY OF THE INVENTION

[0008] The present invention provides methods for producing a product of one or more enzymatic pathways. The pathways used in the methods of the invention involve one or more conversion steps such as, for example, an enzymatic conversion of guluronic acid into D-glucarate (Step 7); an enzymatic conversion of 5-ketogluconate (5-KGA) into L-Iduronic acid (Step 15); an enzymatic conversion of L-Iduronic acid into Idaric acid Step 7b); and an enzymatic conversion of 5-ketogluconate into 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 16); an enzymatic conversion of 1,5-gluconolactone to gulurono-lactone (Step 19). In some embodiments the methods of the invention produce 2,5-furandicarboxylic acid (FDCA) as a product. The methods can include both enzymatic and chemical conversions as steps. Various pathways are also provided for converting glucose or fructose or sucrose or galactose into 5-dehdyro-4-deoxy-glucarate (DDG), and for converting the same sugars into FDCA. The methods can also involve the use of engineered enzymes that perform reactions with high specificity and efficiency.

[0009] In a first aspect the invention provides a method for producing a product of an enzymatic or chemical pathway from a starting substrate. The pathway can contain any one or more of the following conversion steps: an enzymatic conversion of guluronic acid into D-glucarate (Step 7); an enzymatic conversion of 5-ketogluconate (5-KGA) into L-Iduronic acid (Step 15); an enzymatic conversion of L-Iduronic acid into Idaric acid (Step 7b); and an enzymatic conversion of 5-ketocluconate into 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 16); an enzymatic conversion of 1,5-gluconolactone to gulurono-lactone (Step 19).

[0010] In one embodiment the product of the enzymatic pathway is 5-dehydro-4-deoxy-glucarate (DDG). In various embodiments the substrate of the method can be glucose, and the product can 5-dehydro-4-deoxy-glucarate (DDG). The method can involve the steps of the enzymatic conversion of D-glucose to 1,5-gluconolactone (Step 1); the enzymatic conversion of 1,5-gluconolactone to gulurono-lactone (Step 19); the enzymatic conversion of gulurono-lactone to guluronic acid (Step 1B); the enzymatic conversion of guluronic acid to D-glucarate (Step 7); and the enzymatic conversion of D-glucarate to 5-dehydro-4-deoxy-glucarate (DDG) (Step 8).

[0011] In another method of the invention the substrate is glucose and the product is DDG, and the method involves the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to 5-ketogluconate (5-KGA) (Step 14); the conversion of 5-ketogluconate (5-KGA) to L-Iduronic acid (Step 15); the conversion of L-Iduronic acid to Idaric acid (Step 7b); and the conversion of Idaric acid to DDG (Step 8a).

[0012] In another method of the invention the substrate is glucose and the product is DDG and the method involves the steps of the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to 5-ketogluconate (5-KGA) (Step 14); the conversion of 5-ketogluconate (5-KGA) to 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 16); the conversion of 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) to 4-deoxy-5-threo-hexosulose uronate (DTHU) (Step 4); and the conversion of 4-deoxy-5-threo-hexosulose uronate (DTHU) to DDG (Step 5).

[0013] In another method of the invention the substrate is glucose and the product is DDG, and the method involves the steps of: the conversion of D-glucose to 1,5-gluconolactone (Step 1); the conversion of 1,5-gluconolactone to gluconic acid (Step 1a); the conversion of gluconic acid to 5-ketogluconate (5-KGA) (Step 14); the conversion of 5-ketogluconate (5-KGA) to L-Iduronic acid (Step 15); the conversion of L-Iduronic acid to 4-deoxy-5-threo-hexosulose uronate (DTHU) (Step 7B); and the conversion of 4-deoxy-5-threo-hexosulose uronate (DTHU) to DDG (Step 5).

[0014] Any of the methods disclosed herein can further involve the step of converting the DDG to 2,5-furan-dicarboxylic acid (FDCA). Converting the DDG to FDCA in any of the methods can involve contacting DDG with an inorganic acid to convert the DDG to FDCA.

[0015] In another aspect the invention provides a method for synthesizing derivatized (esterified) FDCA. The method involves contacting DDG with an alcohol, an inorganic acid at a temperature in excess of 60 C to form derivatized FDCA. In different embodiments the alcohol is methanol, butanol or ethanol.

[0016] In another aspect the invention provides a method for synthesizing a derivative of FDCA. The method involves contacting DDG with an alcohol, an inorganic acid, and a co-solvent to produce a derivative of DDG; optionally purifying the derivative of DDG; and contacting the derivative of DDG with an inorganic acid to produce a derivative of FDCA. The inorganic acid can be sulfuric acid and the alcohol can be ethanol or butanol. In various embodiments the co-solvent can be any of THF, acetone, acetonitrile, an ether, butyl acetate, an dioxane, chloroform, methylene chloride, 1,2-dichloroethane, a hexane, toluene, and a xylene.

[0017] In one embodiment in the derivative of DDG is di-ethyl DDG and the derivative of FDCA is di-ethyl FDCA, and in another embodiment the derivative of DDG is di-butyl DDG and the derivative of FDCA is di-butyl FDCA.

[0018] In another aspect the invention provides a method for synthesizing FDCA. The method involves contacting DDG with an inorganic acid in a gas phase.

[0019] In another aspect the invention provides a method for synthesizing FDCA. The method involves contacting DDG with an inorganic acid at a temperature in excess of 120 C.

[0020] In another aspect the invention provides a method for synthesizing FDCA. The method involves contacting DDG with an inorganic acid under anhydrous reaction conditions.

DESCRIPTION OF THE DRAWINGS

[0021] FIG. 1 is a electrophoretic gel of crude lysates and purified enzymes of proteins 474, 475, and 476.

[0022] FIGS. 2A-H are schematic illustrations of the pathways of Routes 1, 2, 2A, 2C, 2D, 2E, 2F, respectively.

[0023] FIGS. 3A-C present schematic illustrations of the pathways of Routes 3, 4, and 5, respectively.

[0024] FIG. 4 is an HPCL-MS analysis of the dehydration of gluconate with gluconate dehydratase to produce DHG by pSGI-359.

[0025] FIG. 5 is a graphical illustration of semicarbizide assay plots for measuring the activity of gluconate dehydratases.

[0026] FIGS. 6A-6B provide Lineweaver-Burk plots for the oxidation of glucuronate and iduronate with three enzymes of the invention.

[0027] FIG. 7A shows the results of an HPLC analysis of time points for the isomerization of 5KGA and iduronate using enzymes DTHU isomerases in the EC 5.3.1.17 family. Controls: dead enzyme is a control with heat inactivated enzyme. Med Bl refers to reactions without isomerase add/n. Time points, x axis 1=0.5 h; 2=1; 3=2 h; 4=16 h. FIG. 7B shows an HPLC analysis of time points for the isomerization of 5KGA and iduronate using enzymes in the EC 5.3.1.17 family. Controls: dead enzyme is a control with heat inactivated enzyme; Med Bl: refers to reactions without isomerase add/n. Time points, X axis: 1=0 h; 2=1 h; 3=2 h; 4=17 h.

[0028] FIG. 8 shows product formation for the isomerization of 5KGA and iduronate with enzymes in the EC 5.3.1.n1 family. The data were obtained from enzymatic assays.

[0029] FIG. 9 shows HPLC analysis of the formation of 2,5-DDH and the reduction of 5KGA concentration over time. Total ion counts for 2,5-DDH are shown.

[0030] FIG. 10 is a HPLC-MS chromatogram showing the production of guluronic acid lactone from 1,5-gluconolactone. An overlay of a trace of authentic guluronic acid is shown.

[0031] FIG. 11 is a schematic illustration of the Scheme 6 reaction pathway.

[0032] FIGS. 12A and 12B are LC-MS chromatograms showing 5-KGA and DDG reaction products, respectively.

[0033] FIG. 13 is an LC-MS chromatogram showing FDCA and FDCA dibutyl ester derivative reaction products.

[0034] FIG. 14A is a GC-MS analysis of a crude reaction sample of the diethyl-FDCA synthesis from the reaction of DDG with ethanol. Single peak corresponded to diethyl-FDCA. FIG. 14B is an MS fragmentation of the major product from the reaction of DDG with ethanol.

[0035] FIG. 15A is a GC-MS analysis of a crude reaction sample of the diethyl-FDCA synthesis from the reaction of DDG with ethanol. Single peak corresponded to diethyl-FDCA. FIG. 15B is a MS fragmentation of the major product from the reaction of DDG with ethanol.

[0036] FIG. 16 is a schematic illustration of the synthesis of FDCA and its derivatives from DTHU.

[0037] FIG. 17A is a schematic illustration of Scheme 1. Cell free enzymatic synthesis of DDG from glucose. Enzymes are ST-1: glucose oxidase; ST-1A: hydrolysis-chemical; ST-14: gluconate dehydrogenase (pSGI-504); ST-15: 5-dehydro-4-deoxy-D-glucuronate isomerase (DTHU IS, pSGI-434); ST-7B: Uronate dehydrogenase (UroDH, pSGI-476)); ST-8A Glucarate dehydratase (GlucDH, pSGI-353); ST-A: NAD(P)H oxidase (NADH_OX, pSGI-431); ST-B: Catalase. FIG. 17B shows the concentration of reaction intermediates over the first 3 h as analyzed by HPLC. Formation of DDG is shown in both reactions.

DETAILED DESCRIPTION OF THE INVENTION

[0038] The present invention provides methods for producing a product of an enzymatic pathway. The methods can comprise the enzymatic conversion of a substrate into a product. By utilizing the enzymatic and chemical pathways of the invention it is possible to synthesize a wide variety of products in a highly efficient and economical manner. One product that can be produced by the methods and pathways of the invention is 2,5-furanyl dicarboxylic acid (FDCA), which can be produced at commercial scales according to the invention. The methods can comprise one or more enzymatic and/or chemical substrate-to-product conversion steps disclosed herein. In some embodiments the enzymes utilized perform enzymatic conversion steps using activities unknown for the enzymes. These novel activities can therefore be employed in the invention to perform the conversion steps and perform a substrate to product conversion as part of a enzymatic and/or chemical pathway. Any of the products of any of the pathways disclosed herein (e.g., DDG, iduronic acid, idaric acid, glucaric acid, FDCA, etc.) can be produced on a commercial scale, i.e., in quantities of at least 1 gram or at least 10 grams or at least 100 grams or at least 1 kg in a single bioreactor or reaction vessel, as disclosed herein.

[0039] The pathways of the invention are comprised of any one or more of the steps disclosed herein. It is understood that a step of a pathway of the invention can involve the forward reaction or the reverse reaction, i.e., the substrate A being converted into product B, while in the reverse reaction substrate B is converted into product A. In the methods both the forward and the reverse reactions are described as the step unless otherwise noted.

[0040] The methods involve producing a product of a pathway, which can be an enzymatic pathway. The methods involve one or more enzymatic and/or chemical conversion steps, which convert a substrate to a product. Steps that can be included in the methods include, for example, any one or more of: an enzymatic conversion of guluronic acid into D-glucarate (Step 7); an enzymatic conversion of L-Iduronic acid to 4-deoxy-5-threo-hexosulose uronate (DTHU)(17); an enzymatic conversion of 5-ketogluconate (5-KGA) into L-Iduronic acid (Step 15); an enzymatic conversion of L-Iduronic acid into Idaric acid Step 7B); and an enzymatic conversion of 5-ketocluconate into 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 16); an enzymatic conversion of 1,5-gluconolactone to gulurono-lactone (Step 19). Any one or more of the aforementioned steps can be included in a method or pathway of the invention. An enzymatic step or pathway is a step or pathway that requires an enzyme as a catalyst in the reaction to make the step proceed. Chemical steps can be performed without an enzyme as a catalyst in the reaction. Any one or more of the steps recited in the methods can be an enzymatic step. In some embodiments every step of the pathway is an enzymatic step, while in other embodiments one or more steps in the pathway is a chemical step.

[0041] In some embodiments any of the methods can include a step involving the addition of the substrate of the reaction to a reaction mix containing the enzyme that performs the conversion. Thus the method of converting guluronic acid into D-glucarate (step 7) can involve the addition of guluronic acid as starting substrate to the reaction mix; the enzymatic conversion of L-iduronic acid to Idaric acid (7B) can involve the addition of L-Iduronic acid as starting substrate to the reaction mix; the enzymatic conversion of L-Iduronic acid to 4-deoxy-5-threo-hexosulose uronate (DTHU) (17) can involve the addition of L-iduronic acid as starting substrate to the reaction mix. Any of the methods can involve a step of adding glucose, fructose, galactose, sucrose, or mannose or another mono- or di-saccharide to the reaction mixture. Another step that can be included in any of the methods is a step of purifying from the reaction mixture a reaction product. Thus, a step of purifying glucaric acid/D-glucarate or L-Iduronic acid/iduronate, or Idaric acid, or 2,5-diketo hexanedioic/DKHA can be included in any of the methods described herein. Any of the methods disclose can include a step of isolating or purifying DDG or FDCA from the reaction mixture. And any of the methods can involve a step of adding an enzyme that performs any one or more of the steps described herein to the reaction mixture. A reaction mixture is a mixture of at least one substrate and at least one enzyme and involves the conversion of at least one substrate into a least one enzyme product. Any of the methods can involve a step of adding an isolated enzyme to a reaction mix, the enzyme performing a substrate to product conversion step of a pathway of the invention, and the isolated enzyme being at least 10% purified or at least 20% purified or at least 25% purified or at least 50% purified or at least 70% purified or at least 80% purified or at least 90%, all w/w.

[0042] Since many sugars can be converted into other sugars any of the methods or pathways of the invention can involve the use of glucose, sucrose, fructose or galactose as the starting substrate. Thus, in any pathway or reaction disclosed herein where glucose is the starting substrate it is understood that fructose or sucrose or galactose or mannose or another starting substrate can also be a starting substrate for that pathway or reaction. In some embodiments the sugar is converted into glucose which then enters the pathway but in other embodiments the pathway begins with fructose or sucrose or galactose or mannose or another mono- or di-saccharide.

[0043] The reactions of the invention can occur in a lysate of cells or a cell-free lysate that contains one or more enzymes that perform the enzymatic conversion, but can also occur in a reaction mixture containing components added by the user to form a reaction mixture, or can contain components purified from a cell lysate, or may be contained in a whole cell biocatalyst. The reaction can also occur in a mix made of purified components that have been combined, such as in a mix where the substrate and enzyme were combined to form the reaction mix. The reactions can occur in an in vitro reaction or can occur in a recombinant cell, and therefore the product(s) can be harvested by lysing the cells or by collecting from the culture medium. The reactions can occur in a laboratory container or reaction vessel such as, for example, a centrifuge tube, a test tube, a vial, a beaker, or a glass or metal or plastic container or reactor, a fermenter or fermentation vessel or bioreactor, an algae pond, any of which can be small scale or large scale. Any of the organisms described herein can be utilized as host cells to produce the product of a step or pathway of the invention. The organisms can also be used to produce one or more enzymes of the invention for use in a method of the invention. Various types of organisms can be used. Examples include: bacteria of the family Acetobacteraceae (e.g., bacteria of the genus Acetobacter, Acidiphilium, Gluconobacter, Gluconoacetobacter), or bacteria of the family Pseudomonadaceae (e.g., genus Azotobacter, Pseudomonas), or bacteria of the family Enterobacteriacea (e.g., of the genus Escherichia (e.g., E. coli), Klebsiella). Yeast can also be used for these purposes such as yeast of the genera Saccharomyces, Ashbya, Kluveromyces, Lachancea, Zygosaccharomyces, Candida, Pichia, Arxula or Trichosporon or Blastobotrys. Cyanobacteria can also be used such as those of the genus Cyanothece (e.g., Cyanothece strains ATCC 51142, PCC 7424, PCC 7425, PCC 7822, PCC 8801, PCC 8802), or Microcystis or Synechococcus (e.g., strains elongatus PCC 7942, PCC 7002, PCC 6301, CC9311, CC9605, CC9902, JA-2-3B'a(2-13), JA-3-3Ab, RCC307, WH 7803, WH 8102) or Synechocystis, or Thermosynechococcus. Thus the present invention provides recombinant host cells comprising a recombinant nucleic acid of one or more of SEQ ID NOs: 4-6, 20-32, 36-38, 47-54, 56, 62-66, 69-70, 72, and 79-84 or a codon-optimized sequence of any of SEQ ID NOs: 1-84. The host cells can also contain a vector of the invention described herein. A "codon optimized" sequence refers to changes in the codons of a sequence to those preferentially used in a particular organism so that the encoded protein is efficiently expressed in the organism carrying the sequence. The recombinant nucleic acid sequence can be comprised on a vector, as disclosed herein.

[0044] In various embodiments the methods of the invention are methods of converting glucose or fructose or sucrose or galactose to DDG, or glucose or fructose or sucrose or galactose to FDCA, or glucose or fructose or sucrose or galactose to DTHU or DEHU, or for converting DDG to FDCA. The methods can involve converting the starting substrate in the method into the product. The starting substrate is the chemical entity considered to begin the method and the product is the chemical entity considered to be the final end product of the method. Intermediates are those chemical entities that are created in the method (whether transiently or permanently) and that are present in the reaction pathway between the starting substrate and the product. In various embodiments the methods and pathways of the invention have about four or about five intermediates or 4-5 intermediates, or about 3 intermediates, or 3-5 intermediates, or less than 6 or less than 7 or less than 8 or less than 9 or less than 10 or less than 15 or less than 20 intermediates, meaning these values not counting the starting substrate or the final end product.

[0045] The invention provides methods of producing FDCA and/or DDG, from glucose or fructose or sucrose or galactose that have high yields. The theoretical yield is the amount of product that would be formed if the reaction went to completion under ideal conditions. In different embodiments the methods of the invention produce DDG from glucose, fructose, or galactose with a theoretical yield of at least 50% molar, or at least 60% molar or at least 70% molar, or at least 80% molar, at least 90% molar or at least 95% molar or at least 97% molar or at least 98% molar or at least 99% molar, or a theoretical yield of 100% molar. The methods of the invention also can provide product with a carbon conservation of at least 80% or at least 90% or at least 95% or at least 97% or at least 98% or at least 99% or 100%, meaning that the particular carbon atoms present in the initial substrate are present in the end product of the method at the recited percentage. In some embodiments the methods produce DDG and/or FDCA from glucose or fructose or sucrose or galactose via dehydration reactions.

Example Synthesis Routes

[0046] The invention also provides specific pathways for synthesizing and producing a desired product. Any of the following described routes or pathways can begin with glucose or fructose or sucrose or galactose or mannose and flow towards a desired product. In some embodiments D-glucose is the starting substrate and the direction of the pathway towards any intermediate or final product of the pathway is considered to be in the downstream direction, while the opposite direction towards glucose is considered the upstream direction. It will be realized that routes or pathways can flow in either the downstream or upstream direction. While glucose is used as an example starting substrate for pathways described herein, it is also understood that sucrose, fructose, galactose, or mannose or any intermediate in any of the pathways can also be the starting substrate in any method of the invention, and DDG, DTHU, FDCA, or any intermediate in any of the routes or pathways of the invention can be the final end product of a method of the invention. The disclosed methods therefore include any one or more steps disclosed in any of the routes or pathways of the invention for converting any starting substrate or intermediate into any end product or intermediate in the disclosed routes or pathways using one or more of the steps in the disclosed routes or pathways. Thus, for example the methods can be methods for converting glucose or fructose or sucrose or galactose or mannose to DDG, or to guluronic acid, or to galactarate, or to DTHU, or to DEHU, or to guluronic acid, or to iduronic acid, or to idaric acid, or to glucaric acid, or for converting galactarate to DDG, or for converting guluronic acid to D-glucarate, or for converting 5-KGA to L-Iduronic acid, or for converting L-Iduronic acid to Idaric acid, or for converting 5-KGA to 2,5-DDH or DTHU, or for converting DHG to DEHU. In these embodiments the methods utilize the steps disclosed in the methods and pathways of the invention from starting substrate to the relevant end product. One or more of the steps can also be utilized in methods flowing in the "opposite" or upstream direction from the pathways disclosed herein.

[0047] Route 1 is illustrated in FIG. 2A. Route 1 converts D-glucose (or any intermediate in the pathway) into 5-dehydro-4-deoxy-glucarate (DDG) via an enzymatic pathway via a series of indicated steps. Route 1 converts D-glucose into DDG via a pathway having 1,5-gluconolactone, gluconic acid, 3-dehydro-gluconic acid (DHG), 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH), and 4-deoxy-L-threo-hexosulose uronate (DTHU) as intermediates and DDG as the final end product. For any of the pathways additional intermediates not shown can also be present. The steps are the enzymatic conversion of D-glucose to 1,5-gluconolactone (Step 1); the enzymatic conversion of 1,5-gluconolactone to gluconic acid (Step 1A); the enzymatic conversion of gluconic acid to 3-dehydro-gluconic acid (DHG) (Step 2); the enzymatic conversion of 3-dehydro-gluconic acid (DHG) to 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 3); the enzymatic conversion of 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) to 4-deoxy-L-threo-hexosulose uronate (DTHU) (Step 4); and the enzymatic conversion of 4-deoxy-L-threo-hexosulose uronate (DTHU) to 5-dehydro-4-deoxy glucarate (DDG) (Step 5). Route 1 also comprises sub-routes where the glucose or any intermediate in the pathway as a substrate is converted into any other downstream intermediate as final product, and each substrate to product sub-route is considered disclosed as if each is set forth herein in full.

[0048] Route 2 is illustrated in FIG. 2B and converts D-glucose into DDG. The steps in the Route 2 pathway are the enzymatic conversion of D-glucose into 1,5-gluconolactone (Step 1); the enzymatic conversion of 1,5-gluconolactone to gluconic acid (Step 1A); the enzymatic conversion of gluconic acid to guluronic acid (Step 6); the enzymatic conversion of guluronic acid to D-glucarate (Step 7); the enzymatic conversion of D-glucarate to DDG (Step 8). Route 2 also comprises sub-routes where glucose or any intermediate in the pathway as substrate is converted into any other downstream intermediate as final product, and each sub-route is considered disclosed as if each is set forth herein in full. For example in some embodiments the methods comprise steps for the conversion of glucose or gluconic acid as substrate into guluronic acid or D-glucarate as product using one or more of the steps described in Route 2.

[0049] Route 2A is illustrated in FIG. 2C. The steps in Route 2A are the enzymatic conversion of D-glucose to 1,5-gluconolactone (Step 1); the enzymatic conversion of 1,5-gluconolactone to guluronic acid lactone (Step 19); the enzymatic conversion of guluronic acid lactone to guluronic acid (Step 1B); the enzymatic conversion of guluronic acid to D-glucarate (Step 7); the enzymatic conversion of D-glucarate to 5-dehydro-4-deoxy-glucarate (DDG) (Step 8). Route 2A also comprises sub-routes where glucose or any intermediate in the pathway as starting substrate is converted into any other downstream intermediate as final end product, and each sub-route is considered disclosed as if each is set forth herein in full. For example in some embodiments the methods comprise steps for the conversion of glucose or guluronic acid lactone as substrate into glucarate or DDG as product using one or more of the steps described in Route 2A.

[0050] Route 2B is illustrated in FIG. 2D. The steps in Route 2B are the enzymatic conversion of D-glucose into gluconic acid (Steps 1 and 1A); the enzymatic conversion of gluconic acid into 5-ketogluconate (5-KGA) (Step 14); the enzymatic conversion of 5-KGA into L-Iduronic acid (Step 15); the enzymatic conversion of L-Iduronic acid into Idaric acid (Step 7B); the enzymatic conversion of Idaric acid into DDG (Step 8A). Route 2B also comprises sub-routes where glucose or any intermediate in the pathway as starting substrate is converted into any other downstream intermediate as final end product, and each sub-route is considered disclosed as if each is set forth herein in full. For example in some embodiments the methods comprise steps for the conversion of glucose or 5-KGA as substrate into iduronic acid or idaric acid as product using one or more of the steps described in Route 2B.

[0051] Route 2C is illustrated in FIG. 2E. The steps in Route 2C are the enzymatic conversion of D-glucose to gluconic acid (Steps 1 and 1A); the enzymatic conversion of gluconic acid to 5-ketogluconate (5-KGA) (Step 14); the enzymatic conversion of 5-KGA to 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) (Step 16); the enzymatic conversion of 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) to 4-deoxy-5-threo-hexosulose uronate (DTHU) (Step 4); the enzymatic conversion of DTHU to DDG (Step 5). Route 2C also comprises sub-routes where glucose or any intermediate in the pathway as starting substrate is converted into any other downstream intermediate as final end product, and each sub-route is considered disclosed as if each is set forth herein in full. For example in some embodiments the methods comprise steps for the conversion of glucose or gluconic acid as substrate into 2,5-DDH or DTHU using one or more steps described in Route 2C.

[0052] Route 2D is illustrated in FIG. 2F. The steps in Route 2D are the enzymatic conversion of D-glucose to gluconic acid (Steps 1 and 1A); the enzymatic conversion of gluconic acid to 5-ketogluconate (5-KGA) (Step 14); the enzymatic conversion of 5-KGA to Iduronic acid (Step 15); the enzymatic conversion of L-Iduronic acid to DTHU (Step 17); the enzymatic conversion of DTHU to DDG (Step 5). Route 2D also comprises sub-routes where glucose or any intermediate in the pathway as starting substrate is converted into any other downstream intermediate as final end product, and each sub-route is considered disclosed as if each is set forth herein in full. For example in some embodiments the methods comprise steps for the conversion of glucose or 5-KGA as substrate into L-iduronic acid or DTHU using one or more of the steps described in Route 2D.

[0053] Route 2E is illustrated in FIG. 2G. The steps in Route 2D are the enzymatic conversion of D-glucose to 1,5-gluconolactone (Step 1); the enzymatic conversion of 1,5-gluconolactone to guluronic acid lactone (Step 19); the enzymatic conversion of guluronic acid lactone to guluronic acid (Step 1B); the enzymatic conversion of guluronic acid to 4-deoxy-erythro-hexosulose uronate (DEHU) (Step 17A); the enzymatic conversion of DEHU to 3-deoxy-D-erythro-2-hexulosaric acid (DDH) (Step 7A). Route 2E also comprises sub-routes where glucose or any intermediate in the pathway as starting substrate is converted into any other downstream intermediate as final end product, and each sub-route is considered disclosed as if each is set forth herein in full. For example in some embodiments the methods comprise steps for the conversion of glucose as substrate into guluronic acid or DEHU using one or more of the steps described in Route 2E.

[0054] Route 2F is illustrated in FIG. 2H. The steps in Route 2F are the enzymatic conversion of D-glucose to gluconic acid (Steps 1 and 1A); the enzymatic conversion of gluconic acid to guluronic acid (Step 6); the enzymatic conversion of guluronic acid to 4-deoxy-erythro-hexosulose uronate (DEHU) (Step 17A); the enzymatic conversion of DEHU to 3-deoxy-D-erythro-2-hexulosaric acid (DDH) (Step 7A). Route 2F also comprises sub-routes where glucose or gluconic acid or any intermediate in the pathway as starting substrate is converted into guluronic acid or DDH or any other downstream intermediate as final end product using one or more of the steps of Route 2F, and each sub-route is considered disclosed as if each is set forth herein in full.

[0055] Route 3 is illustrated in FIG. 3A. The steps in Route 3 are the enzymatic conversion of D-glucose to gluconic acid (Steps 1 and 1A); the enzymatic conversion of gluconic acid to 3-dehydro-gluconic acid (DHG) (Step 2); the enzymatic conversion of DHG to 4-deoxy-erythro-hexosulose uronate (DEHU) (Step 6A); the enzymatic conversion of DEHU to DDG (Step 7A). Route 3 also comprises sub-routes where glucose or fructose or sucrose or galactose or any intermediate in the pathway as starting substrate is converted into gluconic acid or DDH any other downstream intermediate of Route 3 as final end product using one or more of the steps of Route 3, and each sub-route is considered disclosed as if each is set forth herein in full.

[0056] Route 4 is illustrated in FIG. 3B. The steps in Route 4 are the enzymatic conversion of D-glucose to a-D-gluco-hexodialdo-1,5-pyranose (Step 9); the enzymatic conversion of a-D-gluco-hexodialdo-1,5-pyranose to a-D-glucopyranuronic acid (Step 10); the enzymatic conversion of a-D-glucopyranuronic acid to D-glucaric acid 1,5-lactone (Step 11); the enzymatic conversion of D-glucaric acid 1,5-lactone to D-glucarate (Step 1C); the enzymatic conversion of D-glucarate to DDG (Step 8). Route 4 also comprises sub-routes where glucose or any intermediate in the pathway as starting substrate is converted into glucarate or DDG or any other downstream intermediate as final end product using one or more of the steps of Route 4, and each sub-route is considered disclosed as if each is set forth herein in full.

[0057] Route 5 is illustrated in FIG. 3C. The steps in Route 5 are the enzymatic conversion of D-galactose to D-galacto-hexodialdose (Step 9A); the enzymatic conversion of D-galacto-hexodialdose to galacturonate (Step 10A); the enzymatic conversion of galacturonate to galactarate (Step 11A); the enzymatic conversion of galactarate to DDG (Step 13). Route 5 also comprises sub-routes where galactose or any intermediate in the pathway as starting substrate is converted into any other downstream intermediate as final product, and each sub-route is considered disclosed as if each is set forth herein in full. For example in some embodiments the methods comprise steps for the conversion of galactose or another substrate into galacturonate or galactarate using the steps described in Route 5.

[0058] In various other embodiments the invention provides a method of producing a product of an enzymatic and/or chemical pathway from a starting substrate that involves performing Step 1, followed by Step 19, followed by Step 1B to produce a guluronic acid product. Optionally the pathway can continue with Step 7 to produce glucarate. In another embodiment the method involves performing Steps 1 and 1A followed by Step 14, followed by Step 15 to produce Iduronic acid. Optionally the method can continue with Step 7B to produce an Idaric acid product or with Step 17 to produce DTHU. In another embodiment the method involves performing Steps 1 and 1A, followed by Step 14 followed by Step 16 to produce a 2,5-DDH product. In another embodiment the method involves performing Step 1 followed by Step 19 to produce guluronic acid lactone.

The Enzymatic Steps

[0059] There are disclosed a wide variety of enzymes (and nucleic acids that encode the enzymes) that can perform the steps of the methods outlined herein. The enzymes utilized in the enzymatic steps of the invention can be proteins or polypeptides. In addition to the families and classes of enzymes disclosed herein for performing the steps of the invention, homologs having a sequence identity to any enzyme or nucleic acid or to any of SEQ ID NOs 1-84, disclosed herein will also be useful in the invention. Enzymes and nucleic acids that are homologs of SEQ ID NOs: 1-84 have a sequence identity of at least 40% or at least 50% or at least 60% or at least 70% or at least 80% or at least 90% or at least 95% or at least 97% or at least 98% or at least 99% to any nucleic acid or enzyme of SEQ ID NO: 1-84, or to a member of an enzyme class disclosed herein. Percent sequence identity or homology with respect to amino acid or nucleotide sequences is defined herein as the percentage of amino acid or nucleotide residues in the candidate sequence that are identical with the known polypeptides, after aligning the sequences for maximum percent identity and introducing gaps, if necessary, to achieve the maximum percent identity or homology. Homology or identity at the nucleotide or amino acid sequence level may be determined using methods known in the art, including but not limited to BLAST (Basic Local Alignment Search Tool) analysis using the algorithms employed by the programs blastp, blastn, blastx, tblastn and tblastx (Altschul (1997), Nucleic Acids Res. 25, 3389-3402, and Karlin (1990), Proc. Natl. Acad. Sci. USA 87, 2264-2268), which are tailored for sequence similarity searching. Alternatively a functional fragment of any of the enzymes or nucleic acids encoding such enzymes or of any enzyme or nucleic acid of SEQ ID NOs 1-84 disclosed herein may also be used. The term "functional fragment" refers to a polypeptide that has an amino-terminal and/or carboxy-terminal deletion and/or internal deletion (which can be replaced to form a chimeric protein), where the remaining amino acid sequence has at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the corresponding positions in the reference sequence, and/or that retains about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the activity of the full-length polypeptide. The EC numbers provided use the enzyme nomenclature of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology. In other embodiments the functional fragment retains the requirement of the presence of a co-factor necessary for the activity of a protein or protein encoded by SEQ ID NO:1-84.

[0060] Also disclosed is an expression vector having a sequence of SEQ ID NO: 4-6, 20-32, 36-38, 47-54, 56, 62-66, 69-70, 72, and 79-84. The vector can be a bacterial, yeast, or algal vector. Vectors designed for expression of a gene can also include a promoter active in the organism carrying the vector and operably linked to the sequence of the invention. The vector can contain a promoter or expression control sequence operatively linked to a sequence of SEQ ID NOs: 4-6, 20-32, 36-38, 47-54, 56, 62-66, 69-70, 72, and 79-84 or a codon-optimized sequence of any of them. A "promoter" refers to a nucleic acid sequence capable of binding RNA polymerase to initiate transcription of a gene in a 5' to 3' ("downstream") direction. A sequence is "operably linked" to a promoter when the binding of RNA polymerase to the promoter is the proximate cause of said gene's transcription.

[0061] Step 1--Conversion (oxidation or dehydrogenation) of glucose to 1,5-gluconolactone. This step can be performed with various enzymes, such as those of the family oxygen dependent glucose oxidases (EC 1.1.3.4) or NAD(P)-dependent glucose dehydrogenases (EC 1.1.1.118, EC 1.1.1.119). Gluconobacter oxydans has been shown to efficiently oxidize glucose to gluconic acid and 5-ketogluconate (5-KGA) when grown in a fermentor. Enzymes of the family of soluble and membrane-bound PQQ-dependent enzymes (EC 1.1.99.35 and EC 1.1.5.2) found in Gluconobacter and other oxidative bacteria can be used. Quinoprotein glucose is another enzyme that is useful in performing this step. The specific enzyme selected will be dependent on the desired reaction conditions and necessary co-factors that will be present in the reaction, which are illustrated in Table 1.

[0062] Step 1A--Conversion (e.g., hydrolysis) of 1,5-gluconolactone to gluconate. This step can be performed chemically in aqueous media and the rate of hydrolysis is dependent on pH (Shimahara, K, Takahashi, T., Biochim. Biophys. Acta (1970), 201, 410). Hydrolysis is faster in basic pH (e.g., pH 7.5) and slower in acid pH. Many microorgranisms also contain specific 1,5-glucono lactone hydrolases, and a few of them have been cloned and characterized (EC 3.1.1.17; Shinagawa, E Biosci. Biotechnol. Biochem. 2009, 73, 241-244).

[0063] Step 1B--Conversion of Guluronic acid lactone to guluronic acid. The chemical hydrolysis of guluronic acid lactone can be done by a spontaneous reaction in aqueous solutions. An enzyme capable of catalyzing this hydrolysis is identified amongst the large number of lactonases (EC 3.1.1. XX and more specifically 3.1.1.17, 3.1.1.25).

[0064] Step 2--Conversion of gluconic acid to 3-dehydro gluconic acid (DHG): Several enzymes, such as gluconate dehydratases, can be used in the dehydration of gluconic acid to dehydro gluconic acid (DHG). Examples include those belonging to the gluconate dehydratase family (EC 4.2.1.39). A specific example of such a dehydratase has been shown to dehydrate gluconate (Kim, S. Lee, S. B. Biotechnol. Bioprocess Eng. (2008), 13, 436). Particular examples of enzymes from this family and their cloning are shown in Example 1.

[0065] Step 3: Conversion of 3-dehydro-gluconic acid (DHG) to 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH). Enzymes, 2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase (or DHG dehydrogenases) (EC 1.1.1.127) for performing this conversion have been described.

[0066] Step 4: Conversion of 4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH) to 4-deoxy-L-threo-hexosulose uronate (DTHU). Enzymes of the family EC 5.3.1.12 can be used in this step, and Step 15 shows that five such enzymes were cloned and shown to have activity for the dehydration of 5-KGA. These enzymes will also show activity towards 2,5-DDH and DTHU.

[0067] Step 5: Conversion of DTHU to 5-dehydro-4-deoxy-glucarate (DDG). DDG can be produced from the chemical or enzymatic oxidation of DTHU, for example with a mild chemical catalyst capable of oxidizing aldehydes in the presence of alcohols. Aldehyde oxidases can be used to catalyze this oxidation. Oxidative bacteria such as Acetobacter and Gluconobacter (Hollmann et at Green Chem. 2011, 13, 226) will be useful in screening. Enzymes of the following families can perform this reaction: aldehyde oxidase EC1.2.3.1, aldehyde ferredoxin oxidoreductase (EC1.2.7.5), and in all the families of EC1.2.1.-XX. Enzymes of the family of uronate dehydrogenases (EC 1.1.1.203) (e.g., see Step 7) will also have this activity. Other enzymes with both alcohol and aldehyde oxidation activity can be used, including enzymes in the alditol oxidase family (see Steps 19 and 6). Other broad substrate oxidases include soluble and membrane bound PQQ-dependent alcohol/aldehyde oxidases. More specifically soluble periplasmic PQQ oxidases enzymes and their homologs belonging into Type I (EC 1.1.9.1) and II (EC 1.1.2.8) families as well as membrane bound PQQ oxidases belonging into EC 1.1.5.X families are useful. In other embodiments aldehyde dehydrogenases/oxidases that act on DTHU can be used.

[0068] Step 5 can also be performed using a dehydrogenase from acetic acid bacteria such Gluconobacter and Acetobacter and Gluconoacetobacter, and others. Whole cell activity is identified by screening microorganisms for the oxidation of DTHU. The activity is identified and one or more of the enzymes is cloned. Enzymes with uronate dehydrogenase activity described in Step-7 and 7B are also screened and found to have this activity. A library of soluble periplasmic and membrane bound PQQ-dependent enzymes is also cloned and several enzymes are found having this activity. Some of the enzymes found to have the activity are NAD(P)- or PQQ-dependent dehydrogenases, but others are FAD-dependent aldehyde dehydrogenases. SEQ ID NO: 71-72 are examples of NADP-dependent dehydrogenases, and any one or a combination of them can be used to perform Step 5. SEQ ID NOs: 73-84 are examples of suitable PQQ-dependent dehydrogenases and any one or any combination of them can be used to perform Step 5.

[0069] Steps 6 and 6A: Conversion of gluconic acid to guluronic acid (6) and conversion of 3-dehydro-gluconic acid (DHG) to 4-deoxy-5-erythro-hexosulose uronate (DEHU)(6A). The enzymes described in Step 5 are useful for these conversions. Other useful enzymes include NAD(P)-dependent dehydrogenases in the EC 1.1.1.XX families and more specifically glucuronate dehydrogenase (EC 1.1.1.19), glucuronolactone reductase (EC 1.1.1.20). In addition, a large number O.sub.2-dependent alcohol oxidases with broad substrate range including sugars will be useful (EC 1.1.3.XX), including sorbitol/mannitol oxidases (EC 1.1.3.40), hexose oxidases (EC 1.1.3.5), alcohol oxidases (EC 1.1.3.13) and vanillin oxidase (EC 1.1.3.38). PQQ-dependent enzymes and enzymes present in oxidative bacteria can also be used for these conversions.

[0070] Steps 7 and 7B: Conversion of guluronic acid to D-glucaric acid (7) and conversion of L-Iduronic acid to Idaric acid (7B). These steps can be accomplished with enzymes of the family of uronate dehydrogenases (EC 1.1.1.203) or the oxidases, as described herein. Examples of uronate dehydrogenases include SEQ ID NO: 1-6, and any one or any combination of them can be used to perform Steps 7 and 7B.

[0071] Step 7A: Conversion of 4-deoxy-5-erythro-hexosulose uronate (DEHU) to 3-deoxy-D-erythro-2-hexulosaric acid (DDH). The same enzymes described in Step 5 will be useful for performing this conversion. Similar to Step 5, for steps 7 and 7B enzymes are identified having the stated activity, which are NAD(P)- or PQQ-dependent dehydrogenases, but others are FAD-dependent aldehyde dehydrogenases. Examples of NADP-dependent gluconate-5-dehydrogenases include SEQ NO: 71-72 and examples of PQQ-dependent dehydrogenases include SEQ ID NO: 73-84, and any one or any combination of them can be used to perform steps 7 and 7B.

[0072] Steps 8 and 8A: Conversion of D-glucaric acid to 5-dehydro-4-deoxy-glucarate (DDG) (Step 8) and conversion of Idaric acid to DDG (Step 8A). Enzymes of the family of glucarate dehydratases (EC 4.2.1.40) can be used to perform these steps. Enzymes of this family have been cloned and have been shown to efficiently convert glucarate to DDG. Two D-glucarate dehydratases (EC 4.2.1.40) were cloned as shown in the Table of cloned glucarate dehydratases below. Both enzymes showed very high activity for the dehydration of Glucarate to DDG using the semicarbazide assay, as described in Step 2.

Cloned Glucarate Dehydratases

TABLE-US-00001 [0073] pSGI WT/ Organism (Vector) Gene ID SYN E. coli 353 (pET28) P0AES2 WT Pseudomonas (SGI) 244 #8114 WT

[0074] Step 9 and 9A: Conversion of D-glucose to a-D-gluco-hexodialdo-1,5-pyranose (9) and conversion of D-galactose to D-galacto-hexodialdose (9A). Oxidases such as those of the galactose oxidase family (EC 1.1.3.9) can be used in this step. Mutant galactose oxidases are also engineered to have activity on glucose and have been described (Arnold, F. H. et al ChemBioChem, 2002, 3(2), 781). Step 9A can be performed with enzymes of the class EC 1.1.3.9.

[0075] Step 10: Conversion of a-D-gluco-hexodialdo-1,5-pyranose to a-D-glucopyranuronic acid (step 10) and D-galacto-hexodialdose to galacturonate (10A). This step can be performed using an enzyme of the family of aldehyde dehydrogenases. Also an enzyme identified from those of Step 5 will be useful for both of these conversions.

[0076] Step 11 and 11A: Conversion of a-D-glucopyranuronic acid to glucuronic acid 1,5-lactone. Aldehyde dehydrogenases and oxidases as described in Step 5 will be useful in performing this step. Uronate dehydrogenases described in Steps 7 and 7B can also be useful in performing this step. Step-11A is the conversion of galacturonate to galactarate. The uronate dehydrogenase (EC 1.1.1.203), for example those described in Steps 7 and 7B, will be useful in performing this step.

[0077] Step 12: Conversion of fructose to glucose. Glucose and fructose isomerases (EC 5.3.1.5) will be useful in performing this step.

[0078] Step 13: Conversion of galactarate to 5-dehydro-4-deoxy-D-glucarate (DDG). Enzymes of the family of galactarate dehydrogenases (EC 4.2.1.42) can be used to perform this step, and additional enzymes can be engineered for performing this step.

[0079] Step 14: Conversion of gluconate to 5-ketogluconate (5-KGA). A number of enzymes of the family of NAD(P)-dependent dehydrogenases (EC1.1.1.69) have been cloned and shown to have activity for the oxidation of gluconate or the reduction of 5KGA. For example, the NADPH-dependent gluconate 5-dehydrogenase from Gluconobacter (Expasy P50199) was synthesized for optimal expression in E. coli as shown herein and was cloned in pET24 (pSGI-383). The enzyme was expressed and shown to have the required activities. Additional enzymes useful for performing this step include those of the family of PQQ-dependent enzymes present in Gluconobacter (Peters, B. et al. Appl. Microbiol Biotechnol., (2013), 97, 6397), as well as the enzymes described in Step 6. Enzymes from these families can also be used to synthesize 5KGA from gluconate.

[0080] Step 15: Conversion of 5-KGA to L-Iduronic acid. This step can be performed with various enzymes from different isomerase families, as further described in Example 4. Examples include isomerases of SEQ ID NOs: 7-19 or a homolog having at least 70% sequence identity to an isomerase of SEQ ID NOs: 7-19; or by an isomerase encoded by a nucleic acid of SEQ ID NOs: 20-32 or a homolog of any of them.

[0081] Step 16: Conversion of 5-KGA to (4S)-4,6-dihydroxy 2,5-diketo hexanoate (2,5-DDH). This dehydration can be performed with enzymes in the gluconate dehydratase family (EC 4.2.3.39), such as those described in Example 5 or Step 17. Examples of gluconate dehydratases that can be used for Step 16 include SEQ ID NOs 33-35 (encoded by SEQ ID NOs: 36-38, and any one or any combination of them can be used to perform Step 16, or homologs thereof.

[0082] Step 17 and 17A: L-Iduronate to 4-deoxy-5-threo-hexosulose uronate (DTHU) and Guluronate to 4-deoxy-erythro-5-hexosulose uronate (DEHU).

[0083] Enzymes of the family of dehydratases are identified that can be used in the performance of this step. Enzymes from the families of gluconate or glucarate dehydratases will have the desired activity for performing these steps. Furthermore, many dehydratases of the family (EC 4.2.1.X) will be useful in the performance of these steps. In particular, enzymes that dehydrate 1,2-dyhydroxy acids to selectively produce 2-keto-acids will be useful, such as enzymes of the families: EC 4.2.1.6 (galactonate dehydratase), EC 4.2.1.8 (mannonate dehydratase), EC 4.2.1.25 (arabonate dehydratase), EC 4.2.1.39 (gluconate dehydratase), EC 4.2.1.40 (glucarate dehydratase), EC 4.2.1.67 (fuconate dehydratase), EC 4.2.1.82 (xylonate dehydratase), EC 4.2.1.90 (rhamnonate dehydratase) and dihydroxy acid dehydratases (4.2.1.9). Since known enzyme selectivity is the production of an alpha-keto acid the identified enzymes will produce DEHU and DTHU, respectively, as the reaction products Step 19: Conversion of 1,5-gluconolactone to guluronic acid lactone. This step can be performed by enzymes of the family of alditol oxidases (EC 1.1.3.41) or the enzymes described in Step 6. Examples of alditol oxidases that can be used for Step 19 include SEQ ID NOs 39-54 or a homolog of any of them, or by an alditol oxidase encoded by a nucleic acid of SEQ ID NOs: 47-54 or a homolog of any of them; and any one or any combination of them can be used to perform Step 19.

Methods of Converting DDG to FDCA and of Making Esterified DDG and FDCA.

[0084] The present invention also provides novel methods of converting DDG to FDCA and FDCA esters. Esters of FDCA include diethyl esters, dibutyl esters, and other esters. The methods involve converting DDG into a DDG ester by contacting DDG with an alcohol, an inorganic acid, and optionally a co-solvent to produce a derivative of DDG. The alcohol can be methanol, ethanol, propanol, isopropanol, butanol, isobutanol, pentanol, hexanol, heptanol, octanol, nonanol, decanol, undecanol, dodecanol, tridecanol, tetradecanol, pentadecanol, hexadecanol, heptadecanol, octadecanol, nonadecanol, eicosanol, dimethyl sulfoxide, dimethylformamide, polyethylene glycol, methyl isobutyl ketone, or any C1-C20 alcohol. The inorganic acid can be sulfuric acid, phosphoric acid, perchloric acid, nitric acid, hydrochloric acid, hydrofluoric acid, hydroboromic acid and hydriodic acid. The co-solvent can be any of or any mixture of THF, acetone, acetonitrile, an ether, butyl acetate, an dioxane, chloroform, methylene chloride, 1,2-dichloroethane, a hexane, toluene, and a xylene. Any combination of the alcohols, inorganic acids, and co-solvents can be utilized in the reactions. The esterified DDG can then be converted into esterified FDCA, for example by contacting it with an acid catalyst.

DDG Purification

[0085] DDG purification for dehydration or esterification was performed by acidifying the DDG, e.g., by lowering the pH of the reaction with the addition of conc HCl to pH .about.2.5. At this pH proteins and any residual glucarate precipitate are removed by filtration and the mixture is lyophilized to give a white powder consisting of DDG and the reaction salts. The mixture can be lyophilized at neutral pH after the enzymes have been removed by filtration. Without further purification the DDG can then be dehydrated to give 2,5-FDCA, or be esterified to dibutyl-DDG (or di-ethyl DDG) prior to dehydration. One or more steps of purifying or esterifying DDG can be added to any of the methods and pathways disclosed herein that produce DDG. Other methods for purifying DDG from the aqueous mixture can also be used. These include separations using membranes or ion exchange resins that capture salts or DDG, etc.

[0086] The invention therefore provides a method of purifying DDG that involves acidifying DDG in a solution, filtering the solution through a filter membrane, and removing water from the solution (e.g., by lyophilization ro spray drying). The solution with the DDG can be acidified to a pH of 2.5-3.5 or pH of 3.0-4.0 or pH of 3.5-4.5 or pH of 4.0-5.0 or pH of 4.5-5.5 or pH of 5.0-6.0 or pH of 5.5-6.5 or pH of 6.0-7.0 or pH of 6.5-7.5 or pH of 7.0-8.0 or pH of 7.5-8.5 or pH of about 8. The amount of water removed can be greater than 80% or greater than 85% or greater than 87% of the water or greater than 90% of the water or greater than 95% of the water or greater than 97% or greater than 98% or greater than 99% of the water from the solvent comprising the DDG. Yields of greater than 25% or 30% or 35% or 40% or 45% molar can be obtained. In one embodiment the method does not involve a step of ion exchange chromatography.

Methods for Synthesizing FDCA and FDCA Derivatives

[0087] The invention also provides various methods of synthesizing FDCA. One method for synthesizing FDCA involves contacting DDG with an alcohol, an inorganic acid at a high temperature to form FDCA. The alcohol can be any alcohol (e.g., any of those described above), and examples include (but are not limited to) methanol, ethanol, propanol, and butanol. Diols can also be used. The high temperature can be a temperature greater than 70.degree. C. or greater than 80.degree. C. or greater than 90.degree. C. or greater than 100.degree. C. or greater than 110.degree. C. or greater than 120.degree. C. or greater than 130.degree. C. or greater than 140.degree. C. or greater than 150.degree. C. to form FDCA. Reaction yields of greater than 20% or greater than 30% or greater than 35% or greater than 40% can be achieved.

[0088] The invention also provides methods for synthesizing derivatives of FDCA. The methods involve contacting a derivative of DDG with an inorganic acid to produce a derivative of FDCA. The inorganic acid can be, for example, sulfuric acid, or any inorganic acid such as those described above. Optionally, the derivative of DDG can be purified prior to contacting it with the second inorganic acid. Non-limiting examples of derivatives of DDG or FDCA include, but are not limited to, methyl DDG, ethyl DDG, propyl DDG, butyl DDG, isobutyl DDG, di-methyl DDG, di-ethyl DDG, di-propyl DDG, di-butyl DDG. The derivative of FDCA produced can be, but is not limited to, methyl FDCA, ethyl FDCA, propyl FDCA, butyl FDCA, di-methyl FDCA, di-ethyl FDCA, di-propyl FDCA, di-butyl FDCA, and isobutyl FDCA. The derivate of FDCA produced corresponds to the derivative of DDG used in the method. The derivative of FDCA can then be de-esterified to produce FDCA. The method can also be conducted in the gas phase, e.g., using the parameters described below.

[0089] Another method for synthesizing FDCA or derivatives of FDCA involves contacting DDG or derivatives of DDG (any described herein) with an inorganic acid in a gas phase, which can be done with a short residence time, e.g., of less than 10 seconds or less than 8 seconds, or less than 6 seconds or less than 5 seconds or less than 4 seconds or less than 3 seconds or less than 2 seconds or less than 1 second. The residence time refers to the time that the sample is present in the reaction zone of the high temperature flow through reactor. The method can also be conducted at high temperatures, for example at temperatures greater than 150.degree. C., greater than 200.degree. C., greater than 250.degree. C., greater than 300.degree. C. or greater than 350.degree. C. Yields of greater than 25% or greater than 30% or greater than 40% or greater than 45% or greater than 50% molar are obtainable. Another method for synthesizing FDCA involves contacting DDG with an inorganic acid at a temperature in excess of 80.degree. C. or 90.degree. C. or 100.degree. C. or 110.degree. C. or 120.degree. C. Another method for synthesizing FDCA involves contacting DDG with an inorganic acid under anhydrous reaction conditions. In various embodiments the anhydrous conditions can be established by lyophilizing the DDG in any method of synthesizing FDCA disclosed herein so that the DDG contains less than 10% or less than 9% or less than 8% or less than 7% or less than 6% or less than 5% or less than 4% or less than 3% water or less than 2% water, by weight.

[0090] The methods of the invention for synthesizing FDCA and its derivatives as described herein provide a significantly higher yield than has been available. In different embodiments molar yields of FDCA (v. DDG) can be obtained of greater than 10% or greater than 15% or greater than 20% or greater than 25% or greater than 30% or greater than 35% or greater than 40% or greater than 45% or greater than 50% or greater than 60% or greater than 65% or from about 40% to about 70%, or from about 45% to about 65%, or from about 50% to about 60%.

EXAMPLES

Example 1

Step 2, Gluconic Acid to 3-Dehydro-Gluconic Acid (DHG)

[0091] Enzymes with natural activity for the dehydration of gluconate are useful in the invention (EC 4.2.1.39). Three enzymes from this family were cloned as shown in Table 1. Enzyme pSGI-365 was cloned and shown to be a dehydratase with broad substrate range having strong activity for the dehydration of gluconate (Kim, S. Lee, S. B. Biotechnol. Bioprocess Eng. 2008, 13, 436).

TABLE-US-00002 TABLE 1 Enzymes used in this experiment and identity homology. All expressed in P. fluorescens pSGI WT/ Expression Organism (Vector) Gene ID SYN Host Achromobacter 365 E3HJU7 Syn P. fluorescens (pRANGER) Achromobacter 359 #0385 wt P. fluorescens (pRANGER) Acinetobacter 360 #0336 wt P. fluorescens (pRANGER) 359_Achromob 365_E3HJU7 pSGI-360_Acinetobaeter (SGI) 78 79 pSGI-359_Achromobacter (SGI) 95 pSGI-365 Acromobacter

[0092] Proteins 359, 360, and 365 (SEQ ID NOs 33-35, respectively) showed 2-5 .mu.mole/min per mg of crude enzyme lysate activity for the synthesis of dehydration of gluconate (gel not shown). pSGI-359 was isolated by precipitation with ammonium sulfate and re-dissolving in buffer and assayed by the semicarbazide assay. Activities of 46.2 U/mL or 5.3 U/mg (1 unit=.mu.mole/min) for the dehydration of gluconate were calculated from semicarbazide assay plots. Reaction buffer (93 mL) containing Kpi 10 mM pH 8.0 with 2 mM MgCl2 and 3.5 gr (0.016 mole) of sodium gluconate was mixed with 7 mL of the previous gluconate dehydratase solution. The reaction was incubated at 45.degree. C. for 16 h before one aliquot was analyzed by HPLC-MS (FIG. 4). As shown in FIG. 4 one new major product with the molecular weight of DHG was produced. The product was also shown to have activity with DHG dehydratases.

[0093] All proteins were cloned on the pRANGER.TM. (Lucigen, Middleton, Wis.) expression vector and were expressed in a Pseudomonas fluorecens strain. pRANGER.TM. is a broad host commercially available plasmid vector containing the pBBR1 replicon, Kanamycin resistance and an pBAD promoter for inducible expression of genes. For the enzyme assay a modification of the semicarbazide assay for the quantification of alpha keto acid was used to calculate the activity of each enzyme (Kim, S.; Lee, S. B. Biochem J. 2005, 387, 271). SEQ ID NOs: 30-32 and 33-35 show the amino acid and nucleotide sequences, respectively, of the gluconate dehydratases #0385, #0336, and E3HJU7.

Example 2

Step 3--3-Dehydro-Gluconic Acid (DHG) to (4S)-4,6-Dihydroxy 2,5-Diketo Hexanoate (2,5-DDH)

[0094] Enzymes of the family (EC 1.1.1.127) can be used to perform this step. Two examples are 2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase and DHG dehydrogenases. Five enzymes from this family were cloned as shown in Table 2 below. pRANGER.TM. vector was used in every case.

TABLE-US-00003 TABLE 2 Cloned of DHG Oxidoreductase (or 2-dehydro- 3-deoxy-D-gluconate 5-dehydrogenase) pSGI WT/ Expression Organism (Vector) Gene ID SYN Host Agrobacterium sp (SGI) 374 #9041 WT P. fluorescens Agrobacterium 375 #8939 WT P. fluorescens tumefaciens (SGI) E. coli 376 P37769 WT P. fluorescens Sphingomonas (SGI) 395 #5112 WT P. fluorescens Hoeflea 396 #7103 WT P. fluorescens phototrophica (SGI)

[0095] The product prepared from the dehydration of gluconate in Step 2 was used as substrate for assaying the lysates of Table 2. As shown in the following Table 3, enzymes were identified showing activity for the oxidation of DHG in assays measuring NADH formation (absorbance increase at 340 nm).

TABLE-US-00004 TABLE 3 Activity Calculations for Oxidation of DHG to 2,5-DDH using DHG Oxidoreductase. A unit = .mu.mole/min of NADH U/mg (100 mM DHG) ENZ pH = 7.5 pH = 8.5 (10 mM DHG) pH = 9.5 pSGI_395 0.012 0.070 (0.02) 0.120 pSGI_396 0.033 0.139 (0.018) 0.418 pSGI_374 0.007 0.043 (0.012) 0.091 pSGI_376 0.007 0.121 (0.01) 1.610

[0096] Further verification of the formation of 2,5-DDH by these enzymes was shown in Step 16 where the reduction of 2,5-DDH (made from the dehydration of 5KGA) with pSGI-395 at acidic pH was shown.

Example 3

Steps 7 and 7B

Conversion of Guluronic Acid to D-Glucaric Acid (7) and Conversion of L-Iduronic Acid to Idaric Acid (7B)

[0097] To demonstrate Steps 7 and 7B the following study was performed. Uronate dehydrogenases (EC 1.1.1.203) are enzymes that oxidize glucuronic and galacturonic acid. Three enzymes with sequence similarity to the known uronate dehydrogenase (Expasy: Q7CRQ0; Prather, K. J, et al., J. Bacteriol. 2009, 191, 1565) were cloned from bacterial strains as shown in Tables 4 & 5.

TABLE-US-00005 TABLE 4 Cloned Uronate Dehydrogenases pSGI Organism (pET28) Gene ID Expression Agrobacterium #474 #8807 BL21DE3 Rhizobium #475 #8958 BL21DE3 Pseudomonas #476 #1770 BL21DE3

TABLE-US-00006 TABLE 5 Sequence Identity #475 #476 Q7CRQ0 474_Agrobacterium 73 49 90 475_Rhizobium 51 74 476_Pseudomonas 50

[0098] Each protein was expressed with a His tag from pET28 and was purified prior to their screening. Protein gels of the crude lysates and purified enzymes are shown in the gel of FIG. 1. After purification all enzymes were tested for activity against glucuronate, as well as against guluronate and iduronate. Kinetic measurements at different substrate concentrations were performed and the calculated activities and Km values for each enzyme are shown in Table 6. All enzymes showed good activity for glucuronate, and also for L-iduronate and guluronate.

TABLE-US-00007 TABLE 6 Activity and Km Value for Purified Uronate Dehydrogenases Vmax (.mu.M/min/mg); and Km (mM) Guluronate Enzyme Glucuronate Iduronate (Vm only) 474 128.2; 0.37 0.96; 29.8 0.017 475 47.4; 0.22 0.59; 42.1 0.016 476 90.9; 0.34 1.36; 29.6 0.014

[0099] Each plasmid shown in Table 4 was transformed in BL21DE3 E. coli cells. Clarified lysates were mixed with equal volume of (25 mL) of equilibration buffer and purified on an Ni NTA column. Activity of each purified enzyme was measured in by mixing 0.050 mL of various dilutions of each purified enzyme with 0.95 mL of reaction buffer (100 mM TrisHCl, pH 8.0, 50 mM NaCl, 0.75 mM NAD+). The reaction progress was measured by monitoring of the formation of NADH at 340 nm. FIGS. 6A and 6B provide Lineweaver-Burk plots for the oxidation of glucuronate and iduronate, with all three enzymes shown in FIGS. 6A and 6B. Clear positive slopes were obtained with all enzymes giving the activities shown in the table above. Protein sequences of the uronate dehydrogenases are shown as SEQ ID NOs: 1-3 and the genes as SEQ ID NO: 4-6.

[0100] Pyrroloquinoline (PQQ) dependent aldehyde dehydrogenases also showed good activity for the oxidation of both guluronate and iduronate. These are soluble periplasmic enzymes that were expressed in the E. coli cytosol after their periplasmic target sequence was removed. The activities of crude lysates in units (.mu.mole/min) per milligram of total lysate protein are shown in the following Table 6A. The actual activity of each enzyme is at least 2-5.times. higher if purified (see expression in FIGS. 3A-3C).

TABLE-US-00008 TABLE 6A Activities of PQQ-Dependent Dehydrogenases with Iduronate and Guluronate (Unit = (.mu.mole/min) Enzyme Iduronate U/mg Guluronate U/mg P75804 (SEQ ID NO: 73) 8.7 3.2 9522 (SEQ ID NO: 74) 7.3 6.1 6926 (SEQ ID NO: 75) 9.2 4.1 7510 (SEQ ID NO: 76) 7.3 3.7 7215 (SEQ ID NO: 77) 14.2 8.3 8386 (SEQ ID NO: 78) 4.3 1.5

[0101] The activities shown on Table 6A were measured using an artificial electron acceptor DCPIP (2,6-dichloroindophenol) according to the following protocol: In 0.95 mL of 20 mM Triethanol amine (pH 8.0) containing 0.2 mM DCPIP, 0.2 mM PMS (phnazine ethosulffate) and substrate (10-40 mM), 0.050 mL of enzyme (as crude lysate or 10-100.times. diluted with buffer) is added and the reaction progress is followed by the change of DCPIP absorbance at 600 nm. Because in their natural state these enzymes are transferring electrons to other proteins or cofactors in the membrane electron transport chain, the in vitro activity is measured using artificial electron acceptors with DCPIP being the most common.

[0102] The enzymes on Table 6A were active against a number of other aldehydes including butyraldehyde, butyraldehyde and glycerol (but not glucose). Therefore, these enzymes will oxidize the aldehyde group of iduronate and guluronate to give iduronic and glucaric acid respectively. In order to confirm this selectivity, two of these enzymes, #403 and #412, were expressed in the periplasm of E. coli by fusing them with the periplasmic target sequence of #403 (a native E. coli enzyme). Both proteins were expressed in the periplasm but in lower levels compared to the cytosol. The previous recombinant cells oxidized benzaldehyde to benzoic acid in good yields and in lower yields produced glucaric and idaric acid from guluronate and iduronate.

Example 4

Step 15

Conversion of 5-Ketogluconate (5-KGA) to L-Iduronic Acid (15) or Guluronic Acid (15A)

[0103] This example illustrates the identification of an enzyme capable of isomerizing 5-KGA to iduronic acid (Step 15) or guluronic acid (Step 15A). Thirteen enzymes from three different isomerase families were cloned as shown in Table 7, while their % sequence identity is shown in Table 8.

TABLE-US-00009 TABLE 7 Isomerases Cloned Gene ID pSGI Archetype .RTM. or WT/ EC Organism (pET28) Expasy SYN 5.3.1.17 Rhizobium 433 #8938 WT 5.3.1.17 E. coli 434 Q46938 WT (Expasy) 5.3.1.17 Rhizobium 435 #3891 WT 5.3.1.17 Pannonibacter 436 #7102 WT 5.3.1.n1 Lactobacillus 458 A5YBJ4 SYN (Expasy) 5.3.1.n1 Acidophilum 440 F0J748 SYN (Expasy) 5.3.1.n1 Bacillus 437 #9209 WT 5.3.1.n1 Ochrobactrum 438 #9732 WT 5.3.1.n1 Halomonas 439 #7403 WT 5.3.1.12 Sphingobacteria 478 #1874 WT 5.3.1.12 Thermotoga 479 Q9WXR9 SYN 5.3.1.12 Bacillus 480 Q9KFI6 SYN 5.3.1.12 Bacillus 481 O34808 SYN

TABLE-US-00010 TABLE 8 % Identities of Isomerases EC 436 434 435 458 440 437 438 439 481 480 479 478 433 5.3.1.17 65 44 43 16 13 18 11 14 6 11 11 7 436 5.3.1.17 45 46 18 14 15 12 13 5 10 11 7 434 5.3.1.17 46 17 10 15 10 13 6 10 12 7 435 5.3.1.17 18 16 18 14 16 9 11 13 7 458 5.3.1.n1 37 57 41 44 6 7 9 5 440 5.3.1.n1 40 67 50 6 6 6 5 437 5.3.1.n1 46 51 8 7 10 6 438 5.3.1.n1 52 5 5 6 4 439 5.3.1.n1 6 7 8 5 481 5.3.1.12 7 36 54 480 5.3.1.12 7 7 479 5.3.1.12 37 478 5.3.1.12

[0104] As shown in Table 8, enzymes with medium homology (underlined) within each family were selected for cloning. The data demonstrated that enzymes from all families showed activity for the isomerization of 5-KGA giving L-iduronate as the main product. Two enzymes from the 5.3.1.17 family (433 & 434) were also used in the example showing the formation of DDG from 5-ketogluconate (5KGA).

[0105] Activity for the isomerization of 5KGA and iduronate using enzymes from Table 7 was measured using an enzymatic method that detected the formation of products by their activity against two different enzymes. For example, isomerization of 5KGA was detected by measuring the activity of the product iduronate using uronate dehydrogenase (pSGI-476). Isomerization of iduronate was detected by measuring the activity 5KGA reductase (pSGI-383, EC 1.1.1.69) of the product 5KGA. Presence of the products was also detected by GC-MS.

[0106] Enzymes from all families showed varying activity for the isomerization of 5KGA and iduronate. Two enzymes from EC 5.3.1.12 were used in a cell free reaction to isomerize 5KGA and ultimately produce DDG as described in the example. The enzymes were purified and showed a single band by gel electrophoresis. The purified isomerases were used in reactions using lysate and buffer containing 5KGA or Iduronate. Product formation was demonstrating using both HPLC and the previously described enzymatic methods. Results for 17 h of incubation using both HPLC and enzyme assays are shown in FIG. 7A. All enzymes showed good activity for the isomerization of both 5KGA and iduronate. Yields for iduronate isomerization by pSGI433, pSGI 434, pSGI 435, and p SGI 436 were 56%, 48% 42%, (436 not measured), respectively when measured enzymatically and 78.8%, 78.5%, 73.3% and 76.6%, respectively when measured by HPLC assay. Yields after 16 h for 5KGA isomerization by the same enzymes were 18%, 17%, and 19% respectively (436 not measured) when measured by enzymatic assay, and 16.6%, 17.8%, 16.3%, and 16.9%, respectively, when measured by HPLC assay.

EC 5.3.1.12 Enzymes

[0107] Enzymes from the EC 5.3.1.12 family (glucuronate isomerases) were also purified by gel electrophoresis, isolated, and used to prepare reactions by mixing with buffer (50 mM HEPES, 1 mM ZnCl2, pH 8.0) that contained 5 mM of 5KGA or Iduronate. The reactions were incubated at 30.degree. C. and analyzed for product formation using both HPLC and enzymatic methods. Results are shown in FIG. 7B.

5.3.1.17 Enzymes

[0108] Enzymes pSGI-478 and pSGI-479 (5-dehydro-4-deoxy-D-glucuronate isomerases) showed isomerization activity for both 5KGA and iduronate. This activity was also confirmed with the enzymatic assays as above. Yields for isomerization of iduronate by pSGI-478 and -479 were 50% and 37%, respectively, when measured enzymatically, and 20% and 18% when measured by HPLC. Yields for 5KGA isomerization were 23% and 26%, respectively, when measured enzymatically, and 24% and 16%, respectively when measured by HPLC. Results are shown in FIG. 7A.

5.3.1.n1 Enzymes

[0109] Enzymes in this family were purified by gel electrophoresis. Product formation was measured using enzymatic assays as described above and the results are shown in FIG. 8. All enzymes cloned in this family were shown to have activity for the isomerization of 5KGA and iduronate.

[0110] In each case plasmids were transformed in BL21DE3 and proteins purified on a Ni NTA column.

Example 5

Step 16

5-Keto-Gluconate (5KGA) to (4S)-4,6-Dihydroxy 2,5-Diketo Hexanoate (2,5-DDH)

[0111] The three gluconate dehydratases described in Step 2 (Example 1) were expressed as described in Example 1, along with a purified glucarate dehydratase from Step 8. Enzymatic reactions for activity were performed and HPLC-MS analysis showed the formation of 2,5-DDH (FIG. 9), which was also confirmed by the fact that formation of the new product was accompanied by the reduction of 5-KGA only in the samples containing gluconate dehydratases, as well as by enzymatic assays with DHG dehydratase (pSGI-395). Good slopes at 340 nm indicating large enzyme activity were obtained when NADH, pSGI-395 lysate and aliquots of the previous reactions were mixed (data not shown). This result in combination with the HPLC analysis prove that the gluconate dehydratases examined dehydrate 5KGA to 2,5-DDH.

Example 6

Step 19

Conversion of 1,5-Gluconolactone to Guluronic Acid .delta.-Lactone

[0112] 1,5-gluconolactone oxidation is a side activity of enzymes from the alditol oxidases (EC 1.1.3.41) family. These enzymes oxidize various alditols such as sorbitol, xylitol, glycerol and others. Enzymes were identified having activity for the oxidation of 1,5-gluconolacone, as shown in Table 6 below.

TABLE-US-00011 TABLE 6 Alditol Oxidases with Activity on 1,5-Gluconolactone 1,5-Gluconolactone Reaction Setup Sorbitol Enzyme Substrate Enzyme Enzyme Source U/mg U/mg Mg mg/mM Yield AO#13 Terriglobuds roseus 0.23 0.02 5.3 15/85 7% AO#22 Granulicella mallensis 0.27 0.015 7.6 15/85 9% AO#28 Streptomyces acidiscabies 1.30 0.010 15 15/85 8% AO#36 Actinomycetales (SGI) 1.83 0.102 25 90/35 46% AO#51 Frankia sp 0.59 0.019 NT NT NT AO#57 Propionibacteriacaeae (SGI) 1.47 0.051 40 70/57 6% AO#76 Streptomyces sp. 1.45 0.045 8.2 15/85 23% AO#251* Paenibacillus sp. 0.47 0.003 24 15 8.5 ~2% *crude lysate

[0113] Reactions were prepared using lysates of all the purified enzymes shown on Table 6. Reactions were prepared in 50 mM K-phosphate buffer, pH 7.0 with 0.5 mg/mL catalase and incubated at 30.degree. C. A new product was observed by HPLC-MS analysis showing the same retention time as guluronate after comparison with authentic standards (FIG. 10). This was confirmed by GC-MS, where the product also had the same MS fingerprint as guluronate. It is therefore clear that all the alditol oxidases described in the Table oxidize the 6-OH of 1,5-gluconolactone to produce the guluronic acid lactone. All alditol oxidases were cloned in pET28a with a HisTag and were expressed in BL21DE3 and purified on a Ni NTA column.

Example 7

Synthesis of FDCA and Other Intermediates

[0114] Purified DDG mono potassium salt was used for the dehydration to 2,5-FDCA. Sulfuric acid was added to DDG and the reaction stirred at 60.degree. C. The in situ yield was calculated (by HPLC-MS) to be .about.24% and .about.27%.

[0115] The reaction solutions were combined and then diluted by pouring into ice (to neutralize the heat). Approximately equivalent volume of THF was added, and the solution transferred to a separation funnel. Sodium chloride salt was added until separation was achieved. The solution was agitated between additions for best possible dissolution. The aqueous layer was removed, and the THF layer washed 3.times. more with sat. NaCL solution. Sodium sulfate was added and the solution left sitting overnight. Two layers formed again overnight. The aqueous layer was discarded and then silica gel was added to the solution. It was then concentrated down to solids via rotovap. The solids were loaded into a silica flash column and then separated via chromatographically. The fraction was concentrated and dried. The isolated yield was 173.9 mg. Corrected yield: 24.9%. .sup.1H and .sup.13C NMR and HPLC-MS analysis confirmed the product.

Dehydration of DDG Dibutyl-2,5-FDCA in BuOH/H.sub.2SO.sub.4

[0116] Dehydration of un-derivitized lyophilized DDG containing the dehydration salts in BuOH was done using a Dean-Stark apparatus. Under these conditions, DDG was added to BuOH, and then H2SO4 was added and the reaction heated at 140.degree. C. After stirring for 4 h HPLC-MS analysis shows the disappearance of DDG and the formation of dibutyl-2,5-FDCA. The in situ yield was calculated (by HPLC-MS) to be 36.5%.

[0117] The mixture was extracted with water, 1% NaOH, and again with water. Then the organic layer was concentrated to a final mass of 37.21 g. A portion of this mass (3.4423 g) was removed and 0.34 g of dibutyl-2,5-FDCA was purified using HPLC. Extrapolating the yield of the isolated product to the total amount of compound isolated from the reaction (37.21 g) and taking into account the amount of salts present in the original DDG (.about.60% pure by weight) the reaction yield was calculated to be 42%. .sup.1H and .sup.13C NMR and HPLC-MS analysis confirmed the product.

Synthesis of Dibutyl DDG

[0118] In another aspect the invention provides a method for synthesizing a derivative of DDG. The method involves contacting DDG with an alcohol, an inorganic acid, and optionally a co-solvent to produce a derivative of DDG. Optionally the derivative of DDG can be purified. The reaction can have a yield of the derivative of DDG of at least 10% molar yield or at least 15% molar yield or at least 20% molar yield or at least 25% or at least 30% or at least 35% molar yield or at least 40% molar yield. The inorganic acid can be sulfuric acid and the alcohol can be methanol, ethanol, propanol, butanol, isobutanol, or any C1-C20 alcohol. In various embodiments the co-solvent can be any of THF, acetone, acetonitrile, an ether, butyl acetate, an dioxane, chloroform, methylene chloride, 1,2-dichloroethane, a hexane, toluene, and a xylene. When the alcohol is ethanol the DDG derivative will be DDG mono-ethyl ester and/or DDG diethyl ester. When the alcohol is butanol the DDG derivative will be DDG mono-butyl ester and/or DDG dibutyl ester.

[0119] DDG mono-potassium salt was used for derivatization according to the following protocol. In a 1 L Morton type indented reaction vessel equipped with a mechanical stirrer and heating mantle was charged with 60:40 DDG:KCl (31.2 mmol), BuOH, and heptane. In a separate vial, sulfuric acid was added to water, and allowed to cool after dissolution. The solution was then added to the flask. The solution was kept at 30.degree. C.

[0120] The precipitate was filtered off concentrated. The remaining gel was dissolved in EtOAc, and then TLC plates were spotted with the solutions and the plates were sprayed with a phosphomolybdic acid mixture, and then heated to at least 150.degree. C. on a hot plate to identify the DDG-DBE fraction. Isolated yield: 4.62 g (15.2 mmol, 47% yield), >98% purity. .sup.1H and .sup.13C NMR and HPLC-MS analysis confirmed the product.

[0121] Different solvents can be used in the synthesis of DDG esters, such as mixtures of BuOH (5%-95% v/v) with co-solvents such as THF, acetone, acetonitrile, ethers (dibutyl, ditheyl, etc,), esters such as Butyl-acetate, 1,6-dioxane, chloroform, methylene chloride, 1,2-dichloroethane, hexanes, toluene, and xylenes may be used as cosolvents. Reaction catalysts such as acids (sulfuric, hydrochloric, polyphosphoric or immobilized acids such as DOWEX) or bases (pyridine, ethyl-amine, diethyl-amine, boron trifluoride) or other catalysts commonly used for the esterification of carboxylic acids.

Dehydration of dibutyl-DDG to dibutyl-FDCA in n-BuOH/H.sub.2SO.sub.4

[0122] A stock solution of DDG-DBE (di-butyl ester) was made in butanol and transferred to a clean, dry 100 mL round-bottomed flask equipped with a stir bar. To the flask, 25 mL of conc. sulfuric acid was added. The flask was sealed and then stirred at 60.degree. C. for 2 hrs. The in situ yield was calculated to be .about.56%. The reaction solution was concentrated and the residue was dissolved in MTBE and transferred to a separation funnel, and then washed with water. The recovered organic layer was concentrated and then separated via HPLC for an isolated yield: 250.7 mg (.about.90% purity) and 35% isolated yield (corrected for purity). .sup.1C and .sup.13C NMR and HPLC-MS analysis confirmed the product.

Example 8

Cell Free Synthesis of DDG and FDCA and Derivatives from 5-KGA (Route 2A)

[0123] This example illustrates the enzymatic conversion of 5KGA to DDG using purified enzymes according to Scheme 6 (a sub-Scheme of 2B), and also illustrates the DDG produced being dehydrated to FDCA using chemical steps. The Scheme involves the steps of isomerization of 5KGA (Step 15) and the subsequent oxidation to idaric acid (Step 7B). DDG was also dehydrated under differing chemical conditions to FDCA. The last step (Step-8A) was performed using glucarate dehydratase from E. coli.

[0124] Scheme 6 is illustrated in FIG. 11. The scheme was performed using a cell free enzymatic synthesis of DDG from 5-KGA. The Scheme involves the performance of steps 15, 7B and 8A (see FIG. 2D). Two additional proteins were used to complete the reaction path, the first being NADH-oxidase (Step A) that is recycling the NAD+ cofactor in the presence of oxygen, and catalase (Step B) that decomposes the peroxide produced from the action of NADH oxidase. The enzymes are shown in the following Table 7. All enzymes contained a HisTag and were purified using an Ni-NTA column. Yields for this synthesis of DDG were calculated to be at least 88-97%.

TABLE-US-00012 TABLE 7 STEP Enzyme EC Organism 15 pSGI-433 5.3.1.17 Rhizobium (SGI) (DTHU_IS) 15 pSGI-434 5.3.1.17 E. coli (DTHU_IS) 7B pSGI-476 1.1.1.203 Pseudomonas (SGI) (UroDH) 8A pSGI-353 4.2.1.40 E. coli (GlucDH) A pSGI-431 1.6.3.1 Thermus (NADH_OX) thermophiilus B Catalase 1.11.1.6 Corynbacterium

[0125] 500 mL of liquid culture was purified for each isomerase for the reaction. Besides the enzymes shown on Table 7, each reaction contained 50 mM TrisHCl (pH 8.0), 50 mM NaCl, 1 mM ZnCl.sub.2 and 2 mM MgCl.sub.2, 1 mM MnCl.sub.2 and 1 mM NAD.sup.+. Reactions were analyzed by HPLC after 16 h of incubation and FIGS. 12A-12B present the chromatograms.

[0126] For dehydration to FDCA, the reaction mixtures of both samples were combined and lyophilized into a white powder, which was split into two samples and each dissolved in AcOH with 0.25M H.sub.2SO.sub.4 or in 4.5 mL BuOH with 0.25M H.sub.2SO.sub.4. Both reactions were heated in sealed vials for 2-4 h at 120.degree. C. Reaction products are shown in FIG. 13.

[0127] Samples 1 and 2 represent authentic standard and the 3 h time point from the reaction in AcOH/H.sub.2SO.sub.4, respectively. Spiking of sample 2 with sample 1 gave a single peak further verifying the FDCA product. Samples 1 and 3 (FIG. 13) represent authentic standard and the 4 h time point from the reaction in BuOH/H.sub.2SO.sub.4, respectively. The formation of FDCA from the enzymatic reactions further confirms the presence of DDG in these samples.

Example 9

Synthesis of DDG from Glucose and Gluconate

[0128] This example shows the enzymatic conversion of glucose and gluconate to DDG. The reaction was conducted with purified enzymes, and crude lysates as a catalyst. Enzymes and substrates were combined in a bio-reactor as shown in the Table below:

TABLE-US-00013 ST-14 ST-15 ST-7B ST-8A ST-A Substrate ST-1 pSGI-504 pSGI-434 pSGI-476 pSGI-353 pSGI-431 ST-B Rxn-1 Glucose 2 mg 7 mL.sup.1 50 mL.sup.2 7.5 mL.sup.1 1 mL.sup.3 4 mL.sup.4 2 mg 600 mg Rxn-2 Gluconate -- 7 mL.sup. 50 mL.sup. 7.5 mL.sup. 1 mL.sup. 4 mL.sup. 2 mg 700 mg .sup.1Lysate from 500 mL liquid culture of recombinant E. coli with plasmid .sup.2Lysate from 2L liquid culture of BL21DE3/pSGI-434 .sup.3Purified enzyme, ~30 Units of activity (or 3 mg of purified GlucD) .sup.4Lysate from 250 mL of culture

[0129] The reaction was incubated at 35.degree. C. and dissolved oxygen and pH were kept at 20% and 8 respectively. Time points were analyzed by HPLC-MS and the results are shown in FIG. 17B. Extracted chromatograms verified the DDG mass (not shown) and corresponding MS fragmentation. The results clearly showed production of DDG during incubation of the enzymes with either glucose or gluconate.

Example 10

Construction of Expression Cassettes for Recombinant Glucarate Dehydratases

[0130] The following example describes the creation of recombinant nucleic acid constructs that contained coding sequence of a D-glucarate dehydratase activity (GDH, EC 4.2.1.40) for heterologous expression in E. coli cells.

[0131] Genes encoding D-Glucarate dehydratase from E. coli (Expasy: POAES2;), Acinetobacter ADP1 (Expasy: POAES2), as well as a proprietary Pseudomonas bacterial strain (#8114) were PCR-amplified from genomic DNA.

[0132] Each of the PCR-amplified genes was subsequently cloned into the bacterial transformation vector pET24a(+), in which the expression of each of the GDH genes was placed under control of a T7 promoter. The nucleotide sequences of each of the PCR-amplified inserts were also verified by sequencing confirmation.

Example 11

E. coli Strains Expressing Recombinant Glucarate Dehydratases

[0133] Each of the expression vectors constructed as described in Example 9 was introduced into NovaBlue(DE3) E. coli by heat shock-mediated transformation. Putative transformants were selected on LB agar supplemented with Kanamycin (50 .mu.g/ml). Appropriate PCR primers were used in colony-PCR assays to confirm positive clones that contained each of the expression vectors.

[0134] For each expression vector, a bacterial colony was picked from transformation plates and allowed to grow at 30.degree. C. in liquid LB media supplemented with Kanamycin (50 .mu.g/ml) for two days. The culture was then transferred into vials containing 15% glycerol and stored at -80.degree. C. as a frozen pure culture.

Example 12

Demonstration of In Vitro Synthesis of DDG by Using Cell Lysate of Recombinant E. coli Cells Expressing a GDH Enzyme

[0135] This Example describes how in vitro synthesis of DDG intermediate was achieved using recombinant glucarate dehydratase (GDH) enzymes produced in E. coli cells.

Preparation of Cell Lysates

[0136] Recombinant bacterial strains constructed as described previously in Example 2 were grown individually in 3 mL of liquid LB media supplemented with Kanamycin (50 .mu.g/ml) at 30.degree. C. on a rotating shaker with rotation speed pre-set at 250 rpm for 1 day. This preculture was used to inoculate 100 mL of TB media containing Kanamycin (50 ug/ml), followed by incubation at 30.degree. C. on a rotating shaker pre-set at 250 rpm for 2-3 hour until early log phase (OD.sub.600.about.0.5-0.6) before isopropyl D-1 thiogalactopyranoside (IPTG; 0.25 mM final concentration) was added to induce protein expression. Cells were allowed to grow for another 18 hours at 30.degree. C. before they were harvested by centrifugation, resuspended in 15 mL of lysis buffer (10 mM phosphate buffer, pH 7.8, 2 mM MgCl.sub.2) and were lysed by sonication. The production of recombinant enzymes in E. coli cells was quantified using standard pre-cast SDS-PAGE gels system (BioRad), and specific activity was measured according to a procedure described by Gulick et al. (Biochemistry 39, 4590-4602, 2000). Crude cell lysates or purified enzymes (using the HisTag) were then tested for the ability to convert gram amounts of glucarate to DDG as described in greater detail below.

Enzymatic Dehydration of Glucarate

[0137] A large scale oxidation of glucarate using glucarate dehydratase was prepared. 350 mL of water 25 g of glucaric acid sodium salt (0.1 mole) and 4.5 gr of KOH (0.8 mole) were mixed in an Erlenmeyer flask. Residual solid glucarate was dissolved by the slow addition of 5M KOH solution (.about.3 mL) and the pH was adjusted to 7.4. In this solution 100 mg of purified glucarate dehydratase and 2 mM MgCl2 were added, and the mixture was placed in an orbital shaker at 30.degree. C. for 20 h. The next day the precipitate is removed by filtration. The pH of the reaction was essentially unchanged. Analysis of the reaction revealed the presence of only DDG in the solution, indicating >95% yield.

Purification of DDG Product from Enzymatic Reactions

[0138] DDG produced via enzymatic dehydration was purified by using either of the two following techniques. The enzymatic dehydration reactions were acidified to pH-3.0 with 6M HCl, filtered to eliminate precipitate, and subsequently lyophilized to produce a white powder consisting of DDG and salts. The same DDG purity (but lower amount of salts) can be obtained if the reaction was filtered through a 10 KDa membrane to remove proteins and then lyophilized. Without any further purification both previous lyophilized powders can be dehydrated to FDCA (or its esters) or can be esterified to dibutyl DDG as shown in other examples of this application.

[0139] Results of HPLC-MS analyses indicated that DDG product constituted at least 95% of the total products in the samples obtained from either of the two purification techniques.

Example 13

Demonstration of In Vitro Synthesis of FDCA from DDG in One-Step Chemical Reaction

[0140] Applicants have discovered that the synthesis of FDCA (i.e., the free acid form) could be achieved by a chemical conversion of DDG to FDCA in the presence of H.sub.2SO.sub.4. The reaction was performed as follows. Approximately 20 mg of DDG acid (crude lyophilized powder with salts previously purified as described in Example 3) and 0.25 M of H2SO4 were added into an air tight sealed tube containing 1 mL of water and 1 mL of DMSO. The DDG was found completely dissolved in this solution. The reaction was stirred at 105.degree. C. for 18 hours. Results of an HPLC-MS analysis performed on a crude reaction sample indicated the formation of FDCA free acid (FDCA: 2,5-furan dicarboxylic acid) as the major product, as well as insignificant amounts of some other unidentified byproducts. As a control in HPLC-MS analysis, a commercial FDCA was analyzed in the same conditions.

Example 14

Demonstration of In Vitro Synthesis of FDCA-Esters (Dimethyl-, Diethyl-, Dibutyl-, and Isopropyl-Esters)

Synthesis of Diethyl-2,5 FDCA from Purified DDG

[0141] In an air tight sealed tube, 18 mL of EtOH, 0.2 gram (1 mmole) of DDG acid, previously purified as described in Example 11, and 0.25 M of H.sub.2SO.sub.4 were added. The DDG acid was not completely dissolved in this solution. The reaction was gently stirred at 105.degree. C. for 18 hours. Results of a GC-MS analysis of a crude reaction sample indicated that the formation of diethyl-FDCA the major product. As a control, an authentic FDCA was chemically synthesized, esterified to diethyl-FDCA and analyzed in the same conditions.

Example 15

Synthesis of Dibutyl-2,5 FDCA from Purified DDG

[0142] In an air tight sealed tube, 18 mL of n-BuOH, 0.2 gram (1 mmole) of DDG acid, previously purified as described in Example 11, and 0.25 M of H.sub.2SO.sub.4 were added. The DDG acid was not completely dissolved in this solution. The reaction was gently stirred at 105.degree. C. for 18 hours. As shown in FIGS. 15A-15B, results of the GC-MS analysis of a reaction sample indicated that diethyl-FDCA (FDCA: 2,5-furan dicarboxylic acid) was formed as the major product. As a control, an authentic FDCA was chemically synthesized, esterified to diethyl-FDCA, and analyzed in the same conditions.

Example 16

Synthesis of Dibutyl-2,5 FDCA from Crude DDG (Unpurified)

[0143] 0.2 gram (1 mmole) of crude DDG acid, which was an unpurified lyophilized powder obtained directly from the enzymatic dehydration of glucarate as described in Example 11, was added into an air tight sealed tube containing 18 mL of n-BuOH, followed by addition of 0.25 M of H.sub.2SO.sub.4. The crude DDG acid was not completely dissolved in this solution. The reaction was gently stirred at 105.degree. C. for 18 hours. Results of a GC-MS analysis of a crude reaction sample indicated that diethyl-FDCA (FDCA: 2,5-furan dicarboxylic acid) was formed as the major product. The GC-MS result indicated that the present of contaminant salts in crude/unpurified lyophilized powder did not significantly affect the reaction outcome. As a control, an authentic FDCA was chemically synthesized, esterified to diethyl-FDCA, and analyzed in the same conditions.

Example 17

In Vitro Production of FDCA and/or Esters Using Immobilized Acids

[0144] In industrial practices, immobilized acids offer many advantages for performing dehydrations since they can typically operate in several types of solvent (aqueous, organic or mixed, etc.). In addition, they can be easily recycled and be re-used. Following some examples of the synthesis of esters of FDCA using immobilized AMBERLYST.RTM.15 (Rohm and Haas, Philadelphia, Pa.) and DOWEX.RTM.50 WX8 (Dow Chemical Co, Midland, Mich.).

Synthesis of Dibutyl-FDCA from Crude DDG by Using DOWEX.RTM.50 WX8

[0145] In an air tight sealed tube, 2 mL of n-Butanol, 20 mg of crude DDG acid (unpurified lyophilized powder containing salts) and 200 mg of DOWEX.RTM.50 WX8 were combined. The DDG was not completely dissolved in this solution. The reaction was gently stirred at 105.degree. C. for 18 hours. Results of the GC-MS analysis of a crude reaction sample indicated that diethyl-FDCA (FDCA: 2,5-furan dicarboxylic acid) was formed as the major product. This GC-MS result indicated that the present of contaminant salts (phosphate and NaCl) in crude/unpurified lyophilized powder did not significantly affect the reaction outcome. As a control, an authentic FDCA was chemically synthesized esterified to diethyl-FDCA and analyzed in the same conditions.

Synthesis of Dibutyl-FDCA from Crude DDG by Using AMBERLYST.RTM.15

[0146] In an air tight sealed tube, 2 mL of n-Butanol, 20 mg of crude DDG acid (crude lyophilized powder with salts) and 200 mg of AMBERLYST.RTM.15 (Rohm and Haas, Philadelphia, Pa.) were combined. The DDG was not completely dissolved in this solution. The reaction was gently stirred at 105.degree. C. for 18 hours. Results of the GC-MS analysis of a crude reaction sample indicated that diethyl-FDCA (FDCA: 2,5-furan dicarboxylic acid) was formed as the major product. This GC-MS result indicated that the present of contaminant salts (phosphate and NaCl) in crude/unpurified lyophilized powder did not significantly affect the reaction outcome. As a control, an authentic FDCA was chemically synthesized esterified to diethyl-FDCA and analyzed in the same conditions.

Synthesis of Ethyl-FDCA from Crude DDG by Using AMBERLYST.RTM.15

[0147] In an air tight sealed tube, 2 mL of ethanol, 20 mg of crude DDG acid (unpurified lyophilized powder containing salts) and 200 mg of AMBERLYST.RTM.15 (Rohm and Haas, Philadelphia, Pa.) were combined. The DDG was not completely dissolved in this solution. The reaction was gently stirred at 105.degree. C. for 18 hours. Results of the GC-MS analysis of a crude reaction sample indicated that diethyl-FDCA (FDCA: 2,5-furan dicarboxylic acid) was formed as the major product. This GC-MS result indicated that the present of contaminant salts (phosphate and NaCl) in crude/unpurified lyophilized powder did not significantly affect the reaction outcome. As a control, a commercial FDCA was chemically esterified to diethyl-FDCA and analyzed in the same conditions.

Synthesis of Diethyl-FDCA from Crude DDG by Using DOWEX.RTM.50 WX8

[0148] In an air tight sealed tube, 2 mL of ethanol, 20 mg of crude DDG acid (unpurified lyophilized powder containing salts) and 200 mg of DOWEX.RTM.50 WX8 were combined. The DDG was not completely dissolved in this solution. The reaction was gently stirred at 105.degree. C. for 18 hours. Results of the GC-MS analysis of a crude reaction sample indicated that diethyl-FDCA (FDCA: 2,5-furan dicarboxylic acid) was formed as the major product. This GC-MS result indicated that the present of contaminant salts (phosphate and NaCl) in crude/unpurified lyophilized powder did not significantly affect the reaction outcome. As a control, a commercial FDCA was chemically esterified to diethyl-FDCA and analyzed in the same conditions.

Example 18

Production of FDCA Derivatives

[0149] The synthesis of a number of high-value FDCA derivatives is described in FIG. 16 in which dehydration of DTHU produces furfural-5-carboxylic acid, i.e., FCA, which is then chemically or enzymatically oxidized to FDCA, be reduced to FCH, or be transaminated (using chemical reductive amination or transaminase) to amino acid-AFC.

Example 19

Production of Di-Butyl FDCA in a Gas Phase Reaction

[0150] In this example the inlet of the GC was used as a high temperature reactor to catalyze the dehydration of di-butyl DDG to di-butyl FDCA. The resulting products were chromatographically separated detected by mass spectrometry. A solution of di-butyl DDG (10 mM) and sulfuric acid (100 mM) in butanol was placed in a GC vial. The vial was injected into a GC and FDCA Dibutyl ester was observed. The reaction occurred in the 300.degree. C. inlet (residence time=4 seconds). The average yield of 6 injections was 54%.

[0151] GC Settings: Direct Liquid Inject/MS Detector [0152] Inlet: 300.degree. C., total flow 29.51 ml/min, split ratio 10:1, split flow 24.1 ml/min, Septum Purge flow 3 mL/min. [0153] GC Liner: 4 mm, glass wool (P/N 5183-4647) [0154] Column Flow: 2.41 ml/min He constant pressure control [0155] Oven Program: At 40.degree. C. hold for 2 min, then ramp 25.degree. C./min to 275.degree. C., then ramp 40.degree. C./min to 325.degree. C., hold for 2 min. [0156] Column: HP-5MS, Agilent Technologies, 30 m.times.0.25 mm.times.0.25 um. [0157] Total Runtime: 14.65 minutes [0158] MSD Transfer line: 290.degree. C. [0159] MS Source: 250.degree. C. [0160] MS Quad: 150.degree. C. [0161] Retention Times: [0162] 2,3-FDCA Dibutyl ester: 9.3 min [0163] 2,5-FDCA Dibutyl ester: 9.7 min

[0164] All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

[0165] No admission is made that any reference constitutes prior art. The discussion of the references states what their authors assert, and the applicants reserve the right to challenge the accuracy and pertinence of the cited documents. It will be clearly understood that although a number of prior art publications are referred to herein, this reference does not constitute an admission that any of these documents forms part of the common general knowledge in the art.

[0166] It should also be understood that the foregoing examples are offered to illustrate, but not limit, the invention.

Sequence CWU 1

1

841267PRTAgrobacterium tumefaciens 1Met Ala Met Lys Arg Leu Leu Val Thr Gly Ala Ala Gly Gln Leu Gly 1 5 10 15 Arg Val Met Arg Lys Arg Leu Ala Ser Met Ala Glu Ile Val Arg Leu 20 25 30 Ala Asp Leu Ala Pro Leu Asp Pro Ala Gly Pro Asn Glu Glu Cys Met 35 40 45 Gln Cys Asp Leu Ala Asp Ala Asp Ala Val Asp Ala Met Val Ala Gly 50 55 60 Cys Asp Gly Ile Val His Leu Gly Gly Ile Ser Val Glu Lys Pro Phe 65 70 75 80 Glu Gln Ile Leu Gln Gly Asn Ile Ile Gly Leu Tyr Asn Leu Tyr Glu 85 90 95 Ala Ala Arg Ala His Gly Gln Pro Arg Ile Ile Phe Ala Ser Ser Asn 100 105 110 His Thr Ile Gly Tyr Tyr Pro Gln Thr Glu Arg Leu Gly Pro Asp Val 115 120 125 Pro Phe Arg Pro Asp Gly Leu Tyr Gly Val Ser Lys Cys Phe Gly Glu 130 135 140 Ser Leu Ala Arg Met Tyr Phe Glu Lys Phe Gly Gln Glu Thr Ala Leu 145 150 155 160 Val Arg Ile Gly Ser Cys Thr Pro Glu Pro Leu Asn Tyr Arg Met Leu 165 170 175 Ser Thr Trp Phe Ser His Asp Asp Phe Val Ser Leu Ile Glu Ala Ala 180 185 190 Phe Arg Ala Pro Val Leu Gly Cys Pro Ile Val Trp Gly Ala Ser Ala 195 200 205 Asn Asp Ala Ser Trp Trp Asp Asn Ser His Leu Gly Phe Ile Gly Trp 210 215 220 Lys Pro Lys Asp Asn Ala Glu Ala Phe Arg Arg Lys Ile Ala Glu Thr 225 230 235 240 Thr Pro Gln Pro Asp Ala Arg Asp Pro Ile Val Arg Phe Gln Gly Gly 245 250 255 Val Phe Val Asp Asn Pro Ile Phe Lys Glu Thr 260 265 2266PRTRhizobium lupini 2Met Lys Arg Leu Leu Ile Thr Gly Ala Ala Gly Ala Leu Gly Arg Val 1 5 10 15 Met Arg Glu Arg Leu Ala Pro Met Ala Thr Ile Leu Arg Leu Ser Asp 20 25 30 Ile Ala Pro Ile Gly Ala Ala Arg Gln Asn Glu Glu Ile Val Gln Cys 35 40 45 Asp Leu Ala Asp Ala Lys Ala Val His Ala Leu Val Glu Asp Cys Asp 50 55 60 Gly Ile Val His Leu Gly Gly Val Ser Val Glu Arg Lys Phe Ser Gln 65 70 75 80 Ile Val Ala Gly Asn Ile Val Gly Leu Tyr Asn Leu Tyr Glu Ala Ala 85 90 95 Arg Ala His Arg Met Pro Arg Ile Val Phe Ala Ser Ser Asn His Thr 100 105 110 Ile Gly Phe Tyr Pro Gln Thr Glu Arg Leu Ser Val Asp His Pro Tyr 115 120 125 Arg Pro Asp Gly Leu Tyr Gly Val Ser Lys Cys Phe Gly Glu Ser Leu 130 135 140 Ala His Met Tyr His Glu Lys Phe Gly Gln Glu Thr Ala Leu Val Arg 145 150 155 160 Ile Gly Ser Cys Val Thr Glu Pro Val Asn His Arg Met Leu Ser Thr 165 170 175 Trp Leu Ser Tyr Asp Asp Phe Val Ser Leu Ile Glu Ala Val Phe Arg 180 185 190 Ala Pro Lys Leu Gly Cys Pro Val Ile Trp Gly Ala Ser Asn Asn Asp 195 200 205 Ala Gly Trp Trp Asp Asn Ser Ala Ala Gly Phe Leu Gly Trp Lys Pro 210 215 220 Lys Asp Asn Ala Glu Ile Phe Arg Ser Lys Ile Glu Ala Ala Cys Glu 225 230 235 240 Arg Pro Gly Ser Asp Asp Pro Ala Ala Arg Trp Gln Gly Gly Leu Phe 245 250 255 Thr Gln Asp Pro Ile Phe Pro Glu Asp Glu 260 265 3272PRTPseudomonas sp. 3Met Thr Thr Ala Tyr Thr Pro Phe Asn Arg Leu Leu Leu Thr Gly Ala 1 5 10 15 Ala Gly Gly Leu Gly Lys Val Leu Arg Glu Ser Leu Arg Pro Tyr Ala 20 25 30 Asn Val Leu Arg Val Ser Asp Ile Ala Ala Met Ser Pro Ala Thr Gly 35 40 45 Ala His Glu Glu Val Gln Val Cys Asp Leu Ala Asp Lys Ala Ala Val 50 55 60 His Gln Leu Val Glu Gly Val Asp Ala Ile Leu His Phe Gly Gly Val 65 70 75 80 Ser Val Glu Arg Pro Phe Glu Glu Ile Leu Gly Ala Asn Ile Cys Gly 85 90 95 Val Phe His Ile Tyr Glu Ala Ala Arg Arg His Gly Val Lys Arg Val 100 105 110 Ile Phe Ala Ser Ser Asn His Val Ile Gly Phe Tyr Lys Gln Asp Glu 115 120 125 Thr Ile Asp Ala Asn Cys Pro Arg Arg Pro Asp Ser Tyr Tyr Gly Leu 130 135 140 Ser Lys Ser Tyr Gly Glu Asp Met Ala Ser Phe Tyr Phe Asp Arg Tyr 145 150 155 160 Gly Ile Glu Thr Val Ser Ile Arg Ile Gly Ser Ser Phe Pro Glu Pro 165 170 175 His Asn Arg Arg Met Met Ser Thr Trp Leu Ser Phe Ala Asp Leu Thr 180 185 190 Gln Leu Leu Glu Arg Ala Leu Tyr Thr Pro Asn Val Gly His Thr Val 195 200 205 Val Tyr Gly Met Ser Ala Asn Lys Asn Val Trp Trp Asp Asn His Leu 210 215 220 Ala Ala His Leu Gly Phe Gln Pro Lys Asp Ser Ser Glu Val Phe Arg 225 230 235 240 Ala Gln Ile Asp Ala Gln Pro Met Pro Ala Ala Asp Asp Pro Ala Met 245 250 255 Val Phe Gln Gly Gly Ala Phe Val Ala Ala Gly Pro Phe Gly Asp Asp 260 265 270 4804DNAAgrobacterium tumefaciens 4atggcaatga aacggcttct tgttaccggt gctgcgggcc agcttggccg cgttatgcgc 60aaacgccttg catcgatggc cgagatcgtt cgccttgccg atctcgcccc gctcgatccg 120gcaggcccga acgaggaatg catgcaatgc gaccttgcgg atgcagacgc cgttgacgcc 180atggttgccg gttgcgacgg catcgttcac ctcggcggca tatcggtgga gaagcctttc 240gaacaaatcc ttcagggcaa catcatcggg ctgtataatc tctatgaggc cgcccgcgcc 300cacggccagc cgcgcatcat cttcgccagt tcgaaccata cgatcggtta ttacccgcag 360acggagaggc ttggaccgga tgttcccttc cgcccggatg ggctttacgg cgtctccaaa 420tgtttcggcg agagccttgc ccgcatgtat ttcgagaaat tcggccagga gaccgcactt 480gtccgcatcg gctcctgcac gccggaaccc cttaattacc gcatgctgtc cacctggttt 540tcgcatgacg atttcgtctc gctgatcgag gcggcgttcc gcgcccccgt gctcggctgc 600cccatcgtct ggggggcgtc ggccaacgat gcgagctggt gggacaattc gcatctcggc 660tttattggat ggaaaccgaa ggacaatgcc gaggccttcc gccggaagat tgccgaaacg 720acgccgcagc cggacgcgcg cgacccgatt gtccgctttc agggtggcgt gtttgtcgac 780aacccgatct tcaaggagac gtga 8045801DNARhizobium lupini 5atgaagagac ttctgattac cggcgcagcg ggtgcactgg gccgcgtgat gcgggaaagg 60ctcgcaccca tggcaacgat tctgcgcctt tccgatatcg ccccgattgg agcggcccgc 120cagaacgagg aaatcgtcca gtgcgatctt gccgatgcca aagcagtgca tgctctggtc 180gaagattgcg acgggatcgt ccatctcggt ggcgtctcag tagagcgcaa gttctcgcag 240atcgtcgccg gcaacatcgt cggcctttac aatctctacg aagccgcacg cgcgcatcgg 300atgccgcgca tcgtctttgc aagttccaat cacaccatcg gcttttatcc gcaaaccgaa 360cggttgtcgg tggaccatcc ctatcgtccg gacgggctct acggcgtatc gaaatgtttc 420ggcgagtctc tggcgcatat gtaccatgag aagttcgggc aggagacggc actcgtgcgc 480atcgggtcct gcgtgaccga accggtcaac catcgcatgc tttccacctg gctttcctac 540gatgatttcg tctcgcttat cgaggccgta ttccgtgcgc cgaaactcgg ctgccccgtc 600atctggggcg cgtcgaacaa cgatgcagga tggtgggaca attccgccgc cggctttctc 660ggctggaagc cgaaagacaa tgccgaaatc ttccgttcga agatcgaagc cgcttgcgaa 720cgccccggtt ctgatgatcc ggccgcccgc tggcaaggcg ggctcttcac gcaggacccg 780atcttcccag aggacgagta a 8016819DNAPseudomonas sp. 6atgaccacag cctacacccc cttcaatcgc ctgctactca ccggagcggc aggcggcctc 60ggcaaggtcc tgcgcgaaag cctgcgacct tatgccaacg tcctgcgcgt ctccgacatc 120gcggccatga gccctgccac aggcgcccat gaagaagtcc aggtctgcga cctcgccgat 180aaagcggcgg tccatcaact ggtcgaaggc gtcgacgcaa tcctgcactt cggtggcgta 240tcggtggagc ggcccttcga ggaaatcctc ggggccaata tctgcggcgt gtttcatatc 300tatgaagcgg cgcgccggca tggcgtaaag cgggtgatct tcgccagctc caaccacgtc 360atcggttttt ataagcagga cgaaaccatc gacgccaact gcccgcgccg ccccgacagc 420tactacggtc tgtccaagtc ctacggcgaa gacatggcca gcttctactt cgaccgctac 480ggcatcgaga ccgtgagcat ccgcatcggc tcctcgttcc ccgagccgca caatcgccgc 540atgatgagca cctggctgag ctttgccgac ctgacgcagc tgctcgaacg cgcgctgtac 600acccccaacg tcggccacac cgtggtctac ggcatgtccg ctaacaagaa cgtctggtgg 660gacaaccacc tggccgcgca cctgggcttc caaccgaagg acagctccga ggtgttccgt 720gcgcagatcg atgcccagcc gatgcccgcc gccgatgacc cggcgatggt ctttcaaggc 780ggcgcctttg tcgcagccgg gccgttcggc gacgactga 8197274PRTRhizobium sp. 7Met Leu Asn Val Glu Thr Arg His Ala Val His Ala Asp His Ala Arg 1 5 10 15 Ser Leu Asp Thr Glu Gly Leu Arg Arg His Phe Leu Ala Gln Gly Leu 20 25 30 Phe Ala Glu Gly Glu Ile Arg Leu Ile Tyr Thr His Tyr Asp Arg Phe 35 40 45 Val Met Gly Gly Ala Val Pro Asp Gly Ala Pro Leu Val Leu Asp His 50 55 60 Val Glu Glu Thr Lys Thr Pro Gly Phe Leu Asp Arg Arg Glu Met Gly 65 70 75 80 Ile Val Asn Ile Gly Ala Glu Gly Ser Val His Ala Gly Asn Glu Ser 85 90 95 Trp Ser Leu Asn Arg Gly Asp Val Leu Tyr Leu Gly Met Gly Ala Gly 100 105 110 Pro Val Thr Phe Glu Gly Ala Gly Arg Phe Tyr Leu Val Ser Ala Pro 115 120 125 Ala His Arg Ser Leu Pro Asn Arg Leu Val Thr Pro Ala Asp Ser Lys 130 135 140 Glu Val Lys Leu Gly Ala Leu Glu Thr Ser Asn Lys Arg Thr Ile Asn 145 150 155 160 Gln Phe Ile His Pro Leu Val Met Glu Ser Cys Gln Leu Val Leu Gly 165 170 175 Tyr Thr Thr Leu Glu Asp Gly Ser Val Trp Asn Thr Met Pro Ala His 180 185 190 Val His Asp Arg Arg Met Glu Ala Tyr Leu Tyr Phe Gly Met Asp Glu 195 200 205 Thr Ser Arg Val Leu His Leu Met Gly Glu Pro Gln Gln Thr Arg His 210 215 220 Leu Phe Val Ala Asn Glu Glu Gly Ala Ile Ser Pro Pro Trp Ser Ile 225 230 235 240 His Ala Gly Ala Gly Ile Gly Ser Tyr Thr Phe Ile Trp Ala Met Ala 245 250 255 Gly Asp Asn Val Asp Tyr Thr Asp Met Glu Phe Ile Gln Pro Gly Asp 260 265 270 Leu Arg 8278PRTEscherichia coli K-12 8Met Asp Val Arg Gln Ser Ile His Ser Ala His Ala Lys Thr Leu Asp 1 5 10 15 Thr Gln Gly Leu Arg Asn Glu Phe Leu Val Glu Lys Val Phe Val Ala 20 25 30 Asp Glu Tyr Thr Met Val Tyr Ser His Ile Asp Arg Ile Ile Val Gly 35 40 45 Gly Ile Met Pro Ile Thr Lys Thr Val Ser Val Gly Gly Glu Val Gly 50 55 60 Lys Gln Leu Gly Val Ser Tyr Phe Leu Glu Arg Arg Glu Leu Gly Val 65 70 75 80 Ile Asn Ile Gly Gly Ala Gly Thr Ile Thr Val Asp Gly Gln Cys Tyr 85 90 95 Glu Ile Gly His Arg Asp Ala Leu Tyr Val Gly Lys Gly Ala Lys Glu 100 105 110 Val Val Phe Ala Ser Ile Asp Thr Gly Thr Pro Ala Lys Phe Tyr Tyr 115 120 125 Asn Cys Ala Pro Ala His Thr Thr Tyr Pro Thr Lys Lys Val Thr Pro 130 135 140 Asp Glu Val Ser Pro Val Thr Leu Gly Asp Asn Leu Thr Ser Asn Arg 145 150 155 160 Arg Thr Ile Asn Lys Tyr Phe Val Pro Asp Val Leu Glu Thr Cys Gln 165 170 175 Leu Ser Met Gly Leu Thr Glu Leu Ala Pro Gly Asn Leu Trp Asn Thr 180 185 190 Met Pro Cys His Thr His Glu Arg Arg Met Glu Val Tyr Phe Tyr Phe 195 200 205 Asn Met Asp Asp Asp Ala Cys Val Phe His Met Met Gly Gln Pro Gln 210 215 220 Glu Thr Arg His Ile Val Met His Asn Glu Gln Ala Val Ile Ser Pro 225 230 235 240 Ser Trp Ser Ile His Ser Gly Val Gly Thr Lys Ala Tyr Thr Phe Ile 245 250 255 Trp Gly Met Val Gly Glu Asn Gln Val Phe Asp Asp Met Asp His Val 260 265 270 Ala Val Lys Asp Leu Arg 275 9278PRTRhizobium sp. 9Met Thr Met Lys Ile Leu Tyr Gly Ala Gly Pro Glu Asp Val Lys Gly 1 5 10 15 Tyr Asp Thr Gln Arg Leu Arg Asp Ala Phe Leu Leu Asp Asp Leu Phe 20 25 30 Ala Asp Asp Arg Val Ser Phe Thr Tyr Thr His Val Asp Arg Leu Ile 35 40 45 Leu Gly Gly Ala Val Pro Val Thr Thr Ser Leu Thr Phe Gly Ser Gly 50 55 60 Thr Glu Ile Gly Thr Pro Tyr Leu Leu Ser Ala Arg Glu Met Gly Ile 65 70 75 80 Ala Asn Leu Gly Gly Thr Gly Thr Ile Glu Val Asp Gly Gln Arg Phe 85 90 95 Thr Leu Glu Asn Arg Asp Val Leu Tyr Val Gly Arg Gly Ala Arg Gln 100 105 110 Met Thr Ala Ser Ser Leu Ser Ala Glu Arg Pro Ala Arg Phe Tyr Met 115 120 125 Asn Ser Val Pro Ala Gly Ala Asp Phe Pro His Arg Leu Ile Thr Arg 130 135 140 Gly Glu Ala Lys Pro Leu Asp Leu Gly Asp Ala Arg Arg Ser Asn Arg 145 150 155 160 Arg Arg Leu Ala Met Tyr Ile His Pro Glu Val Ser Pro Ser Cys Leu 165 170 175 Leu Leu Met Gly Ile Thr Asp Leu Ala Glu Gly Ser Ala Trp Asn Thr 180 185 190 Met Pro Pro His Leu His Glu Arg Arg Met Glu Ala Tyr Cys Tyr Phe 195 200 205 Asp Leu Ser Pro Glu Asp Arg Val Ile His Met Met Gly Arg Pro Asp 210 215 220 Glu Thr Arg His Leu Val Val Ala Asp Gly Glu Ala Val Leu Ser Pro 225 230 235 240 Ala Trp Ser Ile His Met Gly Ala Gly Thr Gly Pro Tyr Ala Phe Val 245 250 255 Trp Gly Met Thr Gly Glu Asn Gln Glu Tyr Asn Asp Val Ala Pro Val 260 265 270 Ala Val Ala Asp Leu Lys 275 10274PRTPannonibacter phragmitetus 10Met Leu Thr Val Glu Thr Arg His Ala Ile Asp Pro Gln Thr Ala Lys 1 5 10 15 Arg Met Asp Thr Glu Glu Leu Arg Lys His Phe His Met Gly Ser Leu 20 25 30 Phe Ala Ala Gly Glu Ile Arg Leu Val Tyr Thr His Tyr Asp Arg Met 35 40 45 Ile Val Gly Ala Ala Val Pro Ser Gly Ala Pro Leu Val Leu Asp Gln 50 55 60 Val Lys Glu Cys Gly Thr Ala Ser Ile Leu Asp Arg Arg Glu Met Ala 65 70 75 80 Val Val Asn Val Gly Ala Ser Gly Lys Val Ser Ala Ala Gly Glu Thr 85 90 95 Tyr Ala Met Glu Arg Gly Asp Val Leu Tyr Leu Pro Leu Gly Ser Gly 100 105 110 Lys Val Thr Phe Glu Gly Glu Gly Arg Phe Tyr Ile Leu Ser Ala Pro 115 120 125 Ala His Ala Ala Tyr Pro Ala Arg Leu Ile Arg Ile Gly Glu Ala Glu 130 135 140 Lys Val Lys Leu Gly Ser Ala Glu Thr Ser Asn Asp Arg Thr Ile Tyr 145 150 155 160 Gln Phe Val His Pro Ala Val Met Thr Ser Cys Gln Leu Val Val Gly 165 170 175 Tyr Thr Gln Leu His Asn Gly Ser Val Trp Asn Thr Met Pro Ala His 180 185 190 Val His Asp Arg Arg Met Glu Ala Tyr Leu Tyr Phe Asp Met Lys Pro 195 200 205 Glu Gln Arg Val Phe His Phe Met Gly Glu Pro Gln Glu Thr Arg His 210 215 220 Leu Val Met Lys Asn Glu Asp Ala Val Val Ser Pro Pro Trp Ser Ile 225 230 235 240 His Cys Gly Ala Gly Thr Gly Ser Tyr Thr Phe

Ile Trp Ala Met Ala 245 250 255 Gly Asp Asn Val Asp Tyr Lys Asp Val Glu Met Val Ala Met Glu Asp 260 265 270 Leu Arg 11271PRTBacillus subtilis 11Met Ser Tyr Leu Leu Arg Lys Pro Gln Ser Asn Glu Val Ser Asn Gly 1 5 10 15 Val Lys Leu Val His Glu Val Thr Lys Ser Asn Ser Asp Leu Thr Tyr 20 25 30 Val Glu Phe Lys Val Leu Asp Leu Ala Ser Gly Ser Ser Tyr Ala Glu 35 40 45 Glu Leu Lys Lys Gln Glu Ile Cys Ile Val Ala Val Thr Gly Asn Ile 50 55 60 Thr Val Thr Asp His Glu Ser Thr Phe Glu Asn Ile Gly Thr Arg Glu 65 70 75 80 Ser Val Phe Glu Arg Lys Pro Thr Asp Ser Val Tyr Ile Ser Asn Asp 85 90 95 Arg Ser Phe Glu Ile Thr Ala Val Ser Asp Ala Arg Val Ala Leu Cys 100 105 110 Tyr Ser Pro Ser Glu Lys Gln Leu Pro Thr Lys Leu Ile Lys Ala Glu 115 120 125 Asp Asn Gly Ile Glu His Arg Gly Lys Phe Ser Asn Lys Arg Thr Val 130 135 140 His Asn Ile Leu Pro Asp Ser Asp Pro Ser Ala Asn Ser Leu Leu Val 145 150 155 160 Val Glu Val Tyr Thr Asp Ser Gly Asn Trp Ser Ser Tyr Pro Pro His 165 170 175 Lys His Asp Gln Asp Asn Leu Pro Glu Glu Ser Phe Leu Glu Glu Thr 180 185 190 Tyr Tyr His Glu Leu Asp Pro Gly Gln Gly Phe Val Phe Gln Arg Val 195 200 205 Tyr Thr Asp Asp Arg Ser Ile Asp Glu Thr Met Thr Val Glu Asn Glu 210 215 220 Asn Val Val Ile Val Pro Ala Gly Tyr His Pro Val Gly Val Pro Asp 225 230 235 240 Gly Tyr Thr Ser Tyr Tyr Leu Asn Val Met Ala Gly Pro Thr Arg Lys 245 250 255 Trp Lys Phe His Asn Asp Pro Ala His Glu Trp Ile Leu Glu Arg 260 265 270 12269PRTOchrobactrum anthropi 12Met Ala Asn Leu Leu Arg Lys Pro Asn Gly Thr His Gly Lys Val His 1 5 10 15 Asp Ile Thr Pro Glu Asn Ala Lys Trp Gly Tyr Val Gly Phe Gly Leu 20 25 30 Phe Arg Leu Lys Ser Gly Glu Ser Val Ser Glu Lys Thr Gly Ser Thr 35 40 45 Glu Val Ile Leu Val Leu Val Glu Gly Lys Ala Lys Ile Ser Ala Ser 50 55 60 Gly Glu Asp Phe Gly Glu Met Gly Glu Arg Leu Asn Val Phe Glu Lys 65 70 75 80 Leu Pro Pro His Cys Leu Tyr Val Pro Ala Glu Ser Asp Trp His Ala 85 90 95 Thr Ala Thr Thr Asp Cys Val Leu Ala Val Cys Thr Ala Pro Gly Lys 100 105 110 Pro Gly Arg Lys Ala Gln Lys Leu Gly Pro Glu Ser Leu Thr Leu Glu 115 120 125 Gln Arg Gly Lys Gly Ala Asn Thr Arg Phe Ile His Asn Ile Ala Met 130 135 140 Glu Ser Arg Asp Val Ala Asp Ser Leu Leu Val Thr Glu Val Phe Thr 145 150 155 160 Pro Gln Gly Asn Trp Ser Ser Tyr Pro Pro His Arg His Asp Glu Asp 165 170 175 Asn Phe Pro Asp Met Thr Tyr Leu Glu Glu Thr Tyr Tyr His Arg Leu 180 185 190 Asn Pro Ala Gln Gly Phe Gly Phe Gln Arg Val Phe Thr Glu Asp Gly 195 200 205 Ser Leu Asp Glu Thr Met Ala Val Ser Asp Gly Asp Val Val Leu Val 210 215 220 Pro Lys Gly His His Pro Cys Gly Ala Pro Tyr Gly Tyr Glu Met Tyr 225 230 235 240 Tyr Leu Asn Val Met Ala Gly Pro Leu Arg Lys Trp Arg Phe Lys Asn 245 250 255 His Pro Asp His Asp Trp Ile Phe Lys Arg Asp Asn Pro 260 265 13268PRTHalomonas titanicae 13Met Ala Ser Leu Leu Val Arg Pro Thr Ala Pro Asp Ala Gln Gly Thr 1 5 10 15 Val Ile Asp Val Thr Pro Glu Ser Ala Gly Trp Thr His Val Gly Phe 20 25 30 Arg Val His Lys Leu Ala Lys Gly Gln Arg Leu Glu Ala Ser Ser Asp 35 40 45 Asp Gln Glu Val Cys Leu Val Leu Leu Thr Gly Arg Ala Thr Val Thr 50 55 60 Cys Gly Glu His Arg Phe Glu Asp Ile Gly Gln Arg Met Asp Ile Phe 65 70 75 80 Glu Gln Ile Pro Pro Tyr Ala Val Tyr Leu Pro Asp His Val Ser Tyr 85 90 95 Ala Val Glu Ala Thr Thr Asp Leu Glu Leu Ala Val Cys Thr Ala Pro 100 105 110 Gly His Gly Asn His Ala Pro Arg Leu Ile Ala Pro Asp Asn Ile Lys 115 120 125 Gln Ser Thr Arg Gly Gln Gly Thr Asn Thr Arg His Val His Asp Ile 130 135 140 Leu Pro Glu Thr Glu Pro Ala Asp Ser Leu Leu Val Val Glu Val Phe 145 150 155 160 Thr Pro Ala Gly Asn Trp Ser Ser Tyr Pro Pro His Lys His Asp Val 165 170 175 Asp Asn Leu Pro His Glu Ser His Leu Glu Glu Thr Tyr Tyr His Arg 180 185 190 Ile Asn Pro Glu Gln Gly Phe Ala Phe Gln Arg Val Tyr Thr Asp Asp 195 200 205 Arg Ser Leu Asp Glu Thr Met Ala Val Glu Asn Gly Cys Cys Val Leu 210 215 220 Val Pro Lys Gly Tyr His Pro Val Gly Ala Ser His Gly Tyr Ser Leu 225 230 235 240 Tyr Tyr Leu Asn Val Met Ala Gly Pro Lys Arg Ala Trp Lys Phe His 245 250 255 Asn Asp Pro Asp His Glu Trp Leu Met Asn Ala Gly 260 265 14264PRTAcidiphilium multivorum 14Met Pro Asp Leu Leu Arg Lys Pro Phe Gly Thr His Gly Lys Val His 1 5 10 15 Asp Ile Thr Pro Ala Ala Ala Gly Trp Arg His Val Gly Phe Gly Leu 20 25 30 Tyr Arg Leu Arg Ala Gly Glu Phe Ala Ala Glu Ala Thr Gly Gly Asn 35 40 45 Glu Val Ile Leu Val Met Val Glu Gly Lys Ala Ser Ile Arg Ala Ala 50 55 60 Gly Arg Asp Trp Gly Val Leu Gly Glu Arg Met Ser Val Phe Glu Lys 65 70 75 80 Ser Pro Pro His Ser Leu Tyr Val Pro Asn Gly Ala Glu Trp Ala Leu 85 90 95 Val Ala Glu Thr Asp Cys Ile Val Ala Val Cys Ser Ala Pro Gly Arg 100 105 110 Gly Gly His Ala Ala Arg Arg Ile Gly Pro Glu Gly Ile Val Leu Thr 115 120 125 Ala Arg Gly Glu Gly Thr Asn Thr Arg His Ile Asn Asn Ile Ala Met 130 135 140 Glu Ala Glu Asp Tyr Cys Asp Ala Leu Leu Val Thr Glu Val Phe Thr 145 150 155 160 Pro Ala Gly His Trp Ser Ser Tyr Pro Ser His Arg His Asp Glu Asp 165 170 175 Asp Asp Pro Arg Ile Thr Tyr Leu Glu Glu Thr Tyr Tyr His Arg Leu 180 185 190 Asn Pro Ala Ser Gly Phe Gly Val Gln Arg Val Tyr Thr Asp Asp Arg 195 200 205 Ala Leu Asp Gln Thr Met Ala Val Ser Asp Gly Asp Val Val Leu Val 210 215 220 Pro Arg Gly His His Pro Cys Ala Ala Pro Tyr Gly Ile Glu Met Tyr 225 230 235 240 Tyr Leu Asn Val Met Ala Gly Pro Leu Arg Lys Trp Arg Phe Leu Pro 245 250 255 Asp Pro Glu Leu Gly Ile Ala Lys 260 15271PRTLactobacillus casei 15Met Ser Leu Leu Tyr His Lys Gln Asn Gln Glu Leu Ser Ser Gly Val 1 5 10 15 Arg Leu Ile Gln Asp Val Asn Ala Ser Asn Ser Pro Met Lys Tyr Thr 20 25 30 Ala Val Lys Val Leu Glu Phe Ser Ala Asp Ser Ser Tyr Glu Glu Thr 35 40 45 Leu Glu Ala Phe Glu Ala Gly Ile Val Val Leu Glu Gly Lys Val Thr 50 55 60 Ile Thr Ala Asp Asp Gln Thr Phe Glu Asp Val Gly Gln Arg Thr Ser 65 70 75 80 Ile Phe Asp Lys Ile Pro Thr Asp Ser Val Tyr Val Ser Thr Gly Leu 85 90 95 Ala Phe Gly Ile Arg Ala Lys Gln Ala Ala Lys Ile Leu Ile Ala Tyr 100 105 110 Ala Pro Thr Asn Gln Thr Phe Pro Val Arg Leu Ile Arg Gly Asn Ile 115 120 125 His Gln Val Glu His Arg Gly Lys Tyr Asn Asn Lys Arg Leu Val Gln 130 135 140 Asn Ile Leu Pro Asp Asn Leu Pro Phe Ala Asp Lys Leu Leu Leu Val 145 150 155 160 Glu Val Tyr Thr Asp Ser Ala Asn Trp Ser Ser Tyr Pro Pro His Arg 165 170 175 His Asp His Asp Asp Leu Pro Ala Glu Ser Leu Leu Glu Glu Ile Tyr 180 185 190 Tyr His Glu Met Arg Pro Lys Gln Gly Phe Val Phe Gln Arg Val Tyr 195 200 205 Thr Asp Asp Leu Ser Leu Asp Glu Thr Met Ala Val Gln Asn Gln Asp 210 215 220 Val Val Val Val Pro Lys Gly Tyr His Pro Val Gly Val Pro Asp Gly 225 230 235 240 Tyr Asp Ser Tyr Tyr Leu Asn Val Met Ala Gly Pro Thr Arg Val Trp 245 250 255 His Phe His Asn Ala Pro Glu His Ala Trp Ile Ile Asp Arg Gln 260 265 270 16466PRTBacteroides sp. 16Met Lys Lys Phe Met Asp Glu Asn Phe Leu Leu Gln Thr Glu Thr Ala 1 5 10 15 Gln Lys Leu Tyr His Asn His Ala Ala Asn Met Pro Ile Phe Asp Tyr 20 25 30 His Cys His Ile Asn Pro Lys Asp Ile Ala Glu Asp Arg Met Phe Lys 35 40 45 Thr Ile Thr Glu Ile Trp Leu Tyr Gly Asp His Tyr Lys Trp Arg Ala 50 55 60 Met Arg Thr Asn Gly Val Asp Glu Arg Phe Cys Thr Gly Asp Ala Ser 65 70 75 80 Asp Trp Glu Lys Phe Glu Lys Trp Ala Glu Thr Val Pro His Thr Leu 85 90 95 Arg Asn Pro Leu Tyr His Trp Thr His Leu Glu Leu Lys Lys Phe Phe 100 105 110 Gly Ile Asn Glu Ile Leu Ser Pro Lys Asn Ala Arg Glu Ile Tyr Asp 115 120 125 Ala Cys Asn Glu Lys Leu Gln Thr Pro Ala Tyr Ser Cys Arg Asn Ile 130 135 140 Ile Arg Met Ala Asn Val His Thr Ile Cys Thr Thr Asp Asp Pro Val 145 150 155 160 Asp Thr Leu Glu Tyr His Gln Gln Ile Lys Glu Asp Gly Phe Glu Val 165 170 175 Ala Val Leu Pro Ala Trp Arg Pro Asp Lys Ala Met Met Val Glu Asp 180 185 190 Pro Lys Phe Phe Asn Asp Tyr Met Asp Gln Leu Ala Glu Ala Ala Gly 195 200 205 Ile His Ile Glu Ser Phe Glu Asp Leu Met Glu Ala Leu Asp Thr Arg 210 215 220 His Gln Tyr Phe His Asp Asn Gly Cys Arg Leu Ser Asp His Gly Leu 225 230 235 240 Asp Thr Val Phe Ala Glu Asp Tyr Thr Glu Glu Glu Ile Lys Ala Ile 245 250 255 Phe Lys Lys Ile Arg Gly Gly Ser Arg Leu Ser Glu Thr Glu Ile Leu 260 265 270 Lys Phe Lys Ser Cys Met Leu Tyr Glu Tyr Gly Val Met Asp His Ser 275 280 285 Arg Gly Trp Thr Gln Gln Leu His Ile Gly Ala Gln Arg Asn Asn Asn 290 295 300 Thr Arg Leu Phe Lys Lys Leu Gly Pro Asp Thr Gly Phe Asp Ser Ile 305 310 315 320 Gly Asp Lys Pro Ile Ala Glu Pro Leu Ala Lys Leu Leu Asp Arg Leu 325 330 335 Asp Gln Glu Asn Lys Leu Cys Lys Thr Val Leu Tyr Asn Leu Asn Pro 340 345 350 Arg Asp Asn Glu Leu Tyr Ala Thr Met Leu Gly Asn Phe Gln Asp Gly 355 360 365 Ser Val Pro Gly Lys Ile Gln Tyr Gly Ser Gly Trp Trp Phe Leu Asp 370 375 380 Gln Lys Asp Gly Met Ile Lys Gln Met Asn Ala Leu Ser Asn Leu Gly 385 390 395 400 Leu Leu Ser Arg Phe Val Gly Met Leu Thr Asp Ser Arg Ser Phe Leu 405 410 415 Ser Tyr Thr Arg His Glu Tyr Phe Arg Arg Thr Leu Cys Asn Leu Leu 420 425 430 Gly Asn Asp Val Glu Asn Gly Glu Ile Pro Ala Asp Met Glu Leu Leu 435 440 445 Gly Ser Met Val Glu Asn Ile Cys Phe Asn Asn Ala Lys Asn Tyr Phe 450 455 460 Asn Phe 465 17451PRTThermotoga maritima MSB8 17Met Phe Leu Gly Glu Asp Tyr Leu Leu Thr Asn Arg Ala Ala Val Arg 1 5 10 15 Leu Phe Asn Glu Val Lys Asp Leu Pro Ile Val Asp Pro His Asn His 20 25 30 Leu Asp Ala Lys Asp Ile Val Glu Asn Lys Pro Trp Asn Asp Ile Trp 35 40 45 Glu Val Glu Gly Ala Thr Asp His Tyr Val Trp Glu Leu Met Arg Arg 50 55 60 Cys Gly Val Ser Glu Glu Tyr Ile Thr Gly Ser Arg Ser Asn Lys Glu 65 70 75 80 Lys Trp Leu Ala Leu Ala Lys Val Phe Pro Arg Phe Val Gly Asn Pro 85 90 95 Thr Tyr Glu Trp Ile His Leu Asp Leu Trp Arg Arg Phe Asn Ile Lys 100 105 110 Lys Val Ile Ser Glu Glu Thr Ala Glu Glu Ile Trp Glu Glu Thr Lys 115 120 125 Lys Lys Leu Pro Glu Met Thr Pro Gln Lys Leu Leu Arg Asp Met Lys 130 135 140 Val Glu Ile Leu Cys Thr Thr Asp Asp Pro Val Ser Thr Leu Glu His 145 150 155 160 His Arg Lys Ala Lys Glu Ala Val Glu Gly Val Thr Ile Leu Pro Thr 165 170 175 Trp Arg Pro Asp Arg Ala Met Asn Val Asp Lys Glu Gly Trp Arg Glu 180 185 190 Tyr Val Glu Lys Met Gly Glu Arg Tyr Gly Glu Asp Thr Ser Thr Leu 195 200 205 Asp Gly Phe Leu Asn Ala Leu Trp Lys Ser His Glu His Phe Lys Glu 210 215 220 His Gly Cys Val Ala Ser Asp His Ala Leu Leu Glu Pro Ser Val Tyr 225 230 235 240 Tyr Val Asp Glu Asn Arg Ala Arg Ala Val His Glu Lys Ala Phe Ser 245 250 255 Gly Glu Lys Leu Thr Gln Asp Glu Ile Asn Asp Tyr Lys Ala Phe Met 260 265 270 Met Val Gln Phe Gly Lys Met Asn Gln Glu Thr Asn Trp Val Thr Gln 275 280 285 Leu His Ile Gly Ala Leu Arg Asp Tyr Arg Asp Ser Leu Phe Lys Thr 290 295 300 Leu Gly Pro Asp Ser Gly Gly Asp Ile Ser Thr Asn Phe Leu Arg Ile 305 310 315 320 Ala Glu Gly Leu Arg Tyr Phe Leu Asn Glu Phe Asp Gly Lys Leu Lys 325 330 335 Ile Val Leu Tyr Val Leu Asp Pro Thr His Leu Pro Thr Ile Ser Thr 340 345 350 Ile Ala Arg Ala Phe Pro Asn Val Tyr Val Gly Ala Pro Trp Trp Phe 355 360 365 Asn Asp Ser Pro Phe Gly Met Glu Met His Leu Lys Tyr Leu Ala Ser 370 375 380 Val Asp Leu Leu Tyr Asn Leu Ala Gly Met Val Thr Asp Ser Arg Lys 385 390 395 400 Leu Leu Ser Phe Gly Ser Arg Thr Glu Met Phe Arg Arg Val Leu Ser 405 410 415 Asn Val Val Gly Glu Met Val Glu Lys Gly Gln Ile Pro Ile Lys Glu 420 425 430 Ala Arg Glu Leu Val Lys His Val Ser Tyr Asp Gly Pro Lys Ala Leu

435 440 445 Phe Phe Gly 450 18427PRTBacillus halodurans 18Met Ser Ile Asn Ser Arg Glu Val Leu Ala Glu Lys Val Lys Asn Ala 1 5 10 15 Val Asn Asn Gln Pro Val Thr Asp Met His Thr His Leu Phe Ser Pro 20 25 30 Asn Phe Gly Glu Ile Leu Leu Trp Asp Ile Asp Glu Leu Leu Thr Tyr 35 40 45 His Tyr Leu Val Ala Glu Val Met Arg Trp Thr Asp Val Ser Ile Glu 50 55 60 Ala Phe Trp Ala Met Ser Lys Arg Glu Gln Ala Asp Leu Ile Trp Glu 65 70 75 80 Glu Leu Phe Ile Lys Arg Ser Pro Val Ser Glu Ala Cys Arg Gly Val 85 90 95 Leu Thr Cys Leu Gln Gly Leu Gly Leu Asp Pro Ala Thr Arg Asp Leu 100 105 110 Gln Val Tyr Arg Glu Tyr Phe Ala Lys Lys Thr Ser Glu Glu Gln Val 115 120 125 Asp Thr Val Leu Gln Leu Ala Asn Val Ser Asp Val Val Met Thr Asn 130 135 140 Asp Pro Phe Asp Asp Asn Glu Arg Ile Ser Trp Leu Glu Gly Lys Gln 145 150 155 160 Pro Asp Ser Arg Phe His Ala Ala Leu Arg Leu Asp Pro Leu Leu Asn 165 170 175 Glu Tyr Glu Gln Thr Lys His Arg Leu Arg Asp Trp Gly Tyr Lys Val 180 185 190 Asn Asp Glu Trp Asn Glu Gly Ser Ile Gln Glu Val Lys Arg Phe Leu 195 200 205 Thr Asp Trp Ile Glu Arg Met Asp Pro Val Tyr Met Ala Val Ser Leu 210 215 220 Pro Pro Thr Phe Ser Phe Pro Glu Glu Ser Asn Arg Gly Arg Ile Ile 225 230 235 240 Arg Asp Cys Leu Leu Pro Val Ala Glu Lys His Asn Ile Pro Phe Ala 245 250 255 Met Met Ile Gly Val Lys Lys Arg Val His Pro Ala Leu Gly Asp Ala 260 265 270 Gly Asp Phe Val Gly Lys Ala Ser Met Asp Gly Val Glu His Leu Leu 275 280 285 Arg Glu Tyr Pro Asn Asn Lys Phe Leu Val Thr Met Leu Ser Arg Glu 290 295 300 Asn Gln His Glu Leu Val Val Leu Ala Arg Lys Phe Ser Asn Leu Met 305 310 315 320 Ile Phe Gly Cys Trp Trp Phe Met Asn Asn Pro Glu Ile Ile Asn Glu 325 330 335 Met Thr Arg Met Arg Met Glu Met Leu Gly Thr Ser Phe Ile Pro Gln 340 345 350 His Ser Asp Ala Arg Val Leu Glu Gln Leu Ile Tyr Lys Trp His His 355 360 365 Ser Lys Ser Ile Ile Ala Glu Val Leu Ile Asp Lys Tyr Asp Asp Ile 370 375 380 Leu Gln Ala Gly Trp Glu Val Thr Glu Glu Glu Ile Lys Arg Asp Val 385 390 395 400 Ala Asp Leu Phe Ser Arg Asn Phe Trp Arg Phe Val Gly Arg Asn Asp 405 410 415 His Val Thr Ser Val Lys Val Glu Gln Gln Thr 420 425 19473PRTBacillus subtilis 19Met Glu Pro Phe Met Gly Lys Asn Phe Leu Leu Lys Asn Glu Thr Ala 1 5 10 15 Val Ser Leu Tyr His Asn Tyr Ala Lys Asp Met Pro Ile Ile Asp Tyr 20 25 30 His Cys His Leu Ser Pro Lys Glu Ile Tyr Glu Asn Lys Thr Phe Gln 35 40 45 Asn Ile Thr Glu Ala Trp Leu Tyr Gly Asp His Tyr Lys Trp Arg Ile 50 55 60 Met Arg Ala Asn Gly Ile Glu Glu Thr Tyr Ile Thr Gly Asp Ala Pro 65 70 75 80 Asp Glu Glu Lys Phe Met Ala Trp Ala Lys Thr Val Pro Met Ala Ile 85 90 95 Gly Asn Pro Leu Tyr Asn Trp Thr His Leu Glu Leu Gln Arg Phe Phe 100 105 110 Gly Ile Tyr Glu Ile Leu Asn Glu Lys Ser Gly Ser Ala Ile Trp Lys 115 120 125 Gln Thr Asn Lys Leu Leu Lys Gly Glu Gly Phe Gly Ala Arg Asp Leu 130 135 140 Ile Val Lys Ser Asn Val Lys Val Val Cys Thr Thr Asp Asp Pro Val 145 150 155 160 Asp Ser Leu Glu Tyr His Leu Leu Leu Lys Glu Asp Lys Asp Phe Pro 165 170 175 Val Ser Val Leu Pro Gly Phe Arg Pro Asp Lys Gly Leu Glu Ile Asn 180 185 190 Arg Glu Gly Phe Pro Glu Trp Val Gln Ala Leu Glu Asp Ala Ala Ala 195 200 205 Ile Ser Ile Thr Thr Tyr Asp Glu Phe Leu Lys Ala Leu Glu Lys Arg 210 215 220 Val Arg Phe Phe His Ser Ala Gly Gly Arg Val Ser Asp His Ala Ile 225 230 235 240 Asp Thr Met Val Phe Ala Glu Thr Thr Lys Glu Glu Ala Gly Arg Ile 245 250 255 Phe Ser Asp Arg Leu Gln Gly Thr Glu Val Ser Cys Glu Asp Glu Lys 260 265 270 Lys Phe Lys Thr Tyr Thr Leu Gln Phe Leu Cys Gly Leu Tyr Ala Glu 275 280 285 Leu Asp Trp Ala Met Gln Phe His Ile Asn Ala Leu Arg Asn Thr Asn 290 295 300 Thr Lys Met Met Lys Arg Leu Gly Pro Asp Thr Gly Tyr Asp Ser Met 305 310 315 320 Asn Asp Glu Glu Ile Ala Lys Pro Leu Tyr Lys Leu Leu Asn Ser Val 325 330 335 Glu Met Lys Asn Gln Leu Pro Lys Thr Ile Leu Tyr Ser Leu Asn Pro 340 345 350 Asn Asp Asn Tyr Val Ile Ala Ser Met Ile Asn Ser Phe Gln Asp Gly 355 360 365 Ile Thr Pro Gly Lys Ile Gln Phe Gly Thr Ala Trp Trp Phe Asn Asp 370 375 380 Thr Lys Asp Gly Met Leu Asp Gln Met Lys Ala Leu Ser Asn Val Gly 385 390 395 400 Leu Phe Ser Arg Phe Ile Gly Met Leu Thr Asp Ser Arg Ser Phe Leu 405 410 415 Ser Tyr Thr Arg His Glu Tyr Phe Arg Arg Ile Val Cys Asn Leu Ile 420 425 430 Gly Glu Trp Val Glu Asn Gly Glu Val Pro Arg Asp Met Glu Leu Leu 435 440 445 Gly Ser Ile Val Gln Gly Ile Cys Tyr Asp Asn Ala Lys His Tyr Phe 450 455 460 Gln Phe Gln Glu Glu Lys Ala Asn Val 465 470 20825DNARhizobium sp. 20atgctcaacg tggaaacgag gcacgccgtt cacgcggatc acgcgagatc actcgacaca 60gagggcctgc gccggcactt cctggcccag ggcctgtttg cggagggcga gatacggctg 120atctatacgc attatgatcg attcgtcatg ggaggcgccg tgccggacgg cgcgccactt 180gtgctcgatc atgtcgagga gacgaaaacg ccgggctttc tcgaccgacg ggagatggga 240atcgtcaata tcggtgctga gggcagcgtg catgccggca acgaaagctg gtcgctgaac 300cgtggtgacg tactttatct cggcatgggg gcgggaccgg tcaccttcga aggggctggg 360cgcttctacc tcgtctcggc accggcgcat cgcagcctgc ccaaccggct cgtcacgccg 420gccgacagca aggaggtcaa gcttggcgct ctcgagactt ccaacaaacg caccatcaat 480cagttcattc atcccctggt catggaaagc tgccagctcg tgctgggata taccacgctg 540gaggacggct cggtctggaa taccatgccc gcgcatgtgc acgaccgacg catggaggcc 600tatctctatt tcggcatgga tgagacatcg cgggttctgc atctgatggg cgagccgcag 660caaacgaggc atctcttcgt cgccaatgag gaaggggcga tctctccgcc gtggtccatc 720catgcgggag caggcattgg cagctatacc ttcatctggg ccatggcggg cgacaatgtc 780gattataccg acatggagtt catccagccg ggagatcttc gatga 82521837DNAEscherichia coli 21atggacgtaa gacagagcat ccacagtgcg cacgcaaaaa cgctggatac ccaagggctg 60cgcaatgaat ttttggttga aaaggtattt gtcgccgatg agtacaccat ggtttacagc 120cacattgacc gaattattgt tggcggcatt atgccgataa ctaaaacggt ttccgttggc 180ggggaagttg gtaaacaact cggcgtaagc tatttccttg aacgtcgcga gttaggtgtt 240atcaatattg gcggtgccgg tacgattact gtcgatggcc aatgctatga aatcggtcac 300cgcgacgccc tgtatgttgg taaaggtgca aaagaagttg tctttgccag tattgatacc 360ggcactccgg cgaagtttta ttacaattgc gcacccgcgc atacgacgta tcccaccaaa 420aaagtcacac cggacgaagt atctccagtc acgttaggcg ataacctcac cagtaaccgt 480cgcacgatta acaaatattt tgtcccggat gtactggaaa cctgccaatt gagtatgggg 540ctgacggagc tggctccggg taacttgtgg aacaccatgc cgtgtcacac ccacgagcgc 600cggatggaag tttatttcta tttcaatatg gatgatgacg cctgcgtttt ccacatgatg 660gggcagccgc aagaaacgcg tcatattgtg atgcataacg agcaggcggt gatctccccg 720agctggtcga tccattccgg tgtcggaacc aaagcttata cctttatctg gggcatggtc 780ggtgaaaacc aggtctttga tgatatggac catgtggccg ttaaagattt gcgctag 83722837DNARhizobium sp. 22atgacgatga agatactcta cggcgccgga ccggaggatg tgaaagggta tgacacgcag 60cgcctgcgcg acgccttcct gctggacgac ctcttcgccg acgaccgggt cagtttcaca 120tatacccatg tcgatcgcct catcctcggc ggggccgtcc cggtgacgac gagcctcacc 180ttcggctccg gcacggagat cggaacgccc tacctgcttt ccgcccgcga gatggggatc 240gccaatctcg gcggcacggg cacgatcgag gtggatggcc agcgcttcac gctcgaaaac 300cgcgacgtgc tctatgtcgg tcgcggcgcc cggcagatga ccgcctccag cctgtcggcg 360gagaggccag cccgcttcta catgaattcc gtgcccgccg gcgccgattt cccgcaccgt 420ctgatcaccc gcggagaggc caagcccctc gatctcggcg atgcgcgccg ctcgaacagg 480cgccggctcg caatgtacat ccatccggag gtctcgccgt cctgcctgct gctcatgggc 540atcaccgatc ttgccgaggg cagcgcctgg aacaccatgc cgccgcatct gcacgagcgg 600cggatggagg cctattgcta cttcgatctc tcgcccgagg accgggtcat ccacatgatg 660ggtcggccgg acgaaacccg ccaccttgtc gtggccgacg gcgaggcggt cctctctccc 720gcctggtcga tccatatggg tgccgggacg gggccctacg ccttcgtctg gggcatgacc 780ggcgaaaacc aggaatacaa cgacgtcgct cccgtagccg tggctgatct caaatga 83723825DNAPannonibacter phragmitetus 23atgctgaccg tcgaaacccg ccacgccatt gatccgcaga ccgcaaagcg gatggacacg 60gaagagctgc gcaagcattt ccacatgggc agcctgtttg ctgccggtga aatccgcctc 120gtctacaccc actatgaccg catgatcgtc ggcgctgccg tgccctcggg cgcgccgctg 180gtgctggatc aggtcaagga atgcggcacc gccagcatcc tcgaccgccg cgagatggct 240gtcgtcaacg tcggcgccag cggcaaggtc tctgcagcag gcgaaaccta cgccatggaa 300cgcggcgacg tgctctatct gccgctgggc tccggcaagg tgaccttcga aggcgaaggc 360cgcttctaca ttctctccgc tccggcccac gctgcttacc cggcccgcct gatccgcatc 420ggcgaggccg agaaggtcaa gctcggctcg gccgagacct ccaacgaccg caccatctac 480cagttcgtgc atccggcggt gatgacttcc tgccaactcg tcgtcggcta cacccagctg 540cacaacggct ctgtctggaa caccatgccc gcccacgtgc atgaccggcg catggaggcc 600tatctctatt tcgacatgaa gccggagcag cgcgtgttcc acttcatggg cgagccgcag 660gaaacccgcc atctggtcat gaagaacgag gatgcggtgg tctccccgcc ctggtccatc 720cactgcggcg caggcaccgg cagctacacc ttcatctggg ccatggccgg cgacaacgtc 780gactacaagg acgtggaaat ggtcgccatg gaggatctgc ggtga 82524816DNABacillus subtilis 24atgagttatt tgttgcgtaa gccgcagtcg aatgaagtgt ctaatggggt caaactggtg 60cacgaagtaa cgaaatccaa ctctgatctc acctatgtag agtttaaagt gttagatctc 120gcttccggtt ccagctatgc agaagaattg aaaaaacagg aaatctgtat tgtcgcggta 180acgggaaaca ttacagtgac cgatcacgag tcgacttttg agaatatcgg cacgcgtgaa 240agcgtattcg aacgaaaacc gacagacagc gtctatattt caaatgaccg ttcctttgag 300atcacagcgg tcagcgacgc aagagtggcg ctttgctatt ctccatcgga aaaacagctt 360ccgacaaagc tgatcaaagc ggaagacaat ggcattgagc atcgcgggaa gttttcaaac 420aaacgtactg ttcacaacat tcttccggat tcagaccctt cagctaacag cctattagta 480gttgaagtct atacagacag cggcaactgg tccagctatc cgcctcataa acatgatcaa 540gacaatttgc cggaggaatc ttttttagaa gaaacgtact accatgagtt agacccggga 600cagggctttg tgtttcagcg tgtatacaca gatgaccgct cgattgacga gacaatgact 660gtagaaaatg aaaacgttgt catcgttcct gcaggatacc acccggtagg cgtgccggac 720ggatacacat cctactattt aaatgtcatg gcagggccga cgcggaaatg gaagtttcat 780aatgacccgg cgcatgagtg gattttagaa cgttaa 81625810DNAOchrobactrum anthropi 25atggccaatt tgttgcgcaa gcccaacggc acgcatggca aggtccacga catcactccg 60gaaaacgcca aatggggtta tgtcgggttc gggctctttc gtctcaaatc cggcgagagt 120gtctccgaaa agaccggatc gacggaggtg atccttgttc ttgtggaagg caaggcaaag 180atttccgctt ctggcgagga tttcggcgag atgggtgaac gcttaaacgt gttcgagaaa 240ctgccgccac actgcctcta tgtgcctgct gaaagcgact ggcatgcaac cgccacgaca 300gattgtgttc tggctgtttg caccgcaccg ggcaagccag gccgcaaggc acagaagctt 360gggccggaaa gcttgacact tgaacaacgc ggaaaaggtg ccaatacccg ctttatccat 420aatatcgcaa tggaaagccg cgatgttgcc gatagccttc ttgttaccga ggtattcaca 480ccgcagggaa actggtcgtc ctatccaccc cacagacacg acgaagacaa ttttccggat 540atgacctatc tggaagagac ctattatcac cgtctcaacc cggcgcaggg cttcggcttc 600cagcgtgttt tcaccgaaga cggaagcctt gatgaaacca tggcggtctc tgacggagac 660gtcgtgcttg taccaaaagg ccaccatcca tgtggcgcgc cctatggcta cgagatgtat 720tatctcaatg tgatggccgg tcccttgcgc aaatggcgct tcaagaacca tcccgaccat 780gactggattt tcaaacgcga caatccgtaa 81026807DNAHalomonas titanicae 26atggcttccc tactggtacg ccccaccgcc ccagatgccc agggcaccgt gattgacgtt 60acccctgaat ctgctggctg gacgcacgtt ggctttcggg tgcataaact cgccaagggc 120cagcgcctgg aggccagcag cgatgatcag gaagtctgcc tggtgctgct caccggtcgc 180gccacggtaa cttgcggcga gcaccgcttt gaagatattg gccagcgtat ggatattttt 240gagcagatcc ctccctatgc ggtttaccta cctgaccatg ttagctacgc ggtggaagcg 300accacagact tagagctagc ggtgtgcacc gcccctgggc atggcaacca tgccccacgg 360ctcatcgcgc ctgacaacat caagcaaagc acccgtggcc agggcaccaa cacccgccat 420gttcacgata ttctgccgga aaccgagccc gccgatagcc tattagtagt cgaagtattc 480acacctgcgg gtaactggtc gagctacccg ccccacaaac acgatgtgga taacttaccc 540cacgaatcac atctggaaga gacctactac caccgcatta accctgaaca agggttcgcc 600ttccagcgcg tttacaccga tgaccgcagc cttgatgaaa ccatggcggt ggaaaacggc 660tgctgtgtgt tggttcccaa gggttaccat ccggtgggcg cctcccatgg ctactcgctc 720tactacttaa atgtgatggc ggggcccaag cgggcatgga aatttcacaa cgaccccgac 780cacgaatggc tgatgaacgc tggatag 80727795DNAAcidiphilium multivorum 27atgccggact tactgagaaa accgtttggc acccatggca aagtgcacga tattacccca 60gcagcagcag gttggagaca tgttggtttt ggcttatatc gcttaagagc gggcgaattt 120gcagcagaag cgacaggcgg caatgaagtt attctggtga tggttgaggg caaagcgtct 180attagagcag caggcagaga ttggggcgtt ttaggcgaac gtatgagcgt cttcgaaaaa 240agtccaccac attccctgta tgtcccgaat ggtgcagaat gggccttagt agccgaaaca 300gattgcattg tagcagtgtg tagcgctccg ggtagaggag gtcatgctgc aagaagaatt 360ggtcctgaag gtattgtgtt aaccgccaga ggtgaaggca ccaatacacg ccacatcaac 420aacatcgcca tggaagccga agattattgt gatgccctgt tagtcaccga agtgttcacc 480ccagccggcc attggagctc ttatccatct catcgtcatg atgaagacga cgatccgcgc 540atcacctatt tagaagagac ctactatcat cgcttaaatc ctgcctcggg ctttggcgtt 600caacgcgtct ataccgatga tcgcgcctta gatcaaacca tggcggtttc tgatggcgat 660gttgttttag ttcctcgcgg ccatcatccg tgtgcagccc cgtatggtat tgaaatgtat 720tacctgaacg tcatggccgg cccgttacgt aaatggcgct ttttacctga tcctgaactt 780ggcattgcga aataa 79528816DNALactobacillus casei 28atgtctctgc tgtaccacaa gcagaaccag gaactgagta gtggtgtgcg cctgatccaa 60gatgttaatg ccagcaatag cccgatgaaa tataccgccg tgaaagtgct ggagtttagc 120gccgatagca gctatgagga aaccttagag gcctttgaag ccggcattgt tgtgttagag 180ggcaaagtga ccatcaccgc cgacgatcaa accttcgaag atgtgggtca aagaacctcg 240atcttcgaca aaatcccgac cgatagcgtt tatgtgtcta ccggtttagc cttcggtatt 300cgcgccaaac aagccgccaa aatcttaatc gcgtatgctc cgaccaatca gaccttccca 360gttcgcttaa ttcgcggcaa tatccaccag gtggaacatc gcggcaagta caacaacaaa 420cgcttagtgc agaacattct cccggataat ctcccgttcg ccgataaatt actgctggtt 480gaggtgtaca ccgatagcgc caattggagc tcctatccgc cgcatagaca tgatcacgat 540gatttaccgg ccgaaagtct gttagaggag atctactatc acgaaatgcg cccgaagcag 600ggcttcgtct ttcaacgcgt gtataccgat gatctgagtc tggatgagac catggccgtt 660caaaatcaag atgttgtcgt tgtcccgaaa ggctatcatc cggttggtgt ccccgacggc 720tatgattcgt attacctgaa cgtgatggcc ggcccgacaa gagtgtggca ttttcataat 780gctccggaac atgcctggat tattgatcgc cagtaa 816291401DNABacteroides sp. 29atgaaaaaat ttatggatga aaattttctg ttgcaaaccg aaacagcgca gaaattgtat 60cataatcacg cggcaaacat gccgattttc gattaccact gccacattaa ccccaaagac 120atcgcggaag accggatgtt taaaaccatc accgaaatct ggttgtacgg cgatcattat 180aaatggcgcg ccatgcgtac aaacggcgtt gacgagcgct tttgcaccgg cgatgcaagc 240gattgggaaa agtttgaaaa gtgggccgaa acggttcctc ataccctgcg taatccgctt 300tatcactgga cacacctgga gctaaagaaa tttttcggga ttaacgagat cctgagtccg 360aaaaatgccc gggaaattta tgatgcctgt aacgaaaaac tgcaaacgcc cgcgtatagt 420tgccgcaaca tcatccggat ggccaatgtg catacaatct gtaccaccga cgacccggtt 480gacacactgg aatatcatca gcaaattaaa gaagacggct ttgaagtggc ggttttacct 540gcctggcgtc cggataaagc gatgatggtg gaagacccga agttctttaa cgactatatg 600gaccagttgg ccgaagctgc cggtatccat atcgaatcgt ttgaggattt gatggaagcc 660ttggatacgc gtcaccagta ttttcatgat aatggttgcc gtttgtccga ccacgggctg 720gataccgttt ttgctgaaga ttatacggag gaagaaatta aagcgatctt caaaaaaatc 780cgtggcggca gcaggcttag cgaaacggaa atcctgaaat tcaagtcctg catgttgtac 840gaatatgggg tgatggacca ttcgcgcggc tggacacaac aattgcacat tggcgcacaa 900cgcaacaaca acacccgttt gttcaaaaaa ttaggtcccg acactggttt cgattcgatt 960ggcgataagc cgatcgctga accattggcc aaattgctcg accgcctgga tcaggaaaac 1020aaattgtgca aaacggtttt gtataatctg aatccgcgtg ataacgagtt gtacgctacc 1080atgttgggca actttcagga cggatcggtt cccgggaaaa ttcaatacgg ctcgggttgg 1140tggtttctcg atcagaaaga cggcatgatt aaacagatga atgccctttc caatctgggt 1200ttgctgagcc gtttcgtagg catgctgacc gactcaagga gcttcctttc gtacacccgt

1260cacgaatatt tccgtcgtac cctttgcaac ctgcttggga atgatgttga aaacggggag 1320attccggcag atatggagct tttgggcagt atggttgaga atatttgttt taataacgcg 1380aagaactatt ttaattttta g 1401301356DNAThermotoga maritima MSB8 30atgtttctgg gcgaagacta tctgctgacc aatcgtgcgg cagttcgtct gttcaacgaa 60gtgaaagatc tgccgatcgt tgatccgcat aaccacctgg atgcgaaaga tatcgtggaa 120aacaaaccgt ggaacgacat ctgggaagtg gaaggtgcga ccgatcacta tgtgtgggaa 180ctgatgcgtc gttgtggtgt tagcgaagaa tatattaccg gctctcgtag caacaaagaa 240aaatggctgg cgctggcgaa agtgtttccg cgttttgtgg gtaatccgac gtacgaatgg 300atccacctgg atctgtggcg tcgtttcaac atcaaaaaag tcatcagcga agaaaccgcg 360gaagaaatct gggaagaaac caaaaaaaaa ctgccggaga tgaccccgca gaaactgctg 420cgcgacatga aagtggaaat cctgtgcacc accgatgatc cggtgtctac cctggaacat 480caccgtaaag cgaaagaagc cgtggaaggc gtgaccattt taccgacctg gcgtccggat 540cgtgcaatga atgttgataa agaaggttgg cgtgaatatg ttgaaaaaat gggtgaacgc 600tatggcgaag ataccagcac cctggatggt tttctgaatg ccctgtggaa aagccacgaa 660cacttcaaag aacacggctg tgtggcgagc gatcatgcgc tgctggaacc gagcgtgtac 720tacgtggatg aaaaccgcgc gcgtgcagtt catgaaaaag cattttctgg tgaaaaactg 780actcaagatg aaatcaacga ctataaagcg ttcatgatgg tgcagttcgg caaaatgaac 840caggaaacca actgggtgac ccagctgcac attggtgccc tgcgcgatta ccgcgatagc 900ctgttcaaaa ccctgggccc ggattctggt ggcgatatca gcaccaactt tctgcgtatt 960gctgaaggtc tgcgttattt tctgaacgaa tttgatggta aactgaaaat tgtgctgtac 1020gtgctggatc cgacccattt accgaccatt tcgaccattg cacgtgcgtt cccgaacgtg 1080tatgtgggtg caccgtggtg gttcaacgat agcccgttcg gcatggaaat gcacctgaaa 1140tacctggcga gcgttgatct gctgtacaat ctggctggta tggttaccga ttcacgtaaa 1200ttactgagtt ttggttctcg taccgaaatg tttcgtcgcg ttctgtctaa tgtggttggc 1260gaaatggtgg aaaaaggcca gatcccgatc aaagaagcgc gcgaactggt gaaacacgtg 1320agctacgacg gcccgaaagc cctgttcttt ggctga 1356311284DNABacillus halodurans 31atgagcatca acagccgtga agttctggcg gaaaaagtga aaaacgcggt gaacaaccag 60ccggttaccg atatgcatac ccacctgttt agcccgaact ttggcgaaat tctgctgtgg 120gacatcgatg aactgctgac ctatcactac ctggttgcgg aagttatgcg ttggaccgat 180gtgagcattg aagcgttttg ggcaatgagc aaacgtgaac aggccgatct gatttgggaa 240gaactgttca tcaaacgcag cccggtgagc gaagcatgtc gtggcgttct gacctgttta 300caaggtttag gtctggatcc ggcaactcgt gatttacagg tgtatcgtga atacttcgcc 360aaaaaaacca gcgaggaaca ggtggatacc gttctgcagc tggcaaatgt gagcgatgtg 420gtgatgacca atgatccgtt cgatgataat gaacgcatca gctggctgga aggcaaacag 480ccggatagcc gctttcatgc agcgttacgt ctggatccgc tgctgaatga atatgaacag 540accaaacatc gtctgcgtga ttggggttat aaagtgaacg acgaatggaa cgaaggcagc 600atccaggaag tgaaacgctt tctgaccgac tggattgaac gtatggatcc ggtgtatatg 660gcggtgagct taccgccgac cttcagcttt ccggaagaat cgaaccgtgg ccgcattatc 720cgtgattgtc tgttaccggt tgcagaaaaa cataacatcc cgtttgcaat gatgattggc 780gtgaaaaaac gcgtgcatcc ggcgttaggt gatgcaggcg attttgtggg taaagcaagt 840atggatggcg ttgaacacct gctgcgcgaa tacccgaaca acaaattcct ggtgaccatg 900ctgagccgcg aaaaccagca cgaactggtg gttctggcgc gtaaatttag taacctgatg 960atttttggtt gttggtggtt tatgaacaac ccggagatca tcaacgaaat gacccgcatg 1020cgcatggaaa tgctgggtac cagctttatc ccgcagcaca gcgatgcccg tgttctggaa 1080cagctgatct ataaatggca ccacagcaaa agcatcatcg cggaagtcct gatcgacaaa 1140tacgacgaca tcctgcaagc aggttgggaa gttaccgaag aagaaatcaa acgtgatgtg 1200gcagatctgt ttagccgcaa cttttggcgc tttgtgggcc gtaacgatca cgtgaccagc 1260gtgaaagtgg aacagcagac ctga 1284321422DNABacillus halodurans 32atggaaccgt ttatgggcaa aaacttcctg ctgaaaaacg agaccgcggt gagcctgtac 60cacaactacg cgaaagatat gccgatcatc gactaccatt gccatctgag cccgaaagaa 120atctacgaga acaaaacctt ccagaacatc accgaagcgt ggctgtacgg cgatcactac 180aaatggcgca tcatgcgtgc gaatggcatc gaagaaacct atattaccgg tgatgcaccg 240gacgaagaaa aattcatggc gtgggcgaaa accgtgccga tggccattgg taatccgctg 300tataactgga cccatctgga actgcaacgt ttttttggca tctacgaaat cctgaacgaa 360aaaagcggca gcgcgatctg gaaacagacc aacaaactgc tgaaaggcga aggctttggt 420gcgcgtgatc tgatcgtgaa aagcaacgtt aaagtggtgt gcaccaccga cgatccggtg 480gattctctgg aataccatct gctgctgaaa gaagacaaag acttcccggt tagcgtttta 540ccgggttttc gtccggataa aggtctggaa atcaaccgtg aaggctttcc ggaatgggtt 600caagccctgg aagatgcggc cgcaattagc attacgacct atgatgaatt tctgaaagcg 660ctggaaaaac gcgtgcgctt cttccatagt gcgggtggtc gtgttagcga tcatgcaatc 720gataccatgg ttttcgccga aaccaccaaa gaagaagcgg gtcgcatttt tagtgatcgt 780ctgcaaggca ccgaagttag ctgcgaagac gagaaaaaat tcaaaaccta caccctgcag 840tttctgtgtg gcctgtatgc cgaactggac tgggcaatgc agtttcacat caacgcgctg 900cgcaacacca acaccaaaat gatgaaacgc ctgggtccgg ataccggtta tgatagcatg 960aacgatgaag aaatcgcgaa accgctgtac aaactgctga acagcgtgga aatgaaaaac 1020caactgccga aaaccatcct gtacagcctg aacccgaacg acaactacgt gatcgcgagc 1080atgatcaaca gcttccagga tggcatcacc ccgggcaaaa ttcagtttgg caccgcatgg 1140tggttcaacg ataccaaaga tggtatgctg gatcagatga aagcactgag caatgtgggc 1200ctgtttagcc gttttattgg catgctgacc gatagccgta gctttctgag ctatacccgt 1260cacgaatact ttcgccgcat tgtgtgtaac ctgatcggcg aatgggtgga aaacggcgaa 1320gttccgcgcg atatggaact gctgggtagt attgtgcaag gtatttgcta cgataacgcg 1380aaacattact tccagttcca ggaggaaaaa gcgaacgtgt ga 142233593PRTAchromobacter piechaudii 33Met Ser Gln Thr Pro Arg Lys Leu Arg Ser Gln Lys Trp Phe Asp Asp 1 5 10 15 Pro Ala His Ala Asp Met Thr Ala Ile Tyr Val Glu Arg Tyr Leu Asn 20 25 30 Tyr Gly Leu Thr Arg Gln Glu Leu Gln Ser Gly Arg Pro Ile Ile Gly 35 40 45 Ile Ala Gln Thr Gly Ser Asp Leu Ala Pro Cys Asn Arg His His Leu 50 55 60 Ala Leu Ala Glu Arg Val Lys Ala Gly Ile Arg Asp Ala Gly Gly Ile 65 70 75 80 Pro Met Glu Phe Pro Val His Pro Leu Ala Glu Gln Gly Arg Arg Pro 85 90 95 Thr Ala Ala Leu Asp Arg Asn Leu Ala Tyr Leu Gly Leu Val Glu Ile 100 105 110 Leu His Gly Tyr Pro Leu Asp Gly Val Val Leu Thr Thr Gly Cys Asp 115 120 125 Lys Thr Thr Pro Ala Cys Leu Met Ala Ala Ala Thr Val Asp Leu Pro 130 135 140 Ala Ile Val Leu Ser Gly Gly Pro Met Leu Asp Gly Trp His Asp Gly 145 150 155 160 Gln Arg Val Gly Ser Gly Thr Val Ile Trp His Ala Arg Asn Leu Met 165 170 175 Ala Ala Gly Lys Leu Asp Tyr Glu Gly Phe Met Thr Leu Ala Thr Ala 180 185 190 Ser Ser Pro Ser Val Gly His Cys Asn Thr Met Gly Thr Ala Leu Ser 195 200 205 Met Asn Ser Leu Ala Glu Ala Leu Gly Met Ser Leu Pro Thr Cys Ala 210 215 220 Ser Ile Pro Ala Pro Tyr Arg Glu Arg Ala Gln Met Ala Tyr Ala Thr 225 230 235 240 Gly Met Arg Ile Cys Asp Met Val Arg Glu Asp Leu Arg Pro Ser His 245 250 255 Ile Leu Thr Arg Gln Ala Phe Glu Asn Ala Ile Val Val Ala Ser Ala 260 265 270 Leu Gly Ala Ser Thr Asn Cys Pro Pro His Leu Ile Ala Met Ala Arg 275 280 285 His Ala Gly Ile Asp Leu Ser Leu Asp Asp Trp Gln Arg Leu Gly Glu 290 295 300 Asp Val Pro Leu Leu Val Asn Cys Val Pro Ala Gly Glu His Leu Gly 305 310 315 320 Glu Gly Phe His Arg Ala Gly Gly Val Pro Ala Val Met His Glu Leu 325 330 335 Phe Ala Ala Gly Arg Leu His Pro Asp Cys Pro Thr Val Ser Gly Lys 340 345 350 Thr Ile Gly Asp Ile Ala Ala Gly Ala Lys Thr Arg Asp Ala Asp Val 355 360 365 Ile Arg Ser Cys Ala Ala Pro Leu Lys His Arg Ala Gly Phe Ile Val 370 375 380 Leu Ser Gly Asn Phe Phe Asp Ser Ala Ile Ile Lys Met Ser Val Val 385 390 395 400 Gly Glu Ala Phe Arg Arg Ala Tyr Leu Ser Glu Pro Gly Ser Glu Asn 405 410 415 Ala Phe Glu Ala Arg Ala Ile Val Phe Glu Gly Pro Glu Asp Tyr His 420 425 430 Ala Arg Ile Glu Asp Pro Ala Leu Asn Ile Asp Glu His Cys Ile Leu 435 440 445 Val Ile Arg Gly Ala Gly Thr Val Gly Tyr Pro Gly Ser Ala Glu Val 450 455 460 Val Asn Met Ala Pro Pro Ser His Leu Ile Lys Arg Gly Val Asp Ser 465 470 475 480 Leu Pro Cys Leu Gly Asp Gly Arg Gln Ser Gly Thr Ser Gly Ser Pro 485 490 495 Ser Ile Leu Asn Met Ser Pro Glu Ala Ala Val Gly Gly Gly Leu Ala 500 505 510 Leu Leu Arg Thr Gly Asp Lys Ile Arg Val Asp Leu Asn Gln Arg Ser 515 520 525 Val Thr Ala Leu Val Asp Asp Ala Glu Met Ala Arg Arg Lys Gln Glu 530 535 540 Pro Pro Tyr Gln Ala Pro Ala Ser Gln Thr Pro Trp Gln Glu Leu Tyr 545 550 555 560 Arg Gln Leu Val Gly Gln Leu Ser Thr Gly Gly Cys Leu Glu Pro Ala 565 570 575 Thr Leu Tyr Leu Lys Val Ile Glu Thr Arg Gly Asp Pro Arg His Ser 580 585 590 His 34602PRTAcinetobacter sp. 34Met Ser Glu Arg Ile Lys Lys Met Asn Asp Gln Asn Lys Arg Ile Phe 1 5 10 15 Leu Arg Ser Gln Glu Trp Phe Asp Asp Pro Glu His Ala Asp Met Thr 20 25 30 Ala Leu Tyr Val Glu Arg Tyr Met Asn Tyr Gly Leu Thr Arg Ala Glu 35 40 45 Leu Gln Ser Gly Arg Pro Ile Ile Gly Ile Ala Gln Thr Gly Ser Asp 50 55 60 Leu Thr Pro Cys Asn Arg His His Lys Glu Leu Ala Glu Arg Val Lys 65 70 75 80 Ala Gly Ile Arg Asp Ala Gly Gly Ile Pro Met Glu Phe Pro Val His 85 90 95 Pro Ile Ala Glu Gln Thr Arg Arg Pro Thr Ala Ala Leu Asp Arg Asn 100 105 110 Leu Ala Tyr Leu Gly Leu Val Glu Ile Leu His Gly Tyr Pro Leu Asp 115 120 125 Gly Val Val Leu Thr Thr Gly Cys Asp Lys Thr Thr Pro Ala Cys Leu 130 135 140 Met Ala Ala Ala Thr Thr Asp Ile Pro Ala Ile Val Leu Ser Gly Gly 145 150 155 160 Pro Met Leu Asp Gly His Phe Lys Gly Glu Leu Ile Gly Ser Gly Thr 165 170 175 Val Leu Trp His Ala Arg Asn Leu Leu Ala Thr Gly Glu Ile Asp Tyr 180 185 190 Glu Gly Phe Met Glu Met Thr Thr Ser Ala Ser Pro Ser Val Gly His 195 200 205 Cys Asn Thr Met Gly Thr Ala Leu Ser Met Asn Ala Leu Ala Glu Ala 210 215 220 Leu Gly Met Ser Leu Pro Thr Cys Ala Ser Ile Pro Ala Pro Tyr Arg 225 230 235 240 Glu Arg Gly Gln Met Ala Tyr Met Thr Gly Lys Arg Ile Cys Glu Met 245 250 255 Val Leu Glu Asp Leu Arg Pro Ser Lys Ile Met Asn Lys Gln Ser Phe 260 265 270 Glu Asn Ala Ile Ala Val Ala Ser Ala Leu Gly Ala Ser Ser Asn Cys 275 280 285 Pro Pro His Leu Ile Ala Ile Ala Arg His Met Gly Ile Glu Leu Ser 290 295 300 Leu Glu Asp Trp Gln Arg Val Gly Glu Asn Ile Pro Leu Ile Val Asn 305 310 315 320 Cys Met Pro Ala Gly Lys Tyr Leu Gly Glu Gly Phe His Arg Ala Gly 325 330 335 Gly Val Pro Ala Val Leu His Glu Leu Gln Lys Ala Ser Val Leu His 340 345 350 Glu Gly Cys Ala Ser Val Ser Gly Lys Thr Met Gly Glu Ile Ala Lys 355 360 365 Asn Ala Lys Thr Ser Asn Val Asp Val Ile Phe Pro Tyr Glu Gln Pro 370 375 380 Leu Lys His Gly Ala Gly Phe Ile Val Leu Ser Gly Asn Phe Phe Asp 385 390 395 400 Ser Ala Ile Met Lys Met Ser Val Val Gly Glu Ala Phe Lys Lys Thr 405 410 415 Tyr Leu Ser Asp Pro Asn Gly Glu Asn Ser Phe Glu Ala Arg Ala Ile 420 425 430 Val Phe Glu Gly Pro Glu Asp Tyr His Ala Arg Ile Asn Asp Pro Ala 435 440 445 Leu Asp Ile Asp Glu His Cys Ile Leu Val Ile Arg Gly Ala Gly Thr 450 455 460 Val Gly Tyr Pro Gly Ser Ala Glu Val Val Asn Met Ala Pro Pro Ala 465 470 475 480 Glu Leu Ile Lys Lys Gly Ile Asp Ser Leu Pro Cys Leu Gly Asp Gly 485 490 495 Arg Gln Ser Gly Thr Ser Ala Ser Pro Ser Ile Leu Asn Met Ser Pro 500 505 510 Glu Ala Ala Val Gly Gly Gly Ile Ala Leu Leu Lys Thr Asn Asp Arg 515 520 525 Leu Arg Ile Asp Leu Asn Lys Arg Ser Val Asn Val Leu Ile Ser Asp 530 535 540 Glu Glu Leu Glu Gln Arg Arg Arg Glu Trp Lys Pro Thr Val Ser Ser 545 550 555 560 Ser Gln Thr Pro Trp Gln Glu Met Tyr Arg Asn Met Val Gly Gln Leu 565 570 575 Ser Thr Gly Gly Cys Leu Glu Pro Ala Thr Leu Tyr Met Arg Val Ile 580 585 590 Asn Gln Asp Asn Leu Pro Arg His Ser His 595 600 35593PRTAchromobacter xylosoxidans 35Met Ser Gln Thr Pro Arg Lys Leu Arg Ser Gln Lys Trp Phe Asp Asp 1 5 10 15 Pro Ala His Ala Asp Met Thr Ala Ile Tyr Val Glu Arg Tyr Leu Asn 20 25 30 Tyr Gly Leu Thr Arg Gln Glu Leu Gln Ser Gly Arg Pro Ile Ile Gly 35 40 45 Ile Ala Gln Thr Gly Ser Asp Leu Ala Pro Cys Asn Arg His His Leu 50 55 60 Ala Leu Ala Glu Arg Ile Lys Ala Gly Ile Arg Asp Ala Gly Gly Ile 65 70 75 80 Pro Met Glu Phe Pro Val His Pro Leu Ala Glu Gln Gly Arg Arg Pro 85 90 95 Thr Ala Ala Leu Asp Arg Asn Leu Ala Tyr Leu Gly Leu Val Glu Ile 100 105 110 Leu His Gly Tyr Pro Leu Asp Gly Val Val Leu Thr Thr Gly Cys Asp 115 120 125 Lys Thr Thr Pro Ala Cys Leu Met Ala Ala Ala Thr Val Asp Ile Pro 130 135 140 Ala Ile Val Leu Ser Gly Gly Pro Met Leu Asp Gly Trp His Asp Gly 145 150 155 160 Gln Arg Val Gly Ser Gly Thr Val Ile Trp His Ala Arg Asn Leu Met 165 170 175 Ala Ala Gly Lys Leu Asp Tyr Glu Gly Phe Met Thr Leu Ala Thr Ala 180 185 190 Ser Ser Pro Ser Ile Gly His Cys Asn Thr Met Gly Thr Ala Leu Ser 195 200 205 Met Asn Ser Leu Ala Glu Ala Leu Gly Met Ser Leu Pro Thr Cys Ala 210 215 220 Ser Ile Pro Ala Pro Tyr Arg Glu Arg Gly Gln Met Ala Tyr Ala Thr 225 230 235 240 Gly Leu Arg Ile Cys Asp Met Val Arg Glu Asp Leu Arg Pro Ser His 245 250 255 Val Leu Thr Arg Gln Ala Phe Glu Asn Ala Ile Val Val Ala Ser Ala 260 265 270 Leu Gly Ala Ser Ser Asn Cys Pro Pro His Leu Ile Ala Met Ala Arg 275 280 285 His Ala Gly Ile Asp Leu Ser Leu Asp Asp Trp Gln Arg Leu Gly Glu 290 295 300 Asp Val Pro Leu Leu Val Asn Cys Val Pro Ala Gly Glu His Leu Gly 305 310 315 320 Glu Gly Phe His Arg Ala Gly Gly Val Pro Ala Val Leu His Glu Leu 325 330 335 Ala Ala Ala Gly Arg Leu His Met Asp Cys Ala Thr Val Ser Gly Lys 340 345 350 Thr Ile Gly Glu Ile Ala Ala Ala Ala Lys Thr Asn Asn Ala Asp Val 355 360 365 Ile Arg Ser Cys Asp Ala Pro Leu Lys His Arg Ala Gly Phe Ile Val 370 375 380 Leu Ser Gly Asn Phe Phe Asp Ser Ala Ile Ile Lys Met Ser Val Val 385 390 395 400 Gly Glu Ala Phe Arg Arg Ala Tyr Leu Ser Glu Pro Gly Ser Glu Asn 405 410 415 Ala Phe Glu Ala Arg Ala Ile Val Phe Glu Gly Pro Glu Asp Tyr His 420

425 430 Ala Arg Ile Glu Asp Pro Thr Leu Asn Ile Asp Glu His Cys Ile Leu 435 440 445 Val Ile Arg Gly Ala Gly Thr Val Gly Tyr Pro Gly Ser Ala Glu Val 450 455 460 Val Asn Met Ala Pro Pro Ser His Leu Leu Lys Arg Gly Ile Asp Ser 465 470 475 480 Leu Pro Cys Leu Gly Asp Gly Arg Gln Ser Gly Thr Ser Ala Ser Pro 485 490 495 Ser Ile Leu Asn Met Ser Pro Glu Ala Ala Val Gly Gly Gly Leu Ala 500 505 510 Leu Leu Arg Thr Gly Asp Arg Ile Arg Val Asp Leu Asn Gln Arg Ser 515 520 525 Val Ile Ala Leu Val Asp Gln Thr Glu Met Glu Arg Arg Lys Leu Glu 530 535 540 Pro Pro Tyr Gln Ala Pro Glu Ser Gln Thr Pro Trp Gln Glu Leu Tyr 545 550 555 560 Arg Gln Leu Val Gly Gln Leu Ser Thr Gly Gly Cys Leu Glu Pro Ala 565 570 575 Thr Leu Tyr Leu Lys Val Val Glu Thr Arg Gly Asp Pro Arg His Ser 580 585 590 His 361782DNAAchromobacter piechaudii 36atgtctcaga caccccgcaa gttgcgcagc cagaaatggt tcgacgaccc tgcgcatgcc 60gatatgacgg cgatttacgt cgagcgttat ctgaattacg gcctgacgcg gcaagagttg 120cagtccgggc ggccgatcat cggcatcgcc cagaccggca gcgatctggc gccctgcaac 180cgccatcacc tggcgctggc cgagcgcgtc aaagcgggca tccgggacgc gggcggcatc 240ccgatggagt tccccgtgca cccgctggcc gaacaaggcc ggcggcccac ggccgcgctg 300gaccgcaacc tggcctatct gggcctggtc gaaatcctgc acggctaccc cttggacggg 360gtggtgctga cgactggctg cgacaagacc acgcctgcct gcctgatggc cgccgccacg 420gtcgacctgc ccgccatcgt gctgtccggc ggccccatgc tggacggctg gcacgacggc 480cagcgcgtcg gttccggcac cgtcatctgg cacgcgcgca acctgatggc ggccggcaag 540cttgattacg aaggcttcat gacgctggcc accgcgtctt cgccgtcggt cggccactgc 600aacaccatgg gcacggcgtt gtcgatgaat tcgctggccg aagcgctggg catgtcgctg 660cccacctgcg ccagcattcc cgccccctac cgcgaacgcg cccagatggc ctacgccacc 720ggcatgcgca tctgcgacat ggtgcgcgaa gacctgcgac cctcccacat cctgacacgg 780caggcattcg agaacgccat cgtcgtggca tcggcgctgg gcgcgtccac caattgcccg 840ccgcacctga tcgcgatggc ccgccacgcc ggcatcgacc ttagcctgga cgactggcag 900cgcctgggtg aagacgtgcc gctgctggtc aactgcgtgc cggcgggcga gcatctgggc 960gagggcttcc accgcgcggg cggcgtcccc gcggtcatgc atgaactgtt cgccgccggg 1020cgccttcacc ccgactgccc caccgtatcc ggcaagacca tcggggacat cgccgcgggc 1080gccaagaccc gcgacgccga cgtcatccgc agctgcgccg ccccgctgaa acaccgggca 1140ggcttcatcg tgctgtcggg caatttcttc gacagcgcca tcatcaagat gtcggtcgta 1200ggcgaagcgt tccgccgcgc ctacctgtcc gaacccggct cagagaacgc cttcgaggcc 1260cgcgccatcg tgttcgaagg ccccgaggac taccacgcgc gcatcgaaga cccggcgctg 1320aacatcgacg aacactgcat ccttgtcatc cgcggcgccg gcaccgtggg ctacccgggc 1380agcgccgaag tggtcaacat ggcgccgccg tcccacctga tcaagcgcgg cgtggattcc 1440ctgccgtgcc tgggggatgg caggcaaagc ggcacttccg gcagcccgtc cattttgaac 1500atgtcccctg aagcagcagt cgggggagga ttggcgctgc tgcgcaccgg cgacaagatc 1560cgtgtcgatc tgaaccagcg cagcgtcacc gccttggtcg acgacgcgga aatggcaaga 1620cggaagcaag aaccgcccta ccaggcaccg gcctcgcaaa cgccctggca agagctgtac 1680cggcaactgg tcggccagtt gtcgacgggc ggctgcctgg agcccgcgac gctatatctg 1740aaagtcatcg aaacgcgcgg cgatccccgg cactctcact ga 1782371809DNAAcinetobacter sp. 37atgagtgaaa ggatcaaaaa aatgaatgat caaaataaac ggattttttt acgtagccaa 60gaatggtttg atgatcctga acatgctgac atgacagcac tctatgttga gcgttatatg 120aattatggcc tgacccgtgc cgagctacaa tcaggccgcc cgattattgg tattgcacaa 180actggcagtg atttaactcc atgtaaccgt caccacaaag aacttgctga acgggttaaa 240gcaggtattc gagatgcggg aggtattccc atggaattcc ccgttcaccc gattgcagaa 300caaacccgtc gccctactgc tgcacttgat agaaatttag cttacttagg cttagttgaa 360atattgcatg gttatccgct tgatggtgtg gtgctaacca caggttgtga caaaactaca 420cctgcttgtt taatggctgc cgcaacgaca gatataccag ccattgtgtt gtctggtgga 480ccaatgctag atggtcattt taaaggtgag ttaattggtt ctgggactgt gctttggcat 540gcaagaaatt tacttgccac gggtgaaatt gattatgaag ggttcatgga aatgaccact 600tcagcatcgc cttcggtcgg acattgcaac accatgggca ctgcactttc tatgaatgcc 660ttggcagaag ctttgggcat gtctttaccg acatgtgcaa gtattccagc gccgtatcgc 720gaacgagggc aaatggccta tatgacaggc aaaagaattt gtgaaatggt tttagaagat 780ttacgccctt ctaaaatcat gaacaaacaa tcatttgaaa atgccatcgc ggtagcttca 840gcattagggg catcaagtaa ttgccctcct cacctcattg caattgcccg tcatatgggc 900attgagctca gtttagaaga ctggcaacgc gttggggaga acattcctct cattgtgaac 960tgtatgcctg cgggtaaata tttaggtgaa ggttttcacc gtgctggcgg tgttcctgct 1020gttttgcatg aattacaaaa ggccagcgtt ttacatgaag gctgtgcatc agtcagcggt 1080aaaacgatgg gagaaattgc taaaaatgct aaaacctcca atgtagatgt tatttttcca 1140tatgaacaac cattaaaaca tggtgcaggt tttattgtgc ttagtggcaa tttcttcgac 1200agcgccatta tgaaaatgtc tgttgtgggt gaagcattta agaaaaccta tttatctgac 1260ccaaatgggg aaaatagctt tgaagcacgg gcaatcgttt ttgaagggcc agaggactac 1320catgcacgaa ttaatgatcc agccttagac attgatgaac attgtatttt ggtcattcgt 1380ggcgctggaa cagtgggcta tccaggtagt gcagaagttg taaatatggc tccacccgca 1440gagttaatta aaaaaggcat cgattcactg ccttgcttag gagatggccg ccaaagtggt 1500acgtctgcca gcccttctat tttaaatatg tcacccgaag cggcggtagg cggtggaatt 1560gcattattaa agaccaatga ccgtttacgc attgatctca ataaacgctc cgtcaacgta 1620ctcatttctg acgaagagtt agaacaacgc cgccgtgagt ggaaaccgac ggtctcttca 1680tctcaaacac cttggcaaga aatgtatcgc aacatggtgg gtcaattatc cactggcggt 1740tgtttggaac ctgcaacttt atatatgcga gtcataaatc aagacaacct tccaagacac 1800tctcattaa 1809381782DNAAchromobacter xylosoxidans 38atgagccaaa caccgcgtaa attacgcagc cagaagtggt ttgacgatcc tgcacatgcc 60gatatgaccg ccatctatgt tgaacgctac ctgaactatg gcttaacccg ccaagaactg 120caaagtggtc gcccgattat tggtattgcc caaaccggca gcgatttagc cccgtgtaat 180cgccatcatt tagccttagc cgaacgcatt aaagcaggca ttagagatgc aggcggcatt 240cctatggaat ttcccgttca tccgctggcc gaacaaggta gacgtcctac agcagcatta 300gatcgcaatt tagcctattt aggcctggtg gaaattttac acggctatcc cctggacggt 360gtggtgctga caaccggttg cgataaaaca acaccggcgt gtttaatggc agctgcaaca 420gttgatattc cggcgatcgt gttatcaggt ggtccgatgt tagatggctg gcatgatggc 480caaagagttg gcagtggtac cgtgatttgg catgcacgca atttaatggc agcaggcaaa 540ctggattatg aaggcttcat gaccctggcg acagcctctt ctccgagtat tggacactgt 600aataccatgg gcacagcctt aagcatgaat agtctggcag aagccctggg tatgtcttta 660ccgacctgtg cgtctattcc agccccgtat agagaacgcg gtcaaatggc gtatgctact 720ggtttacgca tttgcgatat ggtgcgcgaa gatttacgcc cgtcacatgt tttaacccgc 780caagccttcg aaaatgccat tgttgttgcc tcagccttag gtgcaagctc taattgtccc 840cctcatttaa ttgccatggc ccgtcatgcc ggtatcgact taagcctgga tgactggcaa 900cgcttaggcg aagatgttcc gttactggtc aattgtgtgc ctgccggtga acatttaggt 960gaaggatttc atcgcgcggg tggtgttcct gctgttttac atgaattagc tgccgcaggt 1020cgtttacata tggattgtgc taccgtttct ggcaagacca tcggcgaaat tgcagctgcc 1080gcaaaaacca acaacgcaga cgtgattcgc tcgtgtgatg ccccgttaaa acatagagcc 1140ggctttattg tgttaagcgg caatttcttc gactccgcca tcatcaagat gtccgttgtg 1200ggtgaagcct ttcgcagagc ctatttaagt gaacctggca gcgaaaatgc ctttgaagcc 1260cgtgccatcg tgtttgaagg cccggaagac tatcatgccc gcattgaaga tccgaccctg 1320aatattgatg aacactgcat tctggtgatt cgcggcgcag gtaccgttgg ttatcctggt 1380agtgctgaag ttgtgaatat ggccccgccg agccatttat taaaacgcgg tattgattca 1440ttaccttgcc tgggagatgg ccgccaaagt ggtacctcag ctagtccgtc tatcctgaat 1500atgagccctg aagccgccgt tggaggaggt ttagcattat taagaaccgg tgatcgcatt 1560cgcgtcgatc tgaatcaacg ctcagtcatt gcattagtcg accagaccga aatggaacgc 1620cgcaaattag aaccaccgta tcaagcacct gaaagccaaa ccccgtggca agaactgtat 1680cgccaattag tcggtcaact gtcaacaggc ggctgcctgg aaccagccac cttatattta 1740aaagtcgtgg aaacccgtgg agatcctcgt catagccatt aa 178239451PRTTerriglobuds roseus 39Met Asp Arg Arg Glu Leu Leu Lys Thr Ser Ala Leu Leu Met Ala Ala 1 5 10 15 Ala Pro Leu Ala Arg Ala Ala Asn Val Pro Glu Asp His Ala Asn Val 20 25 30 Pro Arg Thr Asn Trp Ser Lys Asn Phe His Tyr Ser Thr Ser Arg Val 35 40 45 Tyr Ala Pro Thr Thr Pro Glu Glu Val Pro Ala Ile Val Leu Glu Asn 50 55 60 Gly His Leu Lys Gly Leu Gly Ser Arg His Cys Phe Asn Asn Ile Ala 65 70 75 80 Asp Ser Gln Tyr Ala Gln Ile Ser Met Arg Glu Val Lys Gly Ile Gln 85 90 95 Ile Asp Glu Ala Ala Gln Thr Val Thr Val Gly Ala Gly Ile Ala Tyr 100 105 110 Gly Glu Leu Ala Pro Val Leu Asp Lys Ala Gly Phe Ala Leu Ala Asn 115 120 125 Leu Ala Ser Leu Pro His Ile Ser Val Gly Gly Thr Ile Ala Thr Ala 130 135 140 Thr His Gly Ser Gly Val Gly Asn Lys Asn Leu Ser Ser Ala Thr Arg 145 150 155 160 Ala Ile Glu Ile Val Lys Ala Asp Gly Ser Ile Leu Arg Leu Ser Arg 165 170 175 Asp Thr Asp Gly Glu Arg Phe Arg Met Ala Val Val His Leu Gly Ala 180 185 190 Leu Gly Val Leu Thr Lys Val Thr Leu Asp Ile Val Pro Arg Phe Asp 195 200 205 Met Ser Gln Val Val Tyr Arg Asn Leu Ser Phe Asp Gln Leu Glu His 210 215 220 Asn Leu Asp Thr Ile Leu Ser Ser Gly Tyr Ser Val Ser Leu Phe Thr 225 230 235 240 Asp Trp Gln Arg Asn Arg Val Asn Gln Val Trp Ile Lys Asp Lys Ala 245 250 255 Thr Ala Asp Ala Pro Gln Lys Pro Leu Pro Pro Met Phe Tyr Gly Ala 260 265 270 Thr Leu Gln Thr Ala Lys Leu His Pro Ile Asp Asp His Pro Ala Asp 275 280 285 Ala Cys Thr Glu Gln Met Gly Ser Val Gly Pro Trp Tyr Leu Arg Leu 290 295 300 Pro His Phe Lys Met Glu Phe Thr Pro Ser Ser Gly Glu Glu Leu Gln 305 310 315 320 Thr Glu Tyr Phe Val Ala Arg Lys Asp Gly Tyr Arg Ala Ile Arg Ala 325 330 335 Val Glu Lys Leu Arg Asp Lys Ile Thr Pro His Leu Phe Ile Thr Glu 340 345 350 Ile Arg Thr Ile Ala Ala Asp Asp Leu Pro Met Ser Met Ala Tyr Gln 355 360 365 Arg Asp Ser Met Ala Ile His Phe Thr Trp Lys Pro Glu Glu Pro Thr 370 375 380 Val Arg Lys Leu Leu Pro Glu Ile Glu Ala Ala Leu Ala Pro Phe Gly 385 390 395 400 Val Arg Pro His Trp Gly Lys Ile Phe Glu Ile Pro Pro Ser Tyr Leu 405 410 415 His Lys Gln Tyr Pro Ala Leu Pro Arg Phe Arg Ala Met Ala Gln Ala 420 425 430 Leu Asp Pro Gly Gly Lys Phe Arg Asn Ala Tyr Leu Asp Arg Asn Ile 435 440 445 Phe Gly Ala 450 40450PRTGranulicella mallensis 40Met Asp Lys Arg Asp Phe Leu Lys Gly Ser Ala Thr Thr Ala Val Ala 1 5 10 15 Leu Met Met Gly Leu Asn Glu Ser Lys Ala Phe Ala Asp Asp Ser Val 20 25 30 Pro Arg Thr Asn Trp Ser Gly Asn Tyr His Tyr Ser Thr Asn Lys Val 35 40 45 Leu Gln Pro Ala Ser Val Ala Glu Thr Gln Asp Ala Val Arg Ser Val 50 55 60 Ala Gly Val Arg Ala Leu Gly Thr Arg His Ser Phe Asn Gly Ile Ala 65 70 75 80 Asp Ser Gln Ile Ala Gln Ile Ser Thr Leu Lys Leu Lys Asp Val Ser 85 90 95 Leu Asp Ala Lys Ser Ser Thr Val Thr Val Gly Ala Gly Ile Arg Tyr 100 105 110 Gly Asp Leu Ala Val Gln Leu Asp Ala Lys Gly Phe Ala Leu His Asn 115 120 125 Leu Ala Ser Leu Pro His Ile Ser Val Gly Gly Ala Cys Ala Thr Ala 130 135 140 Thr His Gly Ser Gly Met Gly Asn Gly Asn Leu Ala Thr Ala Val Lys 145 150 155 160 Ala Val Glu Phe Val Ala Ala Asp Gly Ser Val His Thr Leu Ser Arg 165 170 175 Asp Arg Asp Gly Asp Arg Phe Ala Gly Ser Val Val Gly Leu Gly Ala 180 185 190 Leu Gly Val Val Thr His Leu Thr Leu Gln Val Gln Pro Arg Phe Glu 195 200 205 Met Thr Gln Val Val Tyr Arg Asp Leu Pro Phe Ser Glu Leu Glu His 210 215 220 His Leu Pro Glu Ile Met Gly Ala Gly Tyr Ser Val Ser Leu Phe Thr 225 230 235 240 Asp Trp Gln Asn Gly Arg Ala Gly Glu Val Trp Ile Lys Arg Arg Val 245 250 255 Asp Gln Gly Gly Ala Ser Ala Pro Pro Ala Arg Phe Phe Asn Ala Thr 260 265 270 Leu Ala Thr Thr Lys Leu His Pro Ile Leu Asp His Pro Ala Glu Ala 275 280 285 Cys Thr Asp Gln Leu Asn Thr Val Gly Pro Trp Tyr Glu Arg Leu Pro 290 295 300 His Phe Lys Leu Asn Phe Thr Pro Ser Ser Gly Gln Glu Leu Gln Thr 305 310 315 320 Glu Phe Phe Val Pro Phe Asp Arg Gly Tyr Asp Ala Ile Arg Ala Val 325 330 335 Glu Thr Leu Arg Asp Val Ile Thr Pro His Leu Tyr Ile Thr Glu Leu 340 345 350 Arg Ala Val Ala Ala Asp Asp Leu Trp Met Ser Met Ala Tyr Gln Arg 355 360 365 Pro Ser Leu Ala Ile His Phe Thr Trp Lys Pro Glu Thr Asp Ala Val 370 375 380 Leu Lys Leu Leu Pro Gln Ile Glu Ala Lys Leu Ala Pro Phe Gly Ala 385 390 395 400 Arg Pro His Trp Ala Lys Val Phe Thr Met Lys Ser Ser His Val Ala 405 410 415 Pro Leu Tyr Pro Arg Leu Lys Asp Phe Leu Val Leu Ala Lys Ser Phe 420 425 430 Asp Pro Lys Gly Lys Phe Gln Asn Ala Phe Leu Gln Asp His Val Asp 435 440 445 Ile Ala 450 41414PRTStreptomyces acidiscabies 41Met Thr Ala Ser Val Thr Asn Trp Ala Gly Asn Ile Ser Phe Val Ala 1 5 10 15 Lys Asp Val Val Arg Pro Gly Gly Val Glu Ala Leu Arg Lys Val Val 20 25 30 Ala Gly Asn Asp Arg Val Arg Val Leu Gly Ser Gly His Ser Phe Asn 35 40 45 Arg Ile Ala Glu Pro Gly Ala Asp Gly Val Leu Val Ser Leu Asp Ala 50 55 60 Leu Pro Gln Val Ile Asp Val Asp Thr Glu Arg Arg Thr Val Arg Val 65 70 75 80 Gly Gly Gly Val Lys Tyr Ala Glu Leu Ala Arg His Val Asn Glu Ser 85 90 95 Gly Leu Ala Leu Pro Asn Met Ala Ser Leu Pro His Ile Ser Val Ala 100 105 110 Gly Ser Val Ala Thr Gly Thr His Gly Ser Gly Val Asn Asn Gly Pro 115 120 125 Leu Ala Thr Pro Val Arg Glu Val Glu Leu Leu Thr Ala Asp Gly Ser 130 135 140 Leu Val Thr Ile Gly Lys Asp Asp Ala Arg Phe Pro Gly Ala Val Thr 145 150 155 160 Ser Leu Gly Ala Leu Gly Val Val Val Ala Leu Thr Leu Asp Leu Glu 165 170 175 Pro Ala Tyr Gly Val Glu Gln Tyr Thr Phe Thr Glu Leu Pro Leu Glu 180 185 190 Gly Leu Asp Phe Glu Ala Val Ala Ser Ala Ala Tyr Ser Val Ser Leu 195 200 205 Phe Thr Asp Trp Arg Glu Ala Gly Phe Arg Gln Val Trp Val Lys Arg 210 215 220 Arg Ile Asp Glu Pro Tyr Ala Gly Phe Pro Trp Ala Ala Pro Ala Thr 225 230 235 240 Glu Lys Leu His Pro Val Pro Gly Met Pro Ala Glu Asn Cys Thr Asp 245 250 255 Gln Phe Gly Ala Ala Gly Pro Trp His Glu Arg Leu Pro His Phe Lys 260 265 270 Ala Glu Phe Thr Pro Ser Ser Gly Asp Glu Leu Gln Ser Glu Tyr Leu 275 280 285 Leu Pro Arg Glu His Ala Leu Ala Ala Leu Asp Ala Val Gly Asn Val 290 295 300 Arg Glu Thr Val Ser Thr Val Leu Gln Ile Cys Glu Val Arg Thr Ile 305 310 315 320 Ala Ala Asp Thr Gln Trp Leu Ser Pro Ala Tyr Gly Arg Asp Ser Val 325 330 335 Ala Leu His Phe Thr Trp Thr Asp Asp Met Asp Ala Val Leu Pro Ala 340 345 350 Val Arg Ala Val Glu Ser Ala Leu Asp Gly Phe Gly Ala Arg Pro His 355 360 365 Trp Gly Lys Val Phe Thr Thr Ala Pro Ala Ala Leu Arg

Glu Arg Tyr 370 375 380 Pro Arg Leu Asp Asp Phe Arg Thr Leu Arg Asp Glu Leu Asp Pro Ala 385 390 395 400 Gly Lys Phe Thr Asn Ala Phe Val Arg Asp Val Leu Glu Gly 405 410 42427PRTActinomycetales 42Met Thr Leu Glu Arg Asn Trp Ala Gly Thr His Thr Phe Ala Ala Pro 1 5 10 15 Arg Ile Val Asn Ala Thr Ser Ile Asp Glu Val Arg Ala Leu Val Ala 20 25 30 Glu Ala Ala Arg Thr Gly Thr Arg Val Arg Ala Leu Gly Thr Arg His 35 40 45 Ser Phe Thr Asp Leu Ala Asp Ser Asp Gly Thr Leu Ile Thr Val Leu 50 55 60 Asp Ile Pro Ala Asp Pro Val Phe Asp Glu Ala Ala Gly Ser Val Thr 65 70 75 80 Ile Gly Ala Gly Thr Arg Tyr Gly Ile Ala Ala Ala Trp Leu Ala Glu 85 90 95 His Gly Leu Ala Phe His Asn Met Gly Ser Leu Pro His Ile Ser Val 100 105 110 Gly Gly Ala Ile Ala Thr Gly Thr His Gly Ser Gly Asn Asp Asn Gly 115 120 125 Ile Leu Ser Ser Ala Val Ser Gly Leu Glu Tyr Val Asp Ala Thr Gly 130 135 140 Glu Leu Val His Val Arg Arg Gly Asp Pro Gly Phe Asp Gly Leu Val 145 150 155 160 Val Gly Leu Gly Ala Tyr Gly Ile Val Val Arg Val Thr Val Asp Val 165 170 175 Gln Pro Ala Tyr Arg Val Arg Gln Asp Val Tyr Arg Asp Val Pro Trp 180 185 190 Asp Ala Val Leu Ala Asp Phe Glu Gly Val Thr Gly Gly Ala Tyr Ser 195 200 205 Val Ser Ile Phe Thr Asn Trp Leu Gly Asp Thr Val Glu Gln Ile Trp 210 215 220 Trp Lys Thr Arg Leu Val Ala Gly Asp Asp Glu Leu Pro Val Val Pro 225 230 235 240 Glu Ser Trp Leu Gly Val Gln Arg Asp Ser Leu Thr Ala Gly Asn Leu 245 250 255 Val Glu Thr Asp Pro Asp Asn Leu Thr Leu Gln Gly Gly Val Pro Gly 260 265 270 Asp Trp Trp Glu Arg Leu Pro His Phe Arg Leu Glu Ser Thr Pro Ser 275 280 285 Asn Gly Asp Glu Ile Gln Thr Glu Tyr Phe Ile Asp Arg Ala Asp Gly 290 295 300 Pro Ala Ala Ile Thr Ala Leu Arg Ala Leu Gly Asp Arg Ile Ala Pro 305 310 315 320 Leu Leu Leu Val Thr Glu Leu Arg Thr Ala Ala Pro Asp Lys Leu Trp 325 330 335 Leu Ser Gly Ala Tyr His Arg Glu Met Leu Ala Val His Phe Thr Trp 340 345 350 Arg Asn Leu Pro Glu Glu Val Arg Ala Val Leu Pro Ala Ile Glu Glu 355 360 365 Ala Leu Ala Pro Phe Asp Ala Arg Pro His Trp Gly Lys Leu Asn Leu 370 375 380 Leu Thr Ala Glu Arg Ile Ala Glu Val Val Pro Arg Leu Ala Asp Ala 385 390 395 400 Arg Asp Leu Phe Glu Glu Leu Asp Pro Ala Gly Thr Phe Ser Asn Ala 405 410 415 His Leu Glu Arg Ile Gly Val Arg Leu Pro Arg 420 425 43419PRTFrankia sp. 43Met Arg Asp Ala Ala Ala Ala Asn Trp Ala Gly Asn Val Arg Phe Gly 1 5 10 15 Ala Ala Arg Val Val Ala Pro Glu Ser Val Gly Glu Leu Gln Glu Ile 20 25 30 Val Ala Gly Ser Arg Lys Ala Arg Ala Leu Gly Thr Gly His Ser Phe 35 40 45 Ser Arg Ile Ala Asp Thr Asp Gly Thr Leu Ile Ala Thr Ala Arg Leu 50 55 60 Pro Arg Arg Ile Gln Ile Asp Asp Gly Ser Val Thr Val Ser Gly Gly 65 70 75 80 Ile Arg Tyr Gly Asp Leu Ala Arg Glu Leu Ala Pro Asn Gly Trp Ala 85 90 95 Leu Arg Asn Leu Gly Ser Leu Pro His Ile Ser Val Ala Gly Ala Cys 100 105 110 Ala Thr Gly Thr His Gly Ser Gly Asp Arg Asn Gly Ser Leu Ala Thr 115 120 125 Ser Val Ala Ala Leu Glu Leu Val Thr Ala Ser Gly Glu Leu Val Ser 130 135 140 Val Arg Arg Gly Asp Glu Asp Phe Asp Gly His Val Ile Ala Leu Gly 145 150 155 160 Ala Leu Gly Val Thr Val Ala Val Thr Leu Asp Leu Val Pro Gly Phe 165 170 175 Gln Val Arg Gln Leu Val Tyr Glu Gly Leu Thr Arg Asp Thr Leu Leu 180 185 190 Glu Ser Val Gln Glu Ile Phe Ala Ala Ser Tyr Ser Val Ser Val Phe 195 200 205 Thr Gly Trp Asp Pro Glu Ser Ser Gln Leu Trp Leu Lys Gln Arg Val 210 215 220 Asp Gly Pro Gly Asp Asp Gly Glu Pro Pro Ala Glu Arg Phe Gly Ala 225 230 235 240 Arg Leu Ala Thr Arg Pro Leu His Pro Val Pro Gly Ile Asp Pro Thr 245 250 255 His Thr Thr Gln Gln Leu Gly Val Pro Gly Pro Trp His Glu Arg Leu 260 265 270 Pro His Phe Arg Leu Asp Phe Thr Pro Ser Ala Gly Asp Glu Leu Gln 275 280 285 Thr Glu Tyr Phe Val Ala Arg Glu His Ala Ala Ala Ala Ile Glu Ala 290 295 300 Leu Phe Ala Ile Gly Ala Val Val Arg Pro Ala Leu Gln Ile Ser Glu 305 310 315 320 Ile Arg Thr Val Ala Ala Asp Ala Leu Trp Leu Ser Pro Ala Tyr Arg 325 330 335 Arg Asp Val Met Ala Leu His Phe Thr Trp Ile Ser Ala Glu Gly Thr 340 345 350 Val Met Pro Ala Val Ala Ala Val Glu Arg Ala Leu Ala Pro Phe Asp 355 360 365 Pro Val Pro His Trp Gly Lys Val Phe Ala Leu Pro Pro Ala Ala Val 370 375 380 Arg Ala Gly Tyr Pro Arg Ala Ala Glu Phe Leu Ala Leu Ala Ala Arg 385 390 395 400 Arg Asp Pro Glu Ala Val Phe Arg Asn Gln Tyr Leu Asp Ala Tyr Leu 405 410 415 Pro Ala Ala 44413PRTPropionibacteriacaeae 44Met Thr Gln Arg Asn Trp Ala Gly Asn Val Ser Tyr Ser Ser Ser Arg 1 5 10 15 Val Ala Glu Pro Ala Ser Val Asp Asp Leu Thr Ala Leu Val Glu Ser 20 25 30 Glu Pro Arg Val Arg Pro Leu Gly Ser Arg His Cys Phe Asn Asp Ile 35 40 45 Ala Asp Thr Pro Gly Val His Val Ser Leu Ala Arg Leu Arg Gly Glu 50 55 60 Glu Pro Arg Leu Thr Ala Pro Gly Thr Leu Arg Thr Pro Ala Trp Leu 65 70 75 80 Arg Tyr Gly Asp Leu Val Pro Val Leu Arg Glu Ala Gly Ala Ala Leu 85 90 95 Ala Asn Leu Ala Ser Leu Pro His Ile Ser Val Ala Gly Ala Val Gln 100 105 110 Thr Gly Thr His Gly Ser Gly Asp Arg Ile Gly Thr Leu Ala Thr Gln 115 120 125 Val Ser Ala Leu Glu Leu Val Thr Gly Thr Gly Glu Val Leu Arg Leu 130 135 140 Glu Arg Gly Glu Pro Asp Phe Asp Gly Ala Val Val Gly Leu Gly Ala 145 150 155 160 Leu Gly Val Leu Thr His Val Glu Leu Asp Val Ser Pro Ala Arg Asp 165 170 175 Val Ala Gln His Val Tyr Glu Gly Val Arg Leu Asp Asp Val Leu Ala 180 185 190 Asp Leu Gly Ala Val Thr Gly Ala Gly Asp Ser Val Ser Met Phe Thr 195 200 205 His Trp Gln Asp Pro Ala Val Val Ser Gln Val Trp Val Lys Ser Gly 210 215 220 Gly Asp Val Asp Asp Ala Ala Ile Arg Asp Ala Gly Gly Arg Pro Ala 225 230 235 240 Asp Gly Pro Arg His Pro Ile Ala Gly Ile Asp Pro Thr Pro Cys Thr 245 250 255 Pro Gln Leu Gly Glu Pro Gly Pro Trp Tyr Asp Arg Leu Pro His Phe 260 265 270 Arg Leu Glu Phe Thr Pro Ser Val Gly Glu Glu Leu Gln Ser Glu Tyr 275 280 285 Leu Val Asp Arg Asp Asp Ala Val Asp Ala Ile Arg Ala Val Gln Asp 290 295 300 Leu Ala Pro Arg Ile Ala Pro Leu Leu Phe Val Cys Glu Ile Arg Thr 305 310 315 320 Met Ala Ser Asp Gly Leu Trp Leu Ser Pro Ala Gln Gly Arg Asp Thr 325 330 335 Val Gly Leu His Phe Thr Trp Arg Pro Asp Glu Ser Ala Val Arg Gln 340 345 350 Leu Leu Pro Glu Ile Glu Arg Ala Leu Pro Ala Ser Ala Arg Pro His 355 360 365 Trp Gly Lys Val Phe Thr Leu Pro Gly His Asp Val Ala Ala Arg Tyr 370 375 380 Pro Arg Trp Ala Asp Phe Val Ala Leu Arg Arg Arg Leu Asp Pro Glu 385 390 395 400 Arg Arg Phe Ala Asn Ala Tyr Leu Glu Arg Leu Gly Leu 405 410 45420PRTStreptomyces sp. 45Met Thr Pro Ala Glu Lys Asn Trp Ala Gly Asn Ile Thr Phe Gly Ala 1 5 10 15 Lys Arg Leu Cys Val Pro Arg Ser Val Arg Glu Leu Arg Glu Thr Val 20 25 30 Ala Ala Ser Gly Ala Val Arg Pro Leu Gly Thr Arg His Ser Phe Asn 35 40 45 Thr Val Ala Asp Thr Ser Gly Asp His Val Ser Leu Ala Gly Leu Pro 50 55 60 Arg Val Val Asp Ile Asp Val Pro Gly Arg Ala Val Ser Leu Ser Ala 65 70 75 80 Gly Leu Arg Phe Gly Glu Phe Ala Ala Glu Leu His Ala Arg Gly Leu 85 90 95 Ala Leu Ala Asn Leu Gly Ser Leu Pro His Ile Ser Val Ala Gly Ala 100 105 110 Val Ala Thr Gly Thr His Gly Ser Gly Val Gly Asn Arg Ser Leu Ala 115 120 125 Gly Ala Val Arg Ala Leu Ser Leu Val Thr Ala Asp Gly Glu Thr Arg 130 135 140 Thr Leu Arg Arg Thr Asp Glu Asp Phe Ala Gly Ala Val Val Ser Leu 145 150 155 160 Gly Ala Leu Gly Val Val Thr Ser Leu Glu Leu Asp Leu Val Pro Ala 165 170 175 Phe Glu Val Arg Gln Trp Val Tyr Glu Asp Leu Pro Glu Ala Thr Leu 180 185 190 Ala Ala Arg Phe Asp Glu Val Met Ser Ala Ala Tyr Ser Val Ser Val 195 200 205 Phe Thr Asp Trp Arg Pro Gly Pro Val Gly Gln Val Trp Leu Lys Gln 210 215 220 Arg Val Gly Asp Glu Gly Ala Arg Ser Val Met Pro Ala Glu Trp Leu 225 230 235 240 Gly Ala Arg Leu Ala Asp Gly Pro Arg His Pro Val Pro Gly Met Pro 245 250 255 Ala Gly Asn Cys Thr Ala Gln Gln Gly Val Pro Gly Pro Trp His Glu 260 265 270 Arg Leu Pro His Phe Arg Met Glu Phe Thr Pro Ser Asn Gly Asp Glu 275 280 285 Leu Gln Ser Glu Tyr Phe Val Ala Arg Ala Asp Ala Val Ala Ala Tyr 290 295 300 Glu Ala Leu Ala Arg Leu Arg Asp Arg Ile Ala Pro Val Leu Gln Val 305 310 315 320 Ser Glu Leu Arg Thr Val Ala Ala Asp Asp Leu Trp Leu Ser Pro Ala 325 330 335 His Gly Arg Asp Ser Val Ala Phe His Phe Thr Trp Val Pro Asp Ala 340 345 350 Ala Ala Val Ala Pro Val Ala Gly Ala Ile Glu Glu Ala Leu Ala Pro 355 360 365 Phe Gly Ala Arg Pro His Trp Gly Lys Val Phe Ser Thr Ala Pro Glu 370 375 380 Val Leu Arg Thr Leu Tyr Pro Arg Tyr Ala Asp Phe Glu Glu Leu Val 385 390 395 400 Gly Arg His Asp Pro Glu Gly Thr Phe Arg Asn Ala Phe Leu Asp Arg 405 410 415 Tyr Phe Arg Arg 420 46419PRTPaenibacillus sp. 46Met Gly Asp Lys Leu Asn Trp Ala Gly Asn Tyr Arg Tyr Arg Ser Met 1 5 10 15 Glu Leu Leu Glu Pro Lys Ser Leu Glu Glu Val Lys Asp Leu Val Val 20 25 30 Ser Arg Thr Ser Ile Arg Val Leu Gly Ser Cys His Ser Phe Asn Gly 35 40 45 Ile Ala Asp Thr Gly Gly Ser His Leu Ser Leu Arg Lys Met Asn Arg 50 55 60 Val Ile Asp Leu Asp Arg Val Gln Arg Thr Val Thr Val Glu Gly Gly 65 70 75 80 Ile Arg Tyr Gly Asp Leu Cys Arg Tyr Leu Asn Asp His Gly Tyr Ala 85 90 95 Leu His Asn Leu Ala Ser Leu Pro His Ile Ser Val Ala Gly Ala Val 100 105 110 Ala Thr Ala Thr His Gly Ser Gly Asp Leu Asn Ala Ser Leu Ala Ser 115 120 125 Ser Val Arg Ala Ile Glu Leu Met Lys Ser Asp Gly Glu Val Thr Val 130 135 140 Leu Thr Arg Gly Thr Asp Pro Glu Phe Asp Gly Ala Val Val Gly Leu 145 150 155 160 Gly Gly Leu Gly Val Val Thr Lys Leu Lys Leu Asp Leu Val Pro Ser 165 170 175 Phe Gln Val Ser Gln Thr Val Tyr Asp Arg Leu Pro Phe Ser Ala Leu 180 185 190 Asp His Gly Ile Asp Glu Ile Leu Ser Ser Ala Tyr Ser Val Ser Leu 195 200 205 Phe Thr Asp Trp Ala Glu Pro Ile Phe Asn Gln Val Trp Val Lys Arg 210 215 220 Lys Val Gly Ile Asn Gly Glu Asp Glu Thr Ser Pro Asp Phe Phe Gly 225 230 235 240 Ala Leu Pro Ala Pro Glu Lys Arg His Met Val Leu Gly Gln Ser Val 245 250 255 Val Asn Cys Ser Glu Gln Met Gly Asp Pro Gly Pro Trp Tyr Glu Arg 260 265 270 Leu Pro His Phe Arg Met Glu Phe Thr Pro Ser Ala Gly Asn Glu Leu 275 280 285 Gln Ser Glu Tyr Phe Val Pro Arg Arg His Ala Val Glu Ala Met Arg 290 295 300 Ala Leu Gly Lys Leu Arg Asp Arg Ile Ala Pro Leu Leu Phe Ile Ser 305 310 315 320 Glu Ile Arg Thr Ile Ala Ser Asp Thr Phe Trp Met Ser Pro Cys Tyr 325 330 335 Arg Gln Asp Ser Val Gly Leu His Phe Thr Trp Lys Pro Asp Trp Glu 340 345 350 Arg Val Arg Gln Leu Leu Pro Leu Ile Glu Arg Glu Leu Glu Pro Phe 355 360 365 Ala Ala Arg Pro His Trp Ala Lys Leu Phe Thr Met Glu Ser Glu Met 370 375 380 Ile Gln Ala Arg Tyr Glu Arg Leu Ala Asp Phe Arg Gln Leu Leu Leu 385 390 395 400 Arg Tyr Asp Pro Ile Gly Lys Phe Arg Asn Thr Phe Leu Asp His Tyr 405 410 415 Ile Met His 471356DNATerriglobuds roseus 47atggatcgtc gtgaactgct gaaaacctct gcactgctga tggcagcagc accgttagca 60cgtgcagcaa atgttccgga agatcatgca aatgttccgc gtaccaattg gagcaaaaac 120ttccactata gcaccagccg cgtttatgca ccgactaccc cggaagaagt tccggcaatt 180gttctggaaa atggtcatct gaaaggtctg ggttctcgtc actgcttcaa caacatcgcc 240gatagccagt atgcgcagat cagcatgcgc gaagttaaag gcattcagat cgatgaagcc 300gcacaaaccg ttaccgtggg tgcaggtatt gcgtatggtg aattagcacc ggtgctggat 360aaagcgggtt ttgcactggc aaatttagca agtttaccgc atatcagcgt gggtggcacc 420attgcaaccg caacacatgg ctctggcgtt ggtaacaaaa acctgtcttc tgcaacccgt 480gcaattgaaa tcgtgaaagc ggatggcagc attctgcgtc tgtcgcgtga tactgatggt 540gaacgttttc gtatggcggt ggttcatctg ggtgcattag gtgttttaac caaagttacc 600ctggatatcg tgccgcgctt cgatatgtct caggtggtgt atcgcaacct gtcctttgat 660cagctggaac acaacctgga taccattctg agctctggct atagcgttag cctgttcacc 720gactggcagc gtaatcgtgt taatcaggtg tggatcaaag ataaagcgac cgcggatgca 780ccgcaaaaac cgttacctcc gatgttttat ggtgcgaccc tgcaaaccgc aaaactgcat 840ccgatcgatg atcatccggc agatgcatgt accgaacaaa tgggtagtgt tggtccgtgg 900tatttacgtc tgccgcattt caaaatggag tttaccccga gcagcggtga agaattacag 960accgaatact

tcgtggcgcg caaagatggc tatcgcgcaa ttcgtgccgt ggaaaaactg 1020cgcgataaaa ttaccccgca cctgtttatc accgaaatcc gcaccattgc agcagatgat 1080ctgccgatga gcatggcata tcaacgtgac agtatggcga ttcattttac ctggaaaccg 1140gaagaaccga ccgtgcgtaa attactgccg gaaatcgaag cagcactggc gccgtttggt 1200gttcgtccgc attggggcaa aatttttgaa attccgccga gctatctgca taaacagtat 1260ccggcactgc cgcgttttcg cgcaatggca caggcattag atcctggtgg caaatttcgt 1320aatgcatatc tggatcgtaa catctttggc gcgtag 1356481353DNAGranulicella mallensis 48atggacaaac gcgatttcct gaaaggtagc gcaaccaccg cagttgcact gatgatgggt 60ctgaatgaaa gcaaagcgtt tgcggatgat agcgttccgc gtaccaattg gagcggcaac 120taccattata gcaccaacaa agtgctgcag ccggcaagtg ttgcagaaac ccaagatgca 180gttcgtagtg ttgcaggtgt tcgtgcatta ggtactcgtc atagctttaa cggcatcgcg 240gatagccaga ttgcccagat tagtaccctg aaactgaaag atgtgagcct ggatgcgaaa 300agctcgaccg tgaccgttgg tgcaggtatt cgttatggtg atctggcggt tcagctggat 360gcgaaaggtt ttgctctgca taatctggca agtctgccgc atatttctgt tggtggtgca 420tgtgcaactg cgacccatgg ttcaggtatg ggtaatggta atttagcaac cgcagttaaa 480gcggtggaat ttgttgcggc ggatggtagc gtgcataccc tgtctcgtga tcgtgatggt 540gatcgttttg cgggctctgt tgttggtctg ggtgcattag gtgttgttac ccatttaacc 600ctgcaagttc agccacgttt cgaaatgacc caggtggtgt accgtgatct gccatttagt 660gaactggaac atcatctgcc ggaaattatg ggtgccggtt atagcgtgtc cctgtttacc 720gattggcaga atggtcgtgc aggtgaagtg tggatcaaac gtcgcgtgga tcaaggtggt 780gcaagtgctc ctccagctcg tttttttaat gcaaccttag caaccaccaa actgcacccg 840atcctggatc atcctgctga agcatgtacc gatcagttaa ataccgtagg tccgtggtat 900gaacgtttac cgcacttcaa actgaacttc accccgagca gtggccaaga attacagacc 960gagtttttcg tgccgttcga tcgcggctat gacgccattc gtgccgttga aactttacgt 1020gatgtgatta ccccgcacct gtatatcacc gaactgcgtg cagttgcagc tgatgattta 1080tggatgagca tggcatatca acgtccgagt ctggcaatcc attttacctg gaaaccggaa 1140accgatgcag tgctgaaatt actgccgcag attgaagcga aactggcccc gtttggtgct 1200cgtccgcatt gggcaaaagt ttttaccatg aaaagcagcc atgtggcacc gctgtatccg 1260cgcctgaaag attttctggt tctggcaaaa tcctttgatc cgaaaggcaa attccaaaac 1320gcgtttctgc aggaccatgt ggacatcgca tag 1353491245DNAStreptomyces acidiscabies 49atgaccgcat ctgtgaccaa ttgggcgggt aacatcagct ttgtggcgaa agatgttgtt 60cgtccgggtg gtgttgaagc actgcgtaaa gttgttgcgg gtaatgatcg tgttcgtgtt 120ctgggttctg gtcatagctt taaccgtatc gctgaaccgg gtgctgatgg tgttctggtt 180agcctggatg cattaccgca agtgattgat gttgataccg aacgtcgtac cgtgcgtgtt 240ggtggtggtg ttaaatacgc ggaactggct cgtcatgtga atgaatctgg tctggcactg 300ccgaatatgg catctctgcc gcatatttct gttgcaggtt ctgttgcaac tggtacccat 360ggttctggtg tgaataatgg cccgttagca accccggttc gtgaagttga attattaacc 420gcggatggct ctctggtgac catcggtaaa gatgatgcgc gttttccggg tgcagttact 480tctctgggtg cgctgggtgt tgttgttgca ctgaccttag atttagaacc ggcgtatggt 540gttgaacagt atacctttac cgaattaccg ctggaaggtc tggacttcga agcagttgcg 600agtgcagcat attctgttag cctgttcacc gattggcgtg aagctggttt tcgccaagtt 660tgggtgaaac gccgcattga tgaaccgtac gcgggctttc cgtgggcagc accggcaact 720gaaaaattac atccggttcc gggtatgcca gcagaaaatt gtactgatca atttggtgca 780gcaggtccat ggcatgaacg tttaccgcat tttaaagcgg aatttacccc gtctagcggt 840gatgaattac agagcgaata tctgctgccg cgtgaacatg cactggcggc actggatgca 900gtgggcaacg tgcgtgaaac cgtttctacc gtgctgcaga tttgcgaagt tcgtaccatt 960gcagcagata cccagtggtt aagtccggct tatggtcgtg atagtgttgc attacatttt 1020acttggaccg atgatatgga tgcagtttta cctgcagttc gtgccgttga aagcgcgctg 1080gatggctttg gtgctcgccc gcattggggt aaagtgttta ccaccgcacc ggcagcatta 1140cgtgaacgtt atccgcgtct ggatgatttt cgtaccctgc gtgatgaatt agatccggca 1200ggcaaattta ctaatgcatt tgttcgtgat gttctggaag gttag 1245501284DNAActinomycetales 50atgaccctgg aacgtaattg ggcaggtacc catacctttg cagcaccgcg tattgttaat 60gcaaccagca tcgatgaagt tcgtgcgtta gtggcagaag cagcacgtac cggtacccgt 120gttcgtgcat taggtactcg tcattctttt accgatctgg cagatagcga tggtaccctg 180attaccgtgc tggatattcc ggcagatcca gttttcgatg aagcagcagg tagcgttacc 240attggtgcag gtacccgtta tggtattgca gcagcatggt tagcagaaca tggtctggcg 300tttcacaaca tgggtagcct gccgcatatt agcgttggtg gtgcaattgc aaccggtacc 360catggtagtg gtaatgataa cggcattctg agtagcgcag ttagtggtct ggaatatgtt 420gatgcgaccg gtgaactggt tcatgtgcgt cgtggtgatc ctggttttga tggtctggtt 480gttggtttag gcgcgtatgg tattgtggtt cgtgtgacgg tggatgttca accggcatat 540cgtgttcgcc aggatgtgta tcgtgatgtt ccgtgggatg cagttctggc agattttgaa 600ggtgttacag gtggtgcgta tagcgttagc atctttacca actggctggg tgatacggtg 660gaacagattt ggtggaaaac ccgtctggtt gcaggtgatg atgaactgcc ggtggttccg 720gaaagctggc tgggtgttca acgtgattct ttaaccgcag gtaatctggt tgaaaccgat 780ccggataatt taaccctgca aggtggtgtt ccgggtgatt ggtgggaacg tttaccgcat 840tttcgtctgg aaagtacccc gtctaatggt gatgaaatcc agaccgaata cttcatcgat 900cgcgcggatg gtccggcggc aattaccgca ctgcgtgcat taggtgatcg tattgctccg 960ttactgttag ttaccgaatt acgtaccgca gctccagata aactgtggct gagtggcgca 1020tatcatcgcg aaatgttagc ggtccatttt acctggcgta atttaccgga agaagtgcgt 1080gcagttttac cagcgatcga agaagccctg gcgccgtttg atgctcgtcc gcattggggt 1140aaactgaatc tgttaaccgc agaacgtatt gcagaagttg ttccgcgtct ggctgatgca 1200cgtgatctgt ttgaagaact ggacccggct ggtacctttt ctaatgctca tctggaacgt 1260attggtgttc gtttaccgcg ttag 1284511260DNAFrankia sp. 51atgcgtgatg cagcagcagc aaattgggca ggtaatgtgc gttttggtgc agcacgtgtt 60gttgcaccgg aaagtgttgg tgaactgcag gaaattgttg caggtagccg taaagcacgt 120gcattaggta ccggtcatag ctttagccgt attgcagata ccgatggtac cctgattgct 180accgcacgtt taccacgtcg tattcagatc gatgatggca gcgttaccgt ttctggtggt 240atccgttatg gcgatctggc ccgtgaatta gcaccgaatg gttgggcatt acgtaatctg 300ggttctttac cgcacatttc agttgcaggt gcatgtgcaa ccggtaccca tggttcaggt 360gatcgtaatg gtagtctggc aacctctgtt gcagcgttag aattagttac cgcgtctggt 420gaattagtga gcgttcgtcg tggcgatgaa gatttcgatg gccatgtgat tgcgctgggt 480gcactgggtg ttactgttgc agttaccctg gatttagttc cgggttttca ggttcgtcag 540ctggtgtatg aaggtctgac ccgtgatacc ttactggaaa gtgtgcagga aatctttgct 600gcgagctata gtgttagcgt gtttaccggt tgggacccgg aaagttctca actgtggctg 660aaacagcgcg ttgatggtcc gggcgatgat ggtgaaccac cggcagaacg ttttggtgca 720cgtttagcaa ctcgtccgtt acatccagtt ccgggtattg atccgactca tactactcaa 780caattaggtg ttccaggtcc gtggcatgaa cgtttaccgc attttcgtct ggattttacc 840ccttctgcag gtgatgaact gcaaaccgaa tacttcgtgg cccgcgaaca tgcagcggcg 900gcgattgaag cactgtttgc gattggtgcg gttgttcgtc cggcattaca aattagcgaa 960attcgtaccg ttgcagctga tgcattatgg ctgtctccgg catatcgtcg tgatgttatg 1020gcgttacatt ttacctggat tagcgcagaa ggtaccgtta tgccagcagt tgcagcagtg 1080gaacgtgcac tggcgccgtt tgatccggtt cctcattggg gtaaagtttt tgcgctgccg 1140ccagcagcag ttcgtgctgg ttatcctcgt gcagcagaat ttttagcatt agcagctcgt 1200cgtgatccgg aagcagtttt tcgtaatcag tatttagatg catatttacc ggcagcatag 1260521242DNAPropionibacteriacaeae 52atgacccagc gtaattgggc gggtaatgtg agctatagta gcagccgtgt tgcagaacca 60gcaagtgtgg atgatttaac cgcactggtt gaaagtgaac cgcgtgttcg tccgttaggt 120agtcgtcatt gcttcaacga tatcgccgat accccaggtg ttcatgtttc tctggcacgt 180ctgcgtggtg aagaaccgcg tttaacagca ccgggtacct tacgtactcc agcttggtta 240cgttatggtg atttagttcc ggttctgcgt gaagcaggtg cagcattagc aaatttagca 300tctctgccgc atattagcgt tgcaggtgca gttcaaaccg gtacccatgg ttcaggtgat 360cgtattggca ctctggcaac ccaagttagc gccctggaat tagtgaccgg caccggtgaa 420gttttacgct tagaacgtgg tgaacctgat tttgatggtg cggttgttgg tttaggtgcg 480ttaggtgttc tgactcatgt ggaattagat gttagtccgg cgcgtgatgt tgcacagcac 540gtgtatgaag gtgttcgtct ggatgatgtt ctggcggatt taggcgcggt tactggcgca 600ggtgattcgg tgagcatgtt tacccattgg caagatccgg cagttgttag tcaggtttgg 660gttaaaagtg gcggtgatgt ggatgatgca gcaattcgtg atgcaggtgg tcgtccggca 720gatggtccgc gtcatccaat tgcaggtatt gatccgactc catgtactcc acaattaggt 780gaaccaggtc cgtggtatga tcgtctgccg cattttcgtc tggaatttac cccgagtgtt 840ggtgaagaac tgcaaagtga atatctggtt gatcgcgatg atgccgttga tgcaattcgt 900gcggtgcagg atttagcccc gcgtattgcg ccgctgctgt ttgtttgcga aattcgtacc 960atggcaagtg atggtttatg gctgagcccg gcacaaggtc gtgataccgt tggtctgcat 1020tttacctggc gtcctgatga atctgcagtt cgtcaattat taccggaaat tgaacgtgct 1080ttaccggcaa gtgctcgtcc gcattggggt aaagtgttta ccctgccggg ccatgatgtt 1140gcagcacgtt atccgcgttg ggcagatttt gttgcattac gtcgtcgttt agatccggaa 1200cgtcgtttcg cgaatgcata cctggaacgt ttaggtctgt ag 1242531263DNAStreptomyces sp. 53atgactccgg cggaaaaaaa ttgggcgggc aacatcacct ttggtgcaaa acgtctgtgt 60gttccgcgtt ctgttcgtga actgcgtgaa accgttgcag catctggtgc agttcgtccg 120ttaggtactc gtcatagctt taataccgtt gcagatacca gtggtgatca tgttagtctg 180gcaggtttac cgcgtgttgt ggacatcgat gttccgggtc gtgcagtttc tctgtctgct 240ggtctgcgtt ttggtgaatt tgcggctgaa ttacatgcac gtggtctggc gctggcaaat 300ttaggttctc tgccgcatat tagcgttgca ggtgcagttg caaccggtac tcatggttct 360ggtgttggta atcgttcttt agcaggtgca gttcgtgctt tatctctggt aaccgccgat 420ggtgaaaccc gtaccttacg tcgtaccgat gaagattttg caggtgcagt ggtttctctg 480ggtgcactgg gtgttgttac ttctctggaa ctggatttag ttccggcgtt cgaagtgcgt 540cagtgggtgt acgaagatct gccggaagca actttagcag ctcgttttga tgaagttatg 600tcagcagcgt atagcgtgtc cgtgttcacc gattggcgtc cgggtcctgt tggtcaagtt 660tggctgaaac aacgtgttgg tgatgaaggt gctcgtagtg ttatgccagc agaatggtta 720ggtgcacgtt tagcagatgg tccgcgtcat ccagttccag gtatgcctgc aggtaattgt 780acagcacaac aaggtgttcc aggtccgtgg catgaacgtt taccgcattt tcgcatggaa 840tttaccccgt ctaacggcga tgaactgcaa agcgaatatt ttgtggcgcg tgcagatgca 900gttgcagcgt atgaagcatt agcacgtctg cgtgatcgta ttgcgccggt tctgcaagtt 960agcgaattac gtaccgttgc agcagatgat ctgtggctga gtccggcaca tggtcgtgat 1020agtgttgcgt ttcattttac ctgggttccg gatgcagcag cagttgcacc ggttgcaggt 1080gctattgaag aagcattagc accgtttggt gcacgtccac attggggtaa agtttttagc 1140accgcaccgg aagttttacg taccttatat ccgcgttatg ccgatttcga agaactggtg 1200ggccgccatg atccggaagg cacctttcgt aatgcatttt tagatcgcta ctttcgtcgc 1260tag 1263541260DNAPaenibacillus sp. 54atgggcgata aactgaattg ggcgggcaac tatcgttatc gcagcatgga actgctggaa 60ccgaaaagcc tggaagaagt gaaagatctg gtggttagcc gtaccagcat tcgtgttctg 120ggtagctgtc atagctttaa cggcattgcg gataccggtg gtagtcatct gagtctgcgc 180aaaatgaacc gcgtgattga tctggatcgt gttcagcgta ccgttaccgt tgaaggtggt 240attcgttacg gtgatctgtg ccgctatctg aacgatcatg gttatgccct gcataatctg 300gcaagcttac cgcacatcag cgttgcaggt gcagttgcaa ccgcaaccca tggttctggt 360gatctgaatg caagtctggc aagctctgtt cgtgcaattg aactgatgaa aagcgatggc 420gaagttacgg ttctgacccg tggtaccgat ccggaatttg atggtgcagt tgttggtctg 480ggtggtttag gtgttgtgac caaactgaaa ctggatctgg ttccgagctt tcaggtgtcg 540cagaccgtgt atgatcgtct gccgtttagc gcactggatc atggcatcga tgaaattctg 600agtagtgcat atagcgttag cctgttcacc gattgggcgg aaccgatctt taatcaggtg 660tgggtgaaac gcaaagtggg cattaacggc gaagatgaaa ccagtccgga tttttttggc 720gcattaccgg caccggaaaa acgccacatg gttctgggtc agagcgtggt gaattgcagc 780gaacaaatgg gtgatcctgg tccgtggtat gaacgtttac cgcattttcg catggaattt 840accccgagtg caggcaatga attacagagc gaatattttg tgccgcgtcg tcatgcggtt 900gaagcaatgc gtgcgttagg taaactgcgt gatcgtattg caccactgct gttcatcagc 960gaaatccgca ccattgcgag cgataccttc tggatgagcc cgtgttatcg tcaggattct 1020gttggtctgc attttacctg gaaaccggat tgggaacgtg ttcgtcagtt attaccgctg 1080attgaacgtg aactggaacc gtttgcggca cgtccgcatt gggcgaaact gtttaccatg 1140gaaagcgaaa tgattcaggc gcgctatgaa cgtctggcgg attttcgtca gctgctgctg 1200cgttatgatc cgattggcaa attccgtaac acctttctgg atcactacat catgcactaa 126055205PRTThermus thermophilus 55Met Glu Ala Thr Leu Pro Val Leu Asp Ala Lys Thr Ala Ala Leu Lys 1 5 10 15 Arg Arg Ser Ile Arg Arg Tyr Arg Lys Asp Pro Val Pro Glu Gly Leu 20 25 30 Leu Arg Glu Ile Leu Glu Ala Ala Leu Arg Ala Pro Ser Ala Trp Asn 35 40 45 Leu Gln Pro Trp Arg Ile Val Val Val Arg Asp Pro Ala Thr Lys Arg 50 55 60 Ala Leu Arg Glu Ala Ala Phe Gly Gln Ala His Val Glu Glu Ala Pro 65 70 75 80 Val Val Leu Val Leu Tyr Ala Asp Leu Glu Asp Ala Leu Ala His Leu 85 90 95 Asp Glu Val Ile His Pro Gly Val Gln Gly Glu Arg Arg Glu Ala Gln 100 105 110 Lys Gln Ala Ile Gln Arg Ala Phe Ala Ala Met Gly Gln Glu Ala Arg 115 120 125 Lys Ala Trp Ala Ser Gly Gln Ser Tyr Ile Leu Leu Gly Tyr Leu Leu 130 135 140 Leu Leu Leu Glu Ala Tyr Gly Leu Gly Ser Val Pro Met Leu Gly Phe 145 150 155 160 Asp Pro Glu Arg Val Lys Ala Ile Leu Gly Leu Pro Ser His Ala Ala 165 170 175 Ile Pro Ala Leu Val Ala Leu Gly Tyr Pro Ala Glu Glu Gly Tyr Pro 180 185 190 Ser His Arg Leu Pro Leu Glu Arg Val Val Leu Trp Arg 195 200 205 56618DNAThermus thermophilus 56atggaagcaa ccttaccggt gttagacgcg aaaaccgcag cactgaaacg tcgtagcatt 60cgccgttatc gcaaagatcc agttccggaa ggtttactgc gcgaaattct ggaagcagca 120ttacgtgcac cgtctgcatg gaatttacaa ccgtggcgta ttgtggtggt tcgtgatccg 180gcaactaaac gtgcattacg tgaagcagca tttggtcaag cccatgtgga agaagcaccg 240gttgttctgg ttctgtacgc agatctggaa gatgcactgg cacatctgga tgaagtgatt 300catccgggcg ttcaaggtga acgtcgtgaa gcgcagaaac aagcaattca gcgtgcattt 360gcagcaatgg gtcaggaagc tcgtaaagct tgggcaagcg gtcaaagtta tattctgctg 420ggttatctgc tgctgctgct ggaagcatat ggtctgggtt ctgttccgat gctgggtttt 480gatcctgaac gtgttaaagc gattctgggc ctgccgtcac atgcagcgat tccggcatta 540gttgcactgg gttatccggc tgaagaaggt tatccgagtc atcgtttacc gctggaacgt 600gttgttttat ggcgttga 61857247PRTAgrobacterium sp. 57Met Lys Asn Pro Phe Ser Leu Gln Gly Arg Lys Ala Leu Val Thr Gly 1 5 10 15 Ala Asn Thr Gly Leu Gly Gln Ala Ile Ala Val Gly Leu Ala Ala Ala 20 25 30 Gly Ala Glu Val Val Cys Ala Ala Arg Arg Ala Pro Asp Glu Thr Leu 35 40 45 Glu Met Ile Ala Ser Asp Gly Gly Lys Ala Ser Ala Leu Ser Ile Asp 50 55 60 Phe Ala Asp Pro Leu Ala Ala Lys Asp Ser Phe Ala Gly Ala Gly Phe 65 70 75 80 Asp Ile Leu Val Asn Asn Ala Gly Ile Ile Arg Arg Ala Asp Ser Val 85 90 95 Glu Phe Ser Glu Leu Asp Trp Asp Glu Val Met Asp Val Asn Leu Lys 100 105 110 Ala Leu Phe Phe Thr Thr Gln Ala Phe Ala Lys Glu Leu Leu Ala Lys 115 120 125 Gly Arg Ser Gly Lys Val Val Asn Ile Ala Ser Leu Leu Ser Phe Gln 130 135 140 Gly Gly Ile Arg Val Pro Ser Tyr Thr Ala Ala Lys His Gly Val Ala 145 150 155 160 Gly Leu Thr Lys Leu Leu Ala Asn Glu Trp Ala Ala Lys Gly Ile Asn 165 170 175 Val Asn Ala Ile Ala Pro Gly Tyr Ile Glu Thr Asn Asn Thr Glu Ala 180 185 190 Leu Arg Ala Asp Ala Ala Arg Asn Lys Ala Ile Leu Glu Arg Ile Pro 195 200 205 Ala Gly Arg Trp Gly Arg Ser Glu Asp Ile Ala Gly Ala Ala Val Phe 210 215 220 Leu Ser Ser Ala Ala Ala Asp Tyr Val His Gly Ala Ile Leu Asn Val 225 230 235 240 Asp Gly Gly Trp Leu Ala Arg 245 58198PRTAgrobacterium tumefaciens 58Met Ile Ala Gly Val Gly Gly Glu Ala Arg Glu Leu Ala Leu Asp Leu 1 5 10 15 Ser Asp Pro Met Ala Ala Lys Asp Val Phe Ala Glu Gly Ala Tyr Asp 20 25 30 Leu Leu Ile Asn Asn Ala Gly Ile Ile Arg Arg Ala Asp Ala Val Asp 35 40 45 Phe Ser Glu Asp Asp Trp Asp Ala Val Met Asp Val Asn Leu Lys Ala 50 55 60 Val Phe Phe Thr Ser Gln Ala Phe Ala Arg Ala Leu Met Ser Arg Asn 65 70 75 80 Ala Ser Gly Lys Ile Val Asn Ile Ala Ser Leu Leu Ser Phe Gln Gly 85 90 95 Gly Ile Arg Val Ala Ser Tyr Thr Ala Ala Lys His Gly Val Ala Gly 100 105 110 Ile Thr Arg Leu Leu Ala Asn Glu Trp Ala Ser Arg Gly Ile Asn Val 115 120 125 Asn Ala Ile Ala Pro Gly Tyr Ile Ala Thr Asn Asn Thr Glu Ala Leu 130 135 140 Arg Ala Asp Glu Glu Arg Asn Ala Ala Ile Leu Ala Arg Ile Pro Ala 145 150 155 160 Gly Arg Trp Gly Arg Ala Glu Asp Ile Ala Gly Thr Ala Val Tyr Leu 165 170 175 Cys Ser Pro Ala Ala Asp Tyr Val His Gly Ala Ile Leu Asn Val Asp 180 185 190 Gly Gly Trp Leu Ala Arg 195 59253PRTEscherichia coli 59Met Ile Leu Ser Ala Phe Ser Leu Glu Gly Lys Val Ala Val Val Thr 1 5 10 15 Gly Cys Asp Thr Gly Leu Gly Gln Gly Met Ala Leu Gly Leu Ala Gln 20 25 30 Ala Gly Cys Asp Ile Val Gly Ile Asn Ile Val Glu Pro Thr Glu Thr 35 40 45 Ile

Glu Gln Val Thr Ala Leu Gly Arg Arg Phe Leu Ser Leu Thr Ala 50 55 60 Asp Leu Arg Lys Ile Asp Gly Ile Pro Ala Leu Leu Asp Arg Ala Val 65 70 75 80 Ala Glu Phe Gly His Ile Asp Ile Leu Val Asn Asn Ala Gly Leu Ile 85 90 95 Arg Arg Glu Asp Ala Leu Glu Phe Ser Glu Lys Asp Trp Asp Asp Val 100 105 110 Met Asn Leu Asn Ile Lys Ser Val Phe Phe Met Ser Gln Ala Ala Ala 115 120 125 Lys His Phe Ile Ala Gln Gly Asn Gly Gly Lys Ile Ile Asn Ile Ala 130 135 140 Ser Met Leu Ser Phe Gln Gly Gly Ile Arg Val Pro Ser Tyr Thr Ala 145 150 155 160 Ser Lys Ser Gly Val Met Gly Val Thr Arg Leu Met Ala Asn Glu Trp 165 170 175 Ala Lys His Asn Ile Asn Val Asn Ala Ile Ala Pro Gly Tyr Met Ala 180 185 190 Thr Asn Asn Thr Gln Gln Leu Arg Ala Asp Glu Gln Arg Ser Ala Glu 195 200 205 Ile Leu Asp Arg Ile Pro Ala Gly Arg Trp Gly Leu Pro Ser Asp Leu 210 215 220 Met Gly Pro Ile Val Phe Leu Ala Ser Ser Ala Ser Asp Tyr Val Asn 225 230 235 240 Gly Tyr Thr Ile Ala Val Asp Gly Gly Trp Leu Ala Arg 245 250 60254PRTSphingomonas sp. 60Met Pro Gly Met Thr Thr Pro Phe Asp Leu His Gly Lys Thr Ala Ile 1 5 10 15 Val Thr Gly Ala Asn Thr Gly Ile Gly Gln Ala Ile Ala Leu Ser Leu 20 25 30 Ala Gln Ala Gly Ala Asp Ile Ala Ala Val Gly Arg Thr Pro Ala Gln 35 40 45 Asp Thr Val Asp Gln Val Arg Ala Leu Gly Arg Arg Ala Asp Ile Ile 50 55 60 Ser Ala Asp Leu Ser Thr Ile Glu Pro Val Gln Arg Val Leu Asp Glu 65 70 75 80 Thr Leu Glu Lys Leu Gly Ala Leu Asp Ile Leu Val Asn Asn Ala Gly 85 90 95 Ile Ile Arg Arg Ala Asp Ser Val Asp Phe Thr Glu Glu Asp Trp Asp 100 105 110 Ala Val Ile Asp Thr Asn Leu Lys Thr Thr Phe Phe Leu Cys Gln Ala 115 120 125 Ala Gly Arg His Met Leu Ala Gln Gly Ala Gly Lys Ile Ile Asn Ile 130 135 140 Ala Ser Leu Leu Ser Phe Gln Gly Gly Ile Arg Val Pro Ser Tyr Thr 145 150 155 160 Ala Ser Lys Ser Gly Val Ala Gly Leu Thr Lys Leu Leu Ala Asn Glu 165 170 175 Trp Ala Ala Lys Gly Val Asn Val Asn Ala Ile Ala Pro Gly Tyr Ile 180 185 190 Ala Thr Asn Asn Thr Ala Ala Leu Gln Ala Asp Glu Thr Arg Asn Arg 195 200 205 Gln Ile Gln Glu Arg Ile Pro Ala Gly Arg Trp Gly Asp Pro Ala Asp 210 215 220 Ile Gly Gly Ala Ala Val Phe Leu Ala Ser Ser Ala Ala Asp Tyr Ile 225 230 235 240 His Gly His Thr Leu Ala Val Asp Gly Gly Trp Leu Ala Arg 245 250 61246PRTHoeflea phototrophica 61Met Asn Pro Phe Ser Leu Glu Gly Lys Thr Ala Leu Val Thr Gly Ala 1 5 10 15 Asn Thr Gly Ile Gly Gln Ala Ile Ala Met Ala Leu Gly Arg Ala Gly 20 25 30 Ala Asp Val Ile Cys Ala Gly Arg Ser Ser Cys Ala Glu Thr Val Ala 35 40 45 Leu Ile Ala Gly Ser Lys Gly Lys Ala Arg Glu Leu Val Leu Asp Phe 50 55 60 Ala Asp Pro Met Ala Ala Arg Asp Val Phe Ala Ala Glu Pro Val Asp 65 70 75 80 Ile Leu Val Asn Asn Ala Gly Ile Ile Arg Arg Ala Asp Ala Val Asp 85 90 95 Phe Thr Glu Ala Asp Trp Asp Glu Val Met Asp Val Asn Leu Lys Ala 100 105 110 Val Phe Phe Thr Cys Gln Ala Phe Gly Lys Ala Val Leu Gly Arg Gly 115 120 125 Gly Asn Gly Lys Ile Val Asn Ile Ala Ser Leu Leu Ser Phe Gln Gly 130 135 140 Gly Ile Arg Val Pro Ser Tyr Thr Ala Ser Lys His Gly Val Ala Gly 145 150 155 160 Ile Thr Lys Leu Leu Ala Asn Glu Trp Ala Ala Lys Gly Ile Asn Val 165 170 175 Asn Ala Ile Ala Pro Gly Tyr Ile Glu Thr Asn Asn Thr Glu Ala Leu 180 185 190 Arg Ala Asp Pro Val Arg Asn Lys Ala Ile Leu Glu Arg Ile Pro Ala 195 200 205 Gly Arg Trp Gly Gln Ala Ser Asp Ile Gly Glu Ala Ala Val Phe Leu 210 215 220 Ala Ser Pro Ala Ala Asn Tyr Ile His Gly Ala Val Leu Asn Val Asp 225 230 235 240 Gly Gly Trp Leu Ala Arg 245 62744DNAAgrobacterium sp. 62atgaagaatc ccttttcgct tcaggggcgt aaggcgctcg tcaccggcgc gaatacgggg 60cttggccagg cgattgcggt tgggctcgcc gcggccggtg cggaggtggt ctgcgccgcc 120cgccgcgcgc cggatgaaac gctggagatg atcgccagcg acggcggcaa ggccagcgca 180ttgtccatcg attttgccga tccgctggcg gcgaaggaca gttttgccgg cgccggtttc 240gatattctcg tcaacaatgc cggtatcatc cgccgtgccg attccgtcga gttctccgaa 300ctcgactggg acgaggtgat ggacgtcaat ctcaaggcgc tgtttttcac cacccaggct 360tttgcgaaag agctgctggc gaaaggccgg tccggcaagg tggtcaatat cgcttcgctc 420ctttcctttc agggcggtat tcgcgtgccg tcctatacgg cggcgaaaca tggtgtcgcc 480ggcctaacca aactcctggc gaatgaatgg gccgccaagg gcatcaatgt gaatgccatt 540gcgcccggtt atatcgaaac caacaatacc gaggcgctac gcgccgatgc ggctcgtaac 600aaggccattc tcgagcgcat cccggccggc cgctgggggc gctcggaaga catcgccggg 660gcggcggttt tcctgtcatc tgcggcggcg gactatgtgc atggcgccat tctcaacgtc 720gatggcggct ggctggcgcg ctga 74463597DNAAgrobacterium tumefaciens 63atgatcgccg gcgtgggggg agaagcaagg gagctggcgc tcgatctgtc cgatcccatg 60gcggcaaaag atgtttttgc tgaaggcgct tacgacctcc tcatcaacaa tgccggcatc 120atccgccgtg ccgatgcagt cgatttctcc gaggatgact gggacgcggt gatggacgtg 180aacctgaaag ccgtcttctt cacctcgcaa gcctttgcgc gggctctcat gtccagaaac 240gcaagcggaa agatcgttaa cattgcatcc cttctgtcgt ttcaaggcgg cattcgcgtt 300gcctcctaca cggccgccaa gcacggtgtg gcaggcatca ccagactgtt ggcaaacgaa 360tgggcgtccc gcggcatcaa cgtcaatgcg atagcgcccg gttacattgc cacgaacaac 420acggaagcgc ttcgagccga cgaggagcgc aacgcggcga tcctcgcacg cattccggct 480ggccgctggg ggcgggcgga ggatattgcg ggtactgctg tctatctttg ttcgccggca 540gccgattatg ttcatggcgc cattctaaac gtcgatggcg gttggctcgc gcgctga 59764762DNAEscherichia coli 64atgattttaa gtgcattttc tctcgaaggt aaagttgcgg tcgtcactgg ttgtgatact 60ggactgggtc aggggatggc gttggggctg gcgcaagcgg gctgtgacat tgttggcatt 120aacatcgttg aaccgactga aaccatcgag caggtcacag cgctggggcg tcgtttttta 180agcctgaccg ccgatctgcg aaagattgat ggtattccag cactgctgga tcgcgcggta 240gcggagtttg gtcatattga tatcctggtg aataacgccg gattgattcg ccgcgaagat 300gctctcgagt tcagcgaaaa ggactgggac gatgtcatga acctgaatat caagagcgta 360ttcttcatgt ctcaggcagc ggcgaaacac tttatcgcgc aaggcaatgg cggcaagatt 420atcaatatcg cgtcaatgct ctccttccag ggcgggatcc gtgtgccttc ttataccgca 480tcaaaaagcg gcgtgatggg tgtgacgcga ttgatggcga acgaatgggc taaacacaac 540attaatgtta atgcgatagc cccgggttac atggcgacca acaatactca acaactacgg 600gcagatgaac aacgtagcgc ggaaattctc gaccgcattc cagctggtcg ttggggactg 660ccgagtgacc tgatggggcc gatagtgttc cttgcctcca gcgcttcaga ttatgtgaat 720ggttatacca ttgccgtgga tggcggttgg ctggcgcgtt aa 76265765DNASphingomonas sp. 65atgcccggca tgaccactcc tttcgatctt catggcaaga ccgcgatcgt caccggcgcc 60aataccggca tcggccaggc cattgccctg tcgctcgcgc aggccggcgc ggatatcgcc 120gccgtcggcc gcacgcccgc acaggacacg gtcgatcagg tccgcgcgct cggccgccgg 180gcggacatta tctcggccga cctttcgacc atcgaaccgg tccagcgcgt cctcgacgaa 240acgctggaaa agcttggtgc cttggacata ctggtcaaca atgccggcat catccgccgc 300gccgacagcg tcgatttcac cgaggaggat tgggacgcgg tgatcgacac caatctcaag 360accaccttct tcctctgtca ggccgccggt cgccacatgc ttgcccaagg cgctggcaag 420atcatcaaca tcgcctcgct tctttccttc cagggcggca ttcgcgtgcc gagctacacc 480gcgtccaaaa gcggcgtcgc gggcctgacc aagctgctcg ccaacgaatg ggcggccaag 540ggcgtcaatg tgaacgccat cgcgccgggc tatatcgcca ccaacaacac cgccgcgctc 600caggccgacg aaacccgcaa ccgccagatc caggagcgca tcccggctgg ccgctggggc 660gaccccgccg acattggcgg cgcggccgtg ttcctggcgt ccagcgccgc cgattatatc 720catggccaca cgctcgccgt cgacggcggc tggctcgcgc gctga 76566741DNAHoeflea phototrophica 66atgaacccct tctcgcttga gggcaagacc gcccttgtga ccggtgccaa tacgggcatc 60ggtcaggcca tcgccatggc gcttggccgc gccggggcgg acgtcatctg cgcgggacgc 120tcgtcctgtg cggagaccgt tgccctcatc gctggcagca agggcaaggc gcgcgaactg 180gtgctcgact tcgccgaccc gatggccgcc cgtgacgtgt tcgccgccga accggtggac 240atcctcgtca acaacgcggg catcatccgg cgcgccgatg cagtggattt caccgaggcc 300gactgggatg aggtgatgga cgtgaacctg aaggccgtgt tcttcacctg ccaggccttc 360ggcaaggccg ttcttggccg tggaggaaac ggcaagatcg tcaacattgc ctcgctcctg 420tcattccagg gtggtatccg ggtgccgtcc tacacggcct cgaagcatgg tgttgcaggc 480atcaccaagc ttctggccaa cgaatgggcg gcgaagggca tcaatgtgaa tgccatcgcc 540cccggttaca tcgaaacgaa caataccgaa gcactgcggg cggacccggt gcgcaacaag 600gccatccttg agcgtatccc tgccggccgc tggggccagg cctcggacat cggcgaagcc 660gccgtgttcc ttgcctctcc ggctgccaat tacatccatg gtgcagtgct gaatgttgac 720ggaggctggc ttgcccgctg a 74167446PRTEscherichia coli 67Met Ser Ser Gln Phe Thr Thr Pro Val Val Thr Glu Met Gln Val Ile 1 5 10 15 Pro Val Ala Gly His Asp Ser Met Leu Met Asn Leu Ser Gly Ala His 20 25 30 Ala Pro Phe Phe Thr Arg Asn Ile Val Ile Ile Lys Asp Asn Ser Gly 35 40 45 His Thr Gly Val Gly Glu Ile Pro Gly Gly Glu Lys Ile Arg Lys Thr 50 55 60 Leu Glu Asp Ala Ile Pro Leu Val Val Gly Lys Thr Leu Gly Glu Tyr 65 70 75 80 Lys Asn Val Leu Thr Leu Val Arg Asn Thr Phe Ala Asp Arg Asp Ala 85 90 95 Gly Gly Arg Gly Leu Gln Thr Phe Asp Leu Arg Thr Thr Ile His Val 100 105 110 Val Thr Gly Ile Glu Ala Ala Met Leu Asp Leu Leu Gly Gln His Leu 115 120 125 Gly Val Asn Val Ala Ser Leu Leu Gly Asp Gly Gln Gln Arg Ser Glu 130 135 140 Val Glu Met Leu Gly Tyr Leu Phe Phe Val Gly Asn Arg Lys Ala Thr 145 150 155 160 Pro Leu Pro Tyr Gln Ser Gln Pro Asp Asp Ser Cys Asp Trp Tyr Arg 165 170 175 Leu Arg His Glu Glu Ala Met Thr Pro Asp Ala Val Val Arg Leu Ala 180 185 190 Glu Ala Ala Tyr Glu Lys Tyr Gly Phe Asn Asp Phe Lys Leu Lys Gly 195 200 205 Gly Val Leu Ala Gly Glu Glu Glu Ala Glu Ser Ile Val Ala Leu Ala 210 215 220 Gln Arg Phe Pro Gln Ala Arg Ile Thr Leu Asp Pro Asn Gly Ala Trp 225 230 235 240 Ser Leu Asn Glu Ala Ile Lys Ile Gly Lys Tyr Leu Lys Gly Ser Leu 245 250 255 Ala Tyr Ala Glu Asp Pro Cys Gly Ala Glu Gln Gly Phe Ser Gly Arg 260 265 270 Glu Val Met Ala Glu Phe Arg Arg Ala Thr Gly Leu Pro Thr Ala Thr 275 280 285 Asn Met Ile Ala Thr Asp Trp Arg Gln Met Gly His Thr Leu Ser Leu 290 295 300 Gln Ser Val Asp Ile Pro Leu Ala Asp Pro His Phe Trp Thr Met Gln 305 310 315 320 Gly Ser Val Arg Val Ala Gln Met Cys His Glu Phe Gly Leu Thr Trp 325 330 335 Gly Ser His Ser Asn Asn His Phe Asp Ile Ser Leu Ala Met Phe Thr 340 345 350 His Val Ala Ala Ala Ala Pro Gly Lys Ile Thr Ala Ile Asp Thr His 355 360 365 Trp Ile Trp Gln Glu Gly Asn Gln Arg Leu Thr Lys Glu Pro Phe Glu 370 375 380 Ile Lys Gly Gly Leu Val Gln Val Pro Glu Lys Pro Gly Leu Gly Val 385 390 395 400 Glu Ile Asp Met Asp Gln Val Met Lys Ala His Glu Leu Tyr Gln Lys 405 410 415 His Gly Leu Gly Ala Arg Asp Asp Ala Met Gly Met Gln Tyr Leu Ile 420 425 430 Pro Gly Trp Thr Phe Asp Asn Lys Arg Pro Cys Met Val Arg 435 440 445 68446PRTPseudomonas stutzeri 68Met Thr Thr Ala Met Ser Gly Thr Pro Arg Ile Thr Glu Leu Thr Val 1 5 10 15 Val Pro Val Ala Gly Gln Asp Ser Met Leu Met Asn Leu Ser Gly Ala 20 25 30 His Gly Pro Trp Phe Thr Arg Asn Ile Leu Ile Leu Lys Asp Ser Ala 35 40 45 Gly His Val Gly Val Gly Glu Val Pro Gly Gly Glu Ala Ile Arg Gln 50 55 60 Thr Leu Asp Asp Ala Arg Ala Leu Leu Val Gly Glu Pro Ile Gly Gln 65 70 75 80 Tyr Asn Ala Leu Leu Gly Lys Val Arg Arg Ala Phe Ala Asp Arg Asp 85 90 95 Ala Gly Gly Arg Gly Leu Gln Thr Phe Asp Leu Arg Ile Ala Ile His 100 105 110 Ala Val Thr Ala Leu Glu Ser Ala Leu Leu Asp Leu Leu Gly Gln His 115 120 125 Leu Glu Val Pro Val Ala Ala Leu Leu Gly Glu Gly Gln Gln Arg Asp 130 135 140 Glu Val Glu Met Leu Gly Tyr Leu Phe Phe Ile Gly Asp Arg Asn Arg 145 150 155 160 Thr Asp Leu Gly Tyr Arg Asp Glu Ser Asn Ser Asp Asp Ala Trp Phe 165 170 175 Arg Val Arg Asn Glu Glu Ala Met Thr Pro Glu Arg Ile Val Arg Gln 180 185 190 Ala Glu Ala Ala Tyr Glu Arg Tyr Gly Phe Lys Asp Phe Lys Leu Lys 195 200 205 Gly Gly Val Leu Arg Gly Glu Glu Glu Val Glu Ala Ile Arg Ala Leu 210 215 220 Ala Gln Arg Phe Pro Asp Ala Arg Val Thr Leu Asp Pro Asn Gly Ala 225 230 235 240 Trp Ser Leu Asp Glu Ala Ser Gly Leu Cys Arg Asp Leu His Gly Val 245 250 255 Leu Ala Tyr Ala Glu Asp Pro Cys Gly Ala Glu Asn Gly Tyr Ser Gly 260 265 270 Arg Glu Val Met Ala Glu Phe Arg Arg Ala Thr Gly Leu Pro Thr Ala 275 280 285 Thr Asn Met Ile Ala Thr Asp Trp Arg Gln Met Ser His Ala Val Cys 290 295 300 Leu His Ser Val Asp Ile Pro Leu Ala Asp Pro His Phe Trp Thr Met 305 310 315 320 Ala Gly Ser Val Arg Val Ala Gln Met Cys Ala Asp Phe Gly Leu Thr 325 330 335 Trp Gly Ser His Ser Asn Asn His Phe Asp Ile Ser Leu Ala Met Phe 340 345 350 Thr His Val Ala Ala Ala Ala Pro Gly Arg Val Thr Ala Ile Asp Thr 355 360 365 His Trp Ile Trp Gln Asp Gly Gln His Leu Thr Arg Glu Pro Leu Lys 370 375 380 Ile Val Ser Gly Lys Val Ala Val Pro Gln Lys Pro Gly Leu Gly Val 385 390 395 400 Glu Leu Asp Trp Asp Ala Leu Glu Gln Ala His Ala His Tyr Gln Glu 405 410 415 Lys Gly Leu Gly Ala Arg Asp Asp Ala Ile Ala Met Gln Tyr Leu Ile 420 425 430 Pro Asn Trp Thr Phe Asn Asn Lys Lys Pro Cys Met Val Arg 435 440 445 691341DNAEscherichia coli 69atgagttctc aatttacgac gcctgttgtt actgaaatgc aggttatccc ggtggcgggt 60catgacagta tgctgatgaa tctgagtggt gcacacgcac cgttctttac gcgtaatatt 120gtgattatca aagataattc tggtcacact ggcgtagggg aaattcccgg cggcgagaaa 180atccgtaaaa cgctggaaga tgcgattccg ctggtggtag gtaaaacgct gggtgaatac 240aaaaacgttc tgacgctggt gcgtaatact tttgccgatc gtgatgctgg tgggcgcggt 300ttgcagacat ttgacctacg taccactatt catgtagtta ccgggataga agcggcaatg 360ctggatctgc tggggcagca tctgggggta aacgtggcat cgctgctggg cgatggtcaa 420cagcgtagcg aagtcgaaat gctcggttat ctgttcttcg tcggtaatcg caaagccacg 480ccgctgccgt atcaaagcca gccggatgac tcatgcgact ggtatcgcct gcgtcatgaa 540gaagcgatga cgccggatgc ggtggtgcgc ctggcggaag cggcatatga aaaatatggc 600ttcaacgatt tcaaactgaa gggcggtgta ctggccgggg

aagaagaggc cgagtctatt 660gtggcactgg cgcaacgctt cccgcaggcg cgtattacgc tcgatcctaa cggtgcctgg 720tcgctgaacg aagcgattaa aatcggtaaa tacctgaaag gttcgctggc ttatgcagaa 780gatccgtgtg gtgcggagca aggtttctcc gggcgtgaag tgatggcaga gttccgtcgc 840gcgacaggtc taccgactgc aaccaatatg atcgccaccg actggcggca aatgggccat 900acgctctccc tgcaatccgt tgatatcccg ctggcggatc cgcatttctg gacaatgcaa 960ggttcggtac gtgtggcgca aatgtgccat gaatttggcc tgacctgggg ttcacactct 1020aacaaccact tcgatatttc cctggcgatg tttacccatg ttgccgccgc tgcaccgggt 1080aaaattactg ctattgatac gcactggatt tggcaggaag gcaatcagcg cctgaccaaa 1140gaaccgtttg agatcaaagg cgggctggta caggtgccag aaaaaccggg gctgggtgta 1200gaaatcgata tggatcaagt gatgaaagcc catgagctgt atcagaaaca cgggcttggc 1260gcgcgtgacg atgcgatggg aatgcagtat ctgattcctg gctggacgtt cgataacaag 1320cgcccgtgca tggtgcgtta a 1341701341DNAPseudomonas stutzeri 70atgaccaccg ccatgtcggg cacgccccgc atcaccgaac tcaccgtcgt gcccgtcgcc 60gggcaggaca gcatgctgat gaacctcagc ggcgcccatg ggccctggtt cacccgcaac 120atcctcatcc tcaaggacag cgccggccac gtcggcgtcg gcgaagtgcc gggcggcgaa 180gccatccgcc agaccctcga cgatgcccgt gccctgctgg tcggcgaacc gatcggccag 240tacaacgcgc tgctcggcaa ggtgcgccgc gccttcgccg accgtgacgc cggcggccgc 300ggcctgcaga ccttcgacct gcgcatcgcc attcacgccg tcaccgcgct ggagtcggcg 360ctgctcgacc tgctcggcca gcacctcgag gtgccggtcg ccgccttgct cggcgaaggc 420cagcagcgtg acgaagtgga aatgctcggc tacctgttct tcatcggcga tcgcaacagg 480accgacctcg gctaccgcga cgaatccaac tccgacgacg cctggtttcg cgtgcgcaac 540gaggaggcca tgacgccgga gcgcatcgtc cgccaggccg aggcggccta cgagcgctac 600ggcttcaagg acttcaagct caagggcggc gtactgcgcg gcgaagagga agtcgaggcg 660atccgcgccc tggcccagcg cttccccgac gcccgcgtga ctctggaccc caacggcgcc 720tggtcgctgg acgaagccag cggcctgtgt cgcgacctgc acggcgtgct ggcctatgcc 780gaagacccct gcggtgccga gaacggctat tccggccgcg aggtgatggc cgagttccgc 840cgcgccaccg gtctgcccac cgcgaccaac atgatcgcca ccgactggcg acagatgagt 900cacgcggtgt gcctgcactc ggtggacatc ccgctggccg acccgcactt ctggaccatg 960gccggctctg tgcgcgtggc gcagatgtgc gccgacttcg gcctgacctg gggttcgcac 1020tcgaacaacc acttcgacat ctccctggcg atgttcaccc acgtggcggc cgccgcgccg 1080ggtcgcgtca ccgccatcga cacccactgg atctggcagg acggccagca cctgacccgc 1140gagccgctga agatcgtcag cggcaaggtt gcggtgccgc agaagccggg gctgggcgtc 1200gagctggact gggatgccct ggagcaggcg catgcccact accaagagaa aggcctgggt 1260gcccgcgatg acgccatcgc catgcagtac ctgatcccca actggacctt caacaacaag 1320aagccgtgca tggtgcgctg a 134171256PRTGluconobacter oxydans 71Met Ser His Pro Asp Leu Phe Ser Leu Ser Gly Ala Arg Ala Leu Val 1 5 10 15 Thr Gly Ala Ser Arg Gly Ile Gly Leu Thr Leu Ala Lys Gly Leu Ala 20 25 30 Arg Tyr Gly Ala Glu Val Val Leu Asn Gly Arg Asn Ala Glu Ser Leu 35 40 45 Asp Ser Ala Gln Ser Gly Phe Glu Ala Glu Gly Leu Lys Ala Ser Thr 50 55 60 Ala Val Phe Asp Val Thr Asp Gln Asp Ala Val Ile Asp Gly Val Ala 65 70 75 80 Ala Ile Glu Arg Asp Met Gly Pro Ile Asp Ile Leu Ile Asn Asn Ala 85 90 95 Gly Ile Gln Arg Arg Ala Pro Leu Glu Glu Phe Ser Arg Lys Asp Trp 100 105 110 Asp Asp Leu Met Ser Thr Asn Val Asn Ala Val Phe Phe Val Gly Gln 115 120 125 Ala Val Ala Arg His Met Ile Pro Arg Gly Arg Gly Lys Ile Val Asn 130 135 140 Ile Cys Ser Val Gln Ser Glu Leu Ala Arg Pro Gly Ile Ala Pro Tyr 145 150 155 160 Thr Ala Thr Lys Gly Ala Val Lys Asn Leu Thr Lys Gly Met Ala Thr 165 170 175 Asp Trp Gly Arg His Gly Leu Gln Ile Asn Gly Leu Ala Pro Gly Tyr 180 185 190 Phe Ala Thr Glu Met Thr Glu Arg Leu Val Ala Asp Glu Glu Phe Thr 195 200 205 Asp Trp Leu Cys Lys Arg Thr Pro Ala Gly Arg Trp Gly Gln Val Glu 210 215 220 Glu Leu Val Gly Ala Ala Val Phe Leu Ser Ser Arg Ala Ser Ser Phe 225 230 235 240 Val Asn Gly Gln Val Leu Met Val Asp Gly Gly Ile Thr Val Ser Leu 245 250 255 72771DNAGluconobacter oxydans 72atgtctcacc cggatctgtt tagcttaagt ggcgcacgcg cattagttac tggtgcctct 60cgtggtattg gtttaaccct ggccaaaggt ttagcccgtt atggtgccga agtggtttta 120aatggccgta atgccgaaag cctggattct gcccaaagtg gctttgaagc cgaaggctta 180aaagcatcta ccgctgtgtt tgacgtgacc gatcaagatg cagtcattga cggcgtggca 240gcaattgaac gcgatatggg tccgattgat atcctgatca acaatgcggg cattcaacgc 300agagccccgt tagaagaatt ttctcgcaaa gactgggacg atctgatgag caccaacgtt 360aacgccgtgt tctttgtggg acaagccgtt gccagacaca tgattcctag aggtcgcggt 420aaaatcgtca acatctgttc agtgcagagc gaactggcaa gaccgggtat tgcaccttat 480accgccacaa aaggagccgt caaaaatctg accaaaggta tggccaccga ttggggtcgt 540catggtttac agattaatgg cttagcaccg ggctattttg ccaccgagat gaccgaacgc 600ttagttgccg acgaagaatt taccgactgg ttatgcaaac gcacccctgc aggcagatgg 660ggccaagttg aagaattagt aggcgcagcc gtgtttttaa gtagtagagc ctcaagcttc 720gtgaatggcc aagtcctgat ggttgatggt ggaattactg tgagcctgta a 77173371PRTEscherichia coli K-12 73Met His Arg Gln Ser Phe Phe Leu Val Pro Leu Ile Cys Leu Ser Ser 1 5 10 15 Ala Leu Trp Ala Ala Pro Ala Thr Val Asn Val Glu Val Leu Gln Asp 20 25 30 Lys Leu Asp His Pro Trp Ala Leu Ala Phe Leu Pro Asp Asn His Gly 35 40 45 Met Leu Ile Thr Leu Arg Gly Gly Glu Leu Arg His Trp Gln Ala Gly 50 55 60 Lys Gly Leu Ser Ala Pro Leu Ser Gly Val Pro Asp Val Trp Ala His 65 70 75 80 Gly Gln Gly Gly Leu Leu Asp Val Val Leu Ala Pro Asp Phe Ala Gln 85 90 95 Ser Arg Arg Ile Trp Leu Ser Tyr Ser Glu Val Gly Asp Asp Gly Lys 100 105 110 Ala Gly Thr Ala Val Gly Tyr Gly Arg Leu Ser Asp Asp Leu Ser Lys 115 120 125 Val Thr Asp Phe Arg Thr Val Phe Arg Gln Met Pro Lys Leu Ser Thr 130 135 140 Gly Asn His Phe Gly Gly Arg Leu Val Phe Asp Gly Lys Gly Tyr Leu 145 150 155 160 Phe Ile Ala Leu Gly Glu Asn Asn Gln Arg Pro Thr Ala Gln Asp Leu 165 170 175 Asp Lys Leu Gln Gly Lys Leu Val Arg Leu Thr Asp Gln Gly Glu Ile 180 185 190 Pro Asp Asp Asn Pro Phe Ile Lys Glu Ser Gly Ala Arg Ala Glu Ile 195 200 205 Trp Ser Tyr Gly Ile Arg Asn Pro Gln Gly Met Ala Met Asn Pro Trp 210 215 220 Ser Asn Ala Leu Trp Leu Asn Glu His Gly Pro Arg Gly Gly Asp Glu 225 230 235 240 Ile Asn Ile Pro Gln Lys Gly Lys Asn Tyr Gly Trp Pro Leu Ala Thr 245 250 255 Trp Gly Ile Asn Tyr Ser Gly Phe Lys Ile Pro Glu Ala Lys Gly Glu 260 265 270 Ile Val Ala Gly Thr Glu Gln Pro Val Phe Tyr Trp Lys Asp Ser Pro 275 280 285 Ala Val Ser Gly Met Ala Phe Tyr Asn Ser Asp Lys Phe Pro Gln Trp 290 295 300 Gln Gln Lys Leu Phe Ile Gly Ala Leu Lys Asp Lys Asp Val Ile Val 305 310 315 320 Met Ser Val Asn Gly Asp Lys Val Thr Glu Asp Gly Arg Ile Leu Thr 325 330 335 Asp Arg Gly Gln Arg Ile Arg Asp Val Arg Thr Gly Pro Asp Gly Tyr 340 345 350 Leu Tyr Val Leu Thr Asp Glu Ser Ser Gly Glu Leu Leu Lys Val Ser 355 360 365 Pro Arg Asn 370 74382PRTPseudomonas 74Met Leu Arg Gln Ala Ile Arg Thr Thr Leu Cys Gly Phe Val Ile Ala 1 5 10 15 Ala Ser Phe Gln Val Ala Ala Glu Thr Gln Arg Phe Pro Ser Glu Ala 20 25 30 Gly Gln Val Thr Val Lys Glu Ile Ala Ala Gly Leu Glu Asn Pro Trp 35 40 45 Gly Leu Ala Phe Leu Pro Asp Gly Glu His Met Leu Val Thr Glu Arg 50 55 60 Pro Gly Arg Leu Arg Leu Val Gly Leu Asp Gly Ser Arg Ser Glu Pro 65 70 75 80 Leu Ala Gly Val Pro Asp Val Phe Ala Arg Ala Gln Gly Gly Leu Leu 85 90 95 Asp Val Arg Leu Ser Pro Ala Phe Glu Gln Asp Arg Leu Val Tyr Leu 100 105 110 Ser Tyr Ala Glu Val Gly Glu Asp Gly Lys Ala Gly Thr Ala Val Gly 115 120 125 Arg Gly Arg Leu Asn Asp Asp Arg Ser Arg Leu Glu Asn Phe Glu Val 130 135 140 Ile Phe Arg Gln Leu Pro Lys Leu Ser Ser Gly Ile His Phe Gly Ser 145 150 155 160 Arg Leu Val Phe Ala Gly Asn Gly His Leu Phe Val Ala Leu Gly Glu 165 170 175 Asn Asn Gln Arg Ser Thr Ser Gln Asp Leu Asp Lys His Gln Gly Lys 180 185 190 Val Val Arg Ile Gly Leu Asp Gly Ser Val Pro Asp Asp Asn Pro Phe 195 200 205 Val Gly Arg Asp Gly Val Arg Pro Glu Ile Trp Ser Tyr Gly His Arg 210 215 220 Asn Gln Gln Gly Ala Ala Leu Asn Pro Trp Ser Gly Val Leu Trp Thr 225 230 235 240 His Glu His Gly Pro Arg Gly Gly Asp Glu Ile Asn Ile Pro Gln Ala 245 250 255 Gly Lys Asn Tyr Gly Trp Pro Leu Ala Thr His Gly Ile Asn Tyr Ser 260 265 270 Met Leu Pro Ile Pro Glu Ala Lys Gly Lys Thr Val Lys Gly Thr Glu 275 280 285 Pro Pro His His Val Trp Asp Lys Ser Pro Gly Ile Ser Gly Met Ala 290 295 300 Phe Tyr Asp Ala Glu Arg Phe Pro Ala Trp Gln His Ser Leu Phe Ile 305 310 315 320 Gly Ala Leu Val Asp Leu Ser Leu Ile Arg Leu Gln Leu Asp Gly Asp 325 330 335 Arg Ile Val Gly Glu Glu Arg Leu Leu Lys Asp Leu Asn Ala Arg Ile 340 345 350 Arg Asp Val Arg Val Gly Pro Asp Gly Phe Leu Tyr Leu Leu Thr Asp 355 360 365 Ala Ala Asp Gly Lys Leu Leu Gln Val Gly Leu Asp Ser Asn 370 375 380 75387PRTAchromobacter 75Met Gln Ser Arg Thr Ala Ala Ser Thr Arg Ala Ile Pro Leu Ile Leu 1 5 10 15 Ser Leu Ala Met Ala Phe Ala Ala Ala Pro Ala Val Ala Gln Ala Ala 20 25 30 Gln Glu Pro Pro Ser Ala Pro Ala Arg Val Thr Pro Val Val Gly Gly 35 40 45 Leu Asp His Pro Trp Ser Met Ala Phe Leu Pro Asp Gly Gly Ile Leu 50 55 60 Ile Thr Glu Arg Pro Gly Asn Leu Arg Leu Leu Arg Thr Pro Gly Gly 65 70 75 80 Leu Ser Lys Pro Leu Ser Gly Val Pro Gln Val Ala Ala Arg Gly Gln 85 90 95 Gly Gly Leu Leu Asp Val Ala Leu Ser Pro Asp Phe Ala Thr Asp Arg 100 105 110 Tyr Val Tyr Leu Ala Tyr Ala Glu Ser Asp Gly Asp Lys Ser Gly Thr 115 120 125 Ala Val Gly Arg Gly Arg Leu Ala Asp Asp Ala Ser Gly Leu Glu Gly 130 135 140 Phe Lys Val Leu Phe Arg Gln Glu Pro Lys Leu Ser Ser Gly Gln His 145 150 155 160 Phe Gly Ser Arg Leu Val Phe Asp Gly Lys Gly Tyr Leu Tyr Ile Ala 165 170 175 Leu Gly Glu Asn Asn Gln Arg Pro Thr Ala Gln Asp Leu Asp Lys Leu 180 185 190 Gln Gly Lys Val Val Arg Leu Lys Thr Asp Gly Ser Val Pro Ala Asp 195 200 205 Asn Pro Phe Val Gly Lys Pro Gly Ala Arg Pro Glu Ile Trp Ser Tyr 210 215 220 Gly His Arg Asn Pro Gln Gly Met Ala Leu Asn Pro Trp Thr Gly Glu 225 230 235 240 Leu Trp Glu Asn Glu His Gly Pro Arg Gly Gly Asp Glu Ile Asn Val 245 250 255 Val Lys Pro Gly Lys Asn Tyr Gly Trp Pro Leu Ala Thr Tyr Gly Ile 260 265 270 Asn Tyr Ser Gly Phe Ala Ile Pro Glu Ala Lys Gly Glu Thr Leu Pro 275 280 285 Gly Met Glu Pro Pro Ile His Trp Trp Pro Lys Ser Pro Ala Ile Ser 290 295 300 Gly Met Ala Phe Tyr Asp Ala Asp Arg Phe Pro Ala Trp Arg Asn Ser 305 310 315 320 Leu Phe Ile Gly Ala Leu Gly Asn Gln Asn Leu Ile Arg Leu Thr Val 325 330 335 Asp Gly Asn Arg Val Val Glu Lys Glu Arg Leu Leu Val Asp Arg Lys 340 345 350 Arg Arg Ile Arg Asp Val Arg Gln Gly Pro Asp Gly Tyr Val Tyr Val 355 360 365 Leu Thr Asp Ala Ser Pro Gly Glu Leu Leu Arg Val Ala Pro Ala Glu 370 375 380 Thr Gly Gly 385 76372PRTPseudomonas 76Met Asn Asn Pro Ile Arg Gly Leu Phe Cys Ala Leu Ala Leu Leu Ser 1 5 10 15 Ala Pro Met Leu Ala Pro Ser Ala Trp Ala Ser Ala Lys Val Glu Val 20 25 30 Leu Tyr Glu Gly Leu Glu His Pro Trp Ala Leu Ala Phe Leu Pro Asp 35 40 45 Ala Gln Gly Met Leu Ile Thr Glu Arg Arg Gly Ser Leu Arg Leu Leu 50 55 60 Asp Ala Gln Gly Lys Leu Ser Glu Pro Leu Ala Gly Val Pro Glu Val 65 70 75 80 Phe Ala Val Gly Gln Gly Gly Leu Leu Asp Val Val Leu Ser Pro Ser 85 90 95 Phe Ala Glu Asp Arg Leu Val Tyr Leu Ser Phe Ala Gln Ala Glu Gly 100 105 110 Asp Lys Ala Ala Thr Ser Val Gly Arg Gly Arg Leu Ser Glu Asp Leu 115 120 125 Arg Ser Leu Glu Asp Phe Lys Val Ile Phe Arg Gln Met Pro Ala Leu 130 135 140 Ser Ser Gly His His Phe Gly Ser Arg Leu Val Phe Asp Arg Asp Gly 145 150 155 160 Tyr Leu Phe Ile Ala Leu Gly Glu His Asn Gln Arg Pro Thr Ser Gln 165 170 175 Asp Leu Asp Lys Leu Gln Gly Lys Val Val Arg Leu Tyr Pro Asp Gly 180 185 190 Arg Ile Pro Asp Asp Asn Pro Phe Val Gly Arg Glu Gly Ala Arg Ala 195 200 205 Glu Ile Trp Ser Tyr Gly His Arg Asn Gln Gln Gly Ala Ala Leu Asn 210 215 220 Pro Trp Thr Gly Lys Leu Trp Thr His Glu His Gly Pro Arg Gly Gly 225 230 235 240 Asp Glu Val Asn Ile Pro Glu Ala Gly Lys Asn Tyr Gly Trp Pro Ile 245 250 255 Ala Thr His Gly Val Asn Tyr Ser Phe Leu Ala Ile Pro Glu Ala Glu 260 265 270 Gly Lys Glu Val Ala Gly Thr Glu Pro Pro His His Val Trp Lys Lys 275 280 285 Ser Pro Ala Ile Ser Gly Met Ala Phe Tyr Asp His Ala Arg Phe Pro 290 295 300 Ala Trp Gln His Ser Leu Phe Val Gly Ala Leu Ala Gly Ala Glu Leu 305 310 315 320 Ile Arg Leu Gln Leu Asn Gly Asp Lys Val Val Gly Glu Glu Arg Leu 325 330 335 Leu Gly Glu Arg Lys Ala Arg Ile Arg Asp Val Arg Val Gly Pro Asp 340 345 350 Gly Tyr Leu Tyr Leu Leu Thr Asp Ser Gly Lys Gly Gln Leu Leu Lys 355 360 365 Val Gly Leu Glu 370 77381PRTPseudomonas 77Met Leu Arg Ala Pro Trp Leu Val Thr Leu Thr Ala Ala Ala Leu Leu 1 5 10 15 Pro Leu Trp Ala His Ala Ala Ala Glu Gln Arg Phe Pro Ser Glu Glu 20 25 30 Gly Thr Leu Ile Val Asp Thr Leu Ala Asn Gly Leu Arg Asn Pro Trp 35 40 45 Ala Leu Ala Phe Leu Pro Gly Gly Lys Asp Met Leu Val Thr Glu Arg 50 55

60 Ala Gly Asn Leu Arg Leu Val Asn Ala Glu Gly Lys Val Gly Pro Ser 65 70 75 80 Ile Ser Gly Val Pro Lys Val Trp Ala Glu Gly Gln Gly Gly Leu Leu 85 90 95 Asp Val Ala Leu Ser Pro Glu Phe Gly Lys Asp Arg Thr Val Tyr Leu 100 105 110 Ser Tyr Ala Glu Glu Gly Ser Asp Gly Lys Ala Gly Thr Ala Val Gly 115 120 125 Arg Gly Gln Leu Ser Glu Asp Arg Ala Arg Leu Glu His Phe Thr Val 130 135 140 Ile Phe Arg Gln Leu Pro Lys Leu Ser Val Gly Asn His Phe Gly Ser 145 150 155 160 Arg Leu Val Phe Asp Arg Asn Gly Tyr Leu Phe Ile Ala Leu Gly Glu 165 170 175 Asn Asn Gln Arg Pro Thr Ala Gln Asp Leu Asp Lys Leu Gln Gly Lys 180 185 190 Val Val Arg Ile Leu Pro Asp Gly Glu Val Pro Lys Asp Asn Pro Phe 195 200 205 Val Gly Lys Asp Asn Val Arg Pro Glu Ile Trp Ser Tyr Gly His Arg 210 215 220 Asn Gln Gln Gly Ala Ala Leu Asn Pro Trp Thr Gly Gln Leu Trp Thr 225 230 235 240 His Glu His Gly Pro Arg Gly Gly Asp Glu Ile Asn Ile Pro Lys Pro 245 250 255 Gly Lys Asn Tyr Gly Trp Pro Ile Ala Thr His Gly Ile Asn Tyr Ser 260 265 270 Leu Leu Pro Ile Pro Glu Ala Lys Gly Glu His Val Asp Gly Met Val 275 280 285 Asp Pro His His Val Trp Glu Lys Ser Pro Gly Ile Ser Gly Met Ala 290 295 300 Phe Tyr Asp Ser Pro Thr Phe Lys Ala Trp Asp His Asn Leu Phe Ile 305 310 315 320 Gly Ala Leu Ala Thr Gln Glu Leu Ile Arg Leu Gln Leu Glu Gly Asp 325 330 335 Lys Val Val His Glu Glu Arg Leu Leu Gly Asp Leu Lys Ala Arg Ile 340 345 350 Arg Asp Val Arg Met Gly Pro Asp Gly Tyr Leu Tyr Val Leu Thr Asp 355 360 365 Asp Lys Asp Gly Ala Leu Leu Lys Val Gly Leu Ala Asp 370 375 380 78375PRTCitrobacter 78Met Arg Arg Ser Leu Ile Pro Leu Met Thr Leu Leu Ile Phe Pro Trp 1 5 10 15 Phe Ser Gln Ala Glu Thr Pro Ala Val Asn Val Glu Val Leu Gln Thr 20 25 30 Lys Leu Asp His Pro Trp Ala Leu Ala Phe Leu Pro Gly Asp Asn Gly 35 40 45 Met Leu Ile Thr Leu Arg Gly Gly Gln Leu Arg His Trp Gln Ala Asp 50 55 60 Lys Gly Leu Ser Asp Pro Ile Pro Gly Val Pro Thr Val Trp Ala Ser 65 70 75 80 Gly Gln Gly Gly Leu Leu Asp Val Ala Leu Ala Pro Asp Phe Ser Gln 85 90 95 Ser Arg Arg Val Trp Leu Ser Phe Ala Gln Ala Asp Ala Gln Gly Asn 100 105 110 Ala Gly Thr Val Val Gly Tyr Gly Arg Leu Ser Asp Asp Leu Ser Arg 115 120 125 Leu Glu Asn Phe Gln Thr Val Phe Arg Gln Met Pro Lys Leu Ser Thr 130 135 140 Gly Asn His Phe Gly Gly Arg Leu Val Phe Asp Gly Asn Gly Tyr Leu 145 150 155 160 Phe Ile Gly Leu Gly Glu Asn Asn Gln Arg Pro Thr Ala Gln Asp Leu 165 170 175 Asp Lys Leu Gln Gly Lys Val Val Arg Leu Thr Asp Gln Gly Lys Ile 180 185 190 Pro Pro Asp Asn Pro Phe Val Asn Gln Pro Gly Ala Arg Pro Glu Ile 195 200 205 Trp Ser Tyr Gly Ile Arg Asn Pro Gln Gly Met Ala Met Asn Pro Trp 210 215 220 Ser Asp Thr Leu Trp Leu Asn Glu His Gly Pro Arg Gly Gly Asp Glu 225 230 235 240 Ile Asn Ile Pro Glu Lys Gly Lys Asn Tyr Gly Trp Pro Leu Ala Thr 245 250 255 Trp Gly Ile Asn Tyr Ser Gly Phe Lys Ile Pro Glu Ala Gln Gly Glu 260 265 270 Lys Val Ala Gly Thr Glu Gln Pro Ile Phe Tyr Trp Gln Lys Ser Pro 275 280 285 Ala Val Ser Gly Met Ala Phe Tyr Asp His Asp Thr Phe Pro Gln Trp 290 295 300 Arg Gln Lys Leu Phe Leu Gly Ala Leu Lys Asp Gln Asn Val Ile Val 305 310 315 320 Met Asn Val Asn Gly Asn Thr Val Thr Glu Glu Gly Arg Ile Leu Gly 325 330 335 Glu Arg Lys Gln Arg Ile Arg Asp Val Arg Val Gly Pro Asp Gly Tyr 340 345 350 Leu Tyr Val Leu Thr Asp Glu Ser Asp Gly Glu Leu Leu Lys Val Ser 355 360 365 Pro Arg Ser Ala Gly Asn Pro 370 375 791059DNAEscherichia coli K-12 79atggcaccag caaccgtgaa tgtggaagtt ctgcaggata aactggatca tccgtgggca 60ctggcatttt taccggataa ccatggcatg ctgattaccc tgcgtggtgg tgaactgcgt 120cattggcaag caggtaaagg tttaagcgca ccgttaagtg gtgttccgga tgtttgggca 180catggtcaag gtggtctgtt agatgtggtt ttagcaccgg attttgcaca gtctcgtcgt 240atttggctga gctacagcga agttggcgat gatggtaaag caggtaccgc agtgggttat 300ggtcgtctga gcgatgatct gagcaaagtt accgattttc gtaccgtgtt tcgccaaatg 360ccgaaactga gcaccggcaa ccattttggc ggtcgtctgg tttttgatgg taaaggttat 420ctgtttatcg cgctgggcga aaacaatcag cgtccgaccg cacaggatct ggataaactg 480cagggcaaac tggttcgtct gaccgatcaa ggcgaaattc cggatgataa tccgttcatc 540aaagaaagcg gtgcgcgtgc ggaaatttgg agctatggta ttcgcaaccc gcagggtatg 600gcaatgaatc cgtggagtaa tgcattatgg ctgaacgaac atggtccgcg tggtggtgat 660gaaatcaata ttccgcagaa aggcaaaaac tacggctggc cgctggcaac ctggggtatc 720aattatagcg gctttaaaat cccggaagcg aaaggcgaaa ttgtggcagg taccgaacag 780ccggtgttct actggaaaga ttctccggcg gtttctggta tggcgtttta taatagcgac 840aaattcccgc agtggcagca gaaactgttt attggtgcgc tgaaagataa agacgtgatc 900gtgatgagcg tgaacggcga caaagtgacc gaagatggcc gcattctgac cgatcgtggt 960cagcgtattc gtgatgtgcg taccggtcca gatggttacc tgtatgtgct gaccgatgaa 1020agtagtggtg aattactgaa agtgagcccg cgcaattaa 1059801080DNAPseudomonas 80atgacccagc gttttccgag tgaagcaggt caagttaccg tgaaagaaat tgcggcaggt 60ctggaaaatc cgtggggtct ggcattttta ccggatggcg aacacatgct ggttaccgaa 120cgtccaggtc gtttacgttt agttggtctg gatggttctc gtagtgaacc gttagcaggt 180gttccggatg tttttgcacg tgcacaaggt ggtttactgg atgttcgttt aagcccggcg 240tttgaacagg atcgtctggt ttatctgagc tacgcggaag ttggcgaaga tggtaaagcg 300ggtaccgcag ttggtcgtgg tcgtctgaat gatgatcgtt ctcgtctgga aaactttgaa 360gtgattttcc gccagctgcc gaaactgagt agcggcattc attttggtag tcgtctggtt 420tttgcgggta acggccatct gtttgttgca ctgggtgaaa acaatcagcg ttctaccagc 480caggatctgg acaaacatca gggcaaagtg gtgcgcatcg gcctggatgg ttctgttccg 540gatgataacc cgtttgttgg tcgtgatggt gttcgtccgg aaatttggag ctatggtcat 600cgtaatcagc aaggtgctgc attaaatccg tggagtggtg tgttatggac ccatgaacat 660ggtccgcgtg gtggtgatga aatcaatatt ccgcaagcag gcaaaaacta cggctggccg 720ctggcaactc atggcattaa ctacagcatg ctgccgattc cagaagcgaa aggcaaaacc 780gtgaaaggta ccgaaccgcc acatcatgtt tgggataaat ctccgggtat tagcggtatg 840gcgttttatg atgcggaacg ctttccggca tggcaacatt ctctgtttat tggtgcgctg 900gttgatctga gcctgattcg tctgcagctg gatggtgatc gtattgtggg cgaagaacgt 960ctgctgaaag atctgaatgc gcgtattcgc gatgtgcgtg ttggtccaga tggtttcctg 1020tatctgctga ctgatgcagc tgatggtaaa ctgctgcagg ttggcctgga tagcaattaa 1080811074DNAAchromobacter 81atggcacaag aaccaccatc tgcaccagca cgtgttactc cagttgttgg cggtctggat 60catccatgga gtatggcatt tttaccggat ggcggtattc tgattaccga acgtccgggt 120aatttacgtc tgctgcgtac cccaggtggt ctgagtaaac cgttaagtgg tgttccgcaa 180gttgcagcac gtggtcaagg tggtttactg gatgttgctt taagcccgga ttttgcaacc 240gatcgctatg tgtatctggc ctatgccgaa tctgatggcg ataaatctgg taccgcagtt 300ggtcgtggtc gtttagctga tgatgcaagt ggtctggaag gcttcaaagt gctgtttcgt 360caagaaccga aactgagcag cggccagcat tttggctctc gtctggtttt cgatggtaaa 420ggctatctgt atatcgcgct gggcgaaaac aatcaacgtc cgaccgcaca ggatctggat 480aaattacagg gcaaagtggt gcgcctgaaa accgatggtt ctgttccggc agataacccg 540tttgtgggta aaccaggtgc acgtccggaa atttggtctt atggtcatcg taatccgcag 600ggtatggcgt taaatccgtg gactggtgaa ttatgggaaa acgaacatgg tccgcgtggt 660ggcgacgaaa ttaatgttgt taaaccgggc aaaaactacg gttggccgct ggcgacctat 720ggcatcaact atagcggttt cgcaattcca gaagcgaaag gcgaaacctt accgggtatg 780gaaccaccga ttcattggtg gccgaaatct ccggcaatta gtggtatggc gttttatgat 840gcagatcgct ttccggcgtg gcgtaattct ctgtttattg gtgcactggg taatcaaaac 900ctgatccgcc tgaccgtgga tggcaatcgt gtggtggaaa aagaacgttt actggtggac 960cgcaaacgcc gtattcgtga tgttcgtcaa ggtccggatg gctatgtgta tgttctgacc 1020gatgcaagtc cgggtgaatt actgcgtgtt gcaccggctg aaactggtgg ttaa 1074821047DNAPseudomonas 82atgagcgcga aagtggaagt gctgtatgaa ggcctggaac atccgtgggc attagcattt 60ctgccggatg cacaaggtat gctgattacc gaacgtcgtg gtagtttacg tctgctggat 120gcacagggta aactgagtga accgttagca ggtgttccgg aagtttttgc agttggtcaa 180ggtggtctgc tggatgttgt tttaagcccg agctttgcag aagatcgtct ggtgtatctg 240agctttgcac aggcggaagg cgataaagcc gcaacctctg ttggtcgtgg tcgtttaagt 300gaagatctgc gtagtctgga agatttcaaa gtgatctttc gccagatgcc ggcactgtct 360agtggtcatc attttggcag ccgtctggtg tttgatcgtg atggctatct gttcattgcc 420ctgggcgaac ataatcaacg tccgacctct caggacctgg ataaactgca gggcaaagtg 480gtgcgcttat atccggatgg tcgtattccg gatgataacc cgtttgttgg tcgtgaaggt 540gcacgtgcgg aaatttggag ttatggtcat cgtaatcagc agggtgcagc attaaatccg 600tggaccggta aactgtggac ccatgaacat ggtccgcgtg gtggtgatga agtgaatatt 660ccggaagcag gcaaaaacta tggttggccg attgcgaccc atggtgtgaa ttacagcttt 720ctggcgattc cggaagcaga aggcaaagaa gttgcaggta ccgaaccgcc gcatcatgtt 780tggaaaaaaa gtccggcgat tagtggtatg gcgttctacg atcatgcgcg ttttccggca 840tggcagcata gtctgtttgt tggtgcatta gcaggtgcag aactgattcg tctgcagctg 900aatggcgata aagtggtggg tgaagaacgt ttactgggtg aacgtaaagc gcgtatccgc 960gatgtgcgtg ttggtccaga tggttatctg tatttactga ccgatagcgg caaaggtcaa 1020ctgctgaaag tgggcctgga atgataa 1047831080DNAPseudomonas 83atggcagaac agcgttttcc gagcgaagaa ggtaccctga ttgtggatac cctggcaaat 60ggtctgcgta atccatgggc actggcattt ttaccgggtg gtaaagatat gctggtgacc 120gaacgtgcag gtaatttacg tctggtgaat gcggaaggta aagttggtcc gagcattagc 180ggtgttccga aagtatgggc agaaggtcaa ggtggtctgc tggatgttgc attaagcccg 240gaattcggca aagatcgtac cgtttatctg agctacgccg aagaaggtag cgatggcaaa 300gcaggtactg cagttggtcg tggtcagtta tctgaagatc gtgcgcgttt agaacatttt 360accgtgattt ttcgccagct gccgaaactg tctgtgggca accattttgg cagccgtctg 420gtgtttgatc gtaacggcta cctgtttatt gcgctgggtg aaaacaacca acgtccgacc 480gcacaggatc tggataaact gcagggtaaa gtggtgcgca ttctgccgga tggtgaagtt 540ccgaaagata atccgtttgt tggtaaagat aatgtgcgtc cggaaatctg gagctacggt 600catcgcaacc agcaaggtgc ggcattaaat ccgtggaccg gtcaactgtg gacccatgaa 660catggtccgc gtggtggtga tgaaatcaat attccgaaac cgggtaaaaa ctatggttgg 720ccgatcgcga cccatggcat caattattct ctgctgccga ttccagaagc aaaaggtgaa 780catgtggatg gtatggttga tccgcatcat gtgtgggaaa aaagcccggg cattagcggt 840atggcgttct acgatagccc gaccttcaaa gcgtgggatc ataacctgtt tattggcgca 900ctggcaaccc aagaactgat tcgcctgcag ctggaaggtg ataaagtggt gcatgaagaa 960cgtctgttag gtgatctgaa agcccgtatt cgtgatgttc gtatgggtcc ggatggttat 1020ctgtatgtgc tgaccgacga caaagatggt gcgctgctga aagtgggtct ggcggattaa 1080841071DNACitrobacter 84atggaaactc cggcggttaa cgtggaagtt ctgcagacca aactggatca tccgtgggca 60ctggcatttt taccgggtga taatggtatg ctgattaccc tgcgtggtgg tcaactgcgt 120cattggcaag cagataaagg cttaagcgat ccgattccgg gtgttccgac cgtttgggca 180agtggtcaag gtggtttatt agatgttgca ttagcgccgg attttagtca gagtcgtcgt 240gtttggctga gctttgcaca ggcagatgca caaggtaatg caggtaccgt tgtgggttat 300ggtcgtctga gcgatgattt aagccgtctg gaaaactttc agaccgtgtt ccgtcagatg 360ccgaaactga gcaccggcaa ccactttggt ggtcgtctgg tttttgatgg caacggttat 420ctgtttattg gtctgggcga aaacaatcag cgtccgaccg cacaggatct ggataaactg 480cagggtaaag ttgttcgtct gaccgatcag ggcaaaattc cgccggataa tccgtttgtg 540aatcagccgg gtgcacgtcc ggaaatttgg agctatggta ttcgtaaccc gcagggtatg 600gcgatgaatc cgtggagtga tacattatgg ctgaatgaac atggtccgcg tggtggtgat 660gaaatcaata ttccggaaaa aggcaaaaac tacggctggc cgctggcaac ctggggcatt 720aactatagcg gctttaaaat cccggaagcg cagggcgaaa aagtggcagg taccgaacaa 780ccgatctttt actggcagaa aagtccggca gttagcggta tggcgtttta tgatcatgat 840accttcccgc agtggcgtca gaaactgttt ttaggtgcac tgaaagatca gaacgtcatc 900gtgatgaacg tgaacggcaa caccgtgacc gaagaaggcc gcattctggg cgaacgtaaa 960cagcgcatcc gtgatgtccg tgttggtccg gatggttatc tgtatgtgct gaccgatgaa 1020agtgatggtg aattactgaa agtgagcccg cgttctgcag gtaatccgta a 1071

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed